Code Canvas Technologies
Home > RapidJ > Extras > Forum Ferret
RapidJ - Rapid Java Web Development
 

Forum Ferret

Forum Ferret is a simple Internet forum scanner developed using RapidJ v1.1 and now distributed with RapidJ as an example project. Code Canvas hopes that you might find it useful and that it gives you an insight into how applications generated by RapidJ can be customised.

Note: Prior to RapidJ v1.1, Forum Ferret was provided as a download.

Note: Before scanning a website it is important to ensure that this does not violate their terms of use policy.

Screenshots

1. Entity configuration
2. Creating the database
3. GUI component configuration
4. Generated code
5. Viewing customisations
6. Scanner configuration
7. Scanning for messages
   

Features

Forum Ferret has the following features:

  • Scans Internet forums for new messages.
  • Identifies messages with new replies.
  • Ability to mark messages as important.
  • Flexible scanner configuration using regular expressions.
  • Ability to sort messages by various fields.

Think something is missing? Try adding it yourself!

Installation

As of RapidJ v1.1, Forum Ferret is included with RapidJ as an example project.

To install the Forum Ferret example project:

  • Open the Forum Ferret example project from the examples directory in the RapidJ installation directory.
  • Install Forum Ferret in the same way as the other example projects. (For help with this see the Customer Database example under Getting Started in the Help Contents).
    NOTE: By default, Forum Ferret uses the forumferret database in the bundled HSQLDB database server. Make sure you uncheck Test records when generating the database script so that test data is not created.
  • Open Forum Ferret. If you have installed it into the bundled Resin application server, open it at http://localhost:8080/forumferret/
  • Now you are ready to configure some scanners.

Scanner Configuration

To configure a scanner you need the following:

  • URL of the forum - The URL must be a link to a forum page that displays a list of messages, each with a title, link and optionally either the number of posts or the number of replies. The URL should include any necessary query parameters.
  • A regular expression - The regular expression must match each message on the page and capture the URL, title and the number of replies (or posts) as a group.

To add a new scanner goto the Scanner List page and press the Add New button.

The following table describes the scanner fields and show an example configuration that searches the struts-user mailing list on the Mailing list ARChives for the month of July 2005.

Field Description Example
Name A meaningful name. Struts User - Mailing list ARChives
Url The location of the forum. http://marc.theaimsgroup.com/?l=struts-user&r=1&b=200506&w=2
Reg Expr The regular expression used to match each message on the page. \[((<a href=".*?">)|(<font color=".*?">))(\d*?)((</a>)|(</font>))] <a href="(\?l=struts-user&m=.*?)">(.*?)</a>
Title Group The group index that corresponds to the title of the message. 9
Url Group The group index that corresponds to the URL of the message. 8
Num Replies Group The group index that corresponds to the number of replies to the message. -1
Num Posts Group The group index that corresponds to the total number of posts for the message thread. 4

Note: Group 0 is the entire matched string. The first group that you explicitly specify starts from an index of 1. Specify a group of -1 to indicate that there is no group for a particular field.

Note: Usually forums will display either the number of replies for a message or the total number of messages in that thread. This means you only need to specify one of Num Replies Group or Num Posts Group. Specify a group of -1 for the field that is not used.

Note: Before scanning a website it is important to ensure that this does not violate their terms of use policy.

Regular Expressions

The following resources can help you learn more about regular expressions in Java:

Feedback/Bugs

Want to let us know what you think of Forum Ferret? Found a bug? Please send feedback.