Make your own Java newsreader

XML feeds are the way today to keep a tab on what’s happening in the blogosphere as well as to know about site updates and new additions. I had heard about two popular open source feature-rich Java APIs to deal with the feeds, Rome and Informa but could never really savor them. For one of my own sites where I, sort of hand-built the aggregator as Myjavaserver does not allow deploying external JARs or Taglibs I did not get any chance to play with these APIs. So when I got some opportunity it was imperative that I tried them.

The following is a very elementary example of using the two news aggregation libraries that imitates a newsreader. The examples, as I said, are pretty basic and they would make a hit to the specified Feed URL every time you call the JSPs. The code snippets are not meant to demonstrate good coding practice.

Both the libraries support almost all versions of RSS, RDF and Atom and features such as dynamic discovery of feed format. Feature wise probably Informa has an upper hand (it supports OPML, recognizes the Enclosure element making it suitable to comprehend Podcast feeds and can be configured to use a persistence mechanism built over Hibernate) but what it lacks is availability of documents. There are no primers at the site and the code is very poorly commented making the Javadocs difficult to come to pace quickly. The two articles that I could, Google have been outdated, as I used the 0.6.5 version of the library. Rome, on the other hand, has very nice documentation available at its site, complete with code examples. Many desirable features are unfortunately still on the TODO list. While I have not investigated them, there are a number of sub-projects based on both Rome and Informa, for example: there is a JSP Tag library based on Informa. There is a short review of various libraries here but I guess much stuff on Informa is not relevant now since its latest release.

I am not mincing my words when I say that each API has its own strengths, Informa library is pretty bulky but supports OPML while Rome has a wider support for all kind of XML feeds and has a pluggable architecture. The good thing about these APIs is that they pretty much offer you everything that you may want to do with feeds, reading, generating your own, and creating a digest from multiple feeds and so on.

To run these JSPs, needless to say, you would need to download Informa and Rome libraries. I ran these on jakarta-tomcat-5.0.28 / j2sdk1.4.2_06 and the only dependency I was missing was the JDOM jar that Rome needs.

Click here for a bare-minimum Newsreader based on Informa. Click here for a bare-minimum Newsreader based on Rome.

I hope this post helped you, do leave your comments here.

Disclaimer: The information provided on this page comes without any warranty whatsoever. Use it at your own risk.


DateTime and other problems

Bloglet subscribers to my feed may not be getting my posts in their email for about a week now. The problem seems to have surfaced after some changes made in the JRoller RSS feed XML format. Bloglet now throws an error “String was not recognized as a valid DateTime” for JRoller feeds. Monsur of bloglet wrote in reply to my mail:

Bloglet is rejecting RSS feeds with dates in the following format: “Tue, 21 Oct 2003 24:56:21 -0500”. Its probably a good idea to change your date to adhere to the XML-RPC spec, found here. We have registered it as an outstanding issue though.

Incidentally I was playing with this taglibrary for parsing RSS XML and I had to make some change to the classes so that the permalink to the posts (guid instead of expected link element) are recognized properly. I am planning to host this JSP at mycgiserver but it seems they do not allow uploading third party jars there. I am planning to use the JSP to generate the “Recent Posts” panel on my blog. Currently this is being generated by Blogstreet however the RSS Panel tool of Blogstreet is not able to understand the permalinks too and generated the links incorrectly. I am unsure how I would be able to use the JSP unless mycgiserver allows custom tag libraries.


Castor and the Namespace bug

While working with a new version of Castor I recently encountered a strange error during unmarshalling (creating Java object out of corresponding XML schema). The error was as follows:

java.lang.IllegalArgumentException: The prefix ‘xml’ is reserved (XML 1.0 Specification) and cannot be declared.

Luckily, some Googling brought me to this thread which explains the probable reasons for this “bug”. As it suggested (and it works since I incorporated it) we need to set the namespaces property to true in the file as follows (caveat: needs to be done with Xerces 2.5 only):

org.exolab.castor.parser.namespaces = true

Following is a quote from the said thread, which is in fact a reply from Keith Visco, the Castor XML project lead, that throws light on the cause of the bug:

The issue seems to be with newer versions of Xerces. The older version that ships with Castor works fine. For some reason, when the newer version of Xerces encounters an “xml” prefixed attribute, such as “xml:lang” it tries to automatically start a prefix mapping for “xml”. Which, in my opinion, is technically incorrect. They shouldn’t be doing that. According to the w3c, the “xml” prefix should never be declared.

The reason it started appearing in the new Castor (, was because of a switch to SAX 2 by default during unmarshalling. I found a simple workaround (tested with Xerces 2.5) , at first I thought about disabling namespace processing in Xerces, but then realized that it’s already disabled by default by Castor…so I have no idea why they call #startPrefixMapping when namespace processing has been disabled. But in any event…explicitly enabling namespace processing seems to fix the problem.