Geeks With Blogs
Jason Whitehorn MarshalByRefObject.net
UPDATE (12/17/2007): My blog has moved. This post is now located at: http://jason.whitehorn.ws/2007/06/26/Parsing-RSS-From-C.aspx




I set out to write a RSS parser in C#. I know that several existing libraries are available for .NET that parse RSS streams, but out of curiosity I wanted to give it a go anyway.
Before I get started, for those interested in RSS libraries for .NET checkout RSS.NET or RSSConnect, to just name two.

As .NET developers we have some really powerful tools available to us that are built directly into the .NET framework. Take, for example, Serialization. In .NET we can turn an object into an XML resource using the XmlSerializer class, provided that the class is Serializable. In fact, the XmlSerializer object will even attempt to populate an object from an XML resource.

Visual Studio ships with a tool called XSD which can turn an XML Schema Definition into a C# class. Knowing that RSS is expressed as an XML resource, I set out to find an XSD for RSS. After some searching I found this site, which contains an XSD for RSS that the site's author wrote. For posterity I have placed a mirror of that file on my own site, which you can find here.

Before continuing on, I need to pause and look ahead. The XSD that I found did not work out of the box, and I had to make a small modification to get it to work. I removed a reference to the RSS schema namespace from the XSD. The modified XSD can be downloaded from here

Now, armed with an XSD, we can have the XSD program make some classes for us. From a command prompt (preferable the "Visual Studio Command Prompt" command prompt) type:

xsd RSS20.xsd /classes


Provided that RSS20.xsd is the name of the RSS XSD, and that you are currently in the same directory as RSS20.xsd, the XSD program should output a single file called RSS20.cs. The C# source file that XSD produced contains a class definition for each RSS entity, most notable the root node called "rss".

With classes that represent RSS, we can now use the XmlSerializer object to easily populate an object with RSS data. For example:

     string rssXml = "... your rss data here ...";
XmlSerializer helper = new XmlSerializer(typeof(rss));
rss obj = (rss)helper.Deserialize(new StringReader(rssXml));


The above code snippet, will create an rss instance called "obj" from raw RSS data. But, you don't want to have to do this everytime you want to parse RSS. Instead, it would be helpful if that logic was encapsulated by the rss object. Fortunately, the classes produced by the XSD program are all partial classes. So, in a separate .cs file, we can re-declare the rss class and expand upon its functionality.

//declare another portion of the "rss" object.
public partial class rss{
//add more methods to "rss" here
}


Ideally it would be nice to have both a method to turn an RSS stream into an rss object, and a method to output RSS from an rss object. For example:

public partial class rss{

public static rss Parse(string rssXml) {
//turn an RSS string, into an rss object.
}

public override string ToString() {
//turn the rss object into a RSS string.
}
}


The above two methods can both be implemented using the XmlSerializer object, as it provides methods to serialize and deserialize and object. I ended up with the following implementation.

public partial class rss{

public static rss Parse(string rssXml) {
return (rss)Serializer.Deserialize(new StringReader(rssXml));
}

public override string ToString() {
string result;
using (StringWriter sw = new StringWriter()) {
Serializer.Serialize(sw, this);
result = sw.ToString();
}
return result;
}

private static XmlSerializer Serializer {
get {
if (_serializer == null)
_serializer = new XmlSerializer(typeof(rss));
return _serializer;
}
}
private static XmlSerializer _serializer;
}


The above implementation, combined with the output from the XSD program, is a functional RSS parser. The above code can be used to process blog output, or price feeds from your favorite online retailer.

As an added bonus, I wrote two additional methods for rss from those outlined above.

        //optionally, you can add these two methods to "rss".
public static rss CreateFrom(Uri location) {
rss result;
using (WebClient helper = new WebClient()) {
string rawRss = helper.DownloadString(location);
result = Parse(rawRss);
}
return result;
}

public static rss CreateFrom(string location) {
return CreateFrom(new Uri(location));
}


For those interested, a compiled version of this code can be download from here. Posted on Tuesday, June 26, 2007 10:07 PM .NET | Back to top


Comments on this post: Parsing RSS From C#

# Parsing RSS with C#
Requesting Gravatar...
You've been kicked (a good thing) - Trackback from DotNetKicks.com
Left by DotNetKicks.com on Jun 28, 2007 8:52 AM

Comments have been closed on this topic.
Copyright © Jason Whitehorn | Powered by: GeeksWithBlogs.net | Join free