Geeks With Blogs
Paul Chapman .Net Musings

For one of my projects I need to have Business News headlines displayed on the front page, to give me greater control as to which headlines and from which feeds I use I'm going to have the aggregator call each feed and store the article details in a SQL database.


I'm going to use schemas in designing my database. The use of these changed in SQL 2005 so that each one is a distinct namespace, independent of the user who created it. This will allow me to group related tables together, but it also allows you to use the same table name for different items. For example in an accounts system you have Sales accounts and Purchase accounts. You could of course create one table called SalesAccount and another called PurchaseAccount. With schemas we create ourselves two schemas one called SalesLedger and one called PurchaseLedger. Each schema has a table called 'Accounts'.

So first task is to create the schema that will hold our News. The following line of T/SQL will create our Namespace;


Creating our tables

Now we have our schema the next thing is the tables that will store the feeds we are going to read, and the news headlines. Not a lot special in these ones to be honest. We need two tables; one to store the feeds that we will be aggregating and the second to store the headlines. The following T/SQL will create these tables;

CREATE TABLE [NewsAggregator].[NewsStories](
                  [Id] [bigint] IDENTITY(1,1) NOT NULL,
                  [Title] [varchar](127) NOT NULL,
                  [Link] [varchar](max) NOT NULL,
                  [Desc] [varchar](max) NULL,
                  [Supplier] [varchar](20) NOT NULL,
                  [DateAdded] [datetime] NOT NULL
CONSTRAINT [DF_NewsStories_DateAdded]
DEFAULT (getdate())


CREATE TABLE [NewsAggregator].[NewsFeeds](
                  [Id] [bigint] IDENTITY(1,1) NOT NULL,
                  [Link] [varchar](max) NOT NULL,
                  [Supplier] [varchar](30) NOT NULL,
                  [Weight] [smallint] NULL

So there is our two tables. I have added a 'weight' field to the table of feeds. How this works will be clear later, however it is here to allow us to control how many articles from each feed we are displaying.

Now when developing I like to keep things separated. For this reason I like to do all my database access via Stored Procedure. The practical upshot of this is that any differences in the dialect of SQL you are using should be hidden behind the stored procedure making it a lot easier to covert the whole thing to another database.

Stored Procedures

Our feeds are retrieved by making a call to RetrieveNewsFeeds. Taking note of the fact that I am using a schema called NewsAggregator the code to create this stored procedure is as follows;

CREATE PROCEDURE [NewsAggregator].[RetrieveNewsFeeds]
     SELECT Id,
     FROM   NewsAggregator

This simply returns the results of select statement. I'm not interested in the order so to speed things along no sorting is done. Next having retrieved our news feeds we need to have a means of writing the headlines to our database;

CREATE PROCEDURE [NewsAggregator].[WriteNewsStory]
      @Title      VarChar(127),
      @Link       VarChar(MAX),
      @Desc       VarChar(MAX),
      @Supplier   VarChar(20)

      Declare @StoryCount BigInt


      Set @StoryCount = (Select Count(*) As Counter From NewsAggregator.NewsStories
                         Where Link = @Link)

      If @StoryCount = 0
            INSERT INTO NewsAggregator.NewsStories
                        (Title, Link, [Desc], Supplier)
            VALUES      (@Title,@Link,@Desc,@Supplier)

This is fairly simple as a stored procedure. We first look for the story link in the database to see if it has been read before. If not then the story gets added to the database.

Now we have our feed suppliers, and we can write the stories to our database. The final stored procedure is to recover the files from the database.

First off the procedure opens up a cursor to access the list of feed suppliers. Normally I would not use a cursor which are notoriously slow in SQL Server but it does provide a simple mechanism for progressing through a table.

CREATE PROCEDURE [NewsAggregator].[RetrieveLatestStories]
Declare @Supplier VarChar(30)
     Declare @Weight SmallInt
     Declare @Result As Table
          Title VarChar(127),
          Link VarChar(Max),
          [Desc] VarChar(Max)
FeedSuppliers Cursor
          SELECT Supplier, 
          FROM   NewsAggregator.NewsFeeds

     Fetch FeedSuppliers Into @Supplier,@Weight

     While @@Fetch_Status = 0
          Insert Into @Result (Title,Link,[Desc])
          Select   Top (@Weight)Title,Link,[Desc]
          From     NewsAggregator.NewsStories
          Where    Supplier = @Supplier
          Order By DateAdded Desc

          Fetch FeedSuppliers Into @Supplier,@Weight

     Deallocate FeedSuppliers

     Select * From @Result


While we are moving through the table of suppliers we retrieve from the list of stories the latest for each supplier based on the weight. If the weight is 1 then we retrieve 1 story, 2 and 2 stories 3, 3 stories etc. This weight places a bias in the display of stories allowing us to control the stories from each supplier.
and the following will store the news headlines
Posted on Wednesday, September 3, 2008 10:48 PM Silverlight , VB.Net | Back to top

Comments on this post: Implementing a News Aggregator in Silverlight and VB.Net #1 - Creating SQL Database to store the news details

No comments posted yet.
Your comment:
 (will show your gravatar)

Copyright © paulschapman | Powered by: