Geeks With Blogs

News Coding strategies for the Java and .Net developer...

Michael Ballhaus (MCPD + SCJP) Java + .Net - Coding from the fence

This article is a two-part series regarding the LinqToWikipedia provider. The first article will cover the basic concepts of Linq as well as the client usage of this particular provider while the second article will explore the inner workings of the LinqToWikipedia provider to give you an understanding of what it takes to create your own IQueryable provider.


NOTE: You should download the latest build from Codeplex so you can follow along with the code samples.

What is Linq?

Let's spend a moment and talk about what Linq is all about from a high-level.

Linq (Language Integrated Query) is a .NET programming model that allows for a consistent SQL-like querying syntax (called query expressions) against various data sources. These data sources can be anything from SQL server, Oracle, XML, in-memory objects, web services... just about anything. There are many Linq providers that are provided with .NET such as Linq to SQL, Linq to XML, Linq to Objects, while there are many 3rd party Linq providers out there that are available such as Linq to Oracle, Linq to Amazon, Linq to MySql, etc.

The following code is an example of a Linq query that searches for customers that start with the letter "M" and return the records to an ASP.NET DataGrid.

 

   1:  protected void Page_Load(object sender, EventArgs e)
   2:  {
   3:      var customers =
   4:          from c in Customers.GetCustomers()
   5:          where c.Name.StartsWith("M") && c.Age > 11
   6:          orderby c.Age descending
   7:          select c;
   8:   
   9:      dg_customers.DataSource = customers;
  10:      dg_customers.DataBind();
  11:  }
  12:   
  13:  public class Customers
  14:  {
  15:      public string Name { get; set; }
  16:      public int Age { get; set; }
  17:   
  18:      public static List<Customers> GetCustomers()
  19:      {
  20:          List<Customers> customerds = new List<Customers>();
  21:   
  22:          customerds.Add(new Customers() { Name = "Elo", Age = 6 });
  23:          customerds.Add(new Customers() { Name = "Myranda", Age = 11 });
  24:          customerds.Add(new Customers() { Name = "Mikayli", Age = 12 });
  25:          customerds.Add(new Customers() { Name = "Elias", Age = 13 });
  26:          customerds.Add(new Customers() { Name = "Memori", Age = 15 });
  27:   
  28:          return customerds;
  29:      }
  30:  }


Output:

 

Name Age
Memori 15
Mikayli 12



The compiler will actually translate the Linq query into a lambda expression so if you wanted to you could write this query as a lambda expression directly:

 

IEnumerable<Customers> customers = Customers.GetCustomers().Where(c => c.Name.StartsWith("M") && c.Age > 11).OrderByDescending(y => y.Age);


But as you can see, the Linq code is much more intuitive and especially if you have multiple statements in your Where clause. Either way, this code will work for any Linq provider that exposes a Customers entity whether they are stored in a database, XML files, web service lookup, etc. The Customers entity happens to be in-memory objects in this case.

LinqToWikipedia search options

Before we dive into actually building the Linq query, let's review the two search options provided by LinqToWikipedia. There are two different types of searches that the provider offers via the MediaWiki API:

  • Open Search
  • Keyword Search

Open Search

OpenSearch is a standardized collection of simple formats for the sharing of search results. OpenSearch was created by A9.com, an Amazon.com company, and the OpenSearch format is now in use by hundreds of search engines and search applications around the Internet. LinqToWikipedia can format the returned data so that it adheres to this standard. For more information on the OpenSearch standard visit OpenSearch.org.

Open Search will return the following data elements from a Wikipedia search when querying by a single search term:

  • Text - Title of the page
  • Description - Short description of the page
  • Url - The absolute Url to the page
  • ImageUrl - The absolute Url to the main image of the page (if it exists)

See a live demo of the Open Search function.

This search option will return up to 15 records at a time using the .Take() query extension. You will see an example of this code later on.

Keyword Search

This query option allows you to search the Wikipedia database using multiple keywords and returns data in the following format:

 

  • Title - Title of the page
  • Description - Short description of the page
  • Url - The absolute Url to the page
  • WordCount - Total count of words
  • TimeStamp - The date/time the page was last updated)
  • RecordCount - The total number of records in the result set (this is used for paging)

See a live demo of the Keyword search function.

This query option will return up to 100 records at a time and also supports paging through the data by using a combination of .Skip() and .Take()

Using the LinqToWikipedia provider

Now let's move on to using the LinqToWikipedia provider. You have two options:

  1. Download the source code and add to an existing project and make a project reference or...
  2. Download the source code and just copy/reference the linqtowikipedia.dll within your project

Now that you have a reference, you need to add a using statement to your code to add the namespace:

 

using LinqToWikipedia;


Next you need to instantiate the LinqToWikipedia object via the WikipediaConext class.

 

WikipediaContext datacontext = new WikipediaContext();
//WikipediaContext datacontext = new WikipediaContext(new WebProxy("yourproxy", 80), new NetworkCredential("username", "password", "domain"));


Note: If your application is running behind a firewall/proxy, you can alternatively instansiate the provider passing in a new WebProxy and NetworkCredential to the constructor.

Now we will create our Linq code to query Wikipedia using the OpenSearch method.

We will query using soccer as our keyword and request that the provider return 10 records by calling the .Take(10) query extension method:

 

   1:  var opensearch = (
   2:      from wikipedia in datacontext.OpenSearch
   3:      where wikipedia.Keyword == "soccer"
   4:      select wikipedia).Take(10);


The result set is of type IWikipediaQueryable<WikipediaOpenSearchResult> and since this type implements the IQueryable<T> interface, we can iterate over the results with a foreach loop.

 

   1:  StringBuilder sb = new StringBuilder();
   2:   
   3:  foreach (WikipediaOpenSearchResult result in opensearch)
   4:  {
   5:      sb.Append("Text = " + result.Text + "<br />");
   6:      sb.Append("Description = " + result.Description + "<br />");
   7:      sb.Append("Url = " + result.Url + "<br />");
   8:      sb.Append("ImageUrl = " + result.ImageUrl + "<br /><br />");
   9:  }
  10:   
  11:  //ASP.NET Label control
  12:  lbl_diplay.Text = sb.ToString();


Here is a sample of the results:

Text = Association football
Description = Association football, more commonly known as football or soccer, is a team sport played between two teams of eleven players using a spherical ball.
Url = http://en.wikipedia.org/wiki/Association_football
ImageUrl = http://upload.wikimedia.org/wikipedia/commons/thumb/7/7a/Football_header.JPG/50px-Football_header.JPG

Text = Racing Post
Description = The Racing Post is a British daily horse racing, greyhound racing and sports betting newspaper, currently the only one appearing in print form.
Url = http://en.wikipedia.org/wiki/Racing_Post
ImageUrl =


See a live demo of the Open Search function.

Now we will create our Linq code to query Wikipedia using the KeywordSearch method.

We will query using soccer, los angeles, galaxy, donovan as our keywords and request that the provider return 10 records by calling the .Take(10) query extension method but we also want to skip the first 10 records in the result set by using the .Skip(10) query extension method. By using .Skip() and .Take() in tandem we are effectively enabling recordset paging. To determine the total number of pages in the recordset, we can divide the RecordCount property by the number of .Take() records. Then for each new query we increase the .Skip() number to move to the next recordset until we are at the last page of records. Finally, we will have the results displayed to an ASP.NET DataList.

First we need to declare a local variable that will hold the total number of records in the result set.

 

private int totalrecords = 0;

Now we write our Linq query and set the results as the DataSource of our DataList.

 

   1:  WikipediaContext datacontext = new WikipediaContext();
   2:   
   3:  var query = (
   4:      from wikipedia in datacontext.KeywordSearch
   5:      where
   6:          wikipedia.Keyword == "soccer" &&
   7:          wikipedia.Keyword == "los angeles" &&
   8:          wikipedia.Keyword == "galaxy" &&
   9:          wikipedia.Keyword == "donovan"
  10:      select wikipedia).Skip(10).Take(10);
  11:   
  12:  dl_results.DataSource = query;
  13:  dl_results.DataBind();


NOTE: To use paging, you would want to set the .Skip() number to a variable that you can increment with each subsequent lookup.

Since we are not looping through the results (like we did in the OpenSearch example) we need to find a way to capture the RecordCount value and set it to our local totalrecords variable. To solve this we can use the ASP.NET DataList ItemDataBound event.

 

   1:  protected void dl_results_ItemDataBound(object sender, DataListItemEventArgs e)
   2:  {
   3:      if (e.Item.ItemType == ListItemType.Item || e.Item.ItemType == ListItemType.AlternatingItem)
   4:      {
   5:          if (this.totalrecords == 0)
   7:              this.totalrecords = ((WikipediaKeywordSearchResult)e.Item.DataItem).RecordCount;
  11:      }
  12:  }


See a live demo of the Keyword search function.

Part II - Explore the inner workings of the LinqToWikipedia provider to create your own IQueryable provider. (coming soon)

Posted on Saturday, January 16, 2010 11:21 AM | Back to top


Comments on this post: LinqToWikipedia - A custom .NET Linq provider - Part I

# re: LinqToWikipedia - A custom .NET Linq provider
Requesting Gravatar...
Great job!!! I am going to use WikiMedia API in a project of mine, now I'm going to use LinqToWikipedia!!!
Again, excellent job!

Breno
Left by Breno Ferreira on Jan 24, 2010 6:43 PM

Your comment:
 (will show your gravatar)
 


Copyright © Michael Ballhaus | Powered by: GeeksWithBlogs.net | Join free