posts - 289, comments - 120, trackbacks - 110

My Links

News

My main blog url now is: weblogs.asp.net/meligy

Tag Cloud

Article Categories

Archives

Post Categories

BlogRoll

About Me:

Using MSN and Google Search Webservices

Yesterday, I was implementing site wide search functionality for the site I'm working on currently. The site consisted of both&;static XHTML pages as well as dynamic ones (dynamic in the sense that their content comes from DB based on query strings), and is going to be hosted on shared hosting service, so, implementing the search via DB queries or some sort of indexing service (or even writing my own index engine as windows service or so) were all non-available options. I thought, why not use some online search service, and here how it went...

If you know me well, you'd know I'm a 100% Microsoft guy, who wouldn't use a competetive technology unless have to, the first service I tried was MSN, later, I checked Yahoo service, and finally used Google as the last choice. To tell you the reasons, here're the services ordered descending by my personal convinience:

  • The common in all 3 services
    • The good:
      • All 3 services do not only provide web search, but other types as well, as news and local (best fits in US) and other search types. Also, even for web search, typical additional services like suggestions and safe search filtering are provided.
      • All 3 services allow you to restrict your search to a certain site, by putting something like "site:http://www.MyDomain.com" in the search query itself.
      • All 3 have built in "paging" support by specifying the offset (start index) of the results returned and the count of the results for the service to return. Each resultset/search request returns only the requested count of results after the specified offset, still specifying the total number of results available for the submitted query.
    • The bad:
      • Each of the services requires every site (domain) using the service to have a certain Application license key sent when requesting search results. Each service has a limited number of daily search queries per key. The number of keys available per user account also varies from service to another. (this is not really bad, as the minimal limit I found was 1000. I'll list the limits later on here).
      • Well, nothing is complete in this world! No service is issue-less, and all services are marked as pre-release.
  • Yahoo Search Service
      • This one had all thegood parts mentioned above, with up to 5,000 daily query per key, but, the problem with it is that it doesn't return data in standard webservices WSDL format. Instead, it uses another XML-based data exchange format called "REST". To me, that was a stop point!
  • MSN Search Web Service
    • The good:
      • The daily query limit is 10,000 queries, and you can get seemingly unlimited number of keys per each .NET passport. That's sure the best deal you may get for free.
      • The web service API is very well designed. It seems like you can get multiple search type using a single request.
      • The SDK is quite comprehensive (C#/VB.NET).
      • The results returning include so many available fields about each result. You can still determine which fields to return.
      • The service is very fast, at least, compared with the other 2 services.
      • It can highlight the query words in the results if you configure it to, although this doesn't always work!
    • The bad:
      • Although the API is well designed, there're some odds in it like requiring to fille redundant fields, also, the flixability and power of the API makes it a bit complicated. Thanks for the SDK, this shouldn't be a real problem.
      • There's a weird culture info field that ONLY accepts US English and small number of east Europian cultures. You have to set this field as it has no default value (typically is set to "en-US"). This is not proved to be good in any thing and the problem I'm facing with setting it to "en-US" is that sometimes it shows a little squre on the both right and left sides of each result title (that's in IE6, in FireFox, these come as question marks. This is tested on a page using UTF-8 encoding for the page content, request and response), even English titles in English sites with English search query! Those appear and vanish appearantly randomly for the same result when re-quering!!
      • The results are not so few, but not enough. Searching the same query in MSN search homepage itself returns more results.
      • For many results, the title is returned as the URL, and the descrption is just returned empty string! This happened to about 30% to 40% of each query result set, even when all the results had English values in their HTML "<title>" tags, and didn't have META descriptions (but the description appeared for the rest of the result set althout they also didn't have META descriptions)! The most weird is that submitting the same query in MSN search homepage itself of course doesn't have this weird behavior. I first tried to play with the API and search for this online and finally had to look for alternate solutions.
    • Code Snippet: (Part of what I had in code, without error checks and so)
//You can have multiple search types per single search request,
// then, you need to create at least one
MSNSearchService.SourceRequest source = new MSNSearchService.SourceRequest();
source.Offset = CurrentOffset;
//Specifying the fields I need to be returned in the result set.
source.ResultFields =
    MSNSearchService.ResultFieldMask.Title
    | MSNSearchService.ResultFieldMask.Description
    | MSNSearchService.ResultFieldMask.Url;
//Setting my offset and page size for paging
source.Offset = CrrentOffset;
source.Count = PageSize;

MSNSearchService.SearchRequest request = 
new MSNSearchService.SearchRequest();
//A must-have field
request.CultureInfo = "en-US";
//This didn't really affect the results with trying different values.
request.Flags = MSNSearchService.SearchFlags.DisableHostCollapsing;
//Setting the application ID
request.AppID = "MY_APP_ID";
//Adding the search types I added. In this sample, I only had one
request.Requests = new MSNSearchService.SourceRequest[] { source };
//Testing sending my query and limiting the search to a certain site.
request.Query = Query + " site:http://msdn.microsoft.com";

float countFloat;
//This is the webservice object, therefore, it's a typical IDisposable object.
using (MSNSearchService.MSNSearchService service = new MSNSearchService.MSNSearchService())
{
    
//Trying to fix the problem with having '?' and '[]' on the 
    // right and left sides of result title
    
service.RequestEncoding = System.Text.Encoding.UTF8;
    
//Sending the query to the remote server.
    
MSNSearchService.SourceResponse response = service.Search(request).Responses[0];
    
//Getting the number of pages served by the query in my "TotalPages"
    // using the service response "Total" property and my defined PageSize.
    
countFloat = response.Total / (float)PageSize;
    
int count = (int)countFloat;
    TotalPages = (countFloat > (
float)count) ? ++count : count;
    
//Binding the returned results before disposing the request.
    
ResultsDisplay.DataSource = response.Results[0];
    ResultsDisplay.DataBind();
}

  • Google Web Service:
    • The good:
      • The API is extremly simple (if you ignore the parameters you don't know about). You just call a single method and pass it all the needed parameters.
      • The search results are more accurate than any other service. Typical for Google!
      • It has a simple “<b>“ highlighting for the query terms in the results, and it seemed to not even be optional!.
    • The bad:
      • You can have only 1,000 daily queries per key. This is the minimum among the 3 services. You can only have 1 key per Google account (which is something like .NET passport, not necessiraly a Gmail account).
      • The total pages count returned is not always the right count. When moving withing pages, sometimes you suddenly find the total pages is different. This is a problem with Google search in general not only the webservice.
      • The service is slightly slow. I didn't use any profiling tools or run persized performance tests, but, it's notably slow. It's not too slow to use anyway, but compared to MSN...
      • The api doesn't describe the weird paremeters for the single method (which are NOT well named). I had to search for them.
      • Each reslt summary has line break “<br>” in a fixed position, and there's no clear way how you can change the location of the line break or even remove it. This hurts when the results are displayed in a space with large width, the user just doesn't doesn't understand why the summary didn't continue till the end of the line before moving to the next. (for me, I used typical String Replace method to get rid of it) 
    • Code Snippet: (Part of what I had in code, without error checks and so)
//Creating a reference to the result to use after request disposal
Google.GoogleSearchResult result;
//reatign the webservice itself, which is a typical IDisposable object
using (Google.GoogleSearchService service = new Google.GoogleSearchService())
{
    
//Sending the query to the remote server.
    
result = service.doGoogleSearch("MY_APP_ID",
            
//Limiting the search to a sample site, my copmany's
            
Query + " site:http:/www.gnsegroup.com",
            
//sending my offset and page size for paging
            
CurrentOffset, PageSize,
            
true, "", true, "", "utf-8", "utf-8");
}

//Getting the number of pages served by the query in my "TotalPages"
// using the service result "estimatedTotalResultsCount" (note estimated) property 
// and my defined PageSize.
float countFloat;
countFloat = result.estimatedTotalResultsCount / (
float)PageSize;
int count = (int)countFloat;
count = (countFloat > (
float)count) ? count+1 : count;
TotalPages = count < 100 ? count:100;

//Had to prepare the paging numbers before binding as they are set in HTML via binding expressions <%# ... %>
//Doing the binding
ResultsDisplay.DataSource = result.resultElements;
ResultsDisplay.DataBind();

I hope that this can help anyone trying to implement any of the mentioned search services.

By the way, both Google and Yahoo provide other AJAX based search services, using a standard AJAX data exchange format called JSON (JavaScript Object Notation), which is said to be supported by both ATLAS and AJAX.NET professional and even has dedicated .NET library "JSON.NET". I had no previous experience with JSON, and, had to get the search module up and running REAL quick, so, for AJAX, I just used the relatvely old ATLAS UpdatePanel, carrying a DataList, to which I bind the results, and implementing my own paging in the footer template.

Print | posted on Sunday, July 16, 2006 3:36 AM | Filed Under [ ASP.NET WS & Distributed Apps ]

Feedback

No comments posted yet.

Post Comment

Title  
Name  
Email
Url
Comment   

Powered by: