Mehfuz Hossain
Learning never stops....

Moved to asp.net weblogs

Recenly , i have moved to

http://weblogs.asp.net/mehfuzh

This will be my default blogging home from now and on. However  feeburner  address is same  as before, which is  http://feeds.feedburner.com/burncsharp.

Thanks,

Mehfuz.

Making simple search solution

With the advent of web , search is everywhere. To make every contentful application really useful for users, we need to have a good  search capablity inour app to get the right content when users ask for it. My focus here , is to create a pretty simple search solution using the "Full-Text" capability of Sql Server 2005.

Some key features that i belive to watch out.

  1. When searching with multiple words, it should not get useless inofrmation. For ex . when we search  "asp.net ajax", it should not return content having "php ajax".
  2. Low on CPU usage.
  3. Chances of index hit is pretty high.
  4. Does have a word index methods for which searching with same keyword is faster.

 Some sqls like we always do

Select *

from Person where [Firsname] like '%mehfuz%' or [LastName] like '%hossain%'

First of all , this search is high on IO  and almost neve hits an Index , moreover , it high on cpu usage and really slow.

FullText search on the other hand, ensures that your query hits index, low on logical reads and also faster for searching same keyword. Bascially , To enable fulltext , we need to create a Categlog, that maintains a dictionary for searched keywords, thats why , similar search is always faster.

Now, lets dig into full-text. I have divided this up into some steps, step 1 , 2, and 3, can be done manullay  DB->Stroage - > FullTextCatelogs - > Create New.

Step 1 : Full-Text Initialization

sp_fulltext_database 'enable' - intializes the database for full text search.

Step 2 : Catelog Creation

CREATE

FULLTEXT CATALOG <catelogName>

IN

PATH N'C:\Database\'

WITH

ACCENT_SENSITIVITY = ON

AUTHORIZATION

[dbo]

This will create the catelog file under - > DB - > storage -> FullTextCatelog, Accent_sensitivity means  Café(French) and Cafe(English)   will be treated differently.

Step 3 : Index Creation

Add the table and the key(Primay Key, that will be used to track down results) , on which the the search will be made.

CREATE FULLTEXT INDEX ON [dbo].[<TableName>] KEY INDEX [<ClusterIndex Name>] ON [<CatelogName>] WITH CHANGE_TRACKING AUTO

Add the column on which the search to be made.

ALTER

FULLTEXT INDEX ON [dbo].[<tableName>] ADD ([<Column on which the search will be made>])

Enable the Index, that will do a re-build operation on sql server for full text.

ALTER

FULLTEXT INDEX ON [dbo].[<tablename>] ENABLE

Step 4  : Query and Result

Select

* from <TableName> where Contains([FullTextColumn], 'TV NEAR star')

Contains is a full text routine, That accept keyword LIke 'AND', 'OR' , 'NEAR', you can put columns names in Contains, or put an '*' that will do search on all mapped full text columns.

"NEAR' is useful when used with containsTable  and we want to sort by Rank

Here is the piece of code that i found from MSDN - bit modified

SELECT

*

FROM

 tabletobeSearchOn AS FT_TBL INNER JOIN

CONTAINSTABLE(tabletobeSearchOn , *, 'tv NEAR star') AS KEY_TBL

ON FT_TBL.[ID] = KEY_TBL.[KEY]

ORDER

BY KEY_TBL.[Rank] asc

Note :The more closer the search keyword will be ,the more higher wil be the rank(asc).

 

Now, to make the search more better, i would suggest that  if you have FirstName, lastName, username, sex, columns to be search on , then combine data of all theses columns and dump them to a single Column, Lets say 'FullTextCOlumn' , whenever a change will occur in the table, use trigger to udpate that column with changed data. In this way ur search will get more accurate if you want to search on discreate columns as a whole, or lets say , if you want to simulate a box model.

What is Box model ? Lets say, if you have  'Box1' with balls marked 'Mehfuz', 'Hossain', 'Male' , 'mehfuz@gmail.com' and 'Box2' with balls marked 'Mehfuz', 'Carlos' , 'Male' , calors@newworld.com'. Now if you search for 'Mehfuz Hossain male' , it should get you Box1, not box 2 although there is 'Mehfuz' in FirstName and 'Male in Sex , column of Box2. To get this done the best solution is to combine  columns, in a way that i have just said.

Although, there are plenty of ways to make a search solution , i found the Full-text one more compelling and easy to understand and implement.

Alternate to Cursor

Most of the cases we use cursor, for loop like senarios, like calutating total order for an item and put the sum in a specific column, doing some work on particular table on basis of values of some other tables.

Now, cursor is fine for smaller amount of data items, but it becomes really lousy , as the number of items start to grow.

Here, i will discuss about a totally alternative to Cursor which is more faster and effective for working with huge number of data items. In my example , i will work with tables from  Sample Database called "AdventureWorks" which you can easily download from MSDN site for Sql server 2005.

Now , My objective is to Calulate the Total amount for sales item from SalesDetail Table and store it in the Sales Table (for each row).

To get this sample work , i have created  a sample Column Called TotalAmt in SalesHeader Table, there is already  a Column called TotalDue, that stores the agrregatated value of the same calculated sum that i am going to store in this TotalAmt Column, only i didnt want to mess things up for test purpose and as i want to compare my result with the main result, i decided to go for creating a separate column.

 

Therefore first thing  is to create the sample Column, which goes something like this.

 

alter table Sales.SalesOrderHeader add TotalAmt real not null default 0

 

Next, i have created the query the using the old cursor method that does the same work which my optimized sql will do.

declare @orderId int
declare @taxAmt real
declare @subTotal real
declare @freight real

declare adventure_cursor cursor for

Select SalesOrderID, SubTotal, TaxAmt, Freight From Sales.SalesOrderHeader

Open adventure_cursor

Fetch next From adventure_cursor

into @orderId , @subTotal, @taxAmt , @freight

while @@fetch_status = 0

begin

update Sales.SalesOrderHeader

Set

TotalAmt = @subTotal + @taxAmt + @freight

where SalesOrderID = @orderId

Fetch next From adventure_cursor

into @orderId , @subTotal, @taxAmt , @freight

end

close adventure_cursor

deallocate adventure_cursor

The query shows that , it sums the total TaxAmt, Feight and SubTotal (Total sales for quantity) and stores it in TotalAmt of  Sales.SalesOrderHeader  table for each sales order.

In the AdventureWorks Db there are around 31465 rows . It took around 19 seconds to sum and update value for each item.

 

Now, lets change query to my more optimized update statement solution

 

update Sales.SalesOrderHeader

Set TotalAmt = list.SubTotal + list.TaxAmt + list.Freight

From Sales.SalesOrderHeader as SalesOrder

inner join (

Select SalesOrderID , SubTotal, TaxAmt, Freight from

(

Select

SalesOrderID,

SubTotal,

TaxAmt,

Freight

From Sales.SalesOrderHeader

)

as T ) as list on list.SalesOrderID = SalesOrder.SalesOrderID

 

First of all the query does not use intermidiate varible to store data , therefore it does everything on the fly, secondly i have used  T table of sql server 2005 , which does the processing in memory  but way to faster than any Cursor query.

And, yes the query completes the opeartion in 3 seconds.Amazing!!! You must be thinking 19 seconds was not a big deal at all , but think about 3 milion rows intead of 30,000 rows , in that case surely Cursor is of no use.

Easiest Thread-Safe implementation to work on multiple items at a time.

We have heaard of lot of thread safe practices, semaphore , mutex and others. One goal of all ,is  to create some app/tool which excecutes task faster than traditional single process app. One of the best example of this, creating some console importing tool.

As i said , there are loads of way to make an app multi-threaded, but the simpliest but robust one , which i am going to speak in the next few lines.

Let's kick off by creating a global  List/Array items, that will hold the list of items to work on , there will be a global index and  a object instace , which we will use as a "lock" key.

static

object workerLocker = new object();

private static int _Index = 0;

private static IList<SomeObject> objects= null;

static

int runningWorkers = 0;

Next, there will be an starter routine that will fire off some specific amount of thread , which is definined in app.config

public static int MAX_MODULE_COUNT = int.Parse(ConfigurationSettings.AppSettings["Max_Count"]);

Let's say in main method we have something like :

 

objects =  GetListOfItems(); // objects defined globally

runningWorkers = objects .Count;

List<Thread> threadList = new List<Thread>();

for (int index = 0; index < THREAD_COUNT; index++)

{

Thread thread = new Thread(WorkerRoutine);

thread.Start();

threadList.Add(thread);

}

Now, we have to make sure, while the threads are running the app does not quits , so we will  do a little join , that will wait untill all the threads are finished up.

foreach

(Thread thread in threadList)

{

thread.Join();

//Console.WriteLine(i++);

}

Now, inside the WorkingRountine(our main process routine), we will need to take a item from the list , increase the index and need to work on it.  But the main theme here , the thread will not quit unless all the items are worked on.  Lets say, you have started with 100 threads, those will remain active unless  all the items get dried out. The concept will work more like a factory production line.

Indside "WorkingRoutine"  we will the following:

SomeObject object= null;

while (_Index < runningWorkers)

{

lock (workerLocker)

{

object= objects[_Index++];

}

try

{

// do something useful here

}

catch (Exception ex)

{

//log the exception or do whatever appropiate

}

} // end loop

 

Here , to remember that "lock" has to be done for very little amount of time, so that no two thread cant access the list at the same time. And , as soon as, we took an object from the queue and increased the index, we open it out.

 

Although, there are lots of way to achive this out, i found it really easy to implement and effective.

Separate Browser Specific CSS

Sometimes it is necessary that we do some special tactics to make a  particular CSS work in diiferent browser . For example , the following css.

.logo

{

background-image:none;

filter: progid:DXImageTransform.Microsoft.AlphaImageLoader (src="##BASE_PATH##images/logo.png", sizingMethod="crop");

}

This is needed only for IE6 to make alpha blending work.Now, if the logo is used as a background image, then we will see no logo in Firefox unless we do some css hacks. Ironic !

Rather, using the hacks stuff for aiming at different browsers , which is, in a way cumbersome and not always side effect free, i belive that it's a good idea to create a css handler that will  prepare a combined output depending on browser needs.

 

 

The basic concept is to use a hander class that will stand as an intermidiate processor , which actually get all the css files , depending on the browser on which the app is running it combines them and finally and emits the content to the client.

Here, i need to tell that emiting the content type is a must, although IE is smart enough to recongize the output as CSS by the reference type but firefox wont be able to recognize it at all.

Full Code.

<%

@ WebHandler Language="C#" Class="CSSHanler" %>

using

System;

using

System.Web;

using

System.IO;

public

class CSSHanler: IHttpHandler {

public void ProcessRequest (HttpContext context) {

string css = string.Empty;

//general css

string commonCss = GetRawContent("default.css");

//the following line is useful if you need full path access in css , for ex. DXImageTransform full url.

commonCss = commonCss.Replace(

"##BASE_PATH##", "http://geekswithblogs.net/");;

css = commonCss;

css +=

"\r\n";

if (context.Request.Browser.IsBrowser("IE"))

{

string ieCss = GetRawContent("defaultIE.css");

ieCss = ieCss.Replace(

"##BASE_PATH##", "http://geekswithblogs.net/");

css += ieCss;

css +=

"\r\n";

}

// TOOD : add other browser checks as well.

context.Response.ContentType =

"text/css"; // Must for FireFox

context.Response.ContentEncoding = System.Text.

Encoding.UTF8;

context.Response.Write(css);

context.Response.End();

}

public string GetRawContent(string fileName)

{

TextReader configPath = new StreamReader(System.AppDomain.CurrentDomain.BaseDirectory + "YourDirectory\\" + fileName);

string content = configPath.ReadToEnd();

configPath.Close();

return content;

}

public bool IsReusable {

get {

return false;

}

}

}

To start off, you just to  add the reference in your Default.aspx or Default.master which looks like:

<link rel="stylesheet" href="<%=YOURSITEPREFIX%>CSSHandler.ashx" type="text/css" media="screen" />

And,Replace the site prefix to whatever you want.And that's it, you are ready to throw your hacks away.