Gino Abraham's Blog

December 2009 Entries

Adding Intelligence to your Search Engines.

Conventinal Way of Searching :

When there is a requirement to add a Search feature inside a website to search through knowledge base the first thing that will crack our mind is to add a Free Text Search with some Keywords. If it is a SQL server we will hit a free text query over a specific keyword index else we might lookout for some third party indexing utility like to to fetch best matching result for the end user. This is called Keyword Searching. The search key words will be like "CEO Infosys",""

Adding Intelligence to your Search : 

How do you add intelligence to your search. There are many ways to do that. If you hit a search in google for Semantic Search you will get many results. My intention out here is to let the newbies know how they can add intelligence to their search using some open source tools.

The intelligence that am talking about is to decode a sentence given by the End user in the search box. if end user passes a sentence like "who is the CEO of Infosys", the search engine should be able to decode this and fetch results accordingly.

How can we Achieve this :

We should have a natural language processing capability at our end to achieve this feature. We can always leverage few open source tools to get out natural sentence processed. One such tool is Sharp NLP which a Dotnet version of Open NLP. You can download Sharp NLP @ Sharp NLP uses princeton universities Wordnet library as a backend(list of words stored) to process our sentence. Processing here mean knocking off unwanted words from the Search Sentence given.

For eg: In our search sentence "who is the CEO of Infosys". We could get some verbs,conjuntions,adjectives,nouns a sentence. Sharp NLP can be used to find the Parts of speech and accordingly we can avoid unwanted words from the sentence and then hit a search on our database.

There are few websites which does natural language processing on the key words passed and respond with an answer if the search text is a question type. For eg for the search sentence above will respond as follows.

Kris Gopalakrishnan is the CEO of Infosys

Am doing a reasearch on how to achieve this, you can soon expect a detailed blog on this with samples hosted. Till then Happy Coding