Performing a Case In-sensitive search in an XML Document

XML, is one of the greatest standards that have been developed and adopted over the recent years for data storing and retrieving. XML Documents can store as much information as required and yet consume very less disk space due to them being flat files. XML has provided a great means of replacement to some extent for Databases.

However XML has some restrictions (rather a good one) in it.

XML is case sensitive. Man, MAN, man & mAn are all different when it comes to XML.

XML tags need to be supplemented with end tags for every start tag. So a <Customer> tag definitely needs a </Customer> tag to mark its end.

Security, Scalability and Maintainability are certain issues which still need to be looked into.

There are a variety of XML Parsers that have been built and in use over the years and many standards such as the XPATH, XQUERY have been defined for querying XML Documents just like one would query a Database Table.

.NET has provided a lot of methods / APIs to work with XML Documents and the fullest support for querying / displaying XML Data is actually one of the major features in the advantage of .NET over its precedessors.

XPATH is fully supported in .NET, which is a W3C accepted standard for querying XML Documents. You can implement a Search through an XML Document using XPATH Expression and retrieve/modify the values in the XML Document.

However, since Xml is case sensitive, a search for "Whidbey", "WHIDBEY", "wHiDBey" will not produce the same results. In fact, the words have to be exactly matching for the XPATH Expression to identify the node.

If we would like to implement a case insensitive search, then we need to modify our XPATH Expression a little bit using the translate method. We will see a sample XPATH Expression.

Let us consider the following Books.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>

<book category="Cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>

<book category="xml">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

<book category="XML">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>

<book category="xMl">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>


If in the above XML Document, we would like to find out all books that fall under the category "XML" and we dont worry about the case in which the category has been mentioned in the file (i.e. xml, XML, xMl etc.,), we can use the following XPATH Expression which retrieves the set of nodes irrespective of the case of the category "XML".

string searchtext = "Xml";

/bookstore/book[translate(@category, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') = '" + searchtext.ToLower() + "']

As it is evident from the above XPATH Expression, we are assigning the search keyword "xml" to the variable searchtext and in the XPATH, we are
converting it to all lower case using the ToLower() method of the string class and building the xpath expression. Same way, in the XPATH, we are using the translate function to convert the @category values to all lower so that it searches through the categories irrespective of whether the search keyword is "xml", "XML" or "xML".

Upon executing the above XPATH Expression using appropriate XmlDocument or XpathDocument, the results which fall under the category "XML" will be retrieved irrespective of whether it is "xml", "XML" or "Xml".

Thus, we can implement a case-insensitive search through XML Documents using the XPATH Expression's translate function.

Cheers and Happy XPathing !!!

Print | posted on Monday, September 12, 2005 1:17 PM

Comments on this post

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Thank you very much. It was very helpful.
Left by Shahid H. Faruqi on Jan 13, 2006 8:26 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Brilliant - thanks very much for this post Harish!!!
Left by Tim on Jul 06, 2006 11:02 AM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Works great!!! Thanks
Left by Achintya Jha on Nov 14, 2006 7:20 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
thanks a lot for this helpful tip. Saved me a lot of effort. Thanks again
Left by Gaurav Sawant on Nov 28, 2006 6:39 AM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
awesome dude
Left by nabeel on Mar 08, 2007 1:17 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
I don't understand how that works since you do not know the locale of the text. Transforming the case of text is locale dependent, so it seems to me that your trick only works for english....
Left by Remi on Mar 08, 2007 9:30 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Very nice. It's a shame that SQL Server's implementation of XQuery does not include the translate() function ... :<
Left by Ed Graham on Feb 10, 2009 7:21 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Thanks, really it helped.
Left by Pawan Gupta on Jul 01, 2009 6:26 AM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
you do not know the locale of the text. Transforming the case of text is locale dependent,
Left by mario oyunları on Sep 21, 2009 10:50 AM

# re: Performing a Case In-sensitive search in an XML Document

Left by Danny on Apr 09, 2010 3:17 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
I was looking for this to perform case-insensitive queries with a crawler script.Thanks a lot!
Left by bic on May 17, 2010 2:23 AM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Works good for me!
Thank you very much.
Left by Chris on May 23, 2010 12:43 PM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
hmm..
It works like charm..thanks a lot...
Left by Rama Selvam on Aug 03, 2010 2:12 AM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
I searched a lot.. and got loads of google results but i got it correct from your example..


thanks
Left by Viren on May 25, 2011 10:55 AM

# re: Performing a Case In-sensitive search in an XML Document

Requesting Gravatar...
Good example. If i need to getchild with case insensiive rather getting through attribute is there a way..how would the xpath look like in that case..

Thanks in advance..
Left by Karthik on Aug 23, 2011 1:16 AM

Your comment:

 (will show your gravatar)