XML, is one of the greatest standards that have been developed and adopted over the recent years for data storing and retrieving. XML Documents can store as much information as required and yet consume very less disk space due to them being flat files. XML has provided a great means of replacement to some extent for Databases.
However XML has some restrictions (rather a good one) in it.
XML is case sensitive. Man, MAN, man & mAn are all different when it comes to XML.
XML tags need to be supplemented with end tags for every start tag. So a <Customer> tag definitely needs a </Customer> tag to mark its end.
Security, Scalability and Maintainability are certain issues which still need to be looked into.
There are a variety of XML Parsers that have been built and in use over the years and many standards such as the XPATH, XQUERY have been defined for querying XML Documents just like one would query a Database Table.
.NET has provided a lot of methods / APIs to work with XML Documents and the fullest support for querying / displaying XML Data is actually one of the major features in the advantage of .NET over its precedessors.
XPATH is fully supported in .NET, which is a W3C accepted standard for querying XML Documents. You can implement a Search through an XML Document using XPATH Expression and retrieve/modify the values in the XML Document.
However, since Xml is case sensitive, a search for "Whidbey", "WHIDBEY", "wHiDBey" will not produce the same results. In fact, the words have to be exactly matching for the XPATH Expression to identify the node.
If we would like to implement a case insensitive search, then we need to modify our XPATH Expression a little bit using the translate method. We will see a sample XPATH Expression.
Let us consider the following Books.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
<book category="Cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="xml">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="XML">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="xMl">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
If in the above XML Document, we would like to find out all books that fall under the category "XML" and we dont worry about the case in which the category has been mentioned in the file (i.e. xml, XML, xMl etc.,), we can use the following XPATH Expression which retrieves the set of nodes irrespective of the case of the category "XML".
string searchtext = "Xml";
/bookstore/book[translate(@category, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') = '" + searchtext.ToLower() + "']
As it is evident from the above XPATH Expression, we are assigning the search keyword "xml" to the variable searchtext and in the XPATH, we are
converting it to all lower case using the ToLower() method of the string class and building the xpath expression. Same way, in the XPATH, we are using the translate function to convert the @category values to all lower so that it searches through the categories irrespective of whether the search keyword is "xml", "XML" or "xML".
Upon executing the above XPATH Expression using appropriate XmlDocument or XpathDocument, the results which fall under the category "XML" will be retrieved irrespective of whether it is "xml", "XML" or "Xml".
Thus, we can implement a case-insensitive search through XML Documents using the XPATH Expression's translate function.
Cheers and Happy XPathing !!!
posted @ Monday, September 12, 2005 1:17 PM