Geeks With Blogs
NIEM Guy - NIEM, XML, and .NET Piecing it all together

This is part one of a multi-part post where I will show some of the techniques we've been using to parse NIEM XML documents using LINQ. 

Microsoft has definitely put a lot of time into thinking out the System.Xml.Linq features.  When working with NIEM XML the essential thing to remember is that the only thing that ever really changes is the name of the tag and sometimes the namespace.  This makes NIEM very easy to parse with LINQ because LINQ does not look at an XML document like the DOM - a hierarchical structure, but it can look at it as a flat document as well and this is the power we will leverage.

The key to this are a couple function calls (need to be refactored and probably could be abstracted a little more) named GetElement and GetElements.  GetElement will return the first instance of the element found based on the namespace and element name.  GetElements is the same but it returns an IEnumerable of XElement.

Public Shared Function FindElement(ByVal doc As XDocument, ByVal elementName As String, ByVal xNamespace As XNamespace) As XElement
    Return doc.Descendants(xNamespace + elementName)(0)
End Function

Public Shared Function FindElements(ByVal doc As XDocument, ByVal elementName As String, ByVal xNamespace As XNamespace) As IEnumerable(Of XElement)
    Return doc.Descendants(xNamespace + elementName)
End Function

I tucked those methods in a class called XmlParser.  Notice the use of the + and not the &.  In VB when you join strings the + will throw compiler errors.  However for joining a XNamespace with a string element name you must use the + sign.  I didn't see this exact case documented anywhere but I just took a guess after the & was giving me compiler errors.

Here is a sample doc - declared inline using the Xml literals feature in VB.
Dim doc As XDocument = <?xml version="1.0"?>
                       <myDoc xmlns="http://niemguy.com/mydoc/1.0" xmlns:mydoc="http://niemguy.com/mydoc/1.0" xmlns:nc="http://niem.gov/core/2.0">
                           <people>
                               <nc:Person>
                                   <nc:PersonName>
                                       <nc:PersonGivenName>Chris</nc:PersonGivenName>
                                   </nc:PersonName>
                                   <nc:PersonBirthDate>
                                       <nc:Date>1978-08-12</nc:Date>
                                   </nc:PersonBirthDate>
                               </nc:Person>
                               <nc:Person>
                                   <nc:PersonName>
                                       <nc:PersonGivenName>Sheila</nc:PersonGivenName>
                                   </nc:PersonName>
                                   <nc:PersonBirthDate>
                                       <nc:Date>1977-02-24</nc:Date>
                                   </nc:PersonBirthDate>
                               </nc:Person>
                               <nc:Person>
                                   <nc:PersonName>
                                       <nc:PersonGivenName>David</nc:PersonGivenName>
                                   </nc:PersonName>
                                   <nc:PersonBirthDate>
                                       <nc:Date>1982-06-05</nc:Date>
                                   </nc:PersonBirthDate>
                               </nc:Person>
                           </people>
                       </myDoc>

And finally, here is how we can use the GetElement functions to find some elements.

Dim nameEl As XElement = XmlParser.FindElement(doc, "PersonGivenName", GetXmlNamespace(nc))
Console.WriteLine(nameEl.Value)

Dim birthDays As IEnumerable(Of XElement) = XmlParser.FindElements(doc, "PersonBirthDate", GetXmlNamespace(nc))

For Each birthday In birthDays
    Console.WriteLine(birthday...<nc:Date>.Value)
Next

A key thing to note here is the use of GetXmlNamespace.  The GetXmlNamespace function is a framework method that returns the XNamespace based on the prefix of a namespace you have imported at the project or file level.  For reference are the file level imports I used:

Imports <xmlns:mydoc="http://niemguy.com/mydoc/1.0">
Imports <xmlns:nc="http://niem.gov/core/2.0">

The nice thing about GetXmlNamespace is that you get intellisense on the namespaces you have in your project or file.  This makes it really handy.  The drawback is that you can only get them from there, or at least I have not found a way to get them listed from an external source or the document itself.  For most exchanges that would be ok if you're custom tailoring it a little - but if you're after a generic parser then this may not be the way to go.

Next post will describe a abstract (mustInherit) bass class that we created as a base document parser and then we extend it for special documents.

Posted on Sunday, March 9, 2008 2:56 PM | Back to top


Comments on this post: Finding a NIEM Element using XDocument.Descendants

No comments posted yet.
Your comment:
 (will show your gravatar)


Copyright © niemguy | Powered by: GeeksWithBlogs.net