What is reference data?
Typically by reference data we mean an attribute of an object within a system that is restricted to a specific list of values. In integration scenarios when you are passing a message from one system to another these lists of reference data are different between the systems.
This means if you have a product type id or value from System A there will usually need to be some kind of mapping required to translate the value or id to an equivalent value that System B will understand.
The below diagram illustrates this.
From my experience on BizTalk projects I have seen some different ways that this has been implemented using BizTalk maps. These include the following:
|C# Helper Class (with switch or if statements)
|When I have seen these they are usually in the form of one class (often called MappingHelper.cs) which is full of loads of methods which have an if statement which determines which value gets returned. Although these do the job I often find that this helper class is a mess to read as it is just a dumping ground for random mapping methods and it also tends to be poorly tested which leads to an Anti Pattern I have blogged about in the past. (see: http://geekswithblogs.net/michaelstephenson/archive/2006/12/18/101501.aspx)
|Scripting Functoid with inline code
|Again this can do the job its like the above but the if or switch is written inline within the functoid. The problem is that the code is only tested at the level of testing the map which in most cases does not have enough test cases executed to ensure that the we cover all scenarios
|Custom Database for mapping
|I have seen cases where a custom database (usually in SQL Server as this is where BizTalk sits) has been used to hold mapping data and then a C# Helper has been written which accesses the database and do the mapping.
The main common problems with some of the ways I have seen used are:
· They are not very stable because in most cases they have not been tested properly
· They are difficult to maintain
· They require more work that is needed
I feel that in most cases one of the three ways I will discuss can be used to implement reference data mapping within a BizTalk Map. I also feel that a couple of suggestions on how to implement them will help to ensure you can do this in a way which will make your life easier.
Pattern 1: Simple C# Helper
The first pattern is a simple C# Helper. Ok this this initially sounds very similar to the above method which I said I saw a number of poor examples of. I think in principle this can be a good technique, but it should be used in the right place and using the right technique. Below are some details of the pattern:
|Reference Data: Simple Helper Class
|This can be considered when the following conditions are met: · The data does not change· We need to minimise the overhead during the processing of the map· There are not many different scenarios
|Things to be aware of:
|· If the data changes you will need to change your code which means a redeployment.· You might have to have multiple methods to map between the different combinations of systems.
|I think when you have the scenario where the data is static and you have to map between 3 or 4 systems you could extend this pattern to map to a common value for the entity then have a class for each system which maps its values to and from the common equivalents. I may blog about this in the future.
I think the following are good practices to use when implementing this pattern:
· Don’t have 1 helper class which is full of methods
· Have 1 class which is a Converter for each reference data type (eg CustomerTypeIdConverter)
· Use constants to represent the values for each system (this makes it much easier to read)
· Fully unit test it and ensure you get 100% code coverage before you use the class in BizTalk (this will mean you shouldn’t have any issues with the code once you get it in BizTalk)
The following is an example of how you might implement this:
This pattern is used a lot but I think using the few practices I mention above will give you a number of benefits in your implementation.
Pattern 2: BizTalk Xref Data
This pattern will use a cross reference data mapping feature of BizTalk. I don’t think this feature is particularly well known as I haven’t seen it discussed in many books etc, however if used it can be effective. Basically there are a set of tables within BizTalk which allow you to setup the mappings for reference data between systems. The following table provides some details on this pattern.
|Reference Data: BizTalk Id and Value Cross Reference
|This can be considered when the following conditions are met: · The data may change regularly and we need to be able to change and add new reference data without redeploying the application
|Things to be aware of:
|· Each time you retrieve a value it will hit the database so one mapping from one system to another will have two database hits. In messages where there are lots of fields which require reference data mapping this database activity may affect performance.
|As described below this can easily be integrated into your development build process
In a BizTalk orchestration you can use some standard database functoids to retrieve the data from the management database. The following picture shows which functoids can be used:
The following picture shows how a message might be mapped using the reference data functoids:
You can see that the get common id functoid is used to retrieve a common id based on the process type from the input message. Then using this common id we then retrieve an application specific Id for the output message.
The BizTalk map side of this implementation is easy, however you may be asking well how does my reference data get into the management database. Well there is a handy tool called BTSXRefImport.exe which is located in the BizTalk directory. This tool allows you to have a set of xml files which you would specify all of your reference data in then call this tool to import it.
There following link is to an example of how to use this http://home.comcast.net/~sdwoodgate/xrefseed.zip
When I have used this pattern I usually use the files as described in the link above then during the build process for my application I use the PurgeData.sql file to clear the database followed by calling the tool to reimport all of my reference data. The following example is taken from one of my build files (note it uses the Microsoft.Sdc msbuild tasks for the Sql Task).
<Exec Command='”C:\Program Files\Microsoft BizTalk Server 2006\BTSXRefImport.exe” -file=”C:\Development\EdenbrookTFS\Acme.BizTalk.BuildDemo\SetupFiles.xml”‘ />
Pattern 3: Custom Xref Data
This is a pattern I have thought about recently. I wanted to have something conceptually similar to the BizTalk pattern above but I wanted to have an alternative which I could use where performance was more important. I wanted to create a pattern using C# where I could have an interface which looked similar to the BizTalk Cross Reference functoids and which was also similar to use, but rather than hitting the database every time I wanted to hold the data in memory. I also wanted to provide a pattern which could be flexible in its implementation so a developer could use different sources for the data to come from. Some information about the pattern is:
|Reference Data: Custom Cross Reference
|This can be considered when some of the following conditions are met: · You want to use the BizTalk Id and Value Cross Reference pattern but you are concerned about the effect on performance of all of the database hits· You don’t want to use SQL Server to store the reference data, maybe you want to read it in from different the external systems.· You have small to medium amounts of reference data
|Things to be aware of:
|· The reference data is held in memory so you don’t want to hold too much of it in there.· As its held in memory different processes and machines will have their own instance of the cache of reference data. You would need to be aware of keeping this in sync.
|The pattern is intended to allow flexibility in how the data is loaded. It could also use things like the SqlDependancy object to load data from a database but still be aware of changes.
I have created a base class called ReferenceData which any reference data class should inherit from. In the example associated with this post I have creates a class in the utilities project called ProductTypeId which is one of the types of reference data. In its constructor it will call methods to load the reference data for each system which is associated to the Product Type Id.
In the example I have just loaded some hard coded strings using the LoadReferenceData method on the base class. This load could get the data from where ever the developer chooses as they would be coding it.
On the base class there are a couple of methods which allow you to retrieve the common or application specific value in a similar way to the BizTalk Cross Reference pattern
When you come to implement this in a map it will look similar to the above pattern except that rather than using the build in functoids you will use 2 scripting functoids which will reference your ProductTypeId reference data class. It will look like in the below picture.
I hope these pattens will help you to have a consistent set of options when you are dealing with reference data mapping in BizTalk. I have found them beneficial but it is important to remember to choose the one which fits each situation the best.
The source code used for this example is available at the following location….. (TBC)
The source has a map which demonstrates the use of each technique. Please feel free to add any comments or thoughts below.