Michael Stephenson

Microsoft BPM/SOA Adventures
posts - 132, comments - 93, trackbacks - 15

My Links

News

View Michael Stephenson's profile on LinkedIn

Archives

Post Categories

Image Galleries

BizTalk

Mates

Saturday, November 29, 2008

Whats the future for BAM?

I was at the SOA/BPM Conference this week at reading and was watching one of the presentations which discussed the differences between BizTalk and Dublin.

It occured to my that with BizTalk being pitched as the integration product and dublin being pitched as the application server product this means that BAM is strategically misplaced in terms of being within BizTalk.

This is just my opinion but I think that in the future it would make sense to move BAM to be part of the SQL Server BI offering.  I think that if it was part of SQL Server then it would expose BAM to more products within the Microsoft space.  A summary of my thoughts is:

  • It would allow you to easily use it with Dublin as it would be likely you are using SQL Server for persistence and tracking anyway. 
  • It would still support BizTalk which is based on SQL Server anyway
  • It would allow other products which are based on SQL Server to use this without requireing BizTalk (eg: Dynamics)
  • It would allow ISV's to develop hooks into BAM.  A number of the ISV products which compliment the Microsoft BPM offering often use SQL Server but not necessarily BizTalk
  • In the future it would probably be easier to hook models developed with Oslo into BAM if it was part of SQL Server

These are just my random thoughts and ideas, I have not seen anything from Microsoft which indicates this is a planned direction, but if anyone has an opinion on this feel free to comment below

 

posted @ Saturday, November 29, 2008 12:04 PM | Feedback (1) | Filed Under [ BizTalk ]

Thursday, October 30, 2008

Missing Tracking Data Problem

For a few weeks now we have had some problems with BizTalk tracking data not being right in some testing environments on a large project.  The link below is to a forum post which discusses our problems (thanks Thiago for your help troubleshooting this):

http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=3956734&SiteID=17

The basic problem was that for some reason tracking data seemed to be backing up in the messagebox and not getting through to the Tracking Database.  We went through all of the usual suspects and some more advanced troubleshooting techniques which hadnt really worked out the problem.  We are now in a situation where our environments are working.  For some of them I know the cause of the problem and for one other I have not confirmed the cause.  I will discuss these below.

Production like Environment

In this environment the tracking is now working correctly (although we seem to have lost the old data and just new tracking data is being processed). This environment proved to be a bit frustrating as I was unable to get dedicated time to troubleshoot this issue due to other parts of the environment being used. When I did actually get the environment to do the last thing suggested by Thiago about the windows authentication mode on a post he had seen my plan was to do the following:

  • Check the current state
  • Restart the SQL Servers
  • Change the authentication mode
  • Restart the SQL Servers
  • Analyse any changes to our state

Before I could do this my first check found that tracking was now kind of working. Before id been given the environment for my analysis some fail over testing had been performed by one of the test teams and I know they had restarted some of the BizTalk boxes. At this point I kind of assumed that restarting the BizTalk box which was hosting the tracking host might have been the fix.

It's a bit frustrating when you get this kind of situation where its fixed but your not sure why. I also had to give the environment back to be rebuilt so couldn't look into this again for a while.

Small scale test environment

We got the same issue as above but in a different test environment. This 2nd environment is much smaller with just a single BizTalk box and a single SQL Server box. It was the same symptoms though.

One of the product support guys had come back to me eventually on the above issue on the same day (stroke of good luck) and he mentioned about the TDDS_StreamStatus table which on investigation I realised the sequence numbers were out of sync with the data in the trackingdata_n_n tables within the messagebox. I confirmed with SQL Profiler that TDDS is executing a stored procedure (cant remember the same but it was something like GetNextTrackingData) which was using the next expected sequence number from the stream status. Because the stream status table was much higher than the tracking data tables no data was being returned and therefore noting was passing along the chain to be available in HAT.

This posed the question of how this got out of sync?

Upon investigation I found that while I believed our deployment team were refreshing the BizTalk setup with updates they were also running some clean up tasks based on this article. While this post d oes clean some of the stuff up it isn't really an officially supported approach and if you look at the dtasp_CleanHMData stored procedure it does not touch the TDDS_StreamStatus table. The result of this cleanup process would seem that the messagebox tracking tables will be truncated meaning their next sequence numbers will be 1 but the TDDS_StreamStatus will still be whatever it was before. This means that your tracking data will not start getting into HAT until you get enough of it in the messagebox so that the next sequence number is higher than that that was last processed. Your tracking data will now start going through but only the stuff after this point.

In the case of the small scale environment I was able to just manually change the data in TDDS_StreamStatus to make all of the backed up tracking data get processed (although I would never recommend doing this in a production environment without speaking to Microsoft first).

I have been told the cleanup process was not ran on the production like environment, but I cannot confirm this so this problem may or may not be the cause for both environments.

Lessons Learnt

Based on my experiences in troubleshooting this Id make the following recommendations:

  • Be wary of the clean up scripts. They are not normally expected to be used in a production system and are aimed for development and testing support. I'm not convinced that they do everything they need to?
  • Use MOM to monitor the tracking data counter. If this had been our production setup we would have detected the growth in this counter and have been alerted to the problem. In test we aren't always using MOM so only found these problems while trying to troubleshoot other problems
  • On larger projects where deployments are not done by the development team ensure you have good communications between teams and try to be aware of the processes used for deploying and managing environments

I also know a little more about what TDDS does under the hood now as there isn't too much documentation around on the subject.

posted @ Thursday, October 30, 2008 2:18 AM | Feedback (0) | Filed Under [ BizTalk ]

Saturday, October 25, 2008

BRE Survey

Apparently the CSD/BizTalk Product Group are conducting a survey on the BRE to analyse its usage.  I guess this is an opportunity to contribure to the future of this component so it would be useful for BizTalk and probably WF people to take this

https://live.datstat.com/MSCSD-Collector/Survey.ashx?Name=BRE_Usage_Survey_Blog

posted @ Saturday, October 25, 2008 11:32 PM | Feedback (0) | Filed Under [ BizTalk ]

Thursday, October 02, 2008

MVP

Ive just been awarded a Microsoft MVP for BizTalk.

From what the other guys have said about it im really looking forward to the opportunity to be involved in interesting stuff over the next year

 

posted @ Thursday, October 02, 2008 10:09 PM | Feedback (3) | Filed Under [ BizTalk ]

Sunday, September 21, 2008

NCache and BizTalk Cross Referencing Example

Article Source: http://geekswithblogs.net/michaelstephenson

Following a recent post about the different approaches to caching you might consider when implementing reference data mapping in BizTalk one of the things that stands out most was that the solutions where a team had used a caching approach often resulted in them not using the BizTalk Cross Referencing features. As I've mentioned many times I prefer to use this unless there is good cause not to (there are reasons where you might not want to) but I feel development teams often ignore or don't consider the impact of adding custom databases to a solution without consideration for the extra work this requires in development, testing , deployment and management.

In most cases why do you want to do this when you already have a data store designed for this purpose? One criticism I would make of BizTalk is that the product does not do a very good job of making it easy for people to use the cross referencing featured from a developer experience but these can all be worked around with few problems.

Anyway I have decided that I will produce this sample showing how I have combined the use of NCache and BizTalk Cross Referencing to get a solution which does not need custom databases yet can still have a high performance caching solution which will not increase the BizTalk hosts process memory unnecessarily. The sample can be downloaded from the bottom of the article.

Prerequisites for the sample

You can obviously review the code in this sample, but if you want to run it you will need to do the following things:

  • Install NCache Express Edition

NCache Express Edition is available free from the following link http://www.alachisoft.com/ncache/. I assume you will be installing this to the default location, but if not you might need to modify the msbuild script where I configure NCache.

  • Modify Cross Referencing Setup

In the SetupFiles.xml file in the solution it contains the xml used to setup the cross referencing data in BizTalk. This file requires absolute paths to work so you will need to tweak these to suit your location. The below picture shows what the xml looks like.

Setting up the sample

In the sample you will notice there is a file which is called Setup.cmd. If you run this file it will perform the appropriate actions to configure things for this sample. The actions it will take are as follows:

  • Stop the cache in NCache if it is already running
  • Clear the BizTalk cross referencing tables
  • Stop the NCache windows service
  • Copy the pre-configured NCache config files to their appropriate places to configure NCache with the cache we will use in this sample
  • Start the NCache windows service
  • Start the custom cache
  • Load the BizTalk cross referencing tables using BTSXRefImport

These actions are all done in an msbuild script (picture below) which should make it easy for you to see how this is done.

You should now be able to run the sample.

My Cross Referencing Component

To keep the sample simple I have developed a component which will provide an interface which is the same as that provided by BizTalk cross referencing. I provide a class called CrossReferencingFacade which implements the façade pattern to give you an easy way to obtain the common and application specific id's. The below picture shows this:

There is also a test in the test project which shows how to consume this component. It is as easy to consume as the BizTalk cross referencing dll. If you look in the CrossReferencingManager class you will see there are two key methods which are discussed below:

  • LoadXRefIDData

This will use some data access code to retrieve all of the cross reference data for one specific type of cross reference (xrefId) for example all of the mappings for Product Type. It will then return them to the calling method.

  • GetCrossReferenceIDData

This method will check the cache to see if the data is already there for the requested cross reference data type. If present it will be returned from the cache, and if not the data will be loaded using the LoadXRefIDData, and then placed in the cache.

The result of this means the data is cached once for both the GetAppID and Get CommonID methods.

One interesting bit on this (and there may be better ways to do this) is that to allow you to search for the appropriate mapping data from the same source by both CommonID and AppId I have held the data in a container object which houses a dictionary of the reference data with a unique key for each one, and then I also have 2 dictionaries of the app specific keys and common id keys. This just makes it possible to hold the data just once but search for it in different ways. As mentioned I'm sure if I have a think about this there are better ways to do this but it will do for this sample. (note although this last bit may have sounded overly complicated this is encapsulated so the consumer does not need to care about this)

The NCache bit

So from the above hopefully you can see I have provided a handy way to use BizTalk cross referencing within this sample. The next thing to discuss is NCache. I believe there are a number of additional features which come with the Enterprise version such as security features and tools to manage caches so for any production usage I would definitely recommend that version. For the purposes of this sample the Express edition is more than sufficient.

You can see from the below picture that the code to interact with NCache is very simple.

(Note: In the above picture you can only partly see it, but the cache allows you to insert objects with expiration parameters and also a cache dependency)

With NCache one of the things that I like is I don't need to worry about configuration within my application, so you will notice there isn't any app.config files in the sample which are used by the consumers of the cache. That said there is some configuration for the caching service. You will see in the NCache folder there is some config which controls how your caches are setup. The below picture from the config.ncconfig file shows how I have configured my cache for this sample.

You will notice here that I'm able to control if my cache runs in process or out of process which is how I'm able to move the cached data outside of my Biztalk host process and there are a bunch of other possible settings. This configuration is held along side the caching service.

Plugging it into BizTalk

This component is now very easy to add to a BizTalk implementation by using the call external assembly feature of the scripting functoid to call the component. You will now be able to use the cross referenced features of BizTalk but with out of process caching of the data.

Summary

Hopefully you will see that it is not that difficult to implement a good caching solution which has addresses a combination of the considerations I discussed in my previous article. I quite like this approach and based on my limited experience of the different caching systems available I would probably at present choose NCache over Memcached because it is an established 3rd party system which comes with additional tools and features to support it. That said I will be keeping an eye on the "velocity" project as I think this will definitely be one to watch for the future.

If you have any experiences with this I would be interested to hear your thoughts. The sample is available below:

posted @ Sunday, September 21, 2008 7:48 PM | Feedback (2) | Filed Under [ BizTalk ]

Caching Strategies for Reference Data Mapping with BizTalk

Article Source: http://geekswithblogs.net/michaelstephenson

I've been asked the same question a few times recently by a couple of BizTalk projects about how to map their reference data. When this question comes up we often get involved in a discussion about the pros and cons of caching the reference data and increasing memory usage versus hitting the database every time.

As a rule I tend to use the BizTalk Cross Referencing features for this data mapping unless there is a specific requirement which requires some custom approach. I've blogged about this kind of thing a few times before but I thought its worth a post with some thoughts on the different approaches I've seen used when people have wanted to use caching.

I mentioned in a previous post that the Value cross referencing features already implement a simple caching mechanism. In my opinion though the value cross referencing is aimed more at mapping data type values between types of systems rather than business reference data which would be held in instances of systems which is what I feel the ID cross referencing is aimed more at.

Anyway when it comes to this design decision the things people are usually trying to balance are as follows:

  • Performance – If I have a lot of things to map I don't want to be hitting the database thousands of times
  • Performance – If a I cache the reference data is there a risk it will consume a fair bit of memory and potentially cause throttling based on host process memory threshold being hit
  • Managability – If I cache the data it will have an instance of the cache in each host instance that uses it. How will I ensure these stay in sync
  • Managability – Caching will mean I have to restart all the hosts when the data changes

There are a number of possible ways to solve this problem and each have their own considerations which are discussed in the rest of this article.

Simple Singleton Approach

This is probably the most common approach I've seen. In this approach I've normally seen a custom database implemented to manage the reference data. The developer would then implement a custom data access method and a singleton which would be used to control access to the reference data. This is a pretty standard use of the singleton pattern. In this approach I think some of the considerations which need to be made are:

Pro's

  • Fast access to the data
  • Easy to implement in terms of the C# coding

Con's

  • In most cases there is additional development of a database to manage the data. This then involved additional development/testing/deployment and management work
  • The data is cached in the host process so you need to watch for the impact on the process memory of the BizTalk host
  • If you access this reference data from BizTalk maps running in different hosts then you may end up with multiple instances of the cached data on each server
  • By default your cache usually will not detect changes to the underlying data, however with additional coding you can monitor the custom data and update any changes
  • In most cases the hosts need to be restarted to pick up changes
  • The cache will not be cleared when the data is no longer used

Caching the Response from a Web Service

Sometimes I've seen an approach where a custom database has been implemented then a web service façade has been implemented on top of it. The web service will access the data and return it. In consuming this from BizTalk a C# assembly has been developed which uses the web service to get the reference data which is then consumed by a map.

Pro's

  • The caching is outside of the BizTalk process
  • The caching can be relatively easily configured
  • If the web service is located on the BizTalk box then a local machine hop would be quicker than going remotely, and also with WCF you could optimise this further using appropriate channels

Con's

  • There is a lot of additional development in this approach, custom coding of the web service, development of the database
  • There is a lot of additional management and deployment effort for the database and virtual directories etc for the web service

     

Using the HTTPCache

In this approach I've normally seen it implemented in the same way as the singleton approach above. The key difference is that the reference data is usually held locally in a static hash table in the singleton approach where as in this approach the HttpCache object from the System.Web namespace is used. This gives a couple of options around a sliding and absolute expiration which will remove unused data from the cache helping to control the memory usage. You can also add one of the .net cache dependency objects which would allow you a way to detect changed and refresh the cache.

Pro's

  • Would be fast access to data
  • Relatively simple to implement ways to detect changes
  • Ability to clear the cache for unused data

Con's

  • Again usually has a custom database for the reference data
  • This is in process caching so you need to be aware of the memory usage

     

Using Enterprise Library/Caching Application Block

Enterprise Library has a caching block which provides a number of features which could help you solve this problem. One of the key benefits of enterprise library is that it supports different types of stores for the cached data including:

  • Null – means just stored in memory
  • Database
  • Isolated Storage

If I remember right the cache supports the same features as the HTTPCache approach which allows you to have a dependency and also expirations. There is an article at the following location which discusses using Enterprise Library Caching in BizTalk http://www.malgreve.net/2007/07/using-enterprise-library-in-biztalk.html.

Enterprise Library can also integrate with external backing stores to support out of process caching.

Pro's

  • Ability to abstract the caching store from the consuming code
  • Standard caching feature set

Con's

  • Again usually some requirement for custom data store for reference data
  • Enterprise Library usually required lots of configuration to setup and manage
  • Most commonly cached in process so near to be aware of memory usage

 

Out of Process Caching

One approach I quite like involves caching the data outside of the BizTalk process. This provides the benefit that you can cache without having to worry about the impact on the BizTalk process memory usage. There are a number of caching tools which you can use to help here such as:

Alachisoft offer an express version of their caching product which is free and a version for a relatively small cost which comes with some management tools for their distributed caching system.

Memcached is an open source distributed caching system. I know of some guys who have used this very successfully on a .net project with a major UK company.

Velocity is an initiative at Microsoft at present to create a distributed in memory caching platform. I feel that as this evolves it is important to keep an eye on this as it will in the future be likely to become the best approach to this.

 

These distributed caching systems offer the benefit of taking the memory usage out of your process, but offer fast access to the data via their API. Most of these products also offer high availability and synchronisation across a group of caches when you distribute them across your server group. I have in particular looked at NCache for this example and it is setup as a windows service which you would deploy on each BizTalk box. These services would then be configured to work as a cluster meaning they would synchronise themselves when changes were made.

Pro's

  • Out of process caching offers still fast access to the cached data, but removes the likely hood that the cache might affect BizTalk performance
  • These caches are designed for high performance such as NCache which is intended for high performance customer facing ASP.net applications
  • They can be integrated with caching frameworks such as Enterprise Library (NCache comes with this out of the box)
  • NCache supports cache dependencies and expiration

Con's

  • Again requires work and management of additional components. I think NCache (the buy version) offers a better management and operations
  • Potentially brings up the 3rd party or open source debate around which cache system to use

Summary

Hopefully this article has highlighted the many options available when you are considering a caching solution to support your BizTalk implementation. There are many considerations which can be made and there isn't always a one size fits all rule like in most design decisions. I think some of the things that stand out from this discussion are that most of the approaches above always end up using a custom database to manage the reference data. I think in a future post I will look at how to combine some of the approaches discussed here with the BizTalk Cross Referencing features to produce a fairly simple yet effective combination of all of the approaches.

posted @ Sunday, September 21, 2008 7:42 PM | Feedback (0) | Filed Under [ BizTalk ]

Saturday, September 20, 2008

Basic Server Diff Tool

We had a few problems on a project recently where the server builds did not set machines up exactly the same.  This is a pretty major problem as if the software installed across your BizTalk Group you can not expect to receive consistent or expected results.  Based on this I reviewed a few servers using PsInfo recently but being a manual thing its a bit of a pain trying to identify any differences.

Im sure there must be tools to do this but a quick google didnt throw up too much joy so I knocked up a quick tool to help.

The tool will let you list a bunch of servers which you expect to be the same and will then run PsInfo agains them and report any differences which are identified. 

There are a couple of things to note here though that it will not do things like compare registry settings etc, and ive not used the switch to get disc information from psinfo as if the servers have been used then their disc free space is likely to differ.

Anyway its a bit basic but saved me some hassle so ive shared it on codeplex; http://www.codeplex.com/ServerDiff

 

posted @ Saturday, September 20, 2008 9:14 PM | Feedback (0) |

Friday, September 19, 2008

Problem with orchestration deployment during build process

I came across this problem ages ago when trying to deploy an assembly containing orchestrations during our automated build process. When deploying the application we would get the error shown in the below error message.

My first thought was there are no bindings this is a new application being deployed because my msbuild process would clean up/delete the old application before it removes it.

At the time I think id fixed it without really realising what I'd done and just got on with things (I think I'd updated the version of the assembly) and the build process just continued to work. Id also been unable to recreate the problem.

This week the problem came up again so I spent a little more time looking into it. A bit thanks goes out here to James French who commented on a post by Neil Thompson about this.

To quote James the problem is

"On deployment the deployment program which is run by Visual Studio creates/updates one or more BindingInfo.xml files in the folder C:\Documents and Settings\[user name]\Application Data\Microsoft\BizTalk Server\Deployment\BindingFiles. There are two versions of this file depending on whether you are deploying to a new application or are redeploying.

If you deploy a new application the file is created based on reflection of the Assembly which contains the orchestration and is prefixed with a ~ (tilda) character.

If you re-deploy an existing application the binding file is created based on the binding info held in the BizTalkMgmt database for the assembly being published. This file does not have the ~ (tilda) prefix.

The deployment process attempts to apply these bindings after (re)deployment.

I have found that by deleting the Binding.Info.Xml files in the temp folder as well as deleting the Application via the BizTalk management console that the above mentioned error no longer occurs because the deployment program always uses a fresh binding.info.xml file based on the assembly and not what was previously in the database. (PS: Note that I am developing on one machine and deploying onto a second machine)."

 

Cheers saved me a lot of hassle here!

posted @ Friday, September 19, 2008 1:47 PM | Feedback (0) |

Thursday, September 18, 2008

Schema problem with source control

I came across a frustrating little problem today. We have a schema for a web service we call and have done for ages. The schema is generated by the WSE adapter wizard.

We have had a couple of issues we have had to troubleshoot and it became apparent it would be useful to track a couple of promoted properties from this schema to help with the diagnosis.

So you would think it's pretty straightforward. Add a property schema and then apply the promotion to the schema. It all works great until you save the schema.

Normally a BizTalk schema is in Unicode encoding, but the WSE adapter generates one with ANSI encoding. When you try to save the schema you get a warning because Visual Studio changes the encoding during the save.

The knock on to this is that we are unfortunately using the worst source control system in the world (not naming any names) which then wont allow you to check in your change because it can not diff the files due to the encoding change.

Bit of a pain in the bum but you can work around this, id be interested to know if there is a reason the adapter does a different format to everything else.

posted @ Thursday, September 18, 2008 8:04 PM | Feedback (0) |

Sunday, September 14, 2008

BizTalk XRef Undocumented feature

I found something out about the BizTalk Cross Referencing features the other day that I didnt know and havent seen in the documentation.

If you use the ID cross referencing it will hit the database everytime you use the functoids or access the dll directly

If you use the VALUE cross referencing it implements a simple cache using a singleton so if you update data in the database you should restart each host instance that uses it before the changes will take effect.

 

posted @ Sunday, September 14, 2008 5:35 PM | Feedback (0) |

Saturday, September 13, 2008

Reviewing a BizTalk Group

Article Source: http://geekswithblogs.net/michaelstephenson

I've recently been reviewing some BizTalk setups for various reasons. These include:

  • Is the setup correct
  • Performance analysis and issues
  • General troubleshooting

I thought it would be useful for me and others who might want to look at doing a review of a BizTalk setup to make some notes on some of the activities you might want to do.

 

Comparing Servers

I've come across a couple of instances previously when servers had been setup incorrectly with missing hot fixes or in one case a missing service pack for the .net framework on one server in a group. When you have a BizTalk group or a clustered SQL Server you want to ensure the servers in the group are all the same. In most cases they should have the same hardware and same software installed on them.

I had a bit of a Google on this and there doesn't seem to be that many tools which easily allow you to do this (if you know of any, or I have missed something obvious let me know), but the main way for doing this is with the PsInfo tool from Microsoft/SysInternals. The key thing with this tool is that I'm looking to validate that all servers have been setup with the correct software and hardware and they are all consistent. PsInfo is available from the following location and there are examples to show you how to do this.

http://technet.microsoft.com/en-us/sysinternals/bb897550.aspx

 Update - Ive done a tool to help with this: http://www.codeplex.com/ServerDiff

 

Reviewing Security with Microsoft Base Line Security Analyser

This is a free tool from Microsoft which will allow you to compare a set of servers against most of the recommended security standards. It will identify any vulnerabilities on the servers you have analysed and highlight them and offer recommendations on how to fix or mitigate the risk associated with it.

I would use this tool on all BizTalk and SQL Servers in the setup. This tool is available from the below link:

http://technet.microsoft.com/en-us/security/cc184924.aspx

 

Reviewing SQL Server with SQL Server Best Practice Analyser

Most people are probably familiar with the SQL Server BPA, but if not this tool will collect information about your SQL Server instance and then compare this against best practice rules to highlight any issues with your setup. The tool also makes recommendations on how to resolve these issues. This is a very useful tool and is available from the following location:

http://www.microsoft.com/downloads/details.aspx?FamilyId=DA0531E4-E94C-4991-82FA-F0E3FBD05E63&displaylang=en

 

Reviewing BizTalk Server Group with BizTalk Best Practice Analyser

The BizTalk BPA is a tool which will inspect your BizTalk Group and compare it against well known best practice rules for a BizTalk setup and identify any issues you may have. It also offers recommendations on how to resolve these. The BizTalk BPA is available from the following location:

http://www.microsoft.com/downloads/details.aspx?familyid=dda047e3-408e-48ba-83f9-f397226cd6d4&displaylang=en

 

Reviewing BizTalk Group with Message Box Viewer

MsgBoxViewer is an excellent tool developed by Jean-Pierre Auconie which allows you to run a whole bunch of queries against your BizTalk Group. It will provide you lots and lots of information about the group and also perform some analysis to advise you of things you might need to review. Rather than get into a lot of detail about it, I would like to refer you to Jean-Pierre's post where he gives you all the information you could want.

This tool is available from the following link: http://blogs.technet.com/jpierauc/pages/msgboxviewer.aspx

 

Reviewing BizTalk Application setup with the BizTalk Documenter

When your application has been deployed one common problem is incorrect configuration of one of the many settings that are available. By using the BizTalk Documenter you can quickly and easily document your BizTalk system and then review the chm file to easily see what settings are applied. This gives you a change to look over the settings and potentially spot any problems.

The BizTalk Documenter is available from here: http://www.codeplex.com/BizTalkDocumenter

 

Reviewing Activity by Parsing the IIS Logs

Lots of BizTalk projects will utilise IIS to expose web services to other applications. Using Log Parser to analyse the IIS logs is a good way to get an idea of the activity you are getting in IIS and also identify any errors that are happening. Below is some of the queries I often use to analyse IIS.

Query

Description

IIS Calls by hour

This will give you a breakdown of the number of calls per hour across the chosen day

Calls by hour and URL

This will give you a breakdown of the calls by hour and url across the day

Calls by minute

This allows you to get an analysis of the load on the server in terms of IIS calls per minute

Calls by minute and url

As above but with an additional split by url

IIS Errors

This will list all IIS requests which have responded with an error on that day

IIS Errors by url

As above but summarised by url

Request message size

Lists any large messages which have been sent to IIS (note this required additional IIS logging parameters to be enabled)

Response message size

Lists any large response messages which have been returned through IIS (note this required additional IIS logging parameters to be enabled)

Time taken

Lists the calls in terms of the overall duration of the request (note this required additional IIS logging parameters to be enabled)

Requests per day

Breaks down the calls to count how many there have been for each url for a given day

 

I have provided a sample of these queries from the samples folder at the bottom of the document.

The RunIISQueries.cmd file shows how to run the queries. You basically need to provide the following parameters

  • The output path where to place the reports
  • The path to the IIS log files which can be a remote server
  • The path to LogParser
  • The data to inspect for (in the format yyyy-mm-dd)

These queries tend to be useful on a project when you have limited information about expected volumes. We were able to analyse various servers and work out what the current volume information is, and also we used them to occasionally review the usage on test environments. The beauty of Log Parser is that it is very easy to do your own queries so if you use some different queries feel free to mention them in the comments.

Reviewing Activity by parsing the Event Log

The event log is one of the most important resources available to anyone managing or reviewing a BizTalk setup. It will contain information about any problems that have been experienced etc. On one of my projects we have setup some daily reports which will analyse the event log and give us different views on what has been logged daily. This is in addition to any alerts which get created by monitoring the event log with Openview or MOM.

Some of the event log queries we use are as follows:

Description

Query

This query will review the event log and find all events for the BizTalk event source

logparser "Select TO_STRING( TimeGenerated, 'yyyy-MM-dd' ) as Date, TimeGenerated, EventId as ID, SourceName Into <OutputPath>\EventLogBizTalkEvents.txt From \\<ComputerName>\Application Where Date Like '%<Date>%%' And SourceName Like 'BizTalk%%' Order By SourceName, EventId, TimeGenerated" -i:EVT -rtp:-1

This query will review the event log and find all events logged for our custom event source

logparser "Select TO_STRING( TimeGenerated, 'yyyy-MM-dd' ) as Date, TimeGenerated, EventId as ID, SourceName Into <OutputPath>\EventLogBupaEvents.txt From \\<ComputerName>\Application Where Date Like '%<Date>%%' And SourceName Like '<CustomEventSourceName>%%' Order By SourceName, EventId, TimeGenerated" -i:EVT -rtp:-1

This will review the event log and give a count of events for each event source and number

logparser "Select TO_STRING( TimeGenerated, 'yyyy-MM-dd' ) as Date, EventId as ID, Count(*) as NoEvents, SourceName Into <OutputPath>\EventLogSummary.txt From \\<ComputerName>\Application Where Date Like '%<Date>%%' Group By Date, ID, SourceName Order By SourceName" -i:EVT -rtp:-1

 

On the back of this analysis we keep a library of known events in Share point so we have information about all of the events we typically would expect to see. We run these reports and build up our knowledgebase through testing phases and then by the time we move into production we have already got a good ability to support our solution in place.

Quite often too we have come across intermittent problems which are difficult to track down, using Log Parser helps us to analyse the event log for information across multiple servers which might help us to solve this.

 

Reviewing Activity from HAT with the Orchestration Profiler

The BizTalk orchestration profiler is another useful tool to help you review activity on a BizTalk Group. One of the most common ways I use the tool is integrated into an automated testing process to ensure I have good code coverage. But it can also be used to point at a BizTalk Group and just analyse how the orchestrations are being executed. To be honest it is not really one of the first things I would do when doing a review, but it can help you look for unused orchestration paths and also be useful if you are doing a BizTalk upgrade project and want to see how often orchestrations are used. If I remember right you can also analyse the execution of orchestrations to look for performance bottlenecks within an orchestration.

There is lots of information in the community about this tool so rather than going into it in detail here I would like to refer you to the following site: http://www.codeplex.com/BiztalkOrcProfiler

 

Reviewing activity from HAT with some custom queries

All BizTalk developers will be familiar with HAT. There are a bunch of out of the box queries which can give you various information and you can create your own queries specific to your solution. This section is more for completeness of the article as there is plenty of documentation about the standard queries, and you will make your own up to suit your own needs.

 

Performance Analysis with Perfmon and PAL

I'm often surprised how many times I've done a BizTalk interview and asked a candidate about how to analyse a specific performance issue with BizTalk and how few times anyone mentions perfmon. There are loads of performance counters for all of the different features of BizTalk, and you can also create your own to compliment your solution. These counters can give you valuable information about the health of your BizTalk Group.

If you want to perform a very simple performance analysis of your BizTalk Group both Perfmon and PAL can help you do this. The steps would be as follows:

  1. Use Chapter 9 of Darren Jeffords excellent book to review a simple way to setup your perfmon trace for your BizTalk Group using the spreadsheet he provides.
  2. Start your Perfmon trace and run it while your Group is doing its work
  3. Stop the perfmon trace
  4. Run the PAL tool (available here) against the Perfmon output file
  5. Review the PAL report

The PAL tool will review the captured performance information against recognised benchmarks and produce a report which will highlight any problems or things to be reviewed.

Review Configuration Files

One of the other common deployment mistakes comes when you have configuration information held in configuration files such as BTNTSVC.exe.config or possibly held in SSO or some other configuration store. When you have this kind of configuration it often changes depending on which environment you have deployed to and it is a common issue in deployment that the wrong configuration values have been deployed.

I usually give these files a quick sanity check manually, but also to mitigate the risk of this problem happening in the first place I usually use the approach I discuss in my article about configuring binding files to allow me to have a template and way of producing configuration and binding files for different environments by abstracting the values from the binding/configuration file template.

There is also the spreadsheet option available as part of the deployment framework which Scott Colestock created check the following link for details: http://www.traceofthought.net/PermaLink,guid,b9c45d34-85c8-449f-b1a6-deafc2d89084.aspx

 

Operational Monitoring Tool

Finally if you read my blog regularly you will have seen me going on about how important it is for organisations to monitor their systems with a tool such as MOM/SCOM or HP Openview. These tools will highlight lots of the things which you might find using the above reviewing techniques because they would have preconfigured alerts in them. My two most common questions when reviewing a BizTalk solution related to this are:

  • Is the customer monitoring their BizTalk environments with something like MOM
  • Are they staying on top of the alerts that are being raised

If the answer to the above 2 questions is yes then it's a good indicator than an overall review would be good.

 

Summary

This article was intended to discuss some of the various techniques you might use to perform different types of review of a BizTalk environment. These reviews would aim to ensure that a group has been setup and deployed correctly, and then is managed and operates effectively when it is used.

Id be interested in any thoughts or other things people do in terms of this.

 

 

posted @ Saturday, September 13, 2008 8:59 PM | Feedback (0) |

Friday, September 05, 2008

UK BizTalk Events

Microsoft have asked us to advertise the following events to members of the UK SOA/BPM User Group

BizTalk RFID – Connecting the Edge to Enterprise – 29th October 2008

SOA & BPM Vision Briefing – 23rd Sept 2008

posted @ Friday, September 05, 2008 10:32 PM | Feedback (0) |

SOA/BPM User Group Meeting

The next session of the SOA/BPM User Group has been arranged, we are still finalising content but the event registration is now available.

If you would like to see the details or sign up refer to the following url:

http://sbug.org.uk/forums/p/68/99.aspx#99

 

posted @ Friday, September 05, 2008 10:07 PM | Feedback (2) |

Strange Service Window Behaviour

Thought I would add a quick post about some strange behaviour we have been monitoring recently with service windows.  Our situation was as follows:

We have 4 ports which monitor FTP locations, 2 of them have service windows and 2 dont.  The service windows are specifically for a 1/2 period and have a 5 minute polling period in them.  Our servers are not configured for high availability because there are issues with being able to cluster the servers so we monitor the hosts using HP Openview and if they go down a manual intervention is used to initiate a host on the other server.  We receive 4 files per day one for each location.

The symptoms we experience are as follows:

  • The ports which do not have a service window always seem to work fine
  • The ports with a service window were working fine for a couple of months and then seemed to stop working
  • If we remove the service window the ports work fine
  • We can not see any record in the external FTP servers logs that the port has polled for the file on the ports with a service window
  • We have enabled the logging/diagnostics file for the FTP adapter but nothing is recorded for the service window ports
  • If you restart the host it will pick up the file if it is within the service window, but the next day the port does not pick the next file up
  • We have been able to validate from the FTP logs the file was there at the expected time
  • We have been unable to simulate this on another environment, it only happens on our production environment

Based on the above its a bit strange, we have been able to get it working by swapping the server the host instance runs on, but being our production environment I dont really want to be doing anymore analysis here.  It will be interesting to see if this comes up again, and also if we moved the host instance back to the first server if it still breaks

In the analysis I came across Saravana's post about some stuff he had with service windows and I am fully supportive of his suggestion.  I think that in addition to that it would be very useful if there was an option where you could enable logging to the event log the start and finish of a service window for a given port.  This would be very useful from a monitoring perspective and allow you to configure your monitoring tool to look for these events and work out if a service window had been missed

posted @ Friday, September 05, 2008 10:03 PM | Feedback (0) |

Tuesday, August 12, 2008

I hate days like this

Had one of those days yesterday and today, you know what its like when you find a bug/problem and you dont really get a chance to properly look at it when your doing a million things at once.

It is so often that the cause of the problem is really obvious but you just cant see the wood for the trees.

This is a reminder for myself as much as anything.

I was writing an msbuild task to wrap cruise controls ICruiseManager so I can get the latest build label of a project that has been built.

The problem was that when I tested the task using MsTest from Visual Studio it worked fine, but when I ran it from an MsBuild script it kept failing complaining about not being able to find the ThoughtWorks......Remote assembly.  I ensured that the assembly was where my tasks assembly way and frustratingly it wasnt working

When my brain returned I spotted that the problem was that the assembly is being searched for in the directory which the process started (the .net framework directory where MsBuild.exe is) so by copying the assembly there it now works.

posted @ Tuesday, August 12, 2008 7:38 PM | Feedback (0) |

UK SOA/BPM User Group Slides

All of the slide decks from last months UK SOA/BPM User Group session are now available from our site

http://sbug.org.uk (in the media section)

The decks covered were:

Oslo - Robert Hogg

Composite Applications - Andy James

SOA in 2015 - Darren Hallett

posted @ Tuesday, August 12, 2008 7:25 PM | Feedback (0) |

Frends Newsletter

Just been reading the quarterly newsletter from Frends (http://www.frends.com/), it was nice to see a little plug for my recent article about scheduling requirements for BizTalk solutions where I discuss the use of their product as one of the possible ways to implement this kind of requirement:

http://geekswithblogs.net/michaelstephenson/archive/2008/05/16/122203.aspx

posted @ Tuesday, August 12, 2008 7:22 PM | Feedback (1) |

Tuesday, August 05, 2008

BizTalk Documenter in my build

From previous posts you may have read how I integrated the Microsoft BizTalk documenter into my MsBuild process.

In general this has been working fine, but every now and again I kept getting a build failure when trying to generate the documentation as follows:

System.IO.DirectoryNotFoundException: Could not find a part of the path 'C:\Documents and Settings\<UserAccount>\Local Settings\Temp\BTS2K4Doc\Application\fb7b6ba7-016c-4a1f-9c7d-ec9037911456.html'       

If you get this it seems to be caused by an hhc.exe (Microsoft Help Compiler) process still being alive from a previous run.  Just kill it and the builds will work again.

 

 

 

 

 

 

posted @ Tuesday, August 05, 2008 12:17 AM | Feedback (0) |

Tuesday, July 29, 2008

MsBuild Task for consuming WCF Services

Article Source: http://geekswithblogs.net/michaelstephenson

Recently at the UK SOA/BPM User Group Yossi Dahan and I chatted briefly about using MsBuild and BizTalk, more specifically Yossi mentioned that he wished it was possible to regenerate the schemas for consuming a service automatically.

Following this discussion and a few other things I had a blast at seeing how difficult this would be to do. I think there is a lot of value in having this approach as it fits well with a contract based development approach and continuous integration. I want to be able to regenerate my schemas every time I build the solution, and if a breaking change has been introduced I want the solution to break so it is corrected.

This fits with what I believe is a good practice to identify these kinds of problems as early as possible in the development cycle.

The aims of the task are:

  1. Act in a similar way to the Generate Schema feature for BizTalk/Visual Studio except that it will be driven by an MsBuild task rather than a GUI
  2. Just deal with the schemas, I wasn't too bothered about regenerating the orchestration and binding file samples

(The sample is available for download at the bottom of the article)

The Sample

In the sample the quickest way to demonstrate this is as follows:

  1. Modify the BuildProcess.xml file to point to your url for the MexEndpoint
  2. Double click the RunBuild.cmd file
  3. Check the Acme.BizTalk.Schemas.WCFService folder and you should see your schemas have been updated

It is expected that you would include the schemas in your project, but exclude them from source control so they can be rebuild every time you build the solution.

Digging Deeper

I managed to work out how to do this by inspecting the Microsoft.BizTalk.Adapter.Wcf.Consuming with Reflector. Upon investigation it would have been nice if a few of the classes were externally available from the component. As it was I needed to disassemble the component to progress this idea. In the disassembled component I created a new version of the Consumer class in which I removed some of the functionality and changed the Consume method interface so it wasn't dependant on being passed a Visual Studio Project object.

With a couple of other tweaks I was able to get this working

Conclusion

Although this task seems to do the job, it has not been extensively tested, and it also only currently works with the MEXEndpoint and not yet with meta data files.

Hopefully this idea will help a few people, and the idea might get back to the product team as a possible future enhancement as this kind of thing can certainly help your development process be more effective.

posted @ Tuesday, July 29, 2008 1:16 AM | Feedback (0) |

Saturday, July 26, 2008

Custom Persistence Points

Article Source: http://geekswithblogs.net/michaelstephenson

A few weeks back I got a comment to one of my blog posts by a guy who said he wished he had more control over persistence points in an orchestration. In his example he basically wanted to reduce the number of persistence points as he needed to improve performance in what sounded like a request response scenario.

Often in these kinds of scenarios if BizTalk has been well optimised and you still do not get the latency requirements you need then the likely hood is you are trying to solve a problem with BizTalk when a different solution would be most appropriate.

Anyway what did make me wonder was the opposite of what this guy wanted to do. My thoughts were:

  1. Is there a situation where I would actually want to persist the state of an orchestration myself?
  2. If I wanted to do this how could I do it?
  3. If I can actually do it where might I have used it?
  4. Would we benefit from an explicit "save" shape?

Anyway when you travel around a lot like myself you tend to get plenty of boring delays in the airport so on one such occasion I had a crack to see if you could do this.

How would I do it?

I wanted to do the simplest example which could demonstrate how I could persist the orchestration state from C# and then do something to show how if an error occurs and the orchestration suspends then I can resume it and it will complete as expected.

In the example orchestration I will receive a message which I won't really do anything with, it is simply to start the process. From here I will initialize a counter and then perform a loop from 1 to 20. Normally I will just trace out the current ordinal, but on the 10th iteration I will cause an error to be thrown (I simulate this by checking if a file exists on disk and if so throwing an error). When the orchestration suspends it can be resumed manually and it will not throw an error when it replays the 10th iteration.

After 20 iterations it will exit. The picture below shows the orchestration.

 

 

Now normally in the above orchestration, because there are no persistence points within the loop you would get an output in DebugView like in the below picture.

Here you can see after the error is thrown the process will restart from the beginning again. The next picture below shows what happens when I persisted the orchestration at each iteration in the loop. You can see how the process was resumed and started by replaying iteration 10 and continuing to the end.

Ok so for most people there is nothing new here, but that is just background information. The next thing is in this example is because I did not use any of the normal shapes that cause the state to be persisted how did I achieve this?

Within the Microsoft.BizTalk.XLANGs.BTXEngine namespace there is a BTXService class. With the below line of code I was able to persist the state of the current orchestration.

BTXService.RootService.Persist(false, BTXService.RootService.RootContext, false, false, false, false);

Thoughts now

Ok so I have been able to do what I was wondering about, but as you can see to do it you have to use some of the "under the hood" BizTalk components so it isn't really a recommended way of doing this. This is another just because you can doesn't mean you should thing.

I was discussing this with some friends in the pub a while back and between us our thoughts were:

  1. None of us could think of a specific example of where we would have used this technique, so that probably means we don't need a "Save" shape

     

  2. If we were doing something like this we would possibly have it within an atomic transaction scope anyway so it would create a persistence point before our work anyway. We kind of thought that this was one of the benefits of the XLANG engines approach to allowing you to develop orchestration workflow by protecting the developer from things such as this where if you can do too much you would be likely to get yourself into trouble.

I'd be interested to know if anyone has come across a situation where they felt this feature to explicitly persist the orchestration would have helped.

 If you are interested in the example it can be downloaded below:

 

posted @ Saturday, July 26, 2008 11:55 PM | Feedback (0) |

Friday, July 25, 2008

Refactoring Tales: Long running splitter pattern

Article Source: http://geekswithblogs.net/michaelstephenson

I have often come across situations where I have been asked to look at a process (usually in BizTalk) where it isn’t quite running as the customer would like. I have decided to start a series of posts which I will call refactoring tales. These posts will discuss a the process implementation and the problems encountered along with it. I will then discuss the approach taken to improve things and what the benefits were.

Background
This particular process had been implemented as a sort of splitter pattern. Each day a file would be received which was then to be split and loaded into two different systems. The split was made by checking each record against a third system which would indicate which system the record should be loaded into.
This example is relatively common and the implementation which was developed was as follows:
1.       The file was received as a complex XML file via the FTP adapter which had a service window applied to it A normal file would contain about 5000 records and was approximately 10MB. On exception there have been files as large as 40MB.
2.       A pipeline component was applied in the decode stage which would remove a pre-processing instruction from the file
3.       The file was delivered to the Messagebox
4.       An orchestration would subscribe to the message and start processing
5.       Some validation logic is applied to ensure the message is in sequence. The logic involves using the SQL adapter to check a database to ensure the message is in sequence.
a.       If the message is not then an event is logged and a manual recovery process is initiated and the orchestration suspended
6.       In the orchestration it would iterate through records in the file extracting each record.
7.       In each iteration of the loop the extracted record is checked against an external system to see where to send it by using the oracle adapter
8.       If the message is sent to system A then a web service call is made using the WSE 2 adapter to load the message
9.       If the message is to be sent to system B then an in memory message is built up for all of the messages for system B which will then be sent in batch
10.   When all or the records have been processed then the records for system B are send via FTP to that system. The batch message (about half of the original file) is transformed in the send pipeline to a flat file format.
11.   The batch management database (a custom database which is very simple and to support the sequence number validation) is updated to mark this batch complete
12.   Finally an email is sent to the business users to confirm the results of the batch processing (how many went to each system)
The Pain
The following pain was experienced with this process:
Length of time it runs
Each record in the file would take approximately 4-5 seconds to process with a little time before and after the loop. It was common for the file to take in excess of 6 hours to fully process. The result of this was a file received in the morning would often take almost the whole business day to be loaded.
Cannot scale
One of the key limitations of this design is that because the process is sequential it cannot be scaled out. If the file were to suddenly become 10,000 records you would expect the processing time to double. Adding more BizTalk servers to the group would have no impact on this duration.
Does not recover well from errors
During the first use of this service there were some network connectivity issues and this ment that sometimes the 3 retries on the web service port would all fail. This meant that the exception was caught and the process stopped until someone looked at the problem. When resolved the orchestration was resumed it would continue. But as you can imagine there are two significant impacts on the amount of time the overall process would take.
The first was the response time from a manual operator. This could add anything from a few minutes to an hour that the whole process would be stopped.
The second was that each port error would initiate one of the port retries about 5 minutes later. In the example here about 10% of the calls failed (250+ because not all of the records went to system A) and you can imagine the impact of these extra minutes on the overall processing time. One file took 2.5 days to fully load.
Persistence Points & SQL Transaction Log Backups
One of the behind the scenes problems was that because of the fact that the orchestration was holding one large message in memory from the start and building up another through the execution of the process it meant that every persistence point was expensive with the orchestration state being between 10-20 MB for the standard message size.
There were approximately 2 persistence points per record so for a standard file (5000 records) that means 10000 persistence points. If you do the calculations here you can see we are persisting somewhere in the region of 100Gb worth of data to the messagebox during the processing of this 10 MB file just to save the state of the orchestration.
The SQL Transaction log backups for the messagebox were eating drive space like crazy.
 
There were a couple of other smaller things too, but as you can see there is a clear business and operational need to refractor this.
There were a couple of lessons to learn from this situation, particularly around performance testing and the difference in how a process will be perceived to run on a developer machine versus how it will run in a live environment, but I will not go into these within this post.
 
The Refactoring Challenge
Refactoring is different to re-writing. If we were doing this from scratch there would be a number of different things you could do, but one of my favourite things about these kind of situations is there are things you can change and things you can’t. The challenge is to make it work much more effectively while minimizing change/cost. Some of the factors which affected how I could refactor this are as follows:
·         No change to external systems
In this case I could not change any of the external message formats or communication mechanisms. There was no opportunity to change the external systems functionality.
 
·         Short timescale
I only had a short amount of time to POC a new idea for this and then it would be implemented by other developers.
 
·         The business users still want their email
The email feature for the users screams out as a BAM opportunity. However in this refactoring there is no opportunity to change this or retrain users on how to use the BAM portal etc.
·         Limitations on technologies to use
This particular solution could potentially use SSIS to help with the implementation, in this particular case it was preferred to keep it as much in BizTalk as possible to keep the supportability of it within what had already been trained to the support team.
 
The New Implementation
After weighing up all of the factors and components I had to work with I decided that the best way to refactor this process would be as follows:
1.       The file was received as a complex XML file via the FTP adapter which had a service window applied to it 
2.       A pipeline component was applied in the decode stage which would remove a pre-processing instruction from the file
3.       An XmlDisassembler was added next in the pipeline to promote some metadata about the batch from the message header to the message context
4.       A new custom pipeline component was added to the pipeline to rip apart the batch.
This new pipeline component would use the XPathReader to read the message and to extract each record from the message. The database which was part of this solution used to manage batches was extended to have a table to contain a temporary storage for the batch records. The pipeline component would save all of the records to the database and then change the message which would come out of the pipeline component to be a very small trigger message which would be sent to the messagebox and contain the ID of the batch and the number of records it contains.
5.       A new orchestration would subscribe to this trigger message and start a controller orchestration which would manage the overall process
6.       The controller would iterate through a loop from 1 to the number of records in the batch and create a small simple command message which would be sent to the messagebox with an instruction to process a specific record from this batch
7.       The controller orchestration would then sit polling the database until all of the records have been processed
8.       For each of the command messages which had been sent to the messagebox an instance of an orchestration would be initiated which would process that particular record. If there were 5000 records you would get 5000 instances of the Record Processor orchestration
9.       In the Record Processor orchestration the message it receives will tell the orchestration which batch and record number to process so it can get its data from the temporary database. This orchestration will then call the system which helps decide if the record is sent to system A or system B
10.   If the record is sent to system A then the web service is called. The orchestration will then update the temporary database to indicate where the record is intended to be sent to.
11.   When all records have been processed the Controller orchestration will detect this and stop polling the database.
12.   The controller orchestration will extract all of the records to be sent to system B and then pass this to the messagebox so it can be mapped and delivered to system B via a port subscription.
13.   The controller orchestration will query the temporary database to get the values to create the email which advises the business users how the batch was split and send it via the SMTP adapter
14.   Finally the controller orchestration will clean out the temporary storage of the batch and update the sequencing ready for the next batch
Some additional points to note here are:
·         If the receive pipeline gets an error then the batch would stop saving to the temporary storage. If the instance was resumed then the import would begin again from scratch
·         I considered using a normal BizTalk debatching approach but encountered problems once the batch size got above 1000 records (can’t remember the error message now). Based on this I chose to implement a custom debatching solution
 
The Benefits
As result of the above refactoring the benefits were:
·         Because the records were being processed in parallel I have been able to process a 10MB normal size file in around 15 minutes compared to the previous 6 hours
·         Any errors with one record do not cause a backlog of all of the other records
·         The improved performance means all of the records are loaded before the business users start their working day rather than the previous solution where they were almost finished the day before the records are all processed
·         Because the messages being processed in the orchestrations are smaller than before (approximately 2kb) the fact that overall there are more persistence points (about 4 per record now) means that the total data persisted to the messagebox to maintain the state of the orchestrations has reduced from somewhere in the region of 100GB to < 40MB
Conclusion
Hopefully this post will give a real world example of how being pragmatic in a refactoring exercise can help you make significant improvements to a process with minimal cost to the project. This article may be a little unclear in how this has been implemented so if there is some interest in it the way I implemented this is fairly generic so I could potentially do a future post with an example.
 
 
 
 
 
 

posted @ Friday, July 25, 2008 10:36 PM | Feedback (0) |

Wednesday, July 23, 2008

Detecting BizTalk Event with BizUnit

I came across an annoying one the other day, I havent had time to look into it in more detail, but here are some notes about it.

In some of the tests we do with BizUnit we sometimes check the event log to see that certain messages have occured.  I came across an example I havent noticed before the other day.

In my test I do a bunch of stuff then I wait until a custom event with a specific Event Id is logged to the event log.  I use the BizUnit event log check step and it finds my event fine.

Later in the test I will do the following:

  • Stop the application pool for a web service I will call
  • Allow BizTalk to call the service and then suspend the service instance because the web service is unavailable
  • My test will detect the event logged by BizTalk Event ID - 5754 to indicate a call has failed.  I will also use the ValidationRegEx node to confirm the event message relates to my port
  • I will then start the application pool back up
  • I will then use a custom BizUnit step to resume the suspended instance
  • Finally I will check the process worked correctly

This was working as far as the point of detecting the BizTalk event.  I could see in the event log the event I was expecting was there, but for some reason my step didnt spot it.

Because I was using the regex feature this was my first thought, but the expression was simple and even if I took it away the event wasnt spotted still.

It turns out when I wrote a little C# to check what comes back in the event log was that although the event in Event Viewer displays with the event number 5754, the event log instance from System.Diagnostics comes back with the following property values:

- Event Id: 12588666

- Instance Id: 3233814138

When I