Geeks With Blogs

News View Michael Stephenson's profile on BizTalk Blog Doc View Michael Stephenson's profile on LinkedIn
Michael Stephenson keeping your feet on premise while your heads in the cloud

For a few weeks now we have had some problems with BizTalk tracking data not being right in some testing environments on a large project.  The link below is to a forum post which discusses our problems (thanks Thiago for your help troubleshooting this):

http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=3956734&SiteID=17

The basic problem was that for some reason tracking data seemed to be backing up in the messagebox and not getting through to the Tracking Database.  We went through all of the usual suspects and some more advanced troubleshooting techniques which hadnt really worked out the problem.  We are now in a situation where our environments are working.  For some of them I know the cause of the problem and for one other I have not confirmed the cause.  I will discuss these below.

Production like Environment

In this environment the tracking is now working correctly (although we seem to have lost the old data and just new tracking data is being processed). This environment proved to be a bit frustrating as I was unable to get dedicated time to troubleshoot this issue due to other parts of the environment being used. When I did actually get the environment to do the last thing suggested by Thiago about the windows authentication mode on a post he had seen my plan was to do the following:

  • Check the current state
  • Restart the SQL Servers
  • Change the authentication mode
  • Restart the SQL Servers
  • Analyse any changes to our state

Before I could do this my first check found that tracking was now kind of working. Before id been given the environment for my analysis some fail over testing had been performed by one of the test teams and I know they had restarted some of the BizTalk boxes. At this point I kind of assumed that restarting the BizTalk box which was hosting the tracking host might have been the fix.

It's a bit frustrating when you get this kind of situation where its fixed but your not sure why. I also had to give the environment back to be rebuilt so couldn't look into this again for a while.

Small scale test environment

We got the same issue as above but in a different test environment. This 2nd environment is much smaller with just a single BizTalk box and a single SQL Server box. It was the same symptoms though.

One of the product support guys had come back to me eventually on the above issue on the same day (stroke of good luck) and he mentioned about the TDDS_StreamStatus table which on investigation I realised the sequence numbers were out of sync with the data in the trackingdata_n_n tables within the messagebox. I confirmed with SQL Profiler that TDDS is executing a stored procedure (cant remember the same but it was something like GetNextTrackingData) which was using the next expected sequence number from the stream status. Because the stream status table was much higher than the tracking data tables no data was being returned and therefore noting was passing along the chain to be available in HAT.

This posed the question of how this got out of sync?

Upon investigation I found that while I believed our deployment team were refreshing the BizTalk setup with updates they were also running some clean up tasks based on this article. While this post d oes clean some of the stuff up it isn't really an officially supported approach and if you look at the dtasp_CleanHMData stored procedure it does not touch the TDDS_StreamStatus table. The result of this cleanup process would seem that the messagebox tracking tables will be truncated meaning their next sequence numbers will be 1 but the TDDS_StreamStatus will still be whatever it was before. This means that your tracking data will not start getting into HAT until you get enough of it in the messagebox so that the next sequence number is higher than that that was last processed. Your tracking data will now start going through but only the stuff after this point.

In the case of the small scale environment I was able to just manually change the data in TDDS_StreamStatus to make all of the backed up tracking data get processed (although I would never recommend doing this in a production environment without speaking to Microsoft first).

I have been told the cleanup process was not ran on the production like environment, but I cannot confirm this so this problem may or may not be the cause for both environments.

Lessons Learnt

Based on my experiences in troubleshooting this Id make the following recommendations:

  • Be wary of the clean up scripts. They are not normally expected to be used in a production system and are aimed for development and testing support. I'm not convinced that they do everything they need to?
  • Use MOM to monitor the tracking data counter. If this had been our production setup we would have detected the growth in this counter and have been alerted to the problem. In test we aren't always using MOM so only found these problems while trying to troubleshoot other problems
  • On larger projects where deployments are not done by the development team ensure you have good communications between teams and try to be aware of the processes used for deploying and managing environments

I also know a little more about what TDDS does under the hood now as there isn't too much documentation around on the subject.

Posted on Thursday, October 30, 2008 2:18 AM BizTalk | Back to top


Comments on this post: Missing Tracking Data Problem

# re: Missing Tracking Data Problem
Requesting Gravatar...
Hi,

We had the same issue. Thanks for sharing the solution.

Greetings,
Birgen
Left by Birgen on May 23, 2013 2:25 AM

Your comment:
 (will show your gravatar)


Copyright © Michael Stephenson | Powered by: GeeksWithBlogs.net