It has been quite a while since I signed up for this blog site and high time that something was posted. I have a list of topics that I will be working through and posting. Some I am sure will have been posted by others, but I will be sticking to the technical problems and challenges that I’ve recently faced, and the solutions that worked for me. My motto when learning something new has always been “My kingdom for an example!”, and I plan on delivering useful examples here so others can learn from my efforts, failures and successes.
A bit of background about me… My name is Clayton Groom. I am a founding Partner of the consulting firm, Covenant Technology Partners, LLC, located in St. Louis, Missouri. I focus on SQL Server Data Warehouse design, Analysis Services and Enterprise Reporting solutions. I have been working with SQL Server since the early nineties, when it still only ran on OS/2. I love solving puzzles and technical challenges.
Enough about me… On to a real problem…
SSIS Connection Time outs versus Command Time outs
Last week, I was working on automating the processing for a large Analysis Services cube. I had reworked an SSIS package and script task originally posted by Vidas Matelis that automates the process of adding new and dropping old partitions to/from an Analysis Services cube. I had the package working great, tested, and ready for deployment. It basically performs a query against the source system to determine if there is new data in the warehouse that will require a new partition to be added to the cube, and it checks the cube to see if there are any partitions that are present that are no longer needed in a rolling 60 month window.
My client uses Tivoli for running all their production jobs, and not SQL Agent, so I had to build a command line file for Tivoli to use to run the package.
Everything was going great. I had tested the command file from my development workstation using an XML configuration file to pass in server-specific parameters into the package when executed using the DTExec utility. With all the pieces ready, I updated the dtsconfig file to point to the UAT environment and started working with the Tivoli developer to test the job. On the first run, the job failed, and from what I could see in the SSIS log, it had failed because of a timeout. Other errors in the log made me think that perhaps the connection string had not been passed into the package correctly.
We bumped the Connection Manager timeout values from 20 seconds to 120 seconds and tried again. The job still failed.
After changing the command line to use the /SET option instead of the /CONFIGFILE option, we tested again, and again failure.
After a number more failed attempts, and getting the Teradata DBA involved to monitor and see if we were connecting and failing or just failing to connect, we determined that the job was indeed connecting to the server and then disconnecting itself after 30 seconds. This seemed odd, as we had the timeout values for the connection manager set to 180 seconds by then. At this point one of the DBA’s found a post on the Teradata forum that had the clues to the puzzle:
There is a separate “CommandTimeout” custom property on the Data source object that may needed to be adjusted for longer running queries. I opened up the SSIS package, opened the data flow task that generated the partition list table and right-clicked on the data source.
from the context menu, I selected “Show Advanced Editor” and found the property. Sure enough, it was set to 30 seconds.
The CommandTimeout property can also be edited in the SSIS Properties sheet.
In order to determine how long the timeout needed to be, I ran the query from the task in the development environment and received a response in a matter of seconds. I then tried the same query against the production database and waited several minutes for a response. This did not seem to be a reasonable response time for the query involved, and indeed it wasn’t. The Teradata DBA’s adjusted the query governor settings for the service account I was testing with, and we were able to get the response back down under a minute. Still, I set the CommandTimeout property to a much higher value in case the job was ever started during a time of high-demand on the production server.
With this change in place, the job finally completed successfully. The lesson learned for me was two-fold:
- Always compare query execution times between development and production environments, and don’t assume that production will always be faster. With higher user demands, query governors, and a whole lot more data, the execution time of even what might seem to be simple queries can vary greatly.
- SSIS Connection time out settings do not affect command time outs. Connection timeouts control how long the package will wait for a response from the server before assuming the server is not available or is not responding. Command time outs control how long a task will wait for results to start being returned before deciding that the server is not responding.
Both lessons seem pretty straight forward, and I felt pretty sheepish once I finally figured out what the issue was. To be fair though, In the 5+ years that I have been working with SSIS, I could only recall one other time where I had to set the CommandTimeout property, and that memory only resurfaced while I was penning this post.