Mostly working...

Adventures in Coding
posts - 4 , comments - 0 , trackbacks - 0

Mysterious query timeouts


A mysterious problem plagued one of our web based applications:  Usually searches are very fast and provide results within a couple seconds at most.  Very rarely though searches in the system would time out and keep the UI hanging on a loading screen.  Once this happened, the next searches would most of the time be very fast again and the problem could not be reproduced – I shrugged it off many times as the “warm-up effect”. Surely enough, a few weeks later the same thing would happen – a search times out. Eventually we had one system where the “warm-up effect” would not “solve” the problem – the system was stuck and did not return any search results.

What happened here?



 
Our searches are executed by a back-end powered by SQL Server full text search which keeps statistics internally to make an “informed” decision on what query plan to generate for a given query. Exactly when these statistics are updated depends on when SQL Server decides the existing statistics are stale. As a rule of thumb for a decently large table whenever more than 20% of the rows in a table have been changed (additions, deletions or modifications), the corresponding statistics are probably considered stale. This happens to be around 20 days for most of our databases since they are producing new data and trimming old data on a 24/7 basis.

The default setting in a SQL Server database to deal with statistics updating is to automatically create statistics and to automatically update statistics.  This sounds like a sensible default.



The problem lies in the fact that the statistics are updated at the time the query comes in – If there is an incoming query but statistics are stale, SQL Server will update the statistics first before creating the query plan for the query and executing the query.

Our database just happens to be fairly large and content changes very often (real-time), so statistics updating takes time and is frequently needed. Unfortunately above approach of first updating the statistics will make the query time out on a large database, and in some cases even timeout on the statistics updating itself, so the statistics are never updated. Rinse and repeat for the next search, and no search will ever complete.

The solution was (of course! in hindsight) to set the statistics updating to “Auto Update Statistics Asynchronously” which allows the query to complete using the stale statistics for query plan generation but triggers a statistics update in a separate thread.  After this change all searches again are very fast, and timeouts are a thing of the past. 
 
References:
http://www.sql-server-performance.com/2007/asynchronous-statistics/
http://michaeljswart.com/2009/12/tracking-down-elusive-query-timeouts/
 
 

Print | posted on Friday, December 9, 2011 10:01 PM | Filed Under [ performance SQL Server ]

Feedback

No comments posted yet.
Post A Comment
Title:
Name:
Email:
Comment:
Verification:
 

Powered by: