Mostly working...

Adventures in Coding
posts - 4 , comments - 0 , trackbacks - 0

Wednesday, February 22, 2012

Query Performance Logging with MiniProfiler


I have started to experiment with MiniProfiler to log our query performance. We currently use a mix of pure SQL and Entity Framework, since we need Fulltext search some pure SQL is unavoidable. Depending on the hotspots more queries will move back to pure SQL but for simplicity’s sake trivial queries will remain with EF.

Getting started is relatively easy once you figure out which part of the documentation applies – I found my ways through various Stackoverflow questions and GitHub.

Starting point is a RESTful ASP.NET Web API. To enable profiling I had to add a Profiling database connection to the web.config file and update Global.asax:

 
        void Application_Start(object sender, EventArgs e)
        {
            RegisterRoutes();
 
            enableProfiling = bool.Parse(ConfigurationManager.AppSettings["enable_profiling"]);
 
            if (enableProfiling)
            {
                var storage = new SqlServerStorage(ConfigurationManager.ConnectionStrings["Profiling"].ConnectionString);
                MiniProfiler.Settings.Storage = storage;
                MiniProfiler.Settings.SqlFormatter = new SqlServerFormatter();
                var factory = new SqlConnectionFactory(ConfigurationManager.ConnectionStrings["Profiling"].ConnectionString);
                var profiled = new MvcMiniProfiler.Data.ProfiledDbConnectionFactory(factory);
                Database.DefaultConnectionFactory = profiled;
            }
        }
 
        protected void Application_BeginRequest()
        {
            if (enableProfiling) { MiniProfiler.Start(); } //or any number of other checks, up to you
        }
 
        protected void Application_EndRequest()
        {
            if(enableProfiling)
                 MiniProfiler.Stop(); //stop as early as you can, even earlier with MvcMiniProfiler.MiniProfiler.Stop(discardResults: true);
        }

 

After this the only thing left to do is to add the actual query logging. For now I am only interested in EF SQL queries so I added a helper method which I use instead of directly instantiating the DB context:

 

    public static class FooEntitiesHelper
    {
        public static FooEntities Create()
        {
            var builder = new EntityConnectionStringBuilder(ConfigurationManager.ConnectionStrings["FooEntities"].ConnectionString);
            var sqlConnection = new SqlConnection(builder.ProviderConnectionString);
            var profiledConnection = new EFProfiledDbConnection(sqlConnection, MiniProfiler.Current);
            return profiledConnection.CreateObjectContext<FooEntities>();
        }
    }

Now every query that is using FooEntities is logged to the profiling database:

    using (var context = FooEntitiesHelper.Create())
    {
        //...
    } 

 

This is only the starting point – but I will now have the data to see where the hotspots and problems are and re-write the queries or update the database accordingly.

Posted On Wednesday, February 22, 2012 9:11 PM | Comments (0) | Filed Under [ performance C# .NET SQL Server ]

Friday, December 9, 2011

Mysterious query timeouts


A mysterious problem plagued one of our web based applications:  Usually searches are very fast and provide results within a couple seconds at most.  Very rarely though searches in the system would time out and keep the UI hanging on a loading screen.  Once this happened, the next searches would most of the time be very fast again and the problem could not be reproduced – I shrugged it off many times as the “warm-up effect”. Surely enough, a few weeks later the same thing would happen – a search times out. Eventually we had one system where the “warm-up effect” would not “solve” the problem – the system was stuck and did not return any search results.

What happened here?



 
Our searches are executed by a back-end powered by SQL Server full text search which keeps statistics internally to make an “informed” decision on what query plan to generate for a given query. Exactly when these statistics are updated depends on when SQL Server decides the existing statistics are stale. As a rule of thumb for a decently large table whenever more than 20% of the rows in a table have been changed (additions, deletions or modifications), the corresponding statistics are probably considered stale. This happens to be around 20 days for most of our databases since they are producing new data and trimming old data on a 24/7 basis.

The default setting in a SQL Server database to deal with statistics updating is to automatically create statistics and to automatically update statistics.  This sounds like a sensible default.



The problem lies in the fact that the statistics are updated at the time the query comes in – If there is an incoming query but statistics are stale, SQL Server will update the statistics first before creating the query plan for the query and executing the query.

Our database just happens to be fairly large and content changes very often (real-time), so statistics updating takes time and is frequently needed. Unfortunately above approach of first updating the statistics will make the query time out on a large database, and in some cases even timeout on the statistics updating itself, so the statistics are never updated. Rinse and repeat for the next search, and no search will ever complete.

The solution was (of course! in hindsight) to set the statistics updating to “Auto Update Statistics Asynchronously” which allows the query to complete using the stale statistics for query plan generation but triggers a statistics update in a separate thread.  After this change all searches again are very fast, and timeouts are a thing of the past. 
 
References:
http://www.sql-server-performance.com/2007/asynchronous-statistics/
http://michaeljswart.com/2009/12/tracking-down-elusive-query-timeouts/
 
 

Posted On Friday, December 9, 2011 10:01 PM | Comments (0) | Filed Under [ performance SQL Server ]

Sunday, October 23, 2011

Writing to an HttpListenerResponse output stream: Small Change - Big Impact

Recently I looked at a performance problem when transferring image data from a custom HttpListener to web browser clients.

In the end it came down to one change I had done which was writing the image data to a memory stream first – this was required so I could add the right content-length header to the output based on the actual image size which some of my clients required. Before this change I had directly written the image to the output stream when it was generated but the modified code looked something like this:
 
using (var someImage = GenerateImageResource())
{
    using (MemoryStream ms = new MemoryStream())
    {
        someImage.WriteTo(ms, ImageFormat.Jpeg);
        response.ContentLength64 = ms.Length;
        ms.CopyTo(outputStream);
        outputStream.Flush();
        outputStream.Close();
    }
}
 
This looked like a straightforward change and initial testing proved it to provide the same performance as the original code – but only on Windows 7. When deployed on Window 2008 Server the performance dropped by a factor of 10 compared to directly writing the image output to the stream.
 
What happened here?
 
 
 
A closer look at what MemoryStream.CopyTo() actually does behind the scenes (using the free JustDecompile)  produced  the expected output – besides a bunch of error checking it was copying all the image data by reading it into a fixed size buffer and copying it to the output stream in chunks:
 
private void InternalCopyTo(Stream destination, int bufferSize)
{
     int num = 0;
    byte[] numArray = new byte[bufferSize];
 
    while (this.Read(numArray, 0, (int)numArray.Length) != 0)
    {
        destination.Write(numArray, 0, num);
    }
}
 
The drop off in performance can only mean that the target stream is handling the chunked transfer differently from writing the full array right away (which is what Image.Save() does internally). Diving into the depth of writing into the output stream of a HttpListenerResponse revealed many special cases involving IIS and gets quite complicated - but the point is this could and apparently is OS and platform dependent.
 
Switching to the optimized WriteTo() method in MemoryStream which is essentially equivalent to stream.Write(buffer,0, length) fixes the performance problem:
 
 
using (var someImage = GenerateImageResource())
{
    using (MemoryStream ms = new MemoryStream())
    {
        someImage.WriteTo(ms, ImageFormat.Jpeg);
        response.ContentLength64 = ms.Length;
        ms.WriteTo(outputStream);
        outputStream.Flush();
        outputStream.Close();
    }
}
 
 
Small change – big impact. If you have access to the full data you intend to send, pass it in one go – so the underlying stream can optimize how it sends data, this is especially important for a network stream.

Posted On Sunday, October 23, 2011 12:20 AM | Comments (0) | Filed Under [ performance ]

Friday, September 23, 2011

Ajax request bundling

The setup

In the later stages of testing an ASP.NET web application recently I fired off fiddler to look at the network traffic. It was disheartening to see – there were around 30 Ajax requests per minute on one particular especially crowded page, some even synchronous. Even after cutting back the most obvious offenders this is beyond “chatty” and caused real problems in this application since the number of outstanding Ajax requests is limited and can block the application if the pool is exhausted.Makes you want to scream...

 

scream

 

Now the root cause of this problem was rather noble – the page was made up of several user controls that are independently configured and update themselves through an update URL where they get their new data from. This approach allows placing the user control on any page since there are no outside dependencies (besides appropriate endpoints on the server) and potential reuse in other projects.

Several problems arose from this design at runtime though:

  • Duplicate Data requests: Since each user control requested its own data duplicate data requests were made – the same data just requested by user control A is requested by user control B just a few seconds later.
  • No caching: Since the data for each user control was only “good” until the next update anyway and not reused by anyone else there was no point in caching anything.
  • Data transfer even when there is no new data: Most of the times the updates requested by the user controls were simply looking for changes in the data it already was displaying. 99% of the time there was no change and the processing ended up doing nothing (and spending a lot of effort doing so!)

 

The solution

I decided to attack all of these problems and ended up with an Ajax Request bundler and caching object. The small downside is that I had to add a reference to the Ajax bundler to all of the user controls, but the results were well worth it:

Eliminating duplicate data requests

All user controls ask the bundler for data. Being a central object the bundler is able to cache data for a period of time and return the cached data if requested again within a short period of time (the “time to live” or TTL). This eliminates all the duplicate data requests.

Bundling of Ajax requests

Even with duplicates eliminated there were many wasteful Ajax calls one after the other in a predictable fashion. Combining multiple Ajax requests into a single request that requests the combination of all data requested cut down the total number of requests.

Detect "no changes"

Each data item in the bundler has an associated update id. The backend endpoint of the bundler returns a new update id with each updated data item. Since update ids are in sync between client and server for a given client session, the server knows whether the current data item is the same as the one the client already has and can just omit the data in the response – returning the same update id is enough. This eliminates the average response size dramatically since only new data actually will be returned.

To sum it up we ended up with

  • Less Ajax requests
  • Dramatically reduced average payload on the return
  • More responsive UI because of caching

 So what interface / methods should the Ajax bundler have? From a user’s perspective we want to be able to

  • Get a result for a given key
  • Request an immediate update for a given key and get its result
  • Register a callback for updates if content changes for a given key

To provide this functionality the Ajax Bundler:

  •  Must be able to determine that data for a given key is expired given its TTL
  • Must  request updates from the server and pass in the current Update Ids for each key that needs updating

Armed with these requirements I ended up with something close to this (partial):

function AjaxBundler()
{
    var updateUrl = "someServerUpdateUrl";
    var DEFAULT_TTL = 60000; //60 seconds time to live by default
    var ALMOST_EXPIRED = 5000; //anything within the next 5 seconds is considered "almost expired"
    var that = this;
    var resultCache = {};
    var isUpdatePending = false; //only allow one concurrent update

    this.RegisterForUpdate = function (dataRequestType, callback, millisecondsToLive)
    {
        that.AddOrUpdate(dataRequestType, callback, millisecondsToLive);
        resultCache[dataRequestType].UpdateId = 0;
        resultCache[dataRequestType].Callbacks.push(callback);
        that.CheckForUpdates();
    }

    this.AddOrUpdate = function (dataRequestType, callback, millisecondsToLive)
    {
        if (!resultCache[dataRequestType])
        {
            resultCache[dataRequestType] =
                    {
                        UpdateId: 0,
                        Xml: '',
                        TTL: new Date().getTime(),
                        LifeTime: millisecondsToLive || DEFAULT_TTL,
                        Callbacks: []
                    };
        }
    }

    //Expire data request type and fetch new results from server asynchronously
    this.ForceUpdate = function (dataRequestType)
    {
        that.Expire(dataRequestType);
        that.CheckForUpdates();
    }

    //Fetch new result
    this.GetNewResult = function (dataRequestType)
    {
        that.Expire(dataRequestType);
        return that.GetResult(dataRequestType);
    }

    //Fetch cached or new result (if expired)
    this.GetResult = function (dataRequestType, millisecondsToLive)
    {
        if (that.IsExpired(dataRequestType) || !resultCache[dataRequestType].Xml)
        {
            that.AddOrUpdate(dataRequestType, null, millisecondsToLive);
            that.CheckForUpdates(true);
        }
        return resultCache[dataRequestType].Xml;
    }

    //Expire a request type
    this.Expire = function (dataRequestType)
    {
        if (resultCache[dataRequestType])
        {
            var now = new Date().getTime();
            resultCache[dataRequestType].TTL = now;
        }
    }
 
    //Check if a request type is expired
    this.IsExpired = function (dataRequestType, additionalTimeOffsetMilliseconds)
    {
        additionalTimeOffsetMilliseconds = additionalTimeOffsetMilliseconds || 0;
        var now = new Date(new Date().getTime() + additionalTimeOffsetMilliseconds).getTime();
        return (!resultCache[dataRequestType] || resultCache[dataRequestType].TTL - now <= 0);
    }

    //Fetch list of expired requests
    this.GetExpiredCacheList = function (offset)
    {
        var expiredItems = [];
        offset = offset || 0;

        for (var dataRequestType in resultCache)
        {
            if (that.IsExpired(dataRequestType, offset))
                expiredItems.push(dataRequestType + ":" + resultCache[dataRequestType].UpdateId);
        }
        return expiredItems.join(",");
    }

    //Check the server for any updates for all registered data types that are expired or soon to be expired base on TTL
    this.CheckForUpdates = function (forceUpdate)
    {
        if (!isUpdatePending || forceUpdate)
        {
            isUpdatePending = true;
            var requestUpdateList = that.GetExpiredCacheList(ALMOST_EXPIRED);

            if (requestUpdateList)
            {
                if (forceUpdate)
                    that.Update(BlockingHttpRequest(updateUrl, "POST", true, requestUpdateList));
                else
                    AsyncHttpRequest(updateUrl, that.Update, "POST", true, requestUpdateList);
            }
            else
                isUpdatePending = false;
        }
        else
            setTimeout(function () { that.CheckForUpdates(forceUpdate); }, 100);
    }

    //Parse response from server and notify via callbacks if any update
    this.Update = function (updateXml)
    {
        var callbacks = [];
        for (var dataRequestType in resultCache)
        {
            var dataResult = $(updateXml).find(dataRequestType);
            if (dataResult.length > 0) //xml returned from bundler for this request type
            {
                var updateId = parseInt(dataResult.attr('UpdateId'), 10);

                //update TTL
                resultCache[dataRequestType].TTL = new Date(new Date().getTime() + resultCache[dataRequestType].LifeTime).getTime();

                if (updateId != resultCache[dataRequestType].UpdateId)
                {
                    var xmlContent = dataResult.find(">:first-child");
                    if (xmlContent.length == 1)
                    {
                        //content has changed, notify all registered callbacks
                        callbacks.push(dataRequestType);
                        var xml = xmlContent[0].xml;
                        resultCache[dataRequestType].Xml = xml;
                        resultCache[dataRequestType].UpdateId = updateId;
                    }
                }
            }
        }

        //notify all callbacks with updated xml
        for (var updatedType in callbacks)
        {
            $.each(resultCache[callbacks[updatedType]].Callbacks, function (i, callback)
            {
                try
                {
                    callback(resultCache[callbacks[updatedType]].Xml);
                }
                catch (e) { }
            });
        }
        isUpdatePending = false;
    }
}

 








 

Posted On Friday, September 23, 2011 11:16 PM | Comments (0) | Filed Under [ performance ]

Powered by: