Cloud Lesson Learned: Exponential Backoff

This blog is the first one of a series of blogs addressing programming practices and lessons learned related to cloud computing. While most developers will be familiar at least conceptually with the techniques exposed, I will provide background information and code samples in an attempt to explain why they are so critical in cloud software development.  While most of the information provided will be using Windows Azure and/or SQL Azure, these concepts apply to cloud computing in general.

Exponential Backoff

In this blog I will discuss expotential backoff (EB). EB is a retry technique that assumes failure by nature and attempts to retry the operation, with an exponentially increasing wait time, until a maximum retry count has been reached. This technique accounts for the fact that cloud resources may be unavailable more than a few seconds, for any reason out of your control. In the case of SQL Azure for example, a database may be moved to another server at any time, causing the database from being unavailable for a few seconds, or a few minutes depending on the scenario.

In addition to resource availability, EB also takes into account the fact that the cloud provider may decide to throttle, or limit, availability of resources due to usage overload. For example, requesting too many connection requests quickly may be viewed as a Denial of Service attack by the cloud provider. As result, backing off exponentially connection requests to SQL Azure provides a mechanism to scale back connection requests when a capacity threshold has been encountered.

Example

The following C# example shows an extension method called TryOpen that provides an EB when a connection timeout is encountered. In your code you may want to provide additional exception management, such as providing a mechanism to cancel the EB if a user presses a Cancel button for example. In this code snippet, the TryOpen method tries to open a database connection up to 5 times in a row, backing off 3 seconds exponentially every time (3 seconds, 9 seconds, 27 seconds...).

[the code below was edited on 6/4/11 to fix a bug in the sleep timeout calculation]

static SqlConnection TryOpen(this SqlConnection connection)
{
  int attempts = 0;

  while (attempts < 5)
  {
    try
   
{
      if (attempts > 0)
       
System.Threading.Thread.Sleep(((int)Math.Pow(3, attempts)) * 1000);
      connection.Open();
      return connection;
    }
    catch { }
    attempts++;
  }
  throw new Exception("Unable to obtain a connection to SQL Server or SQL Azure.");
}

Finally, assuming the above code was placed in a static class, your primary code could simply use the TryOpen method this way:

SqlConnection connection = new SqlConnection("your_connection_string");
connection.TryOpen();

As you can see, hiding connection retries is simple when you leverage extension methods in .NET. This allows you to centralize rather complex routines that should be centralized for maintenance.

 

Print | posted @ Thursday, May 26, 2011 1:07 PM

Comments on this entry:

Gravatar # re: Cloud Lesson Learned: Exponential Backoff
by perpetualKid at 6/3/2011 11:53 PM

how often have you tested that code? the ^ operator in C# is a bitwise exclusive OR, so it will be 3001, 3002 and so on.
And if you really use the correct function (Math.Pow), the second attempt will already wait for 9000 seconds (3000^2) instead of 9 sec.

Nice thought, but wrong implementation
Gravatar # re: Cloud Lesson Learned: Exponential Backoff
by Herve Roggero at 6/4/2011 2:02 AM

Yes - thank you for pointing that out. Initially I had a simple multipler for a linear backoff implementation and made an incorrect change without testing that ended up in this post... I edited the post to include the proper timeout calculation. Thanks again!
Comments have been closed on this topic.