If you develop an application for the cloud, such as Microsoft Azure, you may need to create a distributed workload. Distributing workload in the cloud is one of the fundamental aspects of scalability, because you can breakdown the work into smaller pieces and use multiple machines to process the workload. A good workload distribution technique exposes the following important characteristics:
- The more machines you use, the faster the overall workload will be processed
- An increase of machines to process the workload scales in a linear fashion
- When a machine crashes another one can pick the load from where the other one left off
Let’s first review the overall architecture of a workload distribution topology. Let’s assume we have a Distribution Service (DS), which is responsible for preparing the overall workload (for example, we need to backup 100 databases). The workload is broken down into 100 individual requests; each request is then saved into an Azure Queue (possibly as an XML). The Azure Queue is the favorite tool for workload distribution because it guarantees that only 1 service can retrieve a message at a time, and there are built-in recovery mechanisms in case of failure of the processing service (PS).
The Processing Service (PS) is a Worker Role responsible for reading items from an Azure Queue and perform the operation requested. Depending on the amount of work involved, you could code your PS to process a single request at a time, or process multiple messages at the same time. Each PS could have more than one processing thread, if needed. Finally, you could deploy multiple PSs to further increase the throughput of the workload.
The following diagram shows a simple logical deployment model that implements a single DS and multiple PSs to process the workload. From a physical deployment standpoint, the DS could live inside the same worker role as the PS, on a different thread. This works well if the DS and the PS do not compete too much for the same resources. If the DS has significant pre-processing needs, the DS should reside on its own worker role. In the following sections we will improve upon this architecture to handle DS high availability and Azure scalability targets.
To calculate the maximum throughput of your distributed workload using the architecture above, let’s assume the DS does not have any bottlenecks and does not need to scale. Let’s further assume that you can processing 10 (T) parallel threads in each PS (each thread consuming 1 message at a time), using 1 PS instance (X) and it takes 5 seconds (S) to process each message. This gives you T * X / S messages per second maximum throughput, or up to 2 messages per second. If you deploy your PS worker role a second time, you now can perform 4 messages per second. Generally speaking, this model is linear in nature as long as you don’t share any resource. However, as you will see further below, you are sharing the same Azure Storage Account and Azure Queue, so after a certain point, you reach a throughput plateau unless you improve the design.
Also note that smaller processing times makes a more scalable system. It is usually better to have many small requests than a few large ones; it is usually easier to handle failure/retry for smaller units of work, and the system is overall more responsive when it has to deal with smaller retries than large ones. So choosing what your unit of work should be is fundamental. As a general rule, I try to select workloads that are a few seconds in duration and do not depend on the outcome of other workloads, or the sequencing of work units.
Azure Scalability Target
As discussed above, the simple workload distribution model scales well up to a point; specifically, Windows Azure introduces performance limitations on various services, and scalability targets on its storage account. As a result, the scalability of the architecture discussed so far is bound by Azure performance boundaries.There are ways to design around those limitations however. Let’s take a look.
The new scalability targets for an Azure Queue is 2,000 messages per second. Note that this is not a guarantee; it is a performance target. You also have other limitations worth noting: 60MB/Sec per partition (in our case that would be the Azure Queue) and 20,000 transactions per Storage Account (and up to 15GB/sec per storage account). See this link for more information: http://blogs.msdn.com/b/windowsazure/archive/2012/11/02/windows-azure-s-flat-network-storage-and-2012-scalability-targets.aspx.
So in order to go beyond these scalability targets we need to adjust our design and allow our DS and PS to communicate through more than one storage account and, as a result, more than one Azure Queue. It may be possible to use more than one Azure Queue in each storage account, however for simplicity we are creating one queue per storage account in this design. This design implies that the DS creates messages across multiple storage accounts/queues, and that each PS reads from each available queue too, in a round robin mechanism. Over time, the system is able to write and read from multiple queues progressively and can process an even much larger number of items. This technique is called Storage Account Sharding and allows distributed systems to scale significantly.
By default, each Windows Azure account (Live ID) can have up to 20 storage accounts (you may need to call Microsoft Support to obtain 20 storage accounts). So technically, this approach still has a scalability limit. Nevertheless, it is easy to open additional Microsoft Azure Accounts and continue to create additional storage accounts.
DS Redundancy / Locking
There is one more item to consider: how to make the DS highly available. In theory, all you need to do is to deploy the DS to multiple worker roles, and you naturally achieve redundancy. In our design however, this could create a problem because each DS instance is unaware of the other; so this could lead to duplicate messages in the Azure Queues. To go around this issue, you simply need to implement a shared locking mechanism that allows each DS to determine if it needs to run the distribution, of if it should sleep and try again later. If one DS goes down, any other DS will be able to pick up the work going forward.
The easiest way to implement a shared locking mechanism that relies on a wake up call (let’s say, run every 60 seconds), is to use an Azure Table. The Azure Table is shared by each DS, and a single entity representing the workload is created with a NextTimeToRun datetime property. Each DS then attempts to read then update the entity; by default, the update will fail if any of the properties has changed since it was read. So only 1 DS will successfully update the Azure Table at a time, and thus be responsible for running the next distribution.
Ultimately, the proposed distributed workload architecture would look like this. Note that the Azure Table will not be used extensively in this architecture,
Note that this architecture can be further improved. For example this approach doesn’t prescribe the handling of poison messages, nor does it handle how to efficiently store data in SQL Database when dealing with large amounts of data. However, this should provide a good start for a distributed architecture in the cloud.
Other Use Cases
We talked about the DS being a worker role (or multiple worker roles) running on a schedule. However the DS could be a client facing application for example, not running on any specific schedule. If you deploy an application to thousands of workstations, each workstation could act as a DS and could publish data in the Azure Queues directly. This represents a specific use case of the scenario described previously. Here how this use case would look like:
About Herve Roggero
Herve Roggero, Windows Azure MVP in South Florida, is the founder of Blue Syntax Consulting, a company specialized in cloud computing products and services. Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" from Apress and runs the Azure Florida Association (on LinkedIn: http://www.linkedin.com/groups?gid=4177626). For more information on Blue Syntax Consulting, visit www.bluesyntax.net.