A lot of noise around cloud and in particularly Microsoft Windows Azure is the ability to scale up or down at will. This is a very important aspect of the whole cloud story but it is also something that must be treated with respect.
Customers have been asking why Microsoft didn’t provide and automatic scaling feature and reason why not gives us an insight into the whole issue around scaling.
So when should we scale? The need to scale up and down comes in a number of forms or patterns. Simply put they can be considered to be either expected or unexpected. Unexpected patterns in load can be in bursts, ramps up or down or just completely random fluctuations. Expected patterns are typically linked to events such as Christmas, tax submission day or times of the day such as lunch hour or straight after work. So should we scale for both?
Obviously the expected patterns are an obvious target for ‘controlled scaling’ or even some form of auto-scaling. The unexpected patterns are a different challenge all together. To decide we must understand the impact the challenge has on our application and why it could be a problem. If we consider typical cause this helps to give us a better idea. For instance a sudden burst in customer traffic to a web site might simply have been a random event such as an unexpected mention in a blog or twitter! In planning for this we have to be careful not to over-anticipate. If we are measuring a 2 or 3 fold increase in visits to our site shouldn’t necessarily mean we spin up loads more web instances.
For unexpected ‘bursts’ or increases the most sensible approach is to asses this over a period of time say 2 or 3 hours. Then we can consider the best way of acting. It could be spin up more web or simply invoke an asynchronous approach to ensure we process data in the down time rather than keeping the user waiting.
Finally, having been working with Azure for a couple of years here are some key thoughts around scaling:
- Don’t instantly react to ‘spikes’! It may be just that! Consider it takes 10 minutes to start a new work role – the spike could be gone even before the worker role has spun up! Consider other options in your code such as threading before scaling up that extra role.
- The process of scaling up or down takes a finite time – typically this is 10 to 15 minutes. Azure bills by the hour so to maximise the use I suggest you scale up no more than 10 minutes before the start of the next hour ensuring that you are not ‘burning’ your hour before the instance is up and running. Similarly plan the shut time before the end of the hour to ensure you don’t incur the charge for another hour!
- If you are going ‘automated’ the set sample times to be long enough to be meaningful and sensible upper and lower boundaries.
- Consider the events occurring that may cause increased access to your site. Prepare in advance rather than simply reacting are letting the automatic system do it.
- Lastly, treat Azure as any other resource: it must be managed , it must be measured and it must be monitored.