SQL Server IaaS and Retry Logic

Recently I had an interesting discussion with a customer and a question came up: should we still worry about Retry Handling in our application code when our SQL Server runs in virtual machines in an Infrastructure as a Service (IaaS) implementation?

More about the question

First, let’s review this question in more detail.  Let’s assume you currently run an application on virtual machines (or even physical machines) that are hosted at your favorite hosting provider, and you are interested in moving this application in the Microsoft cloud (Microsoft Azure). Your decision is to keep as much of your current architecture in place, and move your application “as-is” in virtual machines in Azure; depending on who you talked to, you probably heard that moving into IaaS (in Azure or not) is a simple fork lift. Except that your application depends on SQL Server, and now you have a choice to make: will your database server run on a virtual machine (IaaS), or using the platform as a service SQL Database (PaaS)?  For the remainder of this discussion I will assume that you “can” go either way; that’s not always true because some applications use features that are only available in SQL Server IaaS (although the gap is now relatively small between SQL Server and SQL Database).

What could go wrong

SQL Database (PaaS) is an environment that is highly load balanced and can potentially fail over to other server nodes automatically and frequently (more frequently than with your current hosting provider). As a result, your application could experience more frequent disconnections. Most often than not, applications are not designed to automatically retry their database requests when such disconnections occur. That’s because most of the time, when a disconnection happens it is usually a bad thing, and there are bigger problems to solve (such as a hard drive failure). However, in the cloud (and specifically with PaaS databases), disconnections happen for a variety of reasons, and are not necessarily an issue; they just happen quickly. As a result, implementing retry logic in your application code makes you application significantly more robust in the cloud, and more resilient to transient connection issues (for more information about this, look up the ).

However, applications that use SQL Server in VMs (IaaS) in Microsoft Azure may also experience random disconnections. Although there are no published metrics that compare the resiliency of the availability of VMs compared to PaaS implementations, VMs are bound to restart at some point (due to host O/S upgrades for example), or rack failures, causing downtime of your SQL Server instance (or a failover event if you run in a cluster). While VMs in Microsoft Azure that run in a load balanced mode can have a service uptime that exceeds 99.95%, VMs running SQL Server are never load-balanced; they can be clustered at best (but even in clustered situations, there are no uptime guarantees since the VMs are not load balanced). VMs also depend on an underlying storage that is prone to “throttling” (read this blog post about Azure Storage Throttling for more information), which could also induce temporary slowdowns, or timeouts. So for a variety of reasons, an application that runs SQL Server in VMs can experience sporadic, and temporary disconnections that could warrant a retry at the application layer.

Retry Handling

As a result, regardless of your implementation decision (SQL Server IaaS, or SQL Database PaaS), it is prudent (if not highly recommended) to modify your application code to include some form of retry logic; adding retry logic will create the perception that your application slowed down by hiding the actual connection failure. There are a few implementation models, but the most popular for the Microsoft Azure platform is the Transient Fault Handling Application Block (mostly used for ADO.NET code). This application block will help you implement two kinds of retries: connection and transaction retries. Connection retries are performed if your code is unable to connect to the database for a short period of time, and transaction retries will attempt to resubmit a database request in case the previous request failed for transient reasons. The framework is extensible and gives you flexibility to decide whether you want to retry in a linear manner, or through a form of exponential retry.

Note that the Entity Framework version 6 and higher include automatic retry policies; see this article for more information.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

Monitoring Flights and Sending SMS with Taskmatics Scheduler and Enzo Unified

Software developers need to build solutions quickly so that businesses remain competitive and agile. This blog post shows you how Taskmatics Scheduler (http://www.taskmatics.com) and Enzo Unified (http://www.enzounified.com) can help developers build and deploy solutions very quickly by removing two significant pain points: the learning curve of new APIs, and orchestrating Internet services.

Sample Solution

Let’s build a solution that checks incoming flights in Miami, Florida, and send a text message using SMS when new flights arrive to one or more phone numbers. To track flight arrivals, we will be using FlightAware’s (http://www.flightaware.com) service which provides a REST API to retrieve flight information. To send SMS messages, we will be using Twilio’s (http://www.twilio.com) service which provides an API as well for sending messages.

To remove the learning from these APIs, we used Enzo Unified, a Backend as a Service (BaaS) platform that enables the consumption of services through native SQL statements. Enzo Unified abstracts communication and simplifies development of a large number of internal systems and Internet services. In this example, Enzo Unified is hosted on the Microsoft Azure platform for scalability and operational efficiency.

To orchestrate and schedule the solution, we used the Taskmatics Scheduler platform. Taskmatics calls into your custom code written in .NET on a schedule that you specify, which is configured to connect to Enzo Unified in the cloud. The call to Enzo Unified is made using ADO.NET by sending native SQL statements to pull information from FlightAware, and send an SMS message through Twilio. At a high level, the solution looks like this:

clip_image002

Figure 1 – High Level call sequence between Taskmatics Scheduler and Enzo Unified

How To Call FlightAware and Twilio with Enzo Unified

Developers can call Enzo Unified using a REST interface, or a native SQL interface. In this example, the developer uses the SQL interface, leveraging ADO.NET. The following code connects to Enzo Unified as a database endpoint using the SqlConnection class, and sends a command to fetch flights from a specific airport code using an SqlCommand object. Fetching FlightAware data is as simple as calling the “Arrived” stored procedure against the “flightaware” database schema.

var results = new List<ArrivedFlightInfo>();

 

// Connect to Enzo Unified using SqlConnection

using (var connection = new SqlConnection(parameters.EnzoConnectionString))

  // Prepare call to FlightAware’s Arrived procedure

  using (var command = new SqlCommand("flightaware.arrived", connection))

  {

    connection.Open();

    command.CommandType = System.Data.CommandType.StoredProcedure;

    command.Parameters.Add(new SqlParameter("airport", airportCode));

    command.Parameters.Add(new SqlParameter("count", 10));

    command.Parameters.Add(new SqlParameter("type", "airline"));

 

    // Call FlightAware’s Arrived procedure

    using (var reader = command.ExecuteReader())

      while (reader.Read())

        results.Add(new ArrivedFlightInfo

        {

          Ident = (String)reader["ident"],

          AircraftType = (String)reader["aircrafttype"],

          OriginICAO = (String)reader["origin"],

          OriginName = (String)reader["originName"],

          DestinationName = (String)reader["destinationName"],

          DestinationCity = (String)reader["destinationCity"]

          // ... additional code removed for clarity...

        });

    }

Calling Twilio is just as easy. A simple ADO.NET call to the SendSMS stored procedure in the “Twilio” schema is all that’s needed (the code is simplified to show the relevant part of the call).

// Establish a connection Enzo Unified

using (var connection = new SqlConnection(parameters.EnzoConnectionString))

  using (var command = new SqlCommand("twilio.sendsms", connection))

  {

    connection.Open();

    command.CommandType = System.Data.CommandType.StoredProcedure;

    command.Parameters.Add(new SqlParameter("phones", phoneNumbers));

    command.Parameters.Add(new SqlParameter("message", smsMessage));

 

    // Call Twilio’s SendSMS method

    command.ExecuteReader();

  }

If you inspect the above code carefully, you will notice that it does not reference the APIs of FlightAware or Twilio. Indeed, calling both FlightAware and Twilio was done using ADO.NET calls against Enzo Unified; because Enzo Unified behaves like a native database server (without the need to install special ODBC drivers), authenticating, making the actual API calls, and interpreting the REST results was entirely abstracted away from the developer, and replaced by an SQL interface, which dramatically increases developer productivity. Database developers can call Enzo Unified directly to test FlightAware and Twilio using SQL Server Management Studio (SSMS). The following picture shows the results of calling Enzo Unified from SSMS to retrieve arrived flights from FlightAware.

image

Figure 2 – Calling the FlightAware service using simple SQL syntax in SQL Server Management Studio

Sending a SMS text message using Twillio is just as simple using SSMS:

image

Figure 3 – Calling the Twilio service using simple SQL syntax in SQL Server Management Studio

How To Schedule The Call With Taskmatics Scheduler

In order to run and schedule this code, we are using Taskmatics Scheduler, which provides an enterprise grade scheduling and monitoring platform. When a class written in .NET inherits from the Taskmatics.Scheduler.Core.TaskBase class, it becomes automatically available as a custom task inside the Taskmatics Scheduler user interface. This means that a .NET library can easily be scheduled without writing additional code. Furthermore, marking the custom class with the InputParameters attribute provides a simple way to specify input parameters (such as the airport code to monitor, and the phone numbers to call) for your task through the Taskmatics user interface.

The following simplified code shows how a custom task class is created so that it can be hosted inside the Taskmatics Scheduler platform. Calling Context.Logger.Log gives developers the ability to log information directly to Taskmatics Scheduler for troubleshooting purposes.

namespace Taskmatics.EnzoUnified.FlightTracker

{

    // Mark this class so it is visible in the Taskmatics interface

    [InputParameters(typeof(FlightNotificationParameters))]

    public class FlightNotificationTask : TaskBase

    {

        // Override the Execute method called by Taskmatics on a schedule

        protected override void Execute()

        {

            // Retrieve parameters as specified inside Taskmatics

            var parameters = (FlightNotificationParameters)Context.Parameters;

 

            // Invoke method that calls FlightAware through Enzo Unified

            var arrivedFlights = GetArrivedFlights(parameters);

 

            // do more work here… such as identify new arrivals

            var newFlights = FlightCache.FilterNewArrivals(arrivedFlights);

 

            // Do we have new arrivals since last call?

            if (newFlights.Count > 0)

            {

               // Invoke method that calls Twilio through Enzo Unified

               var results = SendArrivedFlightsViaSMS(newFlights, parameters);

 

                // Update cache so these flights won’t be sent through SMS again

                FlightCache.SaveFlightsToCache(newFlights);

            }

            else

                Context.Logger.Log("SMS phase skipped due to no new arrivals.");

 

            Context.Logger.Log("Job execution complete.");

        }
    }
}

Installing the task into the Taskmatics Scheduler platform is very straightforward. Log into the user interface and create a definition for the flight tracker task. This step allows you to import your library into the system to serve as a template for the new scheduled task that we will create next.

clip_image006

Figure 4 - Import your custom task as a definition

Once you have created your definition, go to the “Scheduled Tasks” section of the user interface, and create the task by selecting the definition that you just created from the Task dropdown. This is also where you will schedule the time and frequency that the task will run as well as configure the input parameters for the task.

clip_image008

Figure 5 - Schedule your custom task to run on the days and times you specify.

clip_image010

Figure 6 - Configure the parameters for the scheduled task.

Finally, from the Dashboard screen, you can run your task manually and watch the output live, or look at a past execution of the task to see the outcome and logs from that run. In the image below, you can see the execution of the Flight Tracking task where we monitored recent arrivals into the Miami International Airport (KMIA).

clip_image012

Figure 7 - Review and analyze previous task executions or watch your tasks live as they run.

Conclusion

This blog post shows how developers can easily build integrated solutions without having to learn complex APIs using simple SQL statements, thanks to Enzo Unified’s BaaS platform. In addition, developers can easily orchestrate and schedule their libraries using the Taskmatics Scheduler platform. Combining the strengths of Enzo Unified and Taskmatics, organizations can reap the following benefits:

  • Rapid application development by removing the learning curve associated with APIs
  • Reduced testing and simple deployment by leveraging already tested services
  • Service orchestration spanning Internet services and on-premises systems
  • Enterprise grade scheduling and monitoring

You can download the full sample project on GitHub here: https://github.com/taskmatics-45/EnzoUnified-FlightTracking

About Blue Syntax Consulting

Our mission is to make your business successful through the technologies we build, create innovative solutions that are relevant to the technical community, and help your company adopt cloud computing where it makes sense. We are now making APIs irrelevant with Enzo® Unified. For more information about Enzo Unified and how developers can access services easily using SQL statements or a simple REST interface, visit http://www.enzounified.com or contact Blue Syntax Consulting at info@bluesyntaxconsulting.com.

About Taskmatics

Taskmatics was founded by a group of developers looking to improve the productivity of their peers. Their flagship application, Taskmatics Scheduler, aims to boost developer productivity and reduce the effort involved in creating consistent and scalable tasks while providing a centralized user interface to manage all aspects of your task automation. For more information and a free 90-day trial, visit http://taskmatics.com or email us at info@taskmatics.com.

Copy Files From You Local Computer To An Azure VM and Back

Do you need to copy files from your local workstation to Azure Virtual Machines? No need to use FTP, or send files on cloud drives or blobs. The only thing you need to do is to access your local drives from your cloud VM and pull the files over. You can also use the same approach to pull files from your cloud VM locally. Here is how it’s done.

First, logon to Microsoft Azure and browse to your Virtual Machine from the portal. From there, select your Virtual Machine and click on the Connect button at the bottom (make sure you select your VM first by clicking on the status field for example so that the VM clearly shows with a darker blue background).

AzureMgmt

After clicking on Connect, you will be prompted to Open or Save the RDP file for the remote session to your VM. Click on Save, and click on Open folder once the save operation is complete.  This will open the directory where the RDP file was saved and automatically select the file for you.

Save

SaveComplete

Right-click on your RDP file, and choose Edit from the dropdown menu. You will see the Remote Desktop Connection configuration window. Click on the Local Resources tab, and in the Local devices and resources section, click on the More… button.

RemoteLocalResources

Expand the Drives and click on the local drives you want to share from within your Virtual Machine (I selected my C drive), then click OK.

LocalDrives

Finally, click on the Connect button to logon to your Virtual Machine. Once logged on, you will see your local drive available from within Explorer as a mapped drives as shown below. You can now copy and move files from your Virtual Machines in and out of your Workstation easily. Because you can change any RDP file in the same way, you can do the same with virtual machines hosted on other cloud platforms or from your own network.

Explorer

 

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

How To Create A Powershell Script To Backup Your Azure Blobs

In this post I will show you how easy it is to create a PowerShell script using Visual Studio that can backup your Azure Blobs. The same concept can be applied to backup Azure Tables and SQL Database. To successfully follow this example, you will need a few things. First, we are coding in C# with Visual Studio 2012. The actual backup operation is rather complex, so we will be using the free API provided by Blue Syntax (http://www.bluesyntaxconsulting.com) to perform the actual backup operation. The PowerShell script will be calling the API.

Install And Configure Enzo Cloud Backup

As a pre-requisite, you will need to install the Enzo Cloud Backup tool, and the Enzo Cloud Backup API Version 3. Both can be found on the Blue Syntax website (http://www.bluesyntaxconsulting.com/backup30.aspx). Click on the Download button found on the product description page, and install Enzo Cloud Backup, and Enzo Cloud Backup API.

Once installed, start Enzo Cloud Backup. You will see a Login page. You will need to specify an Azure Storage Account where Enzo will store its configuration information. It is recommended to use a Storage Account that is used by Enzo only. Once you have created the Storage Account in your Azure subscription, enter the Account Name and an Account Key on this screen to proceed.  For detailed information on how to create an Azure Storage Account and access the Account Key, read this blog post: http://azure.microsoft.com/en-us/documentation/articles/storage-create-storage-account/.

EnzoLogin

Once logged in, you will need to register your product in order to obtain license keys; this is a very simple step and will ensure that you obtain the necessary license keys to run this example (to register your product, start Enzo Cloud Backup, and go to Help –> Request Permanent License Keys). There is no charge for the Community Edition of Enzo Cloud Backup; the API comes at no charge as well. Once registered you will receive an email with your license keys, so make sure you enter a valid email address in the registration key.

Register

Create a Class Library Project

Now that the Enzo Cloud Backup tool is installed along with the API, let’s create a new project in Visual Studio 2012. The project type is a class library. I named this project PSEnzoBackup. Make sure you select a Visual C# project; the code provided further down is written in C#.

NewProject

Configure Your Project

Once the project has been created, rename the Class1.cs file to BackupBlob.cs and make sure the class name is also renamed to BackupBlob. The end result should look like this in you Solution Explorer.

 RenameClass

Once the class has been renamed, add the following references to your project:

  • System.Management.Automation
  • CloudBackupAPI
  • EnzoBackupAPI

You can find the Automation DLL on your development machine (usually under C:\Program Files (x86)\Reference Assemblies\Microsoft\WindowsPowerShell). The other two libraries can be found in the directory where you installed the Enzo Cloud Backup API (they are usually found in a subdirectory of C:\Program Files (x86)\BlueSyntax\).

Add A Backup Method

At this point, the project is ready for development. Paste the following code into your BackupBlob.cs file as-is.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Management.Automation;

namespace PSEnzoBackup
{
    [Cmdlet(VerbsCommon.New, "BlobBackup")]
    public class BackupBlob : PSCmdlet
    {
        private string[] _blobNames;

        [Parameter(
            Mandatory=true,
            ValueFromPipelineByPropertyName = true,
            ValueFromPipeline = true,
            Position = 0,
            HelpMessage = "The list of blobs to backup (container/blob) separated by a comma. Can use * as a wildcard."
            )
        ]
        [Alias("ListBlob")]
        public string[] BlobNames
        {
            get { return _blobNames; }
            set { _blobNames = value; }
        }

        protected override void ProcessRecord()
        {
            WriteObject("Starting the backup operation...");

            BSC.Backup.Helper.AzureBlobBackupHelper backup = new BSC.Backup.Helper.AzureBlobBackupHelper(
                "ENZO_STORAGE_ACCOUNT",
                "ENZO_STORAGE_KEY",
                false,
                "YOUR_API_LICENSE_KEY");
            backup.Location = BSC.Backup.Helper.DeviceLocation.LocalFile;
            backup.DeviceURI = @"c:\tmp\backupfile.bkp";
            backup.Override = true;
            backup.SpecificBlobs = _blobNames.ToList();
            backup.UseCloudAgent = false;

            string operationId = backup.Backup();

        }

    }
}

The ProcessRecord() method is a protected override of the method that will be called by the PowerShell command. In this method, we simply create a reference to the AzureBlobBackupHelper class; the constructor requires the account name and keys of the Azure Storage Account used by Enzo Cloud Backup, and the API license key. Additional properties are available on the backup object; we are specifying that a local file backup device will be created, and will override any existing file. The SpecificBlobs property is set to the list of blob names provided as parameters to the PowerShell command. The Backup() method returns an OperationId which can be used by other Helper classes to check on the progress of the backup operation. We are hard-coding the account credentials and the name of the backup file, but it would be easy to provide parameters instead (as we did with the BlobNames parameter).

Before compiling this code, you will need to replace a few things:  the ENZO_STORAGE_ACCOUNT and ENZO_STORAGE_KEY are those used by the Backup application when you login (see the pre-requisites above). As indicated earlier, you must run the Enzo Backup application at least once to create the necessary configuration tables that the API uses. The AccountName and AccountKey you use to login in Enzo Backup are the ones you need to specify here.  The YOUR_API_LICENSE_KEY is the API License Key you received by email when registering your product with Blue Syntax. Once you have specified those values, you can compile this code.  Note the path where the DLL is being created when you compile as you will need it soon.

DllPath 

Test the PowerShell Script

Open a PowerShell command and type this command to load the PowerShell library we just compiled.

Import-Module "C:\YOUR_FULL_PATH_TO_PROJECT_DIRECTORY\PSEnzoBackup\bin\Debug\PSEnzoBackup.dll"

And finally, run the following command to backup all your blobs. The BlobNames parameter is a list of blob names (specified as ContainerName/BlobName) separated by a comma. The list of blobs supports a wildcard (*/* means all containers and all blobs). For more information about this parameter, and other properties available by this API, visit the online help here: http://www.bluesyntaxconsulting.com/EnzoCloudBackup30/APIBackupAB.aspx.

New-BlobBackup -BlobNames */*

You can easily wrap the commands provided by the Enzo Cloud Backup API in a PowerShell module, including backup Azure Tables, SQL Databases, and even restoring them. This example shows how to backup to disk, but you can also backup to blobs. To learn more about the API visit http://www.bluesyntaxconsulting.com/Backup30API.aspx.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

Backing up Azure Blobs

As customers adopt the Microsoft Azure platform, the need to backup Azure Blobs is becoming increasingly important. That’s because Azure Blobs are used by both the Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) components of Microsoft Azure. In fact, Azure Blobs are also becoming more popular with local software applications that need a central location to store configuration information and application logs. Since Blobs are a form of data, there is a need to back them up.

Until now, the way to backup Azure Blobs was to copy the files on local hard drives, which many third party tools currently provide. However Azure Blobs contain additional information that files can’t carry, such as custom metadata, HTTP properties, and Access Control List (ACL) details providing security information on those blobs. But Azure Blobs are not the only components that need to be backed up: Azure Blob Containers are also important. They are the equivalent of directory structures, and also carry metadata and ACL information. Existing third party tools do not provide a mechanism to save this additional information.

I created a free utility called Enzo Cloud Backup, version 3, now available for download (http://www.bluesyntaxconsulting.com/). This tool allows you to perform backup and restore operations on Azure Blobs (in addition to Azure Tables and Azure SQL Database). It is designed to handle a very large number of blobs and containers. In addition, an API is also available at no charge so that you can programmatically initiate backup and restore operations.

The free edition provides all the features of the advanced edition, but is limited in the number of backups that can be performed monthly. Give it a try; you will be surprised by how easy it is to use. And if you have some feedback, please send them to me!

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

The Business Case for a Data Server

As a developer you are familiar with Web Servers, and Database Servers. Both service data, in different ways. And this creates an interesting challenge. Let’s face it: accessing data is hard. Data in databases, in XML documents, in PDF documents, in flat files, on FTP servers in a proprietary format, in cubes, in no-SQL… you name it. In addition to those storage formats, data is made available in a large variety of file formats, available through a multitude of protocols, and serviced through an ever increasing set of providers each with their own authentication and authorization implementation. But the storage of data and the way to access it isn’t the only challenge. Who wants to access all this data? Developers, DBAs, reports writers, Systems Integrators, consultants, managers, business users, office workers, applications…

What’s the real problem?

Accessing data is hard because of those three forces: storage formats, protocols, and consumers. Different consumers need the same data in different ways, because they consume data in different ways. When the data is needed by applications (such as a mobile phone), then it is best accessed using REST requests returning JSON documents; this is usually best accomplished by servicing data through a web server. But if the data is needed by a report (such as Excel, or a cube for analytical purposes), it is usually best accessed using SQL commands, which is usually best accomplished using a database server.

This creates an interesting challenge: who are the consumers for a given set of data, and in which protocol should it be accessed? This simple question is hard to answer because for a given set of data, consumers change over time, but not the data stores, nor the protocols. The challenge is usually resolved by hiring consultants (or spending a lot of time with internal resources) to build bridges that move/copy data from one storage format to the next, so it can be used by the consumers that need the data in a specific format; that’s why integration tools are very popular: let’s copy the data from point A to point B, and C, so it can be used by consumers (business partners, reports, executives…). All this takes time and money. Copying no-sql data to a relational database or in a cube, so it can be reported on, takes effort. Or extracting flat files from an FTP site daily, and load it in a database, so it can used by a web application requires the creation of a complex set of programs that orchestrate this work.

As a result, the difficulty in making data access ubiquitous inhibits certain companies from making timely decisions, because the necessary data is not immediately available in the right format at the right time. And as mentioned previously, the need for data in various formats is clear. How many deployments of SSIS (SQL Server Integration Services), Joomla, BizTalk, Informatica, Scheduled jobs and ETL processes that move files around are you aware of? Some are needed because complex transformations are necessary, but a vast number of those implementations are in place because the source data is simply in the wrong format. [note: I am not saying these tools are not necessary; I am naming these tools to outline the need for data movement in general]

Introducing the Data Server

With this challenge continuing to grow with every new platform, API, consumer, and storage type, I suggest that a new technology be built, that I will simply call a Data Server, so that data can be serviced in real-time through virtually any protocol regardless of where the data comes from. In other words, it shouldn’t matter who needs the data and through which protocol; a mobile application needs data through REST/JSON? No problem. A report needs the exact same data through SQL? No problem. The same data, coming from the same source, should be accessible in real-time regardless of consumer preferences. If data were available in a ubiquitous manner, regardless of the protocol being used, a large number of integration routines would become obsolete, or would be largely simplified.

So what are the attributes of a Data Server?  It should be able to hide the complexities of the underlying data sources, and present data in a uniform way, through multiple protocols. For example, a Data Server would present Tweets through a REST/JSON interface and through an SQL interface (the same data in real-time; not a copy of the data). SharePoint lists could be made available through the same REST/JSON interface, or SQL as well. An FTP server could be accessed through REST/JSON and SQL too, and so would WMI, no-sql, CICS screens, flat files, SOAP endpoints… Regardless of the origin of the data, it would be serviced through a uniform response in the desired protocol.

The Data Server could also abstract security by shielding the authentication and authorization details of the underlying source of data.  This makes a lot of sense too because most sources of data use different security protocols, or variations of published standards. For example, authenticating to Microsoft Bing, Azure Blobs, Google Maps, FTP, SharePoint or Twillio is difficult because they all use different implementations. So abstracting authentication through a single REST layer or an SQL interface, and adding a layer of authorization on top of these data endpoints makes things much easier and more secure. It becomes possible to monitor data consumption across private and public sources of data as well, which can be important in certain organizations.

Data Cross Concerns

A Data Server would also help in implementing data cross concerns that are not usually easy to configure (or use), such as caching, asynchronous processing, scheduling, logging and more. For example, caching becomes much more interesting through a data server because it doesn’t matter which interface is used to access the data; cached data sets can be made available to both REST/JSON and SQL interfaces at the same time, which means that the data needs to be cached only once and remains consistent no matter which consumer reads the data. 

Asynchronous processing is also an interesting cross concern; consumers can start a request without waiting for its completion, through REST or SQL equally. For example, a REST command could initiate an asynchronous request, and a SQL command could check the completion of the request and fetch the results. Since the protocol used by consumers becomes an implementation choice, the data takes center place in a Data Server. Accessing, managing, recycling and updating data through a vast array of data sources becomes protocol agnostic.

Conclusion

To accelerate data-intensive projects and to help organizations consume the data they need efficiently, data should be made available in a uniform way regardless of the client protocol, no matter where it comes from, so that it doesn’t matter anymore who needs to consume that data. By creating this level of abstraction, the authentication and authorization mechanism can be streamlined too, so that there is only one way to secure the information. And since a Data Server channels requests to a multitude of data endpoints, it becomes a hub where common data related concerns can be implemented, including caching, scheduling and asynchronous requests.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

To SaaS or not to SaaS

Many companies servicing specific industries have technology assets that could be transformed into a Software as a Service (SaaS) offering. Your business may have a unique, proprietary technology that your industry could use. So packaging your intellectual property into a SaaS offering may make sense; you could essentially rent your ‘savoir faire’ to your industry, or even to multiple industries depending on how generic the technology is. For example, if you know how to process insurance claims effectively for your own business, could you offer an Insurance Claim SaaS platform to your industry? When done properly, offering a SaaS solution could open the doors to new revenue and a more predictable revenue model based on a subscription model for example. And of course, if you are considering a SaaS offering, it may even be possible to transform your technology offering into a more developer centric solution as a PaaS offering; so the following discussion applies equally to PaaS enabling your technology stack.

What’s my low hanging fruit?

To begin answering the question of whether or not you should consider building a SaaS offering, you want to make an inventory of the high level technologies you have built over the years. It may be related to licensing, data transfer, security models, industry-specific calculations, a mathematical model and so forth. Or it could be as simple, or generic, as business card transactions that you feel is unique to your industry because of compliance requirements or other constraint.

Once you have established this list, find the technology that is likely the easiest to package. If the technology is well understood by your staff, if you own all the rights to the technology, if the technology does not need too much rework to be used by multiple customers, and if making the technology available to the competition won’t put you out of business, then you may have a winner!

What else is involved?

If you identify a technology that makes sense to package as a SaaS offering, you need to evaluate the energy involved in creating your offering. By energy I mean associated costs and servicing readiness of your company. From a cost standpoint, you may need to invest in making your technology more generic to handle a greater variety of conditions that your online customers will need; or perhaps you may need to redesign a component in your technology to handle multiple customer requests at once.

From a servicing standpoint, will you need to build a management portal for your SaaS offering? Will you need to train your staff into the new kinds of issues that your technology may face when exposed publicly? Keep in mind that building an online service may require your company to be prepared to function like a software company, with release management, higher quality control, patch management and so forth.

Example of a failed service

Although creating a SaaS offering can be lucrative, it is usually not easy to build and service over time. Some services seem to make sense, but the technology stack, or the actual customer demands are such that it may not be a viable financial option in the long run. A technology that does not scale very well to an increasing number of users could spell doom for your SaaS ideas.

Although there are many services that failed over the years, I am cherry picking the Azure Reporting Service as the prime example because it was put together by a company that now builds cloud services as part of its mainstream business: Microsoft. Of course, Microsoft has built many successful online services (SaaS, PaaS and IaaS), and as such is developing a leading expertise in packaging and bringing to market remarkable solutions. And yet, the reporting service offered a couple of years ago failed from a business standpoint. The service itself was certainly in high demand; configuring an onsite Reporting Services environment is complex, and hard to maintain. So a SaaS/PaaS offering made sense for this service because of the demand for simpler configuration. However, the technology is likely very hard to scale; running reports can use a lot of resources, which makes it difficult to predict (and contain) the amount of resources used by hundreds (or thousands) of customers. Ultimately, while the offering was awesome (and so was the service itself), it did not appear to be scalable enough, which means that Microsoft probably pulled the plug on this offering due to the high operating costs (high operating costs meant a high price point for the service itself, which as a result impacted negatively customer demand).

Conclusion

If your business services a specific industry, transforming an existing technology you built over the years into a SaaS (or PaaS) offering could provide a new and more scalable business model to your company, and create a more predictable revenue stream over time. One way to enter this business model is to leverage your own technology stack and repurpose what you already do well into a publicly available service. However it can be challenging to transform your business into an online operation. If your company is considering entering this space, you should consider the various business impacts of running a SaaS service, including whether or not your technology needs to be upgraded and how you may need to change your business to function like a software company.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

Microsoft Azure and Threat Modeling You Apps

In this blog post I will introduce you to the Microsoft Threat Modeling Tool (TMT) and how it can be used specifically with the Microsoft Azure environment for application development and application deployment purposes. In order to do this, I will take a specific example, using the Enzo Cloud Backup tool my company is building, and show you how to perform a high level threat model for a specific operation: a backup request.

Introduction

First things first… what is the Threat Modeling Tool (TMT)? It’s a pretty cool piece of software that allows you to lay down your application architecture, with its various dependencies and protocols; it becomes a canvas for understanding and identifying potential threats. The tool itself may provide certain recommendations, and a list of checkpoints, but even if you decide to use the tool as a simple architecture diagram, it is very effective at doing so. You can use TMT to create simple, system-level diagrams, or to dive into specific operations to get into the data flow of your system. I refer to system-level diagrams as Level 0, and a data flow as Level 1.  The deeper you go, the deeper the analysis. Interestingly enough, I am starting to use TMT as a troubleshooting guide as well, since it does a good job at representing inter-system connections and data flows.

As you draw your diagram, you will need to choose from a list of processes, external components, data stores, data flows and trust boundaries. Selecting the correct item in this list is usually simple; you also have Generic items that allows you specify every security property manually.

image

Enzo Cloud Backup – Level 0 Diagram

Let’s start with a Level 0 system diagram of the Enzo Cloud Backup system. As we can expect, this is a very high level overview which provides context for other more detailed diagrams. The Level 0 diagram shows two primary trust domains: the corporate network where Enzo Cloud Backup is installed, and the Microsoft Azure trust domain, where the backup agent is installed. This diagram also introduces the concept that the agent and the client communicate directly, and through an Azure Storage account. As a result, two communication paths need to be secured, and both happen to take place over an HTTPS connection. The cloud agent also communicates to an Azure Storage account through HTTPS within the Microsoft Azure trust boundary, and to a data source (that is being backed up or restored). A more complex diagram would also show the fact that multiple Virtual Machines could be deployed in the cloud, all communicating to the same storage account, but for simplicity I am only showing one Virtual Machine. Also note that in my Level 0 diagram I choose not to document the sequence of events; in fact my Level 0 diagrams are not specific enough to dive into specific events or data flows.

Although the diagram below shows boxes and circles, selecting the right stencil is important for future analysis. There are no specific visual cues on the diagram itself indicating if you are dealing with a Managed or Unmanaged process for example, but the tool shows this information in the Properties window (shown later) and it will use this information to provide a set of recommendations. In addition, each item in the diagram comes with a set of properties that you can set, such as whether a data flow is secured, and if it requires authentication and/or authorization.

image

Enzo Cloud Backup – Level 1 Remote Backup Operation

Now let’s dive into a Level 1 diagram. To do so, let’s explore the systems and components used by a specific process: a remote backup operation. Level 1 diagrams are usually more complex, and as a result need to show as many data stores as possible, and optionally identify the order of data flows. While TMT doesn’t natively give you the option to document the sequence of your data flow, you can simply name each data flow by adding a number in front of it to achieve a similar result. By convention, I use 0 as initialization tasks, 1 as the starting request, and sometimes I even use decimals to document what happens within a specific step. But feel free to use any mechanism you think works for you if you need to document the sequence of your data flows.

My Level 1 diagram below shows that a human starts the backup request through Enzo Cloud Backup, which then sends a request through a Queue. The backup service then picks up the work request and starts a backup thread that performs the actual work. The thread itself is the one creating entries in the history data store that the client reads to provide a status to the user. As we can see in this diagram, the cross-boundary communications are as expected from our Level 0 diagram. Also worth noting is that the backup agent reads from a configuration file; from a security standpoint this could represent a point of failure and disrupt the backup process if it is tempered with. The client depends on the local machine registry which is also a store to properly secure as a result.  Last but not least, the numbering provided in front of each data flow allows you to walk through the steps taken during the backup request.

image

As mentioned previously, you can configure each item in the diagram with important properties which will ultimately help TMT provide recommendations. For example, I selected an HTTPS data flow for the Monitoring Status reading from the History Data store. Since I selected an HTTPS data flow, the tool automatically set some of the security attributes; I also provided information about the payload itself (it is a REST payload for example). Other properties, not shown below, are also available. And you can create your own properties if necessary, for documentation purposes.

image

Going Further with TMT

TMT is being used by large corporations to provide more detailed application diagrams, going as far as internal memory access. In fact the tool allows you to add notes, and have a checklist of important security verifications for your diagrams. TMT also comes with a build-in security report that gives you which steps you should consider in making you system more secure. The following diagram is a short sample output of this report for the Registry Hive access. The more information you provide, including the properties on your data stores, data flows, processes and trust boundaries, the better.

image

Last but not least, the tool comes with an Analysis view of your diagram, which is the basis for generating the report shown above. For example TMT identified that the History Data store could become unavailable; while this sounds like a simple problem to solve through a retry mechanism, this could be used as a Denial of Service attack against the tool. Note that TMT will highlight the component in question to make it easy to understand the area at risk.

image

Give it a ride! TMT is easy to use and provides a solid framework for documenting and analyzing your systems; and since most corporations have a vested interested in securing they applications in the cloud, TMT offers the framework to do so. To learn more about TMT, check out this link: http://blogs.microsoft.com/cybertrust/2014/04/15/introducing-microsoft-threat-modeling-tool-2014/.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

DocumentDB vs Azure SQL vs Azure Table

Microsoft Azure is now offering a third storage option for developers: DocumentDB.  DocumentDB is a no-sql storage service that stores JSON documents natively and provides indexing capabilities along with other interesting features. Microsoft Azure also offers Azure Tables, another no-sql storage option, and Azure SQL, which is a relational database service. This blog post provides a very high level comparison of the three storage options.

Storage

Purely from a storage perspective, the only relational storage offering is Azure SQL. This means that only Azure SQL will provide a schema and a strongly typed data store. This gives Azure SQL some interesting advantages related to search performance on complex indexes. Both Azure Table and DocumentDB are no-sql storage types, which means they do not enforce schema on their records (a record is called an Entity in Azure Table, and a Document in DocumentDB). However DocumentDB stores its data as a native JSON object, and allows attachments to a document, while Azure Table stores its data in XML (although you can retrieve Azure Table entities as JSON objects). The advantage of storing native JSON objects is that the data maps directly into JSON objects in the client application code. It is also worth noting that Azure SQL comes with storage capacity limitations; it can only store up to 500GB of data in a single database.

Indexing

All three options provide a form of indexing. Azure SQL offers the most advanced indexing capability allowing you to define indexes with multiple columns and other advanced options (such as a WHERE clause as part of the index filter). DocumentDB offers automatically generated indexes allowing queries to efficiently retrieve information. Azure Tables offer only a single index on its PartitionKey; this means that querying on properties other than the PartitionKey can take a significant amount of time if you have a lot of data.

Programmability

Azure SQL provides advanced server-side programming capabilities, through triggers, column-level integrity checks, views and functions (UDF). The programming language for Azure SQL is T-SQL. DocumentDB offers a similar programmability surface using JavaScript, which provides a more uniform experience for developers. Azure Tables do not offer server-side programmability.

Referential Integrity and Transactions

All three options provide some form of data consistency, although limited for Azure Tables. Azure SQL offers full-blown RDBMS referential integrity and transactional support; with T-SQL developers can nest transactions server-side and perform commit or rollback operations. And Azure SQL offers strong Referential Integrity through foreign keys, unique constraints, NOT NULL constraints and more. DocumentDB does not offer strong Referential Integrity (there is no concept for foreign keys for example) but supports transactions through JavaScript. As a result, developers can use JavaScript to enforce server-side referential integrity programmatically. Azure Tables offers basic transaction support through the use of its BATCH operation, although there are a few stringent requirements: up to 100 entities can be batched together, and all must share the same PartitionKey.

Service Cost

Azure SQL pricing recently changed and is now driven by the level of service you desire, which makes it difficult to compare with other storage options. Assuming a Basic level of service and a relatively small database (up to 2GB), you can expect to pay roughly $5.00 per month starting Nov 1st 2014. Pricing of Azure SQL goes up significantly with higher levels of service, and can reach over $3,000 per month for the more demanding databases (the price point impacts the maximum size of the database and the throughput available). DocumentDB offers a starting price point at $45.00 per month per Capacity Unit, which represents a storage quantity and specific throughput objectives (note: the price for DocumentDB is estimated since it is currently in preview). Azure Tables only have a storage price point which makes this storage option very affordable. A 1GB storage requirement with Azure Tables will cost about $0.12 per month with high availability and geo-redundancy.

All the storage options incur additional charges for transactions, which is not included in this analysis to simplify the comparison.

Summary

The three storage options provided by Microsoft Azure (Azure SQL, DocumentDB and Azure Tables) provide an array of implementation options at various price points depending on the needs of your application. Azure SQL provides higher data integrity but is limited in storage capacity, DocumentDB offers a no-sql implementation with configurable consistency levels and JavaScript server-side logic, and Azure Table offers simple no-sql storage capacity.

The following table is my interpretation of the Microsoft Azure storage options used for storing objects or tables.

 

Storage

Indexing

Server-Side Programmability

Ref. Integrity and Transaction Support

Service Cost

Azure SQL

Database

Up to 500GB

Yes

Multi-column

Yes

T-SQL

Yes for R.I.

Yes for Tx Support

Starts at $5 / DB / Month

(DB = Database)

DocumentDB

JSON

Petabytes

Yes

Automatically created/maintained

Yes

JavaScript

Through JavaScript

Supports Tx

starts at $45 / CU / mo

(CU = Capacity Unit)

Azure Table

XML

200TB

Primary Key only

No secondary index

No

Limited

(uses BATCH operation)

$0.12 per GB

 

NOTE 1: This blog post was updated on 9/11/14 at 9:48pm ET to correct an inaccuracy on DocumentDB indexing.
NOTE 2: This blog post was updated on 9/14/14 at 6:37pm ET to correct a typo.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.

About the new Microsoft Innovation Center (MIC) in Miami

Last night I attended a meeting at the new MIC in Miami, run by Blain Barton (@blainbar), Sr IT Pro Evangelist at Microsoft. The meeting was well attended and is meant to be run as a user group format in a casual setting. Many of the local Microsoft MVPs and group leaders were in attendance as well, which allows technical folks to connect with community leaders in the area.

If you live in South Florida, I highly recommend to look out for future meetings at the MIC; most meetings will be about the Microsoft Azure platform, either IT Pro or Dev topics.

For more information on the MIC, check out this announcement:  http://www.microsoft.com/en-us/news/press/2014/may14/05-02miamiinnovationpr.aspx.

About Herve Roggero

Herve Roggero, Microsoft Azure MVP, @hroggero, is the founder of Blue Syntax Consulting (http://www.bluesyntaxconsulting.com). Herve's experience includes software development, architecture, database administration and senior management with both global corporations and startup companies. Herve holds multiple certifications, including an MCDBA, MCSE, MCSD. He also holds a Master's degree in Business Administration from Indiana University. Herve is the co-author of "PRO SQL Azure" and “PRO SQL Server 2012 Practices” from Apress, a PluralSight author, and runs the Azure Florida Association.