Blog Stats
  • Posts - 13
  • Articles - 0
  • Comments - 2
  • Trackbacks - 0

 

Saturday, March 13, 2010

Writing a SQL Azure Book - Notes

Over the last few months I have had the opportunity to ramp up significantly on SQL Azure.  In fact I will be the co-author of Pro SQL Azure, published by Apress. This is going to be a book on how to best leverage SQL Azure, both from a technology and design standpoint. Talking about design, one of the things I realized is that understanding the key limitations and boundary parameters of Azure in general, and more specifically SQL Azure, will play an important role in making sounds design decisions that both meet reasonable performance requirements and minimize the costs associated with running a cloud computing solution.
 
The book touches on many design considerations including link encryption, pricing model, design patterns, and also some important performance techniques that need to be leveraged when developing in Azure, including Caching, Lazy Properties and more.
 
Finally I started working with Shards and how to implement them in Azure to ensure database scalability beyond the current size limitations. Implementing shards is not simple, and the book will address how to create a shard technology within your code to provide a scale-out mechanism for your SQL Azure databases.
 
As you can see, there are many factors to consider when designing a SQL Azure database. While we can think of SQL Azure as a cloud version of SQL Server, it is best to look at it as a new platform to make sure you don’t make any assumptions on how to best leverage it.

Tuesday, January 26, 2010

SQL Azure - Execution Plans

Quick note on SQL Azure...

While I was performing a demo of SQL Azure at the last SQL Saturday in Tampa, I tried to pull an execution plan to see if that worked. I was happily surprised to find out that you can actually get actual execution plans back. This is really cool considering that there is no current support for server-side traces. At least we can tell whether indexes are actually being used in  the cloud.

Just be ready to wait a little - execution plans are 'heavy' on packet transfers - so a far link connection on top of SSL makes displaying execution plans a slow operation. But it's worth the wait!

 

Sunday, January 10, 2010

Designing an application to work with SQL Azure

Using SQL Azure for your applications is relatively straightforward. All you need is... a connection string... and since SQL Azure uses TDS as its underlying communication protocol, just like SQL Server, it may seem natural that you don't need to change much in your application design approach when connecting a SQL Azure database. However, this may not necessarily be true. SQL Azure was designed for ease of deployment and scalability; and as such you may need to take certain things into consideration when creating a new application against SQL Azure.

The following is not meant to be an extensive list of things to consider, but a starting to that should help you figure out which next steps you need to take.

Performance and Application Responsiveness for Windows Applications
Web development may not be too different from a performance standpoint since all data communication takes place within the data center. However for Windows Applications, where the user interface runs local to a user's workstation, having to connect to the cloud to retrieve data may pose some challenges. First, the communication takes place over SSL; no choice here. So it's always encrypted (good from a security standpoint, not so good for performance). Second, your data can be far, very far away. So retrieving larger recordsets may actually take some time (more than what your users may be expecting). As a result your application design should most likely include careful row selection, smart indexing, multi-threading (for faster UI response time) and even caching for somewhat static data, so you don't have to do database roundtrips every time you need to display the list of States for example. Most of those design choices apply whether you are desiging a web application or a Windows application.

SQL Azure Limitations
SQL Azure also imposes certain limitations you need to be aware of that may impact your design. For example, depending on your service plan your database may be limited to 10GB. As a result, if you plan in storing that kind of data, you will need to plan for data partitioning, or use a technique called Sharding. Also certain capabilities are not yet available, such as using the CLR for more advanced stored procedures. So some of your business logic may need to remain in your client-side code, or in business objects in Windows Azure through the use of REST services. Finally, consider performance of your queries as extremely important; SQL Azure will protect itself against poorly written SQL code and may disconnect your session if it feels you are consuming too much bandwidth, or if you left a connection left open for too long... so automatic reconnection may also be necessary in your design.

Security
Last but not least, security should be strongly considered as part of your design. Storing data in the cloud presents new risks for organizations. As a result, plan for encryption more than usual, and plan to have to implement your routines in your .NET code since Encryption is not yet supported in SQL Azure.

These should be considered high-level thoughts for designing applications that use SQL Azure as their backend storage. Make sure to read about SQL Azure on MSDN for more information.

 

SQL Saturday in Tampa - January 23rd

Just in case you are around... please come to the SQL Saturday in Tampa, FL on Saturday January 23rd. I will be there!!! And many other SQL Server speakers. Visit http://www.sqlsaturday.com/32/eventhome.aspx for more information...

Many topics will be covered including SSIS, ELT, Powershell, SQL Azure,  CTEs, Dashboards, Service Broker and many more!

 

Sunday, November 29, 2009

Cloud Computing, Agile Development and Infrastructure

A few weeks ago I was part of a meeting that outlined some of the challenges that Agile development was inducing on an existing corporate infrastructure. And Cloud Computing became a possible solution to the fast pace changes required by Agile development initiatives.

Agile projects tend to be short in nature, with fewer documentation and evolving requirements. From a project management standpoint, development cycles are divided in Sprints, which can be a couple of weeks long in certain cases. The fast-pace development approach creates code that needs to be deployed in production (or at least in a staging environment) as early as possible in the project lifecycle to obtain customer feedback. And since requirements are bound to change over time, infrastructure requirements are also bound to change.

For larger organizations, provisioning of hardware and infrastructure configuration can take days, or weeks, when taking into consideration bandwidth needs, security and server acquisition. In addition, making changes to production (or staging) systems can be as slow, or difficult, putting additional pressure on IT departments. In this Agile world, IT departments now have to provision hardware in a matter of hours (or a few short days) and need to be able to adapt to rapid requirements changes.

This is where Cloud Computing comes in. While Cloud Computing doesn't necessarily remove the involvement of the IT Department, it could provide the necessary flexible platform businesses are looking for to obtain an environment that can be used for system testing and staging purposes, and eventually production as well. Since Cloud Computing promises faster provisioning time (the hardware is available on demand, and so is the bandwidth capacity), some companies are looking at Cloud Computing as a way to quickly provision infrastructure requests.

It will be interesting to observe how larger corporations adapt the Cloud over the next few years. Cloud computing may be more than a new marketing spin on Hosting Services; it may actually give IT organizations the means to become more effective in supporting fast-pace development initiatives.

 

Saturday, November 07, 2009

SQL Azure - Auditing Choices

As I am digging more into SQL Azure, it seems choices for auditing will become a little bit more restricted.

Generally speaking there are four ways to audit SQL Server statements; these mechanisms are used by various software vendors to deliver auditing capabilities for compliance mandates and for security reviews. However as we will see, many of the products will stop from working for SQL Azure due to some limitations imposed by the database.  At a high level, the four auditing mechanisms are:

  1. Server-side Traces
    Uses server-side traces to trace and store statements executed on a database server
  2. Log File Auditing
    Uses backups of log files to discover which statements were previously executed
  3. Network Sniffing
    Sniffing of network packets and storing SQL Server packets' content, including statements
  4. Database Proxying
    Captures all incoming network packets directly before forwarding them to the database server

It turns out that option number 1 is no longer available. Many SQL Server auditing products rely on this mechanism. Since server-side tracing stored procedures are no longer available, these products will not be able to audit SQL Azure statements.

Regarding option number 2, it seems that no option is available to obtain a backup of the log files. The BACKUP and RESTORE operations are not supported since cloud computing takes care of high availability concerns.

Regarding option number 3, packet sniffing will no longer work either since all direct connections to SQL Azure require SSL. As a result, all packets are encrypted and cannot be analyzed.

The only remaining option is to use a database proxy that handles the SSL handshake on both ends and stored the database statements going through. However, if a database connection can be made around the proxy, those statements will not be captured. While SQL Azure allows firewall settings to limit connections (by IP), it would be difficult to prove from an auditing standpoint that the firewall settings were not altered over time.

At this point at least, there appears to be no real silver bullet for auditing a SQL Azure database; at least not yet...

Still, most applications using the SQL Azure platform will not likely store any sensitive data, initially. As the SQL Azure platform grows in its use, I would expect some of the options above to be enabled, or new options to become available.

 

 

Wednesday, October 07, 2009

SQL Azure - Getting Started

I started to look into SQL Azure, the database cloud computing initiative from Microsoft, and I was completely blown away. Microsoft is providing an interesting option that strikes an interesting balance between a plain-old RDBMS implementation and a scalable platform.  By definition, relational databases have difficulty scaling due to the many features and constraints that make them popular. However, SQL Azure removes some of those features in order to provide a more scalable SQL Server, while keeping intact its strong relational capabilities.

For example, outside of hashing, SQL Azure does not support encryption that uses certificates, which makes somewhat sense or those certificates would need to be deployed virtually everywhere.  Considering that encryption is CPU intensive anyway, it is probably best to leave the encryption details to the consumer.

I was initially surprised that the USE <database> command didn't work, although it doesn't generate an error either. I guess every database created could be sitting on any hardware... which makes sense from a scalability standpoint.

Finally, I had to get used to sqlcmd.exe again... but it works like a charm. There are ways to get a SQL Azure database available through Visual Studio. However you should expect that not all the T-SQL commands are available, or fully supported. For example, none of the system procedures will work... (bummer!).

To get started, go to http://www.microsoft.com/azure/sql.mspx - from there you can sign up for the CTP. 

Saturday, September 19, 2009

Strong Password Hashing with SQL Server

While SQL Server security features continue to improve, hashing in SQL Server using native functions is simple, but not necessarily up to speed on the latest security specifications. The good news however, is that using extended stored procedures allows database developers to take advantage of the strength of .NET using the System.Security.Cryptography namespace.

I started to evaluate Hashing capabilities within SQL 2005 and 2008 and as it turns out, hashing is extremely simple using the HASHBYTES function. This function takes two parameters: the hashing algorithm (sha1, md5...) and the input parameter.

For example, the following code hashes a password. Pay attention to the Encoding chosen; the type of the variable dictates which encoding will be used (varchar and nvarchar).

DECLARE @password1 varchar(100)   -- UTF8  Encoding
DECLARE @password2 nvarchar(100) -- Unicode
SET @password1 = 's3cret'
SET @password2 = 's3cret'
SELECT HASHBYTES('sha1', @password1)
SELECT
HASHBYTES('sha1', @password2) 

Output:

0xFEF341F85D87439E7D91A2D465B9871EF66B5E98 
0xC06DCADF544BC3D6ECE7C64F485D2846E7A93F55

Unfortunately, SQL Server does not support any of the stronger hashing mechanisms, such as SHA256. In addition, storing passwords by simply hashing them is not considered sufficient as they are vulnerable to dictionary attacks; indeed, hashing a given input always yields the same output. To provide stronger hashing, one would need to obtain a cryptographically strong random byte array, known as a vector (or a salt value). To use stronger hashing algorithms and create vectors, we need to use the .NET framework through an extended function.

Using .NET Hashing

The objective of our .NET code is to generate a single output that is never the same (avoiding the dictionary vulnerability) and uses SHA256 as the hashing mechanism. The following shows how to use a .NET extended function that takes a password as an input and returns a byte array to SQL Server that contains both the hash and the vector.

Note that your SQL Server database should allow .NET execution. Here is the SQL statement used to enabled CLR on SQL Server:

sp_configure 'clr enabled' , '1'
go
reconfigure

The following T-SQL uses the strong .NET hashing functions found later in this post. The output is a byte array that can be stored as binary(48).

DECLARE hash binary(48)
SET hash = (SELECT dbo.pyn_encryption_hashPassword('s3cret'))
SELECT dbo.pyn_encryption_verifyPassword('wrongpwd', hash) -- returns 0
SELECT dbo.pyn_encryption_verifyPassword('s3cret', hash) -- returns 1

The above code makes it simple to offer a strong password hashing function that combines SHA256 and a vector. Since the vector is embedded in the password, there is no need to store the vector in a separate column.

Here is the .NET code that creates and verifies a password using a strong hashing method. The methods below use UTF8 encoding. To use Unicode, replace the method calls GetNonUnicodeBytes() to GetUnicodeBytes().

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlBytes pyn_encryption_hashPassword(SqlString password)
{
 // Create a strong vector
 byte[] vector = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
 System.Security.Cryptography.RandomNumberGenerator.Create().GetNonZeroBytes(vector); 

 // Get the password bytes  
 byte[] pwdBytes = password.GetNonUnicodeBytes(); 

 // Add the vector bytes to the password bytes and hash them both at the same time 
 System.Security.Cryptography.SHA256Managed sha256 = new System.Security.Cryptography.SHA256Managed();
 byte[] outputBytes = sha256.ComputeHash(AddBytes(pwdBytes, vector)); 

 // Return the resulting hash, and append the vector again to it so it can extracted later 
 return new SqlBytes(AddBytes(outputBytes, vector));
}

[Microsoft.SqlServer.Server.SqlFunction]
public static SqlBoolean pyn_encryption_verifyPassword(SqlString password, SqlBytes hash)
{
 byte[] vector = new byte[16];
 byte[] pwdAndHash = new byte[32]; 

 // Split the hash and vector into separate variables 
 Array.Copy(hash.Value, 32, vector, 0, 16);
 Array.Copy(hash.Value, 0, pwdAndHash, 0, 32); 

 // Get the password bytes that will be tested against the hash 
 byte[] pwdBytes = password.GetNonUnicodeBytes(); 

 // Compute a hash using the password provided, and the vector extracted from the hash 
 System.Security.Cryptography.SHA256Managed sha256 = new System.Security.Cryptography.SHA256Managed();
 byte[] testHash = sha256.ComputeHash(AddBytes(pwdBytes, vector)); 

 // Compare hash values to determine if the password provided matches 
 return new SqlBoolean(BitConverter.ToString(pwdAndHash) == BitConverter.ToString(testHash));
}

private static byte[] AddBytes(byte[] array1, byte[] array2)

 // Add two byte arrays 
 byte[] array3 = new byte[array1.Length + array2.Length];
 Array.Copy(array1, array3, array1.Length);
 Array.Copy(array2, 0, array3, array1.Length, array2.Length);
 return array3;
}

};

Conclusion

While SQL Server hashing capabilities offer good support for simple hashing needs, extending SQL Server with .NET can provide a significant security advantage when it comes to encryption. Since the code above does not access any local resources on the database server, it can run in Safe permission mode.

 

Using SQL Server Hashing

Sunday, July 12, 2009

SQL Saturday - South Florida on August 8th 2009

Scott Klein (http://geekswithblogs.net/ScottKlein/Default.aspx) and I are organizing the first SQL Saturday of South Florida, coming on August 8th 2009. It will be held in Miramar at Devry University. If you are in the area, please drop by! We have over 20 speakers and a great deal of giveaways. To find out more, please go to:

http://www.sqlsaturday.com/eventhome.aspx?eventid=20

Thanks

Herve

 

Sunday, June 28, 2009

Data Masking - in-flight vs. at-rest

Many organizations are faced with the threat of data theft, from which legal battles, hefty fines and negative publicity can arise. Interestingly enough stealing data is not always that difficult. In this post we will review certain aspects of data masking, a technique used to disguise personable and sensitive information.

Data masking encompasses two key areas: in-flight and at-rest. In-flight data masking is different than encryption; the goal of this technique is to temporarily transform data from its original value before it is viewed on screen, or printed, depending on predefined security rules, such as a login name or Active Directory group. This is perhaps better described as a "need-to-know" transformation. The real data is not changed, but its representation is transformed based on who sees it. For example a supervisor may see a customer's full credit card information but a customer representative may only see the last four digits. While some applications may provide built-in support for similar features, most do not offer this capability.

At-rest masking is a technique typically used by corporations that need to restore backup copies of their production systems for quality assurance, package implementation testing or software development projects. This type of masking is typically considered much more difficult for many reasons, including process changes, testing tool and script compatibility, discovery of logical entity relationships within a database and across systems and more. For example, from a process standpoint, who in your organization would be responsible for defining and maintaining the data transformation of Social Security Numbers? Should a Social Security Number be transformed the same way on DB2, Oracle and SQL Server databases? How will testing batch processes be affected if cross-system data integrity is no longer maintained after data masking? Is it possible to stage data masking over multiple days depending on the size of each system? How often should non-production systems be refreshed?

In some cases data resides in flat files, XML documents, PDF documents (such as invoices) or images. Very few vendors seem to be able to mask data outside of what an ODBC driver can provide.

While in-flight data masking is technically difficult to achieve (and very few vendors are available), at-rest data masking may seem technically simpler but poses significant challenges that impact software development life cycle and package implementation projects alike. Perhaps the most difficult aspect of a masking project is to identify the business and technical requirements so that a solution can be picked, be it a vendor or a custom solution.

Some of the vendors that offer masking solutions include Compuware and IBM. There are a few other vendors that are more specialized if your masking needs are limited to Oracle for example. On SQL Server specifically, it is possible to devise a complex and effective masking solution using SSIS.

However the real complexity of the project is not the technology, but rather the process, as discussed above. My advice with this type of project is to carefully select vendors and possibly leverage consulting firms that may be able to guide you in the difficult task of gathering requirements and selecting a vendor. 

 

 

 

Copyright © Herve Roggero