Social Meet up on Twitter for Meet Windows Azure on June 7th

Get ready for a cool event on June 7th. Register to watch live (starting at 1PM PDT): here – The event will be presented by Scott Guthrie. To keep updated on this event follow the twitter conversations @WindowsAzure, #MEETAzure, #WindowsAzure.

You can also register for the social meetup on Twitter event, organized by MVP Magnus Martensson.

I assure you; you don’t want to miss this event…

MEET Windows Azure Blog Relay:

SQL Azure: Database Size History

You may end up in a situation where you need to know how your databases have evolved over time from a storage standpoint, and how much they were used. For example, you may want to know how your databases were used last month if your customers complained about performance. Or you may need to see how fast your database is growing to help you plan for future capacity.

Two system views are now available (see disclaimer further down): sys.resource_usage and sys.resource_stats. Both views contain a time stamp, a database name, the database SKU (web, enterprise), usage_in_seconds and storage_in_megabytes. The resource_storage view also has an end_time column, which does not show in the resource_stats view.

Although both views contain similar information, they appear to work on a different time scale. The resource_usage stores hourly information and can go back a few months (three months in my case), while the resource_stats contains information as granular as every five minutes but only goes back a few weeks.

For example, the following statement returns the total usage (in seconds) of a database, including the minimum and maximum storage required by my database on a monthly basis.

SELECT
    CAST(year(time) as nvarchar(4)) + '-' + CAST(month(time) as nvarchar(2)) as YearMonth,
    database_name, 
    sum(usage_in_seconds) as total_usage_sec,
    MIN(storage_in_megabytes) as Min_MB,
    MAX(storage_in_megabytes) as Max_MB
FROM sys.resource_usage
WHERE database_name = 'EnzoLog'
GROUP BY database_name, CAST(year(time) as nvarchar(4)) + '-' + CAST(month(time) as nvarchar(2))

The above statement returns the following rows for my EnzoLog database:

YearMonth database_name total_usage_sec Min_MB Max_MB
2012-1 EnzoLog 170 4.50 5.37
2012-2 EnzoLog 417 5.10 11.18
2012-3 EnzoLog 855 5.10 11.15
2012-4 EnzoLog 497 4.45 5.45

 

Because the resource_stats table returns information up to 5 minutes increments, you can have a more granular view of your database usage patterns during the day.

NOTE: The statements above are running against database objects found in the master database of a SQL Azure server. They are not currently supported nor are they documented. This means that these objects could change at any time, or even be dropped in the future. Do not use these objects in production systems until they are documented by Microsoft®.

SQL Saturday #141–June 16 1012 near Fort Lauderdale

SQLSaturday #141 South Florida 2012 will be here in less than 3 months! This is a training event for SQL Server professionals and those wanting to learn about SQL Server. This event will be held June 16, 2012 at Nova Southeastern University, Davie, FL 33314. The focus is on providing a good variety of topics, and making it all happen through the efforts of volunteers. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon as seating is limited, and let friends and colleagues know about the event. For more information, or to register, please visit http://www.sqlsaturday.com/141/eventhome.aspx

We are also looking for more speakers. Speaker registration is open!

Ways To Load Data In SQL Azure

This blog provides links to a presentation and a sample application that shows how to load data in SQL Azure using different techniques and tools. The presentation compares the following techniques: INSERT, BCP, INSERT BULK, SQLBulkCopy and a few tools like SSIS and Enzo Data Copy.

The presentation contains load tests performed from a local machine with 4 CPUs, using 8 threads (leveraging the Task Parallel Library), to a SQL Azure database using the code provided below. The test loads 40,000 records. Note that however the test was not conducted from a highly controlled environment, so results may vary. However enough differences were found to show a trend and demonstrate the speed of the SQLBulkCopy API versus other programmatic techniques.

You can download the entire presentation deck here: presentation (PDF)

image

Enzo Data Copy and SSIS

The presentation deck also shows that the Enzo Data Copy wizard loads data efficiently with large tables. However it performs slower for very small databases. The reason the Enzo Data Copy is fast with larger databases is due to its internal chunking algorithm and highly tuned parallel insert operations tailored for SQL Azure. In addition, Enzo Data Copy is designed to retry failed operations that could be the result of network connection issues or throttling; this resilience to connection issues ensures that large databases are more likely to be transferred successfully the first time. The Enzo Data Copy tool can be found here: Enzo Data Copy

In this test, with SSIS left with its default configuration, SSIS was 25% slower than the Enzo Data Copy Wizard with 2,000,000 records to transfer.  The SSIS Package created was very basic; the UseBulkInsertWhenPossible property was set to true which controls the use of the INSERT BULK command. Note that a more advanced SSIS developer will probably achieve better results by tuning the SSIS load; the comparison is not meant to conclude that SSIS is slower than the Enzo Data Copy utility; rather it is meant to show that the utility compares with SSIS in load times with larger data sets. Also note that the utility is designed to be a SQL Server Migration tool; not a full ETL product.

image

 

Note about the source code

Note that the source code is provided as-is for learning purposes. In order to use the source code you will need to change the connection string to a SQL Azure database, create a SQL Azure database if you don’t already have one, and create the necessary database objects (the T-SQL commands to run can be found in the code). 

The code is designed to give you control over how many threads you want to use and the technique to load the data. For example, this command loads 40,000 records (1000 x 40) using 8 threads in batches of 1000: ExecuteWithSqlBulkCopy(1000, 40, 8); While this command loads 40,000 records (500 x 80) using 4 threads in batches of 500: ExecuteWithPROC(500, 80, 4);

Here is the link to the source code: source code

Here is a sample output of the code:

image

How To Copy A Schema Container To Another SQL Azure Database

This article is written to assist SQL Azure customers to copy a SCHEMA container from one SQL Azure database to another. Schema separation (or compress shard) is a technique used by applications that hold multiple customer “databases” inside the same physical database, but separated logically by a SCHEMA container. At times it may be necessary to copy a given SCHEMA container from one database to another. Copying a SCHEMA container from one database to another can be very difficult because you need to only extract and import the data contained in the tables found in that schema.

If your SQL Azure database has multiple schema containers in it, and you would like to copy the objects under that schema to another database, you can do so by following the steps below. The article uses the Enzo Backup tool (http://bluesyntax.net/backup.aspx), designed to help in achieving this complex task.

Example Overview

Let’s assume you have a database with multiple SCHEMA containers and you would like to copy one of the schema containers to another database. A SCHEMA container holds tables, foreign keys, default constraints, indexes and more. In the screenshot below the selected database has 5 schema contains: the DBO schema and 4 custom SCHEMA container. The tool allows you to copy the objects from any schema into another database.

image

Backing Up Your SCHEMA Container

The first step is to backup your schema. Enzo Backup gives you the options needed to backup a single SCHEMA container at a time.

1- If you have registered your databases previously with the backup tool, select the database from the list of databases on the left pane, right-click on the SCHEMA container (i.e. your logical database) and select Backup to Blob (or file). Otherwise click on the Backup –> To Blob Device from the menu and enter the server credentials, and specify the name of the SCHEMA to backup from the Advanced screen.

2- Type the name of your backup device

3- Optionally select a cloud agent if you are saving to a Blob on the Advanced tab (note: a cloud agent must be deployed separately)

4- Click Start

image

Restoring Your SCHEMA Container

Once the backup is complete, you can restore the backup device to the database server of your choice. Note that you could restore to a local SQL Server database or to another SQL Azure database server.

1- Click on Backups on the left pane to view your backup devices and find your device

2- Right-click on your backup device (file or blob)

3- Enter the credentials of the server you are restoring to and the name of the database

4- If the database does not exist, select the Create If… option

5- Optionally, if you are using a Blob device and restoring to a SQL Azure database, check the Use Cloud Agent (note: a cloud agent must be deployed separately)

6- Click Start

image

If for some reason the database you are restoring to is not empty, you may see a warning indicating that the database has existing objects. Click “Yes” to continue. When the operation is complete, you can inspect your database to verify the presence of your logical database.

Considerations

- You can restore additional SCHEMA containers on an existing database. You would simply need to repeat the above steps for each SCHEMA container.

- If your intent was to “move” the SCHEMA container, you will need to clean up the original database. Once your SCHEMA container has been copied, and you have verified all you data is present in the destination database, you will need to manually drop all the objects in the source database before dropping the SCHEMA container.

- This tool does not support backing up SQL Server database. However you can restore a backup device created with Enzo Backup on a local SQL Server database if desired.

Quick Review of Backup tools for SQL Azure

The landscape of SQL Azure backups is changing rapidly.  A few tools are becoming available at no charge and Microsoft is adding capabilities over time. Here a quick update.

Microsoft Tools

Microsoft offers two primary backup mechanisms so far:

  • Export/Import feature available on the Azure Management portal
  • The COPY operation as part of its T-SQL CREATE DATABASE statement

These mechanisms do not offer a scheduling component and do not work together. To obtain a transactionally consistent backup, you first need to perform the COPY operation manually, then run the Export function.

Free Tools

You have other tools on the market that are a bit more comprehensive at no charge. Here are a few:

  • Enzo Backup for SQL Azure (Standard Edition)
  • Red Gate's SQL Azure Backup (backup only; no restore; no cloud backup)

These are the only two third-party products available at no charge I am aware of right now. Note that Red Gate's product provides a simple copy operation to a local SQL Server database, not really a backup/restore solution. However it does the job well if you want to get a local copy of your data. Enzo Backup is more comprehensive and offers many more functions, such as a built-in scheduler, cloud backup devices (in Blobs) and a restore capability.

Paid-For Tools

  • Enzo Backup for SQL Azure (Advanced Edition)
  • SQLAzureBackup (basic command-line BCP wrapper, no cloud backup)

Regarding the tools you can purchase, SQLAzureBackup is a product you can purchase, although it seems somewhat limited in its capabilities. You can nonetheless use it to export your data locally and restore the database back into SQL Azure. Enzo Backup Advanced Edition is a more powerful flavor of the free version that leverages multithreading for faster operations.

Here is a link to an MSDN article that offers additional information on how to backup SQL Azure: http://blogs.msdn.com/b/davidmcg/archive/2011/09/29/data-backup-strategies-for-windows-and-sql-azure.aspx

For the time being Enzo Backup appears to have significantly more capabilities than the other products, and it can be used at no charge by visiting Blue Syntax's website (http://www.bluesyntax.net).

[Disclaimer: I am the author of Enzo Backup for SQL Azure; this blog intends to provide a quick overview of the current tools available on the market; please investigate the tools referenced here and visit the MSDN link provided to make an educated decision]

 

Azure Florida Association: New user group announcement

I am proud to announce the creation of a new virtual user group: the Azure Florida Association.

The missiong of this group is to bring national and internaional speakers to the forefront of the Florida Azure community. Speakers include Microsoft employees, MVPs and senior developers that use the Azure platform extensively.

How to learn about meetings and the group

Go to http://www.linkedin.com/groups?gid=4177626

First Meeting Announcement

Date: January 25 2012 @4PM ET

Topic: Demystifying SQL Azure

Description: What is SQL Azure, Value Proposition, Usage scenarios, Concepts and Architecture, What is there and what is not, Tips and Tricks

Bio: Vikas is a versatile technical consultant whose knowledge and experience ranges from products to projects, from .net to IBM Mainframe Assembler.  He has lead and mentored people on different technical platforms, and has focused on new technologies from Microsoft for the past few years.  He is also takes keen interest in Methodologies, Quality and Processes.
 

How to Detect If You Are Connected To A Federation Root

If you develop applications in SQL Azure, you may end up in a situation where your code needs to know if it is operating in the root database or in a Federation Member, or if it is connected to a regular user database. Two ways can be used to make this determination: using system tables, or using the USE FEDERATION command.
Using System Tables
This approach is the easiest to implement since it relies a SQL call only, without much logic in your application code. Assuming your code is currently connected to a SQL Azure database, you could use the sys.databases table to read the is_federation_member column.  
Here is the statement you would use:
SELECT is_federation_member FROM sys.databases WHERE name = db_name()
If you are currently connected to a federation member, is_federation_member will be 1; otherwise it will be 0.
Note that as of this writing, running the above statement in SQL Azure will throw an error because the is_federation_member column is not yet available in regular SQL Azure user databases. However, if you connect to a federation, the SQL code above will run.
Using the USE FEDERATION Command
Alternatively you can also use code in .NET to make that determination. Connect to the database as usual, using the SqlConnection class for example, then perform the following tasks:
-          Execute the USE FEDERATION ROOT WITH RESET command, which forces the connection back to the root database (in case you are not current in the root database)
-          Execute SELECT db_name() which fetches the name of the database of the root database
-          Compare the original database name you connected to with the one returned by the previous call; if the names are the same, you originally connected to the root database
Here is the sample code:
string sql = "USE FEDERATION ROOT WITH RESET";
string sql2 = "SELECT db_name()";
bool isFederationRoot = false;
 
try
{
 SqlConnection sqlConn = new...; // Your connection string goes here
 sqlConn.Open();
 (
new SqlCommand(sql, sqlConn)).ExecuteNonQuery(); // connect to the root db
 SqlDataReader dr = (new SqlCommand(sql2, sqlConn)).ExecuteReader(); // Get the db_name()
 
 if (dr.Read())
   isFederationRoot = (dr[0].ToString().ToLower() == dbName.ToLower());
 dr.Close();
}
catch { }
 
 

 

Solving Schema Separation Challenges

Introduction
To save on hosting costs and simplify maintenance, Software as a Service (SaaS) providers typically rely on schema separation to host multiple customers' records. This implementation relies on a specific SQL Server and SQL Azure feature called a schema object. A schema object behaves like a container, or a namespace in programmatic terms, allowing multiple tables (and other objects) to be stored with the same name in a single database.
However schema separation comes with its own set of challenges. The challenge in using schema separation is that few tools support it. For example, until recently, it was not possible to easily backup or restore a single schema.  Other challenges include security and account management. Let's explore further.
Basics of Schema Separation
Let's assume you have two customers, and you need to track historical information. So you need to create a table called tbl_history for each customer.  You could either create a database for each customer with the tbl_history table in each database (the linear shard pattern), or create a schema container for each customer in a single database, and create the tbl_history table in each schema container (the compressed shard pattern). For a description of sharding patterns review this white paper:  http://www.bluesyntax.net/files/EnzoFramework.pdf.
Here is an example of a schema-based storage for different customers. A database is created with 2 schema containers (cust1 and cust2). The tbl_history table is then added to each schema container. The following script assumes that the database has already been created.
CREATE SCHEMA cust1
GO
CREATE SCHEMA cust2
GO
 
CREATE TABLE cust1.tbl_history(id int identity(1,1) primary key, dateadded datetime default (getdate()), productid int, quantity int)
CREATE TABLE cust2.tbl_history(id int identity(1,1) primary key, dateadded datetime default (getdate()), productid int, quantity int)
 
INSERT INTO cust1.tbl_history(productid, quantity) values (1, 5)
INSERT INTO cust1.tbl_history(productid, quantity) values (2, 7)
INSERT INTO cust2.tbl_history(productid, quantity) values (107, 22)
 
SELECT * FROM cust1.tbl_history
SELECT * FROM cust2.tbl_history
At this point we have two tables with the same name, each in a different schema container, but within a single database. Here is more information about schema containers: http://msdn.microsoft.com/en-us/library/ms189462.aspx
Securing Access
An important feature of schema containers is their support for security. You could easily create a user (mapped to a login account) and grant that user SELECT rights to cust1.  The following statements should be executed against the master database in SQL Azure.
CREATE LOGIN logincust1 WITH PASSWORD = 'p@ssw0rd001'
CREATE LOGIN logincust2 WITH PASSWORD = 'p@ssw0rd002'
Then back in the user database, create two users.
CREATE USER user1 FOR LOGIN logincust1
CREATE USER user2 FOR LOGIN logincust2
 
Finally, you need to authorize each user to execute statements against the history table. We will authorize user1 to use all the objects in schema cust1, and user2 in cust2.
GRANT SELECT, EXECUTE ON SCHEMA::cust1 TO user1
GRANT SELECT, EXECUTE ON SCHEMA::cust2 TO user2
 
At this point, user1 can only select and execute stored procedures in the cust1 schema. user1 cannot access schema cust2.  Here is more information about the Create Login statement: http://msdn.microsoft.com/en-us/library/ee336268.aspx
Customer Accounts vs. Service Accounts
The previous section discusses the creation of logins and users that provide security to each schema container. It should be noted that each login/user account created in the user database should be treated as a service account, not an actual customer account. Indeed, if you create customer accounts in SQL Azure directly you could negatively impact connection pooling, and hence performance. You should authorize users first, using a specific application authentication component (such as ASP.NET Membership), then map the customer account to the service account to use.
You typically implement the mapping (from customer account to service account) using a separate customer account database. A column in your customer account table would store the database connection string, which would contain the service account to use. Here is an example:
CREATE TABLE custmapping(customerid int primary key, custconnection nvarchar(255) NOT NULL)
Note that in a production environment, you would likely encrypt the connection string. The above table does not implement encryption to illustrate a simple scenario.
You would then add a record for each customer:
INSERT INTO custmapping VALUES (1, 'server=....;UID=logincust1;PWD=p@ssw0rd001')
INSERT INTO custmapping VALUES (2, 'server=....;UID=logincust2;PWD=p@ssw0rd002')
 
When a customer logins with their account, your code would read the custmapping table to retrieve the connection string to use for the duration of the session.
Moving A Schema
If a customer grows significantly, or is abusing your SaaS application, you may be facing the need to move that customer's data to a different database so that the other customers (called tenants in SaaS terms) are not affected negatively by the increase in resources needed by that customer.
There are very few options available today to move a single schema container, and its associated objects, from one database to another. You could manually create the schema container and its objects first, then use BCP or SSIS to move the data. However this can be error prone and lengthy.
A tool recently released by Blue Syntax, called Enzo Backup for SQL Azure, provides the ability to backup a single schema. This tool will backup the associated users in addition to the related objects (tables, stored procedures and so forth). The restore process will recreate the schema in the chosen database server and all the objects (and data) in that schema. Here is more information about this tool: http://www.bluesyntax.net/backup.aspx
Multitenant Frameworks
Due to the level of complexity in building multitenant environments, certain companies are turning to specialized frameworks. Although these frameworks can require some learning curve, they provide certain capabilities that would be difficult to build, such as fan-out, caching, and other capabilities. Here are a few .NET frameworks:
·         The Enzo Sharding Library (open-source); also supports Data Federation: http://enzosqlshard.codeplex.com/releases/view/72791
·         The CloudNinja project (open-source): http://cloudninja.codeplex.com/
·         The Enzo Multitenant Framework: http://www.bluesyntax.net/scale.aspx
Schema Separation and Data Federation
Data Federation is an upcoming feature of SQL Azure that will provide a new mechanism to distribute data. In a way, Data Federation allows SaaS vendors to design a monolithic database, in which all customers are located, and distribute records of one or more tables across databases when the time comes. Data Federation is essentially a compressed shard, similarly to schema separation, with the added benefit of tooling support and easier repartitioning based on performance and storage needs. In addition, Data Federation can help you distribute data on almost any dimension; not just customer id.
Data Federation and Schema Separation are not competing solutions for SaaS vendors. Each have specific benefits and challenges. SaaS vendors will need to determine which one serves their needs best, or even use both.
Using both Schema Separation and Data Federation delivers even greater flexibility.  For example, if the history table of customer 2 becomes too large, you could leverage Data Federation to split that table, without affecting customer 1. A specific use of this technique could be to use Data Federation to split tables across databases every year for all customers (or only those that have a lot of records). This dual layering technique (using schema separation for customer records, and data federation for archiving) can deliver remarkable performance results and scalability.
Conclusion
Many vendors have successfully built a SaaS solution using schema separation. While tooling has been lacking, some vendors are adding schema separation support in their products. Understanding the options available at your finger tips, including the frameworks already developed, and how to combine sharding models can give you a significant advantage when building your SaaS solutions.
  

Sharding Library for SQL Azure Data Federation

The ability to write scale out applications for SQL Azure will soon become much easier, thanks to the upcoming SQL Azure Data Federation capabilities. In an earlier post, I outlined specific steps developers can take to prepare for this key enhancement.  Some of the key capabilities of SQL Azure Data Federation is to distribute large data sets across multiple databases, hence providing a mechanism to avoid the current database size limitations of SQL Azure.

While it may be tempting to look at this new feature as a workaround to an existing size limitation, the real benefit of federating data is to give each data set its own set of independent processing power; indeed each federation member is assigned its own CPU, storage, memory, TempDB and network bandwidth. In other words, Data Federation provides the basic foundation for a scale out database model, typically needed by SaaS applications and historical data sets.

Distributed Needs

Since data stored in federation members are actually stored across multiple databases, certain capabilities taken for granted in typical development environments are not available just yet.  And with this type of new features, new needs are surfacing.  In a nutshell, some of the capabilities currently missing include:

-          The ability to fan out queries when running a query across federation members

-          The ability to perform basic joins across federation members to exclude data sets

-          The ability to execute simple aggregated functions such as MIN, MAX... across federated members

Since these capabilities are not yet supported by the Data Federation feature, it is necessary to leverage parallel processing on the client (or middle-tier) application in order to efficiently query across federation members (i.e. databases).

Open-Source Enzo Library

In order to provide support the above capabilities across federation members, I updated the Enzo Shard Library open-source project (http://enzosqlshard.codeplex.com/releases/view/72791), which is currently in Version 2.0 Beta. This updated library offers new important features, including support for Data Federation. A new class (called the Distributed Query) allows developers to execute a simple SELECT query across federation members. For example, a Distributed Query that fetches customer data across federation members (hence databases) to return the Maximum CustomerID would look like this:

SELECT MAX(CustomerId) USING (select customerid from customers) FEDERATED ON (customerfederation)

The above statement is interpreted by the Enzo library; the inner statement (select customerid from customers) is the T-SQL part that is actually executed across each federation member in parallel against SQL Azure. Then the library aggregates the results and returns the maximum customer id from all the databases involved in the federation (see the details on CodePlex for a complete specification of the Distributed Query).

There are many new features available in this library, which will be the subject of my talk at the PASS Conference 2011 in October (more information).

 

Twitter