Shaun Xu

The Sheep-Pen of the Shaun


News

logo

Shaun, the author of this blog is a semi-geek, clumsy developer, passionate speaker and incapable architect with about 10 years experience in .NET. He hopes to prove that software development is art rather than manufacturing. He's into cloud computing platform and technologies (Windows Azure, Aliyun) as well as WCF and ASP.NET MVC. Recently he's falling in love with JavaScript and Node.js.

Currently Shaun is working at IGT Technology Development (Beijing) Co., Ltd. as the architect responsible for product framework design and development.

MVP

My Stats

  • Posts - 97
  • Comments - 345
  • Trackbacks - 0

Tag Cloud


Recent Comments


Recent Posts


Archives


Post Categories



This is the second post about SQL Azure Federation. In my first post I described a little bit about the theory of data partitioning, included the different between the horizontal partitioning and the vertical partitioning. I also talked about the features need to be done when we implemented the horizontal partitioning. And finally, I described some basic concept about SQL Azure Federation.

In this post, I will demonstrate how to use SQL Azure Federation in the SQL Server Management Studio (SSMS). I will also demonstrate the benefit of using SQL Azure Federation to build the multi-tenant data solution.

 

Create Federation and Tables

SQL Azure Federation is officially available for every Windows Azure data centers. There no additional register or sign up process. When we create a new server or database in SQL Azure, we can use SQL Azure Federation. For example, I had created a new database in my SQL Azure server which located in East Asia data center. Then I can open my SSMS and connect to this SQL Azure server and use SQL Azure Federation.

Do not forget to set the SQL Azure firewall before you connect to the server from local tools such as SSMS.

In this post I would like to take a very simple scenario as the example. Assuming we have a multi-tenant CRM system which contains accounts, contacts for each tenants. We also have some lookup data such as countries and titles. We also have one table contains some metadata for each tenants. So the database diagram would be like this.

Drawing1

In SQL Azure Federation, if we decided to split some tables we must have the federation ready, then create the tables that wanted to be federated based on the federation. There is no way in SQL Azure Federation to switch tables between the federated and non-federated or reference. So before we start to create the tables we need to firstly define the federation.

Since we decided to have the tenant ID as the key when splitting the database, the federation distribution should be INT type, which is the same as the type of TenantID in the tables.

In order to create the federation, in SSMS connect to the database and execute the T-SQL below.

   1: CREATE FEDERATION Tenant_Fed (TID INT RANGE)
   2: GO

There’s nothing special happened after we executed this T-SQL in SSMS. But in SQL Azure fabric, it created a database which represent our first federation member. The federation member database name was specified by SQL Azure which was a GUID with “system-” prefixed. And the original database, which we created and are connecting now, became the root database.

image

After that, all tables in our system should be in one of these three types.

Table Type Description Where
Federated Table Tables that will be split based on the federation and their federation columns. Federation Members
Reference Table Tables that will not be split, but should be referred by the federated table. These tables will be copied across all federation members. Federation Members
Centre Table Tables that will not be split, and very rarely join-select with the tables in federation members. Federation Root

For the federated table and reference table, we need to create them in a federation member, so that when we split this member, SQL Azure Federation will help us to create them in the new member. In order to connect to a federation member we need to execute the T-SQL below.

   1: USE FEDERATION Tenant_Fed (TID = 0) WITH RESET, FILTERING = OFF
   2: GO

If you are familiar with SQL Azure you should be already know that the USE keyword doesn’t work in SQL Azure. But if we specify the FEDERATION keyword and the federation name it will switch our connection to the federation member database, which contains the distribution key value we specified here (TID = 0). And if we connected to this federation member we can retrieve the database name by using the db_name() function. As you can see in my workstation my first federation member database name was “system-06d60081-6737-413e-85b1-df65cb55f1c9”.

image

Now we have been connecting to the federation member and the next step is to create the federated tables. Based on our design the Account and Contact table should be partitioning by their TenantID column so let’s create them by the following T-SQL.

   1: CREATE TABLE [dbo].[Account](
   2:     [ID] [int] NOT NULL,
   3:     [TenantID] [int] NOT NULL,
   4:     [Name] [varchar](50) NOT NULL,
   5:     [CountryID] [int] NOT NULL,
   6:  CONSTRAINT [PK_Account] PRIMARY KEY CLUSTERED 
   7: (
   8:     [ID] ASC
   9: )WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF)
  10: )
  11: FEDERATED ON (TID = TenantID)
  12: GO

In the T-SQL above we are going to create the Account table with fields and primary key. And at the end of the statement we defined that the TenantID column is the federated column of this table. This means, when SQL Azure Federation split the table it will look up the value of TenantID, decided which rows should be in which member.

But when we executed we will get an error that the primary key index could not be created since the index doesn’t contain the federated column.

image

This is a limitation in SQL Azure Federation. The federated column in a federated table must be the clustered index, or be in the clustered index. Since we defined the ID as the primary key of the Account table, to satisfied this limitation we must include the TenantID as the primary key as well. So our T-SQL should be like this.

   1: CREATE TABLE [dbo].[Account](
   2:     [ID] [int] NOT NULL,
   3:     [TenantID] [int] NOT NULL,
   4:     [Name] [varchar](50) NOT NULL,
   5:     [CountryID] [int] NOT NULL,
   6:  CONSTRAINT [PK_Account] PRIMARY KEY CLUSTERED 
   7: (
   8:     [ID] ASC,
   9:     [TenantID] ASC
  10: )WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF)
  11: )
  12: FEDERATED ON (TID = TenantID)
  13: GO

And similarly, in Contact table we also need to add the TenantID column as the primary key. We also need to amend the foreign keys as well. The T-SQL should be like this.

There is no constraint that the federation column name should be the same in each federated tables. We can specify the federation key to the different columns that has different names in each federated tables when created the tables. For example in Account table we link TID = TenantID while in Contact table we link TID = Tenant_ID or TID = T_ID, etc..

   1: CREATE TABLE [dbo].[Contact](
   2:     [ID] [int] NOT NULL,
   3:     [TenantID] [int] NOT NULL,
   4:     [AccountID] [int] NOT NULL,
   5:     [Name] [varchar](50) NOT NULL,
   6:     [TitleID] [int] NOT NULL,
   7:  CONSTRAINT [PK_Contact] PRIMARY KEY CLUSTERED 
   8: (
   9:     [ID] ASC,
  10:     [TenantID] ASC
  11: )WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF)
  12: )
  13: FEDERATED ON (TID = [TenantID])
  14: GO
  15:  
  16: ALTER TABLE [dbo].[Account]  WITH CHECK ADD  CONSTRAINT [FK_Account_Country] FOREIGN KEY([CountryID])
  17: REFERENCES [dbo].[Country] ([ID])
  18: GO
  19: ALTER TABLE [dbo].[Account] CHECK CONSTRAINT [FK_Account_Country]
  20: GO
  21: ALTER TABLE [dbo].[Contact]  WITH CHECK ADD  CONSTRAINT [FK_Contact_Account] FOREIGN KEY([AccountID], [TenantID])
  22: REFERENCES [dbo].[Account] ([ID], [TenantID])
  23: GO
  24: ALTER TABLE [dbo].[Contact] CHECK CONSTRAINT [FK_Contact_Account]
  25: GO
  26: ALTER TABLE [dbo].[Contact]  WITH CHECK ADD  CONSTRAINT [FK_Contact_Title] FOREIGN KEY([TitleID])
  27: REFERENCES [dbo].[Title] ([ID])
  28: GO
  29: ALTER TABLE [dbo].[Contact] CHECK CONSTRAINT [FK_Contact_Title]
  30: GO

Now we have the federated tables created in our first federation member. Next let’s create the reference tables. In this case the reference tables are County and Title. They don’t have the column related with the federation distribution key, which is the tenant ID in our example. But they need to be selected alone with the federated tables. For example we might need to retrieve the account information for a tenant with the country name. So they need to be added as reference tables.

To create a reference table in a federation member would be the same as what we did on a normal database, no need to specify the federated column in the CREATE TABLE statement.

   1: CREATE TABLE [dbo].[Title](
   2:     [ID] [int] NOT NULL,
   3:     [Title] [varchar](50) NOT NULL,
   4:  CONSTRAINT [PK_Title] PRIMARY KEY CLUSTERED 
   5: (
   6:     [ID] ASC
   7: )WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF)
   8: )
   9: GO
  10:  
  11: CREATE TABLE [dbo].[Country](
  12:     [ID] [int] NOT NULL,
  13:     [Country] [varchar](50) NOT NULL,
  14:  CONSTRAINT [PK_Country] PRIMARY KEY CLUSTERED 
  15: (
  16:     [ID] ASC
  17: )WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF)
  18: )
  19: GO
  20:  
  21: ALTER TABLE [dbo].[Account]  WITH CHECK ADD  CONSTRAINT [FK_Account_Country] FOREIGN KEY([CountryID])
  22: REFERENCES [dbo].[Country] ([ID])
  23: GO
  24: ALTER TABLE [dbo].[Account] CHECK CONSTRAINT [FK_Account_Country]
  25: GO
  26:  
  27: ALTER TABLE [dbo].[Contact]  WITH CHECK ADD  CONSTRAINT [FK_Contact_Title] FOREIGN KEY([TitleID])
  28: REFERENCES [dbo].[Title] ([ID])
  29: GO
  30: ALTER TABLE [dbo].[Contact] CHECK CONSTRAINT [FK_Contact_Title]
  31: GO

And in the T-SQL above we also added the foreign keys between the reference tables and federated tables. It’s possible to add the foreign keys between the federated tables, like what we added between the Account and Contact. It’s possible to add the foreign key from the federated table to reference table, like the key between Account and Country. But it’s not allowed to add the foreign key from a non-federated table to a federated table. For example we cannot add a foreign key between the Account to a table in the federation root.

Now we have all tables and keys ready in our first federation member. Let’s add some sample data.

   1: INSERT INTO Country VALUES (1, 'China')
   2: INSERT INTO Country VALUES (2, 'US')
   3: INSERT INTO Country VALUES (3, 'UK')
   4:  
   5: INSERT INTO Title VALUES (1, 'Mr')
   6: INSERT INTO Title VALUES (2, 'Ms')
   7:  
   8: INSERT INTO Account VALUES (1, 1, 'Tenant 1 - Account 1', 1)
   9: INSERT INTO Account VALUES (2, 1, 'Tenant 1 - Account 2', 2)
  10: INSERT INTO Account VALUES (3, 2, 'Tenant 2 - Account 3', 3)
  11: INSERT INTO Account VALUES (4, 2, 'Tenant 2 - Account 4', 1)
  12: INSERT INTO Account VALUES (5, 3, 'Tenant 3 - Account 5', 2)
  13: INSERT INTO Account VALUES (6, 3, 'Tenant 3 - Account 6', 3)
  14: INSERT INTO Account VALUES (7, 4, 'Tenant 4 - Account 7', 1)
  15: INSERT INTO Account VALUES (8, 4, 'Tenant 4 - Account 8', 2)
  16: INSERT INTO Account VALUES (9, 5, 'Tenant 5 - Account 9', 3)
  17: INSERT INTO Account VALUES (10, 5, 'Tenant 5 - Account 10', 1)
  18:  
  19:  
  20: INSERT INTO Contact VALUES (1, 1, 1, 'Tenant 1 - Account 1 - Contact 1', 1)
  21: INSERT INTO Contact VALUES (2, 1, 1, 'Tenant 1 - Account 1 - Contact 2', 2)
  22: INSERT INTO Contact VALUES (3, 1, 1, 'Tenant 1 - Account 1 - Contact 3', 1)
  23: INSERT INTO Contact VALUES (4, 1, 2, 'Tenant 1 - Account 2 - Contact 4', 2)
  24: INSERT INTO Contact VALUES (5, 1, 2, 'Tenant 1 - Account 2 - Contact 5', 1)
  25: INSERT INTO Contact VALUES (6, 1, 2, 'Tenant 1 - Account 2 - Contact 6', 2)
  26: INSERT INTO Contact VALUES (7, 2, 3, 'Tenant 2 - Account 3 - Contact 7', 1)
  27: INSERT INTO Contact VALUES (8, 2, 3, 'Tenant 2 - Account 3 - Contact 8', 2)
  28: INSERT INTO Contact VALUES (9, 2, 3, 'Tenant 2 - Account 3 - Contact 9', 1)
  29: INSERT INTO Contact VALUES (10, 2, 4, 'Tenant 2 - Account 4 - Contact 10', 2)
  30: INSERT INTO Contact VALUES (11, 2, 4, 'Tenant 2 - Account 4 - Contact 11', 1)
  31: INSERT INTO Contact VALUES (12, 2, 4, 'Tenant 2 - Account 4 - Contact 12', 2)
  32: INSERT INTO Contact VALUES (13, 3, 5, 'Tenant 3 - Account 5 - Contact 13', 1)
  33: INSERT INTO Contact VALUES (14, 3, 5, 'Tenant 3 - Account 5 - Contact 14', 2)
  34: INSERT INTO Contact VALUES (15, 3, 5, 'Tenant 3 - Account 5 - Contact 15', 1)
  35: INSERT INTO Contact VALUES (16, 3, 6, 'Tenant 3 - Account 6 - Contact 16', 2)
  36: INSERT INTO Contact VALUES (17, 3, 6, 'Tenant 3 - Account 6 - Contact 17', 1)
  37: INSERT INTO Contact VALUES (18, 3, 6, 'Tenant 3 - Account 6 - Contact 18', 2)
  38: INSERT INTO Contact VALUES (19, 4, 7, 'Tenant 4 - Account 7 - Contact 19', 1)
  39: INSERT INTO Contact VALUES (20, 4, 7, 'Tenant 4 - Account 7 - Contact 20', 2)
  40: INSERT INTO Contact VALUES (21, 4, 7, 'Tenant 4 - Account 7 - Contact 21', 1)
  41: INSERT INTO Contact VALUES (22, 4, 8, 'Tenant 4 - Account 8 - Contact 22', 2)
  42: INSERT INTO Contact VALUES (23, 4, 8, 'Tenant 4 - Account 8 - Contact 23', 1)
  43: INSERT INTO Contact VALUES (24, 4, 8, 'Tenant 4 - Account 8 - Contact 24', 2)
  44: INSERT INTO Contact VALUES (25, 5, 9, 'Tenant 5 - Account 9 - Contact 25', 1)
  45: INSERT INTO Contact VALUES (26, 5, 9, 'Tenant 5 - Account 9 - Contact 26', 2)
  46: INSERT INTO Contact VALUES (27, 5, 9, 'Tenant 5 - Account 9 - Contact 27', 1)
  47: INSERT INTO Contact VALUES (28, 5, 10, 'Tenant 5 - Account 10 - Contact 28', 2)
  48: INSERT INTO Contact VALUES (29, 5, 10, 'Tenant 5 - Account 10 - Contact 29', 1)
  49: INSERT INTO Contact VALUES (30, 5, 10, 'Tenant 5 - Account 10 - Contact 30', 2)

After executed these T-SQL we had put all data in our first federation member. It contains two references tables (Country, Title) and two federated tables (Account, Contact), and some foreign keys between the reference table and federated table.

image

And we can select the data in this federation member with the tables joining, for example the T-SQL below will list all accounts and contacts information.

   1: SELECT Account.Name, Country.Country, Contact.Name, Title.Title FROM Contact
   2: LEFT JOIN Account ON AccountID = Account.ID
   3: LEFT JOIN Country ON Account.CountryID = Country.ID
   4: LEFT JOIN Title ON Contact.TitleID = Title.ID

 

Split Federation Member

The key feature of SQL Azure Federation is to split a federation member into two based on the federated value specified without any downtime. Since we have inserted some data in our first federation member, let’s split it into two.

Split a federation member is very easy. We use the ALTER FEDERATION command and specify from what value of the distribution key to split. For example, since the federation distribution key is the TID (tenant ID), we will split all federated tables based on their federated column value at 3. This means all data that the tenant ID is less than 3 will be in the federation member 1 while others will be in the federation member 2.

In order to split the federation we need to firstly connect to the federation root, and then execute the ALTER FEDERATION command specifying the boundary value.

   1: USE FEDERATION ROOT WITH RESET
   2: GO
   3:  
   4: ALTER FEDERATION Tenant_Fed SPLIT AT (TID = 3)
   5: GO

When execute this command the SQL Azure engine will perform the operations listed below.

  • First, it will configure the federation metadata information on the federation root, to indicate that the federation should be split.
  • Then it will create two databases for the new federation members. Please note that even though we have had a federation member already when executing the split command, SQL Azure Federation will NOT use this member in the future. It will create two new members.
  • SQL Azure will create the tables, keys and constraints, etc. from the current federation member to the new members.
  • It will copy the records in the reference tables to new members.
  • It will copy the records in the federated tables to new members based on the boundary value specified. Records with the federation column value less than the boundary value will be put into the low member while those equals or more than the boundary value will be put into the high member.
  • Finally, SQL Azure will re-map the connections from the original federation member to the new members and drop the original member.

After the split command finished we can connect one federation member by using the USE FEDERATION command. In SQL Azure Federation we cannot connect to a member database by specifying its database name, instead we need to tell the SQL Azure Federation the value of the distribution key we want to us, then SQL Azure Federation will rout us to that member database. For example, we split our data at tenant ID = 3, then if we want to read the data of tenant 4 we can use the T-SQL below.

   1: USE FEDERATION Tenant_Fed (TID = 4) WITH RESET, FILTERING = OFF
   2: GO

Then execute the command we had just preformed before to see the data in this federation member.

   1: SELECT Account.Name, Country.Country, Contact.Name, Title.Title
   2: FROM Contact
   3: LEFT JOIN Account ON AccountID = Account.ID
   4: LEFT JOIN Country ON Account.CountryID = Country.ID
   5: LEFT JOIN Title ON Contact.TitleID = Title.ID

And as you can see, the SSMS returned the data that the tenant ID is equal or more than 4.

image

And if we want to see the data in the first member, we can just specify the TID = 0, 1 or 2, any value less than the boundary value of that federation member.

image

Connect to Atomic Unit

When we connected to a federation member we use the USE FEDERATION command. There’s a clause in this T-SQL statement which is FILTERING = OFF. In the example above we can select all data in the federation member unless we didn’t put any criteria in WHERE clause in SELECT command. This is because we utilized FILTERING = OFF. It will make the connection scope set to the full range covered by the federation member containing the specified key value. The connection behaves the same if it were connected to the member through its physical federation member name (the database name).

SQL Azure Federation provides another valuable feature that can force the connection scope set to the federation key instance (federation key value) in a federation member rather than to the full range of federation member, by using FILTERING = ON clause.

image

So let’s have a try and to see what will happen if we set the FILTERING = ON. At this time we still specify the SSMS to use federation where the key equals to 2 but specify the FILTERING = ON, and select the data without any criteria.

   1: USE FEDERATION Tenant_Fed (TID = 2) WITH RESET, FILTERING = ON
   2: GO
   3:  
   4: SELECT Account.Name, Country.Country, Contact.Name, Title.Title
   5: FROM Contact
   6: LEFT JOIN Account ON AccountID = Account.ID
   7: LEFT JOIN Country ON Account.CountryID = Country.ID
   8: LEFT JOIN Title ON Contact.TitleID = Title.ID

Now we can see that only the records that the TenantID = 2 were returned. We didn’t provide anything in the WHERE clause in SELECT command but it worked as if we specified WHERE TenantID = 2. This is the benefit that using the FILTERING = ON in the USE FEDERATION command.

image

We mentioned in the previous post, in SQL Azure Federation all records that related with the same federation key value in a federation member is called an Atomic Unit. In the case above we connected to the federation member’s atomic unit which the value is TenantID = 2, by using the FILTERING = ON clause.

This feature is very useful especially when we need to migrate a single tenant application to multi-tenant style. As you know we might need to add the tenant identity column in the tables that tenant-award when migrating, but this might lead us to rewrite all related SQL scripts. For example if we have a SQL script that retrieves all data in table Order, it might look like this in single tenant application.

   1: SELECT Order.ID, Order.Name FROM Order

But if we implemented the multi-tenant feature we have to add the TenantID column in this table to indicate which tenant the order is. And accordingly we have to rewrite the SQL script if a participant tenant user is currently logging on.

   1: SELECT Order.ID, Order.Name FROM Order
   2: WHERE Order.TenantID = @UserTenentID

Assuming that we have 50 tables that are tenant-award and 10 scripts for each table. Then we need to amend 50 * 10 = 500 SQL scripts. And almost all changes are very similar.

But if we are using SQL Azure Federation we can make the tenant ID as the federation distribution key and let the tables split based on their TenantID column. Then if we have an user logged in under a tenant, we can invoke the USE FEDERATION statement specifying the TenantID value with FILTERING = ON before any data command, then all following SQL scripts will no need to be changed since the SQL Azure Federation helped us to filter the connection scope to this tenant. This can be done very easily by using AOP.

 

Summary

In this post I demonstrated how to use SQL Azure Federation. We talked about how to create federation, federation members and tables. And we demonstrated how to split a federation member, which is horizontal data partitioning without any downtime.

I also talked about the feature of FILTERING = ON in USE FEDERATION statement and the benefit in multi-tenant solution.

SQL Azure Federation is a very powerful tool for us to build scale-out and/or multi-tenant application. In the coming few posts I would like to discuss the metadata system of the federation, the pricing and how to split the federation members based on the data size or record count.

And I would like to introduce on how to implement our own horizontal partitioning data access layer without SQL Azure Federation available.

 

Hope this helps,

Shaun

All documents and related graphics, codes are provided "AS IS" without warranty of any kind.
Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.