I was recently on a project that required me to move SharePoint content to SQL Database. As I searched the best way to do this, I thought of a timer job that would run nightly and move the data to SQL, a co-worker who is a SQL expert suggested I use SSIS to move the data over. I started searching online and came up with a couple of options the most notable being the technet article. The problem with the example on technet is that it uses a custom connector that connect one Document library or List to a SQL table. I was looking to capture data from many Lists not just one. I also did not know how many as that number of lists was dynamic. The following is the solution implemented using SSIS and SharePoint.
Step 1: Understand the moving parts
My SharePoint environment was a stand-alone installation using SQL Server 2008 and Windows Server 2008 both in one box. I needed to move the content over to another SQL server on the network but not on the same machine as that of SharePoint. I installed SSIS on the SQL Server that was on my stand alone installation of SharePoint. That SSIS package would would run on my SharePoint box and in fact be run by a SQL agent on a timed basis.
Step 2: Install the required
In order to implement the same solution you will need to install SSIS on the SharePoint box. You can do this by either going to Add/Remove programs on the Stand Alone installation and edit the SQL 2008 instance installed. Follow the wizard and install SSIS. You can also do this by using the original ISO or DVD and using the installation wizard.
My development environment also has Visual Studio 2008 and I think that as part of the SSIS installation I had access to SSIS templates, or where they available as part of VS2008? Anyway, if you have installed SSIS you will have access to SSIS templates.
Step 3: Add all entries into Config file
As I finished developing the SSIS package I had hardcoded all the connection strings, and any other essential information to complete the data migration. It is a good Idea to start thinking about how you will be able to quickly deploy this SSIS package from dev to UAT to Prod environments. There are 2 things that work hand in hand here, "Variables" and "Package Configurations".
Variables - are simply that, variables, you will use through out your SSIS package which can be accessed by following the instructions in figure 1.
Package Configurations - Access package configurations as shown in the figure 2. Once you have finished developing the SSIS package you will need to configure the config. A wizard will guide you through which fields you want to promote as configurations and also what config file you want to use. You will create your own and save it along with your project, like shown in figure 3. This way you will be able to edit the config values right through visual studio and it exists within the project.
Step 4: Access SharePoint content
In order to get at the content you want to export I had to write code that accessed my SharePoint data. Now remember, this code included in SSIS package will run on the SharePoint server, therefore you can use the SharePoint object model to get at the data. The important part isn't so much accessing the content but binding it to an SSIS action that will allow us to manipulate and save it to SQL.
First, switch to "Data Flow" and add a data flow transformation item that allows you to write custom code. I used Script Component.
Second, make sure the variables you have created and configured can be used in this Script component. In my example I used Provinces and configured a custom string that i could use in code, but was not hard coded. Under Custom properties in Figure 4, you will see all the read only variables I have allowed this script to have access to. By clicking on the Read only variables field you will get to another screen that allows you to check off(Include) variables from the available variables on the package.
Third, Create columns you want your data to bind to. Basically you will be building a table with all your data. Figure 5 shows you the Input/Output section, Make sure you configure the datatype, of your column and make sure you add the right type of data to the column in the custom code.
Fourth, you now need to write the code and bind to the right columns. Access the Script by clicking the "Edit Script" button that is shown in Figure 4. This will launch another Visual Studio instance and this is where you will write the code that accompanies the script component. You will have to do the usual and include the Microsoft.SharePoint.dll as a reference and another thing I did because my code uses Lambda functions is made sure that the target source was .Net 3.5. This is configured in the properties of the project that just opened up. The point I wan to make is not so much the SharePoint code but how to bind to to the columns and how to access the variables you have made available to the Script Component.
Access Output Columns In code: When you create a script component you will have 3 override methods included in code, "PreExecute", "PostExecute", and "CreateNewOutputRows". The method you will really be working with is "CreateNewOutputRows". In order to create rows you will need to write the following code:
and then to access the rows you will need to following code:
MyOutputBuffer.Name = "Whatever Name you want";
then to create a new row you will need to write the above code again. Fill up the output buffer with as many rows as you want.
Access the variables in code: In order to access the variables you have made available simply access them through the variable object. The following lines of code are example:
Step 5: Export SharePoint content to SQL
Once you have your data in output columns as part of the script component you can move that content to SQL. Before you do that, you might need to do a couple of things, I had to do "Data Conversion". There might be times where the output column does not match the type that is created in SQL. I had times where the package would run properly and it would tell me as part of logs that new rows were created in SQL but at the end of the day nothing was added to sql. This was because some of the output columns created were not of the same type. I also had a couple Lookup columns , but that's up to you to explore.
To add to SQL, use the action "OLE DB Destination" under Data Flow Destinations. You should keep in mind that now that a SQL Connection is created through the settings in the action, you now have new fields you can add to the config settings described in Step 2. I you have not created the config section, then you can do that now and you will have the ability to include the SQL Connection as a configurable value. If you have already created the config file, you can EDIT the existing config file to include the new SQL Connection and any other value you are using.
Integration Services is a very handy way to move SharePoint data over from SharePoint to SQL. The SharePoint object model can still be used to access SharePoint content but the SSIS package will need to executed on the SharePoint server. If the SSIS package was needed to run remotely, Web services can be used form the script component and bound to the right columns. I Know this blog post is not as detailed as you might like but it does highlight some of the things you need to do to build and configure an SSIS package. I also configured this SSIS package to run an a nightly basis with a SQL Agent. Deployment was not covered in this Blog post. If anyone would like some help with the deployment and configuring the agent, let me know and I'll write a post with details.