News

Tweets













The SSIS Expression Way–Skipping an Unwanted File

Yesterday at the MSDN SSIS Forum somebody asked a question if it is possible to skip a particular file conditionally while keep processing the others.

Such a situation arises in cases when a file mask/wildcard set for the file names to pick for processing in a ForEach Loop container collects extra files.

For example, if we have a folder with files as in the picture below, and we do not want to process (for a specific reason) say File2.txt, our only option to specify the files eligible for processing would be File*.txt only (alas, the Collection’s property page “Files:” text box does not support Regular Expressions).

image

So what is the solution? I proposed to implement the proper conditional processing. How this can be done? The answer is by using the Precedence Constraints.

As an aside, the Precedence Constraint is an often overlooked entity, but a very powerful decision control mechanism in SSIS, yet it seems that many SSIS developers tend to avoid or dislike it.

OK, so let’s tackle the package up!

To start off, let‘s create a SSIS project, add or modify the existing/created package by dragging and dropping the ForEach Loop container in which will set the file mask for the files to be collected. It may look like below in BIDS/Visual Studio 2008:

SNAGHTML3be511

At this point let’s map the file name picked to a new package scope variable

image

We will also need one more variable to keep the name of the file to exclude:

SNAGHTML3f428c

( only the bottom variable is relevant)

By the way, this approach also permits to save this variable to a configuration file to get the ability of changing its value after the package is deployed, but involving a config file is beyond the scope of this post.

Next step is to continue adding components to the package and at this point it will up to what needs to be achieved by the package, so in my case for the sake of simplicity I will use a Script Task (to just serve the purpose of the processing inception) which will display and/or output the currently processed file to the Output window of BIDS and two Data Flow Tasks for each category of files (those that need to be processed and the one that not).

In short I was able to arrive to a design like this:

image

I will share for the sake of completeness what is in the Script Task to just let you know where the file names get generated:

   1:  public void Main()
   2:          {
   3:              // Uncomment for a visual representation
   4:              // MessageBox.Show (Dts.Variables["pkVarCurrentFile"].Value.ToString());
   5:   
   6:              // Log the file name being processed to the Output Window
   7:              bool fireAgain = true;
   8:              this.Dts.Events.FireInformation(1, "", "Current file: " + Dts.Variables["pkVarCurrentFile"].Value.ToString(), String.Empty, 0, ref fireAgain);
   9:              
  10:              Dts.TaskResult = (int)ScriptResults.Success;
  11:          }

To display a graphical message box one should un-comment the line #4, but I find more SSISish to use the FireInformation function.

Now is the main part: the Precedence Constraint logic. If you remember we used two variables. One is to store the currently processed file and another is the one with the file name to exclude: pkVarCurrentFile and pkVarFilesToSkip.

What goes into the Precedence Constraints is this expression:

FINDSTRING( @[User::pkVarCurrentFile],@[User::pkVarFilesToSkip],1 ) < 1

Into the one leading the DFT that will process the file, and this one into the one that will not:

FINDSTRING( @[User::pkVarCurrentFile],@[User::pkVarFilesToSkip],1 ) >= 1

As you can see only the boolean condition is reversed. That’s all! See a picturised representation:

SNAGHTML50f789

At this point we are ready to test!

I added a breakpoint to the DFT that supposed to not to process the unwanted file (the File2.txt if you remember). Such a DFT can merely be empty, but I chose to use a single Trash Component.

So to make sure we do process the needed files we do a test run:

image

And our Script Task is reporting:

SSIS package "DontProcessUnwantedFiles.dtsx" starting.
Information: 0x1 at Show the current file name in Output Window: Current file: C:\Temp\FilesToProcess\File1.txt
Information: 0x4004300A at Process this file, SSIS.Pipeline: Validation phase is beginning.
Information: 0x4004300C at Process this file, SSIS.Pipeline: Execute phase is beginning.
Information: 0x4004300B at Process this file, SSIS.Pipeline: "component "Trash Destination" (1)" wrote 0 rows.
Information: 0x1 at Show the current file name in Output Window: Current file: C:\Temp\FilesToProcess\File1_2.txt
Information: 0x4004300A at Process this file, SSIS.Pipeline: Validation phase is beginning.
Information: 0x4004300C at Process this file, SSIS.Pipeline: Execute phase is beginning.
Information: 0x4004300B at Process this file, SSIS.Pipeline: "component "Trash Destination" (1)" wrote 0 rows.
Information: 0x1 at Show the current file name in Output Window: Current file: C:\Temp\FilesToProcess\File1_3.txt
Information: 0x4004300A at Process this file, SSIS.Pipeline: Validation phase is beginning.
Information: 0x4004300C at Process this file, SSIS.Pipeline: Execute phase is beginning.
Information: 0x4004300B at Process this file, SSIS.Pipeline: "component "Trash Destination" (1)" wrote 0 rows.
Information: 0x1 at Show the current file name in Output Window: Current file: C:\Temp\FilesToProcess\File2.txt
SSIS breakpoint 'Break when the container receives the OnPreExecute event' at executable 'Skip this file' has been hit

Excellent. We have just hit the breakpoint, and at that stage the DFT was processing the needed files.

Now we let the package continue and …

Information: 0x1 at Show the current file name in Output Window: Current file: C:\Temp\FilesToProcess\File2.txt
SSIS breakpoint 'Break when the container receives the OnPreExecute event' at executable 'Skip this file' has been hit
Information: 0x4004300A at Skip this file, SSIS.Pipeline: Validation phase is beginning.
Information: 0x4004300C at Skip this file, SSIS.Pipeline: Execute phase is beginning.
Information: 0x4004300B at Skip this file, SSIS.Pipeline: "component "Trash Destination" (1)" wrote 0 rows.
Information: 0x1 at Show the current file name in Output Window: Current file: C:\Temp\FilesToProcess\File2_1.txt
Information: 0x4004300A at Process this file, SSIS.Pipeline: Validation phase is beginning.

voilà, the unwanted file was skipped!

Wednesday, January 11, 2012 11:51 PM

Feedback

# re: The SSIS Expression Way–Skipping an Unwanted File

Hi, Very nice and useful information shared, this blog is very good to acknowledge yourself and to remain updated, especially your writing style is very attractive, keep it up.

---------------------

PokerStars Review 1/12/2012 12:45 AM | angelina

# re: The SSIS Expression Way–Skipping an Unwanted File

This blog is very good...

-------

çilingir 1/14/2012 2:46 PM | tommy

Post A Comment
Title:
Name:
Email:
Comment:
Verification: