The sum of all SQL Server matter that exist,
and the space in which all SQL Server events
occur or could occur.
Welcome to SQL Server Universe.com Sign in | Join | Help
Home SS SLUG Forums Articles Photos Downloads

Nuggets from My SSIS Experience by Dinesh Priyankara





























































































































































































































  

[ ABOUT THE AUTHOR ]

I have been working on a SQL Server Integration Project for the last few months, and it made me perform a whole lot of research and also gave me a lot of good experiences. I thought about writing this note based on some of the issues I faced and how I solved them. This article lists three issues and the solutions I applied to each of them. One thing I need to mention, is that the solutions described may not be the best solution to the issue. I am sure a lot of you guys have addressed these kinds of issues before, hence I look forward on seeing your feedback or alternative solutions. The issues listed are;

Issue #1: Continue the execution flow after handling errors

Initially I faced this problem with "Foreach Loop Container", but this is applicable for any situation. The "Foreach Loop Container" traverses a folder and accesses the files available. Tasks inside the loop validate the files with certain procedures and record them in the database, and lot more. What I wanted to configure was, continue with all files even after an error occurred for the particular file. After spending some time, I figured it out that how to implement it. I found some interesting points too. Here is the way I did it.

Let's try to simulate the issue. Follow below steps;

  • Create a SQL Server Integration Services project and open the default package. Add a "Foreach Loop Container" to the "Control Flow".

  • Set the Enumerator of the "Foreach Loop Container" as Foreach File Enumerator. Set a folder (in may case, it is "Folder I".) to access that has at least three files (again, in my case, files are TextFile1.txt, TextFile2.txt and TextFile3.txt).

  • Select Fully qualified for the Retrieve file name. Do not check the Traverse sub folders.

  • Go to the Variable Mappings and set the Index 0 to new string type variable. Name the variable as FilePath.

  • Create a new string type variable  and name it as NewFilePath.

  • Add a "Script Task" within the "Foreach Loop Container" and name it as Set the NewFilePath variable.

  • Open the "Script Task Editor" and go to "Script" section. Add User::FilePath as read-only variable and User::NewFilePath as read-write variable. Add below script to the task. It sets the destination folder as "Folder II".

    ' Setting the destination folder to the file. The folder "D:\TestFolder II" is the destination folder.
    Dts.Variables("User::NewFilePath").Value = "D:\TestFolder II\" & Dts.Variables("User::FilePath").Value.ToString().Substring(Dts.Variables("User::FilePath").Value.ToString().LastIndexOf("\") + 1)

  • Add a "File System Task" into the "Foreach Loop Container" and connect it with above "Script Task".  Name it as Copy file to the destination. Set the operation as Copy File. Set SourceVariable as User::FilePath and DestinationVariable as User::NewFilePath. Make sure that property OverwriteDestination set as False.

  • Add another "Script Task" into the loop and connect it with "File System Task". Script it, just to show a message saying "File is successfully transferred.". Now your package looks like this;

Done. If you execute the package now, it should copy all the files available in the "Folder I" to "Folder II". Execute and make sure it runs without any problems. Once copied, delete all files in the "Folder II" except "TextFile2.txt". Execute the package again. Here is the result;

            

You get a success message for the TextFile1.txt and then it stops because it cannot copy the TextFile2.txt. It exists in the "Folder II". This is the kind of issue I faced. I wanted to continue the execution with rest of files, in this case, it should copy the TextFile3.txt even after the error occurred with the TextFile2.txt. This is how I started addressing it;

  • I added the another "Script Task" that handles the error and connected it with "Copy file to the destination". The precedence constraint set as "Failure". The newly added "Script Task" displays the error message (Originally, it should be used for logging the error.).

  • Run the package again. Make sure that the "Folder II" contains only "TextFile2.txt".

  • You get the success message for the first file (TextFile1.txt) and get the error message for the second file (TextFile2.txt). But still it does not continue with the third file.


  • We should tell to the SSIS engine that "ignore the error and continue". The only way to pass the message to the engine is, setting the property MaximumErrorCount of "Copy file to the destination" task. The value in this property says that "continue the execution until the number of specified errors are reached", that has default value of 1.

  • Let's set the MaximumErrorCount of "Copy file to the destination" to 100 and see. Again delete all files in the "Folder II" except "TextFile2.txt". Execute the package. Here is the result;

      

    Now it does not go the "Error message" task at the processing of "TextFile2.txt". It evaluates the result of "Copy file to the destination" as success and execute the "Success message" task, that is wrong. Then it fails the "Foreach loop container" without executing the "TextFile3.txt". What we want is, execute the "Error message" task at the error and continue the "Foreach loop container" without stopping, in my case process the "TextFile3.txt".

  • Set the property MaximumErrorCount of "Copy file to the destination" back to 1. This fails the task at the error.  Set the MaximumErrorCount of "Foreach Loop Container" to 100. This causes to continue the "Foreach loop container" at the error inside.

  • Delete files except "TextFile2.txt" in the "Folder II" and execute the package. Here is the result, it is as we expected;


This is how I addressed the issue. Now it handles the errors inside and continues the execution. If you need to capture the error message generated from the system, implement "Error message" task in the OnError event handler of the "Copy file to the destination" task. The error message can be taken from the System::ErrorDescription variable.

Issue #2: Errors are not getting captured - System::ErrorDescription

This is a funny mistake I had made. You may do the same (my apologies, if I underestimate you :)), so make sure you do this workaround. I had a "Execute SQL Task" that perform a DDL operation and errors of the task had been handled inside the OnError event handler. Errors are logged into the database, it sends error info by executing a SP via  another "Execute SQL Task" inside the event handler. The issue was, not all errors are logged, of course, because of a mistake I  had made. Let me show it with a simple example.

  • Create a package and add an "Execute SQL Task". Run below script in one of your test databases. It adds two tables and one stored procedure.

    CREATE TABLE Test (Id int PRIMARY KEY, [Value] varchar(5))
    GO
    CREATE TABLE Errors (Error varchar(1000))
    GO
    CREATE PROC AddErrors @ErrorDescription varchar(1000)
    AS
    BEGIN

       INSERT INTO
    Errors VALUES (@ErrorDescription)
    END

  • Create a connection for the test database and add the connection to the task. Set a simple "INSERT" statement to the "Execute SQL Task".

  • Open OnError event handler of the "Execute SQL Task" and add another "Execute SQL Task" to the event. Set the same connection to the task. We will be calling the SP like below;



    But, as we need to capture the original error message, we have to set the SQLStatement with SQLStatementSource property expression. You will be setting the expression like below;

     

  • Done. Test the package by executing it. It should insert a record without any problem.

  • Okay. Executing again causes to insert a duplicate record into the table, that leads to an error. Since we have a task to log the error, error should be logged in the "Errors" table. Execute again. it throws an error. Has it been inserted to the "Errors" table. It has not. At a glance I could not find the exact reason for this but finally I found that all because of single quotes contain in the error message. So, I had to do little formatting on it, ended up with below expression;



    What I have done here is, replace single quote with two single quotes. Problem solved.

How did you handle this?

Issue #3: Validating error on Data Flow Task

This is another mistake I had made. One of my packages contained "Data Flow Task" and the connection had been used inside the "Data Flow Task" were dynamically set at the control flow. The design of the package was done in my  machine and the  connection had been made to the local server, even though the connection string was assigned at the run-time. The issue was related to the "validation" of the "Data Flow Task", when it runs in a different machine, or in a machine that has no database similar to the design-time set database. If the issue want to be simulated, we would be having a simple package like this;

You can have anything inside the "Load Data" - "Data Flow Task" that does the extracting data from a source and loading them into a destination. In my case, I added an "OLE  DB Source" to get some data from SQL Server and pass them to the added "Recordset Destination". The connection "DatabaseConnection" is the one used by the "OLE DB Source". At the design-time, the "DatabaseConnection" is set to the local database called "TestDatabase". The "Script Task" dynamically adjust the connection string of "DatabaseConnection" (Assume that connection string that is written for "TestDatabase", comes from either configuration file or from the registry). Once the package is deployed, it is supposed to get the connection string for different database and set it to the "DatabaseConnection". Okay. Let's test this;

  • Create a package like above and set the connection to local database (TestDatabase).

  • Create a variable called "ConnectionString" and set a configuration file to get a value for it. Make sure the the connection string coming from the configuration is written for "TestDatabase".

  • Code the "Set the connection string" task to assign the value from "ConnectionString" variable to the connection.

    Dts.Connections(
    "SourceConnection").ConnectionString = Dts.Variables("User::ConnectionString").Value.ToString()

  • Run the package and see. It should work without any problems.

  • Now change the configuration file. Set connection string to different database (The structure of the database should be equal to the TestDatabase) and rename the "TestDatabase" to "_TestDatabase". This make sure that local database set at the design-time is not available.

  • Run the package outside the Business Intelligence Development Studio. Now it should take the new connection string from the configuration file. Does it run without any problem? Here is the result of mine;

  • It does not execute properly. The error says that "TestDatabase" is not available. It is supposed to get the new connection string from the configuration file and use it. But seems it is still looking for the one we used at the design-time. This has to be stopped, and of course, it can be simply solved.

  • Open the package again through BIDS. Get the property of the "Data Flow Task". Change the property DelayValidation to True.

  • Run the package again with the configuration file. Now it does not check the availability (or no validation against the data source that is used at the design-time) of "TestDatabase". It runs without any problems.

The reason for this issue is, the "Data Flow Task" is being validated with its component at the run-time before executing. Since the correct connection string is not assigned the time of validating, it uses the old one (or the one that is set at the design-time). By setting the property DelayValidation to True, we can avoid these issues. This is not valid to tasks such as "Execute SQL Task". You may set the same when you execute package (child package) inside another package.

[ RATE this ARTICLE, and COMMENT on it ]      [ ABOUT THE AUTHOR ]

Powered by Community Server, by Telligent Systems