Our package is ready we have made it dynamic (partly) and we are ready to go. What if there are any errors in the package execution? How to troubleshoot or debug the SSIS Package?
There are various ways to approach this:
- Look at the progress tab for initial symptoms
- Debug the Package to pin point the issue
Let’s look at the Progress tab first. Open the SSIS Solution built in Chapter 3.
Go to the Solution Explorer, select the package, right click it and select Execute Package. Once the package execution starts, you will notice a new tab would appear after the Package Explorer tab as Progress. Go to this tab after the package completion. If everything is fine the data flow task will turn green and the Progress tab would look as in te figure below:
Now what you can do is rename the Input file in your system from Input.csv to Input_1.csv. Run the package again and notice what happens:
- In the Control Flow tab you will see the Data Flow Task turn Red
- Go to the Data Flow tab, you will notice the Flat File Source turn Red.
- Now go to the Progress tab. It will be slightly different from what you noticed earlier. Look for the 2 Error statements with a red Exclamation mark icon beside them. Also there is red icon with cross towards the end in Fig 2.
- If you read the first error message it states: “
[Flat File Source ] Error: Cannot open the datafile "I:\InputPath\Input.csv".”
Reading the above message it is clear that there is an issue with the file name. We need to rectify this, either we change the file name in the system to Input.csv or in the flat file connection browse to the file path and select the new file that we have input_1.csv. Either case the issue will be resolved and the package will execute successfully and the progress tab will have no Red marks.
The above approach can be applied to trouble shoot basic issues like connection managers or variables missing etc. But what if the issue is not that simple?
For debugging a package, we have 3 basic methods:
- Setting breakpoints in a package, container or task
- Setting breakpoint in Script task
- Using Data Viewer in the data flow task
- Setting Error redirection in the Data Flow task
Setting up breakpoint in a task:
For example, we have for each loop that iterates through each file in a given folder (the set up for this will be shown in later sections). Now we need to see the value that is populated in the variable containing the file name, in each iteration. Here we need to debug the package and below are the steps you need to take.
Click the Control Flow tab and place the cursor anywhere on the background of the design surface or a task or a container, right-click, and then click Edit Breakpoints.
Depending on where you right click the Set BreakPoint – Package or task you clicked popup window would open as in the figure below.
You may select the various events where you want the breakpoint to be set. While execution you can have a watch to view the variable values at these breakpoints.
Setting up break point in Script Task:
Open the script task you want to set the breakpoint in VSTA. Go to the line where you wish to setup the break point. Right click on that line and select point to Breakpoint, and then click Insert Breakpoint.
Using Data Viewer in Data Flow Task:
This is one of the most commonly used techniques for debugging. As SSIS is an ETL tool and we are mostly dealing with data, in case we need to see what kind of data is passing between 2 Data Flow Tasks we can use this method. Here are the steps.
- Go to the Data Flow tab
- Double click the green arrow connecting the Flat File source with the OLEDB Destination. Data Flow Path Editor Window would open as in the figure below:
- Select Data Viewer on the left panel and then click the Add button. Configure Data Viewer Window would open below figure
- In the General tab select Grid, the other options are generally not used, such as, the histogram, scatter plot or column chart. You could try them out later for practice.
- Now go to the Grid tab and select all the columns you wish to view while package execution. By moving the columns from displayed columns to unused Columns panel.
- This completes the setup of Data Viewer. Now execute the package you would see a popup with the tabular display of data in the input csv file as in the figure below:
- In the above window, you can analyse the data that is coming out of any transform and if you see there are wrong transformation of data happening or unexpected data coming in like data type error you could take corrective action.
Setting Error redirection in the Data Flow task:
Remember that the above action will help only when the data passes successfully from the source transform. What if the data is not able to be read in the source transform? How can we monitor the data then? Here we would need to use the Error redirection in data flow task.
Before starting we should introduce some error in the source csv file data. Open the csv file and set the name column to a long sting greater than 50 characters. Save the file and re-run the package. You will see that the package fails at the Flat File source in the Data Flow Task.
Getting around this issue we need to know what is wrong with the data. Right now the input file is small and we can get this issue by simply looking into the file but what is there is a single error in a file having millions of records and hundreds of columns. Would be able to detect it by just viewing the file? Probably not.
We use SSIS for this. Let’s say we want to see the error row redirection for all the corrupt data coming from the Flat file source. For this double click the Flat file source. Select the Error Output on the left panel. On the right, you will see a tabular representation of all possible combinations.
Let us take the example of the first column “Name”. We can have 3 scenarios for Name column namely: Error of any kind or Truncation i.e. the source value is larger that the value that the SSIS source can accommodate for this column.
The default behaviour for both Error and Truncation is that the component in question would fail leading other components to fail. We have the option of setting this property to “Ignore failure” or “Redirect Row”
If this is selected then for this column if error occurs this row would be passed to output buffer but the value would be set to…… why not you try this out yourself. If we set this option then the number of records entering the transform = number of transforms exiting the transform.
What this option will do is redirect the row as it is to another flow (for now think that all the good records will go to the green path while the bad records to Red path). In case we set the redirect row for each column for error and Truncation. We have 100 input records and 10 records have error of some kind or the other. Then 90 records will go along the green arrow and 10 along the red arrow. We can have another destination after the red arrow to view these error records later for analysis.