Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

SSIS scenarios...
Browse by Tags · View All
MSBI 30
SSIS 27
BRH 15
SQL Server 15
#BI 14
Script Component 9
#SQLSERVER 8
#SQL Server 7
Flat File Source 5
Script Component Source 5

Archive · View All
August 2010 4
March 2011 3
September 2010 3
February 2013 2
June 2011 2
November 2010 2
May 2010 2
November 2009 2
March 2010 2
May 2012 1

SSIS - Script Component, add missing column or ignore extra columns

Mar 9 2011 5:00AM by Sudeep Raj   

Today I got a query asking me if we can have a script component which reads data from a text file which is supposed to have 5 input columns. However the data that they receive has at time extra columns or less columns for certain records. So the question arises how do we go about this.

Input:

Col1,Col2,Col3,Col4,Col5
1,2,3,4,5
1,2,3,4,5,6,7,
1,2,3

Output:

Col4	Col3	Col2	Col1	Col
5	4	3	2	1
5	4	3	2	1
NULL	NULL	3	2	1

There are 3 ways that come to my mind straight away.

  1. Script Component
  2. Flat file source which reads all data into one column and then splits them based on index of the comma in a derived column
  3. Flat file source with delimited settings to read just 5 columns.

 

Let me start from the 3rd approach: What will happen in this case is when we get 5 columns things will work fine. In case we have more columns in the input and our column 5 has ample width all the additional columns will be appended as part of column 5. So we might use a derived column in the next step to eliminate the extra data which is simple. The problem will come for the case where we have less number of columns coming. In this case what SSIS will do is read the next line data as part of the missing column of the current record. So this will make the data corrupt(you could try this out to see how SSIS actually works). So this approach will not be of use.

If we go about with the 2nd approach. It is possible to achieve the goal but to write the expression in the derived column task to split the data will be a point of discussion. The moment the expression becomes complicated I try to avoid it as the maintenance of the code is difficult and is prone to error. There is no good editor for editing the expressions. So I would not use this approach as well. But yes this approach can be taken at your risk.

So now we are left with the 1st method that is to go with Script Component. This makes out lives simple and the code manageable. Yes for those who are not familiar with any development language will find this a bit challenging (for that matter expressions are more challenging and less readable). So I would provide just the code here. For other setting you could have a look at my post Script Component - Source Part 1. So here we go:

C#

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

    //StramReader to read the input file stream
    private StreamReader textReader;

    //String to save the source file path
    private string SrcFilePath;

    //Int to count the number of records read.
    private int i = 0;

    //Override the AcquireConnections Method to set up the connection once for the file.
    public override void AcquireConnections(object Transaction)
    {
        SrcFilePath = @"I:\SSIS\Packages\Input\DifferentCols.csv";
    }

    public override void PreExecute()
    {
        base.PreExecute();

        //Set the textReader at the PreExecute Phase so that we donot initialize it for each record.
        textReader = new StreamReader(SrcFilePath);
    }


    public override void CreateNewOutputRows()
    {
        string nextLine;
        string[] columns;
        char[] delimiters;
        delimiters = ",".ToCharArray();

        //Read next line from the file to the string variable
        nextLine = textReader.ReadLine();

        //Read the file till nextLine variable is not NULL ie. EOF
        while (nextLine != null)
        {
            if (i > 0 && nextLine.Length>0)
            {
                this.Output0Buffer.AddRow();
                //Split the records by commas to later extract the data and not the column Name in each record.
                columns = nextLine.Split(delimiters);
                {
                    this.Output0Buffer.Column = columns.Length > 0 ? columns[0]: null;
                    this.Output0Buffer.Column1 = columns.Length > 1 ? columns[1]: null ;
                    this.Output0Buffer.Column2 = columns.Length > 2 ? columns[2]: null;
                    this.Output0Buffer.Column3 = columns.Length > 3 ? columns[3]: null;
                    this.Output0Buffer.Column4 = columns.Length > 4 ? columns[4] : null;
                }
            }
            i++;
            //Read the next line
            nextLine = textReader.ReadLine();
        }
    }

    public override void PostExecute()
    {
        base.PostExecute();

        //Close the Text reader once the file has been read in the PostExecute Phase.
        textReader.Close();
    }
}

VB.NET

Imports System.Data
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.IO

 _
Public Class ScriptMain
	Inherits UserComponent

	'StramReader to read the input file stream
	Private textReader As StreamReader

	'String to save the source file path
	Private SrcFilePath As String

	'Int to count the number of records read.
	Private i As Integer = 0

	'Override the AcquireConnections Method to set up the connection once for the file.
	Public Overrides Sub AcquireConnections(Transaction As Object)
		SrcFilePath = "I:\SSIS\Packages\Input\DifferentCols.csv"
	End Sub

	Public Overrides Sub PreExecute()
		MyBase.PreExecute()

		'Set the textReader at the PreExecute Phase so that we donot initialize it for each record.
		textReader = New StreamReader(SrcFilePath)
	End Sub


	Public Overrides Sub CreateNewOutputRows()
		Dim nextLine As String
		Dim columns As String()
		Dim delimiters As Char()
		delimiters = ",".ToCharArray()

		'Read next line from the file to the string variable
		nextLine = textReader.ReadLine()

		'Read the file till nextLine variable is not NULL ie. EOF
		While nextLine IsNot Nothing
			If i > 0 AndAlso nextLine.Length > 0 Then
				Me.Output0Buffer.AddRow()
				'Split the records by commas to later extract the data and not the column Name in each record.
				columns = nextLine.Split(delimiters)
				If True Then
					Me.Output0Buffer.Column = If(columns.Length > 0, columns(0), Nothing)
					Me.Output0Buffer.Column1 = If(columns.Length > 1, columns(1), Nothing)
					Me.Output0Buffer.Column2 = If(columns.Length > 2, columns(2), Nothing)
					Me.Output0Buffer.Column3 = If(columns.Length > 3, columns(3), Nothing)
					Me.Output0Buffer.Column4 = If(columns.Length > 4, columns(4), Nothing)
				End If
			End If
			i += 1
			'Read the next line
			nextLine = textReader.ReadLine()
		End While
	End Sub

	Public Overrides Sub PostExecute()
		MyBase.PostExecute()

		'Close the Text reader once the file has been read in the PostExecute Phase.
		textReader.Close()
	End Sub
End Class

Hope this helps :)

Tags: SSIS, Script Component, SQL Server, MSBI, #BI, BRH, Script Component Source, Flat File Source, #TSQL, #SQL Server,


Sudeep Raj
12 · 13% · 4303
2
 
0
Lifesaver
 
0
Refreshed
 
0
Learned
 
0
Incorrect



Submit

3  Comments  

  • Sudeep... it's a good article.. in fact it was one of my interview Questions.... Can you achieve the same using SQL

    commented on Sep 7 2011 9:06PM
    vishalchitrala@gmail.com
    289 · 0% · 148
  • Hi Vishal,

    Thanks.

    In SQL depends whats the scenario. If it is a table you could use ISNULL() or coalesce function to handle nulls...

    commented on Sep 11 2011 10:45AM
    Sudeep Raj
    12 · 13% · 4303
  • Sudeep great article, but worth mentioning that you need to set to Asynchronous input (setting SynchronousInputID to None in the output's properties). Otherwise the Script component will keep yelling and spitting errors such 'Output0Buffer is not a member of .......'.

    commented on Mar 5 2012 2:17PM
    Jason Yousef
    156 · 1% · 319

Your Comment


Sign Up or Login to post a comment.

"SSIS - Script Component, add missing column or ignore extra columns" rated 5 out of 5 by 2 readers
SSIS - Script Component, add missing column or ignore extra columns , 5.0 out of 5 based on 2 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]