Scenario: You need to select files based on the uniqueness of a part of the file name.
Lets say the format of the file is:
XXXX_MMDDYYYY_garbage.csv
Where XXXX is of 4 characters, MMDDYYYY is the date in the format MMDDYYYY, <garbage> is any set of legal characters.
In the INPUT folder we have files like:
A001_01012011_xyz.csv
A001_01012011_xyz123.csv
A001_01012011_xsyz456.csv
A001_01022011_xyz.csv
A001_01022011_xyz123.csv
A001_01022011_xsyz456.csv
A002_01012011_xyz.csv
A002_01012011_xyz123.csv
A002_01012011_xsyz456.csv
A002_01022011_xyz.csv
A002_01022011_xyz123.csv
A002_01022011_xsyz456.csv
A002_01032011_xyz.csv
A002_01032011_xyz123.csv
A002_01032011_xsyz456.csv
In the output folder we need to get the following files with the specified naming convention and the data in all the files in one group is the same, so you could copy the data of any file in the group. Remove the garbage part from the file name and have one file from each group processed. Grouping is done based on the file name parts XXXX & MMDDYYYY. The output file names should be like the once mentioned below for the above scenario.
A001_01012011.csv
A001_01022011.csv
A002_01012011.csv
A002_01022011.csv
A002_01032011.csv
So tell me how would you design the SSIS package for this scenario and provide the design you feel will be most efficient.
Note: Use of Script Task is not allowed.
Click Here to Submit an Answer