Many times during ETL we have a requirement where based on the file name we have to perform certain operations. Normally within a For Each Loop with File enumerator in Control Flow Task there is no direct way of doing this. If the file mask matches, well and good but nothing beyond that. I demonstrate this with a scenario:
We have a lot of files in a folder and there is no specific file mask that you can be applied apart from *.txt. So you need to pick up all the files from the folder matching the above scenario. Now the requirement says that we need to process only only those records whose file name's length is greater than 2(just an example), if not, do not process the files. File could be processed in a Data Flow Task(DFT) and other tasks may follow.
To achieve this we have 2 options:
Have a foreach loop with file enumerator and mask set to *.txt. Within the loop have a Script task as the 1st task where check the length of the filename and then set up a flag to TRUE or FALSE which can be later used in a precedence constraint to manipulate the flow of control to the DFT.
I call this DUMMY SCRIPT TASK method (nothing official about it). Steps are the same as above but instead of putting code in the script task we have the script task blank without any code or any variables. It is still the first task in the loop which will be connected to a DFT for further operations and have Precedence constraint set between them for controlling the flow.
Below are the steps to meet the scenario by the Dummy Script Task Method.
Create a string variable(strFileName) to store the file name from the foreach loop. The package layout would be:
Setting up the for each loop, points to note,
On Collection tab
On the Variable mapping tab:
- Select the variable from the drop down list or create a new string variable to store the filename for each iteration of the loop. I am using a user variable strFileName.
- Set the index to 0(numeric zero) as we need to map just one value.
Drag a Script task & a DFT inside the foreach loop and don’t make any changes to the Script Task. All you need to do is link them with a precedence constraint.
Setting up the Precedence constraint(Many people know it by THE green arrow).
Double click on the link joining the Script task with the DFT. The Precedence Constraint Editor would open.
Set the Evaluation option to Expression
After doing the above step you now need to set up the expression.
Set it to:
This implies that only when the length of the filename is greater than 2 should the control go to the DFT.
You could use lot more expressions based on your requirement mostly we see that the need is to validate the file name against today’s date. For more examples of expression check this.
Now proceed with the flow as per your need and execute your package. It will meet the requirement we set up at the beginning.