In my previous post,
SSIS #97 – When MERGE JOIN is your friend?, I showed you a scenario where
Merge Join can be you real friend in your ETL work. I also showed you two ways to
satisfy the sorting requirement by the Merge Join task.
One of the ways is to sort your data in a query in the OLE DB Source. Then you can
just “tell” SSIS that your data set is indeed sorted. This is where I missed one
important piece of information in my previous post, that is, sorted on what.
SSIS is a powerful tool for ETL, with a reputation of steep learning curve to go
with it. This is one of the evidences, in my humble opinion.
OK, let’s have a recap.
Step 1 – Write a query with sorting in the OLE DB Source editor
Go the OLE DB Source editor.
I erased all the irrelevant info except the order by clause.
Step 2 – Go to the Advanced Editor for the OLE DB Source to set IsSorted to True
You need to tell SSIS two things:
- The output data set is sorted
- It is sorted on the column LOAN_NR
First, set IsSorted = True on the Input and Output Properties tab.
Step 3 – Set SortKeyPosition to 1
Next, expand the “Output Columns” branch, and select the column that corresponds
to the first column in your ORDER BY clause. Set its “SortKeyPosition” property
to the number 1.
Number 1 means that the column is the first column in my ORDER BY clause. Since
I only have one column in my ORDER BY clause, my work is done. Otherwise, repeat
the above step for each column in your ORDER BY clause, increasing the SortKeyPosition
by one each time (1,2,3,…).
Now the Merge Join will understand and acknowledge that the incoming data is sorted
and will not require a Sort transformation task between the source and a Merge Join