Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

Browse by Tags · View All
#SQLServer 46
SQL Server 44
SSRS EXPERT 41
BI 36
#BI 33
brh 28
SSIS BEST PRACTICES 23
TSQL 19
MS BI STACKS 18
SSRS 17

Archive · View All
March 2011 24
July 2011 17
February 2011 15
April 2010 13
June 2011 11
May 2011 11
April 2011 11
July 2010 11
March 2010 8
December 2011 7

SSIS #97 – When MERGE JOIN is your friend?

Aug 22 2011 2:12AM by Sherry Li   

What does Merge Join in SSIS do?

Merge Join task in SSIS is actually a very straightforward task to use, if you know this:

  • The Merge Join Transformation joins two data sets on a common key (or keys).
  • It is the same as performing an SQL Join – the options you have are Inner Join, Full Outer Join and Left Outer Join.

What is the difference between Merge Join in SSIS and SQL Joins?

If you are SQL developer, you would ask, “If I can do joins in SQL, why would I use merge Join”? On the other hand, if you are an open-minded SQL developer, you would ask, “when do I use Merge Join in SSIS, instead of SQL Join”? If you are an open-minded and a very stubborn SQL developer, you would ask, “what is the real differences between Merge Join in SSIS and SQL Join”? Here are the differences:

  • In Merge Join, it is SSIS rather than SQL is doing the join, so it’s an in-memory activity.
  • Merge Join is a blocking transformation, that is, it needs to receive all data from all inputs before it can proceed to merge the data.
  • Merge Join can do joins on data sets from difference data sources, and SQL joins cannot.

When did Merge Join in SSIS become my friend?

Your first reaction is performance hit, just like me, until you see the last difference. You want to give it a try!I don’t use Merge Join very often in my ETL work, because I am too much a STAGING person. But there are a few places where I would rather use Merge Join is SSIS, than staging any of those ugly/huge tables in DB2. They are ugly, because they are very difficult to use, and they are huge with 10-20 million records. I got frustrated with all the wrong where clauses that gave me the wrong/missing data. My life as a BI developer will be so much easier, only if I can:

  • get all the data from the ugly/huge table in DB2 (with only the where clause I know for sure, so that no data will be missing)
  • join to my nice/small table on SQL server

This is when I decided that I wanted to be friend with the Merge Join!

Data sets must be sorted when using Merge Join in SSIS

Once the decision is made, using and configuring the Merge Join is actually very straightforward, but only if you know this:

My data sets from DB2 and SQL Server must be sorted on the common key(s) first

Two ways to meet the sorting requirement

There are actually two ways to meet the sorting requirement:

  • You can ask SSIS to sort the data set for you by using the Sort Transformation task, or,
  • You can tell SSIS that your data sets are already sorted by setting the IsSorted property of the OLE DB Source Output to true (must use the advanced Editor

Finally, I’ll show you some basic steps to make it work.

Steps when using Merge Join in SSIS

The is the final screen shot. On the left, in the OLE DB Source, I have a query that creates a data set from DB2; on the right, in the OLE DB Source, I have a query that creates a data set from SQL Server.

The left data set can be a huge data set, which can have about 10 million records. The right data set is a very nice and small data set that has only about 5K records.

Step 1: Create your input data sets.

I assume that you know how to write queries for OLE DB Source to create data sets.

Step 2: Optionally, set IsSorted = True for your OLE DB Source Output

If you query has done the sorting, you can set the IsSorted property to true on the Advanced Editor.

On the Advanced Editor, go to the last tab, Input and Output Properties.

My example showed False. You can certainly set it to True if you have done the sorting in your query.

Step 3: Optionally, ask SSIS to sort the data sets

In my example, I asked SSIS to sort the data set on one column.

Remember, both of the data sets must be sorted on the same key(s).

Step 4: Configure the Merge Join

Here is the good news: there is no Advanced Editor for this Merge Join transformation task.

  1. Join Type: pick from these three options. I picked Inner Join.

  2. Pick Join Key from the two data sets. SSIS has picked the right key for me, because those are the key I’ve sorted the data set on.
  3. Pick the out put fields from the two data sets.

You should be done by now…but hang on…

Step 5: Two more things about Merge Join

If you look at the Merge Join properties, you will notice two more properties that might need your attention.

MaxBuffersPerInput – if your data sets are huge and the sorting is somehow incorrect, you will need to experiment with this number.

TreatNullsAsEqual –this property decides whether to join Null values or not. By default it does. In my example, there should be no NULL value. So I didn’t bother with this.

One last note, the result from the Merge Join is also sorted! This is a good news. If you need to use the Merge Join output in other transformations, there is no more sorting needed.

Tags: SSIS BEST PRACTICES, SQL Server,


Sherry Li
14 · 12% · 3827
1
 
0
Lifesaver
 
0
Refreshed
 
0
Learned
 
0
Incorrect



Submit

2  Comments  

  • The only time I'd consider an SSIS Merge Join is your last point: Joining data from disparate data sources.

    But one caveat: You can join data from two different datasources in SQL Server if you are open to using Linked Servers, OPENDATASOURCE, OPENROWSET or OPENQUERY. This requires that an ODBC or OLE-DB driver for the foreign data source be installed on the SQL Server. Be aware that the resultset will be streamed to tempdb prior to the join within SQL Server.

    commented on Sep 2 2011 5:54AM
    Marc Jellinek
    95 · 2% · 586
  • Good article, can you please let me know how to merge an XML having 2 nodes using Merge join transformation

    commented on Nov 22 2011 12:01AM
    Karthikeyan Anbarasan
    52 · 4% · 1176

Your Comment


Sign Up or Login to post a comment.

"SSIS #97 – When MERGE JOIN is your friend?" rated 5 out of 5 by 1 readers
SSIS #97 – When MERGE JOIN is your friend? , 5.0 out of 5 based on 1 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]