..... and the Winners are:
Adam Machanic
Adam Machanic is a Boston-based independent database consultant, writer, and speaker. He has been involved in dozens of SQL Server implementations for both high-availability OLTP and large-scale data warehouse applications, and has optimized data access layer performance for several data-intensive applications. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "Expert SQL Server 2005 Development" (Apress, 2007) and "Inside SQL Server 2005: Query Tuning and Optimization" (Microsoft Press, 2007). Adam regularly speaks at user groups, community events, and conferences on a variety of SQL Server and .NET-related topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, Microsoft Certified IT Professional (MCITP), and a member of the INETA North American Speakers
Ashish Gilhotra
Ashish Gilhotra a Web Developer from Jaipur, Rajasthan. He is an MCAD in .net with Sql Server.
Leonid Koyfman
Leonid Koyfman is a Senior Developer with Razorfish ( San Francisco). He is in software development over 10 years. His focus is data visualization and delivering reporting solutions from various data sources, but spending most of the time with SQL Server.
I would like to congratulate the winners. In the coming days, I will publish more details about these people so that all of you get to know more about them.
Solving the problem
Let us now have a look at how this challenge can be solved. The solution lies in the way we join @b and @c. This is the tricky part that we need to solve.
A join between @b and @c will produce duplicate rows. The "code" column in @b contains 2 records having value "1". Similarly the "code" column in @c contains 3 records having value "1". If you try to join them with a FULL JOIN you will receive 12 records. But our query should produce only 3 rows. Similarly, our query should produce only 2 records for rows having "2" in the column "code". The total number of rows we are looking for is 5 where as a FULL JOIN between @b and @c currently produces 16 records. See this example:
SELECT * FROM @b b
FULL JOIN @c c ON b.code = b.code
/*
code bname code cname
----------- ---------- ----------- ----------
1 aaa 1 xxx
1 bbb 1 xxx
2 ccc 1 xxx
2 ddd 1 xxx
1 aaa 1 yyy
1 bbb 1 yyy
2 ccc 1 yyy
2 ddd 1 yyy
1 aaa 1 zzz
1 bbb 1 zzz
2 ccc 1 zzz
2 ddd 1 zzz
1 aaa 2 www
1 bbb 2 www
2 ccc 2 www
2 ddd 2 www
*/
How do we remove those duplicate rows? Well, the join that we apply should do the following. It should take the first row in @b with value "1" and join it with the first row in @c having "1". Then it should take the second row in @b having "1" and join it with the second row in @c having value "1". It should then take the third row in @c having value "1". This does not have a matching row in @b, so it will return the information from @c and will return NULL from @b. If we can apply such a join, we can receive back exactly 3 records.
To join the rows in the manner described above, we need to generate a sequence number for each row. The sequence number should reset for each distinct value in the "code" column. So the trick here is to generate this sequence number.
SQL Server 2005 introduced ROW_NUMBER() that can be used to generate a sequence number. However it is not available in SQL Server 2000. So, how do we generate a sequence number (and reset it for each group) in SQL Server 2000?
In SQL Server 2000, this can be achieved by doing a self join and counting the number of records smaller than equal to the current row. This trick is demonstrated in the below example.
SELECT
b1.bname,
b1.code,
COUNT(*) AS row
FROM @b b1
INNER JOIN @b b2 ON
b1.code = b2.code
AND b2.bname <= b1.bname
GROUP BY
b1.bname,
b1.code
/*
bname code row
---------- ----------- -----------
aaa 1 1
bbb 1 2
ccc 2 1
ddd 2 2
*/
This trick is going to help us solve the problem. Using this trick, we can join @b and @c on Code + row and it will make each row unique and will eliminate the duplicate values from the results. A join of @b and @c using this method will produce the following result.
SELECT * FROM (
SELECT
b1.bname,
b1.code,
COUNT(*) AS row
FROM @b b1
INNER JOIN @b b2 ON
b1.code = b2.code
AND b2.bname <= b1.bname
GROUP BY
b1.bname,
b1.code
) b FULL JOIN (
SELECT
c1.cname,
c1.code,
COUNT(*) AS row
FROM @c c1
INNER JOIN @c c2 ON
c1.code = c2.code
AND c2.cname <= c1.cname
GROUP BY
c1.cname,
c1.code
) c ON b.Code = c.Code AND b.row = c.row
/*
bname code row cname code row
---------- ----------- ----------- ---------- ----------- -----------
ccc 2 1 www 2 1
aaa 1 1 xxx 1 1
bbb 1 2 yyy 1 2
NULL NULL NULL zzz 1 3
ddd 2 2 NULL NULL NULL
*/
This leads us to the final solution. The final result can be achieved by simply joining this result with @a. Here is the complete code listing.
DECLARE @a TABLE (code INT, aname VARCHAR(10))
INSERT INTO @a(code, aname) SELECT 1,'Cat'
INSERT INTO @a(code, aname) SELECT 2,'Dog'
INSERT INTO @a(code, aname) SELECT 3,'Bird'
DECLARE @b TABLE (code INT, bname VARCHAR(10))
INSERT INTO @b(code, bname) SELECT 1,'aaa'
INSERT INTO @b(code, bname) SELECT 1,'bbb'
INSERT INTO @b(code, bname) SELECT 2,'ccc'
INSERT INTO @b(code, bname) SELECT 2,'ddd'
DECLARE @c TABLE (code INT, cname VARCHAR(10))
INSERT INTO @c(code, cname) SELECT 1,'xxx'
INSERT INTO @c(code, cname) SELECT 1,'yyy'
INSERT INTO @c(code, cname) SELECT 1,'zzz'
INSERT INTO @c(code, cname) SELECT 2,'www'
SELECT
a.*,
b.bname,
c.cname
FROM @a a
LEFT JOIN (
(
SELECT
b1.bname,
b1.code,
COUNT(*) AS row
FROM @b b1
JOIN @b b2 ON
b1.code = b2.code
AND b2.bname <= b1.bname
GROUP BY
b1.bname,
b1.code
) b
FULL JOIN
(
SELECT
c1.cname,
c1.code,
COUNT(*) AS row
FROM @c c1
JOIN @c c2 ON
c1.code = c2.code
AND c2.cname <= c1.cname
GROUP BY
c1.cname,
c1.code
) c ON
b.code = c.code
AND b.row = c.row
) ON a.code IN (b.code, c.code)
/*
code aname bname cname
----------- ---------- ---------- ----------
1 Cat aaa xxx
1 Cat bbb yyy
1 Cat NULL zzz
2 Dog ccc www
2 Dog ddd NULL
3 Bird NULL NULL
*/
I would like to thank everyone who participated in the challenge. Are you ready for the next challenge?