beyondrelational.com



October 2009 - Posts

TSQL Challenge 15 - matching data between rows and columns

This challenge is derived from a realistic reporting scenario that requires the transformation of rows into columns. This is commonly known as a matrix, pivot, or crosstab query. This challenge involves two primary tables @Row and @Cols. The objective will be to create a matrix result set. Before I explain the rest of the requirements, let me show you the sample tables. Column Table:

Col
-----------
1
2
3
4
5
6
7
8
9

Row Table:

Row
-----------
100
104
101
99
77
20
10

The final output should look like this:

Row         1    2    3    4    5    6    7    8    9
----------- ---- ---- ---- ---- ---- ---- ---- ---- ----
10          x    x              x                   
20          x    x         x    x                   
77          x                             x         
99          x         x                             x
100         x    x         x    x                   
101         x                                       
104         x    x         x                   x    

The challenge is to mark a coordinate, with a value of X, if and only if the row value is divisible by the col value, i.e. it has a modulo of zero. The additional requirements are: the final query must work with random row values and the pivot operator should be used.

Sample Data

Use the following script to generate the sample data needed for this challenge

DECLARE @Cols TABLE(Col INT);

INSERT INTO @Cols VALUES (1);
INSERT INTO @Cols VALUES (2);
INSERT INTO @Cols VALUES (3);
INSERT INTO @Cols VALUES (4);
INSERT INTO @Cols VALUES (5);
INSERT INTO @Cols VALUES (6);
INSERT INTO @Cols VALUES (7);
INSERT INTO @Cols VALUES (8);
INSERT INTO @Cols VALUES (9);

DECLARE @Rows TABLE(Row INT);

INSERT INTO @Rows VALUES (100);
INSERT INTO @Rows VALUES (104);
INSERT INTO @Rows VALUES (101);
INSERT INTO @Rows VALUES (99);
INSERT INTO @Rows VALUES (77);
INSERT INTO @Rows VALUES (20);
INSERT INTO @Rows VALUES (10);

Challenge Requirements Summary

  1. Only coordinates where the column value modulo the row value equals zero should be marked, with an "X"
  2. Number of rows in the table is not fixed. The query should work with variable number of rows in the table.

Notes

  1. Read the Submission Guidelines and make sure that your solution follows them.
  2. The solution should work on SQL Server 2005, 2008 or later versions
  3. Last date to submit your entries: Monday 11/2/2009
  4. Use this forum for any questions related to TSQL Challenge #15

About the Authors.


TSQL Challenge 11 - Solution by Marcos Kirchner
4TDQPM9Y9CM6[1]

Marcos is a support analyst for an ISV company in Joinville, Brazil, where he helps customers properly configure and administer their databases. He is also an enthusiastic .NET and SQL Server developer and holds some Microsoft certifications (MCDBA and MCTS/MCITP for SQL 2005 and 2008).
He is currently attending a postgraduate course and helps in local forums whenever time allows.

How did Marcos arrive at his solution?

This is how Macros explained his solution to us:

This is a challenge that looks simplistic at first, but quickly becomes complex. I thought I would be able to select from @t and CROSS JOIN with @c twice and then apply the discount calculation.

There are two major problems with these logical steps:

  • The discounts must be cumulative and order does matter for percentage coupons.
  • The maximum discount may not consist of two coupons (perhaps not even one) for a given product. Remember the discount cannot be greater than 30% the original price or 30 dollars.

Instead, I created four CTEs that progressively apply discounts and check business rules.

The Discounts1 CTE crosses join the products with the discounts.

WITH Discounts1 AS
(SELECT t.ID, t.PRICE,  c1.ID AS CID1, c1.NAME AS CNAME1,
	t.PRICE - CASE c1.IS_PERCENT WHEN 0 THEN c1.VALUE
		WHEN 1 THEN t.PRICE * c1.VALUE / 100 END AS PRICE1
FROM @T AS t CROSS JOIN @C AS c1),

The Discounts2 CTE then checks that we are not breaking any rules, with one coupon applied. The Discount2 CTE also applies the second coupon (another cross join); however, this time I am applying filtering. The USE1 column indicates whether or not the discount is in compliance with the business rules. A USE1 value of 1 indicates that the discount is still in compliance and a value of 0 indicates non-compliance.

Discounts2 AS
(SELECT d.*,
	CASE WHEN (d.PRICE - d.PRICE1 <= 30) AND
		(d.PRICE1 / d.PRICE >= .7) THEN 1 ELSE 0 END AS USE1,
	c2.ID AS CID2, c2.NAME AS CNAME2,
	d.PRICE1 - CASE c2.IS_PERCENT WHEN 0 THEN c2.VALUE
		WHEN 1 THEN d.PRICE1 * c2.VALUE / 100 END AS PRICE2
FROM Discounts1 AS d CROSS JOIN @C AS c2
WHERE d.CID1 <> c2.ID),

The Discounts3 checks that the discount is still in compliance, with two coupons applied. DiscountCTE3 also creates an USE2 column. If USE2 = 1 both coupons are valid to use for a given product. After Discounts3, we have three price columns, PRICE, PRICE1 and PRICE2. Each column corresponds to original price the original proces less one and two discounts respecitvely. We also have the USE1 and USE2 columns that tell us if PRICE1 or PRICE2 columns violate the rules.

Discounts3 AS
(SELECT *,
	CASE WHEN (USE1 = 1) AND (PRICE - PRICE2 <= 30) AND
		(PRICE2 / PRICE >= .7) THEN 1 ELSE 0 END AS USE2
FROM Discounts2),

So, the Discounts4 CTE picks PRICE2, PRICE1, or PRICE column as appropriate, and generates the COUPON_NAMES column. The Discounts4 generates a list of products with all legal combinations of discounts applied. Once we have them all, we just need to pick the one with the lowest price. This is done in the final select using a cross apply and top 1.

Discounts4 AS
(SELECT ID, PRICE,
	CASE WHEN USE2 = 1 THEN PRICE2
		WHEN USE1 = 1 THEN PRICE1
		ELSE PRICE END AS DISC_PRICE,
	CASE WHEN USE2 = 1 THEN CNAME1 + ' + ' + CNAME2
		WHEN USE1 = 1 THEN CNAME1
		ELSE '' END AS COUPON_NAMES
FROM Discounts3)

Lastly we handle the results format. While this solution is not overly complex, it is very inflexible. Currently the code is limited by a finite number of coupons; however, this process does not scale well with more and more coupons. Consider if we needed to check three or more coupons. To check three or more additonal coupons requires even more CTEs, which increases complexity, reduces speed, and increases maintenance overhead. Because I use common table expressions and the cross apply operator, this solution is limited to SQL Server 2005 or greater. Here is the final solution:

WITH Discounts1 AS
(SELECT t.ID, t.PRICE,  c1.ID AS CID1, c1.NAME AS CNAME1,
	t.PRICE - CASE c1.IS_PERCENT WHEN 0 THEN c1.VALUE
		WHEN 1 THEN t.PRICE * c1.VALUE / 100 END AS PRICE1
FROM @T AS t CROSS JOIN @C AS c1),

Discounts2 AS
(SELECT d.*,
	CASE WHEN (d.PRICE - d.PRICE1 <= 30) AND
		(d.PRICE1 / d.PRICE >= .7) THEN 1 ELSE 0 END AS USE1,
	c2.ID AS CID2, c2.NAME AS CNAME2,
	d.PRICE1 - CASE c2.IS_PERCENT WHEN 0 THEN c2.VALUE
		WHEN 1 THEN d.PRICE1 * c2.VALUE / 100 END AS PRICE2
FROM Discounts1 AS d CROSS JOIN @C AS c2
WHERE d.CID1 <> c2.ID),

Discounts3 AS
(SELECT *,
	CASE WHEN (USE1 = 1) AND (PRICE - PRICE2 <= 30) AND
		(PRICE2 / PRICE >= .7) THEN 1 ELSE 0 END AS USE2
FROM Discounts2),

Discounts4 AS
(SELECT ID, PRICE,
	CASE WHEN USE2 = 1 THEN PRICE2
		WHEN USE1 = 1 THEN PRICE1
		ELSE PRICE END AS DISC_PRICE,
	CASE WHEN USE2 = 1 THEN CNAME1 + ' + ' + CNAME2
		WHEN USE1 = 1 THEN CNAME1
		ELSE '' END AS COUPON_NAMES
FROM Discounts3)

SELECT d.ID, t.NAME,
	CAST(d.PRICE AS VARCHAR(20)) + '$' AS PRICE,
	CAST(d.DISC_PRICE AS VARCHAR(20)) + '$' AS DISC_PRICE,
	CAST(d.PRICE - d.DISC_PRICE AS VARCHAR(20)) + '$' AS TOT_DISC,
	CAST(ROUND(100 -  d.DISC_PRICE / (d.PRICE / 100), 2) AS VARCHAR(20)) + '%' AS RATE,
	COUPON_NAMES
FROM @T AS t CROSS APPLY
	(SELECT TOP 1* FROM Discounts4 d
	WHERE t.ID = d.ID ORDER BY DISC_PRICE) AS d

Thanks Marcos for your explanation!

What can we learn from this?

To resolve this challenge one should sit down and really thing through the methodology that he would like to implement.  As with any great challenge, the developer has to break this complex problem into more manageable pieces.  This challenge requires organizational skills, computational skills, TSQL skills and most of all patience. 

 
The first two questions one should ask himself are:


•    How do I find the correct combination of coupons to maximize discount, while complying with the business rules?
•    How do I verify that my solution is compliant with the business rules? 

Final Word:

Congratulations Marcos for your solution and in-depth explanation!

All TSQL Challenge 11 winning solutions have been presented, but please remember that we are still accepting solutions. New and better solution are always surfacing, and the next winning solution may be yours.

Stay tuned TSQL Challenge 15, will be online next Monday. Please bookmark our RSS Feed in your favorite RSS reader, so you do not forget!

 


TSQL Challenge 11 - Solution by Brad Schulz
Brad Schulz
My Photo Brad has been working with databases for 30 years. He was a Microsoft Visual FoxPro (VFP) MVP from 1994-2000, and has written articles for various VFP publications. He now devotes most of his time to SQL Server and is a moderator at the MSDN T-SQL Forums. He has a T-SQL blog at http://bradsruminations.blogspot.com/

 

How did Brad work on the Challenge?

Let’s start with the full solution of Brad:

;with CouponCombos as
(
  select Coupon_Names=c1.Name+' + '+c2.Name
        ,Value1=c1.Value
        ,Is_Percent1=c1.Is_Percent
        ,Value2=c2.Value
        ,Is_Percent2=c2.Is_Percent
  from @c c1
  join @c c2 
	on c1.ID<>c2.ID and 
	(c1.ID>c2.ID or c1.Is_Percent=1 or c2.Is_Percent=1)
  union all
  select Name
        ,Value
        ,Is_Percent
        ,0
        ,0
  from @c
)
,PriceRankings as
(
  select t.ID
        ,Name
        ,Price
        ,Disc_Price
        ,Tot_Disc
        ,Rate
        ,Coupon_Names
        ,RowNum=row_number() over 
		(partition by ID order by Disc_Price)
  from @t t
  cross join CouponCombos
  cross apply (select Disc1=case when Is_Percent1=1
                            then Price*Value1/100 
                            else Value1 
                            end) ApplyDisc1
  cross apply (select Disc2=case when Is_Percent2=1
                            then (Price-Disc1)*Value2/100
                            else Value2
                            end) ApplyDisc2
  cross apply (select Disc_Price=Price-Disc1-Disc2
                     ,Tot_Disc=Disc1+Disc2
                     ,Rate=(Disc1+Disc2)/Price) CalcRate
  where Rate<=0.30 and Tot_Disc<=30
)
select 'ID NAME    PRICE    DISC_PRICE  TOT_DISC  RATE    COUPON_NAMES'
union all
select 
'-- ------  -------- ----------- --------- ------- -------------------------'
union all
select cast(ID as char(3))
      +cast(Name as char(8))
      +cast(convert(varchar,Price)+'$' as char(9))
      +cast(convert(varchar,Disc_Price)+'$' as char(12))
      +cast(convert(varchar,Tot_Disc)+'$' as char(10))
      +cast(convert(varchar,Rate*100)+'%' as char(8))
      +Coupon_Names
from PriceRankings
where RowNum=1

And here is how he explained to us :

The query first uses a CTE called CouponCombos, which combines the coupons creating  almost all possible combinations, including UNIONing single coupons.  Any two coupons that are both for flat (non-percent) dollar discounts, will result in the same dollar discount no matter what order the two were applied. Next I filter the combination where the ID of the first coupon is smaller than the ID of the second one. 

For example, Coupon 1 and Coupon 4 are for $15 and $12 respectively.  It doesn't matter whether you apply 1 then 4 or 4 then 1.  But the WHERE clause in my CTE will only choose the “4 then 1” option.  This is not a big deal, but it does make the final result, of the query, more deterministic and forces it to match its output with the expected output of the challenge.

I then employ a second CTE called PriceRankings, which CROSS JOINs the @T table (Products) with the CouponCombos CTE.  I then use CROSS APPLY three times to calculate (1) the Discount on the first coupon, then (2) the Discount on the second coupon (which is calculated on top of the first discount), and then finally (3) the Discounted Price and Total Discount and Rate.  By using CROSS APPLY this way, I can refer to more meaningful column names in the SELECT list and in the WHERE clause.  In addition to being clearer, this also makes the query more compact because it eliminates the introduction of either overly-complicated expressions or more CTE's.  And, what’s best, the readability of the CROSS APPLY method comes at no cost whatsoever.  I’m a huge fan of the often-misunderstood, underused APPLY operator.

The WHERE clause of the PriceRankings CTE only includes the items whose Discounted Price is more than 70% of the original price (or Rate<=0.30) and the items whose Total Discount was no more than $30.

Finally, the PriceRankings CTE employs the ROW_NUMBER() function to PARTITION BY the ID number of the Product table (@T) and ORDERs BY the Discounted Price.

The main query just pulls out the rows from the PriceRankings CTE that have a RowNum value of 1 (i.e. those rows with the lowest Discounted Price for each ID).

Note that I left the Rate column in the PriceRankings CTE as a decimal and didn’t multiply it by 100 until the main query to get a final percent.  Normally I wouldn't do this (I'd do it in one single equation), but for Product 4, the challenge’s expected output was 29.28%, when, in fact, the discount rate percent figure (when doing it as one equation) is 20.50 / 70.00 * 100 = 29.2857 which rounds to 29.29.  So this is why I separated it... By just dividing the two money fields 20.50 / 70.00, it comes out to 0.2928, thus when I multiply that by 100 in the main query, it finally comes out to 29.28.

Finally, the main query uses UNION ALL to incorporate the titles and output together so that the result came out EXACTLY as the challenge had requested.  If one runs the query with SET NOCOUNT ON and sets the Query Results To Text option to not include column headers, then the Results Text Window will appear exactly as the challenge had requested, character by character.

 

Thanks Brad for this great explanation!

Why join TSQL Challenge Team ?

Some weeks after this challenge, Brad has decided to join the TSQL Challenge team, here is a few words about it:

I first discovered the T-SQL Challenges with Challenge #10.  I submitted three completely different solutions for it, but unfortunately, none of them got selected as a winner (Hmmm... I'll have to have a word with the other team members about that... just kidding).  I looked forward to each new challenge... They were a lot of fun, and they gave the brain a good little workout every few weeks.  I also enjoyed reading how previous winners had come up with their solutions... there are lots of very talented people out there with great ideas.

So when I was asked to join the team, I accepted immediately.  I look forward to working with the other team members, whom I greatly respect and admire, and I also look forward to seeing how different people attack various problems.  I hope the team enjoys some of the crazy challenge ideas that I've got brewing in my head.

A Final Word

Congratulations again Brad for this solution and thanks again for your explanations!

We will finish tomorrow with the solution of Marcos and don’t forget TSQL Challenge 15 next week!

 

About the Authors.

Posted: 10-15-2009 6:08 PM by Rui Carvalho | with no comments
Filed under: , ,

TSQL Challenge 11 - Solution by Peter Larsson
Peter Larsson
4TD2939KPP22[1]

Professional database developer since 2001.
Started to use Microsoft SQL Server in 1993, and have been true since then.
He mostly working with performance troubleshooting and database architecturing.

How did Peter work on the challenge?

This is the solution provided by Peter and how he explained it to us:

The challenge is about finding the optimal strategy to maximize coupon discount.
One of the rules is that no more than two coupons can be applied to a single purchase.

This information made me think that a CROSS JOIN would be perfect solution, because it produces all possible combinations.
So I did and got the result I wanted.

Three other important rules were defined as

1) The coupons are cumulative; the second coupons is based on the result for first coupon
2) The discount price cannot be less than 70% of the original price
3) The total amount of the discount cannot exceed 30$

With this in mind, I started to see the Cartesian product is missing some combinations.  The Cartesian product is missing combinations because it produced all combinations of exactly two coupons.
What if a single coupon was enough, because using two coupons violated rule #2 and #3, ie too much discount.

The solution was to add a "No go" coupon! And that's basically all there is to this challenge.
Rest of solution is mostly for doing the math behind the combinations of coupons, this is done in a CASE clause for readability.

So here is my solution for the challenge problem.

Start by creating all combinations of coupons, including the "No go" coupon.

;WITH 
c (Name, Value, Is_Percent) AS ( 
	-- Get all present coupons from source table 
	SELECT Name, Value, Is_Percent 
	FROM @c UNION ALL 
	-- This is the "No go" coupon which is calculated 
	-- as $0 discount 
	SELECT '', 0, 0 ), 
Yak (ID, Name, Price, DiscPrice, cn, LowerLimit) AS ( 
	SELECT t.ID, 
	-- Fetch the Product ID 
	t.Name, 
	-- Fetch the Product Name 
	t.Price, 
	-- Fetch the Product Price 
	-- Now calculate the [discount value] for 
	-- all combinations of possible discount types. 
	-- There are four of them, since we can use up to 
	-- two coupons and every coupon can have either 
	-- percent or value discount 
	-- We also make sure to comply with rule #1 
	CASE 
		WHEN c1.IS_PERCENT = 1 AND c2.IS_PERCENT = 1 
		THEN t.Price * 	(1.0 - c1.Value / 100.0) * 
						(1.0 - c2.Value / 100.0) 
		WHEN c1.IS_PERCENT = 1 AND c2.IS_PERCENT = 0 
		THEN t.Price * (1.0 - c1.Value / 100.0) - c2.Value 
		WHEN c1.IS_PERCENT = 0 AND c2.IS_PERCENT = 1 
		THEN (t.Price - c1.Value ) * (1.0 - c2.Value / 100.0) 
		ELSE t.Price - c1.Value - c2.Value 
	END AS DiscPrice, 
	-- Here we calculate the coupon names to display 
	c1.Name + 
	CASE WHEN c2.Name = '' THEN '' ELSE ' + ' + c2.Name END AS cn, 
	-- This calculation is vital (rule #2 and #3), 
	-- because it determines the actual lower limit 
	-- allowed for combination of coupon values 
	CASE WHEN t.Price * 0.7 < t.Price - 30 
	THEN t.Price - 30 
	ELSE t.Price * 0.7 
	END AS LowerLimit 
	FROM @t AS t 
	-- This is the product table 
	CROSS JOIN c AS c1 
	-- This is the coupon table, 
	-- aliased c1 for first coupon 
	INNER JOIN c AS c2 ON c2.Name <> c1.Name 
	-- This is the coupon table, 
	-- aliased c2 for second coupon 
	-- Note that it isn't exactly a cartesian product, 
	-- because we can't use same coupon twice 
) 

With this base to work with; Product information, combined coupons discount value and allowed lower price for product,
we now can work with the data and select the optimal strategy for our buy.

This is now trivial, but I'll explain anyway since it can be confusing:

SELECT ID, 
	-- This is the Product ID 
	Name, 
	-- This is the Product Name 
	LTRIM(STR(Price, 15, 2)) + '$' AS Price, 
	-- This is the formatted Product Price (original) 
	LTRIM(STR(DiscPrice, 15, 2)) + '$' AS Disc_Price, 
	-- This is the formatted Product Price(discounted) 
	LTRIM(STR(Price - DiscPrice, 15, 2)) + '$' AS Tot_Disc, 
	-- This is the formatted Discount value (dollar) 
	LTRIM( STR( 
		ROUND( 100.0 - 100.0 * DiscPrice / Price, 2, 1),
		15, 2)) + '%' AS Rate, 
	-- This is the formatted Discount value (percent) 
	cn AS Coupon_Names 
	-- This is the coupon combination. 
	-- Note that these are in order of use 
FROM ( 
	SELECT ID, 
	Name, 
	Price, 
	DiscPrice, 
	cn, 
	ROW_NUMBER() OVER ( 
		PARTITION BY ID 
		ORDER BY DiscPrice, cn DESC) AS recID 
	-- This is vital. It makes sure that we 
	-- only use one combination of coupons for 
	-- any given product. 
	-- Lowest price is fetched first 
	FROM Yak WHERE DiscPrice >= LowerLimit 
	-- Make sure no combination of coupons exceeds 
	-- lower allowed price 
) AS d 
WHERE recID = 1 
-- Get only one combination for each product 
ORDER BY ID -- Sort the list for easier reading 

Thanks Peter for this explanation!

 

What can we learn from this?

Peter has thoroughly explained how to approach this problem from a TSQL perspective, but would like to reiterate that this challenge requires organizational and mathematical computation skills too, more explanation below.

1.  Write down all [business] rules. It's much easier to organize them and categorize them

2.  Practice your math skills. They will be needed over and over again

That’s a really important point, regarding your business needs you should optimize your query as needed directly from query conception. Adding rules is not only adding a condition to a query or joining another table ;-)

The Final Word

Congratulations Peter! Thank you for solution and in-depth explanation.

In the next few days, we will show Brad and Marcos solutions. Stay tuned for TSQL Challenge 15 next week!

About the Authors.

Posted: 10-14-2009 11:06 AM by Rui Carvalho | with no comments
Filed under: , ,

TSQL Challenge 11 Winners

Thanks to All

First and foremost TSQL Challenges would like to say thank you to all the participants. TSQL Challenges relies on your participation to bring the SQL Server community together.  Our mission at TSQL Challenges is to get the SQL Server community excited about TSQL and increase set based query awareness. 

Our participation metrics show an upward trend in community participation, for each and every challenge issued.  With that said many of you have expressed concerns about the turnaround time between solution submissions and announcing winners.   We know the results mean a lot to you and that is why we are currently working to improve our workflow and solution evaluation processes. 

In addition to better processes, TSQL Challenges is recruiting more talent to help with these processes.   It is our hope that in the very near future challenge submission and evaluation will be handled more quickly, so you can see the results in a timelier manner.

The Challenge

TSQL Challenge 11 is about calculating the lowest price of an Item by applying the best combination of discount coupons. The target of this challenge is to use differing coupon combinations to maximize savings and minimize cost. The code should be scalable and work with any number of coupons.

The Winners

Congratulations to the winners!

We are very pleased to announce the two winners of this challenge.  It is also worthy of noting that both winners have recently joined the TSQL Challenges team. We are honored to have them. Welcome aboard Brad and Peter!

Here is the winners:

Brad Schulz
My Photo Brad has been working with databases for 30 years. He was a Microsoft Visual FoxPro (VFP) MVP from 1994-2000, and has written articles for various VFP publications. He now devotes most of his time to SQL Server and is a moderator at the MSDN T-SQL Forums. He has a T-SQL blog at http://bradsruminations.blogspot.com/

 

Peter Larsson
4TD2939KPP22[1]

Professional database developer since 2001.
Started to use Microsoft SQL Server in 1993, and have been true since then.
He mostly working with performance troubleshooting and database architecturing.

 

4TDQPM9Y9CM6[1]

Marcos is a support analyst for an ISV company in Joinville, Brazil, where he helps customers properly configure and administer their databases. He is also an enthusiastic .NET and SQL Server developer and holds some Microsoft certifications (MCDBA and MCTS/MCITP for SQL 2005 and 2008).
He is currently attending a postgraduate course and helps in local forums whenever time allows.

You will see their solutions in the next days and don’t forget to check for the new challenge on next Monday.

 

About the Authors.

Posted: 10-13-2009 12:00 PM by Rui Carvalho | with no comments
Filed under: ,

TSQL Challenges - All in one
The All in one page with all our challenges and solutions reference moved to this url: http://beyondrelational.com/blogs/tc/pages/tsql-challenge-all-in-one-list-of-publications-and-solutions.aspx Please update your bookmarks!
Copyright © Beyondrelational.com