Facebook Sign in | Join
Pushing database changes needn't be hard work with SQL Compare
Getting Started with Adobe After Effects - Part 6: Motion Blur
First Time? You can support us by signing up. It takes only 5 seconds. Click here to sign up. If you already have an account, click here to login.

TSQL Challenge 6 Winners

Again there were a lot of participants for this challenge, and a lot of different solutions.

Regarding all these solutions this is the short list of the winners we choose for this challenge:

  • Arnaud Lhopiteau

    Arnaud Lhopiteau is a MCSD .NET & Senior developer at Alltitude.com, his father introduced him to database systems with R:Base in late 80’s .
    He loves to take care of his family, read Joe Celko’s book, smoke pipe and drink tea. Currently he uses SQL Server 2008 in his day to day work.
    Arnaud's solution is a nice GROUP BY WITH ROLLUP one.

 

  • Aurelien already closed the initial winners of the TSQL Challenge 5. He's solution for this challenge is a nice GROUP BY WITH ROLLUP one too.  

    Aurelien Verla

    Aurelien Verla is Online Strategies Team Manager at Wygwam. Web Developer since 10 years, his first approach with databases systems was Access !

    MVP around ASP.NET Platform, he also spend a lot of time around SQL Server for one reason: If the UI is slow, first look at the database ! You can find it's english blog here and the french one here.

  • Leonid Koyfman
     
    Again for this challenge, Leonid Koyfman provides us an elegant solution. In fact, Leonid sent two solutions: one using GROUP BY WITH ROLLUP like Aurelien's and Arnaud's one, and the other using one UNION for each level.
    There will be for sure some discussions about performance issues ! 
    Leonid is a ‘many times winner’ of the challenge and SQL Server Expert.

 

  • Cleber Augusto Martins

    Cleber is a new challenger. He provides us a nice and readable solution using UNION and CROSS JOIN.
    Cleber Martins works for ACI Worldwide as a Sr. Business Solutions Analyst for the Risk products. Cleber is  based in Brazil and has 8 years of experience supporting financial institutions to prevent fraud and money laundering activities with neural models, rules based technology and consulting services.
    Cleber's experience with SQL Server is very successful in transactional warehouses handling 1/2Billion transactions a month on terabyte databases where peaks can go over 4MM transactions per hour.

    Cleber said about TSQL Challenges:
    "I've been on engagements where customers want to extract very complex and detailed information from our product's databases, so once I'm not a SQL professional then I'm always researching the internet for clues. Now that I found beyondrelational.com and the challenges then I'm able to grow my knowledge on weekly basis and do my job better everyday."

 

We will publish in the few next days explanations for each solution.

The key point of this challenge was to show how it is possible to produce a full report inside the database without any other reporting facilities. I am not saying that Reporting Services or any other reporting solution is not good, I just say that most of the times this reporting computation should be done simply directly from the database in Tsql. What are the benefits of this solution? In a few words we can say:

  • all is done in the database, you should hope better performance ;-)
  • decrease unneeded network traffic
  • just get what you need

The idea is that you should have better performance if you avoid retrieving a lot of data to analyze in your application (that’s quite obvious isn’t it?). But all is not ideal, because with that you will have other troubles:

  • you will introduce domain logic in your database (and you should probably don’t want doing this if you work with an ORM)
  • You will have to write all in Tsql (as you do it by the hand, there is no drag and drop reporting facilities ;-).
  • It should be harder to write, understand and maintain (one big query is always harder than a lot of smaller queries).

All of that just to says that the best choice for doing reporting depends on your needs ;-). I just want with this challenge to present an old-fashioned way of doing reporting that is still a good option today!

The problem in this challenge is that we have to produce a complete report table with rows of data (the original rows of data) but also introduce some consolidation rows.

There are different ways to handle the problem in pure TSql. The options are:

  • Iterate through a CTE on each consolidation level
  • Suppose that you have a predefined number of levels and create one query for each consolidation level that you join with unions
  • Use the old ROLLUP functionality of TSQL (which may be deprecated in future versions but still useful right now)

I think we have already talked a lot about CTE in this challenges, and that’s why I present a solution based on the ROLLUP grouping.

So, for newcomers what is that? The grouping with ROLLUP function is a TSql option of the GROUP BY that will add automatically a consolidation row for each group in a hierarchical way. Let’s see a short example:

DECLARE @T TABLE (ID INT,Label CHAR(4),Tag CHAR(4))
INSERT INTO @T (ID,Label,Tag) 
SELECT 1,'A','SQL' UNION SELECT 2,'B','XML' 
UNION SELECT 3,'C','SQL' UNION SELECT 4,'B','SQL' 
UNION SELECT 5,'A','XML' UNION SELECT 8,'C','SQL' 
UNION SELECT 7,'B','SQL' UNION SELECT 6,'A','XML'
 
SELECT    Tag,Label,COUNT(ID) AS ItemsCount,
        GROUPING(Tag)+GROUPING(Label) As Level
FROM @T GROUP BY Tag,Label WITH ROLLUP

Which will produce the following table:

Tag    Label    ItemsCount   Level
SQL     A       1            0
SQL     B       2            0
SQL     C       2            0
SQL     NULL    5            1
XML     A       2            0
XML     B       1            0
XML     NULL    3            1
NULL    NULL    8            2

We can see that for each consolidation level of the hierarchy a row is added, a null is set for group by fields that are under the current level and that the aggregates fields are automatically computed for this level.

One other interesting feature is the GROUPING function.  The GROUPING function returns 0 if the column value came from the fact data, and 1 if the column value is a NULL generated by the aggregation. Adding a grouping for each group by column we can create the hierarchical levels.

We can tune our original example query to have better label information using this method:

SELECT  Tag = CASE WHEN GROUPING(Tag) = 1 THEN 'ALL' ELSE Tag END,
        Label= 
        CASE GROUPING(Tag)+GROUPING(Label)
            WHEN 1 THEN 'Total ' + Tag
            WHEN 2 THEN 'Total ALL TAGS'
            ELSE Label END,
        COUNT(ID) AS ItemsCount,
        GROUPING(Tag)+GROUPING(Label) As Level
FROM @T GROUP BY Tag,Label WITH ROLLUP

 

Which produce:

Tag  Label          ItemsCount  Level
---- -------------- ----------- -----
SQL  A              1           0
SQL  B              2           0
SQL  C              2           0
SQL  Total SQL      5           1
XML  A              2           0
XML  B              1           0
XML  Total XML      3           1
ALL  Total ALL TAGS 8           2

Coming back  to the challenge, there is one solution using this feature. It is not the shortest or the best, but is relatively readable and (I hope) maintainable:

SELECT [PERIOD_ID]= 
        CASE [LEVEL]
        WHEN 0 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR) 
             + CAST([WEEKDAY]    AS NVARCHAR)        
        WHEN 1 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR) + 'T'
        WHEN 2 THEN CAST([YEAR] AS NVARCHAR)+ 'T'
        END    
        ,[LEVEL]
        ,[PERIOD] =
            CASE [LEVEL]
                WHEN 0 THEN '   ' + [WEEKDAYNAME] + ' ' + [MONTHDAY]                
                WHEN 1 THEN 'TOTAL WEEK ' + CAST([WEEK] AS NVARCHAR)
                WHEN 2 THEN 'TOTAL YEAR ' + CAST([YEAR] AS NVARCHAR)
            END
        ,[HOME] = SUM(CASE WHEN Page = 'Home' THEN TOTAL ELSE 0 END)
        ,[CONTACT] = SUM(CASE WHEN Page = 'Contact' THEN TOTAL ELSE 0 END)
        ,[PRODUCTS] = SUM(CASE WHEN Page = 'Products' THEN TOTAL ELSE 0 END)
FROM (
SELECT    Page
        ,[YEAR] = DATEPART(YEAR,VisitDate)
        ,[WEEK] = DATEPART(WEEK,VisitDate)
        ,[WEEKDAY] = DATEPART(WEEKDAY,VisitDate)
        ,[WEEKDAYNAME] = LEFT(UPPER(DATENAME(WEEKDAY,VisitDate)),3)
        ,[MONTHDAY] = CAST(DATEPART(DAY,VisitDate) AS NVARCHAR) + '/' 
                    + CAST(DATEPART(MONTH,VisitDate) AS NVARCHAR)
        ,[TOTAL] = SUM(NbVisitors)
        ,[LEVEL] = GROUPING(DATEPART(WEEKDAY,VisitDate)) 
                 + GROUPING(DATEPART(YEAR,VisitDate)) 
                 + GROUPING(DATEPART(WEEK,VisitDate)) 
                 + GROUPING(CAST(DATEPART(DAY,VisitDate) AS NVARCHAR) 
                 + '/' + CAST(DATEPART(MONTH,VisitDate) AS NVARCHAR))
FROM @T
GROUP BY Page
        ,DATEPART(YEAR,VisitDate)
        ,DATEPART(WEEK,VisitDate)
        ,DATEPART(WEEKDAY,VisitDate)
        ,DATENAME(WEEKDAY,VisitDate)
        ,CAST(DATEPART(DAY,VisitDate) AS NVARCHAR) + '/' 
         + CAST(DATEPART(MONTH,VisitDate) AS NVARCHAR)
WITH ROLLUP
) T
WHERE 1=1
    AND [LEVEL] < 3
GROUP BY 
        CASE [LEVEL] --> by Period Id
        WHEN 0 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR) 
             + CAST([WEEKDAY]    AS NVARCHAR)        
        WHEN 1 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR) + 'T'
        WHEN 2 THEN CAST([YEAR] AS NVARCHAR)+ 'T'
        END,     
        CASE [LEVEL] --> by period
        WHEN 0 THEN '   ' + [WEEKDAYNAME] + ' ' + [MONTHDAY]                
        WHEN 1 THEN 'TOTAL WEEK ' + CAST([WEEK] AS NVARCHAR)
        WHEN 2 THEN 'TOTAL YEAR ' + CAST([YEAR] AS NVARCHAR)
        END
        ,[LEVEL]
ORDER BY [PERIOD_ID]

If you have questions or comments about this solution, discuss it on the dedicated forum: http://beyondrelational.com/groups/tsqlchallenge/forum/t/125.aspx

If you have generic questions or comments about the TSQL Challenge 6 you can go to this forum: http://beyondrelational.com/forums/t/37.aspx

Stay tunned and don't forget to check for TSQL Challenge 9


Share

Comments

# re: TSQL Challenge 6 Winners

Wednesday, June 03, 2009 2:03 PM by Leonid Koyfman

I like the idea to produce a full report inside the database without any other reporting facilities.

In addition to the benefits of this solution you listed I would add

- ability to schedule data retrieval and

- email the results from SQL Server.

In conjunction with ability to format output as HTML table using FOR XML PATH

(see bookonline example Sending an HTML e-mail message in msdn.microsoft.com/.../ms190307.aspx )

it opens a wide array of reporting options.

Report can be emailed on a schedule with HTML formatted body.

I have dozens of reports produced this way for myself for database monitoring and for business users.

Formatting is limited only by HTML skills.

Even better if this part can be delegated to somebody from creative or marketing group, and they prepare HTML template with portlets placeholders).

Then replacing those placeholders with output formatted as HTML table(s) is just a modification of book online example.

I think for fixed recurrent reports this is better than having users to login on Reporting Services and hit database every time to see static results.

Taking this approach raises importance of many output formatting T-SQL functions and hacks that arguably considered more suitable for a front end.

I mean all flavors of DATENAME, CONVERT etc.

If old-fashioned way of doing reporting is still alive, it means it's reliable and in many cases good enough option.


Copyright © Rivera Informatic Private Ltd.