June 2009 - Posts
Thanks to all Challengers
I would like to cheerfully thanks all challengers for their solutions to this challenge.
It's the members of a community that grow that community. Information you are sharing here is the very heart of this learning community's value!
The Challenge
So, challenge 8 was about processing HIERARCHIES… without using the new data type HIERARCHYID of SQL Server 2008.
Using a CTE was required to write a query that can take a Manager Name as a parameter and list all the employees reporting to that manager, directly or indirectly
/*
Jacob
Jess
Steve
John
Bob
Michael
Paul
Rui
Smith
Lana
Johnson
Bobbey
Mic
Stev
Jacobson
Steffi
Paulson
Bracha
Jessica
*/
Adding Complexity
This query is pretty easy to write using a recursive CTE. To make the query complex, so that it will meet the complexity level expected for the ‘TSQL Challenges’ series, the following restriction was added:
“The query should be written using a recursive CTE, but the filter for "@manager” should not be applied inside the CTE”
For the sake of simplicity, let’s start with a filter inside the CTE.
This a Rob Farley’s code. Rob sent this solution before this restriction on the filter for @manager was added.
Thanks Rob for this great code and for clear comments !
DECLARE @Employees TABLE (EmpID INT, EmpName VARCHAR(20), ReportsTo INT)
INSERT INTO @Employees(EmpID, EmpName, ReportsTo)
SELECT 1, 'Jacob', NULL UNION ALL
SELECT 2, 'Rui', NULL UNION ALL
SELECT 3, 'Jacobson', NULL UNION ALL
SELECT 4, 'Jess', 1 UNION ALL
SELECT 5, 'Steve', 1 UNION ALL
SELECT 6, 'Bob', 1 UNION ALL
SELECT 7, 'Smith', 2 UNION ALL
SELECT 8, 'Bobbey', 2 UNION ALL
SELECT 9, 'Steffi', 3 UNION ALL
SELECT 10, 'Bracha', 3 UNION ALL
SELECT 11, 'John', 5 UNION ALL
SELECT 12, 'Michael', 6 UNION ALL
SELECT 13, 'Paul', 6 UNION ALL
SELECT 14, 'Lana', 7 UNION ALL
SELECT 15, 'Johnson', 7 UNION ALL
SELECT 16, 'Mic', 8 UNION ALL
SELECT 17, 'Stev', 8 UNION ALL
SELECT 18, 'Paulson', 9 UNION ALL
SELECT 19, 'Jessica', 10
DECLARE @manager VARCHAR(20)
SELECT @manager = 'Jacob'
-- Your query here:
; /* Previous query should've been ended with a semi-colon */
with OrgChart as
(
/* Base case for recursive CTE, with an empty varbinary to start with */
select cast('' as varbinary(max)) as orderby, e.EmpId, e.EmpName
from @Employees e
where e.EmpName = @manager
union all
/* Recursive case, padding onto the end of the varbinary to maintain correct order */
select orderby + cast(e.EmpID as varbinary(4)), e.EmpId, e.EmpName
from OrgChart o
join
@Employees e
on e.ReportsTo = o.EmpID
)
/* Now see the length of the orderby, divided by 4 (the length of an integer), and put this many pads in */
/* NB: As the pad is four spaces, this could've been done using a single space and no division, but I prefer it this way for flexibility */
select replicate(' ',len(orderby) / 4) + EmpName
from OrgChart
order by orderby;
Possible pittfalls
If you make a first attempt using @manager at CTE’s anchor. When getting the filter on @manager outside the CTE, do not forget to restrict CTE’s anchors to employees reporting to no one (ReportsTo IS NULL), or you’ll have a huge quantity of unnecessary reads!
Cosmetics: In the output resultset, do not pad the manager.
The Winners
Congratulations to the winners :-)
We’ll see in a few days how Matthieu, Syed and Leonid did manage this challenge.
- Matthieu Hodin
Matthieu Hodin is Project Manager Wygwam, developper since he got an HP48GX in hands (who remember?), and web developper since 2002.
Interested in "data" and "interface".
Activity preferred at job : put MVP's at work and challenge them to find solutions (for customers and not only for these SQL challenges ;p) ) !
- Syed Mehroz Alam
Syed Mehroz Alam, living in Karachi, Pakistan, is primarily a developer focusing Microsoft technologies. He has completed his bachelors as a Computer Systems Engineer in 2006 and is currently pursuing a Masters degree in Computer Science. Despite developing rich internet applications, he loves to work with SQL Business Intelligence platform that enhances his TSQL expertise. He is fond of logical challenges and has won several speed programming competitions which are listed here.
Mehroz writes articles at CodeProject, and expresses his experiences with .NET and SQL server at his blog. When he has time, he contributes to MSDN and Silverlight forums. He loves to play football (Soccer) and computer games, especially first person shooters and RPGs.
- Leonid Koyfman
Leonid is a ‘many times winner’ of the challenge and SQL Server Expert.
If you have generic questions about this challenge the dedicated is still open to discuss: http://beyondrelational.com/groups/tsqlchallenge/forum/t/66.aspx
Thanks again, and stay tuned for next challenges!
Syed Mehroz Alam prensents us a nice solution using all the principles already described. As obvious, he’s solution is one of the sortests. But let’s present Syed berore running into it:
Syed Mehroz Alam, living in Karachi, Pakistan, is primarily a developer focusing Microsoft technologies. He has completed his bachelors as a Computer Systems Engineer in 2006 and is currently pursuing a Masters degree in Computer Science. Despite developing rich internet applications, he loves to work with SQL Business Intelligence platform that enhances his TSQL expertise. He is fond of logical challenges and has won several speed programming competitions which are listed here.
Mehroz writes articles at CodeProject, and expresses his experiences with .NET and SQL server at his blog. When he has time, he contributes to MSDN and Silverlight forums. He loves to play football (Soccer) and computer games, especially first person shooters and RPGs.
Here is it’s solution and explanation:
This is the original solution:
create table #temp
(
[database] nvarchar(MAX),
[table] nvarchar(MAX),
[rows] int,
[reserved_size] nvarchar(100),
[data_size] nvarchar(100),
[index_size] nvarchar(100),
[unused_space] nvarchar(100)
)
declare @sql nvarchar(MAX)
set @sql=replace('if !~! not in (!master!,!model!,!msdb!,!tempdb!) exec [~].dbo.sp_msforeachtable "insert into #temp([table], [rows], [reserved_size], [data_size], [index_size], [unused_space]) exec [~].dbo.sp_spaceused !?!"','!',char(39))
EXEC sp_MSForEachDB
@command1=@sql,
@command2="update #temp set [database]='~' where [database] is null",
@replacechar='~'
select top(5) [database] as base, [table], [data_size] as size, [rows] as rows
from #temp
order by Cast(LEFT([data_size],len([data_size])-3) as int) desc
drop table #temp
So, I started by created a temporary table with columns (database, table, rows, reserved_size, data_size, index_size, unused_space) for the output. I used the two undocumented stored procedures sp_MSforeachdb and sp_MSforeachtable to iterate through all the tables in all the databases and executed sp_spaceused as described in the following pseudo code:
foreach(database db in serverDatabases)
if (db not in 'master', 'msdb', 'model', 'tempdb')
foreach(table t in db.Tables)
{
insert into #temp (table, rows, reserved_size, data_size, index_size, unused_space)
execute sp_spaceused for table 't'
--at this point, our #temp table will be populated with data for each table
--but the 'database' column will be 'null', so now replace it with the name of database
update #temp
set [database] = 'db' where [database] is null
}
The most important part is that I am using an update operation for storing the database name in the temporary table. Thanks to Microsoft that we can give a set of 3 commands to the above mentioned undocumented stored procedures. Another hard part was to create a single t-sql statement that iterates for all tables inside a database and execute sp_spaceused. I did this by a complex combination of single quotes, double quotes and the replace function. In the last, I am just selecting the top(5) rows ordered by size.
To enter the contest, I reduced the script length by replacing all the variable/column names with a single length identifier. Here was my final submission:
create table #t(d nvarchar(MAX),t nvarchar(MAX),r int,x nvarchar(100),s nvarchar(100),y nvarchar(100),z nvarchar(100))
declare @s nvarchar(MAX)
set @s=replace('if !~! not in (!master!,!model!,!msdb!,!tempdb!) exec [~].dbo.sp_msforeachtable "insert into #t(t, r,x,s,y,z) exec [~].dbo.sp_spaceused !?!"','!',char(39))
EXEC sp_MSForEachDB @command1=@s, @command2="update #t set d='~' where d is null", @replacechar='~'
select top(5) d as base, t as [table], s as size, r as rows from #t order by Cast(LEFT(s,len(s)-3) as int) desc
drop table #t
Thanks again Syed for sharing this solution with us and congratulations for being part of the winners!
As the TSQL Challenge #10 is available don’t forget to have a look on it.
I also invite you to see the post about “Invitation to volunteer with TSQL Challenges Committee”.
Happy Challenge now!
WALDAR’S SQLING AND DATAWAREHOUSING PLACE An interesting SQL Server blog by Fabien, which is about SQL (mostly Oracle and SQL Server) and Datawarehousing.One can expect tips and tricks around SQL and Datawarehousing, driven around what he runs against...
Wow! Free training to solve TSQL Challenges
Along with TSQL Challenge #10, we are introducing a FREE training program to help more people capable of solving real-life TSQL challenges. If you are able to solve a given challenge, please go ahead and send us your entries. However, if you could not solve it, register for a FREE training webcast (after the challenge closes) that will help you to learn how to solve problems similar to the one given in the challenge. Details of the training program is given at the end of this post.
Challenge #10
I think this challenge will be very interesting as you need to sort the data horizontally and vertically to solve this. To understand this challenge better, let us look at the source data and expected result.
Source Data
C1 C2 C3
---- ---- ----
2 1 3
3 2 1
Z X Y
B C D
Y Z X
B C A
Expected Result
Here is the expected output. Write a single query that operates on the above data and produces the output given below.
C1 C2 C3
---- ---- ----
1 2 3
A B C
B C D
X Y Z
This challenge requires the following steps:
- Sort the values horizontally. Arrange the values from smallest to the largest. for example, the first row contains values “2”, “1” and “3”. This should be arranged as “1”, “2” and “3”.
- Sort the rows vertically. This is the regular sorting that we are familiar with. Final result should be sorted as shown in the ‘expected result’ listing.
- Remove duplicates – Duplicate rows should be removed from the final output.
Sample Data
You can use the following script to generate the sample data.
DECLARE @t TABLE (
c1 CHAR(1),
c2 CHAR(1),
c3 CHAR(1)
)
insert into @t (c1, c2, c3) values ('2','1','3')
insert into @t (c1, c2, c3) values ('3','2','1')
insert into @t (c1, c2, c3) values ('Z','X','Y')
insert into @t (c1, c2, c3) values ('B','C','D')
insert into @t (c1, c2, c3) values ('Y','Z','X')
insert into @t (c1, c2, c3) values ('B','C','A')
SELECT * FROM @t
Notes
- As usual, write a single query that produces the expected output. CTEs are acceptable.
- Send your entries to tc@beyondrelational.com
- Do not include your solutions in the body of the email. Send them as an attachment in the email.
- Add ‘TSQL Challenge #10’ in the subject line of the email.
- Last date to submit your entries: 29 June 2009
- Use this forum for any questions related to TSQL Challenge #10
Free Training Webcast
With this challenge, we start arranging a free training webcast for those people who are interested in the challenge, but not able to solve it. The training event will be right after the closing date of the challenge. People attending the training webcast can still send their solutions, but will be evaluated separately. Winners will be identified only from the first submission. If some one writes a good solution after attending the training webcast, we will publish the solution, but will not be considered as a winner.
The training webcast may not directly try to solve the given challenge. However, it will explain how to write SET based queries that move data from one shape to another and the tips and tricks explained in the webcast can be used to solve the given challenge.
To attend the training webcast, register at: http://tsqlchallenge10.eventbrite.com/
Barry Young provides us a nice solution using the functions MsForeach that we already talked about but also with a very interesting inline-decompressing feature.
R. Barry Young has been a Principal Database and Development Consultant for Proactive Performance Solutions, a software consulting company in Northern Delaware for the last 15 years. Prior to that he was a performance and capacity planning Consultant for Digital Equipment Corporation.
Databases that he has used include SQL Server, Oracle, Sybase, VAX RDB and VAX DBMS. Lanuages that has programed in include SQL, VB.net, C#, VB, C++, VBA, C, Pascal, Fortan, Cobol, Bliss, TPU, DCL, Teco, Javascript, DHTML+CSS, Modula, Macro-32, Macro-11, etc. Industries that he has worked for or consulted in include Insurance, Finance, Aerospace, Chemical, Software, Pharmacuetical, Health & Medical, Defense, Manufacturing and Government.
Barry has been programming for 35 years, profesionally for 30 years, and as a professional consultant for over 25 years. He has a B.S. in Theoretical mathematics. He is a frequent poster at SQLServerCentral.com and has had several articles published there as well. He is also active as a presenter in his local SQL Server & .Net SIGs.
You can found of him on his blog site is MovingSQL.com, and you can usually found him at SQLServerCentral.com.
you can find below the solution of Barry with a full explanation.
So how did I come to develop what is possibly the ugliest, most inelegant and incomprehensible kludge of SQL code ever seen by human eyes?
Well, I initially came up with two different approaches to this TSQL challenge, one based on sp_spaceused and the undocumented procedures sp_MSForEachDB and sp_MSForEachTable to re-execute it for every table in every non-system database, and the other using nested aggregated dynamic SQL to access the the allocation DMV's directly. The sp_MSForEachDB approach as shown in uncompacted form proved to be significantly shorter:
-- Temp Table to hold SP_SPACEUSED output for each table in a
-- single database.
CREATE TABLE #T
(
[TABLE] NVARCHAR(128),
ROWS CHAR(11),
V VARCHAR(128),
SIZE VARCHAR(128),
I VARCHAR(128),
U VARCHAR(128)
)
-- Temp Table to hold table space info accumulated and labeled
-- for each database.
CREATE TABLE #TT
(
BASE NVARCHAR(128),
[TABLE] NVARCHAR(128),
ROWS CHAR(11),
V VARCHAR(128),
SIZE VARCHAR(128),
I VARCHAR(128),
U VARCHAR(128)
)
-- Execute commands for each database:
EXEC SP_MSFOREACHDB '
USE [^] -- Switch to the next database
IF DB_ID()>4 -- Skip System databases
BEGIN
-- For Each Table in this database:
EXEC SP_MSFOREACHTABLE
-- Execute SP_SPACEUSED on the table and Insert the output into #T
''INSERT INTO #T EXEC SP_SPACEUSED ''''?'''''',
-- Filter out Temp tables and table variables in TempDB which
-- cannot be accessed this way (turns out I didnt need this,
-- but I missed it).
@WHEREAND=''AND O.NAME NOT LIKE ''''#[0-9]%'''''',
-- Save local data into #TT adding DB name label.
-- (in fact I only need two tables so that I can add the
-- DB name labels)
@POSTCOMMAND=''INSERT INTO #TT SELECT DB_NAME(),* FROM #T
DELETE FROM #T''
END
', '^'
-- Change the DB substitution symbol, because '?' is used for
-- the table names
-- Display the top 5 tables in the whole Server Instance:
SELECT TOP 5 BASE ,
[TABLE] ,
SIZE ,
ROWS
FROM #TT
-- Order by size descending:
ORDER BY CAST(LEFT(SIZE ,LEN(SIZE )-3) AS INT) DESC
-- CleanUp
DROP TABLE #TT
DROP TABLE #T
Unfortunately, it was still huge. Even after taking out all of the comments and replacing all of the carriage returns and line feeds and all whitespace with single spaces, it was still 634 characters long. I was pretty sure that this was not nearly short enough to win, so I had to find some other way to make it shorter.
It occurred to me that there was a lot of redundancy in the script that it would probably compress pretty well. However, obviously I could not implement a ZIP algorithm to compress it. Not because it couldn't be done in T-SQL, in fact it is quite doable, but because the corresponding decompression code that would need to be included would be much larger than the original uncompressed script. However, I thought that a simple many-to-one character substitution compression might give me enough compression, if the decompression code that I would have ti embed could be made small enough.
The obvious approach was to just use REPLACE to substitute the longest and most frequently occurring strings for bytes in the character set that I was not otherwise using. Unfortunately this resulted in a decompression script that consisted of about 30 lines like this:
SELECT @S=REPLACE(@s,'`','CATCH ')
SELECT @S=REPLACE(@s,'~','TRY ')
SELECT @S=REPLACE(@s,'>','END ')
SELECT @S=REPLACE(@s,'<','BEGIN ')
SELECT @S=REPLACE(@s,';','), ')
SELECT @S=REPLACE(@s,'!','TABLE')
SELECT @S=REPLACE(@s,'$','var¥')
SELECT @S=REPLACE(@s,'¥','char(')
This meant that each replacement was going to add an additional 34 characters plus a the length of a copy of the string being replaced, resulting in a total characters saved = (N*L) -N*1 -34 -L. Where "L" is the length of the substring being replaced, "N" is the number of occurrences of that substring, so "(N*L)" is the total number of characters removed, "-N*1" is the number of single characters that replace them, "-34" is the base overhead per replacement and "-L" is the copy of the original string that remains in the decompressing REPLACE statement. Simplifying this gives Total_Savings = (N-1)*(L-1) - (34+1).
This was a very large cost of replacement, requiring at least 2 occurrences of a substring 37 or more characters long to result in a total savings that was positive, or 37 occurrences of a 2 character substring or anything in between. A cost this high was not going to result in in many substitution opportunities nor much overall compression. I determined that I could save an enormous amount by re-compressing the decompression statements themselves, replacing the string "') SELECT @S=REPLACE(@s,'" and "','" each with a single character. This would be very complicated because I would have to nest it and actually have two embedded decompression routines along with all of the quote-doubling that goes into nesting dynamic SQL, but it would change the cost equation to (N-1)*(L-1) - (2+1), which would mean that cases as small as 2 occurrences of a 5 character string or 5 occurrences of a 2 character string could be replaced with a net gain in total length.
That was pretty good, but the additional cost and complexity of two decompression routines and having to double-quotes twice was a concern. I felt that there must be a simpler way to both decrease the replacement cost and simplify the decompression.Obviously I could decompress the string by doing the REPLACE in a loop, but then where could I efficiently store and retrieve the REPLACE parameters? Finally it occurred to me that since the substitution strings where always a single character I could prefix them onto the beginning of the compressed command string and have the decompresser remove them as it used them. I could do the same thing with the replacement strings but since they were variable length I would need an unused character as a separator ("~"). Thus if the starting command string was this:
"CREATE TABLE #T( ... CREATE TABLE #TT( ...
DROP TABLE #T DROP TABLE #TT"
substituting "$" for the repeated string " TABLE #T" would look like this:
"$ TABLE #T~CREATE$( ... CREATE$T( ... DROP$ DROP$T"
resulting in a total savings of 21 characters. This has further reduced our base cost from 2 to 1, making the savings formula now (N-1)*(L-1) - (1+1) meaning that occurrences as small as four two-character strings or two four-character strings would be compressible. plus the decompressor was relatively simple, resulting in the following SQL:
declare @s nvarchar(999)
Set @s='Œ''~‚,ROWS~€ŒŒ~}and~|le~{char(1~\e ~Q Se|ct ~Jexec
sp_~< drop $+~;creat\$+~: From+~7SIZ\~6€INSERT into+~/JMSFor
Each~+ #t~&€Œ~$TAB|~"[$] n!‚ {1),v !,7!,i !,u !) ~!var{28)~;
(";t(BAS\n!,"/DB ŒUS\[^]IF DB_ID()>4 BEGIN /$ 6 Jspaceused
&Œ?&&,@where}=€} o.nam\not lik\&Œ#[0-9]%&&,@postcomm}=6tQdb_
name(),*: De|te:€ END Œ, Œ^ŒQTOP 5 BAS\,[$] ,7‚:t order by
cast(|ft(7,|n(7)-3) as int) desc<' ASCII(@s)< WHILE>99
SELECT @s=REPLACE
(
Right( @s, Len(@s) - (Charindex('~', @s)) ),
Left( @s, 1 ),
substring( @s, 2, Charindex('~', @s)-2 )
)
EXEC(@s)
(Editor note: the @s string was originally in one row but for formatting reasons it was exploded in multiple lines, same remark for the next code)
Of course, I had to compact this as well (remove line breaks, whitespace, etc.), with the final command string being:
declare @s nvarchar(999) Set @s='Œ''~‚,ROWS~€ŒŒ~}and~|le~{char(1~\e ~Q Se|ct ~Jexec sp_~< drop $+~;creat\$+~: From+~7SIZ\~6€INSERT into+~/JMSForEach~+ #t~&€Œ~$TAB|~"[$] n!‚ {1),v !,7!,i !,u !) ~!var{28)~;(";t(BAS\n!,"/DB ŒUS\[^]IF DB_ID()>4 BEGIN /$ 6 Jspaceused &Œ?&&,@where}=€} o.nam\not lik\&Œ#[0-9]%&&,@postcomm}=6tQdb_name(),*: De|te:€ END Œ, Œ^ŒQTOP 5 BAS\,[$] ,7‚:t order by cast(|ft(7,|n(7)-3) as int) desc<' ASCII(@s)< WHILE>99SELECT @s=REPLACE(Right(@s,Len(@s)-(Charindex('~',@s))),left(@s,1),substring(@s,2,Charindex('~',@s)-2)) EXEC(@s)
With a compressed length of 384 characters, I had shortened the original string by almost 50%, however, adding in the additional external commands, including the decompresser brought the total length back up to 550 characters. I had managed to achieve a final reduction of over 80 characters, not nearly as much as I had hoped, but still significant and after a week of trying, discarding and retrying dozens of different approaches and compression attempts, I was happy to settle for it.
Well, this has gotten pretty long and I have skipped over a lot of stuff (such as the compression routines, a whole story in themselves), so if anyone has any questions about this, I would request that you post them as a new thread at SQLServerCentral.com
As Barry said compression routines are whole story in themselves and I hope we will have time to talk about them in a near future!
Again congratulations to Barry to share with us this really interesting solution!
We are very happy to see the excitement and interest that the SQL Server community has shown towards TSQL Challenges. We take this opportunity to thank you all for your continuous support and participation in TSQL Challenges.
TSQL Challenges constantly aim at helping people to enhance their SET based query writing skills. With TSQL Challenges, some times you learn stuff that you don’t know, some times you will see better ways of doing stuff that you already know and some times you will be able to use your expertise to help others to learn TSQL querying skills. Even SQL Server experts love TSQL Challenges because every challenge inspires them to come up with new better ways of solving the given problem. The story narrated by Aurelien and Matthieu is a very good example: http://beyondrelational.com/blogs/tc/archive/2009/06/14/aurelien-and-matthieu-have-fun-with-tsql-challenges.aspx
We are making plans to come up with a number of TSQL activities and would love to welcome those of you interested in joining and volunteering with the TSQL Challenges Team. There are two specific areas where we need your help.
Volunteers for Identifying and Creating new challenges
Resolving challenges is funny, for sure. But writing an interesting TSQL puzzle and looking at the different ways people solve it might be much more interesting.
Volunteers for Evaluating Solutions
It is a great exercise to go through the many ways to solve a given TSQL problem. It will be a great opportunity to discuss with TSQL experts and learn query writing and optimization tricks from them.
If it sounds interesting, please send us an email at tc@beyondrelational.com.
Support TSQL Challenges
If you don’t have time to contribute in creating and evaluating challenges, you can still support TSQL Challenges by spreading the word in the community. If you have a website or blog and can link to us, we will be very happy. It will be also helpful, if you can make a post on your blog or website when a new challenge or solution is published. We will be further delighted if you can tweet or bookmark the challenges using social networking media.
Thanks for your continuous support!
TSQL Challenges Team
First of all, I want to thank all challengers for their contribution. There is more and more people participating, contributing and giving great feedback. All of this, encourage us to continue and help us developing TSQL Challenges community. I hope that everyone have fun with these challenges and always learn something.
Back to the TSQL Challenge 7, this is the list of the winners:
We will publish in a few days the details for each solution.
As for TSQL Challenge 4, the key point of this challenge was to write the shortest code.
You should ask me about the interest of writing the shortest code. You should argue that writing short and obscure code is rarely a symptom of quality and you will be right!
The idea is not to write the shortest-shortest code without any limitation of sense but to try through the challenge to see if there isn’t a more optimal solution than the first simple solution we have in mind. It is also the opportunity to see some tips and tricks from experts or some functions less well known.
As you will in the details of the solutions, this challenge was the occasion to deal with some undocumented functions of SQL Server: sp_MsForEachDb and sp_MsForEachTable. One may argue that these functions are not documented nor supported. In fact, you should use them carefully in your projects, but currently they are present and very useful, mainly for managing tasks like the one in this challenge.
For their defense, I should add, that their source are available, so you can verify by yourself the content and adapt it to your own procedure if it is not what you want. In all the cases, they are interesting learning points.
To find the code of each of them, navigate on your database to Procedures, System procedures, sys.sp_MSforeachdb and sys.sp_MSforeachtable, then right click Modify. This is the signature for these two procedures:
ALTER proc [sys].[sp_MSforeachdb]
@command1 nvarchar(2000),
@replacechar nchar(1) = N'?',
@command2 nvarchar(2000) = null,
@command3 nvarchar(2000) = null,
@precommand nvarchar(2000) = null,
@postcommand nvarchar(2000) = null
as
...
The 3 commands are used in the process for each table/db, then pre and post commands are used outside the iterative process. They use the same worker process which is defined by the procedure sys.spMSforeach_worker.
By the way, most people use only the @command1 parameter with the default ‘?’ char for replacing table/db names. The tip here was to use different replace chars for each one and write a double dynamic query. You should also have troubles with the quotes…
One solution should be to use simple quotes and double quotes. This is an example that should print all tables for all non-system databases
exec sp_MSforeachdb '
select "$",0,0,0,0,"x";
if db_id("$")>4
exec $..sp_Msforeachtable "exec sp_spaceused ''?''";',N'$'
For the table size calculation of the challenge, there is different methods for calculating that. You should do this by hand using the data from the view sys.dm_db_partition_stats. But it is not always easy or needed to have a deep dive in it. There is also two very useful system procedures for this too:
- sp_spaceused @table_name
- sp_MStablespace @table_name
the first one is more verbose and usually for printing ‘as is’. The second is more useful for additional computation. They return both the Kbytes of data used and number of rows asked by the challenge.
Then the last problem of the challenge was to return an ordered table with both database and tables names. This means that you need to mark the database name in your iteration and be able to store data in a temp table that you can query and reorder. This not obviously the most difficult part but if you want to create the shortest query you need to get the data directly out of the stream produced by the MSforeach procedures…
One more interesting tip if you need to manage the output table of a procedure is to introduce an identity column at the end. If you create a temp table that maps to the output of the procedure, you have the possibility to add an identity column at the end that can help you managing the output data (you can see it in action in the example at the end of this post).
In all the cases you will need to use a temp table. For these kind of problems I prefer to use table variables but temp tables or global temp tables should also work too. Global temp tables should be useful to store and exchange data between the main context and dynamic queries context.
You will see in the next posts the solutions by each challengers, they are all based on these principles.
A special mention to Barry, that create a compressed version of its query with decompression on the fly! He has not get the shortest query but its compression solution made him win 30% of the query size! That’s really amazing ;-)
Just to finish, this the solution I made while writing the challenge that uses the principles described here:
--fill data into a temp table
declare @a table(
a char(128),b int,c char(16),d char(16),e char(16)
,f char(16),id int identity(1,1))
insert into @a
exec sp_MSforeachdb '
select "$",0,0,0,0,"x";
if db_id("$")>4
exec $..sp_Msforeachtable "exec sp_spaceused ''?''";',
N'$'
-- filter and order final data
select top 5
BASE=(
select top 1 a from @a
where id <= db2.id - 1 and f='x'
order by id desc),
[TABLE]=a,SIZE=d,ROWS=b
from @a db2 where db2.f<>'x'
order by str(rtrim(replace(d,'KB','')))desc
If you have questions or comments about this solution you should discuss it on the dedicated forum: http://beyondrelational.com/groups/tsqlchallenge/forum/t/186.aspx.
If you have generic questions about this challenge the dedicated is still open to discuss: http://beyondrelational.com/forums/t/62.aspx
Congratulations to the winners, thank you to all challengers and stay tuned for next challenges!
Any challenge can be source of fun and teasing between people. TSQL Challenges obviously do! Aurelien Verla had a lot of fun solving TSQL Challenge 6 along with his workmate Matthieu Hodin last week. Aurelien and Matthieu are workmates at Wygwam.

Aurelien narrated the story as follows:
"Sometimes, there is a real story behind the scene. For TSQL Challenge 6, the story involved my colleague Matthieu Hodin and myself as key actors !
Here’s a little resume of that story"
Day 1 : TSQL Challenges #6 is online and my first attempt was made with the “UNION” trick.
Aurelien: "Matthieu, look at my query, I solved the TSQL Challenges 6 !"
Matthieu: "HAHAHA ! What a noob, you didn’t used the GROUP BY WITH ROLLUP trick !"
Day 2 : Disappointed by Matthieu, I worked on a GROUP BY WITH ROLLUP version of my query.
Aurelien: "Matthieu, look at my query, I solved the TSQL Challenges 6 with the f@!!#? GROUP BY WITH ROLLUP trick"
Matthieu: "HAHAHA ! What a noob, the query uses DATENAME() to establish the MON/TUE…, this will never work on my FRENCH SQL SERVER INSTANCE !"
Aurelien To himself: "I HATE YOU MATTHIEU ! J"
"At this point, the race began with Matthieu, my goal was to bust him with the “perfect” query, less characters than him, more beautiful tricks and so on.
This is where the SUBSTRING() helped me to take a real advantage. Working a lot with Javascript, this trick come from my attempts of creating a nice localized client side Calendar."
When we received Matthieu Hodin’s solution, we were a bit confused. It was a very valuable solution but difficult to read.
At this time, we didnt know the story behind the scenes :-)
Here is Matthieu’s solution:
SELECT
a+ISNULL(s,'')+ISNULL(LEFT(MIN(L)+j,1),'T') PERIOD_ID
,grouping(j)+grouping(s)LEVEL
,COALESCE(
SUBSTRING('SATSUNMONTUEWEDTHUFRI',1+((MIN(L)+@@datefirst)%7)*3,3)
+' '+J ,'TOTAL WEEK '+s ,'TOTAL YEAR '+a
)PERIOD
,SUM((P-P%72*2)/72*V)HOME
,SUM((P-P%67*2)/67*V)CONTACT
,SUM(P/80*V)PRODUCTS
FROM(
SELECT
*
,DATENAME(dd,d)+'/'+CAST(DATEPART(mm,d)AS CHAR)j
,DATENAME(yy,d)A ,CAST(DATEPART(dw,D)AS VARCHAR)L
,DATENAME(ww,d)s
FROM(
SELECT
VisitDate D,
ASCII(page)P
,NbVisitors V
FROM @t
)t
)t
GROUP BY A,s,J
WITH rollup
HAVING grouping(a)=0
ORDER BY PERIOD_ID
Aurelien’s solution and comments is here.
We see this story as really encouraging, and it should also be for all SQL Server developers in keeping looking for better ways of writing common TSQL Code.
Thanks to Aurelien and Matthieu !
What is new with TSQL Challenges
‘TSQL Challenges Team’ is working very hard to take this series to the next level. As part of this process, we have redefined the goal and structure of TSQL Challenges.
Goal
The goal of TSQL Challenges is to help people enhance their SET based query writing skills. Most of the times, a SET based query performs better than a row-by-row (often referred as RBAR – Row By Agonizing Row). Most people find trouble writing set based operations because they are more familiar with procedural programming model. It takes some practice and exercises to develop the SET based thinking and query writing skills.
The challenges we bring through ‘TSQL Challenges’ series will encourage people to write SET based queries to solve common problems. It will help people who are not familiar with SET based queries, to start learning it. It will help people who are already familiar with SET based queries, to learn different, and often better ways of writing queries.
Structure
The challenges we publish through ‘TSQL Challenges’ series will follow a uniform structure. All the challenges will focus on transforming data from one form to another. Each challenge will show a set of source data and expected results. The solution of the challenge should be written using a single TSQL Query. The query can use CTEs if needed and it will be considered as part of the query that uses the CTE.
Your comments
We will be very happy to hear your comments and feedback on what we do with TSQL Challenges. You can post a comment or send an email to tc@beyondrelational.com.
TSQL Challenge 9
Let us move to Challenge 9. This is a real-life challenge that our ‘Challenge Team Lead’, Rui encountered recently. Here is the problem description in his own words.
The Context
“We are managing a messaging system that needs some consolidation functionalities. There are two systems that exchanges messages. For each message, there is a flag that indicates that the message was sent or not and another flag that indicates that the message was received or not. Obviously, the receive flag can not be true if sent flag is false. Delivery of a message may fail due to a number of reasons such as high network traffic, packet loss, network failure, server busy and so on.
We want to build a synchronization tool that is able to analyze the status of the messages and ask for synchronization updates during low network traffic periods. This is process workflow:
- multiple times per day when the traffic is low, the 'target server' will ask for a synchronization report to the source server
- The source server will build a report of ranges of messages depending on their status
- The target server will ask the source server to re-send messages range by range.
- The target server will send acknowledgement for each message received
to from the source server.”
The Challenge
The challenge is to build the ranges of sequential values. The table with these ranges will have the following properties:
- The first message identifier of the range
- The last message identifier of the range
- The Send status of the range
- The Acknowledgement status
The ranges of the messages to take will be selected between two dates. The table has an IDENTITY column to maintain the sequence.
Source Data
ID CreationDate Content SendState AckState
----------- ----------------------- ---------- --------- --------
1 2009-05-27 22:15:43.647 Msg #1 0 0
2 2009-05-28 00:39:43.647 Msg #2 0 0
3 2009-05-28 03:03:43.647 Msg #3 1 1
4 2009-05-28 05:27:43.647 Msg #4 1 1
5 2009-05-28 07:51:43.647 Msg #5 1 1
6 2009-05-28 10:15:43.647 Msg #6 1 0
7 2009-05-28 12:39:43.647 Msg #7 1 0
8 2009-05-28 15:03:43.647 Msg #8 1 0
9 2009-05-28 17:27:43.647 Msg #9 1 0
10 2009-05-28 19:51:43.647 Msg #10 1 1
Result Data
FirstIdInclusive LastIdInclusive SendState AcknoledgeState
---------------- --------------- --------- ---------------
1 2 0 0
3 5 1 1
6 9 1 0
10 10 1 1
Performance is a key factor for this challenge. There are plenty of solutions for this challenge but the work to build sequential ranges is not natural in SQL and not all solutions will perform and scale as well. You should suppose that messages selected between two dates may grow up to 1 million rows and that the ranges size may be one to thousands. As the process should be launched a lot of times per day, it needs to be as fast as possible.
You have complete freedom to choose any approach to solve this problem. There is no restriction on the version of SQL server you should target to. Your solution will be acceptable as long as it produces the correct results and considers performance and scalability.
Sample Data
Use the following script to generate the sample data for this challenge.
DECLARE @tc9 TABLE(
ID INT IDENTITY(1,1),
CreationDate DATETIME,
Content NVARCHAR(10),
SendState BIT,
AckState BIT
)
INSERT INTO @tc9 (CreationDate,Content,SendState,AckState)
SELECT GETDATE()-1.0,'Msg #1',0,0 UNION
SELECT GETDATE()-0.9,'Msg #2',0,0 UNION
SELECT GETDATE()-0.8,'Msg #3',1,1 UNION
SELECT GETDATE()-0.7,'Msg #4',1,1 UNION
SELECT GETDATE()-0.6,'Msg #5',1,1 UNION
SELECT GETDATE()-0.5,'Msg #6',1,0 UNION
SELECT GETDATE()-0.4,'Msg #7',1,0 UNION
SELECT GETDATE()-0.3,'Msg #8',1,0 UNION
SELECT GETDATE()-0.2,'Msg #9',1,0 UNION
SELECT GETDATE()-0.1,'Msg #10',1,1
SELECT * FROM @tc9
Notes
- Send your entries to tc@beyondrelational.com with subject ‘TSQL Challenge 9’ and send your code as an attachment (.sql file)
- Along with your entry, write an explanation of your code and logic used. Put your explanation in a .txt file and send as an attachment along with the solution.
- Make sure that you have created your profile at beyondrelational.com and a photo is uploaded to the profile. If you have not registered with the site, you can do so by clicking on the ‘join’ link on the top right corner. If your entry wins, we will link to your profile when we publish the results.
- By submitting your entry for a challenge, you authorize beyondrelational.com to ‘edit’ (if needed) and ‘publish’ it.
- Last date to submit your entries: 15 June 2009
- Visit this forum for general questions on TSQL Challenges.
- Visit this forum for any question on TSQL Challenge 9.

Arnaud Lhopiteau is a MCSD .NET & Senior developer at Alltitude.com, his father introduced him to database systems with R:Base in late 80’s .
He loves to take care of his family, read Joe Celko’s book, smoke pipe and drink tea. Currently he uses SQL Server 2008 in his day to day work.
Here is TSQL Challenge 6 Arnaud’s solution. Comments are inside code.
SET NOCOUNT ON
SET DATEFORMAT YMD
SET DATEFIRST 1
DECLARE @T TABLE (ID INT IDENTITY(1,1), VisitDate DATETIME, Page NVARCHAR(128),NbVisitors INT)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-23','Home',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-24','Home',14)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Home',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Home',3)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Home',4)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Home',33)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-03','Home',2)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Contact',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Contact',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Contact',35)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-30','Contact',13)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Products',8)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Products',12)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Products',16)
--Third stage of the query, the “display part”, the Period_Id,
--Level and Period labels are elaborated.
--Notice the "RIGHT(CAST([Week]+100 AS CHAR(3)),2)" that allows
--to display the week number always over 2 digits.
SELECT CAST([YEAR] AS CHAR(4))
+ COALESCE(RIGHT(CAST([Week]+100 AS CHAR(3)),2),'T')
+ CASE WHEN [Week] IS NULL THEN ''
ELSE COALESCE(CAST(DATEPART(weekday,[Date]) AS CHAR(2)),'T')
END PERIOD_ID,
--the Order of the WHEN clauses testing for null values is important,
--it goes from left (week) to right (date)
--this because a yearly total has a NULL value for the Week column,
--but also a NULL value in the Day column
--as would have a weekly total
CASE WHEN [Week] IS NULL THEN 2
WHEN [Date] IS NULL THEN 1
ELSE 0 END [LEVEL],
CASE WHEN [Week] IS NULL THEN 'TOTAL YEAR ' + CAST([YEAR] AS CHAR(4))
WHEN [Date] IS NULL THEN 'TOTAL WEEK ' + CAST([Week] AS CHAR(2))
ELSE UPPER(CAST(DATENAME(WEEKDAY, [Date]) AS CHAR(3)))+ ' '
+ CAST(DATEPART(day,[Date]) AS VARCHAR(2))+ '/'
+ CAST(DATEPART(MONTH,[Date]) AS VARCHAR(2)) END [PERIOD],
HOME,
CONTACT,
PRODUCTS
FROM (
--Second stage, the “grouping part”,they are grouped for each date,
--Years and weeks are extracted
SELECT YEAR(VisitDate) [YEAR],
DATEPART(week,VisitDate) [Week],
--But we keep the Date as a column instead of decomposing it as a
--weekday and a day number in the month because it carries both
--information.
--TRICKY POINT, the Coalesce is necessary because it decorelates
--the original Visitdate of the first stage query from the date we want
--to return.
--To better understand the mechanism, test the query after removing the
--coalesce from the select and group by clause.
COALESCE(VisitDate,'') [Date],
--VisitDate [Date],
SUM(HOME) [Home],
SUM(CONTACT) [Contact],
SUM(PRODUCTS) [Products]
FROM(
--First stage of the query, the “unpivot part”, the values are
--unpivoted over the 3 categories
SELECT VisitDate,
CASE Page WHEN 'Home' THEN NbVisitors ELSE 0 END HOME,
CASE Page WHEN 'Contact' THEN NbVisitors ELSE 0 END CONTACT,
CASE Page WHEN 'Products' THEN NbVisitors ELSE 0 END PRODUCTS
FROM @T
) X
GROUP BY YEAR(VisitDate),DATEPART(week,VisitDate),COALESCE(VisitDate,'')
--We create the sub-total line thanks a rollup,
WITH ROLLUP
--the grand total is removed by an having clause filtering the line with a
--null value for the year column.
HAVING YEAR(VisitDate) IS NOT NULL
--Please note that no ORDER BY clause is necessary because the GROUP BY
--clause does an ascendant sort on the Year, Week and Date parts.
) Y
If you have questions or comments about this solution, discuss it on the
dedicated forum.
If you have generic questions or comments about the TSQL Challenge 6 you can go to
this forum.
Congratulations Arnaud for this nice solution and thanks for sharing it with us!
Cleber is a new challenger. He provides us a nice and readable solution using UNION and CROSS JOIN.
Cleber Martins works for ACI Worldwide as a Sr. Business Solutions Analyst for the Risk products. Cleber is based in Brazil and has 8 years of experience supporting financial institutions to prevent fraud and money laundering activities with neural models, rules based technology and consulting services.
Cleber's experience with SQL Server is very successful in transactional warehouses handling 1/2Billion transactions a month on terabyte databases where peaks can go over 4MM transactions per hour.
Cleber said about TSQL Challenges:
"I've been on engagements where customers want to extract very complex and detailed information from our product's databases, so once I'm not a SQL professional then I'm always researching the internet for clues. Now that I found beyondrelational.com and the challenges then I'm able to grow my knowledge on weekly basis and do my job better everyday."
Here is TSQL Challenge 6 Cleber’s comments and code:
My hundreds first thoughts for the Challenge 6 was to do multiple queries, but once the challenge stated that one query would be better score then I've assumed that I should be able to do it. I've take advantage of the unique ID's, so I've joined the table with itself what allowed me to use the first tree ID's to get the results for each of the consolidations.
SET NOCOUNT ON
SET DATEFORMAT YMD
SET DATEFIRST 1
DECLARE @T TABLE (ID INT IDENTITY(1,1), VisitDate DATETIME, Page NVARCHAR(128),
NbVisitors INT)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-23','Home',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-24','Home',14)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Home',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Home',3)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Home',4)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Home',33)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-03','Home',2)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Contact',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Contact',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Contact',35)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-30','Contact',13)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Products',8)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Products',12)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Products',16)
select "PERIOD_ID"
= case a.ID
when 1 then convert(varchar, datepart(yy, min(b.VisitDate))) +
convert(varchar, datepart(ww, min(b.VisitDate))) +
convert(varchar, datepart(dw, min(b.VisitDate)))
when 2 then convert(varchar, datepart(yy, min(b.VisitDate))) +
convert(varchar, datepart(ww, min(b.VisitDate))) + 'T'
when 3 then convert(varchar, datepart(yy, min(b.VisitDate))) + 'T'
end
,"LEVEL"
= a.ID -1
from @T a, @T b
where a.ID <= 3
select
"PERIOD_ID"
= case a.ID
when 1 then convert(varchar, datepart(yy, min(b.VisitDate))) +
convert(varchar, datepart(ww, min(b.VisitDate))) +
convert(varchar, datepart(dw, min(b.VisitDate)))
when 2 then
convert(varchar, datepart(yy, min(b.VisitDate))) +
convert(varchar, datepart(ww, min(b.VisitDate))) + 'T'
when 3 then convert(varchar, datepart(yy, min(b.VisitDate))) + 'T'
end
,"LEVEL"
= a.ID -1
,"PERIOD"
= case a.ID
when 1 then
case datepart(dw, min(b.VisitDate))
when 1 then 'MON' when 2 then 'TUE' when 3 then 'WED' when 4 then 'THU'
when 5 then 'FRI' when 6 then 'SAT' when 7 then 'SUN'
end
+ ' ' + convert(varchar, datepart(dd, min(b.VisitDate))) + '/' +
convert(varchar, datepart(mm, min(b.VisitDate)))
when 2 then
'TOTAL WEEK ' + convert(varchar, datepart(ww, min(b.VisitDate)))
when 3 then
'TOTAL YEAR ' + convert(varchar, datepart(yy, min(b.VisitDate)))
end
,"HOME"
= sum(case when b.Page = 'Home' then b.NbVisitors else 0 end)
,"CONTACT"
= sum(case when b.Page = 'Contact' then b.NbVisitors else 0 end)
,"PRODUCTS"
= sum(case when b.Page = 'Products' then b.NbVisitors else 0 end)
from @T a, @T b
where a.ID <= 3
group by
a.ID
, case a.ID
when 1 then b.VisitDate
when 2 then
dateadd(ww, datepart(ww, b.VisitDate), datepart(yy, b.VisitDate))
when 3 then datepart(yy, b.VisitDate)
end
order by "PERIOD_ID"
If you have questions or comments about this solution, discuss it on the dedicated forum.
If you have generic questions or comments about the TSQL Challenge 6 you can go to this forum.
Stay tunned and don't forget to check the current TSQL Challenge 9.
Congratulations Cleber for this nice solution and thanks for sharing it with us!
Again for this challenge, Leonid Koyfman provides us an elegant solution.
Leonid is a ‘many times winner’ of the challenge and SQL Server Expert.
This time Leonid presented 2 solutions: one using GROUP BY with ROLLUP operator and the other that combines all summary levels with UNION ALL operator.
There will be for sure some discussions about performance issues !
Here are the two solutions:
SET DATEFIRST 1
SET DATEFORMAT YMD
SET NOCOUNT ON
DECLARE @T TABLE (ID INT IDENTITY(1,1), VisitDate DATETIME, Page NVARCHAR(15),
NbVisitors INT)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-23','Home',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-24','Home',14)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Home',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Home',3)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Home',4)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Home',33)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-03','Home',2)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Contact',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Contact',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Contact',35)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-30','Contact',13)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Products',8)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Products',12)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Products',16)
---------------------- Solution #1 ------------------------
SELECT
PERIOD_ID = CASE level
WHEN 0 THEN yyyy+WW+DW
WHEN 1 THEN yyyy+WW+'T'
WHEN 2 THEN yyyy+'T'
-- ELSE 'GT'
END
, [LEVEL]
, PERIOD = CASE level
WHEN 0 then ' '+DW_name+' '+CONVERT(char(5),VisitDate,103)
WHEN 1 then 'TOTAL WEEK ' +ww
WHEN 2 then 'TOTAL YEAR ' +yyyy
-- ELSE 'GRAND TOTAL'
END
, HOME
, CONTACT
, PRODUCTS
FROM(
SELECT
yyyy
, WW
, DW
, DW_name = MAX(DW_name)
, VisitDate = MAX(VisitDate)
, [level] = GROUPING(yyyy)+GROUPING(WW)+GROUPING(DW)
, Home = SUM(CASE Page when 'Home' THEN NbVisitors ELSE 0 END)
, Contact = SUM(CASE Page when 'Contact' THEN NbVisitors ELSE 0 END)
, Products = SUM(CASE Page when 'Products' THEN NbVisitors ELSE 0 END)
FROM(
SELECT
yyyy = DATENAME(yy,VisitDate)
, WW = RIGHT('00'+DATENAME(ww,VisitDate),2)
, DW = CAST(DATEPART(dw,VisitDate) as char(1))
, DW_name= LEFT(DATENAME(dw,VisitDate),3)
, VisitDate
, Page
, NbVisitors
FROM @t
)x
GROUP BY
yyyy
, WW
, DW
WITH ROLLUP
)x
WHERE 1=1
AND [LEVEL] <=2
ORDER BY PERIOD_ID
---------------------- Solution #2 ------------------------
SELECT
PERIOD_ID
, [LEVEL]
, PERIOD
, HOME =SUM(CASE Page WHEN 'Home' THEN NbVisitors ELSE 0 END)
, CONTACT =SUM(CASE Page WHEN 'Contact' THEN NbVisitors ELSE 0 END)
, PRODUCTS =SUM(CASE Page WHEN 'Products' THEN NbVisitors ELSE 0 END)
FROM
(
SELECT
Period_ID =DATENAME(yy,VisitDate)+DATENAME(ww,VisitDate)+
CAST(DATEPART(dw,VisitDate) as char(1))
, [level] =0
, Period =' '+LEFT(DATENAME(dw,VisitDate),3)+' '+
CONVERT(char(5),VisitDate,103)
, page
, NbVisitors
FROM @t
UNION all
SELECT
Period_ID =DATENAME(yy,VisitDate)+DATENAME(ww,VisitDate)+'T'
, [level] =1
, Period ='TOTAL WEEK ' +datename(ww,VisitDate)
, page
, NbVisitors
FROM @t
UNION ALL
SELECT
Period_ID =DATENAME(yy,VisitDate)+'T'
, [level] =2
, Period ='TOTAL YEAR ' +DATENAME(yy,VisitDate)
, page
, NbVisitors
FROM @t
-- UNION ALL
-- SELECT
-- Period_ID ='GT'
-- , [level] =3
-- , Period ='GRAND TOTAL'
-- , page
-- , NbVisitors
-- from @t
)x
GROUP BY
PERIOD_ID
, [LEVEL]
, PERIOD
ORDER BY PERIOD_ID
Here are Leonid’s comments about his solution:
First I'd like to highlight lines related to output formatting and crostab transformation that are the same in both solutions.
.
For this challenge it's more appropriate to use not DATEPART, but DATENAME function where it's possible.
Not all of the options available for DATEPART presented in DATENAME and vise versa.
When we need to generate labels like "TOTAL for year 2009", use of DATENAME allows us to avoid CASTing in subsequent concatenations.
Both SELECT DATEPART(YY,getdate()) and SELECT DATENAME(YY,getdate()) return 2009, but first returns integer and other one returns string.
In line WW=RIGHT('00'+DATENAME(ww,VisitDate),2) I left padded week number with 0 to have values in Period_id column properly aligned for any week.
Maintaining consistent alignment increases readability of any report.
Code LEFT(DATENAME(dw,VisitDate),3) returns 3 letter abbreviation for a weekday(Mon,Tue etc.)
Line with CONVERT(char(5),VisitDate,103) takes advantage of built-in format that displays date in European style: DD/MM
Cross tabulation for Pages was achived using
, Home =SUM(CASE Page when 'Home' THEN NbVisitors ELSE 0 END)
, Contact =SUM(CASE Page when 'Contact' THEN NbVisitors ELSE 0 END)
, Products =SUM(CASE Page when 'Products' THEN NbVisitors ELSE 0 END)
with
GROUP BY yyyy,WW,DW
Knowing in advance that there were only three pages in the website we could use PIVOT operator for this purpose.
But performance wise it wouldn't give us any gain.
For Solution #1 I used ROLLUP operator to summarize data on multiple levels of hierarchy(year,week,day in our case) and GROUPING aggregate function to indicate those levels in the output.
It's just one query that scales well and available in all versions.
I calculated levels as [LEVEL]=GROUPING(yyyy)+GROUPING(WW)+GROUPING(DW) in the derived table to have more readable CASEs for PERIOD_ID and PERIOD columns in outermost SELECT.
ROLLUP operator also generates a line with grand total summary.
Challenge sample resultset did not include grand total, so I added statement WHERE [LEVEL] <=2
But if we expand initial data set with records from the other year e.g:
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2008-04-01','Products',100)
it will reveal the need of having 'GRAND TOTAL' as a part of the report.
Then to achieve this we just need to remove that filter on [LEVEL] and uncomment ELSE cases for Period_ID and Period.
What I like about Solution #2 is that it's simple, easy to understand and maintain.
This solution is not scalable because requires table scan in each section.
But if dataset is small and performance is acceptable this would be the one that I put into production.
It can be easily expanded if we need to add yet another level of aggregation(e.g. by Quarter) by just adding
UNION ALL
SELECT
Period_ID =DATENAME(yy,VisitDate)+'Q'+DATENAME(q,VisitDate)
, [level] =3
, Period ='TOTAL Q' +DATENAME(q,VisitDate)+' YEAR ' +DATENAME(yy,VisitDate)
, page
, NbVisitors
FROM @t
Again, as a reports developer, I anticipate the need of Grand Total to handle the case when we have data from different years.
I left commented the section that does this job.
If you have questions or comments about this solution, discuss it on the dedicated forum.
If you have generic questions or comments about the TSQL Challenge 6 you can go to this forum.
Stay tunned and don't forget to check the current TSQL Challenge 9.
Congratulations Leonid for those nice solutions and thank you for being part of the TC community!
Aurelien already closed the initial winners of the TSQL Challenge 5. He's solution for this challenge is a nice GROUP BY WITH ROLLUP one.

Aurelien Verla is Online Strategies Team Manager at Wygwam. Web Developer since 10 years, his first approach with databases systems was Access !
MVP around ASP.NET Platform, he also spend a lot of time around SQL Server for one reason: If the UI is slow, first look at the database ! You can find it's english blog here and the french one here.
Aurelien took time for commenting on his solution, though he is really busy actually with the organization of the Wygday conference.
Here are Aurelien’s comments:
Here’s the different “parts” of the query who helped me to make it concise and “original” :
- Using an HAVING clause to get rid of the global summary row of the GROUP BY WITH ROLLUP
- Using DATENAME instead of CAST for less characters
- Try to avoid CASE WHEN END except on the static pivot part
o ISNULL is your friend for the PERIOD_ID
o COALESCE is your friend for the PERIOD label: Using concatenation between a varchar and NULL produce NULL !
- Don’t forget the Modulo around DATEFIRST
o Changing the setting of the SQL SERVER INSTANCE doesn’t impact the fact that March 23, 2009 is a Monday AND will be a Monday whatever setting you put on SQL.
o Use the SUBSTRING trick to avoid a 7 parts CASE WHEN END: Don’t forget it when you work with fixed length data, in this case, all the outputs are 3 characters length.
Here is the code:
SELECT
A.[Y] + ISNULL(A.[W], '') + ISNULL(A.[D], 'T') AS 'PERIOD_ID',
GROUPING(A.[W]) + GROUPING(A.[D]) AS 'LEVEL',
COALESCE(
SUBSTRING('SATSUNMONTUEWEDTHUFRI',
((A.[D] + @@DATEFIRST) % 7) * 3 + 1, 3) + ' ' + MIN(A.[DM]),
'TOTAL WEEK ' + A.[W],
'TOTAL YEAR ' + A.[Y]
) AS 'PERIOD',
SUM(A.[H]) AS 'HOME',
SUM(A.[C]) AS 'CONTACT',
SUM(A.[P]) AS 'PRODUCTS'
FROM (
SELECT
DATENAME(YEAR, [VisitDate]) AS 'Y',
DATENAME(WEEK, [VisitDate]) AS 'W',
CAST(DATEPART(WEEKDAY, [VisitDate]) AS VARCHAR(2)) AS 'D',
DATENAME(DAY, [VisitDate]) + '/' +
CAST(MONTH([VisitDate]) AS VARCHAR(2)) AS 'DM',
CASE WHEN [Page] = 'Home' THEN [NbVisitors] ELSE 0 END AS 'H',
CASE WHEN [Page] = 'Contact' THEN [NbVisitors] ELSE 0 END AS 'C',
CASE WHEN [Page] = 'Products' THEN [NbVisitors] ELSE 0 END AS 'P'
FROM
@T
) AS A
GROUP BY A.[Y], A.[W], A.[D] WITH ROLLUP
HAVING GROUPING(A.[Y]) = 0
ORDER BY PERIOD_ID
If you have questions or comments about this solution, discuss it on the dedicated forum.
If you have generic questions or comments about the TSQL Challenge 6 you can go to this forum.
Stay tunned and don't forget to check the current TSQL Challenge 9.
Congratulations Aurelien for this nice solution and thank you for being part of the TC community!