May 2009 - Posts
Again there were a lot of participants for this challenge, and a lot of different solutions.
Regarding all these solutions this is the short list of the winners we choose for this challenge:
- Arnaud Lhopiteau
Arnaud Lhopiteau is a MCSD .NET & Senior developer at Alltitude.com, his father introduced him to database systems with R:Base in late 80’s .
He loves to take care of his family, read Joe Celko’s book, smoke pipe and drink tea. Currently he uses SQL Server 2008 in his day to day work.
Arnaud's solution is a nice GROUP BY WITH ROLLUP one.
-
Aurelien already closed the initial winners of the TSQL Challenge 5. He's solution for this challenge is a nice GROUP BY WITH ROLLUP one too.

Aurelien Verla is Online Strategies Team Manager at Wygwam. Web Developer since 10 years, his first approach with databases systems was Access !
MVP around ASP.NET Platform, he also spend a lot of time around SQL Server for one reason: If the UI is slow, first look at the database ! You can find it's english blog here and the french one here.
- Leonid Koyfman
Again for this challenge, Leonid Koyfman provides us an elegant solution. In fact, Leonid sent two solutions: one using GROUP BY WITH ROLLUP like Aurelien's and Arnaud's one, and the other using one UNION for each level.
There will be for sure some discussions about performance issues !
Leonid is a ‘many times winner’ of the challenge and SQL Server Expert.
- Cleber Augusto Martins
Cleber is a new challenger. He provides us a nice and readable solution using UNION and CROSS JOIN.
Cleber Martins works for ACI Worldwide as a Sr. Business Solutions Analyst for the Risk products. Cleber is based in Brazil and has 8 years of experience supporting financial institutions to prevent fraud and money laundering activities with neural models, rules based technology and consulting services.
Cleber's experience with SQL Server is very successful in transactional warehouses handling 1/2Billion transactions a month on terabyte databases where peaks can go over 4MM transactions per hour.
Cleber said about TSQL Challenges:
"I've been on engagements where customers want to extract very complex and detailed information from our product's databases, so once I'm not a SQL professional then I'm always researching the internet for clues. Now that I found beyondrelational.com and the challenges then I'm able to grow my knowledge on weekly basis and do my job better everyday."
We will publish in the few next days explanations for each solution.
The key point of this challenge was to show how it is possible to produce a full report inside the database without any other reporting facilities. I am not saying that Reporting Services or any other reporting solution is not good, I just say that most of the times this reporting computation should be done simply directly from the database in Tsql. What are the benefits of this solution? In a few words we can say:
- all is done in the database, you should hope better performance ;-)
- decrease unneeded network traffic
- just get what you need
The idea is that you should have better performance if you avoid retrieving a lot of data to analyze in your application (that’s quite obvious isn’t it?). But all is not ideal, because with that you will have other troubles:
- you will introduce domain logic in your database (and you should probably don’t want doing this if you work with an ORM)
- You will have to write all in Tsql (as you do it by the hand, there is no drag and drop reporting facilities ;-).
- It should be harder to write, understand and maintain (one big query is always harder than a lot of smaller queries).
All of that just to says that the best choice for doing reporting depends on your needs ;-). I just want with this challenge to present an old-fashioned way of doing reporting that is still a good option today!
The problem in this challenge is that we have to produce a complete report table with rows of data (the original rows of data) but also introduce some consolidation rows.
There are different ways to handle the problem in pure TSql. The options are:
- Iterate through a CTE on each consolidation level
- Suppose that you have a predefined number of levels and create one query for each consolidation level that you join with unions
- Use the old ROLLUP functionality of TSQL (which may be deprecated in future versions but still useful right now)
I think we have already talked a lot about CTE in this challenges, and that’s why I present a solution based on the ROLLUP grouping.
So, for newcomers what is that? The grouping with ROLLUP function is a TSql option of the GROUP BY that will add automatically a consolidation row for each group in a hierarchical way. Let’s see a short example:
DECLARE @T TABLE (ID INT,Label CHAR(4),Tag CHAR(4))
INSERT INTO @T (ID,Label,Tag)
SELECT 1,'A','SQL' UNION SELECT 2,'B','XML'
UNION SELECT 3,'C','SQL' UNION SELECT 4,'B','SQL'
UNION SELECT 5,'A','XML' UNION SELECT 8,'C','SQL'
UNION SELECT 7,'B','SQL' UNION SELECT 6,'A','XML'
SELECT Tag,Label,COUNT(ID) AS ItemsCount,
GROUPING(Tag)+GROUPING(Label) As Level
FROM @T GROUP BY Tag,Label WITH ROLLUP
Which will produce the following table:
Tag Label ItemsCount Level
SQL A 1 0
SQL B 2 0
SQL C 2 0
SQL NULL 5 1
XML A 2 0
XML B 1 0
XML NULL 3 1
NULL NULL 8 2
We can see that for each consolidation level of the hierarchy a row is added, a null is set for group by fields that are under the current level and that the aggregates fields are automatically computed for this level.
One other interesting feature is the GROUPING function. The GROUPING function returns 0 if the column value came from the fact data, and 1 if the column value is a NULL generated by the aggregation. Adding a grouping for each group by column we can create the hierarchical levels.
We can tune our original example query to have better label information using this method:
SELECT Tag = CASE WHEN GROUPING(Tag) = 1 THEN 'ALL' ELSE Tag END,
Label=
CASE GROUPING(Tag)+GROUPING(Label)
WHEN 1 THEN 'Total ' + Tag
WHEN 2 THEN 'Total ALL TAGS'
ELSE Label END,
COUNT(ID) AS ItemsCount,
GROUPING(Tag)+GROUPING(Label) As Level
FROM @T GROUP BY Tag,Label WITH ROLLUP
Which produce:
Tag Label ItemsCount Level
---- -------------- ----------- -----
SQL A 1 0
SQL B 2 0
SQL C 2 0
SQL Total SQL 5 1
XML A 2 0
XML B 1 0
XML Total XML 3 1
ALL Total ALL TAGS 8 2
Coming back to the challenge, there is one solution using this feature. It is not the shortest or the best, but is relatively readable and (I hope) maintainable:
SELECT [PERIOD_ID]=
CASE [LEVEL]
WHEN 0 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR)
+ CAST([WEEKDAY] AS NVARCHAR)
WHEN 1 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR) + 'T'
WHEN 2 THEN CAST([YEAR] AS NVARCHAR)+ 'T'
END
,[LEVEL]
,[PERIOD] =
CASE [LEVEL]
WHEN 0 THEN ' ' + [WEEKDAYNAME] + ' ' + [MONTHDAY]
WHEN 1 THEN 'TOTAL WEEK ' + CAST([WEEK] AS NVARCHAR)
WHEN 2 THEN 'TOTAL YEAR ' + CAST([YEAR] AS NVARCHAR)
END
,[HOME] = SUM(CASE WHEN Page = 'Home' THEN TOTAL ELSE 0 END)
,[CONTACT] = SUM(CASE WHEN Page = 'Contact' THEN TOTAL ELSE 0 END)
,[PRODUCTS] = SUM(CASE WHEN Page = 'Products' THEN TOTAL ELSE 0 END)
FROM (
SELECT Page
,[YEAR] = DATEPART(YEAR,VisitDate)
,[WEEK] = DATEPART(WEEK,VisitDate)
,[WEEKDAY] = DATEPART(WEEKDAY,VisitDate)
,[WEEKDAYNAME] = LEFT(UPPER(DATENAME(WEEKDAY,VisitDate)),3)
,[MONTHDAY] = CAST(DATEPART(DAY,VisitDate) AS NVARCHAR) + '/'
+ CAST(DATEPART(MONTH,VisitDate) AS NVARCHAR)
,[TOTAL] = SUM(NbVisitors)
,[LEVEL] = GROUPING(DATEPART(WEEKDAY,VisitDate))
+ GROUPING(DATEPART(YEAR,VisitDate))
+ GROUPING(DATEPART(WEEK,VisitDate))
+ GROUPING(CAST(DATEPART(DAY,VisitDate) AS NVARCHAR)
+ '/' + CAST(DATEPART(MONTH,VisitDate) AS NVARCHAR))
FROM @T
GROUP BY Page
,DATEPART(YEAR,VisitDate)
,DATEPART(WEEK,VisitDate)
,DATEPART(WEEKDAY,VisitDate)
,DATENAME(WEEKDAY,VisitDate)
,CAST(DATEPART(DAY,VisitDate) AS NVARCHAR) + '/'
+ CAST(DATEPART(MONTH,VisitDate) AS NVARCHAR)
WITH ROLLUP
) T
WHERE 1=1
AND [LEVEL] < 3
GROUP BY
CASE [LEVEL] --> by Period Id
WHEN 0 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR)
+ CAST([WEEKDAY] AS NVARCHAR)
WHEN 1 THEN CAST([YEAR] AS NVARCHAR)+ CAST([WEEK] AS NVARCHAR) + 'T'
WHEN 2 THEN CAST([YEAR] AS NVARCHAR)+ 'T'
END,
CASE [LEVEL] --> by period
WHEN 0 THEN ' ' + [WEEKDAYNAME] + ' ' + [MONTHDAY]
WHEN 1 THEN 'TOTAL WEEK ' + CAST([WEEK] AS NVARCHAR)
WHEN 2 THEN 'TOTAL YEAR ' + CAST([YEAR] AS NVARCHAR)
END
,[LEVEL]
ORDER BY [PERIOD_ID]
If you have questions or comments about this solution, discuss it on the dedicated forum: http://beyondrelational.com/groups/tsqlchallenge/forum/t/125.aspx
If you have generic questions or comments about the TSQL Challenge 6 you can go to this forum: http://beyondrelational.com/forums/t/37.aspx
Stay tunned and don't forget to check for TSQL Challenge 9
Aurelien is a new challenger that closes the initial winners of the TSQL Challenge 5.

Aurelien Verla is Online Strategies Team Manager at Wygwam. Web Developer since 10 years, his first approach with databases systems was Access !
MVP around ASP.NET Platform, he also spend a lot of time around SQL Server for one reason: If the UI is slow, first look at the database ! You can find it's english blog here and the french one here.
Here is his solution:
;WITH [FILTER_IDX] ([id], [data], [idx], [start], [stop]) AS (
SELECT [id], [data], 1, 1, CHARINDEX(',', [data])
FROM @filter
UNION ALL
SELECT [id], [data], [idx] + 1, [stop] + 1,
CHARINDEX(',', [data], [stop] + 1)
FROM [FILTER_IDX]
WHERE [stop] > 0
), [BLOG_IDX] ([id], [tags], [idx], [start], [stop]) AS (
SELECT [id], [tags], 1, 1,
CHARINDEX(',', [tags])
FROM @blog
UNION ALL
SELECT [id], [tags], [idx] + 1, [stop] + 1,
CHARINDEX(',', [tags], [stop] + 1)
FROM [BLOG_IDX]
WHERE [stop] > 0
), [FILTER_SPLIT] ([id], [data], [value]) AS (
SELECT [id], [data], SUBSTRING([data], [start],
CASE WHEN [stop] > 0
THEN [stop] - [start]
ELSE LEN([data]) END)
FROM [FILTER_IDX]
), [BLOG_SPLIT] ([id], [tags], [value]) AS (
SELECT [id], [tags], SUBSTRING([tags], [start],
CASE WHEN [stop] > 0
THEN [stop] - [start]
ELSE LEN([tags]) END)
FROM [BLOG_IDX]
)
SELECT T1.[id], T1.[data], T3.[name], T2.[tags], COUNT(*) AS 'RELEVANCE'
FROM [FILTER_SPLIT] AS T1
LEFT JOIN [BLOG_SPLIT] AS T2 ON T1.[value] = T2.[value]
INNER JOIN @blog AS T3 ON T3.[id] = T2.[id]
GROUP BY T1.[id], T1.[data], T3.[name], T2.[tags]
ORDER BY T1.[id], [RELEVANCE] DESC
Here is his comments about the solution:
- My first goal was to make a clear and “read-able” query. That’s why I used CTE to split the query into logical parts: separator positions, split and then the join for relevance.
- Probably not the best performance here, but a nice compromise between being sure that someone else could understand it and performance.
In fact that this is also a key point that I think is really important in Sql, the readability! Most of the times, the Tsql code do its job, but if one wants to modify it after some time, it always needs to re-write it.
Congratulations Aurelien for this solutions and thanks for sharing it with us!
Welcome to TSQL Challenge 8!
This challenge is more about processing HIERARCHIES. SQL Server 2008 introduced a new data type HIERARCHYID which is pretty good for processing hierarchies. However, the problem explained in this challenge should be fixed without using the HIERARCHYID data type.
The Problem
The problem is all about identifying all the employees directly or indirectly reporting to a given manager. Here is the organization chart of my fictitious company.
/*
Jacob
Jess
Steve
John
Bob
Michael
Paul
Rui
Smith
Lana
Johnson
Bobbey
Mic
Stev
Jacobson
Steffi
Paulson
Bracha
Jessica
*/
The Challenge
The challenge is to write a query that can take a Manager Name as a parameter and list all the employees reporting to that manager, directly or indirectly. If the parameter contains “Smith” the query should return:
If the parameter passed is ‘Jacob’, the query should return:
Jacob
Jess
Steve
John
Bob
Michael
Paul
Your query should ideally look like the following:
DECLARE @manager VARCHAR(20)
SELECT @manager = 'Smith'
-- Your query here:
Adding Complexity
This query is pretty easy to write using a recursive CTE. To make the query complex, so that it will meet the complexity level expected for the ‘TSQL Challenges’ series, I am adding the following restriction.
“The query should be written using a recursive CTE, but the filter for "@manager” should not be applied inside the CTE”
Sample Data
Use the following sample data for writing your queries:
DECLARE @Employees TABLE (EmpID INT, EmpName VARCHAR(20), ReportsTo INT)
INSERT INTO @Employees(EmpID, EmpName, ReportsTo)
SELECT 1, 'Jacob', NULL UNION ALL
SELECT 2, 'Rui', NULL UNION ALL
SELECT 3, 'Jacobson', NULL UNION ALL
SELECT 4, 'Jess', 1 UNION ALL
SELECT 5, 'Steve', 1 UNION ALL
SELECT 6, 'Bob', 1 UNION ALL
SELECT 7, 'Smith', 2 UNION ALL
SELECT 8, 'Bobbey', 2 UNION ALL
SELECT 9, 'Steffi', 3 UNION ALL
SELECT 10, 'Bracha', 3 UNION ALL
SELECT 11, 'John', 5 UNION ALL
SELECT 12, 'Michael', 6 UNION ALL
SELECT 13, 'Paul', 6 UNION ALL
SELECT 14, 'Lana', 7 UNION ALL
SELECT 15, 'Johnson', 7 UNION ALL
SELECT 16, 'Mic', 8 UNION ALL
SELECT 17, 'Stev', 8 UNION ALL
SELECT 18, 'Paulson', 9 UNION ALL
SELECT 19, 'Jessica', 10
Notes:
- The query should run on SQL Server 2005 and 2008.
- Write a single query that returns the results. No temp tables, functions etc should be used.
- Send your entries to tc@beyondrelational.com with subject ‘TSQL Challenge 8’ and send your code as an attachment (.sql file)
- Use 4 spaces for indentation while generating the output hierarchy.
- Siblings within a parent node may be sorted either by Name or by ID (I have sorted them by ID in the sample given above)
- Last date to submit your entries: 1 June 2009
- Visit this forum for general questions on TSQL Challenges.
- Visit this forum for any question on TSQL Challenge 8.
The Context
You have done some Sql Server development tasks for a small company that has one Sql Server with several bases. Before you leave, as they will not have any dba to monitor databases, they ask you to write a simple script to help them monitor space used by their data.
The Challenge
You have to write the shortest script to list the 5 biggest tables on the server. It should include all databases that are not system databases. The resulting table will include only the following fields: BASE as database name, TABLE as table name, SIZE as table size used only by data (no index or unused reserved space) in KB and then ROWS as the number of rows in the table (in order to correlate if the space used is big by data or because of the number of rows).
This is an example resulting table:
BASE TABLE SIZE ROWS
----------------------------------
base1 T1 162720 KB 14233
base2 T2 38000 KB 8462
base1 T3 10720 KB 5343
base1 T4 9040 KB 5006
base3 T5 2800 KB 1899
The script should work on any SQL Server “as is”.
Notes:
- The target of the challenge is to write the shortest code. We will only count the chars for this challenge, so don’t try to make it readable. Keep just the essential code.
- You should write it as you want with any of the capabilities provided by a standard installation of SQL Server.
- The script should target any version of SQL Server and it is not limited to only one query.
- The script should not create any persistent objects (you should use temp tables or create proc/udf if you remove them after but not leave them on the server after your script runs).
- The result has to be a table as described before (this means not a print) and should be reusable for future actions (we can plan for example to send the result table by mail or stored it on an other data table every 5 min with a job…)
- Send your entries to tc@beyondrelational.com
- Do not paste the code in the email body. Include it as an attachment: if possible a file named firstname_lastname_tsqlchallenge7.sql
- Mention ‘TSQL Challenge 7’ in the email subject.
- Last Date: 25 May 2009.
- There is a forum linked to 'TSQL Challenges' for general questions and discussions on the 'TSQL Challenges' series.
- For any questions or comments specific to TSQL Challenge 7, visit the dedicated thread
To be able to connect better with the admirers of TSQL Challenges, we have been working on creating our workspace on a number of social networking websites/applications. Many thanks to my friend Rui, who is very passionate about TSQL Challenges and is taking lead on this front.
You can connect with ‘TSQL Challenges’ on any or all of the following locations.
In addition to this, there is a dedicated group within www.beyondrelational.com for TSQL Challenges.
Happy Social Networking!
Friedrich Paul is another many-times-winner of the TSQL Challenges.
He uses like Leonid only one CTE with the same split parsing recursively mechanism . This solution is very goods in terms of code optimization (even if it is too much compressed for a good readability in my point of view ;-)).
Here is his solution:
;with extract(id,name,tags,old_pos,new_pos)
as (
select id,name,
substring(tags,0,
case when charindex(',',tags,0) = 0
then len(tags) else charindex(',',tags,0) end),
0 as old_pos,charindex(',',tags,0) + 1 as new_pos
from @blog --Anchor member function
Union all
select b.id ,b.name,
substring(b.tags,new_pos,
case when charindex(',', b.tags,new_pos) = 0
then len(b.tags)
else charindex(',',b.tags,new_pos) - new_pos end),
charindex(',',b.tags,new_pos) as old_pos,
charindex(',',b.tags,new_pos) + 1 as new_pos
from @blog b
inner join extract e on b.id = e.id
where new_pos ! = 1 --Recursive member function
)
select f.id,f.data,e.name,b.tags,
count(*) as relevance
from @filter f,extract e,@blog b
where b.id = e.id and f.data like '%' + e.tags + '%'
group by f.id,f.data,e.name,b.tags
order by id,relevance desc
You can find all the explanations of this solution on the Friedrich's blog post about that: http://datawarehousejourney.blogspot.com/2009/05/solution-for-tsql-challenge-5.html
If you have questions or comments about this solution, discuss it on the dedicated forum: http://beyondrelational.com/forums/t/56.aspx
If you have generic questions or comments about the TSQL Challenge 5 you can go to this forum: http://beyondrelational.com/forums/t/18.aspx
Stay tunned and don't forget to check the TSQL Challenge #6.
Congratulations Freddy for this solutions and thanks for sharing it with us!
Divya is a new challenger that sends us a very interesting solution. She show us that it can be done without CTE in one very short and simple query. We really enjoyed her solution not only because it is a completely different solution but also because of the global ergonomics of the query

She is working as a SQL Server developer and had written many articles on MsSqlTips and SqlServerCentral.
Here is her solution:
select id,data,name,tags,COUNT(relevance) as Relevance
from
(
select F.id,
F.data,
B.name,
B.tags,
filterdata,
case when CHARINDEX(filterdata,B.tags) >0
then 1 else 0 end as relevance
from (
select F1.id,
F1.data,
O.filterdata
from
(
select *,
cast('<X>'+
replace(F.data,',','</X><X>')+
'</X>' as XML)
as xmlfilter
from @filter F
)F1
cross apply
(
select fdata.D.value('.','varchar(50)') as filterdata
from f1.xmlfilter.nodes('X') as fdata(D)) O
) F
,@blog B
)F1
where relevance =1
group by F1.id,F1.data,F1.name,F1.tags
order by F1.id,Relevance desc
The key point of this solution is to transform the coma separated values to an xml values series then to get the values by a cross apply. Simple but efficient!.
Congratulations Divya for this solutions and thanks for sharing it with us!
There were a lot of different solutions for this challenge but globaly the key point to release the split in line in an unique query was to use CTEs. As most people send us valuable queries with CTEs, we took in consideration also other notation points as the length, the ergonomy of the query and brevity/simplicity.
Again for this challenge, Leonid Koyfman provides us a simple and sexy solution. There is only one CTE, the code is clear, readable, and also one with the best performances, just the things we need. Leonid is a ‘many times winner’ of the challenge and SQL Server Expert.
Here is it's solution:
;with cte
as(
select
id,
tag=case charindex(',',tags)
when 0 then tags
else substring(tags,1,charindex(',',tags,1)-1)
end ,
txt=substring(tags,charindex(',',tags,2)+1,len(tags))
from @blog
union all
select
id,
tag=case charindex(',',txt)
when 0 then txt
else substring(txt,1,charindex(',',txt,1)-1)
end ,
txt=substring(txt,charindex(',',txt,2)+1,len(txt))
from cte
where txt!=tag
)
select
f.id,
f.data,
b.[name],
b.tags,
RELEVANCE=count(*)
from
cte join
@blog b on b.id=cte.id join
@filter f
on charindex(','+cte.tag+',' , ','+f.data+',')>0
group by
f.id,
f.data,
b.[name],
b.tags
order by
id,
count(*)desc,
b.[name]
You can find a very detailled explanation of his solution on his blog: http://beyondrelational.com/blogs/leonid/archive/2009/05/11/solution-for-challenge-5.aspx.
Again, congratulations Leonid and thanks for sharing this solution with us!
I spent a lot of time reviewing all the solutions and regarding the key points of our current notation for the challenge this is the list of the winners:
- Leonid Koyfman which is a many times winner of the challenge and provide as usual a clean and very efficient solution
- Divya Agrawal which share the first place with Leonid and which provide the shortest solution (and without CTE!)
- Friedrich Paul and Aurélien Verla which provide a more traditional but efficient solution.
We will publish in the next days explanations for each solution.
I would like to mention also the fantastic solution of Michael Coles which provide a full Xml solution that we also talk about in a few days.
A few words about this challenge. The key starting point for this one was to wonder if it is possible to find some solutions to realize inline, inside a stand alone sql query something like the traditional sql split that all us sql developpers have used at least one time.
The exercice was interesting because in definitive, coma separated values are very often used (even if I admit it is not in most cases a good practice…).
We can imagine in this exemple that our query should help to populate an auto-complete search list by tags in a blog application. Coma separated values are not as easy as it seems to share between the application and the database. Most of the times, developpers move this difficulty from sql side to application side my doing the same with objects manipulation. I am sure we can write a pretty linq query with the same effect. But most of the times for larger data volumes that’s not the good performance solution.
Awaiting to present the different solutions of each winners, this is my first solution. It’s one solution, not the better one as you will see in next posts:
;WITH FILTER (id , data, VALUE) AS (
SELECT id
,data = CAST(data AS NVARCHAR(MAX))
,VALUE = CAST('' AS NVARCHAR(MAX))
FROM @filter
UNION ALL
SELECT id, data =
CASE WHEN CHARINDEX(',',data) > 0
THEN SUBSTRING(
data,
CHARINDEX(',',data)+1,
LEN(data)-CHARINDEX(',',data))
ELSE '' END
, VALUE =
CASE WHEN CHARINDEX(',',data) > 0
THEN SUBSTRING(data,1,CHARINDEX(',',data)-1)
ELSE data END
FROM FILTER
WHERE LEN(data) > 0
),
BLOG_TAGS (id , tags, VALUE) AS (
SELECT id
,tags = CAST(tags AS NVARCHAR(MAX))
,VALUE = CAST('' AS NVARCHAR(MAX))
FROM @blog
UNION ALL
SELECT id, tags =
CASE WHEN CHARINDEX(',',tags) > 0
THEN SUBSTRING(
tags,
CHARINDEX(',',tags)+1,
LEN(tags)-CHARINDEX(',',tags))
ELSE '' END
, VALUE =
CASE WHEN CHARINDEX(',',tags) > 0
THEN SUBSTRING(tags,1,CHARINDEX(',',tags)-1)
ELSE tags END
FROM BLOG_TAGS
WHERE LEN(tags) > 0
)
SELECT F.ID,F.data,B.name,B.tags,RELEVANCE = COUNT(B.ID)
FROM(
SELECT F.ID,F.data,FILTER.VALUE
FROM FILTER INNER JOIN @filter F ON FILTER.ID = F.ID
WHERE LEN(VALUE) > 0
) F
INNER JOIN
(
SELECT B.ID, B.name, B.tags,T.VALUE
FROM @blog B INNER JOIN BLOG_TAGS T ON B.ID = T.ID
WHERE LEN(VALUE) > 0
) B ON F.VALUE = B.VALUE
GROUP BY F.ID,F.data,B.name,B.tags
ORDER BY F.ID,RELEVANCE DESC
Stay tunned and don't forget to check the TSQL Challenge #6
cheers!
The context
For this challenge, you will have to analyse logs on the web server of your company.
Here is the sample data you have to work with:
ID VisitDate Page NbVisitors
----------- ----------------------- --------------- -----------
1 2009-03-23 00:00:00.000 Home 10
2 2009-03-24 00:00:00.000 Home 14
3 2009-03-26 00:00:00.000 Home 22
4 2009-03-27 00:00:00.000 Home 3
5 2009-03-29 00:00:00.000 Home 4
6 2009-04-01 00:00:00.000 Home 33
7 2009-04-03 00:00:00.000 Home 2
8 2009-03-26 00:00:00.000 Contact 22
9 2009-03-27 00:00:00.000 Contact 10
10 2009-03-29 00:00:00.000 Contact 35
11 2009-03-30 00:00:00.000 Contact 13
12 2009-03-27 00:00:00.000 Products 8
13 2009-03-29 00:00:00.000 Products 12
14 2009-04-01 00:00:00.000 Products 16
The Challenge
There is an existing application that takes tabular data with hierarchical consolidation inside. Your boss ask you to produce a report about web site activity using this application. In order to use the hierarchical capabilites of this application, each row of the report you provide needs to have an id and hierarchical level information. The Id is a construction based on year, week number and day in week number. For each consolidation row A T is added at the end (see the resulting table).
The key point of this report is to count for each day, each week, each year, the number of visitors of each page of the site. A period label should be also produced with the following format : '3 letter of the name of the day in week' day / month (exemple: MON 23/3) for the day rows, TOTAL WEEK XX (exemple TOTAL WEEK 25) for the week rows and finally TOTAL YEAR XXXX for the year rows.
We supose that we only have 3 sections on the site corresponding to Home,Contact and Products. This should be considered as static information and will not be updated in the future.
This is the expected output based on the input data provided.
PERIOD_ID LEVEL PERIOD HOME CONTACT PRODUCTS
---------- ----- --------------- ----------- ----------- -----------
2009131 0 MON 23/3 10 0 0
2009132 0 TUE 24/3 14 0 0
2009134 0 THU 26/3 22 22 0
2009135 0 FRI 27/3 3 10 8
2009137 0 SUN 29/3 4 35 12
200913T 1 TOTAL WEEK 13 53 67 20
2009141 0 MON 30/3 0 13 0
2009143 0 WED 1/4 33 0 16
2009145 0 FRI 3/4 2 0 0
200914T 1 TOTAL WEEK 14 35 13 16
2009T 2 TOTAL YEAR 2009 88 80 36
Notes:
- You should write this in any number of queries to produce the final correct table (but solutions with only one query will have better score)
- You have to use only pure T-SQL, no CLR.
- The query can assume that there are only three pages in the website: home, contact and products.
- The Query should target any version of Sql Server (but a solution that works on all will be prefered)
- You have to use the exact sample data provided (please don't re-write the sample data with your naming,etc...)
- Send your entries to jacob@beyondrelational.com
- Do not paste the code in the email body. Include it as an attachment (.sql file)
- Mention ‘TSQL Challenge 6’ in the email subject
- Last Date: 18 May 2009
- There is a forum linked to 'TSQL Challenges' for general questions and discussions on the 'TSQL Challenges' series.
- For any questions specific to 'TSQL Challenge 6', visit this thread.
Sample Data:
SET DATEFIRST 1
SET DATEFORMAT YMD
SET NOCOUNT ON
DECLARE @T TABLE (ID INT IDENTITY(1,1), VisitDate DATETIME, Page NVARCHAR(15),NbVisitors INT)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-23','Home',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-24','Home',14)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Home',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Home',3)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Home',4)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Home',33)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-03','Home',2)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-26','Contact',22)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Contact',10)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Contact',35)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-30','Contact',13)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-27','Products',8)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-03-29','Products',12)
INSERT INTO @T (VisitDate,Page,NbVisitors) VALUES ('2009-04-01','Products',16)