Jacob Sebastian, SQL Server MVP
See this post that explains how to format TSQL code listing when posting in this forum.
Hi, Is the data in Column "Name" using SPACES for the indenting or TAB?
I'm totally confused regarding the definition of "SET based approach" that has been used to evaluate challenge #13
"Several solutions were rejected for the following reasons
I absolutely agree in terms of a WHILE loop (or a cursor). But it's the very first time I read that an UPDATE operation was not accepted as being a SET based operation...
Regarding temp tables and table variables:
Those usually can be defined inside a CTE. However, there are cases where an intermediate table combined with a CTE performs way better than having the SELECT statement inside the CTE (and some of those cases are shown within your challenges, I'm sure). Most probably, some better scaling solutions have been eliminated down the road due to the current (new?) interpretation of "SET based".
It would have helped the previous challenges a lot if the requirements were as clear as they are now....
1) is there guaranteed to be only one top-level employee in the hierarchy?
2) what is the maximum hierarchy depth we should code for?
also, sorry to be an anal programmer-type (funny how you get those in here!), but the challenge states, "The column 'by_sub' should show the total of orders created by all the employees reporting to him/her." This makes it sound like you just want the direct-report employees, when in actuality it's all the employees underneath the given employee (i.e. in his/her hierarchy tree). This is clear from the sample data. I guess it depends on how you define "reports to," but in the sample data column names you use "ReportsTo" in a way which means "Reports Directly To" so it creates some confusion.
Can you please clarify what criteria you employ to test the output? I am wondering if the columns need to be of a certain type and length?
If so, can you please detail that?
Please note that we are not trying to define what is a SET based operation and what is not. The rules indicate the type of operations that the challenges allow. We are not trying to set or redefine the standards. "UPDATE operations" are not allowed, not because they are not SET based, but because we dont allow multi-statement-queries.
In no way we want to add new 'interpretation to SET BASED'.
We agree that the rules were not very clear previously, but they are getting better now. We are working hard to make it even clearer in the future.
#1 - There may be multiple top level employee in the hierarchy
#2 - The number of hierarchy levels are not defined. Your code should handle the maximum possible levels.
You are right. the 'by sub' column should include the total order of all employees directly or indirectly reports to the given employee. Not only just the immediate subordinates.
The column types does not really matter, but the number of columns and the name and order of columns that your query produce should match exactly with the sample output given with the challenge.
Since I now can 'spam' the forum too (thx Jacob!) and this might be the more appropriate place to ask this, I'll repeat my question here :
Why do you guys "insist" to have the
input data in Table variables ? They kind of behave different (read :
worse) than ordinary tables which is where data normally resides (eg.
statistics, indexes, ...) And they make testing slow as for each test
one has to declare + load the table again (from an existing table).
=> Wouldn't it be a lot easier to have an empty database with
just those tables in it ? The added benefit of (potentially) having
indexes on said input data might make the challenge a bit more
realistic. Besides, let's agree : @emp and @order don't even have
proper PK's !?!? What kind of an example is that in an advanced SQL
Challenge ? =/
ps: as for the 'only one query allowed, no variables etc' - rules,
I'm no fan of that either as (often) means working around limitations
instead of using the full potential of the system. Then again, nobody
is forcing me to take part in this =)
In other words, I like these challenges as they 'evangelize' SET based SQL, yet I fear that due to the restrictions imposed they actually push people away from it rather than make them see the benefit.
Each of the solutions will go through 3 levels of testing. We don’t collect the performance data in the first two levels (basic testing and logic testing).
Solutions that pass both the above testing will undergo load testing with a large set of data. This test is done on a physical table (not on a table variable). The evaluation script will replace the reference to the table variable with a physical table having a large number of rows (usually 1 million) and run the tests on that. We collect performance statistics only at this testing phase.
Currently it is a pain even on our side and we really wish to come up with a better idea. So I hope the idea of accepting solutions that work with table variables might change very soon :-)
Why do you not just publish the table schema and data generation script so that everyone can optimise their queries to produce the best result?
I personally generate test data in a table to load test my solution, then I have to convert it to your table variable to submit it, and you then have to convert it back again to test it yourselves....what a waste of all of our valuable time....
So, come on, give us the test data generation script and save everyone the trouble.
This is something we have in the wish list. Probably we will be able to do it with the next challenge
Although not shown in the sample data, would you expect employees without any orders to be included in the results? Or will this never be the case?