This is a problem that I had seen in one of the forums many years back. This problem is related to a reporting requirement from the data stored by a 'poorly designed' system. The system has one table that stores the appointments scheduled for each employee. There is another table that keeps tracks of the activities performed by employees.
Red Gate SQL Prompt, a SSMS plug-in that increases how fast you work with SQL, is sponsoring this week’s TSQL challenge 24. The top winner of TSQL Challenge 24 will win a license of SQL Prompt Pro.
Your task is to generate a report that links the scheduled tasks and activities performed in the given output format.
Appointments Table
The system uses a schedule table to keep track of the activities scheduled by each employee. Here is how this table looks like.
EmpName Activity StartTime Duration
------- ---------- ------------------- --------
John HR Meeting 2010-01-01 10:15:00 01:30
John Lunch 2010-01-01 13:00:00 00:45
John Training 2010-01-01 15:00:00 01:00
Mike HR Meeting 2010-01-01 10:15:00 01:30
Mike Lunch 2010-01-01 13:00:00 00:45
Jessica Training 2010-01-01 11:20:00 00:30
The table does not have a primary key. The name of the employee is used as the key across the system. Employee names are guaranteed to be unique in this system. Here is how the first record is interpreted: "John is supposed to attend the 'HR Meeting' at 10:15 on 1st January 2010. The meeting will be 90 minutes long".
Activity Log Table
Another system keeps track of the actual activities of the employees against the scheduled activities. This system writes the activities to an 'Acitvity Log' table. This system logs the activies every 30 minutes. This is how the data in this table looks like:
EmpName Activity StartTime Duration
------- ---------- ------------------- --------
John HR Meeting 2010-01-01 10:00:00 00:15
John HR Meeting 2010-01-01 10:30:00 00:30
John HR Meeting 2010-01-01 11:00:00 00:30
John HR Meeting 2010-01-01 11:30:00 00:15
John Lunch 2010-01-01 13:00:00 00:30
John Lunch 2010-01-01 13:30:00 00:15
John Training 2010-01-01 15:00:00 00:30
John Training 2010-01-01 15:30:00 00:20
Mike HR Meeting 2010-01-01 10:00:00 00:15
Mike HR Meeting 2010-01-01 10:30:00 00:30
Mike HR Meeting 2010-01-01 11:00:00 00:30
Mike HR Meeting 2010-01-01 11:30:00 00:10
Mike Lunch 2010-01-01 13:00:00 00:25
Mike Lunch 2010-01-01 13:30:00 00:25
Jessica Training 2010-01-01 11:00:00 00:05
Jessica Training 2010-01-01 11:30:00 00:20
The way data is logged in this table looks little odd. The first 4 records are to be interpreted as "Between 10 and 10.30 AM, John attended 'HR Meeting' for 15 minutes" which means that John attended the meeting at 10:15 sharp. The meeting continued and he attended for another 30 minutes between 10:30 and 11:00. He then attended for another 30 minutes betwee 11:00 and 11:30 and finally attended for 15 minutes between 11:30 and 12:00. So to summarize, it shows that John attended the HR Meeting from 10:15 to 11:45 for 90 minutes.
This is what you need to produce from the above sample data.
Date EmpName Activity SchSt SchDur ActSt ActDur
---------- ------- ---------- ----- ------ ----- ------
2010-01-01 Jessica Training 11:20 00:30 11:25 00:25
2010-01-01 John HR Meeting 10:15 01:30 10:15 01:30
2010-01-01 John Lunch 13:00 00:45 13:00 00:45
2010-01-01 John Training 15:00 01:00 15:00 00:50
2010-01-01 Mike HR Meeting 10:15 01:30 10:15 01:25
2010-01-01 Mike Lunch 13:00 00:45 13:05 00:50
The last four columns are abbreviated and they stand for 'Scheduled Start Time', 'Scheduled Duration', 'Actual Start Time' and 'Actual Duration'.
FAQ
- What will be the maximum duration of an activity?
- The maxinum duration of an activity will be 24 hours.
- Can a record in the Activity Log Table have a duration value that is more than 30 minutes?
- No, the activity log table is updated every 30 minutes and the maximum duration that a record can store is 30 minutes.
- Does the Activity Log always get recorded at the 00 and 30 minute of the hour? For example, can it run at 10:32?
- The activity log will always update the table at the 00 and 30th minute of every hour. It cannot run at other times, such as 10:32 am.
- Can an event start on one day and end on the next day?
- No, all events will start and end on the same day
- What is the sorting requirement of the output?
- Sort the output by Date, Employee Name and Schedule Start Time.
- Is it possible to have overlapping activities in the appointments table?
- No, one employee can have only one activity scheduled for any given period.
- Is it possible to have more than one activitiy within 30 minutes in the appointments/activity table?
- Yes, it is acceptable to have one schedule end at 10:15 and another start at 10:15. As long as they are not overlapping, they are fine. However, it is not possible to have two appointments that starts and ends within the span of 30 minutes. For example, the following is not possible: 10:00 to 10:15 Activity1, 10:15 to 10:30 Activity2.
- Is it possible to have more than one entry per starttime in the activity log?
- Yes, it is possible. A typical example is the case where one appointment ends at 10:15 (which may have started before 10:00) and another starts at 10:15 (which might be ending at some time after 10:30). In such a case there will be two records at 10:00. (Note that in such a case either the first shift should have started earlier than 10:00 or the second should end after 10:30)
- Is it possible to have Orphan records in either table? Can there be rows in the activity log table without an appointment?
- Yes, it is possible. There may be rows in the activity log table without an appointment. There may be rows in the appointment table without an activity log record. You should ignore such rows while processing.
- Can the same appointment be scheduled more than once on the same day by the same person?
- Do we have to assume that there is no "gaps" in the activity log table?
- An employee cannot have a break between an activity. So there will be no gaps in the activity table for a single user activity.
- If an activity is scheduled to start at 10:00 for 20 minutes and if I see an entry for that activity in the log table at 10:00 for 20 minutes, how do I calculate the start time of the activity? The activity may have started at any time between 10:00 and 10:10
- In such a case, you should assume that the activity started on time (10:00). However, if you see only 15 minutes in the activity log then you should assume that the activity started at 10:15.
- Can an activity start prior to the scheduled time?
- No, an activity cannot start before the scheduled time. However, it can start late. If an activity is scheduled to start at 10:30 and if you find an activty log entry at 14:00 for 5 minutes, you should assume that the activity started at 14:25.
More Examples
The early review team had come up with a number of grey areas that need more explanation. One of the scenarios that added a lot of confusion to the processing logic is the case when two activities of an employee are recorded at the same hour slot. This is discussed in FAQ 7, 8 and 12. Here is another illustration that explains this.
Here is the entry in the schedule table
EmpName Activity StartTime Duration
------- ---------- ------------------- --------
John HR Meeting 2010-01-01 10:15:00 01:30
John Training 2010-01-01 15:00:00 01:00
Activity Log Table Example 1
EmpName Activity StartTime Duration
------- ---------- ------------------- --------
John Training 2010-01-01 16:30:00 00:05' –- 16:55 to 17:00
John Training 2010-01-01 17:00:00 00:05' –- 17:00 to 17:05
John HR Meeting 2010-01-01 17:00:00 00:10' –- 17:05 to 17:15
Activity Log Table Example 2
EmpName Activity StartTime Duration
------- ---------- ------------------- --------
John Training 2010-01-01 17:00:00 00:10 –- 17:00 to 17:10
John HR Meeting 2010-01-01 17:00:00 00:05 –- 17:25 to 17:30
John HR Meeting 2010-01-01 17:30:00 00:05 –- 17:30 to 17:35
The above illustration demonstrates how the activity start time should be calculated in the scenarios where two entries for the same employee are recorded at the same 30 minute slot.
Example 1: There are two entries for Trainnig (16:30 and 17:00, 5 minutes each), which indicates that the meeting started in the 16:30 to 17:00 slot and continued to 17:00 to 17:30 slot for 5 minutes. That essentially tells us that the meeting started at 16:55 and ended at 17:05 (10 minutes). HR meeting has no second entry, so it will start at the earliest possible time starting at 17:00 and therefore will follow the Training activity (17:05-17:15).
For the second Example two entries for HR meeting (17:00 and 17:30), so it will be from 17:25 till 17:35 using the same logic we saw earlier. Since Training has no second entry it will start at the earliest possible time which is 17:00.
Sample Data
Here is the script to genreate the source tables and populate the sample data
IF OBJECT_ID('TC24_Schedules','U') IS NOT NULL BEGIN
DROP TABLE TC24_Schedules
END
GO
CREATE TABLE TC24_Schedules (
SchID INT IDENTITY PRIMARY KEY,
EmpName VARCHAR(10),
Activity VARCHAR(15),
StartTime SMALLDATETIME,
Duration VARCHAR(5)
)
GO
INSERT INTO TC24_Schedules (EmpName, Activity, StartTime, Duration)
SELECT 'John', 'HR Meeting', '2010-01-01 10:15:00', '01:30' UNION ALL
SELECT 'John', 'Lunch', '2010-01-01 13:00:00', '00:45' UNION ALL
SELECT 'John', 'Training', '2010-01-01 15:00:00', '01:00' UNION ALL
SELECT 'Mike', 'HR Meeting', '2010-01-01 10:15:00', '01:30' UNION ALL
SELECT 'Mike', 'Lunch', '2010-01-01 13:00:00', '00:45' UNION ALL
SELECT 'Jessica', 'Training', '2010-01-01 11:20:00', '00:30'
IF OBJECT_ID('TC24_ActivityLog','U') IS NOT NULL BEGIN
DROP TABLE TC24_ActivityLog
END
GO
CREATE TABLE TC24_ActivityLog (
LogID INT IDENTITY PRIMARY KEY,
EmpName VARCHAR(10),
Activity VARCHAR(15),
StartTime SMALLDATETIME,
Duration VARCHAR(5)
)
GO
INSERT INTO TC24_ActivityLog(EmpName, Activity, StartTime, Duration)
SELECT 'John', 'HR Meeting', '2010-01-01 10:00:00', '00:15'
UNION ALL
SELECT 'John', 'HR Meeting', '2010-01-01 10:30:00', '00:30'
UNION ALL
SELECT 'John', 'HR Meeting', '2010-01-01 11:00:00', '00:30'
UNION ALL
SELECT 'John', 'HR Meeting', '2010-01-01 11:30:00', '00:15'
UNION ALL
SELECT 'John', 'Lunch', '2010-01-01 13:00:00', '00:30'
UNION ALL
SELECT 'John', 'Lunch', '2010-01-01 13:30:00', '00:15'
UNION ALL
SELECT 'John', 'Training', '2010-01-01 15:00:00', '00:30'
UNION ALL
SELECT 'John', 'Training', '2010-01-01 15:30:00', '00:20'
UNION ALL
SELECT 'Mike', 'HR Meeting', '2010-01-01 10:00:00', '00:15'
UNION ALL
SELECT 'Mike', 'HR Meeting', '2010-01-01 10:30:00', '00:30'
UNION ALL
SELECT 'Mike', 'HR Meeting', '2010-01-01 11:00:00', '00:30'
UNION ALL
SELECT 'Mike', 'HR Meeting', '2010-01-01 11:30:00', '00:10'
UNION ALL
SELECT 'Mike', 'Lunch', '2010-01-01 13:00:00', '00:25'
UNION ALL
SELECT 'Mike', 'Lunch', '2010-01-01 13:30:00', '00:25'
UNION ALL
SELECT 'Jessica', 'Training', '2010-01-01 11:00:00', '00:05'
UNION ALL
SELECT 'Jessica', 'Training', '2010-01-01 11:30:00', '00:20'
Notes
- The solution should be a single query that starts with a "SELECT", ";WITH" or "WITH"
- Read the Submission Guidelines and make sure that your solution follows them.
- The solution should work on SQL Server 2005, 2008 or later versions
- Use this forum for any questions related to TSQL Challenge #24
- Last Date to submit your entries: 08 March 2010 Midnight GMT.
- Please pay attention to the rules explained in the FAQ, especially FAQ 12 which adds a good deal of complexity to the challenge.
And the winner gets..
In addition to SQL Stars the winner who comes on top will get a license of Red Gate SQL Prompt Pro.
SQL Prompt is a plug-in that increases how fast you can work with SQL.
- Code-completion for SQL Server
- Automated SQL reformatting to match your preferred style
- Rapid access to your database schema information
- Snippets cut down repetitive typing
Click here to download a trial copy of Red Gate SQL Prompt
Special thanks to the Early Review Team, especially lmu92, dishdy, Aurelien Verla, Sergejack and Brad Schulz who reviewed the challenge and provided valuable inputs to make the challenge description and rules more comprehensible.
I am very glad to announce the winners of TSQL Challenge 16. This is a bit special for two reasons. First of all, there was a long gap between announcing the winners of the previous challenge and this one. We have been working hard improving the evaluation system which caused a temporary delay. We will be able to move fast now and clear all the back logs.
Secondly, this is the first challenge which went over a full evaluation cycle using the new software we have been building. We were very excited evaluating the solutions using the new software which will reduce the delay in announcing the results considerably.
And the winners are…
- Kevin Suchlicki
- Mark Cowne
You can find the details of the winners here.
There were more solutions that passed the Logic Testing. We did a load testing of those solutions using the data posted here. The solutions were tested with a large table containing 50,000 rows and with a timeout of 10 minutes. Solutions that took more than 10 minutes were discarded from the final results. Both the winning solutions took an average time below 5 minutes. You can find the performance comparison of the solutions here.
With this challenge, we are publishing a little bit more information about the performance evaluation we did. Along with the performance summary, we are also posting the results of individual runs of the solutions. Each solution was executed against the load testing data 5 times and the result of each execution is posted with the final results. You can click on the CPU, Reads, Writes or Duration on the summary page and it will take you to the detailed information. You can see an example here.
Congratulations Kevin and Mark!
It is again time to play with some date/time values. This challenge involves reading the audit log entries generated by a monitoring application and create a report that shows the up-time and down-time of a system that the application is monitoring.
Red Gate SQL Backup Pro is sponsoring TSQL Challenge 23. The top winner of TSQL Challenge 23 will win a license of SQL Backup Pro worth $795, and a Red Gate goodie bag including a T-shirt, frisbee and more.
The Context
NP Corporation is using a Legacy Application that their employees use to perform some of the day-to-day operations. The system is quite old, the people that built the system is not around any more. Due to some unknown reasons the system hangs frequently and the only way to bring it up is to reboot the server. Though it is not an acceptable behavior, there is no alternative and the management of NP corporation has decided to migrate to a new system soon. The management wants to keep track of the uptime and downtime of the system and created a monitoring application that populate an audit table as given below.
AuditTime IsOnline
----------------------- --------
2010-01-01 10:11:23.000 1
2010-01-01 10:14:03.000 1
2010-01-01 10:19:36.000 1
2010-01-01 10:23:59.000 0
2010-01-01 10:27:14.000 0
2010-01-01 10:31:24.000 1
2010-01-01 10:34:15.000 1
2010-01-01 10:36:11.000 0
2010-01-01 10:39:40.000 1
2010-01-01 10:42:23.000 1
The monitoring application checks the status of the application preriodically and logs the status to the audit table. If the server is online, it writes "1" and "0" if the system is not online. "AuditTime" column stores the time at which the monitoring application checked the status of the server. The monitoring application has a custom logic that decides the interval at which the audit takes place.
I know, none of you are happy with the way the system works and the monitoring application behaves. Some of you might even refuse to work in such an environment arguing the whole setup does not make sense. However, for the purpose of this challenge, assume that you have no choice than to live with it!
The Challenge
Your job is to write a query that produces a report showing the uptime and downtime of the system. Here is the output that you need to produce, based on the given input data.
TimeFrom TimeTo IsOnline
------------------- ------------------- --------
2010-01-01 10:00:00 2010-01-01 10:21:47 Yes
2010-01-01 10:21:47 2010-01-01 10:29:19 No
2010-01-01 10:29:19 2010-01-01 10:35:13 Yes
2010-01-01 10:35:13 2010-01-01 10:37:55 No
2010-01-01 10:37:55 2010-01-01 11:00:00 Yes
Rules
There are a few points to keep in mind when generating the output data.
- The solution should be a single query that starts with a "SELECT", ";WITH" or "WITH"
- The first and last periods start and end on the hour preceding and following. (Note that the "TimeFrom" on the first row is 10 AM and "TimeTo" in the last row is 11 AM. For example, if the first reading is at 08:23:45, the first period in the resultset should start at 8:00:00. If the last reading is at 21:23:45, the last period in the result should end at 22:00:00.
- All rows except for the first and last should finish at the midpoint between the readings. For example the reading at 10:19 found the system to be online and the reading at 10:23 found it to be offline. Since the server went offline some time between the two readings, you should take the midpoint between the readings: 10:21
- The first and last periods should be whatever status the first and last readings indicate (rather than assuming that the period starts up or down).
- The result should use the datetime type, rounded down to the nearest second. Each reading is taken at an exact second - there is no millisecond component.
Sample Data
Here is the script to genreate the source tables and populate the sample data
IF OBJECT_ID('TC23','U') IS NOT NULL BEGIN
DROP TABLE TC23
END
GO
CREATE TABLE TC23(
AuditTime DATETIME,
IsOnline BIT -- 0:Down, 1:Up
)
GO
INSERT INTO TC23(AuditTime, IsOnline)
SELECT '2010-01-01T10:11:23', 1 UNION ALL
SELECT '2010-01-01T10:14:03', 1 UNION ALL
SELECT '2010-01-01T10:19:36', 1 UNION ALL
SELECT '2010-01-01T10:23:59', 0 UNION ALL
SELECT '2010-01-01T10:27:14', 0 UNION ALL
SELECT '2010-01-01T10:31:24', 1 UNION ALL
SELECT '2010-01-01T10:34:15', 1 UNION ALL
SELECT '2010-01-01T10:36:11', 0 UNION ALL
SELECT '2010-01-01T10:39:40', 1 UNION ALL
SELECT '2010-01-01T10:42:23', 1
SELECT * FROM TC23
/*
AuditTime IsOnline
----------------------- --------
2010-01-01 10:11:23.000 1
2010-01-01 10:14:03.000 1
2010-01-01 10:19:36.000 1
2010-01-01 10:23:59.000 0
2010-01-01 10:27:14.000 0
2010-01-01 10:31:24.000 1
2010-01-01 10:34:15.000 1
2010-01-01 10:36:11.000 0
2010-01-01 10:39:40.000 1
2010-01-01 10:42:23.000 1
*/
Notes
- Read the Submission Guidelines and make sure that your solution follows them.
- The solution should work on SQL Server 2005, 2008 or later versions
- Use this forum for any questions related to TSQL Challenge #23
- Last Date to submit your entries: 22 February 2010 Midnight GMT.
And the Winner Gets...
Red Gate SQL Backup Pro is sponsoring TSQL Challenge 23. The top winner of TSQL Challenge 23 will win a license of SQL Backup Pro worth $795, and a Red Gate goodie bag including a T-shirt, frisbee and more.
Note: In case more than one user comes in the first place, we will select the winner through a lucky draw
Special thanks to the Early Review Team, especially
Sergejack and dave ballantyne who reviewed the challenge and provided valuable inputs to make the challenge description and rules more comprehensible. Rui, Peter and Rob from the TSQL Challenge Team worked really hard on this challenge to get it to the way you see it now.