Consolidate start/end times in a dataset

tokash

Registered User.
Local time
Today, 09:46
Joined
Sep 10, 2015
Messages
15
I'm using SQL 2012.

My current dataset looks like this:

shft_id code_id start_time end_time
9090 100 1/28/2015 07:15 1/28/2015 08:00
9090 100 1/28/2015 08:00 1/28/2015 08:30
9090 120 1/28/2015 08:30 1/28/2015 11:30
9090 110 1/28/2015 11:30 1/28/2015 13:00
9090 100 1/28/2015 13:00 1/28/2015 16:00
9090 100 1/28/2015 16:00 1/28/2015 16:30
9091 120 1/28/2015 11:00 1/28/2015 13:00
9091 110 1/28/2015 13:00 1/28/2015 15:30
9091 110 1/28/2015 15:30 1/28/2015 18:00
9091 110 1/28/2015 18:00 1/28/2015 18:30
9091 120 1/28/2015 18:30 1/28/2015 19:30

I need to consolidate consecutive start/end times based on the shft_id and code_id to the following desired output:

shft_id code_id start_time end_time
9090 100 1/28/2015 07:15 1/28/2015 08:30
9090 120 1/28/2015 08:30 1/28/2015 11:30
9090 110 1/28/2015 11:30 1/28/2015 13:00
9090 100 1/28/2015 13:00 1/28/2015 16:30
9091 120 1/28/2015 11:00 1/28/2015 13:00
9091 110 1/28/2015 13:00 1/28/2015 18:30
9091 120 1/28/2015 18:30 1/28/2015 19:30

Would anyone know the SQL that could accomplish this?
 
would think a groupby query would do it - group by shift_id and code_id, and select min start_time and max end_time
 
would think a groupby query would do it - group by shift_id and code_id, and select min start_time and max end_time
Thanks for the suggestion CJ, but doing a simple group by on the shft_id, code_id with min/max on the start/end times will give inaccurate results.
For example, such a query would give a result of the very first record being:

shft_id code_id start_time end_time
9090 100 1/28/2015 07:15 1/28/2015 16:30

which is inaccurate
 
i already posted similar query just a week ago.
 
SELECT DISTINCT shft_id, code_id, DMIN("start_time","yourTableName","shft_id = " & shft_id & " AND code_id = " & code_id) AS start_time, DMAX("start_time","yourTableName","shft_id = " & shft_id & " AND code_id = " & code_id) AS end_time FROM yourTableName
 
Data does not have a sense of 'order' - previous/next etc. Even Excel, which still needs the data sorted or 'placed' the right way to get the right result. In this case the order can be determined by having a start_time of one record equalling the end_time of another.

This relies on not having two records with the same start_time or end_time and (with the exception of the first and last record) there is always a start_time to match an end_time

Try this, not tested, change names to suit

Code:
 SELECT DISTINCT shft_id, code_id,
     nz((SELECT TOP 1 start_time FROM myTable AS T WHERE end_time=myTable.start_time and shft_id=myTable.shft_id and code_id=myTable.code_id),start_time) as starttime, end_time
 FROM myTable
 WHERE not exists (SELECT * FROM myTable AS T WHERE start_time=myTable.end_time and shft_id=myTable.shft_id and code_id=myTable.code_id)

The theory is:

to select the start time the nz function and subquery is basically saying 'if there is a record with the same end time as this records start time and the same shft and code id's then there is a previous record for this shft and code id so use that records start time, otherwise there is no matching previous record so this is the first record, so use this records start time.

however this would still bring through all records so we now need to exclude the starting records which is what the Exists subquery does.

this is looking to see if a record exists where the start time equals this records end time and has the same shft and code id. If it does exist then the current record is not the last record for this shft and code id so is excluded.
 
@arnelgp. Although your solution is a common one, it won't work in this case because there are duplicates of the shift and code id's - an additional field would be required to indicate a batch number or similar. It effectively produces the same result as the group by query I was suggesting
 
thinking about it another solution which will be faster is would be

Code:
SELECT ThisRec.shift_id, ThisRec.code_id, Nz([PrevRec].[start_time],[thisrec].[start_time]) AS StartTime, ThisRec.end_time
FROM (myTable AS ThisRec LEFT JOIN myTable AS PrevRec ON (ThisRec.start_time = PrevRec.end_time) AND (ThisRec.code_id = PrevRec.code_id) AND (ThisRec.shift_id = PrevRec.shift_id)) LEFT JOIN myTable AS NextRec ON (ThisRec.end_time = NextRec.start_time) AND (ThisRec.code_id = NextRec.code_id) AND (ThisRec.shift_id = NextRec.shift_id)
WHERE (((NextRec.shift_id) Is Null))
 
thinking about it another solution which will be faster is would be

Code:
SELECT ThisRec.shift_id, ThisRec.code_id, Nz([PrevRec].[start_time],[thisrec].[start_time]) AS StartTime, ThisRec.end_time
FROM (myTable AS ThisRec LEFT JOIN myTable AS PrevRec ON (ThisRec.start_time = PrevRec.end_time) AND (ThisRec.code_id = PrevRec.code_id) AND (ThisRec.shift_id = PrevRec.shift_id)) LEFT JOIN myTable AS NextRec ON (ThisRec.end_time = NextRec.start_time) AND (ThisRec.code_id = NextRec.code_id) AND (ThisRec.shift_id = NextRec.shift_id)
WHERE (((NextRec.shift_id) Is Null))
You, sir, are my hero! It's a simple but elegant solution that does exactly what I am looking for. I'm only sad that I can give one thanks per post...
But thank you thank you thank you!
 
SELECT DISTINCT shft_id, code_id, DMIN("start_time","yourTableName","shft_id = " & shft_id & " AND code_id = " & code_id) AS start_time, DMAX("start_time","yourTableName","shft_id = " & shft_id & " AND code_id = " & code_id) AS end_time FROM yourTableName
aernelgp, thank you so much for taking the time to read my post and offer a solution. CJ hit the nail on the head and his sql does exactly what I was looking for. But I appreciate your help and feedback as well!
 

Users who are viewing this thread

Back
Top Bottom