I have a table below where I know the the Start and End Positions in the Position Column. The table is order by Start Time in DESC order.
| ID | PREV End Time | Start Time | End Time | Position |
|---|---|---|---|---|
| 1 | 1/1/2022 10:00 | 1/1/2022 10:30 | 1/1/2022 11:30 | Start |
| 1 | 1/1/2022 12:30 | 1/1/2022 13:30 | 1/1/2022 14:30 | null |
| 1 | 1/1/2022 15:30 | 1/1/2022 16:30 | 1/1/2022 17:30 | End |
| 1 | 1/1/2022 18:30 | 1/1/2022 19:30 | 1/1/2022 20:30 | Start |
| 1 | 1/1/2022 21:30 | 1/1/2022 22:30 | 1/1/2022 23:30 | null |
| 1 | 1/2/2022 0:30 | 1/2/2022 1:30 | 1/2/2022 2:30 | null |
| 1 | 1/2/2022 3:30 | 1/2/2022 4:30 | 1/2/2022 5:30 | End |
I want to be able to make two new columns where I can identify Trip 1 and 2 distinctly. And I also want to number each leg for each trip. Below, is the table I desire.
| ID | PREV End Time | Start Time | End Time | Position | LEG | Trip |
|---|---|---|---|---|---|---|
| 1 | 1/1/2022 10:00 | 1/1/2022 10:30 | 1/1/2022 11:30 | Start | 1 | 1 |
| 1 | 1/1/2022 12:30 | 1/1/2022 13:30 | 1/1/2022 14:30 | null | 2 | 1 |
| 1 | 1/1/2022 15:30 | 1/1/2022 16:30 | 1/1/2022 17:30 | End | 3 | 1 |
| 1 | 1/1/2022 18:30 | 1/1/2022 19:30 | 1/1/2022 20:30 | Start | 1 | 2 |
| 1 | 1/1/2022 21:30 | 1/1/2022 22:30 | 1/1/2022 23:30 | null | 2 | 2 |
| 1 | 1/2/2022 0:30 | 1/2/2022 1:30 | 1/2/2022 2:30 | null | 3 | 2 |
| 1 | 1/2/2022 3:30 | 1/2/2022 4:30 | 1/2/2022 5:30 | End | 4 | 2 |
Sometimes the data, isn’t as perfect as the first row starting as the Start position. Sometimes it starts in the middle of a trip. How can I exclude? And sometimes the end, ends in the middle of a trip. How can I exclude?
Please let me know if you have any questions. Thank you for taking the time to read.
>Solution :
From Oracle 12, you can use MATCH_RECOGNIZE to do row-by-row processing:
SELECT ID,
PREV_End_Time,
Start_Time,
End_Time,
Position,
Leg,
Trip
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY ID
ORDER BY start_time
MEASURES
MATCH_NUMBER() AS trip,
COUNT(*) AS leg
ALL ROWS PER MATCH
PATTERN (start_row intermediate_row* end_row)
DEFINE
start_row AS position = 'Start',
intermediate_row AS position IS NULL,
end_row AS position = 'End'
)
Which, for the sample data:
CREATE TABLE table_name (ID, PREV_End_Time, Start_Time, End_Time, Position) AS
SELECT 1, DATE '2022-01-01' + INTERVAL '10:00' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '10:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '11:30' HOUR TO MINUTE, 'Start' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' + INTERVAL '12:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '13:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '14:30' HOUR TO MINUTE, NULL FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' + INTERVAL '15:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '16:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '17:30' HOUR TO MINUTE, 'End' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' + INTERVAL '18:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '19:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '20:30' HOUR TO MINUTE, 'Start' FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-01' + INTERVAL '21:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '22:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '23:30' HOUR TO MINUTE, NULL FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-02' + INTERVAL '00:30' HOUR TO MINUTE, DATE '2022-01-02' + INTERVAL '01:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '02:30' HOUR TO MINUTE, NULL FROM DUAL UNION ALL
SELECT 1, DATE '2022-01-02' + INTERVAL '03:30' HOUR TO MINUTE, DATE '2022-01-02' + INTERVAL '04:30' HOUR TO MINUTE, DATE '2022-01-01' + INTERVAL '05:30' HOUR TO MINUTE, 'End' FROM DUAL;
Outputs:
ID PREV_END_TIME START_TIME END_TIME POSITION LEG TRIP 1 2022-01-01 10:00:00 2022-01-01 10:30:00 2022-01-01 11:30:00 Start 1 1 1 2022-01-01 12:30:00 2022-01-01 13:30:00 2022-01-01 14:30:00 null 2 1 1 2022-01-01 15:30:00 2022-01-01 16:30:00 2022-01-01 17:30:00 End 3 1 1 2022-01-01 18:30:00 2022-01-01 19:30:00 2022-01-01 20:30:00 Start 1 2 1 2022-01-01 21:30:00 2022-01-01 22:30:00 2022-01-01 23:30:00 null 2 2 1 2022-01-02 00:30:00 2022-01-02 01:30:00 2022-01-01 02:30:00 null 3 2 1 2022-01-02 03:30:00 2022-01-02 04:30:00 2022-01-01 05:30:00 End 4 2
db<>fiddle here