I have 3 columns like below:
| ID | Status | Date |
|---|---|---|
| 001 | T | 2022-09-27 |
| 001 | T | 2022-09-26 |
| 001 | T | 2022-09-25 |
| 001 | T | 2022-09-24 |
| 001 | T | 2022-09-23 |
| 001 | T | 2022-09-22 |
| 001 | T | 2022-09-21 |
| 001 | R | 2022-09-20 |
| 001 | R | 2022-09-19 |
| 001 | R | 2022-09-18 |
| 001 | R | 2022-09-17 |
| 001 | R | 2022-09-16 |
| 001 | T | 2022-09-15 |
| 001 | T | 2022-09-14 |
| 001 | T | 2022-09-13 |
| 001 | T | 2022-09-12 |
| 001 | T | 2022-09-11 |
| 001 | T | 2022-09-10 |
| 001 | T | 2022-09-09 |
| 001 | T | 2022-09-08 |
| 001 | T | 2022-09-07 |
| 001 | T | 2022-09-06 |
How can I generate an output like this in Snowflake?
| ID | Status | Start_date | End_date |
|---|---|---|---|
| 001 | T | 2022-09-21 | 2022-09-27 |
| 001 | T | 2022-09-06 | 2022-09-15 |
>Solution :
It is "gaps and islands" class problem. The easiest way is to use MATCH_RECOGNIZE clause:
SELECT *
FROM test
MATCH_RECOGNIZE (
PARTITION BY ID
ORDER BY Date
MEASURES
CLASSIFIER() AS Status
,MIN(Date) AS Start_date
,MAX(Date) AS End_date
PATTERN (T+)
DEFINE T AS Status = 'T'
) AS mr;
Output:
