Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Determining how far away is the next ID

So I have some data, subset as follows:

ID  data start_time 
001    X 2021-12-29 10:54:12.429 +0000
002    Y 2022-01-16 05:07:55.708 +0000 
003    Y 2021-12-31 12:25:12.980 +0000
002    A 2022-01-03 12:49:41.866 +0000
001    A 2021-12-30 16:32:13.736 +0000
001    A 2022-01-17 10:10:10.736 +0000

I would like to determine in minutes, the time difference between a given ID and the next occurrence in the dataframe, in order of start_time. So if an ID appears at 12:00 and 12:01, I would like the ID to show the time of the next entry as well as the diff in minutes, using SQL/Snowflake. CTE preferred.

The following fields should be added:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  • next_timestamp: The timestamp of the following entry
  • time_diff: The difference in minutes between start_time and next_timestamp.
  • entry_order: The number of how many of this ID has been.

Expected output:

ID  data start_time                       next_timestamp                 time_diff  entry_order
001    X 2021-12-29 10:54:12.429 +0000    2021-12-30 16:32:13.736 +0000  1778       1
001    A 2021-12-30 16:32:13.736 +0000    2022-01-17 10:10:10.736 +0000  25537      2
003    Y 2021-12-31 12:25:12.980 +0000    NULL                           NULL       1
002    A 2022-01-03 12:49:41.866 +0000    2022-01-16 05:07:55.708 +0000  18258      1
002    Y 2022-01-16 05:07:55.708 +0000    NULL                           NULL       2
001    A 2022-01-17 10:10:10.736 +0000    NULL                           NULL       3

Notice, the resulting output is ordered by the timestamp, ascending.

>Solution :

The LEAD function can be used to find the next start_time per ID.

And the ROW_NUMBER function can return a unique sequencial number per ID.

SELECT *
, LEAD(start_time) OVER (PARTITION BY ID ORDER BY start_time) AS next_timestamp
, DATEDIFF(minute, start_time, LEAD(start_time) OVER (PARTITION BY ID ORDER BY start_time)) AS time_diff
, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY start_time) AS entry_order
FROM your_table
ORDER BY start_time
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading