Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to calculate time elapsed since an event occurred in a specific column – Pandas DataFrames

I am analyzing readings from my continuous glucose monitor as a diabetic. I have a spreadsheet where I’ve logged the date/time, the type of entry, and my blood glucose level. Each row represents a new entry, and entries can be of various types, for example:

  • DOSE_INSULIN (amount of units of insulin injected),
  • NEW_SENSOR (recording that I swapped out CGM sensors), or
  • TEXT (any text based information I wanted to note down to myself).

What I am trying to do is to create a new column that tracks the amount of time (in hours) elapsed between sensor swaps, so that I eventually can determine if there is a relationship between sensor age and BGL control/variability.

Here is an example table beforehand:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

date type
11/21/21 12:55AM TEXT
11/21/21 1:16AM DOSE_INSULIN
11/21/21 2:05AM NEW_SENSOR
11/21/21 2:12AM DOSE_INSULIN
11/21/21 2:34AM DOSE_INSULIN
11/21/21 2:44AM NEW_SENSOR

And here is what I would like it to look like afterwards:

date type hours_since_new_sensor
11/21/21 12:55AM TEXT NaN
11/21/21 1:16AM DOSE_INSULIN NaN
11/21/21 2:05AM NEW_SENSOR 0
11/21/21 2:12AM DOSE_INSULIN 0.12
11/21/21 2:34AM DOSE_INSULIN 0.48
11/21/21 2:44AM NEW_SENSOR 0

There are a few other stackoverflow pages that I’ve found with similar questions, but I am having trouble adapting them to my specific issue.

  • This one creates the new columns based on two separate groups.
  • This one uses grouper, but also groups by an ID, which is confusing me.

So far, I have only been able to get to this line:

df['date'].where(df['type'] == 'NEW_SENSOR')

Which I understand will output a series(?) of null values, except for where the type is NEW_SENSOR, in which case it will output the date for that event. I can’t figure out how to expand on this to get what I really want though.

Any help or advice would be greatly appreciated, thank you so much!

>Solution :

Try with groupby:

df["date"] = pd.to_datetime(df["date"])
df["hours_since_new_sensor"] = df["date"] - df.groupby(df["type"].eq("NEW_SENSOR").cumsum())["date"].transform("min")
#reset the value before the first NEW_SENSOR to null
df["hours_since_new_sensor"] = df["hours_since_new_sensor"].where(df["type"].eq("NEW_SENSOR").cumsum()>0)

>>> df
                 date          type hours_since_new_sensor
0 2021-11-21 00:55:00          TEXT                    NaT
1 2021-11-21 01:16:00  DOSE_INSULIN                    NaT
2 2021-11-21 02:05:00    NEW_SENSOR        0 days 00:00:00
3 2021-11-21 02:12:00  DOSE_INSULIN        0 days 00:07:00
4 2021-11-21 02:34:00  DOSE_INSULIN        0 days 00:29:00
5 2021-11-21 02:44:00    NEW_SENSOR        0 days 00:00:00

If you would like to change the time to hours, you can do:

df["hours_since_new_sensor"] = df["hours_since_new_sensor"].dt.total_seconds().div(3600)

>>> df
                 date          type  hours_since_new_sensor
0 2021-11-21 00:55:00          TEXT                     NaN
1 2021-11-21 01:16:00  DOSE_INSULIN                     NaN
2 2021-11-21 02:05:00    NEW_SENSOR                0.000000
3 2021-11-21 02:12:00  DOSE_INSULIN                0.116667
4 2021-11-21 02:34:00  DOSE_INSULIN                0.483333
5 2021-11-21 02:44:00    NEW_SENSOR                0.000000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading