How can I flag rows in a temporal database that are between specific datetimes? (specifically flagging asleep time)

I have a dataset where each row is a 15-s epoch. I have sleep start and sleep end times for each day and merged that into my epoch-level dataset. so each row has three variables that look like this:

    datetime            sleep_start_time    sleep_end_time
0   2022-03-17 14:45:15 2022-03-16 23:15:00 2022-03-17 07:17:00
1   2022-03-17 14:45:30 2022-03-16 23:15:00 2022-03-17 07:17:00
2   2022-03-17 14:45:45 2022-03-16 23:15:00 2022-03-17 07:17:00
3   2022-03-17 14:46:00 2022-03-16 23:15:00 2022-03-17 07:17:00
4   2022-03-17 14:46:15 2022-03-16 23:15:00 2022-03-17 07:17:00

(these rows would all evaluate to 0 in my desired output)

the three variables are all currently datetime64[ns] and I think I’m having issues because of the space that is in that format? I’m really pretty stumped. all the other solutions i find have to do with timedeltas between dates only, or counting minutes between times, neither of which seems helpful.

also, i know that the midnight crossover can be a tricky thing–i believe my data source has already accounted for this and the appropriate sleep start/end times are assigned to each day, so let’s assume that’s not going to be an issue.

My goal is to simply evaluate a boolean true/false column in the dataframe for whether the row’s datetime is between that row’s sleep start and sleep end time or not.
the code i have been trying to use looks like this:

if (merged[merged['sleep_start_time']] <= merged[merged['datetime']]) & (merged[merged['datetime']] <= merged[merged['sleep_end_time']]):
    merged['sleep'] = 1

which generates a KeyError:

KeyError: "None of [DatetimeIndex([‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n …\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00′],\n dtype=’datetime64[ns]’, length=1359360, freq=None)] are in the [columns]"

>Solution :

You could do that like so:

merged['sleep'] = (merged['sleep_start_time'] <= merged['datetime']) & (merged['datetime'] <= merged['sleep_end_time'])

Leave a Reply