I have a dataset where each row is a 15-s epoch. I have sleep start and sleep end times for each day and merged that into my epoch-level dataset. so each row has three variables that look like this:
datetime sleep_start_time sleep_end_time
0 2022-03-17 14:45:15 2022-03-16 23:15:00 2022-03-17 07:17:00
1 2022-03-17 14:45:30 2022-03-16 23:15:00 2022-03-17 07:17:00
2 2022-03-17 14:45:45 2022-03-16 23:15:00 2022-03-17 07:17:00
3 2022-03-17 14:46:00 2022-03-16 23:15:00 2022-03-17 07:17:00
4 2022-03-17 14:46:15 2022-03-16 23:15:00 2022-03-17 07:17:00
(these rows would all evaluate to 0 in my desired output)
the three variables are all currently datetime64[ns] and I think I’m having issues because of the space that is in that format? I’m really pretty stumped. all the other solutions i find have to do with timedeltas between dates only, or counting minutes between times, neither of which seems helpful.
also, i know that the midnight crossover can be a tricky thing–i believe my data source has already accounted for this and the appropriate sleep start/end times are assigned to each day, so let’s assume that’s not going to be an issue.
My goal is to simply evaluate a boolean true/false column in the dataframe for whether the row’s datetime is between that row’s sleep start and sleep end time or not.
the code i have been trying to use looks like this:
if (merged[merged['sleep_start_time']] <= merged[merged['datetime']]) & (merged[merged['datetime']] <= merged[merged['sleep_end_time']]):
merged['sleep'] = 1
which generates a KeyError:
KeyError: "None of [DatetimeIndex([‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n ‘2022-03-16 23:15:00’, ‘2022-03-16 23:15:00’,\n …\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00’,\n ‘2023-02-12 22:48:00’, ‘2023-02-12 22:48:00′],\n dtype=’datetime64[ns]’, length=1359360, freq=None)] are in the [columns]"
>Solution :
You could do that like so:
merged['sleep'] = (merged['sleep_start_time'] <= merged['datetime']) & (merged['datetime'] <= merged['sleep_end_time'])