Pandas: identify is there any specific row within time range

December 27, 2023

I have dataframe like this

data = {'event': [1, 1, 3, 2, 3, 1, 2, 3, 4, 5], 'team': ['A', 'B', 'B', 'A', 'B', 'C', 'D', 'C', 'D', 'D'], 'timeStamp': ['2023-07-23 14:57:13.357', '2023-07-23 14:57:14.357', '2023-07-23 14:57:15.357', '2023-07-23 14:57:16.357', '2023-07-23 14:57:20.357', '2023-07-23 14:57:13.357', '2023-07-23 14:57:18.357', '2023-07-23 14:57:23.357', '2023-07-23 14:57:23.357', '2023-07-23 14:57:25.357']}

I need to identify is there row with event "2" in 7 seconds after each row for each team.

Output should be like this

event	team	timeStamp	is_2_in_7_sec
1	A	2023-07-23 14:57:13.357	TRUE
1	A	2023-07-23 14:57:14.357	TRUE
3	B	2023-07-23 14:57:15.357	FALSE
2	A	2023-07-23 14:57:16.357	FALSE
3	B	2023-07-23 14:57:20.357	FALSE
1	C	2023-07-23 14:57:13.357	TRUE
2	C	2023-07-23 14:57:18.357	FALSE
3	C	2023-07-23 14:57:23.357	FALSE
4	D	2023-07-23 14:57:23.357	FALSE
5	D	2023-07-23 14:57:25.357	FALSE

In R it’s quite easy, but in Python I’m stuck at this task.

>Solution :

You can use a groupby.rolling max after setting up a boolean column, then merge to the original DataFrame:

out = df.merge(df.assign(is_2_in_7_sec=df['event'].eq(2))[::-1]
                 .groupby(df['team'])
                 .rolling('7s', on='timeStamp')
                 ['is_2_in_7_sec'].max().astype(bool)
                 .reset_index()
               )

Output:

   event team               timeStamp  is_2_in_7_sec
0      1    A 2023-07-23 14:57:13.357           True
1      1    A 2023-07-23 14:57:14.357           True
2      3    B 2023-07-23 14:57:15.357          False
3      2    A 2023-07-23 14:57:16.357           True
4      3    B 2023-07-23 14:57:20.357          False
5      1    C 2023-07-23 14:57:13.357          False
6      2    D 2023-07-23 14:57:18.357           True
7      3    C 2023-07-23 14:57:23.357          False
8      4    D 2023-07-23 14:57:23.357          False
9      5    D 2023-07-23 14:57:25.357          False