Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how do I filter data that come before row that contains certain value for each group in dataframe

How do I get only the rows that come after the ‘click’ in the column ‘action_type’ for each client_id
the toy data.

df = pd.DataFrame({
  'user_client_id': [1,1, 1, 1, 1,1, 1,1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
   'timestamp':['2021-12-18 09:15:59', '2021-12-18 10:33:49', '2021-12-18 10:34:08',
'2021-12-18 10:34:09', '2021-12-18 10:57:02','2021-12-18 10:57:33','2021-12-18 10:58:01','2021-12-18 10:58:02','2021-12-18 10:58:17',
'2021-12-18 10:58:29','2021-12-18 10:58:31','2021-12-18 10:58:34', '2021-12-18 10:58:34','2021-12-18 10:58:47', '2021-12-18 10:59:12',
'2021-12-18 10:59:28','2021-12-18 10:59:35','2021-12-18 10:59:38','2021-12-18 11:05:13', '2021-12-18 11:05:58','2021-12-18 11:06:08','2021-12-18 11:06:10','2021-12-18 11:06:12','2021-12-18 11:07:42',
 '2021-12-18 11:10:07','2021-12-18 11:10:23', '2021-12-18 11:10:53', '2021-12-18 11:10:58', '2021-12-18 11:13:04', '2021-12-18 11:13:06',
'2021-12-18 14:56:32','2021-12-18 17:16:40'],
'action_type ': ['to_cart','to_cart','to_cart','to_cart','click', 'to_cart', 'to_cart', 'increment', 'remove', 'to_cart', 'increment', 'click', 'to_cart', 'increment', 'to_cart', 'to_cart', 'remove', 'to_cart', 'increment', 'to_cart', 'to_cart', 'click', 'increment',
 'to_cart', 'to_cart', 'to_cart', 'click', 'increment', 'to_cart', 'increment', 'to_cart', 'increment'] })

For the client with id 1 everything that comes before the click at 2021-12-18 10:57:02 should be filtered
for the client with id 2 everything that comes before the click at 2021-12-18 11:06:10 should be filtered

I’ve tried this way, but it only works for the client 1, but doesn’t work for the client 2

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df.iloc[df.loc[df['action_type']=='click'].index[0]:,:]

>Solution :

Use boolean mask:

m = df.groupby('user_client_id')['action_type'] \
      .apply(lambda x: x.eq('click').cumsum().astype(bool))

out = df[m]

Output:

>>> out
    user_client_id            timestamp action_type
4                1  2021-12-18 10:57:02       click
5                1  2021-12-18 10:57:33     to_cart
6                1  2021-12-18 10:58:01     to_cart
7                1  2021-12-18 10:58:02   increment
8                1  2021-12-18 10:58:17      remove
9                1  2021-12-18 10:58:29     to_cart
10               1  2021-12-18 10:58:31   increment
11               1  2021-12-18 10:58:34       click
12               1  2021-12-18 10:58:34     to_cart
13               1  2021-12-18 10:58:47   increment
14               1  2021-12-18 10:59:12     to_cart
21               2  2021-12-18 11:06:10       click
22               2  2021-12-18 11:06:12   increment
23               2  2021-12-18 11:07:42     to_cart
24               2  2021-12-18 11:10:07     to_cart
25               2  2021-12-18 11:10:23     to_cart
26               2  2021-12-18 11:10:53       click
27               2  2021-12-18 11:10:58   increment
28               2  2021-12-18 11:13:04     to_cart
29               2  2021-12-18 11:13:06   increment
30               2  2021-12-18 14:56:32     to_cart
31               2  2021-12-18 17:16:40   increment

Boolean mask:

>>> pd.concat([df, m], axis=1)
    user_client_id            timestamp  action_type  action_type
0                1  2021-12-18 09:15:59      to_cart        False
1                1  2021-12-18 10:33:49      to_cart        False
2                1  2021-12-18 10:34:08      to_cart        False
3                1  2021-12-18 10:34:09      to_cart        False
4                1  2021-12-18 10:57:02        click         True
5                1  2021-12-18 10:57:33      to_cart         True
6                1  2021-12-18 10:58:01      to_cart         True
7                1  2021-12-18 10:58:02    increment         True
8                1  2021-12-18 10:58:17       remove         True
9                1  2021-12-18 10:58:29      to_cart         True
10               1  2021-12-18 10:58:31    increment         True
11               1  2021-12-18 10:58:34        click         True
12               1  2021-12-18 10:58:34      to_cart         True
13               1  2021-12-18 10:58:47    increment         True
14               1  2021-12-18 10:59:12      to_cart         True
15               2  2021-12-18 10:59:28      to_cart        False
16               2  2021-12-18 10:59:35       remove        False
17               2  2021-12-18 10:59:38      to_cart        False
18               2  2021-12-18 11:05:13    increment        False
19               2  2021-12-18 11:05:58      to_cart        False
20               2  2021-12-18 11:06:08      to_cart        False
21               2  2021-12-18 11:06:10        click         True
22               2  2021-12-18 11:06:12    increment         True
23               2  2021-12-18 11:07:42      to_cart         True
24               2  2021-12-18 11:10:07      to_cart         True
25               2  2021-12-18 11:10:23      to_cart         True
26               2  2021-12-18 11:10:53        click         True
27               2  2021-12-18 11:10:58    increment         True
28               2  2021-12-18 11:13:04      to_cart         True
29               2  2021-12-18 11:13:06    increment         True
30               2  2021-12-18 14:56:32      to_cart         True
31               2  2021-12-18 17:16:40    increment         True

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading