Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filtering rows before a certain criteria by group in pandas

I am looking to figure out how I can filter rows in pandas based on a certain criteria that changes for each group. See dummy data here and the desired result.

Dummy data

data = {'ID':['1111', '1111', '1111', '1111','1112','1112','1112','1112','1112'],
        'Category':[1,2,2,2,1,3,2,2,1]}
pd.DataFrame(data)
ID      Category
1111    1
1111    2
1111    2
1111    2
1112    1
1112    3
1112    2
1112    2
1112    1

Now I am trying to filter the dataframe such that for each ID it picks all the rows before the first occurrence of Category 2. This is how the desired result would look like

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

ID      Category
1111    1
1112    1
1112    3

>Solution :

Try this:

df[(df['Category'] != 2).groupby(df['ID']).cumprod()] 

Output:

     ID  Category
0  1111         1
4  1112         1
5  1112         3

Details:
Create a boolean series where True when Category is not equal to 2, then use cumprod such that the first False Category = 2, it remains False for each group.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading