Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using Pandas to drop the extra rows after satisfying a condition

My sample df looks like this

id   year    success
1    2000      N
1    2001      N
1    2002      Y
1    2003      N
1    2004      N
2    2000      N
2    2001      N
2    2002      N
3    2000      N
3    2001      N
3    2002      Y
....

Here, we can see that the id==1 and id==3 has both success==Y and success==N but id==2 only has success==N

What I want to do?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  • I want to only have rows in which if we find the first success==Y, we drop the remaining column for that group, eg id==1

This is how the new df should look.

id   year    success
1    2000      N
1    2001      N
1    2002      Y
2    2000      N
2    2001      N
2    2002      N
3    2000      N
3    2001      N
3    2002      Y
....

Here, in the above df we removed the extra rows after we encountered success==Y. Since, id==2 does not have success==Y, we did not remove any rows and in id==2, the last row is the Y so no rows were removed.

What I did?

  • I tried to group the id but then I want all the results even though I have duplicate ids. So this did not work.

Could someone please help me achieve this result?

>Solution :

First check success column against value Y, which gives you True value where success=Y; To mark all values after first Y as drop, we can further use cummax on this condition; Finally use the negated condition to filter the data frame:

df[~df.success.eq('Y').groupby(df.id).apply(lambda g: g.cummax().shift(fill_value=False))]

    id  year success
0    1  2000       N
1    1  2001       N
2    1  2002       Y
5    2  2000       N
6    2  2001       N
7    2  2002       N
8    3  2000       N
9    3  2001       N
10   3  2002       Y
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading