Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Isolate rows containing IDs in a column based on another column value, yet keeping all the records of original ID

I’d prefer to explain it grafically as it’s hard for me to sum it up in the title.

Given a dataframe like this one below:

id        type
1         new
2         new
2         new repeater
2         repeater
3         repeater
4         new
4         new repeater
5         new repeater
5         repeater
6         new

I would like to filter it so it just returns me the values in the column id that appear in type at least as new, yet once this condition is fulfilled I want the remaining records belonging to this ID to stay in the outcoming DF. In other words, it should look like follows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id        type
1         new
2         new
2         new repeater
2         repeater
4         new
4         new repeater
6         new

>Solution :

Use GroupBy.cummax with bollean mask for test first match condition and filter in boolean indexing:

df = df[df['type'].eq('new').groupby(df['id']).cummax()]
print (df)
   id          type
0   1           new
1   2           new
2   2  new repeater
3   2      repeater
5   4           new
6   4  new repeater
9   6           new
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading