I’d prefer to explain it grafically as it’s hard for me to sum it up in the title.
Given a dataframe like this one below:
id type
1 new
2 new
2 new repeater
2 repeater
3 repeater
4 new
4 new repeater
5 new repeater
5 repeater
6 new
I would like to filter it so it just returns me the values in the column id that appear in type at least as new, yet once this condition is fulfilled I want the remaining records belonging to this ID to stay in the outcoming DF. In other words, it should look like follows:
id type
1 new
2 new
2 new repeater
2 repeater
4 new
4 new repeater
6 new
>Solution :
Use GroupBy.cummax with bollean mask for test first match condition and filter in boolean indexing:
df = df[df['type'].eq('new').groupby(df['id']).cummax()]
print (df)
id type
0 1 new
1 2 new
2 2 new repeater
3 2 repeater
5 4 new
6 4 new repeater
9 6 new