Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Check if a value exists per group and remove groups without this value in a pandas df

I have a pandas df that looks like this:

import pandas as pd
d = {'value1': [1, 1, 1, 2, 3, 3, 4, 4, 4, 4], 'value2': ['A', 'B', 'C', 'C', 'A', 'B', 'B', 'A', 'A', 'B']}
df = pd.DataFrame(data=d)
df

Per group in column value1 I would like to check if that group contains at least one value ‘C’ in column value2. If a group doesn’t have a ‘C’ value, I would like to exclude that group

    value1  value2
    1       A
    1       B
    1       C
    2       C
    3       A
    3       B
    4       B
    4       A
    4       A
    4       B

The resulting df should look like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    value1  value2
    1       A
    1       B
    1       C
    2       C

What’s the best way to achieve this?

>Solution :

use groupby filter

df.groupby('value1').filter(lambda x: x['value2'].eq('C').sum() > 0)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading