Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to extract ids and rows only if a columns has all of the designated values

I have the following dataframe

id  date     other variables..  
A   2019Q4      
A   2020Q4        
A   2021Q4 
B   2018Q4
B   2019Q4
B   2020Q4
B   2021Q4
C   2020Q4
C   2021Q4
D   2021Q4
E   2018Q4
E   2019Q4
E   2020Q4
E   2021Q4
.       .      

I want to group by id and keep those ids if it contains all of the designated values (i.e. 2019Q4, 2020Q4, 2021Q4) then extract rows that correspond to those values. isin() won’t work because it won’t drop C and D.

desired output

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

A   2019Q4      
A   2020Q4        
A   2021Q4 
B   2019Q4
B   2020Q4
B   2021Q4
E   2019Q4
E   2020Q4
E   2021Q4
.       .      

>Solution :

You can use set operations to filter the id and isin for the date:

target = {'2019Q4', '2020Q4', '2021Q4'}

id_ok = df.groupby('id')['date'].agg(lambda x: target.issubset(x))

df2 = df[df['date'].isin(target) & df['id'].map(id_ok)]

or, using transform:

target = {'2019Q4', '2020Q4', '2021Q4'}

mask = df.groupby('id')['date'].transform(lambda x: target.issubset(x))

df2 = df[df['date'].isin(target) & mask]

output:

   id    date  other
0   A  2019Q4    NaN
1   A  2020Q4    NaN
2   A  2021Q4    NaN
4   B  2019Q4    NaN
5   B  2020Q4    NaN
6   B  2021Q4    NaN
11  E  2019Q4    NaN
12  E  2020Q4    NaN
13  E  2021Q4    NaN

id_ok:

id
A     True
B     True
C    False
D    False
E     True
Name: date, dtype: bool
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading