Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to filter the rows based on a column value group by results

I have the below dataframe

import pandas as pd
data= [['A','2022-07-01',3],['A','2022-07-01',4],['A','2022-07-01',5],['A','2022-07-02',5],['A','2022-07-03',6],['A','2022-07-03',2],['B','2022-07-01',3],['B','2022-07-01',4],['B','2022-07-02',5],['B','2022-07-02',6],['B','2022-07-03',2],['C','2022-07-01',3],['C','2022-07-02',4],['C','2022-07-02',5],['C','2022-07-03',6],['C','2022-07-04',2]]
df = pd.DataFrame(data,columns =['category','date','Value'])


I would like to get the all the rows from ‘category’ which are having duplicate date entries of greater than 1.

Category A has three entries of date 2022-07-01, two entries of 2022-07-03..so its unique duplicate date occurence(2022-07-01,2022-07-03) is two, which more than 1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

B has two entries of date 2022-07-01, two entries of 2022-07-02..so its unique duplicate date occurence(2022-07-01,2022,07,02) is two, which is more than 1

Where as

C has only one date duplicate entry (2022-07-02), so this C category rows all should be removed from the filter output and only all rows of A and B are required

Please advice. I tried groupby with transform(len)>1 but its doesn’t help out.

>Solution :

First run value_counts on category+date, then count the number of duplicated values with groupy.sum:

s = (
 df[['category', 'date']].value_counts().gt(1)
 .groupby(level='category').sum()
)

out = list(s[s.gt(1)].index)

Output: ['A', 'B']

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading