Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get data that occur in all specified years of column in Pandas

I need only those company name with year and estimate grp size those are present in all three year eg 2019, 2020, 2021

year    Company/Account Name    EstimatedGroupSize
0   2019    Unknown             19550
1   2019    Mayo Clinic         7754
2   2019    Deloitte            6432
3   2019    Rizona State        5582
4   2019    Intel Corporation   4595
5   2020    Deloitte            4063
6   2020    Unknown             3490
7   2021    Unknown             3484
8   2020    Intel Corporation   3460
9   2021    Intel Corporation   3433
10  2021    Deloitte            3250

So my output should be

year    Company/Account Name    EstimatedGroupSize
0   2019    Unknown             19550
2   2019    Deloitte            6432
4   2019    Intel Corporation   4595
5   2020    Deloitte            4063
6   2020    Unknown             3490
7   2021    Unknown             3484
8   2020    Intel Corporation   3460
9   2021    Intel Corporation   3433
10  2021    Deloitte            3250

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Here is solution for filter year with Company/Account Name if present at least one row and filter original DataFrame by inner merge:

#if need filter ony some years first
df = df[df['year'].isin([2019, 2020, 2021])]

df1 = pd.crosstab(df['year'], df['Company/Account Name'])

df = df.merge(df1.loc[:, df1.gt(0).all()].stack().index.to_frame(index=False))
print (df)
   year Company/Account Name  EstimatedGroupSize
0  2019              Unknown               19550
1  2019             Deloitte                6432
2  2019    Intel Corporation                4595
3  2020             Deloitte                4063
4  2020              Unknown                3490
5  2021              Unknown                3484
6  2020    Intel Corporation                3460
7  2021    Intel Corporation                3433
8  2021             Deloitte                3250
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading