Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Grouping values in a column by a criteria and getting their mean using Python / Pandas

I have data on movies and all movies have IMDB score, however some do not have a meta critic score

Eg:

Name IMDB Score Meta Score
B 8 86
C 8 90
D 8 null
E 8 91
F 7 66
G 3 44

I want to fill in the null values in the meta critic score with the mean of the values of movies that have the same IMDB score
so the null value in this table should be replaced by the mean of movies B,C,E

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How would I achieve this with Numpy / Pandas?

I looked up online and the closest solution I could find was averaging all the metacritic scores and replacing the null values with that Average.

>Solution :

groupby + fillna

df.groupby('IMDB Score')['Meta Score'].apply(lambda x: x.fillna(x.mean()))

output:

0    86.0
1    90.0
2    89.0
3    91.0
4    66.0
5    44.0
Name: Meta Score, dtype: float64

make result to Meta Score column

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading