Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculate the average of the lowest n percentile

I have the following dataset. I want to find the average run of the lower 20 percentile. For example:
If I divide the runs column into 5 batches then the first two rows will be in the 20 percentile. So the average run of these two rows will be (1+2)/2 = 1.5
How do I divide the data frame into 5 batches (with sorting) and then find the average of that specific group?

I have tried using the following but the output shows 2.8 instead of 3

d.runs.quantile(0.2)

Input:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


ODI_runs = {'name': ['Tendulkar', 'Sangakkara', 'Ponting', 
                      'Jayasurya', 'Jayawardene', 'Kohli', 
                      'Haq', 'Kallis', 'Ganguly', 'Dravid'], 
            'runs': [1,2,3,4,5,6,7,8,9,10]} 
d = pd.DataFrame(ODI_runs)  

name            runs
Tendulkar       1
Sangakkara      2
Ponting         3
Jayasurya       4
Jayawardene     5
Kohli           6
Haq             7
Kallis          8
Ganguly         9
Dravid          10

Output:

1.5

>Solution :

You could use the pandas.DataFrame.quantile method: to retrieve the value that separates the first 20% of the data we use df["runs"].quantile(0.2). Then, is all pandas: use loc to target the correct rows and columns, and calculate the .mean() of thos values:

>> df.loc[df["runs"] <= df["runs"].quantile(0.2), "runs"].mean()
1.5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading