Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculate statistics based on values from multiple columns

I have a dataframe like as shown below

ID_1,value_1,value_2,ID_2         
1,21,0,A1
1,31,5,A2
1,0,0,A3
1,21,0,A4
1,21,21,A5
2,202,0,B1
2,310,159,B2
2,0,0,B3
2,201,310,B4
2,210,214,B5
2,178,190,B6

tdf = pd.read_clipboard(sep=',')

I would like to compute the mean, median and mode for each of the ID_1

However, I would like to exclude rows which has zeroes for both value_1 AND value_2

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

For ex: We will ignore/filter out row index 2 and 7.

Once, we ignore rows with zero values, we compute mean, median and mode based on values from value_1 and value_2 for each ID_1. Meaning, we concatenate the values in both columns value_1 and value_2 and then compute the statistics

I was trying something like belowm but it is incorrect

tdf = tdf[~(tdf['value_1']==0) & (tdf['value_2']==0)]  #this is incorrect logic
t1_count = tdf.groupby('ID')['value_1'].size().reset_index()
t1_values = tdf.groupby('ID')['value_1'].agg(list)

I expect my output to be like as shown below

    ID_1    mean    mode    median
0   1   18.8    21  21.0
1   2   183.5   310 201.5

>Solution :

If need remove rows with both 0 use:

tdf = tdf[~((tdf['value_1']==0) & (tdf['value_2']==0))]

Or:

tdf = tdf[(tdf['value_1']!=0) | (tdf['value_2']!=0)]

Or:

tdf = tdf[tdf[['value_1','value_1']].ne(0).all(axis=1)]

And then:

df = (tdf.melt(id_vars='ID_1', 
              value_vars=['value_1','value_2'])
          .groupby('ID_1')['value'].agg(['mean','median',lambda x: x.mode().iat[0]])
          .rename(columns={'<lambda_0>':'mode'})
          .reset_index())
print (df)
   ID_1   mean  median  mode
0     1   15.0    21.0    21
1     2  197.4   201.5   310
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading