Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get average values from specific value type?

I have a dataframe:

id            value
a       0:3,1:0,2:0,3:4
a       0:0,1:0,2:2,3:0
a       0:0,1:5,2:4,3:0

How to get average values of keys in column value?

For example for 0:3,1:0,2:0,3:4 it must be (0+0+0+3+3+3+3)/7 = 1.71.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

For 0:0,1:0,2:2,3:0 it must be 2+2/2=2.

For 0:0,1:5,2:4,3:0 it must be (1+1+1+1+1+2+2+2+2)/9 = 1.44.

So desired result is:

id            value
a              1.71
a              1.66
a              1.44

How to do that? Its not a dictionary values, so I dont really understand how to operate with keys and values here

>Solution :

You could use:

g = (df.join(df['value'].str.split(',')
                        .explode()
                        .str.split(':', expand=True)
             .astype(int)
            )
       .assign(value=lambda d: d[0]*d[1])
       .set_index('id', append=True)
       .groupby(level=[0,1])
      )
g['value'].sum()/g[1].sum()

output:

   id
0  a       1.7143
1  a       2.0000
2  a       1.4444

how it works:

1- splitting the values and create new columns
(df.join(df['value'].str.split(',')
                        .explode()
                        .str.split(':', expand=True)
             .astype(int)
            )
   .assign(value=lambda d: d[0]*d[1])
   .set_index('id', append=True)
)
      value  0  1
  id             
0 a       0  0  3
  a       0  1  0
  a       0  2  0
  a      12  3  4
1 a       0  0  0
  a       0  1  0
  a       4  2  2
  a       0  3  0
2 a       0  0  0
  a       5  1  5
  a       8  2  4
  a       0  3  0
2- computing sum of values and sum of weights to get the mean
>>> g['value'].sum()
   id
0  a     12
1  a      4
2  a     13

>>> g[1].sum()
   id
0  a     7
1  a     2
2  a     9 

>>> g['value'].sum()/g[1].sum()
   id
0  a       1.7143
1  a       2.0000
2  a       1.4444
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading