Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

groupby year calculation on two different columns

I have the following dataframe:

date  = ['2015-02-03 23:00:00','2015-02-03 23:30:00','2016-02-04 00:00:00','2016-02-04 00:30:00']
v1 = [33.24  , 31.71  , 34.39  , 34.49 ]
v2 = [35.24  , 33.71  , 36.39  , 36.49 ]
    
df = pd.DataFrame({'V1':v1,'V2':v2}, index=pd.to_datetime(date))

print(df.head())
    
                     v1       V2
index                                     
2015-02-03 23:00:00  33.24  35.24 
2015-02-03 23:30:00  31.71   33.71
2016-02-04 00:00:00  34.39   36.39
2016-02-04 00:30:00  34.49   36.49

I would like to do the following operation: divide every row in column v1 by the sum of column v2 over the year. How can I do that efficiently?

So far I have tried to put up something like this, but I can get it to work:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df["result"] = df.groupby(df.index.year).apply(lambda x: x["V1"]/x['V2'].sum())

>Solution :

Use Index.year and groupby.transform:

df['result'] = df['V1'].div(df.groupby(df.index.year)['V2'].transform('sum'))

Output:

                        V1     V2    result
2015-02-03 23:00:00  33.24  35.24  0.482088
2015-02-03 23:30:00  31.71  33.71  0.459898
2016-02-04 00:00:00  34.39  36.39  0.471872
2016-02-04 00:30:00  34.49  36.49  0.473244
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading