Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

calculate weights from a column in a dataframe with conditions R

I need to calculate weights from a column in a dataframe based on some conditions.

I have Total Assets from several banks, with different countries, years and specializations.

For each bank I want to calculate a weight (w) where w(i) = Tot_Asset (bank) / sum (Tot_Ass of all banks within same year, country and Specialization)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Example dataframe:

banks <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
Country <- c("NL", "ES", "IT", "IT", "ES", "NL", "FR", "NL", "ES", "NL", "IT", "IT", "NL", "ES", "NL")
year <- c(2020, 2019, 2018, 2019, 2020, 2020, 2018, 2019, 2019, 2019, 2018, 2019, 2020, 2018, 2020)
Specialization <- c("cooperative", "saving", "cooperative", "cooperative", "saving", "cooperative", "saving", "cooperative", "cooperative", "saving", "cooperative", "saving", "cooperative", "cooperative", "cooperative")
Tot_Assets <- c(100, 200, 145, 300, 200, 345, 543, 190, 150, 120, 310, 210, 110, 210, 220)
data <- data.frame(banks, Country, year, Specialization, Tot_Assets)

As an example of what I would like to obtain:

The bank 1 is in NL, is a cooperative and the Total Asset is from 2020; bank 6, 13 and 15 has the same caracteristics. So the program has to do :

  • w(1) = 100 / (100 + 345 + 110 + 220) = 0.13
  • w(6) = 345 / (100 + 345 + 110 + 220) = 0.45
  • and so on

I hope that I explained myself and that you can help me, thanks in advance!

>Solution :

We may group by ‘Country’, ‘year’, ‘Specialization’, and create the ‘weights’ column by dividing the ‘Tot_Assets’ with the sum of ‘Tot_Assets’

library(dplyr)
data <- data %>% 
  group_by(Country, year, Specialization) %>%
  mutate(weights = Tot_Assets/sum(Tot_Assets)) %>%
  ungroup

-output

data
# A tibble: 15 × 6
   banks Country  year Specialization Tot_Assets weights
   <dbl> <chr>   <dbl> <chr>               <dbl>   <dbl>
 1     1 NL       2020 cooperative           100   0.129
 2     2 ES       2019 saving                200   1    
 3     3 IT       2018 cooperative           145   0.319
 4     4 IT       2019 cooperative           300   1    
 5     5 ES       2020 saving                200   1    
 6     6 NL       2020 cooperative           345   0.445
 7     7 FR       2018 saving                543   1    
 8     8 NL       2019 cooperative           190   1    
 9     9 ES       2019 cooperative           150   1    
10    10 NL       2019 saving                120   1    
11    11 IT       2018 cooperative           310   0.681
12    12 IT       2019 saving                210   1    
13    13 NL       2020 cooperative           110   0.142
14    14 ES       2018 cooperative           210   1    
15    15 NL       2020 cooperative           220   0.284
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading