Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create a Weighted Sum Score based on a second dataset for specific variables

I have to create a weighted sum score (WSum) based on several variables. For instance, mydata has three variables (A, B, and C). I have to create a ‘WSum’ score in which I multiple each of these variables by the weights present in the weights dataset, and sum all of them (A*0.5 + B*1.2 + C*2). I have to do all of that without altering other variables in the dataset, for instance, in this case, the variable ID.

PS: I can definitely do it manually mutate(Wsum = (A*0.5 + B*1.2 + C*2)). But I would like to automatize this process, because, as you might guess, I don’t have only 3 variables, and that will not be a one-time job. Thanks in advance.

Example data

Packages required

library(dplyr)

Example Data

mydata <- tibble(ID = (01, 02
                 A = c(1, 2, 3),
                 B = c(2, 5, 4),
                 C = c(3, 6, 1))

>    # A tibble: 3 x 4
>     ID     A     B     C
>  <dbl> <dbl> <dbl> <dbl>
>1     1     1     2     3
>2     2     2     5     6
>3     3     3     4     1

Data Weight Dictionary

weights <- tibble( variables = c("A", "B", "C"),
                   w = c(0.5, 1.2, 2))

># A tibble: 3 x 2
>  variables     w
>  <chr>     <dbl>
>1 A           0.5
>2 B           1.2
>3 C           2 

Expected result

># A tibble: 3 x 5
>     ID     A     B     C  Wsum
>  <dbl> <dbl> <dbl> <dbl> <dbl>
>1     1     1     2     3   8.9
>2     2     2     5     6  19  
>3     3     3     4     1   8.3

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

In base R, use %*% or crossprod after converting to matrix

mydata$Wsum <- as.vector(as.matrix(mydata[weights$variables]) %*% 
       with(weights, setNames(w, variables)))

-output

mydata$Wsum
[1]  8.9 19.0  8.3

Or using tidyverse, loop across the columns in the ‘mydata’, get the corresponding value from deframed (converted to named vector) ‘weights’, multiply and get the rowSums to create the ‘Wsum’ column

library(dplyr)
library(tibble)
mydata <- mydata %>%
  mutate(Wsum = rowSums(across(all_of(weights$variables),
    ~ .x * deframe(weights)[[cur_column()]])))
mydata
# A tibble: 3 × 4
      A     B     C  Wsum
  <dbl> <dbl> <dbl> <dbl>
1     1     2     3   8.9
2     2     5     6  19  
3     3     4     1   8.3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading