I have to create a weighted sum score (WSum) based on several variables. For instance, mydata has three variables (A, B, and C). I have to create a ‘WSum’ score in which I multiple each of these variables by the weights present in the weights dataset, and sum all of them (A*0.5 + B*1.2 + C*2). I have to do all of that without altering other variables in the dataset, for instance, in this case, the variable ID.
PS: I can definitely do it manually mutate(Wsum = (A*0.5 + B*1.2 + C*2)). But I would like to automatize this process, because, as you might guess, I don’t have only 3 variables, and that will not be a one-time job. Thanks in advance.
Example data
Packages required
library(dplyr)
Example Data
mydata <- tibble(ID = (01, 02
A = c(1, 2, 3),
B = c(2, 5, 4),
C = c(3, 6, 1))
> # A tibble: 3 x 4
> ID A B C
> <dbl> <dbl> <dbl> <dbl>
>1 1 1 2 3
>2 2 2 5 6
>3 3 3 4 1
Data Weight Dictionary
weights <- tibble( variables = c("A", "B", "C"),
w = c(0.5, 1.2, 2))
># A tibble: 3 x 2
> variables w
> <chr> <dbl>
>1 A 0.5
>2 B 1.2
>3 C 2
Expected result
># A tibble: 3 x 5
> ID A B C Wsum
> <dbl> <dbl> <dbl> <dbl> <dbl>
>1 1 1 2 3 8.9
>2 2 2 5 6 19
>3 3 3 4 1 8.3
>Solution :
In base R, use %*% or crossprod after converting to matrix
mydata$Wsum <- as.vector(as.matrix(mydata[weights$variables]) %*%
with(weights, setNames(w, variables)))
-output
mydata$Wsum
[1] 8.9 19.0 8.3
Or using tidyverse, loop across the columns in the ‘mydata’, get the corresponding value from deframed (converted to named vector) ‘weights’, multiply and get the rowSums to create the ‘Wsum’ column
library(dplyr)
library(tibble)
mydata <- mydata %>%
mutate(Wsum = rowSums(across(all_of(weights$variables),
~ .x * deframe(weights)[[cur_column()]])))
mydata
# A tibble: 3 × 4
A B C Wsum
<dbl> <dbl> <dbl> <dbl>
1 1 2 3 8.9
2 2 5 6 19
3 3 4 1 8.3