Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: adding up scores in a variable using mutate and case_when

I want to create a score based on dummy variables, where different combinations add up to a certain ‘stringency’.

set.seed(2)
df <- data.frame(id = 1:20,
                 a = rbinom(20, 1, 0.6),
                 b = rbinom(20, 1, 0.6),
                 c = rbinom(20, 1, 0.6),
                 d = rbinom(20, 1, 0.6),
                 e = rbinom(20, 1, 0.6))

Which looks like

id a b c d e
1 1 0 0 0 1
2 0 1 1 0 0
3 1 0 1 0 1
4 1 1 1 1 1
5 0 1 0 0 1
6 0 1 0 1 0
7 1 1 0 1 0
8 0 1 1 1 1
9 1 0 1 1 0
10 1 1 0 1 1
11 1 1 1 1 0
12 1 1 1 1 1
13 0 0 0 1 1
14 1 0 0 1 1
15 1 1 1 1 1
16 0 0 0 0 1
17 0 0 0 1 1
18 1 1 0 0 1
19 1 0 0 1 1
20 1 1 0 1 1

Now I am trying to create the following variable:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df <- df %>% mutate(stringency = case_when(a == 1 ~ 3,
                                           a == 1 & b == 1 ~ 6,
                                           a == 1 & c == 1 ~ 6,
                                           a == 1 & b == 1 & c ~ 7,
                                           a == 1 & d == 1 ~ 11,
                                           a == 1 & e == 1 ~ 9,
                                           a == 1 & b == 1 & d == 1 ~ 9,
                                           a == 1 & b == 1 & e == 1 ~ 9,
                                           TRUE ~ 0))

However, this produces a result where only the first argument works (a == 1 ~ 3)

id a b c d e stringency
1 1 0 0 0 1          3
2 0 1 1 0 0          0
3 1 0 1 0 1          3
4 1 1 1 1 1          3
5 0 1 0 0 1          0
6 0 1 0 1 0          0
7 1 1 0 1 0          3
8 0 1 1 1 1          0
9 1 0 1 1 0          3
10 1 1 0 1 1         3
11 1 1 1 1 0         3
12 1 1 1 1 1         3
13 0 0 0 1 1         0
14 1 0 0 1 1         3
15 1 1 1 1 1         3
16 0 0 0 0 1         0
17 0 0 0 1 1         0
18 1 1 0 0 1         3
19 1 0 0 1 1         3
20 1 1 0 1 1         3

What I want is that it ‘builds up’: if you have just a, you get 3; if you have a and b, you get 6; etc.

Any ideas on how I can do this? Many thanks

>Solution :

You have to pay attention to the order of your conditions. case_when will stop if the first time a condition is TRUE and will not evaluate the rest. Therefore you want your most complex conditions at the beginning and a == 1 at the end.

library(dplyr)

set.seed(2)
df <- data.frame(id = 1:20,
                 a = rbinom(20, 1, 0.6),
                 b = rbinom(20, 1, 0.6),
                 c = rbinom(20, 1, 0.6),
                 d = rbinom(20, 1, 0.6),
                 e = rbinom(20, 1, 0.6))


df <- df %>% mutate(stringency = case_when(a == 1 & b == 1 & c == 1 ~ 7,
                                           a == 1 & b == 1 & d == 1 ~ 9,
                                           a == 1 & b == 1 & e == 1 ~ 9,
                                           a == 1 & d == 1 ~ 11,
                                           a == 1 & e == 1 ~ 9,
                                           a == 1 & b == 1 ~ 6,
                                           a == 1 & c == 1 ~ 6,
                                           a == 1 ~ 3,
                                           TRUE ~ 0))

df
#>    id a b c d e stringency
#> 1   1 1 0 0 0 1          9
#> 2   2 0 1 1 0 0          0
#> 3   3 1 0 1 0 1          9
#> 4   4 1 1 1 1 1          7
#> 5   5 0 1 0 0 1          0
#> 6   6 0 1 0 1 0          0
#> 7   7 1 1 0 1 0          9
#> 8   8 0 1 1 1 1          0
#> 9   9 1 0 1 1 0         11
#> 10 10 1 1 0 1 1          9
#> 11 11 1 1 1 1 0          7
#> 12 12 1 1 1 1 1          7
#> 13 13 0 0 0 1 1          0
#> 14 14 1 0 0 1 1         11
#> 15 15 1 1 1 1 1          7
#> 16 16 0 0 0 0 1          0
#> 17 17 0 0 0 1 1          0
#> 18 18 1 1 0 0 1          9
#> 19 19 1 0 0 1 1         11
#> 20 20 1 1 0 1 1          9

Created on 2023-04-05 by the reprex package (v2.0.1)

But as you can see in row 4, the result is 7, but if you change the order of the conditions it could be all other values as well, so you need to add some more conditions for clarity.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading