Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Use tidy-select to create a new summary variable with mutate

I am trying to make the creation of a summary column within my data "future proof" for the case that another variable is added to the dataset. For this I tried using tidy-select language, but failed to accomplish this. a minimal example is:

variables = c("v1", "v2")
tst = tribble(
  ~ID, ~time, ~v1, ~v2,
  "1a", 1,  T,  F,
  "1b", 1,  F,  T,
  "1c", 1,  F,  F,
  "1a", 2,  T,  F,
  "1b", 2,  T,  T,
  "1c", 2,  F,  F
)

tst %>%
  group_by(ID, time) %>%
  mutate(any_v = any(v1,v2)) # works

tst %>%
  group_by(ID, time) %>%
  mutate(any_v = any(all_of(variables))) # this doesnt work!

What I am trying to achieve is a case, where a variable "v3" could be added to the tribble and the variables vector, while the mutate call can stay as is. Any help is appreciated!

I tried already most of the tidy-select language, but i am no expert here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use pick() to pull out those columns

tst %>%
  group_by(ID, time) %>%
  mutate(any_v = any(pick(all_of(variables)))) 
#   ID     time v1    v2    any_v
#   <chr> <dbl> <lgl> <lgl> <lgl>
# 1 1a        1 TRUE  FALSE TRUE 
# 2 1b        1 FALSE TRUE  TRUE 
# 3 1c        1 FALSE FALSE FALSE
# 4 1a        2 TRUE  FALSE TRUE 
# 5 1b        2 TRUE  TRUE  TRUE 
# 6 1c        2 FALSE FALSE FALSE

You will get the same answer if you add v3 with all TRUE values

tst %>%
  mutate(v3=TRUE) %>% 
  group_by(ID, time) %>%
  mutate(any_v = any(pick(all_of(variables))))
#   ID     time v1    v2    v3    any_v
#   <chr> <dbl> <lgl> <lgl> <lgl> <lgl>
# 1 1a        1 TRUE  FALSE TRUE  TRUE 
# 2 1b        1 FALSE TRUE  TRUE  TRUE 
# 3 1c        1 FALSE FALSE TRUE  FALSE
# 4 1a        2 TRUE  FALSE TRUE  TRUE 
# 5 1b        2 TRUE  TRUE  TRUE  TRUE 
# 6 1c        2 FALSE FALSE TRUE  FALSE

Or if you just always wanted to use any column that start with "v", use

tst %>%
  mutate(v3=TRUE) %>% 
  group_by(ID, time) %>%
  mutate(any_v = any(pick(starts_with("v"))))
#   ID     time v1    v2    v3    any_v
#   <chr> <dbl> <lgl> <lgl> <lgl> <lgl>
# 1 1a        1 TRUE  FALSE TRUE  TRUE 
# 2 1b        1 FALSE TRUE  TRUE  TRUE 
# 3 1c        1 FALSE FALSE TRUE  TRUE 
# 4 1a        2 TRUE  FALSE TRUE  TRUE 
# 5 1b        2 TRUE  TRUE  TRUE  TRUE 
# 6 1c        2 FALSE FALSE TRUE  TRUE 
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading