Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R data frame: select rows that meet logical conditions over multiple columns (variables) indexed by name

Ok this example should clarify what I am looking for

set.seed(123456789)

df <- data.frame(
  x1 = sample(c(0,1), size = 10, replace = TRUE),
  x2 = sample(c(0,1), size = 10, replace = TRUE),
  z1 = sample(c(0,1), size = 10, replace = TRUE)
  )

I want to select all rows that have x1 and x2 =1. That is,

df[df$x1==1 & df$x2==1,]

which returns

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   x1 x2 z1
1   1  1  1
4   1  1  1
6   1  1  1
10  1  1  0

but I want to do it in a way that scales to many x variables (e.g. x1,x2,…x40), so I would like to index the columns by "x" rather than having to write df$x1==1 & df$x2==1 &… & df$x40==1. Note that I care about having the z1 variable in the resulting data set (i.e. while the rows are selected based on the x variables, I am not looking to select the x columns only). Is it possible?

>Solution :

A possible solution, based on dplyr:

library(dplyr)

set.seed(123456789)

df <- data.frame(
  x1 = sample(c(0,1), size = 10, replace = TRUE),
  x2 = sample(c(0,1), size = 10, replace = TRUE),
  z1 = sample(c(0,1), size = 10, replace = TRUE)
)

df %>% 
  filter(across(starts_with("x"), ~ .x == 1))

#>   x1 x2 z1
#> 1  1  1  1
#> 2  1  1  1
#> 3  1  1  1
#> 4  1  1  0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading