I am trying to treat the values in a list as column names to filter against in a for loop using R. However, R seems to be treating the values as strings, not actual column names. For example, let’s say that I have a dataframe with 3 columns: PERSON_NAME, SCORE_1 and SCORE_2. Then I store the "SCORE_1" and "SCORE_2" as values within as list. I want to write a for loop that creates two data frames, one with students who scored >90 on the first exam and another with >90 on the second exam.
The following does it without a for loop.
library(dplyr)
# Create a dataframe of test scores
df <- data.frame(
PERSON_NAME = c('JIM', 'SALLY', 'JOHN', 'SUE'),
SCORE_1 = c(100, 95, 80, 85),
SCORE_2 = c(95, 75, 90, 97)
)
# Find students with SCORE_1 greater than 90
df %>%
filter(
SCORE_1 > 90
) %>%
select(
PERSON_NAME,
SCORE_1
)
# Find students with SCORE_2 greater than 90
df %>%
filter(
SCORE_2 > 90
) %>%
select(
PERSON_NAME,
SCORE_2
)
When I try a for loop, it doesn’t seem to work. I get two data frames with the right columns, however, the filtering is not working. How do I adjust the below to make the filter step work? Thank you!
for (i in 1:length(list_of_tests)) {
x <- df %>%
filter(
list_of_tests[i] > 90
) %>%
select(
PERSON_NAME,
list_of_tests[i]
)
assign(list_of_tests[i],x)
}
>Solution :
Using tidyverse
library(purrr)
library(dplyr)
map(threshold_cols, ~ df %>%
filter(if_any(all_of(.x), ~ .x > 90)))
-output
$SCORE_1
PERSON_NAME SCORE_1 SCORE_2
1 JIM 100 95
2 SALLY 95 75
$SCORE_2
PERSON_NAME SCORE_1 SCORE_2
1 JIM 100 95
2 SUE 85 97