Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

dplyr: If a column is present then evaluate an expression. If not return a `FALSE`

I am getting myself confused with dplyr and if_any. I am trying to perform something along these lines:

If a column is present then evaluate an expression. If not return a FALSE.

So these three scenarios capture what I am thinking:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(dplyr)

dat <- data.frame(x = 1)

## GOOD: if foo_col is NA then return FALSE
dat %>%
  mutate(foo_col = NA_character_) %>%
  mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#>   x foo_col present
#> 1 1    <NA>   FALSE

## GOOD: if foo_col is not NA return FALSE
dat %>%
  mutate(foo_col = "value") %>%
  mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#>   x foo_col present
#> 1 1   value    TRUE

## NOT GOOD: if foo_col is absent, return TRUE? Want this to be FALSE.
dat %>%
  mutate(present = if_any(matches("foo_col"), ~ !is.na(.x)))
#>   x present
#> 1 1    TRUE

So can anyone suggest a way to determine how I could check for the is.na condition but also if the column is actually there?

>Solution :

If we need the last to be FALSE while giving the TRUE/FALSE for the other two cases

library(dplyr)
dat %>%
   mutate(present = ncol(pick(matches("foo_col"))) > 0 & 
                   if_any(matches("foo_col"), ~ !is.na(.x)))

-output

  x present
1 1   FALSE

Or as @boshek mentioned in the comments, rlang::is_empty should work as well

dat %>% 
  mutate(present = !rlang::is_empty((across(matches("foo_col")))) & 
                if_any(matches("foo_col"), ~ !is.na(.x)))

-output

  x present
1 1   FALSE

For the other cases

> dat %>%
+   mutate(foo_col = NA_character_) %>%
+   mutate(present = ncol(pick(matches("foo_col"))) > 0 &if_any(matches("foo_col"), ~ !is.na(.x)))
  x foo_col present
1 1    <NA>   FALSE
> dat %>%
+   mutate(foo_col = "value") %>%
+   mutate(present =  ncol(pick(matches("foo_col"))) > 0 & if_any(matches("foo_col"), ~ !is.na(.x)))
  x foo_col present
1 1   value    TRUE

NOTE: But this test cannot differentiate the FALSE from the NA cases and column not found FALSE

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading