Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Error when using case_when with mutate and functions: getting closest number to zero with NA

I am trying to flag the closest number of days from an accident to when someone got insurance. 0 represents the same day, a negative number represents insurance prior to accident and a positive number represents insurance post accident.

Data

df <- data.frame(id=1:4, accident1=c(-1,3,4, NA), accident2=c(100, -2, NA, NA), accident3=c(-3,1,2, NA))
    
> df
  id accident1 accident2 accident3
1  1        -1       100        -3
2  2         3        -2         1
3  3         4        NA         2
4  4        NA        NA        NA

Code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(DescTools)
library(dplyr)

df %>%
  rowwise() %>%
  mutate(magic=
           case_when(
             accident1 <0 |accident2<0 |accident3<0 ~ as.numeric(pmax(accident1, accident2, accident3, na.rm=T)),
             accident1 >0 |accident2>0 | accident3>0 ~ as.numeric(pmin(accident1, accident2, accident3, na.rm=T)),
             accident1 > 0 & accident2 > 0 & accident3>0 ~ as.numeric(pmin(accident1, accident2, accident3, na.rm=T)),
             accident1 < 0 & accident2 < 0 & accident3 < 0 ~ as.numeric(pmax(accident1, accident2, accident3, na.rm=T)),
             TRUE ~ NA_real_)) %>%
         rowwise() %>% 
  # not working
  mutate(magic= case_when(
             (accident1 >0 |accident2<0 |accident3<0) & (accident1 >0 |accident2>0 | accident3>0) ~ 
               Closest(as.numeric(unlist(c(accident1, accident2, accident3))), 0, na.rm=T), TRUE~magic))

Result without last row in data (all NAs):

# A tibble: 3 × 5
# Rowwise: 
     id accident1 accident2 accident3 magic
  <int>     <dbl>     <dbl>     <dbl> <dbl>
1     1        -1       100        -3    -1
2     2         3        -2         1     1
3     3         4        NA         2     2

However, when I try this with NA in last row:

Error: Problem with `mutate()` column `magic`.
ℹ `magic = case_when(...)`.
ℹ `magic` must be size 1, not 0.
ℹ Did you mean: `magic = list(case_when(...))` ?
ℹ The error occurred in row 4.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
Problem with `mutate()` column `magic`.
ℹ `magic = case_when(...)`.
ℹ no non-missing arguments to min; returning Inf
ℹ The warning occurred in row 4. 

Any advice on how to get the code to work on row 4 with NAs?

>Solution :

The main reason is the last element which returns numeric(0) because all the elements are NA and we used na.rm = TRUE for Closest. We can prevent that by indexing to return first element and this changes to NA

library(dplyr)
df %>% 
  mutate(magic=
           case_when(
             accident1 <0 |accident2<0 |accident3<0 ~ as.numeric(pmax(accident1, accident2, accident3, na.rm=T)),
             accident1 >0 |accident2>0 | accident3>0 ~ as.numeric(pmin(accident1, accident2, accident3, na.rm=T)),
             accident1 > 0 & accident2 > 0 & accident3>0 ~ as.numeric(pmin(accident1, accident2, accident3, na.rm=T)),
             accident1 < 0 & accident2 < 0 & accident3 < 0 ~ as.numeric(pmax(accident1, accident2, accident3, na.rm=T)),
             TRUE ~ NA_real_)) %>%
         rowwise() %>%  
  mutate(magic= case_when(
             (accident1 >0 |accident2<0 |accident3<0) & (accident1 >0 |accident2>0 | accident3>0) ~ 
               Closest(as.numeric(unlist(c(accident1, accident2, accident3))), 0, na.rm=TRUE)[1], TRUE~magic))

-output

# A tibble: 4 × 5
# Rowwise: 
     id accident1 accident2 accident3 magic
  <int>     <dbl>     <dbl>     <dbl> <dbl>
1     1        -1       100        -3    -1
2     2         3        -2         1     1
3     3         4        NA         2     2
4     4        NA        NA        NA    NA

It is more easier to understand if we do this only on Closest

> apply(df[-1], 1, function(x) Closest(x, 0, na.rm = TRUE))
[[1]]
accident1 
       -1 

[[2]]
accident3 
        1 

[[3]]
accident3 
        2 

[[4]]
named numeric(0)  ####

The solution does index, so that numeric(0) becomes NA

numeric(0)[1]
[1] NA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading