Home Groupby, mutate a new column based on conditions of one column is in the specific ranges of another numeric column

Questions

Groupby, mutate a new column based on conditions of one column is in the specific ranges of another numeric column

June 10, 2022

Given a df as follows:

df <- structure(list(group = c("A", "A", "A", "A", "A", "B", "B", "B", 
"B"), pred_val = c(22.52, 21.87, 31.45, 21.45, 19.99, 13.96, 
15.97, 6.5, 19.89), actual_val = c(21L, 21L, 21L, 21L, 21L, 16L, 
16L, 16L, 16L)), class = "data.frame", row.names = c(NA, -9L))

Out:

group pred_val actual_val
A   22.52   21      
A   21.87   21      
A   31.45   21      
A   21.45   21      
A   19.99   21      
B   13.96   16      
B   15.97   16      
B   6.50    16      
B   19.89   16

Let’s say I’ll need to groupby group column then create a new column acc_level, more specifically, for each group, if pred_val is in the range of actual_val ±2, then returns good as acc_level, if in the range of actual_val ±5, but not in actual_val ±2, then returns medium, outer of those ranges, then return poor.

How could I achieve that use dplyr or other packages in R? Thanks.

Pseudo code:

df %>% group_by(group) %>%
  mutate(acc_level = case_when((pred_val isin actual_val ±2) ~ 'good', (pred_val isin actual_val ±5) ~ 'medium', otherwise ~ 'poor'))

Expected output:

>Solution :

df %>%
   group_by(group) %>%
   mutate(acc_level =abs(pred_val-actual_val), 
          acc_level = case_when( acc_level<=2~'good', 
                                 acc_level <=5~'medium', TRUE~'poor'))

# A tibble: 9 x 4
# Groups:   group [2]
  group pred_val actual_val acc_level
  <chr>    <dbl>      <int> <chr>    
1 A         22.5         21 good     
2 A         21.9         21 good     
3 A         31.4         21 poor     
4 A         21.4         21 good     
5 A         20.0         21 good     
6 B         14.0         16 medium   
7 B         16.0         16 good     
8 B          6.5         16 poor     
9 B         19.9         16 medium