Recode subset of variables using case when in R

November 8, 2021

I am trying to recode some survey data in R. Here is some data similar to what I actually have.

df <- data.frame(
  A = rep("Y",5),
  B=seq(as.POSIXct("2014-01-13"), as.POSIXct("2014-01-17"), by="days"),
  C = c("Neither agree nor disagree",
        "Somewhat agree",
        "Somewhat disagree",
        "Strongly agree",
        "Strongly disagree"),
  D=c("Neither agree nor disagree",
         "Somewhat agree",
         "Somewhat disagree",
         "Strongly agree",
         "Strongly disagree")
)

I looked up some other posts and wrote the code below:

init2<-df %>%
  mutate_at(vars(c(1:4)), function(x) case_when( x == "Neither agree nor disagree" ~ 3, 
                                     x == "Somewhat agree" ~ 4, 
                                     x == "Somewhat disagree"~ 2,
                                     x== "Strongly agree"~ 5,
                                     x== "Strongly disaagree"~ 1
                                     
                                     ))

But this throws the error

Error: Problem with `mutate()` column `B`.
i `B = (function (x) ...`.
x character string is not in a standard unambiguous format

Run `rlang::last_error()` to see where the error occurred.

My input dates are POSIXct. SHould I change their format? What is the fix for this issue? Thanks.

>Solution :

It does not make sense to try to recode POSIXt columns to your Likert scale; nor does it make sense to me to try to recode the "Y" column, though at least you are not getting an error about that.

I suggest you either:

Explicitly mutate the columns you want,

df %>%
  mutate(across(c(C, D), ~ case_when(
    . == "Neither agree nor disagree" ~ 3,
    . == "Somewhat agree"             ~ 4,
    . == "Somewhat disagree"          ~ 2,
    . == "Strongly agree"             ~ 5,
    . == "Strongly disagree"          ~ 1
  )))
#   A          B C D
# 1 Y 2014-01-13 3 3
# 2 Y 2014-01-14 4 4
# 3 Y 2014-01-15 2 2
# 4 Y 2014-01-16 5 5
# 5 Y 2014-01-17 1 1

Explicitly exclude columns you don’t want,

df %>%
  mutate(across(-c(A, B), ~ case_when(
    . == "Neither agree nor disagree" ~ 3,
    . == "Somewhat agree"             ~ 4,
    . == "Somewhat disagree"          ~ 2,
    . == "Strongly agree"             ~ 5,
    . == "Strongly disagree"          ~ 1
  )))

Conditionally process them via some filter (though this is not infallible):

df %>%
  mutate(across(where(~ all(grepl("agree", .))), ~ case_when(
    . == "Neither agree nor disagree" ~ 3,
    . == "Somewhat agree"             ~ 4,
    . == "Somewhat disagree"          ~ 2,
    . == "Strongly agree"             ~ 5,
    . == "Strongly disagree"          ~ 1
  )))

FYI, according to https://dplyr.tidyverse.org/reference/mutate_all.html (on 2021 Nov 7):