Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting multiple columns to factors and releveling with mutate(across)

dat <- data.frame(Comp1Letter = c("A", "B", "D", "F", "U", "A*", "B", "C"),
                   Comp2Letter = c("B", "C", "E", "U", "A", "C", "A*", "E"),
                   Comp3Letter = c("D", "A", "C", "D", "F", "D", "C", "A"))  

GradeLevels <- c("A*", "A", "B", "C", "D", "E", "F", "G", "U")

I have a dataframe that looks something like the above (but with many other columns I don’t want to change).

The columns I am interested in changing contains lists of letter grades, but are currently character vectors and not in the right order.

I need to convert each of these columns into factors with the correct order. I’ve been able to get this to work using the code below:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

factordat <-
    dat %>%
      mutate(Comp1Letter = factor(Comp1Letter, levels = GradeLevels)) %>%
      mutate(Comp2Letter = factor(Comp2Letter, levels = GradeLevels)) %>%
      mutate(Comp3Letter = factor(Comp3Letter, levels = GradeLevels)) 

However this is super verbose and chews up a lot of space.

Looking at some other questions, I’ve tried to use a combination of mutate() and across(), as seen below:

factordat <-
  dat %>%
    mutate(across(c(Comp1Letter, Comp2Letter, Comp3Letter) , factor(levels = GradeLetters))) 

However when I do this the vectors remain character vectors.

Could someone please tell me what I’m doing wrong or offer another option?

>Solution :

You can do across as an anonymous function like this:

dat <- data.frame(Comp1Letter = c("A", "B", "D", "F", "U", "A*", "B", "C"),
                   Comp2Letter = c("B", "C", "E", "U", "A", "C", "A*", "E"),
                   Comp3Letter = c("D", "A", "C", "D", "F", "D", "C", "A"))  

GradeLevels <- c("A*", "A", "B", "C", "D", "E", "F", "G", "U")

dat %>%
  tibble::as_tibble() %>%
    dplyr::mutate(dplyr::across(c(Comp1Letter, Comp2Letter, Comp3Letter) , ~forcats::parse_factor(., levels = GradeLevels)))

# # A tibble: 8 × 3
#   Comp1Letter Comp2Letter Comp3Letter
#   <fct>       <fct>       <fct>      
# 1 A           B           D          
# 2 B           C           A          
# 3 D           E           C          
# 4 F           U           D          
# 5 U           A           F          
# 6 A*          C           D          
# 7 B           A*          C          
# 8 C           E           A     

You were close, all that was left to be done was make the factor function anonymous. That can be done either with ~ and . in tidyverse or function(x) and x in base R.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading