Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

What is the correct way of applying separate_rows to a data.frame?

I have example data which looks as follows:

library(dplyr)
library(tidyr)

# example data frame
df <- data.frame(
  col1 = c("A;B;C", "A;B", "B;C", "A;C", "B", "A;B;C;D"),
  col2 = c("X;Y;Z", "X;Y", "Y;Z", "X;Z", "Z", "W;X;Y;Z"),
  col3 = c("1;2", "1", "2;3", "3", "4;5;6", "7"),
  col4 = c(1, 2, 3, 4, 5, 6),
  col5 = c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE)
)

# select columns to separate
selected_cols <- c("col1", "col2", "col3", "col4", "col5")

The following code does however not work for some reason:

# separate rows within selected columns that are character columns
df_separated <- df %>% 
  mutate(across(where(is.character), ~ separate_rows(., sep = ";")))

It gives the error:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Error in `mutate()`:
ℹ In argument: `across(where(is.character), ~separate_rows(., sep
  = ";"))`.
Caused by error in `across()`:
! Can't compute column `col1`.
Caused by error in `UseMethod()`:
! no applicable method for 'separate_rows' applied to an object of class "character"
Run `rlang::last_error()` to see where the error occurred.

I am kind of assuming that the entire point of separate_rows is to be applied to character columns, so something is going wrong..

Background

I wanted to make bar chart out of every column of my data set, for which I found this very nice solution by Ronak Shah.

library(ggplot2)

lapply(names(df), function(col) {
  ggplot(df, aes(.data[[col]], ..count..)) + 
    geom_bar(aes(fill = .data[[col]]), position = "dodge")
}) -> list_plots

Now my issue is that some of my columns have multiple answers, so the code does not work properly.

>Solution :

First of all, separate_rows is not meant to be used in mutate. Second, separating multiple columns will only work if they contain the same number of elements per cell. As the latter is not the case for your columns one option would of course be to reshape to long as suggested by @ChrisRuehlemann.

However, as your final goal is to make a bar chart of each column another option would be to move the separate_rows step into your plotting function:

library(ggplot2)
library(tidyr)

lapply(c("col1", "col3"), function(col) {
  separate_rows(df, all_of(col), sep = ";") |>
    ggplot(aes(.data[[col]])) +
    geom_bar(aes(fill = .data[[col]]))
})
#> [[1]]

#> 
#> [[2]]

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading