Split variable from comma into an ordered dataframe

July 11, 2022

I have a dataframe like this, where the values are separated by comma.

# Events
# A,B,C
# C,D
# B,A
# D,B,A,E
# A,E,B

I would like to have the next data frame

# Event1  Event2  Event3  Event4  Event5
# A       B       C       NA      NA
# NA      NA      C       NA      NA
# A       B       NA      NA      NA
# A       B       NA      D       E
# A       B       NA      NA      E

I have tried with cSplit but I don’t have the desired df. Is possible?

NOTE: The values doesn’t appear in the same possition as the variable Event in the second dataframe.

>Solution :

Another approach using tidyverse:

library(dplyr)
library(purrr)
library(stringr)

Events = c("A,B,C", 'C,D', "B,A", "D,B,A,E", "A,E,B")

letters <- Events %>% str_split(",") %>% unlist() %>% unique()

df <- data.frame(Events)

df %>% 
  map2_dfc(.y = letters, ~ ifelse(str_detect(.x, .y), .y, NA)) %>% 
  set_names(nm = paste0("Events", 1:length(letters)))

#> # A tibble: 5 × 5
#>   Events1 Events2 Events3 Events4 Events5
#>   <chr>   <chr>   <chr>   <chr>   <chr>  
#> 1 A       B       C       <NA>    <NA>   
#> 2 <NA>    <NA>    C       D       <NA>   
#> 3 A       B       <NA>    <NA>    <NA>   
#> 4 A       B       <NA>    D       E      
#> 5 A       B       <NA>    <NA>    E

^{Created on 2022-07-11 by the reprex package (v2.0.1)}