Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R – Splitting a dataframe by using strsplit, but keep delimiter

I have a dataframe like the following:

ref = c("ab/1bc/1", "dd/1", "cc/1", "2323")
text = c("car", "train", "mouse", "house")

data = data.frame(ref, text)

Which produces this:

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

IF the cell within the ref column has /1 in it, I want to split it and duplicate the row.

I.e. the table above should look like this:

enter image description here

I have the following code, which splits the cell by the /1, but it also removes it. I thought about adding /1 back onto every ref, but not all refs have it.

data1 = data %>%
    mutate(ref = strsplit(as.character(ref), "/1")) %>%
    unnest(ref)

Some of the other answers use regex for when people split by things like &/,. etc, but not /1. Any ideas?

>Solution :

With separate_rows and look-behind:

library(tidyr)
library(dplyr)
data %>%
  separate_rows(ref, sep = "(?<=/1)") %>% 
  filter(ref != "")

output

# A tibble: 5 × 2
  ref   text 
  <chr> <chr>
1 ab/1  car  
2 bc/1  car  
3 dd/1  train
4 cc/1  mouse
5 2323  house

Or with strsplit:

data %>%
  mutate(ref = strsplit(ref, "(?<=/1)", perl = TRUE)) %>%
  unnest(ref)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading