Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Separate columns that contain an unequal mixture of numbers and text in R

I have a very odd data frame that looks like this

library(tidyverse)

df <- tibble(position=c(100,200,300),
             correction=c("62M89S", 
                     "8M1D55M88S",
                     "1S25M1P36M89S"))

df
#> # A tibble: 3 × 2
#>   position correction   
#>      <dbl> <chr>        
#> 1      100 62M89S       
#> 2      200 8M1D55M88S   
#> 3      300 1S25M1P36M89S

Created on 2022-03-02 by the reprex package (v2.0.1)
the correction column will be used to correct the positions.
I want to separate the correction column in a way that my data frame looks like this

df    col1  col2 col3 col4 col5
100                    62M  89S
200          8M  1D    55M  88S
300     1S  25M  1P    36M  89S

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use tidyr::separate with fill = "left":

library(tidyr)
library(dplyr)
df %>% 
  separate(correction, into = str_c("col", 1:5), 
           sep = "(?<=\\D)(?=\\d)", fill = "left")

# A tibble: 3 x 6
  position col1  col2  col3  col4  col5 
     <dbl> <chr> <chr> <chr> <chr> <chr>
1      100 NA    NA    NA    62M   89S  
2      200 NA    8M    1D    55M   88S  
3      300 1S    25M   1P    36M   89S  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading