Separate a column based on symbol while keeping the symbol in the first column

I have a dataset which I need to split the column into two based on the : symbol. However, I do want to keep the : in the first column. How to achieve that?

Here is the dataset:

dd <- data.frame(col1=c("*MOT:0 .",
"*CHI:byebye .",
"*MOT:yeah byebye .",
"*CHI:0 [>] .",
"*MOT:<what are you gonna do now> [<] ?",
"*CHI:gonna do .",
"*MOT:<what's that [= block]> [>] ?"))

dd
                                 col1
                               *MOT:0 .
                          *CHI:byebye .
                     *MOT:yeah byebye .
                           *CHI:0 [>] .
 *MOT:<what are you gonna do now> [<] ?
                        *CHI:gonna do .
     *MOT:<what's that [= block]> [>] ?

In the end, I want this:

  col1   col2
  *MOT:  0 .
  *CHI:  byebye .
  *MOT:  yeah byebye .
  *CHI:  0 [>] .
  *MOT:  <what are you gonna do now> [<] ?
  *CHI:  gonna do .
  *MOT:  <what's that [= block]> [>] ?

Any help will be greatly appreciated!

>Solution :

You can use tidyr::separate with a lookbehind regex

tidyr::separate(dd, col1, '(?<=:)', into = c('col1', 'col2'))
#>    col1                              col2
#> 1 *MOT:                               0 .
#> 2 *CHI:                          byebye .
#> 3 *MOT:                     yeah byebye .
#> 4 *CHI:                           0 [>] .
#> 5 *MOT: <what are you gonna do now> [<] ?
#> 6 *CHI:                        gonna do .
#> 7 *MOT:     <what's that [= block]> [>] ?

Leave a Reply