I have a weird data format and I need to split a column to two.
col=c("142343-2344343(+)", "546354-4775458(-)", "374637463")
I want to split col to col1 and col2, using the first parenthesis as separator.
I want something like this
col1 col2
142343-2344343 +
546354-4775458 _
374637463 NA
I d love your help!
>Solution :
We may use base R with read.csv
read.csv(text = sub("(.*)([+-])$", "\\1,\\2",
gsub("\\(|\\)", "", col)), header = FALSE, na.strings= "",
col.names = c("col1", "col2"))
-output
col1 col2
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
With tidyr, an option is
library(tidyr)
library(dplyr)
library(tibble)
tibble(col) %>%
separate_wider_regex(col, c(col1 = ".*", "\\(", var2 = "[^)]",
"\\)"), too_few = "align_start")
-output
# A tibble: 3 × 2
col1 var2
<chr> <chr>
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>