I have
x<-"1, A | 2, B | 10, C "
x is always this way formatted, | denotes a new row and the first value is the variable1, the second value is variable2.
I would like to convert it to a data.frame
variable1 variable2
1 1 A
2 2 B
3 10 C
I haven’t found any package that can understand the escape character |
How can I convert it to data.frame?
>Solution :
We may use read.table from base R to read the string into two columns after replacing the | with \n
read.table(text = gsub("|", "\n", x, fixed = TRUE), sep=",",
header = FALSE, col.names = c("variable1", "variable2"), strip.white = TRUE )
-output
variable1 variable2
1 1 A
2 2 B
3 10 C
Or use fread from data.table
library(data.table)
fread(gsub("|", "\n", x, fixed = TRUE), col.names = c("variable1", "variable2"))
variable1 variable2
1: 1 A
2: 2 B
3: 10 C
Or using tidyverse – separate_rows to split the column and then create two columns with separate
library(tidyr)
library(dplyr)
tibble(x = trimws(x)) %>%
separate_rows(x, sep = "\\s*\\|\\s*") %>%
separate(x, into = c("variable1", "variable2"), sep=",\\s+", convert = TRUE)
# A tibble: 3 × 2
variable1 variable2
<int> <chr>
1 1 A
2 2 B
3 10 C