I have dataframe with two pieces of information per row: the name of a substrate and its mass. I want to pivot longer the data frame and in the output the name and the mass of each substrate should go into seperate columns. Here is an example of the data frame:
library(tidyr)
df <- tibble(
ID = c("A", "B", "C"),
LOC = c("GER", "FR", "GB"),
`Substrat 1 Name` = c("silage", "slurry", "biowaste"),
`Substrat 1 Menge in t` = c(10000, 5000, 3000),
`Substrat 2 Name` = c("grass", "silage", "corn"),
`Substrat 2 Menge in t` = c(6700, 3400, 1200),
`Substrat 12 Name` = c("slurry", "grass", "silage"),
`Substrat 12 Menge in t` = c(200, 4000, 3700)
)
I would like to get the following as a result:
res <- tibble(
ID = c(rep("A", 3), rep("B", 3), rep("C", 3)),
LOC = c(rep("GER", 3), rep("FR", 3), rep("GB", 3)),
Substrat = rep(c("Substrat 1", "Substrat 2", "Substrat 12"), 3),
Name = c("silage", "grass", "slurry", "slurry", "silage", "grass", "biowaste", "corn", "silage"),
'Menge in t' = c(10000, 6700, 200, 5000, 3400, 4000, 3000, 1200, 3700)
)
To pivot longer I tried the following:
res <- df |>
pivot_longer(cols = c(`Substrat 1 Name`:`Substrat 12 Menge in t`),
names_to = c(".value", "Substrat"),
names_pattern = "(\\w+\\s\\w+)\\s(.+)",
values_drop_na = T)
However, it says:
Error in pivot_longer():
! Can’t combine Substrat 1 Name and Substrat 1 Menge in t .
Can anybody give an advice how to solve this issue?
>Solution :
Try the following regex :
res <- df %>%
pivot_longer(cols = c(`Substrat 1 Name`:`Substrat 12 Menge in t`),
names_to = c("Substrat", ".value"), # LINE TO CHANGE
names_pattern = "(Substrat \\d+) (.+)", # LINE TO CHANGE
values_drop_na = T)