I have a df like this:
df <- data.frame(id=c("j1", "j2", "j3/j9", "j5", "j2/j8", "j3/j4"), dad=c("j10", "j11", "", "j13", "", ""), mom=c("k2", "k4", "", "k6", "", ""))
I am trying to split only those cells that contain a slash "/" in the column "id". I want to get the split string in the existing columns "dad" and "mom". The desired output would be this:
df2 <- data.frame(id=c("j1", "j2", "j3/j9", "j5", "j2/j8", "j3/j4"), dad=c("j10", "j11", "j3", "j13", "j2", "j3"), mom=c("k2", "k4", "j9", "k6", "j8", "j4"))
I am trying this code:
df3 <- tidyr::separate(data = df, col = "id", into = c("dad", "mom"), sep = "/")
but that split the entire column "id" in two new columns. Any idea how to solve this?
>Solution :
You could use:
library(dplyr)
library(stringr)
df %>%
mutate(dad = if_else(str_detect(id, "/"), str_extract(id, ".*(?=/)"), dad),
mom = if_else(str_detect(id, "/"), str_extract(id, "(?<=/).*"), mom))
This returns
id dad mom
1 j1 j10 k2
2 j2 j11 k4
3 j3/j9 j3 j9
4 j5 j13 k6
5 j2/j8 j2 j8
6 j3/j4 j3 j4