I have a df:
A <- c("A", "A123", "A123", "B123", "B123", "B")
B <- c("NA", "as", "bp", "df", "kl", "c")
df <- data.frame(A, B)
and I would like to create a df in which the output would be
A <- c("A", "A123", "B123", "B")
C <- c("NA", "as;bp", "df;kl", "c")
df2 <- data.frame(A,C)
This new column is based on if there is a duplicate in column A, then combine the values in column B to make a new column, all other unique values in column B that correspond single/unique values in A would be carried over to column C.
Any help in generating a code where you get column C would be appreciated as I don’t even know where to begin in coding for this.
thank you!
>Solution :
Use tidyverse with reframe to paste the non-missing ‘B’ values for each ‘A’ group – if all values are missing, return the B column
library(dplyr)
library(stringr)
df %>%
reframe(C = if(all(is.na(B))) B else
str_c(B[complete.cases(B)], collapse = ";"), .by = "A")
-output
A C
1 A NA
2 A123 as;bp
3 B123 df;kl
4 B c