R dplyr pivot wider with duplicates and generate variable names

March 7, 2022

How can I go from

df<-data.frame(id=c("A", "B", "B"), res=c("one", "two", "three"))
df

df.output<-data.frame(id=c("A", "B"), res1=c("one", "two"), res2=c(NA, "three"))
df.output

with dplyr?

I do not know a priori the number of duplicates in id (in this example B has 2 occurrences), so the number of resX variables in the output data frame has to be generated on the fly.

>Solution :

You just need to create a row identifier, which you can do with dplyr and then use tidyr::pivot_wider() to generate all your resX variables.

library(dplyr)
library(tidyr)

df %>%
  group_by(id) %>%
  mutate(
    no = row_number()
  ) %>%
  ungroup() %>%
  pivot_wider(
    id,
    names_from = no,
    names_prefix = "res",
    values_from = res
  )
#> # A tibble: 2 × 3
#>   id    res1  res2 
#>   <chr> <chr> <chr>
#> 1 A     one   <NA> 
#> 2 B     two   three