How to pivot dataframe by adding duplicate columns

February 21, 2023

I have a dataframe with duplicate values in what I want to be a unique identifier column. The other columns of the dataframe contain data that I want to preserve by creating new "duplicate" columns. For example, given the following dataframe:

sample_df <- data.frame(
  id = c(1, 1, 1, 2, 2, 3, 3, 3, 3),
  test = c("A", "B", "C", "A", "B", "A", "B", "C", "D"),
  result = c(93, 88, 89, 73, 89, 71, 73, 73, 75)
)
sample_df

#   id test result
# 1  1    A     93
# 2  1    B     88
# 3  1    C     89
# 4  2    A     73
# 5  2    B     89
# 6  3    A     71
# 7  3    B     73
# 8  3    C     73
# 9  3    D     75

Is there a way to pivot this by adding new iterations of the non-id columns? I’m thinking of something like this:

#   id test1 result1 test2 result2 test3 result3 test4 result4
# 1  1     A      93     B      88     C      89  <NA>      NA
# 2  2     A      73     B      89  <NA>      NA  <NA>      NA
# 3  3     A      71     B      73     C      73     D      75

Where the maximum number of duplicates in the id column determines how many new iterations of the non-id columns are joined to the dataframe.

I’ve tried doing this with tidyr::pivot_wider() but with this function I’ve only been able to generate output columns based on the values of the columns (for example, returning A, B, C, and D as the output columns rather than test1, result1, test2, result2, etc.). I may also just be lacking the proper terminology to find other sources for answers.

>Solution :

You could create an "observation number" column within each id and pass it to names_from in pivot_wider():

library(dplyr) # >= 1.1.0
library(tidyr)

sample_df %>%
  mutate(obs = row_number(), .by = id) %>%
  pivot_wider(
    values_from = test:result, 
    names_from = obs, 
    names_sep = "",
    names_vary = "slowest"
  )

# A tibble: 3 × 9
     id test1 result1 test2 result2 test3 result3 test4 result4
  <dbl> <chr>   <dbl> <chr>   <dbl> <chr>   <dbl> <chr>   <dbl>
1     1 A          93 B          88 C          89 <NA>       NA
2     2 A          73 B          89 <NA>       NA <NA>       NA
3     3 A          71 B          73 C          73 D          75