Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to pivot dataframe by adding duplicate columns

I have a dataframe with duplicate values in what I want to be a unique identifier column. The other columns of the dataframe contain data that I want to preserve by creating new "duplicate" columns. For example, given the following dataframe:

sample_df <- data.frame(
  id = c(1, 1, 1, 2, 2, 3, 3, 3, 3),
  test = c("A", "B", "C", "A", "B", "A", "B", "C", "D"),
  result = c(93, 88, 89, 73, 89, 71, 73, 73, 75)
)
sample_df

#   id test result
# 1  1    A     93
# 2  1    B     88
# 3  1    C     89
# 4  2    A     73
# 5  2    B     89
# 6  3    A     71
# 7  3    B     73
# 8  3    C     73
# 9  3    D     75

Is there a way to pivot this by adding new iterations of the non-id columns? I’m thinking of something like this:

#   id test1 result1 test2 result2 test3 result3 test4 result4
# 1  1     A      93     B      88     C      89  <NA>      NA
# 2  2     A      73     B      89  <NA>      NA  <NA>      NA
# 3  3     A      71     B      73     C      73     D      75

Where the maximum number of duplicates in the id column determines how many new iterations of the non-id columns are joined to the dataframe.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I’ve tried doing this with tidyr::pivot_wider() but with this function I’ve only been able to generate output columns based on the values of the columns (for example, returning A, B, C, and D as the output columns rather than test1, result1, test2, result2, etc.). I may also just be lacking the proper terminology to find other sources for answers.

>Solution :

You could create an "observation number" column within each id and pass it to names_from in pivot_wider():

library(dplyr) # >= 1.1.0
library(tidyr)

sample_df %>%
  mutate(obs = row_number(), .by = id) %>%
  pivot_wider(
    values_from = test:result, 
    names_from = obs, 
    names_sep = "",
    names_vary = "slowest"
  )
# A tibble: 3 × 9
     id test1 result1 test2 result2 test3 result3 test4 result4
  <dbl> <chr>   <dbl> <chr>   <dbl> <chr>   <dbl> <chr>   <dbl>
1     1 A          93 B          88 C          89 <NA>       NA
2     2 A          73 B          89 <NA>       NA <NA>       NA
3     3 A          71 B          73 C          73 D          75
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading