Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to avoid transposition of duplicates into lists with pivot_wider?

I have duplicates on the first 3 columns that I would like to keep after pivot_wider transposition, but not in list format.
How to do it?

Initial data with duplicates:

dat0 <-
structure(list(id = c("P1", "P1", "P1", "P1", "P1", "P1", "P1", 
"P1", "P1", "P1", "P2", "P2", "P2", "P2", "P2", "P2"), analyte = c("A", 
"A", "B", "B", "B", "B", "C", "C", "D", "D", "B", "B", "B", "B", 
"D", "D"), analyzer = c("X", "Y", "X", "Y", "X", "Y", "X", "Y", 
"X", "Y", "X", "Y", "X", "Y", "X", "Y"), result = c(0.7, 0.9, 
1.26, 1.23, 1.24, 1.22, 5.7, 5.3, 4.1, 4.2, 1.22, 1.23, 1.21, 
1.22, 4.4, 4.5)), row.names = c(NA, -16L), class = c("tbl_df", 
"tbl", "data.frame")) 

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What pivot_wider produces after running, with the following message:

dat1 <- dat0 %>% 
  pivot_wider(names_from = analyzer, values_from = result)  

enter image description here

Values from `result` are not uniquely identified; output will contain list-cols.
• Use `values_fn = list` to suppress this warning.
• Use `values_fn = {summary_fun}` to summarise duplicates.
• Use the following dplyr code to identify duplicates.
  {data} |>
  dplyr::summarise(n = dplyr::n(), .by = c(id, analyte, analyzer)) |>
  dplyr::filter(n > 1L) 

Desired output with duplicates:

enter image description here

Thanks for help

>Solution :

You can use row_number() within grouped data to distinguish the duplicates, e.g.,

dat0 %>%
    mutate(grp = row_number(), .by = c(id, analyte, analyzer)) %>%
    pivot_wider(names_from = analyzer, values_from = result) %>%
    select(-grp)

which gives

# A tibble: 8 × 4
  id    analyte     X     Y
  <chr> <chr>   <dbl> <dbl>
1 P1    A        0.7   0.9
2 P1    B        1.26  1.23
3 P1    B        1.24  1.22
4 P1    C        5.7   5.3
5 P1    D        4.1   4.2
6 P2    B        1.22  1.23
7 P2    B        1.21  1.22

Another option is using unnest on top of what you have achieved

dat0 %>%
    pivot_wider(names_from = analyzer, values_from = result, values_fn = list) %>%
    unnest(cols = c("X", "Y"))

which gives the desired output as well

# A tibble: 8 × 4
  id    analyte     X     Y
  <chr> <chr>   <dbl> <dbl>
1 P1    A        0.7   0.9
2 P1    B        1.26  1.23
3 P1    B        1.24  1.22
4 P1    C        5.7   5.3
5 P1    D        4.1   4.2
6 P2    B        1.22  1.23
7 P2    B        1.21  1.22
8 P2    D        4.4   4.5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading