Using tidyr::pivot_wider when some keys have multiple values

Advertisements

I have a long data frame that i want to widen, but one key has two different values:

df <- data.frame(ColA=c("A", "B", "B", "C"), ColB=letters[23:26])

  ColA ColB
1    A    w
2    B    x
3    B    y
4    C    z

I want my output to be a paste of the two values for this key together:

  ColA ColB
1    A    w
2    B    xy
3    C    z

A regular pivot_wider() will throw a warning and convert the values to lists:

df.wide <- df %>%
    pivot_wider(names_from=ColA, values_from=ColB)

Warning message:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 

# A tibble: 1 x 3
  A         B         C        
  <list>    <list>    <list>   
1 <chr [1]> <chr [2]> <chr [1]>

Based on the warning it looks like pivot_wider() with a value_fn() is similar to what I want as an intermediate step:

# intermediate step
df.wide <- df %>%
    pivot_wider(names_from=ColA, values_from=ColB, values_fn=SOMETHING)

   A  B  C
1  w  xy z

But it looks like values_fn() only takes summary functions, and not something that would work on character data (like paste())

The closest I can get is:

df %>%
  pivot_wider(names_from=ColA, values_from=ColB, values_fn=list) %>% 
  mutate(across(everything(), as.character)) %>%
  pivot_longer(cols=everything(), names_to="ColA", values_to="ColB")

# A tibble: 3 x 2
  ColA  ColB             
  <chr> <chr>            
1 A     "w"              
2 B     "c(\"x\", \"y\")"
3 C     "z" 

With an additional mutating gsub()-type function. Surely there’s an easier way! Preferably within the tidyverse, but also open to other packages.

Thanks

>Solution :

I don’t think you need to pivot here, unless your real data is more complicated than the example shown.

library(dplyr)

df %>% 
  group_by(ColA) %>% 
  summarise(ColB = paste0(ColB, collapse = ""))

Result:

# A tibble: 3 × 2
  ColA  ColB 
  <chr> <chr>
1 A     w    
2 B     xy   
3 C     z  

Leave a ReplyCancel reply