In R, I want to pivot_wider within group_by without merging to one row

August 9, 2023

I have grouped data that I want to change to wide format – but using pivot_wider merges all the rows in to lists.

Please see example data, code and wanted output below.

# Example data
name <- c("pre", "post", "pre", "post")
id <- c(1, 1, 2, 2)
gender <- as.factor(c("M", NA, "F", NA))
text <- c("one two", "three four", "A B", "C D")
df <- tibble(name, id, gender, text)

# Works to pivot one group
df %>% 
  group_by(id) %>% 
  filter(id == 1) %>% 
  pivot_wider(names_from = name,
              values_from = c(id, gender, text),
              values_fn = list(. =  ~ toString(unique(.))))

# Does not pivot within each group.  
df %>% 
  group_by(id) %>% 
  #  filter(id == 1) %>% 
  pivot_wider(names_from = name,
              values_from = c(id, gender, text),
              values_fn = list(. =  ~ toString(unique(.))))

# This is how I would like the data to be
id_pre  <- c(1, 2)
id_post  <- c(1, 2)
gender_pre  <- as.factor(c("M", "F"))
gender_post  <- c(NA, NA)
text_pre  <- c("one two", "A B")
text_post <- c("three four", "C D")
tibble(id_pre,id_post, gender_pre, gender_post,
       text_pre,text_post)

>Solution :

pivot_wider doesn’t work with group_by – instead you should use a column grouper as an id_col in pivot_wider.

This case is a little strange because you both want to use the id column as a grouper column and a value column. I think it works best if you duplicate the column to use one copy in each way. You can, of course, drop it at the end:

df |>
  mutate(id_orig = id) |>
  pivot_wider(
    names_from = name,
    values_from = c(id, gender, text),
    values_fn = list(. =  ~ toString(unique(.))))
  )
# # A tibble: 2 × 7
#   id_orig id_pre id_post gender_pre gender_post text_pre text_post 
#     <dbl>  <dbl>   <dbl> <fct>      <fct>       <chr>    <chr>     
# 1       1      1       1 M          NA          one two  three four
# 2       2      2       2 F          NA          A B      C D

(Use, e.g., ... |> select(-id_orig) to drop the column. I just leave it in to illustrate.)

Since id doesn’t change pre- to post- another option would be to eliminate it from the pivot values (just using it as an id_col) and copy/rename it after:

 df |>
  pivot_wider(
    names_from = name,
    values_from = c(gender, text),
    values_fn = list(. =  ~ toString(unique(.x)))
  ) |>
  select(
    id_pre = id,
    id_post = id,
    everything()
  )
# # A tibble: 2 × 6
#   id_pre id_post gender_pre gender_post text_pre text_post 
#    <dbl>   <dbl> <fct>      <fct>       <chr>    <chr>     
# 1      1       1 M          NA          one two  three four
# 2      2       2 F          NA          A B      C D