Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group by, combine and list into a new column in R

I have the following dataframe:

df <- data.frame(name = c("michael", "michael", "michael", "jim", "jim", "pam", "dwight", "dwight", "dwight"),
           thing = c("mug", "bandana", "tiny tv", "pranks", "face", "reception", "bear", "beets", "battlestar galactica"))

I would like to group "thing" by "name" and add it to a new column. This column must to be a list type which each element of "thing" correspond to an element of this list.

I’ve tried this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df_1 <- df %>% group_by(name) %>% 
  mutate(new_col = paste0(as.list(thing), collapse = "\", \"")) %>% 
  mutate(new_col = paste0("c('", new_col)) %>%
  mutate(new_col= as.list(str_trim(new_col, side = "both"))) %>% 
  ungroup() %>% 
  select(-thing)

df_1$new_col <- as.list(paste(df_1$new_col, "\")", sep = ""))

df_1 <- as.tibble(unique(df_1))

# A tibble: 4 × 2
  name    new_col  
  <chr>   <list>   
1 michael <chr [1]>
2 jim     <chr [1]>
3 pam     <chr [1]>
4 dwight  <chr [1]>

However, I want this:

# A tibble: 4 × 2
  name    new_col  
  <chr>   <list>   
1 michael <chr [3]>
2 jim     <chr [2]>
3 pam     <chr [1]>
4 dwight  <chr [3]>

Thank you in advance.

>Solution :

A list-column shouldn’t need to use paste or similar, we can do just

out <- df %>%
  group_by(name) %>%
  summarize(new_col = list(thing))
out
# # A tibble: 4 × 2
#   name    new_col  
#   <chr>   <list>   
# 1 dwight  <chr [3]>
# 2 jim     <chr [2]>
# 3 michael <chr [3]>
# 4 pam     <chr [1]>
out$new_col[[1]]
# [1] "bear"                 "beets"                "battlestar galactica"

Note that there is also

df %>%
  group_nest(name, .key = "new_col")
# # A tibble: 4 × 2
#   name               new_col
#   <chr>   <list<tibble[,1]>>
# 1 dwight             [3 × 1]
# 2 jim                [2 × 1]
# 3 michael            [3 × 1]
# 4 pam                [1 × 1]

in which each of new_col is an embedded frame, each of which here has just one column:

out <- df %>%
  group_nest(name, .key = "new_col")
out$new_col[[1]]
# # A tibble: 3 × 1
#   thing               
#   <chr>               
# 1 bear                
# 2 beets               
# 3 battlestar galactica
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading