Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Merging cells in R based on a value in another row, with character format (not numeric)

I would like to merge several cells in a dataframe, based on a value in another column. However, I would also like to keep the data and not lose any values when merging, please see below.

I’ve tried to use the aggregate function but I don’t think it works as my desired column is in the format character, rather than numeric.

Current dataframe format:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id     food
1      salt
1      chicken
1      pasta
2      rice
2      broccoli
3      cream
3      pasta
3      garlic
3      lemon
...    ...

Desired dataframe format:

id     food
1      salt chicken pasta
2      rice broccoli
3      cream pasta garlic lemon
...    ...

I was wondering if it was even possible to merge cells that have character/string properties, and if so how would I go about doing this?

Thank you very much

Data

dat  <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L), food = c("salt",
"chicken", "pasta", "rice", "broccoli", "cream", "pasta", "garlic",
"lemon")), class = "data.frame", row.names = c(NA, -9L))

>Solution :

You can do this with dplyr:

library(dplyr)
dat  |>
    group_by(id)  |>
    summarise(food = paste(food, collapse = " "))
# A tibble: 3 x 2
#      id food
#   <int> <chr>
# 1     1 salt chicken pasta      
# 2     2 rice broccoli
# 3     3 cream pasta garlic lemon

Alternatively here is a data.table approach:

library(data.table)
setDT(dat)

dat[, .(food = paste(food, collapse = " ")), by = id]
#    id                     food
# 1:  1       salt chicken pasta
# 2:  2            rice broccoli
# 3:  3 cream pasta garlic lemon

And for the sake of completeness a base R approach:

aggregate(
    list(food = dat$food), 
    by = list(id = dat$id), 
    paste
)
#   id                        food
# 1  1        salt, chicken, pasta
# 2  2              rice, broccoli
# 3  3 cream, pasta, garlic, lemon
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading