Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R Dataframe: Combine rows / values when two other values match

I have a dataframe that looks like this:

Name  Fruit Cost
Adam  Orange   2
Adam  Apple    3
Bob   Orange   3
Cathy Orange   4
Cathy Orange   5

Dataframe creation:

df=data.frame(Name=c("Adam","Adam","Bob","Cathy","Cathy"),Fruit=c("Orange","Apple","Orange","Orange","Orange"),Cost=c(2,3,3,4,5))

I would like to script a combine that says when Name and Fruit match, add the Cost and delete the other row. For the example, the result would look like this, with two Cathy costs being combined because the Name and Fruit are the same:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Name  Fruit Cost
Adam  Orange   2
Adam  Apple    3
Bob   Orange   3
Cathy Orange   9

I was thinking of writing a for loop to compare line by line, value by value, compare and add and then delete. But I have to imagine there’s a faster/cleaner way.

>Solution :

What you are trying to do is sum Cost within a group.

In base R:

aggregate(Cost ~ Name + Fruit, df, sum)

Or using dplyr:

library(dplyr)

df %>% 
  group_by(Name, Fruit) %>% 
  summarize(Cost = sum(Cost), .groups = "drop")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading