Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Combining multiple Dataframes with different row names

Hi I am looking to combine multiple data frames based on a common row name, but some of these data frames may have different row names that may exist in some, and not in others.

For example, I have these data frames

df1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Payments Safeway Costco Fredmeyer
Gas 3.34 3.23 3.37
Grocery 10 22 17
Membership 0 80 0
Utility 89 67 78
Annual 120 112 98

Another data frame:
df2

Payments SamsClub Walmart
Gas 3.39 3.09
Grocery 13 28
Rating 4.2 4.8
Basic 4 7

A third data frame,
df3

Payments TraderJoes WholeFoods PayLess
Grocery 15 28 9
Utility 77 97 65
Annual 0 0 5

With all of these data tables, I want to keep the different store names as the columns, and then combine based on the "Payments" name column. But if a certain data frame does not have that specific payment/value available, then I would still include it for that store, but place the value as 0 for it. In the end, all of these Payments would then be rearranged to alphabetical order, with the final result looking like below:

df4

Payments Safeway Costco Fredmeyer SamsClub Walmart TraderJoes WholeFoods Payless
Annual 120 112 98 0 0 0 0 5
Basic 0 0 0 5 7 0 0 0
Gas 3.34 3.23 3.37 3.39 3.09 0 0 0
Grocery 10 22 17 13 28 15 28 9
Membership 0 80 0 0 0 0 0 0
Rating 0 0 0 4.2 4.8 0 0 0
Utility 89 67 78 0 0 77 97 65

Please know if this is possible. I have been finding it difficult to solve this, and would appreciate any help on this. Thank you!

I also went ahead and prepared the code for the data frames if that helps with solving this faster:

# Make df1
Payments <- c("Gas", "Grocery", "Membership", "Utility", "Annual")
Safeway <- c(3.34, 10, 0, 89, 120)
Costco <- c(3.23, 22, 80, 67, 112)
Fredmeyer <- (3.37, 17, 0, 78, 98)
df1 <- data.frame(Payments, Safeway, Costco, Fredmeyer)

# Make df2
Payments <- c("Gas", "Grocery", "Rating", "Basic")
SamsClub <- c(3.39, 13, 4.2, 2)
Walmart <- c(3.09, 28, 4.8, 7)
df2 <- data.frame(Payments, SamsClub, Walmart)

# Make df3
Payments <- c("Grocery", "Utility", "Annual")
TraderJoes <- c(15, 77, 0)
WholeFoods <- c(28, 97, 0)
Payless <- c(9, 65, 5)
df3 <- data.frame(Payments, TraderJoes, Wholefoods, Payless)

>Solution :

We can bind the datasets together with bind_rows and do a group by sum

library(dplyr)
bind_rows(df1, df2, df3) %>%
  group_by(Payments) %>% 
   summarise(across(everything(), sum, na.rm = TRUE))

-output

# A tibble: 7 × 9
  Payments   Safeway Costco Fredmeyer SamsClub Walmart TraderJoes WholeFoods Payless
  <chr>        <dbl>  <dbl>     <dbl>    <dbl>   <dbl>      <dbl>      <dbl>   <dbl>
1 Annual      120    112        98        0       0             0          0       5
2 Basic         0      0         0        2       7             0          0       0
3 Gas           3.34   3.23      3.37     3.39    3.09          0          0       0
4 Grocery      10     22        17       13      28            15         28       9
5 Membership    0     80         0        0       0             0          0       0
6 Rating        0      0         0        4.2     4.8           0          0       0
7 Utility      89     67        78        0       0            77         97      65
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading