Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Merge multiple data frames with similar names and add a new column indicating origin

I have several data frames that I want to merge into one large dataframe. In addition, I want to add a column to this new dataframe that indicates which data frame the data originated from. I did make a rather crude way of doing this:

exports_all<- rbind((mutate(Dai15_Dai3_1, sentiment = factor("week1"))),
                    (mutate(Dai15_Dai3_2, sentiment = factor("week2"))),
                    (mutate(Dai15_Dai3_3, sentiment = factor("week3"))),
                    (mutate(Dai15_Dai3_4, sentiment = factor("week4"))),
                    (mutate(Dai15_Dai3_5, sentiment = factor("week5"))),
                    (mutate(Dai15_Dai3_6, sentiment = factor("week6"))),
                    (mutate(Dai15_Dai3_7, sentiment = factor("week7"))))

Produces:

     Region_ID  Sv_mean  Week
1        10    -64.01115 week1
2        10    -64.96363 week1
3        10    -67.98841 week1
4        13    -66.88734 week2
5        13    -69.79789 week2
6        13    -69.94071 week2
7        15    -66.04855 week3
8        15    -68.31167 week3
9        15    -68.67383 week3

I’m trying to find a more elegant way of doing this. The data I used here was broken into 7 weeks, but next years data could be different (also the year before had 12 weeks). I’m trying to find a solution that will not choke up when encountering differing week numbers.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I did make some progress but the only solution I found inserts a column with the dataframe of origin, not exactly what I need.

exports_all<- mget(ls(pattern = 'Dai15_Dai3_*')) %>%
  map_df(I, .id = 'Week')

>Solution :

We may have to modify the names of the list as mget returns a named list with the names being the object names.

library(dplyr)
mget(ls(pattern = 'Dai15_Dai3_*')) %>% 
  setNames(paste0("Week", seq_along(.))) %>%
  bind_rows(.id = "Week")

If it should replace the substring in the original names

library(stringr)
mget(ls(pattern = 'Dai15_Dai3_*')) %>%
  setNames(str_replace(names(.), ".*_(\\d+)$", "Week\\1")) %>% 
  bind_rows(.id = "Week")

NOTE: bind_rows is enough as we are not applying any functions in each of the list elements (in the OP’s code, I (identity) was used as a placeholder for .f)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading