I am working with the R programming language.
I am starting with these 3 data frames – these frames have different numbers of columns and different column names:
df_1 = data.frame(col1 = c("A", "B", "C"), col2 = c(2,4,6), col3 = c(5, "B", "F"))
df_2 = data.frame(col4 = c(5,6,7), col5 = c("A", "D", "Z"))
df_3 = data.frame(col6 = c("dog", "cat"), col7 = c("bear", "wolf"), col8 = c("lion", "tiger"), col9 = c("horse", "pig"), col10 = c("shark", "whale"))
My Question: Using these 3 data frames, I am trying to get the following output ("final") where columns from 3 data frames are stacked on top of each other based on their positions :
final
new_col_1 new_col_2 new_col_3 new_col_4 new_col_5 new_col_6
1 A 2 5 <NA> <NA> <NA>
2 B 4 B <NA> <NA> <NA>
3 C 6 F <NA> <NA> <NA>
4 5 A <NA> <NA> <NA> <NA>
5 6 D <NA> <NA> <NA> <NA>
6 7 Z <NA> <NA> <NA> <NA>
7 dog bear <NA> lion horse shark
8 cat wolf <NA> tiger pig whale
I tried to use the advice from this question here (Combine two data frames by rows (rbind) when they have different sets of columns) to solve my problem :
library(gtools)
smartbind(df_1, df_2, df_3)
col1 col2 col3 col4 col5 col6 col7 col8 col9 col10
1:1 A 2 5 NA <NA> <NA> <NA> <NA> <NA> <NA>
1:2 B 4 B NA <NA> <NA> <NA> <NA> <NA> <NA>
1:3 C 6 F NA <NA> <NA> <NA> <NA> <NA> <NA>
2:1 <NA> NA <NA> 5 A <NA> <NA> <NA> <NA> <NA>
2:2 <NA> NA <NA> 6 D <NA> <NA> <NA> <NA> <NA>
2:3 <NA> NA <NA> 7 Z <NA> <NA> <NA> <NA> <NA>
3:1 <NA> NA <NA> NA <NA> dog bear lion horse shark
3:2 <NA> NA <NA> NA <NA> cat wolf tiger pig whale
However, this code is not giving me the desired results – can someone please tell me what I am doing wrong and how I can fix this?
Thanks!
>Solution :
Two problems:
- The names are different. Easily remedied, since you don’t care about any of the original column names.
- The column classes are different. Easily remedied, we will convert everything to
character. I don’t think this is fully always desired, you can always usetype.convert(., as.is=TRUE)as needed.
With this, we can use either dplyr::bind_rows or data.table::rbindlist to combine them.
frames <- list(df_1, df_2, df_3)
frames2 <- lapply(frames, function(z) {
z <- setNames(z, paste0("new_col_", seq_along(z)))
z[] <- lapply(z, as.character)
z
})
out <- bind_rows(frames2)
out
# new_col_1 new_col_2 new_col_3 new_col_4 new_col_5
# 1 A 2 5 <NA> <NA>
# 2 B 4 B <NA> <NA>
# 3 C 6 F <NA> <NA>
# 4 5 A <NA> <NA> <NA>
# 5 6 D <NA> <NA> <NA>
# 6 7 Z <NA> <NA> <NA>
# 7 dog bear lion horse shark
# 8 cat wolf tiger pig whale
### similarly
out <- rbindlist(frames2, use.names = TRUE, fill = TRUE)
If you want whatever might be "number-like" to return to this number-like format (will do nothing with this data), you can run:
out[] <- lapply(out, type.convert, as.is = TRUE)