I have this list of dfs:
my_list <- list(structure(list(observations = c(1L, 5L), variables = c(4L,
8L)), class = "data.frame", row.names = c("asp_202003...Copy.xlsx",
"asp_202003.xlsx")), structure(list(observations = c(3L, 1L),
variables = 5:4), class = "data.frame", row.names = c("eay_201008_a.xlsx",
"eay_202003.xlsx")), structure(list(observations = 3:4, variables = c(4L,
6L)), class = "data.frame", row.names = c("wana_202309...Copy.xlsx",
"wana_202309.xlsx")))
I merge the dfs like so:
my_merge <- my_list %>% reduce(full_join)
Output:
my_merge
# observations variables
#1 1 4
#2 5 8
#3 3 5
#4 3 4
#5 4 6
But I would want to keep the index names (or extract them) in a new column called ‘file’, like so:
Desired output:
# file observations variables
# asp_202003...Copy.xlsx 1 4
# asp_202003.xlsx 5 8
# etc.
Also note, the desired output should have 6 rows, not 5 as in current my_merge object! In current my_merge object, identical values between two of the rows means one was ‘lost’. This is also why I want to set file name as index.
>Solution :
You could make them into tibbles first and save row names as a variable then use bind_rows().
library(dplyr)
my_list <- list(structure(list(observations = c(1L, 5L), variables = c(4L, 8L)), class = "data.frame",
row.names = c("asp_202003...Copy.xlsx", "asp_202003.xlsx")),
structure(list(observations = c(3L, 1L), variables = 5:4), class = "data.frame",
row.names = c("eay_201008_a.xlsx", "eay_202003.xlsx")),
structure(list(observations = 3:4, variables = c(4L, 6L)), class = "data.frame",
row.names = c("wana_202309...Copy.xlsx", "wana_202309.xlsx")))
bind_rows(purrr::map(my_list, ~as_tibble(.x, rownames="file")))
#> # A tibble: 6 × 3
#> file observations variables
#> <chr> <int> <int>
#> 1 asp_202003...Copy.xlsx 1 4
#> 2 asp_202003.xlsx 5 8
#> 3 eay_201008_a.xlsx 3 5
#> 4 eay_202003.xlsx 1 4
#> 5 wana_202309...Copy.xlsx 3 4
#> 6 wana_202309.xlsx 4 6
Created on 2024-03-18 with reprex v2.0.2