Flattening nested lists and selecting data in R

My data comes in the form of a large number of nested nested lists that looks like this but much larger:

data_in <-list(a=list(list(info=c(ID="C.1", treatment="C", color="green"), parameters=c(v=2, d=2), data=mtcars),
                      list(info=c(ID="C.2", treatment="C", color="green"), parameters=c(v=2, d=2), data=mtcars),
                      list(info=c(ID="L.1", treatment="L", color="green"), parameters=c(v=2, d=2), data=mtcars)),
               b=list(list(info=c(ID="C.1", treatment="C", color="green"), parameters=c(v=2, d=2), data=mtcars),
                      list(info=c(ID="C.2", treatment="C", color="green"), parameters=c(v=2, d=2), data=mtcars),
                      list(info=c(ID="L.1", treatment="L", color="green"), parameters=c(v=2, d=2), data=mtcars)),
               c=list(list(info=c(ID="C.1", treatment="C", color="green"), parameters=c(v=2, d=2), data=mtcars),
                      list(info=c(ID="C.2", treatment="C", color="green"), parameters=c(v=2, d=2), data=mtcars),
                      list(info=c(ID="L.1", treatment="L", color="green"), parameters=c(v=2, d=2), data=mtcars)))

Is there an elegant solution (using map() or unlist()?) to convert this list of lists to a list of dataframes containing only selected data?

I was looking for a result like this:

$a
   ID treatment v d
1 C.1         C 2 2
2 C.2         C 2 2
3 L.1         L 2 2

$b
   ID treatment v d
1 C.1         C 2 2
2 C.2         C 2 2
3 L.1         L 2 2

$c
   ID treatment v d
1 C.1         C 2 2
2 C.2         C 2 2
3 L.1         L 2 2

Thank you very much in advance!

>Solution :

If we want to use only tidyverse, then loop to the inner nested layer with map, extract the elements of interest (‘info’, ‘parameters’), convert to tibble, remove the ‘color’ column, bind them in each nested layer (_dfr)

library(dplyr)
library(purrr)
map(data_in,  ~
   map_dfr(.x, ~ .x[c("info", "parameters")] %>% 
   map_dfc(as_tibble_row)) %>% select(-color))

Or use a recursive function (rrapply) to extract the elements and convert to tibble (as it is a named vector), then bind the datasets in the inner layer, unnest while looping over the outer list with map

library(rrapply)
library(tidyr)
rrapply(data_in, condition = \(x, .xname) .xname %in%
    c("info", "parameters"), as_tibble_row, how = "prune"  ) %>% 
  map(~ bind_rows(.x) %>%
   unnest(where(is.list)) %>%
   select(-color))

-output

$a
# A tibble: 3 × 4
  ID    treatment     v     d
  <chr> <chr>     <dbl> <dbl>
1 C.1   C             2     2
2 C.2   C             2     2
3 L.1   L             2     2

$b
# A tibble: 3 × 4
  ID    treatment     v     d
  <chr> <chr>     <dbl> <dbl>
1 C.1   C             2     2
2 C.2   C             2     2
3 L.1   L             2     2

$c
# A tibble: 3 × 4
  ID    treatment     v     d
  <chr> <chr>     <dbl> <dbl>
1 C.1   C             2     2
2 C.2   C             2     2
3 L.1   L             2     2

Leave a Reply