I have a list of dataframes that I would like to split based on a column, in that case the
cluster column.
d1 <- data.frame(y1=c(1,2,3), cluster=c(1,2,6))
d2 <- data.frame(y1=c(3,2,1), cluster=c(6,2,4))
my.list <- list(d1, d2)
Using
lapply(my.list , function(x) split(x, x$cluster)) returns the splitted dataframes as sublists. Is it possible to split the dataframes and create new dataframes as separate list entries?
The desired output would be something like this:
my.list2 <- list(df1_cl1 , df1_cl2m df1_cl6, df2_cl6, df2_cl2, df2_cl4 )
Thank you!
>Solution :
The first step is correct, to get data in required structure you can unlist the list output with recursive = FALSE.
my.list2 <- unlist(lapply(my.list , function(x)
split(x, x$cluster)), recursive = FALSE)
my.list2
#$`1`
# y1 cluster
#1 1 1
#$`2`
# y1 cluster
#2 2 2
#$`6`
# y1 cluster
#3 3 6
#$`2`
# y1 cluster
#2 2 2
#$`4`
# y1 cluster
#3 1 4
#$`6`
# y1 cluster
#1 3 6
length(my.list2)
#[1] 6
You can drop the names of the list with unname(my.list2).