Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert a grouped data.frame into a list in R dplyr

I have a large data.frame that looks like this.
I want to group by data.frame based on tissue and for each tissue to create a list

library(tidyverse)
df <- tibble(tissue=c("A","A","B","B"), genes=c('CD79B','CD79A','CD19','CD180'))
df
#> # A tibble: 4 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1   A     CD79B
#> 2   A     CD79A
#> 3   B     CD19 
#> 4   B     CD180

Created on 2022-10-21 with reprex v2.0.2

I want my data to look like this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#> # A tibble: 2 × 1
#>  tissue       genes
#>   <chr>       <chr>
#> 1   A         CD79B
#> 2   A         CD79A
#> 
#> [[2]]
#> # A tibble: 2 × 1
#>   tissue genes
#>   <chr>    <chr>
#> 1   B    CD19 
#> 2   B    CD180

What have I tried so far?
I have used group_map but I am missing the tissue column!

library(tidyverse)
df <- tibble(tissue=c("A","A","B","B"), genes=c('CD79B','CD79A','CD19','CD180'))
  

df1 <- df |> 
  group_by(tissue) |> 
  group_map(~.)

df1  
#> [[1]]
#> # A tibble: 2 × 1
#>   genes
#>   <chr>
#> 1 CD79B
#> 2 CD79A
#> 
#> [[2]]
#> # A tibble: 2 × 1
#>   genes
#>   <chr>
#> 1 CD19 
#> 2 CD180

Created on 2022-10-21 with reprex v2.0.2

Any help or guidance are appreciated

>Solution :

Here are 3 ways to achieve the same result. Posting this answer since you are asking about a base R native pipe solution:

library(tidyverse)
df <- tibble(tissue=c("A","A","B","B"), genes=c('CD79B','CD79A','CD19','CD180'))

# base R, no pipe
split(df, df$tissue)
#> $A
#> # A tibble: 2 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1 A      CD79B
#> 2 A      CD79A
#> 
#> $B
#> # A tibble: 2 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1 B      CD19 
#> 2 B      CD180

# base R with pipe
df |> {\(.) split(., .$tissue)}()

# or
df |> (\(.) split(., .$tissue))()

#> $A
#> # A tibble: 2 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1 A      CD79B
#> 2 A      CD79A
#> 
#> $B
#> # A tibble: 2 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1 B      CD19 
#> 2 B      CD180

# dplyr
df %>% 
  group_split(tissue)
#> <list_of<
#>   tbl_df<
#>     tissue: character
#>     genes : character
#>   >
#> >[2]>
#> [[1]]
#> # A tibble: 2 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1 A      CD79B
#> 2 A      CD79A
#> 
#> [[2]]
#> # A tibble: 2 × 2
#>   tissue genes
#>   <chr>  <chr>
#> 1 B      CD19 
#> 2 B      CD180
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading