Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to group interconnected elements in R dplyr

I have a data frame that looks like this.
Elements from the col1 are connected indirectly with elements in col2.
for example 1 is connected with 2 and 3.
and 2 is connected with 3. Therefore 1 should be connected with 3 as well.

library(tidyverse)

df1 <- tibble(col1=c(1,1,2,5,5,6), 
              col2=c(2,3,3,6,7,7))
df1
#> # A tibble: 6 × 2
#>    col1  col2
#>   <dbl> <dbl>
#> 1     1     2
#> 2     1     3
#> 3     2     3
#> 4     5     6
#> 5     5     7
#> 6     6     7

Created on 2022-03-15 by the reprex package (v2.0.1)

I want my data to look like this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#>    col1  col2  col3
#>   <dbl> <dbl>
#> 1     1     2  group1
#> 2     1     3  group1
#> 3     2     3  group1
#> 4     5     6  group2
#> 5     5     7  group2
#> 6     6     7  group2

I would appreciate any possible help to solve this riddle.
Thank you for your time

>Solution :

We may use igraph

library(igraph)
library(dplyr)
library(stringr)
g <- graph.data.frame(df1, directed = TRUE)
df1 %>% 
   mutate(col3 = str_c("group", clusters(g)$membership[as.character(col1)]))

-output

# A tibble: 6 × 3
   col1  col2 col3  
  <dbl> <dbl> <chr> 
1     1     2 group1
2     1     3 group1
3     2     3 group1
4     5     6 group2
5     5     7 group2
6     6     7 group2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading