Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create a table that record the number of row pairs that are not zero in R

Apologies if the title is confusing, but below is what I would like to accomplish. Let’s say I have a table as seen below.

df <- data.frame(
  patient = paste0("patient",seq(1:6)),
  gene_1 = c(10,5,0,0,1,0),
  gene_2 = c(0,26,4,5,6,1),
  gene_3 = c(1,3,5,12,44,1)
)
patient gene_1 gene_2 gene_3
patient1 10 0 1
patient2 5 26 3
patient3 0 4 5
patient4 0 5 12
patient5 1 6 44
patient6 0 1 1

What I want is another table that records the total number of pairs only if both values are non-zero. The table would look like so:

col1 col2 number-of-pairs
gene1 gene2 2
gene1 gene3 3
gene2 gene3 5

Any help is appreciated. Thank you.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

We can do this by pivoting your data to a long format, doing a self-join, and then filtering:

library(tidyr)
library(dplyr)
## Long format, keep only non-zeros
long_data = pivot_longer(df, -patient) %>%
  filter(value != 0) %>%
  select(-value)

## Self join on patient,
## Remove exact matches (can't pair with yourself)
## And use < to remove doublecounts
long_data %>%
  left_join(long_data, by = "patient") %>%
  filter(name.x != name.y & name.x < name.y) %>%
  count(name.x, name.y)
# # A tibble: 3 × 3
#   name.x name.y     n
#   <chr>  <chr>  <int>
# 1 gene_1 gene_2     2
# 2 gene_1 gene_3     3
# 3 gene_2 gene_3     5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading