Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calcuate Ratio Matrix using R

I was wondering if there is a simple method to calculate a ratio matrix for each element in a data frame. Example –

gene sample1 sample2 sample3 sample4 .....
aa     2       2       3      2
aa     1       5       2      1
aa     4       1       2      3
bb     1       2       1      2
bb     2       1       1      2 

and I was the ratio for each element from sample1 to sample4 calculated for common row values in gene in each column. The calculation would be like this –

gene sample1 sample2 sample3 sample4 .....
aa     2/7     2/8     3/7      2/6
aa     1/7     5/8     2/7      1/6
aa     4/7     1/8     2/7      3/6
bb     1/3     2/3     1/2      2/4
bb     2/3     1/3     1/2      2/4 

The result would be like this –

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

gene  sample1  sample2  sample3  sample4 .....
aa     .28       .25       .42      .33
aa     .14       .62       .28      .16
aa     .57       .12       .28      .5
bb     .33       .66       .5       .5
bb     .66       .33       .5       .5 

What I have tried in a loop is this –

tf <- dd %>%
        group_by(symbol) %>%
        summarise_if(is.numeric, mean)

but this summarises but does not calculate for each element and keep the same matrix dimension of initial data frame (e.g here its dd). Any suggestion would be most appreciated.

>Solution :

You can do:

library(dplyr)

dat %>%
  group_by(gene) %>%
  mutate(across(everything(), proportions)) %>% 
  ungroup()

# A tibble: 5 x 5
  gene  sample1 sample2 sample3 sample4
  <chr>   <dbl>   <dbl>   <dbl>   <dbl>
1 aa      0.286   0.25    0.429   0.333
2 aa      0.143   0.625   0.286   0.167
3 aa      0.571   0.125   0.286   0.5  
4 bb      0.333   0.667   0.5     0.5  
5 bb      0.667   0.333   0.5     0.5  

If you have missing values that you’d like to ignore, use:

dat %>%
  group_by(gene) %>%
  mutate(across(everything(),  ~ .x / sum(.x, na.rm = TRUE))) 

Data:

dat <- structure(list(gene = c("aa", "aa", "aa", "bb", "bb"), sample1 = c(2, 
1, 4, 1, 2), sample2 = c(2, 5, 1, 2, 1), sample3 = c(3, 2, 2, 
1, 1), sample4 = c(2, 1, 3, 2, 2)), class = "data.frame", row.names = c(NA, 
-5L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading