Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using dplyr to calculate geomean in a row wise fashion

I’d like to calculate the geomean using each row from three columns. I found solutions to calculate it from the values in one column (example), but not from a row.

Here’s a simplified example:

data <- structure(list(fs_id = structure(1:8, levels = c("CON1", "NC", 
"water", "SCR1", "FAN1_1", "CON2", "SCR2", "FAN1_2"), class = "factor"), 
    twodct_ATP5B = c(1.06960527260684, 0.00241424406360917, NA, 
    0.953100847649869, 0.404512354245938, 0.934924336678708, 
    1.32283164360403, 0.194667767059346), twodct_EIF4A2 = c(1.07741209897215, 
    NA, NA, 1.01873805854745, 0.467988708062081, 0.928149963188649, 
    1.31762036152893, 0.33377442013251), twodct_GAPDH = c(1.04388739915294, 
    0.000156497290441042, NA, 0.972431569982792, 0.547030142788418, 
    0.957957726869246, 0.942311505534324, 0.337842927620691)), row.names = c(NA, 
-8L), class = c("tbl_df", "tbl", "data.frame"))

The table looks like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> data
# A tibble: 8 × 4
  fs_id  twodct_ATP5B twodct_EIF4A2 twodct_GAPDH
  <fct>         <dbl>         <dbl>        <dbl>
1 CON1        1.07            1.08      1.04    
2 NC          0.00241        NA         0.000156
3 water      NA              NA        NA       
4 SCR1        0.953           1.02      0.972   
5 FAN1_1      0.405           0.468     0.547   
6 CON2        0.935           0.928     0.958   
7 SCR2        1.32            1.32      0.942   
8 FAN1_2      0.195           0.334     0.338

I want to get the row wise geomean of columns twodct_ATP5B, twodct_EIF4A2 and twodct_GAPDH.

I’ve had a crack like this, but doesn’t seem to work:

data %>%
  rowwise() %>%
  dplyr::mutate(geomean = exp(mean(log(select(., c("twodct_ATP5B", "twodct_EIF4A2", "twodct_GAPDH")))))) %>%
  ungroup()

>Solution :

This is a good time to use c_across within the rowwise:

data %>%
  rowwise() %>%
  dplyr::mutate(geomean = exp(mean(log(c_across(c(twodct_ATP5B, twodct_EIF4A2, twodct_GAPDH)))))) %>%
  ungroup()
# # A tibble: 8 × 5
#   fs_id  twodct_ATP5B twodct_EIF4A2 twodct_GAPDH geomean
#   <fct>         <dbl>         <dbl>        <dbl>   <dbl>
# 1 CON1        1.07            1.08      1.04       1.06 
# 2 NC          0.00241        NA         0.000156  NA    
# 3 water      NA              NA        NA         NA    
# 4 SCR1        0.953           1.02      0.972      0.981
# 5 FAN1_1      0.405           0.468     0.547      0.470
# 6 CON2        0.935           0.928     0.958      0.940
# 7 SCR2        1.32            1.32      0.942      1.18 
# 8 FAN1_2      0.195           0.334     0.338      0.280
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading