Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

rstatix::wilcox_test doesn't take formula as variable

I am trying to do boxplots and add p-values from rstatix::wilcox_test(). But I want to do this in a loop. so the column names for the formula for wilcox_test is stored in a variable. When I use the variable name in the wilcox_test, it throws error

Error in `pull()`:
! Can't extract columns that don't exist.
✖ Column `rs_id` doesn't exist.

Can someone please help me how to use variables in the formula for wilcox_test()

here is my code

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

rs_ids <- c('rs2160669', 'rs964184')
phenos <- c('HBA1C', 'TG')
for (rs_id in rs_ids){
  for(phen in phenos){
    bxp <- ggboxplot(d.k, x = rs_id, y = phen)
    stat.test <- d.k %>%  wilcox_test(.data[[`phen`]] ~ .data[[`rs_id`]]) %>%  add_significance()
    stat.test <- stat.test %>% add_xy_position(x = .data[[rs_id]]) %>% mutate(myformatted.p = ifelse(p < 0.001, 'p<0.001', paste0('p=',signif(p, digits = 2))))
  
    bxp <-  bxp +   geom_jitter(position=position_jitter(0.2), alpha = 0.5)   +  theme_bw() + theme(legend.position="none", axis.text=element_text(size=12), axis.title=element_text(size=14,face="bold")) + xlab(rs_id) + ylab(phen)
    plot <- bxp + stat_pvalue_manual(stat.test, label = "{myformatted.p}", tip.length = 0.01, step.increase = 0.1)
  }
}

sample data is below

structure(list(HBA1C = c(4.5, 5.9, 8.02, 5.5, 5.4, 6, 7.1, 9.9, 
5.58, 5, 7.6, 8, 5.5, 6.4, 6.8, 9, 6.2, 4.9, 5.7, 6.2, 6.3, 5.7, 
5.9, 6.9, 7.3, 5.7, 7.2, 10.4, 5.4, 5.3, 4.8, 5.5, 7.1, 6.2, 
7, 7.4, 11.4, 5.4, 5.5, 5.3, 5.5, 5.9, 5.5, 5.79, 4.8, 6.16, 
5.63, 7.41, 7.68, 5.82), TG = c(0.83, 0.95, 2.48, 0.78, 0.48, 
1.16, 1.45, 1.32, 1.03, 3.77, 3.87, 2.34, 1.98, 2.59, 2.92, 2.71, 
6.88, 1.25, 3.68, 1.15, 2.33, 3.37, 0.61, 1.02, 1.63, 1.32, 1.21, 
1.35, 0.85, 3.97, 4.04, 3.63, 2.62, 1.46, 2.33, 2.46, 1.09, 1.46, 
2.77, 3.1, 3.13, 2.55, 1.91, 0.97, 0.87, 1.46, 1.45, 1.15, 2.61, 
2.15), rs2160669 = c(3, 2, 3, 3, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2, 
1, 2, 3, 3, 3, 3, 2, 2, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 
2, 3, 3, 3, 2, 3, NA, 3, 3, 3, 3, 3, 3, 2, 2), rs964184 = c(1, 
2, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 1, 2, 3, 2, 1, 1, 1, 1, 2, 2, 
1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, NA, 
2, 1, 3, 1, 1, 2, 2, 2)), row.names = c(NA, -50L), class = c("tbl_df", 
"tbl", "data.frame"))

>Solution :

A few issues here:

  1. Backticks are wrong, even in dplyr verbs that recognize .data you’d need to use .data[[phen]] and .data[[rs_id]]. (C.f., d.k %>% count(.data[[phen]])).

  2. .data can only be used in tidyverse-aware functions (e.g., dplyr), it won’t work in wilcox_test.

I suggest you make the formula as a string and then as.formula it. This will work in the %>%-pipe since the first argument to the test is data=, so the formula (as we form it) will be the second arg, formula=.

rs_id <- "rs2160669"
phen <- "HBA1C"
as.formula(paste(phen, rs_id, sep = "~"))
# HBA1C ~ rs2160669

library(dplyr) # just for %>% which is magrittr, but I'm inferring your use of dplyr
library(rstatix)
d.k %>%
  wilcox_test(as.formula(paste(phen, rs_id, sep = "~")))
# # A tibble: 3 x 9
#   .y.   group1 group2    n1    n2 statistic     p p.adj p.adj.signif
# * <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>       
# 1 HBA1C 1      2          1    12         7 0.923  1    ns          
# 2 HBA1C 1      3          1    36        25 0.542  1    ns          
# 3 HBA1C 2      3         12    36       280 0.13   0.39 ns          
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading