Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

running paired wilcoxon test on rows of two dataframes

I have a two large dataframes (around 19000 rows and 71 columns) as follows
df1

sample1 sample2 sample3
gene1 5 10 15
gene2 2 8 10
gene3 3 9 10

df2

sample1 sample2 sample3
gene1 40 50 65
gene2 12 18 0
gene3 31 19 10

I am trying to perform wilcoxon rank sum test on the rows with the same index but the code is taking forever on google colab!!
My code so far

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

wilc_results= c()
for( x in 1:nrow(df1)){
  for (y in 1:nrow(df2)){
    result= wilcox.test(as.numeric(df2[y,]), as.numeric(f1d[x,]), 
                        alternative= 'two.sided', paired= T )
    wilc_results[length(wilc_results) + 1] <- result$p.value
  }
}

is there a much faster way to get the desired output?

>Solution :

There is no need to loop twice, since both your data frames have the same number of columns. It runs in about 10 seconds on a similarly sized dataset on my computer.

wilc_results <- list()
for(i in 1:nrow(df1)) {
  result <- wilcox.test(as.numeric(df1[i,]), as.numeric(df2[i,]),
                        alternative='two.sided', paired=T)
  wilc_results[[i]] <- result$p.value
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading