Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove columns in R by two conditions in two different columns

I am using this data.frame. I need to apply statistical tests (wilcox.test) for each column by comparing the ‘0’ to the ‘1’ groups, but I can only do that if each group has at least 2 values. How can I remove all columns for which the group size of ‘0’ or the group size of ‘1’ is smaller than 2? Then I can be running my code without errors. So in this example the pear and cherry columns would be removed.

 df <- data.frame(group=c(rep(0,10),rep(1,10)),
      apple = as.numeric(c(runif(20, -1, 18))),
      pear = as.numeric(c(rep("NA",12), runif(8, 2, 7))),
      banana = as.numeric(c(runif(10, 1, 3), runif(10, 2.5, 6))),
      cherry = as.numeric(c(runif(9, 5, 12), rep("NA",10), 4.31)),
      kiwi = as.numeric(c(rep("NA",8), runif(12, -1, 6))))

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

You can use select + where to select variables with a function. I anticipated using select with group_by to deal with this issue, but dplyr seems unable to support that. So a workaround is to use tapply(or ave) for grouping:

library(dplyr)

df %>%
  select(where(~ all(tapply(.x, df$group, \(x) sum(!is.na(x)) >= 2))))

   group      apple   banana        kiwi
1      0  7.9768511 1.183422          NA
2      0 -0.6611309 1.948172          NA
3      0  0.6690410 1.556230          NA
4      0  1.3582682 1.063583          NA
5      0  4.5359535 2.972903          NA
6      0  8.8755979 2.074685          NA
7      0  2.9280202 1.734720          NA
8      0  7.4065231 1.460041          NA
9      0  0.8837726 1.109268  1.54898128
10     0 -0.9704649 2.447073  4.27753379
11     1  3.2403002 4.839462 -0.88546624
12     1  0.4561026 4.703763  2.50467817
13     1 10.2888012 3.920268  2.62292534
14     1  3.4619229 3.010228  4.67953823
15     1  0.2207555 5.582971  3.71465882
16     1 -0.3694006 3.326906  4.17280678
17     1 13.1442999 3.018943  3.39256613
18     1  6.7433707 2.989773  0.04379258
19     1 16.0372570 2.839262  4.41795547
20     1 15.7012046 2.982483  3.13632483
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading