Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Evaluating boolean expression over the columns of a matrix against the elements of a vector in R

I have a matrix of data, where I want to check whether or not the absolute value of each column falls within a certain range. Moreover, I would like to calculate the proportion of times it occurs across all columns. I know how to do this manually but I would like to write this generally outside of a loop so that any time the user gives me a matrix X and y of any size that it works. The only additional piece of information is that the number of columns of X will always be the same length of y. I also would like to do this in base R if possible. Here is my R code:

set.seed(42)

# Made up data
x <- matrix(rnorm(27), nrow = 9)

y <- c(.2, .5, 2)



> sum(abs(x[,1]) <= y[1] & abs(x[,2]) <= y[2] & abs(x[,3]) <= y[3]) / nrow(x)
[1] 0.2222222

So ideally I would want something like

sum(abs(x) <= y) / nrow(x)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

sum(rowSums(t(t(abs(x)) <= y)) == ncol(x)) / nrow(x)
# [1] 0.2222222

Walk-through:

  • Unfortunately, x > y recycles y across x, but column-wise, so it is effectively doing c(x[1,1] > y[1], x[2,1] > y[2], x[3,1] > y[3], x[4,1] > y[1], ...), which is not what we want. We can transpose x so that the get the correct recycling of y … and then transpose it again to get it back in the same shape as x (not strictly required).

    t(t(abs(x)) <= y)
    #        [,1]  [,2]  [,3]
    #  [1,] FALSE  TRUE FALSE
    #  [2,] FALSE FALSE  TRUE
    #  [3,] FALSE FALSE  TRUE
    #  [4,] FALSE FALSE  TRUE
    #  [5,] FALSE  TRUE  TRUE
    #  [6,]  TRUE  TRUE  TRUE
    #  [7,] FALSE FALSE  TRUE
    #  [8,]  TRUE  TRUE  TRUE
    #  [9,] FALSE FALSE  TRUE
    
  • Now we want to know how many rows have as many TRUEs as x has columns, done with rowSums(.) == ncol(x). And the sum of all of these with sum(.).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading