Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: using mapply for a function of two vectors

I have an R function that calculates the Hamming distance of two vectors:

Hamming = function(x,y){
get_dist = sum(x != y, na.rm=TRUE)
return(get_dist)
}

that I would like to apply to every row of two matrices M1, M2 without using a for loop. What I currently have (where L is the number of rows in M1 and M2) is the very time-consuming loop:

xdiff = c()
for(i in 1:L){
    xdiff = c(xdiff, Hamming(M1[i,],M2[i,]))
}

I thought that this could be done by executing

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

mapply(Hamming, t(M1), t(M2))

(with the transpose because mapply works across columns), but this doesn’t generate a length L vector of Hamming distances for each row, so perhaps I’m misunderstanding what mapply is doing.

Is there a straightforward application of mapply or something else in the R apply family that would work?

>Solution :

If dim(M1) and dim(M2) are identical, then you can simply do:

rowSums(M1 != M2, na.rm = TRUE)

Your attempt with mapply didn’t work because m-by-n matrices are stored as m*n-length vectors, and mapply handles them as such. To accomplish this with mapply, you would need to split each matrix into a list of row vectors:

mapply(Hamming, asplit(M1, 1L), asplit(M2, 1L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading