Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pairwise subtraction of two numeric vectors and selection of results within a given range

I have two dataframes of the form

df.f <- data.frame(ID=c("A_1", "A_2", "A_3"), B=c(130, 250, 330))
df.r <- data.frame(ID=c("B_1", "B_2", "B_3"), B=c(170, 350, 480))

I want to pairwise subtract df.f$B from df.r$B (direction is important), and selects those pairs which subtraction value is within a range of 200 and 300.

I know how to outer:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    outer(df.r$B, df.f$B, '-')

     [,1] [,2] [,3]
[1,]   40  -80 -160
[2,]  220  100   20
[3,]  350  230  150

and could potentially melt the resulting matrix and then select upon the value column, but outer doesnt keep names:

library(reshape2)
res <- melt(as.matrix(outer(df.r$B, df.f$B, '-')))
final <- res[res$value >= 200 & res$value <= 300,]

> final
  Var1 Var2 value
2    2    1   220
6    3    2   230

Any idea how to achieve a result in the form of:

Var1 Var2 Diff
A_1 B_2 220
A_2 B_3 230

I guess it can be done with finding the indices in the original table.

Many thanks!

>Solution :

You can assign names after outer:

o <- outer(df.r$B, df.f$B, '-')
dimnames(o) <- list(df.r$ID, df.f$ID)
o
#     A_1 A_2  A_3
# B_1  40 -80 -160
# B_2 220 100   20
# B_3 350 230  150

or to each vector:

o <- outer(setNames(df.r$B, df.r$ID), setNames(df.f$B, df.f$ID), '-')
o
#     A_1 A_2  A_3
# B_1  40 -80 -160
# B_2 220 100   20
# B_3 350 230  150

Which melts well:

res <- reshape2::melt(o)
res
#   Var1 Var2 value
# 1  B_1  A_1    40
# 2  B_2  A_1   220
# 3  B_3  A_1   350
# 4  B_1  A_2   -80
# 5  B_2  A_2   100
# 6  B_3  A_2   230
# 7  B_1  A_3  -160
# 8  B_2  A_3    20
# 9  B_3  A_3   150

res[res$value >= 200 & res$value <= 300,]
#   Var1 Var2 value
# 2  B_2  A_1   220
# 6  B_3  A_2   230

Alternatively, you can do an outer-join (merge on nothing) and some simple assignment:

with(merge(df.f, df.r, by = c()),
     data.frame(Var1=ID.x, Var2=ID.y, Diff=B.y-B.x))
#   Var1 Var2 Diff
# 1  A_1  B_1   40
# 2  A_2  B_1  -80
# 3  A_3  B_1 -160
# 4  A_1  B_2  220
# 5  A_2  B_2  100
# 6  A_3  B_2   20
# 7  A_1  B_3  350
# 8  A_2  B_3  230
# 9  A_3  B_3  150

(and subset as desired).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading