Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Distance between point coordinates from two different data frames in R

I have two shape files, with point coordinates.

#Shapefile 1:

id <- c(98,76,88)
lat <- c(28.54265,28.54474,28.54463)
long <- c(77.20034,77.19437,77.19354)
score1 <- c(3,2,0)
score2 <- c(2,1,2)
file1 <- data.frame(id,lat,long,score1,score2)

file1_sf <- st_as_sf(file1, coords = c("long", "lat"), crs = 4326L)

#Shapefile 2:

name <- c("A","B")
lat <- c(28.6705,28.6735)
long <- c(77.41588,77.28998)
feature <- c("red","yellow")

file2 <- data.frame(name,lat,long,feature)

file2_sf <- st_as_sf(file2, coords = c("long", "lat"), crs = 4326L)

Now I want to find out the the point from File 2 that is closest to a point in File 1, and the distance between them. And I want to retain all the columns.

I used st_distance() and then used a rowwise() to get the minimum distance. However, I am not able to retain all the columns.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Is there an elegant way of solving this problem? I have 40k locations in file 1 and 200 coordinates in file 2.

>Solution :

Talking about 40k rows, running Rfast::rowMins() twice comes with acceptable cost.

x = sf::st_distance(file1_sf, file2_sf)
i = Rfast::rowMins(x)
d =  Rfast::rowMins(x, value = TRUE)
cbind.data.frame(file1_sf, "NearestPointIn2" = file2_sf$name[i], "Distance" = d)

  id score1 score2                  geometry NearestPointIn2 Distance
1 98      3      2 POINT (77.20034 28.54265)               B 16978.60
2 76      2      1 POINT (77.19437 28.54474)               B 17090.98
3 88      0      2 POINT (77.19354 28.54463)               B 17145.58

A merged version:

x = sf::st_distance(file1_sf, file2_sf)
i = Rfast::rowMins(x)
d =  Rfast::rowMins(x, value = TRUE)
merge(cbind.data.frame(file1_sf, "name" = file2_sf$name[i], "Distance" = d), 
      file2_sf, by = "name")
# rm(x, i, d)

  name id score1 score2                geometry.x Distance feature               geometry.y
1    B 98      3      2 POINT (77.20034 28.54265) 16978.60  yellow POINT (77.28998 28.6735)
2    B 76      2      1 POINT (77.19437 28.54474) 17090.98  yellow POINT (77.28998 28.6735)
3    B 88      0      2 POINT (77.19354 28.54463) 17145.58  yellow POINT (77.28998 28.6735)

I do not know how the naming should be done and connot do any better than using "name". But this is just asthetics and can be changed anytime.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading