I have made this code to find the distance between stations but in the output, there is only one value. Can you find the error?
df <- data.frame(
station = rep(c("A", "B", "C", "D"), each = 20),
temperature = rnorm(80),
latitude = c(40.7128, 34.0522, 41.8781, 39.9526),
longitude = c(-74.0060, -118.2437, -87.6298, -75.1652)
)
stations <- unique(df$station)
my_points <- matrix(NA, nrow = length(unique(df$station)), ncol = length(unique(df$station)))
# Loop through each station combination
for (i in 1:length(stations)) {
for (j in 1:length(stations)) {
# Get temperatures for the two stations
lat1 <- df$latitude[df$station == stations[i]]
lon1 <- df$longitude[df$station == stations[i]]
lat2 <- df$latitude[df$station == stations[j]]
lon2 <- df$longitude[df$station == stations[j]]
my_points[i, j] <- as.vector(dist(matrix(c(lon1,lon2,lat1,lat2),
nrow = 2)))
}
}
distance_df <- as.data.frame(my_points)
>Solution :
There are two issues here:
-
Your input data frame might not look the way you expect it to – the latitude and longitude columns are recycled so you have multiple different coordinates for the same station. Try adding
rep()in the lat and long columns as well asstation. -
In your code
lat1 <- df$latitude[df$station == stations[i]]returns a vector, because there are multiple matches. I think you’re expecting a single value. Use only the first matching element (since they are now all the same elements in the vector after addingrep()as above):
df <- data.frame(
station = rep(c("A", "B", "C", "D"), each = 20),
temperature = rnorm(80),
latitude = rep(c(40.7128, 34.0522, 41.8781, 39.9526), each = 20),
longitude = rep(c(-74.0060, -118.2437, -87.6298, -75.1652), each = 20)
)
stations <- unique(df$station)
my_points <- matrix(NA, nrow = length(unique(df$station)), ncol = length(unique(df$station)))
# Loop through each station combination
for (i in 1:length(stations)) {
for (j in 1:length(stations)) {
# Get temperatures for the two stations
lat1 <- df$latitude[df$station == stations[i]][1]
lon1 <- df$longitude[df$station == stations[i]][1]
lat2 <- df$latitude[df$station == stations[j]][1]
lon2 <- df$longitude[df$station == stations[j]][1]
my_points[i, j] <- as.vector(dist(matrix(c(lon1,lon2,lat1,lat2),
nrow = 2)))
}
}
distance_df <- as.data.frame(my_points)
This gives:
V1 V2 V3 V4
1 0.000000 44.73631 13.67355 1.386235
2 44.736313 0.00000 31.59835 43.480707
3 13.673546 31.59835 0.00000 12.612446
4 1.386235 43.48071 12.61245 0.000000
A slightly better way of finding unique stations:
unique(df[, c("station", "latitude", "longitude")])
You can then loop over those instead:
# Loop through each station combination
for (i in 1:length(stations)) {
for (j in 1:length(stations)) {
# Get temperatures for the two stations
lat1 <- unique_df$latitude[unique_df$station == stations[i]]
lon1 <- unique_df$longitude[unique_df$station == stations[i]]
lat2 <- unique_df$latitude[unique_df$station == stations[j]]
lon2 <- unique_df$longitude[unique_df$station == stations[j]]
my_points[i, j] <- as.vector(dist(matrix(c(lon1,lon2,lat1,lat2),
nrow = 2)))
}
}