I’m using a dataset from an R package track2KBA which has tracking data on a seabird species. I want to measure the time difference between each relocation grouped by the individual bird.
But when I run my script I don’t get the differences I’d expect. For instance, the first difference should be 6 seconds.
track_id date_gmt time_gmt longitude latitude lon_colony lat_colony datetime difference
<int> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dttm> <drtn>
1 69303 2012-07-21 11:01:54 -5.73 -16.0 -5.73 -16.0 2012-07-21 11:01:54 NA secs
2 69302 2012-07-21 11:02:00 -5.73 -16.0 -5.73 -16.0 2012-07-21 11:02:00 NA secs
3 69303 2012-07-21 11:03:33 -5.73 -16.0 -5.73 -16.0 2012-07-21 11:03:33 99 secs
4 69302 2012-07-21 11:03:42 -5.73 -16.0 -5.73 -16.0 2012-07-21 11:03:42 102 secs
5 69303 2012-07-21 11:05:13 -5.73 -16.0 -5.73 -16.0 2012-07-21 11:05:13 100 secs
6 69302 2012-07-21 11:05:26 -5.73 -16.0 -5.73 -16.0 2012-07-21 11:05:26 104 secs
Here’s my code:
library(track2KBA)
library(tidyverse)
library(lubridate)
boobies$datetime <-
(paste(boobies$date_gmt, boobies$time_gmt))
boobies <- boobies %>%
mutate(datetime = lubridate::ymd_hms(datetime)) %>%
group_by(track_id) %>%
arrange(datetime) %>%
mutate(difference = datetime - lag(datetime))
And some sample data which comes from the package:
boobies <- structure(list(track_id = c(69303L, 69302L, 69303L, 69302L, 69303L,
69302L), date_gmt = c("2012-07-21", "2012-07-21", "2012-07-21",
"2012-07-21", "2012-07-21", "2012-07-21"), time_gmt = c("11:01:54",
"11:02:00", "11:03:33", "11:03:42", "11:05:13", "11:05:26"),
longitude = c(-5.72769, -5.72639, -5.72769, -5.72635, -5.72769,
-5.72639), latitude = c(-16.00749, -16.00713, -16.00749,
-16.00723, -16.00749, -16.0071), lon_colony = c(-5.73, -5.73,
-5.73, -5.73, -5.73, -5.73), lat_colony = c(-16.01, -16.01,
-16.01, -16.01, -16.01, -16.01), datetime = c("2012-07-21 11:01:54",
"2012-07-21 11:02:00", "2012-07-21 11:03:33", "2012-07-21 11:03:42",
"2012-07-21 11:05:13", "2012-07-21 11:05:26")), .internal.selfref = <pointer: (nil)>, row.names = c(NA, 6L), class = c("data.table", "data.frame"))
>Solution :
There’s an issue with your data. The answer you got (with two NAs at the start of the difference column) is the correct one (it seems), because the first two rows are the first two data points for the first two track_ids (which I presume correspond to birds). The first points don’t have a point to refer back to, hence them both being NA.
Anyway, here are the two ways of doing it: grouped and non grouped
library(tidyverse)
# not grouped by track_id (this gets the 6 second difference you were looking for)
mutate(boobies, difference = difftime(datetime, lag(datetime), units = "secs"))
# Output
track_id date_gmt time_gmt longitude latitude lon_colony lat_colony
<int> <date> <chr> <dbl> <dbl> <dbl> <dbl>
1 69303 2012-07-21 11:01:54 -5.73 -16.0 -5.73 -16.0
2 69302 2012-07-21 11:02:00 -5.73 -16.0 -5.73 -16.0
3 69303 2012-07-21 11:03:33 -5.73 -16.0 -5.73 -16.0
4 69302 2012-07-21 11:03:42 -5.73 -16.0 -5.73 -16.0
5 69303 2012-07-21 11:05:13 -5.73 -16.0 -5.73 -16.0
6 69302 2012-07-21 11:05:26 -5.73 -16.0 -5.73 -16.0
datetime difference
<dttm> <drtn>
1 2012-07-21 11:01:54 NA secs
2 2012-07-21 11:02:00 6 secs
3 2012-07-21 11:03:33 93 secs
4 2012-07-21 11:03:42 9 secs
5 2012-07-21 11:05:13 91 secs
6 2012-07-21 11:05:26 13 secs
# grouped by track_id
mutate(boobies, difference = difftime(datetime, lag(datetime), units = "secs"), .by = track_id)
# Output:
# A tibble: 6 × 9
track_id date_gmt time_gmt longitude latitude lon_colony lat_colony
<int> <date> <chr> <dbl> <dbl> <dbl> <dbl>
1 69303 2012-07-21 11:01:54 -5.73 -16.0 -5.73 -16.0
2 69302 2012-07-21 11:02:00 -5.73 -16.0 -5.73 -16.0
3 69303 2012-07-21 11:03:33 -5.73 -16.0 -5.73 -16.0
4 69302 2012-07-21 11:03:42 -5.73 -16.0 -5.73 -16.0
5 69303 2012-07-21 11:05:13 -5.73 -16.0 -5.73 -16.0
6 69302 2012-07-21 11:05:26 -5.73 -16.0 -5.73 -16.0
datetime difference
<dttm> <drtn>
1 2012-07-21 11:01:54 NA secs
2 2012-07-21 11:02:00 NA secs
3 2012-07-21 11:03:33 99 secs
4 2012-07-21 11:03:42 102 secs
5 2012-07-21 11:05:13 100 secs
6 2012-07-21 11:05:26 104 secs