Home group_by( ) and mutate( ) do not match sizes

Questions

group_by( ) and mutate( ) do not match sizes

October 20, 2022

I have a large data table with multiple columns and a custom function. The data table looks something like that, and there are eight different bird_ID types:

   GPS_ID bird_ID device_ID devicetype           timestamp       date
1:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02
2:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02
3:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02
4:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02
5:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02
6:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02

The custom function calculates the difference in time between the timestamp of two rows, and assigns a number in a new column named Position.Burst.ID. If the diff is more than 5 seconds, the number sequence advances, else it keeps the previously assigned number.

pbid <- function(data_table) {
  newbout <- which(c(TRUE, diff(as.POSIXct(data_table$timestamp, tz = "UTC")) >= 5) == T)
  boutind <- rep(seq_along(newbout), diff(c(newbout, (nrow(data_table) + 1))))
  data_table$Position.Burst.ID <- boutind
}

This function works great with one bird_ID.

   GPS_ID bird_ID device_ID devicetype           timestamp       date Position.Burst.ID   
1:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02                 1
2:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02                 1
3:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02                 1
4:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02                 1
5:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02                 1
6:     NA    350E    202927   ornitela 2022-05-02 00:03:59 2022-05-02                 1

I wanted to group_by(bird_ID), so it will start counting from the top for each bird_ID

data_table %>%
  group_by(bird_ID) %>%
  mutate(Position.Burst.ID = pbid(data_table))

That surely didn’t work, because:

`Position.Burst.ID` must be size 419335 or 1, not 4592293.

Any ideas on how to approach this?

I have already tried to create a loop and put the function inside, but that was also a dead-end. And I really wanted to avoid using a for loop with this amount of data.

>Solution :

Here’s how I’d do it:

data_table %>%
  group_by(bird_ID) %>%
  mutate(Position.Burst.ID = cumsum(timestamp - lag(timestamp, default = timestamp[1]) >= 5) + 1)

dplyr

byMR

Published October 20, 2022

Add a comment

Rolling average on previous dates per group

byMR

October 20, 2022

Questions

Converting int to float with known precision

byMR

October 20, 2022

Questions

Using a Dictionary<TKey, TValue> with Exceptions

byMR

October 20, 2022

Questions

How to fix `returns a value referencing data owned by the current function` when using std::str::Chars in Rust

byMR

October 20, 2022

Questions

Dynamically add JSON data into div

byMR

October 20, 2022

Questions

Template class: operator []: 2 overloads have similar conversions

byMR

October 20, 2022

group_by( ) and mutate( ) do not match sizes

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Rolling average on previous dates per group

Converting int to float with known precision

Using a Dictionary<TKey, TValue> with Exceptions

How to fix `returns a value referencing data owned by the current function` when using std::str::Chars in Rust

Dynamically add JSON data into div

Template class: operator []: 2 overloads have similar conversions

Keep Up to Date with the Most Important News

group_by( ) and mutate( ) do not match sizes

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Rolling average on previous dates per group

Converting int to float with known precision

Using a Dictionary<TKey, TValue> with Exceptions

How to fix `returns a value referencing data owned by the current function` when using std::str::Chars in Rust

Dynamically add JSON data into div

Template class: operator []: 2 overloads have similar conversions

Discover more from Dev solutions