Home Group by unique ids to calculate number of days in R

Questions

Group by unique ids to calculate number of days in R

February 24, 2022

i want to find how many days a unique id has worked in this df by abstracting the max date minus min date.

structure(list(id = c("f1", "f2", "f2", "f4", "f5", "f2", "f3", 
"f1", "f1", "f3", "f4", "f2", "f2", "f2", "f2"), 
 date = structure(c(18687, 18687, 18687, 18687, 18687, 
18687, 18687, 18687, 18687, 18687, 18687, 18687, 18687, 18687, 
18687), class = "Date")), row.names = c(2L, 4L, 8L, 15L, 17L, 
18L, 21L, 25L, 36L, 37L, 38L, 40L, 42L, 48L, 52L), class = "data.frame")

I have tried this but I get weird numbers

df_total_days_per_id<-df %>%
  group_by(id) %>%
  mutate(xx1 = max(date)-min(date)) %>% #by user find range
  group_by(id) %>%
  summarise(number_of_days = sum(xx1))

>Solution :

Based on your code I can make an estimate of why you are getting "weird numbers".

The function mutate() adds a new column to the existing table. Let’s forget about groups for a moment and assume the following dates:

2022-01-01 
2022-01-11 
2022-01-21

You want the answer to be 20, right? 21 minus 1. However with your mutate() function you create this:

2022-01-01   20 days
2022-01-11   20 days
2022-01-21   20 days

and then in the last operation you sum all these 20s to reach 60 days. Obviously this is wrong.

Instead of using mutate() we use summarize() like so:

df %>%
  group_by(id) %>%
  summarize(number_of_days = max(date)-min(date)+1) #added +1 for correction

after this there is no need to add another group_by() or sum()

Edit:

I noticed that you are talking about the number of days worked. So if we take the example above you might instead want to output 3 instead of 20 because there are days inbetween that were skipped. If this is the case you should use a different function in your summarize() altogether.

For this we can use the function n_distinct() which counts the amount of unique values in a group, like so:

df %>%
  group_by(id) %>%
  summarise(number_of_days = n_distinct(date))

group-by

byMR

Published February 24, 2022

Add a comment

Select rows in a data frame based on the date range

byMR

February 24, 2022

Questions

ReactJs – useState – doesn't change my value

byMR

February 24, 2022

Questions

Updating object in nested array

byMR

February 24, 2022

Questions

Reading a Structure using pointer second time gives segmentation fault error

byMR

February 24, 2022

Questions

return key-value pair key based on key-value pair value date with sorting

byMR

February 24, 2022

Questions

Fast fill 3D array/matrix in R with known index and value

byMR

February 24, 2022

Group by unique ids to calculate number of days in R

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Select rows in a data frame based on the date range

ReactJs – useState – doesn't change my value

Updating object in nested array

Reading a Structure using pointer second time gives segmentation fault error

return key-value pair key based on key-value pair value date with sorting

Fast fill 3D array/matrix in R with known index and value

Keep Up to Date with the Most Important News

Group by unique ids to calculate number of days in R

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Select rows in a data frame based on the date range

ReactJs – useState – doesn't change my value

Updating object in nested array

Reading a Structure using pointer second time gives segmentation fault error

return key-value pair key based on key-value pair value date with sorting

Fast fill 3D array/matrix in R with known index and value

Discover more from Dev solutions