Home dplyr group_by retaining extra columns after summarise

Questions

dplyr group_by retaining extra columns after summarise

June 21, 2022

I am at a total loss for this one – I am playing with the "pedestrian" dataset from tsibble. I want to get total counts for each month/year. I started by adding a month_year column, then summarise the data with sum, like so:

library("tidyverse")
library("tsibble")

df1 <- pedestrian
df1$month_year <- format(as.Date(df1$Date), "%Y-%m")

count_all <- df1 %>%  
  dplyr::group_by(month_year) %>% 
  dplyr::summarise(total = sum(Count))

A summary of count_all looks like this:

  month_year          Date_Time                         total      
 Length:17542       Min.   :2015-01-01 00:00:00.0   Min.   :   12  
 Class :character   1st Qu.:2015-07-02 17:15:00.0   1st Qu.:  349  
 Mode  :character   Median :2016-01-01 11:30:00.0   Median : 2090  
                    Mean   :2016-01-01 11:44:40.2   Mean   : 2593  
                    3rd Qu.:2016-07-02 04:45:00.0   3rd Qu.: 4455  
                    Max.   :2016-12-31 23:00:00.0   Max.   :15990

Why is Date_Time being retained? And how can I prevent it form impacting the summary (as in prevent it from giving me 17,542 rows instead of the expected 24). If I remove the column before the summary like so:

df1$Date_Time <- NULL

Then it works fine, and a summary of the result looks like this:

  month_year            total        
 Length:24          Min.   :1148276  
 Class :character   1st Qu.:1756898  
 Mode  :character   Median :1927154  
                    Mean   :1895161  
                    3rd Qu.:2066043  
                    Max.   :2393675

This solution is fine, but I would like to know what the cause of the issue is so that I can avoid it in future (it was easy to catch the problem this time, but may not always be so straight forward).

Thanks in advance for the help!

>Solution :

The dataset pedestrian is a tsibble with sensor as a key and Date_Time as the index. Any operation you do on the tsibble will retain the index. You can remove the index by converting back to a tibble.

pedestrian %>%
  as_tibble() %>% 
  mutate(ym = yearmonth(Date)) %>% 
  dplyr::group_by(ym) %>% 
  dplyr::summarise(total = sum(Count))

tsibble

byMR

Published June 21, 2022

Add a comment

My responsive media-query doesn't show neither mobile or desktop view

byMR

June 21, 2022

Questions

React – Change button to loading status whilst fetching data

byMR

June 21, 2022

Questions

cannot add style on vue.js etemplate using a function

byMR

June 21, 2022

Questions

CONCAT two wstrings in codesys

byMR

June 21, 2022

Questions

Python regex: Match any text excluding 2 words (probably look arounds)

byMR

June 21, 2022

Questions

DomPDF is deprecated in laravel project

byMR

June 21, 2022

dplyr group_by retaining extra columns after summarise

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

My responsive media-query doesn't show neither mobile or desktop view

React – Change button to loading status whilst fetching data

cannot add style on vue.js etemplate using a function

CONCAT two wstrings in codesys

Python regex: Match any text excluding 2 words (probably look arounds)

DomPDF is deprecated in laravel project

Keep Up to Date with the Most Important News

dplyr group_by retaining extra columns after summarise

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

My responsive media-query doesn't show neither mobile or desktop view

React – Change button to loading status whilst fetching data

cannot add style on vue.js etemplate using a function

CONCAT two wstrings in codesys

Python regex: Match any text excluding 2 words (probably look arounds)

DomPDF is deprecated in laravel project

Discover more from Dev solutions