Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Aggregation and mean calculation with dplyr

I have a chunk of code that aggregates timestamps of a large dataset (see below). Each timestamp represents a tweet. The code aggregates the tweets per week, it works fine. Now, I also have a column with the sentiment value of each tweet. I would like to know if it is possible to calculate the mean sentiment of the tweets per week. It would be nice to have at the end one dataset with the amount of tweets per week and the mean sentiment of these aggregated tweets. Please let me know if you’ve got some hints 🙂

Kind regards,
Daniel

weekly_counts_2 <- df_bw %>% 
  drop_na(Timestamp) %>%             
  mutate(weekly_cases = floor_date(   
    Timestamp,
    unit = "week")) %>%            
  count(weekly_cases) %>%
  tidyr::complete(                
    weekly_cases = seq.Date(          
      from = min(weekly_cases),      
      to = max(weekly_cases),         
      by = "week"),                   
    fill = list(n = 0))

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

It is difficult to verify the answer since no data has been shared but based on the description provided here is a solution that you can try.

library(dplyr)
library(tidyr)
library(lubridate)

weekly_counts_2 <- df_bw %>% 
  drop_na(Timestamp) %>%             
  mutate(weekly_cases = floor_date(Timestamp,unit = "week")) %>% 
  group_by(weekly_cases) %>%
  summarise(mean_sentiment = mean(sentiment_value, na.rm = TRUE),
            count = n()) %>%
  complete(weekly_cases = seq.Date(min(weekly_cases), 
              max(weekly_cases),by = "week"), fill = list(n = 0))

I have assumed the column with the sentiment value is called sentiment_value, change it accordingly to your data.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading