Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Averaging df columns/rows by year

I have the following df and i wish to average the values in each year to leave me with 1 value for each of the years in the df (1 value for overall and 1 for eastern here), this would leave me with a df of 3 columns and 3 rows on this example. I tried to use tapply for this but had no joy as yet

df 
    date       overall  eastern
1   1997-12-15  17.548  18.751
2   1998-01-15  16.189  17.155
3   1998-02-15  15.667  16.504
4   1998-03-15  15.509  16.208
5   1998-04-15  16.709  17.702
6   1998-05-15  18.822  19.660
7   1998-06-15  22.722  23.512
8   1998-07-15  25.372  25.912
9   1998-08-15  26.597  27.226
10  1998-09-15  25.256  26.151
11  1998-10-15  22.857  24.440
12  1998-11-15  20.242  21.867
13  1998-12-15  17.179  18.797
14  1999-01-15  16.003  17.206
15  1999-02-15  15.140  16.345
16  1999-03-15  15.522  16.689
17  1999-04-15  16.537  17.521
18  1999-05-15  19.658  20.740
19  1999-06-15  23.245  24.161
20  1999-07-15  25.313  26.053
21  1999-08-15  26.753  27.112
22  1999-09-15  26.040  26.597
23  1999-10-15  23.843  24.940
24  1999-11-15  20.940  22.375
25  1999-12-15  17.842  19.439

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

We convert the ‘date’ to Date class (with lubridate::ymd or as.Date from base R), extract the year as grouping variable and summarise across the columns to get the mean value

library(dplyr)
library(lubridate)
df1 %>%
    group_by(year = year(ymd(date))) %>%
    summarise(across(overall:eastern, mean, na.rm = TRUE))

-output

# A tibble: 3 × 3
   year overall eastern
  <dbl>   <dbl>   <dbl>
1  1997    17.5    18.8
2  1998    20.3    21.3
3  1999    20.6    21.6

If we also want by seasons, create a key value dataset with month and seasons values, join and do a group by mean

keydat <- tibble(seasons = rep(c("Winter", "Spring", "Summer", "Fall"),
    each = 3), month = c("Dec", month.abb[-length(month.abb)]))
df1 %>%
   mutate(date = as.Date(date), month = format(date, '%b'),
   year = format(date, '%Y')) %>% 
  left_join(keydat) %>% 
  group_by(year, seasons) %>% 
  summarise(across(c(overall, eastern), mean, na.rm = TRUE), 
    .groups = 'drop')

-output

# A tibble: 9 × 4
  year  seasons overall eastern
  <chr> <chr>     <dbl>   <dbl>
1 1997  Winter     17.5    18.8
2 1998  Fall       22.8    24.2
3 1998  Spring     17.0    17.9
4 1998  Summer     24.9    25.6
5 1998  Winter     16.3    17.5
6 1999  Fall       23.6    24.6
7 1999  Spring     17.2    18.3
8 1999  Summer     25.1    25.8
9 1999  Winter     16.3    17.7

Or in base R

aggregate(.~ year, transform(df1, year = format(as.Date(date), 
   '%Y'))[-1], FUN = mean)
  year  overall  eastern
1 1997 17.54800 18.75100
2 1998 20.26008 21.26117
3 1999 20.56967 21.59817

data

df1 <- structure(list(date = c("1997-12-15", "1998-01-15", "1998-02-15", 
"1998-03-15", "1998-04-15", "1998-05-15", "1998-06-15", "1998-07-15", 
"1998-08-15", "1998-09-15", "1998-10-15", "1998-11-15", "1998-12-15", 
"1999-01-15", "1999-02-15", "1999-03-15", "1999-04-15", "1999-05-15", 
"1999-06-15", "1999-07-15", "1999-08-15", "1999-09-15", "1999-10-15", 
"1999-11-15", "1999-12-15"), overall = c(17.548, 16.189, 15.667, 
15.509, 16.709, 18.822, 22.722, 25.372, 26.597, 25.256, 22.857, 
20.242, 17.179, 16.003, 15.14, 15.522, 16.537, 19.658, 23.245, 
25.313, 26.753, 26.04, 23.843, 20.94, 17.842), eastern = c(18.751, 
17.155, 16.504, 16.208, 17.702, 19.66, 23.512, 25.912, 27.226, 
26.151, 24.44, 21.867, 18.797, 17.206, 16.345, 16.689, 17.521, 
20.74, 24.161, 26.053, 27.112, 26.597, 24.94, 22.375, 19.439)), 
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", 
"25"))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading