Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pick the cumulative max value in Panel data by groups

I have a panel dataset with thousands of observations. Let me simplify things:

library(dplyr)
x1 = c(NA,NA,NA,5.1,5.0,5.4,5.15,4.9,5,6)
x2 = c(5.9,5.85,5.8,6,5.95,5.98,5.99,6.1,6.15,6.14)
df = data.frame(countrycode = c(replicate(10,"ITA"), replicate(10,"UK")),
                year = c(replicate(2,2010:2019)), LogGdp = c(x1,x2))

Then, I’m looking for an efficient way to pick the maximum value of a given variable (by country code) which updates when a new maximum is found. The outcome should look like the record column below

z1 = c(NA,NA,NA,5.1,5.1,5.4,5.4,5.4,5.4,6) 
z2 = c(5.9,5.9,5.9,6,6,6,6,6.1,6.15,6.15)
df = data.frame(countrycode = c(replicate(10,"ITA"), replicate(10,"UK")),
                year = c(replicate(2,2010:2019)), LogGdp = c(x1,x2), record = c(z1,z2))

That’s my attempt:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df %>%
  group_by(countrycode) %>%
  mutate(record = cummax(LogGdp))

>Solution :

You can replace missing values with -Inf, use cummax() to find cumulative maxima, and then convert -Inf back to NA with na_if().

df %>%
  mutate(record = replace(LogGdp, is.na(LogGdp), -Inf) %>% cummax %>% na_if(-Inf),
         .by = countrycode)

#    countrycode year LogGdp record
# 1          ITA 2010     NA     NA
# 2          ITA 2011     NA     NA
# 3          ITA 2012     NA     NA
# 4          ITA 2013   5.10   5.10
# 5          ITA 2014   5.00   5.10
# 6          ITA 2015   5.40   5.40
# 7          ITA 2016   5.15   5.40
# 8          ITA 2017   4.90   5.40
# 9          ITA 2018   5.00   5.40
# 10         ITA 2019   6.00   6.00
# 11          UK 2010   5.90   5.90
# 12          UK 2011   5.85   5.90
# 13          UK 2012   5.80   5.90
# 14          UK 2013   6.00   6.00
# 15          UK 2014   5.95   6.00
# 16          UK 2015   5.98   6.00
# 17          UK 2016   5.99   6.00
# 18          UK 2017   6.10   6.10
# 19          UK 2018   6.15   6.15
# 20          UK 2019   6.14   6.15
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading