I have the following data frame:
tibble(
company = rep(x = "google", each = 15),
date = c(
"2019q1",
"2019q2",
"2019q3",
"2019q4",
"2019end",
"2020q1",
"2020q2",
"2020q3",
"2020q4",
"2020end",
"2021q1",
"2021q2",
"2021q3",
"2021q4",
"2021end"
),
values = c(1:15)
)
How can i apply the following calculation to this data frame:
This will yield a new column called new_value
e.g. for the first row of 2019q1 this would be:
new_value = 2019q1 + 2018end + 2018q1
More detailed result post 2019 years to follow
I am aware 2018 values do not exist so this will be simply NA
but when you get to 2020q1 the calculation would be this:
2020q1 + 2019end + 2019q1
2020q2 + 2019end + 2019q2
2020q3 + 2019end + 2019q3
2020q4 + 2019end + 2019q4
2020end + 2019end + 2020end
And so on and so forth.
In essence the formula per row is the following:
current date + end date value of previous year + previous date value
I have had some issues with trying to do this in long format, pivotting the data doesn’t yield the desired result either
>Solution :
Here is a solution using left_joins of mutated versions of the dataframe to align the correct values in each row, prior to a final rowwise addition and cleanup.
library(tidyverse)
tibble(
company = rep(x = "google", each = 15),
date = c(
"2019q1",
"2019q2",
"2019q3",
"2019q4",
"2019end",
"2020q1",
"2020q2",
"2020q3",
"2020q4",
"2020end",
"2021q1",
"2021q2",
"2021q3",
"2021q4",
"2021end"
),
values = c(1:15)
) %>%
mutate(date = str_replace(date, "end", "q5")) %>%
separate(date, into = c("year", "quarter"), sep = "q") %>%
mutate(across(c("year", "quarter", "values"), as.integer)) %>%
left_join(., group_by(., year) %>%
slice_tail(n = 1) %>%
mutate(year = year + 1), by = "year") %>%
left_join(., mutate(., year = year + 1), by = c("year", "quarter.x")) %>%
mutate(new_values = values.x.x + values.y.x + values.x.y) %>%
select(company.x.x, year, quarter.x, values.x.x, new_values) %>%
rename_with(~str_remove(., "\\..+"))
# # A tibble: 15 × 5
# company year quarter values new_values
# <chr> <dbl> <int> <int> <int>
# 1 google 2019 1 1 NA
# 2 google 2019 2 2 NA
# 3 google 2019 3 3 NA
# 4 google 2019 4 4 NA
# 5 google 2019 5 5 NA
# 6 google 2020 1 6 12
# 7 google 2020 2 7 14
# 8 google 2020 3 8 16
# 9 google 2020 4 9 18
# 10 google 2020 5 10 20
# 11 google 2021 1 11 27
# 12 google 2021 2 12 29
# 13 google 2021 3 13 31
# 14 google 2021 4 14 33
# 15 google 2021 5 15 35