For example, I have a dataset of 30-years air temperature of a city, the dataset looks like:
Year Julian_date temperature
1991 1 2.1
1991 2 2.2
... ... ...
1991 365 2.3
1992 1 2.1
... ... ...
1992 365 2.5
... ... ...
2020 366 2.5
I would like to calculate the 90th percentile value of each Julian date (from different years), and returen the results, like:
Julian_date value(the 90th percentile)
1 2.4
2 2.6
... ...
365 2.5
How should I write the code in r?
>Solution :
You can first group by Julian_date, then use the quantile function to set the probability inside summarise.
library(tidyverse)
df %>%
group_by(Julian_date) %>%
summarise("value (the 90th percentile)" = quantile(temperature, probs=0.9, na.rm=TRUE))
Output
Julian_date `value (the 90th percentile)`
<int> <dbl>
1 1 2.1
2 2 2.2
3 365 2.5
Data
df <- structure(list(Year = c(1991L, 1991L, 1991L, 1992L, 1992L, 2020L
), Julian_date = c(1L, 2L, 365L, 1L, 365L, 365L), temperature = c(2.1,
2.2, 2.3, 2.1, 2.5, 2.5)), class = "data.frame", row.names = c(NA,
-6L))