Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R create multiple columns in one mutate command

I have a data.frame like this.

library(tidyverse)

df <- tibble(
  name = rep(c("a", "b"), each = 100),
  value = runif(100*2),
  date = rep(Sys.Date() + days(1:100), 2)
  )

I would like to do something very similar to the code below. Is there a way to create these 10 columns in one go? Basically, I am trying to find out how much does 99th percent quantile change if we remove one observation, and then 2, and then 3 and so on.

df %>%
  nest_by(name) %>%
  mutate(
    q99_lag_0 = data %>% pull(value) %>% quantile(.99),
    q99_lag_1 = data %>% pull(value) %>% tail(-1) %>% quantile(.99),
    q99_lag_2 = data %>% pull(value) %>% tail(-2) %>% quantile(.99),
    q99_lag_3 = data %>% pull(value) %>% tail(-3) %>% quantile(.99),
    q99_lag_4 = data %>% pull(value) %>% tail(-4) %>% quantile(.99),
    q99_lag_5 = data %>% pull(value) %>% tail(-5) %>% quantile(.99),
    q99_lag_6 = data %>% pull(value) %>% tail(-6) %>% quantile(.99),
    q99_lag_7 = data %>% pull(value) %>% tail(-7) %>% quantile(.99),
    q99_lag_8 = data %>% pull(value) %>% tail(-8) %>% quantile(.99),
    q99_lag_9 = data %>% pull(value) %>% tail(-9) %>% quantile(.99),
    q99_lag_10 = data %>% pull(value) %>% tail(-10) %>% quantile(.99)
  )

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

First, reproducible random data:

library(dplyr)
library(purrr) # map_dfx
set.seed(42)
df <- tibble(
  name = rep(c("a", "b"), each = 100),
  value = runif(100*2),
  date = rep(Sys.Date() + 1:100, 2)
)
head(df)
# # A tibble: 6 x 3
#   name  value date      
#   <chr> <dbl> <date>    
# 1 a     0.915 2021-12-14
# 2 a     0.937 2021-12-15
# 3 a     0.286 2021-12-16
# 4 a     0.830 2021-12-17
# 5 a     0.642 2021-12-18
# 6 a     0.519 2021-12-19

Then the call:

df %>%
  nest_by(name) %>%
  mutate(
    q99_lag_0 = quantile(data$value, 0.99),
    map_dfc(-1:-10, ~ tibble("q99_lag_{-.x}" := quantile(tail(data$value, .x), 0.99)))
  ) %>%
  ungroup()
# # A tibble: 2 x 13
#   name                data q99_lag_0 q99_lag_1 q99_lag_2 q99_lag_3 q99_lag_4 q99_lag_5 q99_lag_6 q99_lag_7 q99_lag_8 q99_lag_9 q99_lag_10
#   <chr> <list<tbl_df[,2]>>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>      <dbl>
# 1 a              [100 x 2]     0.983     0.983     0.983     0.983     0.983     0.983     0.983     0.983     0.983     0.983      0.983
# 2 b              [100 x 2]     0.963     0.963     0.963     0.963     0.963     0.963     0.946     0.946     0.946     0.947      0.947
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading