I have a list of customers who viewed a house and who bought a house.
I’d like to group_by customer and filter for customers who bought a house within a month of viewing.
example
customer <- c(1, 2, 3, 3, 4, 4, 4, 5)
action <- c("view", "view", "view", "buy", "view", "view", "buy", "view")
date <- c("2022/01/01", "2022/03/01", "2022/01/01", "2022/12/01", "2022/01/01", "2022/03/01", "2022/03/31", "2022/01/01")
df <- tibble(customer, action, date)
In this case I’d like to get back customer 4 from the filter, they viewed twice and bought within a month of the second viewing.
thanks!
>Solution :
library(lubridate)
library(tidyverse)
df %>%
mutate(date = date %>%
as.Date("%Y/%m/%d")) %>%
pivot_wider(names_from = action,
values_from = date) %>%
unnest(everything()) %>%
mutate(diff = interval(view, buy) %>%
as.numeric("months")) %>%
filter(diff < 1)
# A tibble: 1 x 4
customer view buy diff
<dbl> <date> <date> <dbl>
1 4 2022-03-01 2022-03-31 0.986