Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Number of months since last email received

I have the below table that shows a monthly view of customers that received an email in a particular month.

cust_id eom_date email_sent
1 2023-06-30 1
1 2023-05-31
1 2023-04-30 1
1 2023-03-31
1 2023-02-28
1 2023-01-31 1
2 2023-06-30 1
2 2023-05-31
2 2023-04-30 1
2 2023-03-31
2 2023-02-28
2 2023-01-31
2 2022-12-31 1

I need to find out the number of months that has passed since the last email a customer received. This is the desired outcome:

cust_id eom_date email_sent months_since_last_email
1 2023-06-30 1 0
1 2023-05-31 1
1 2023-04-30 1 0
1 2023-03-31 2
1 2023-02-28 1
1 2023-01-31 1 0
2 2023-06-30 1 0
2 2023-05-31 1
2 2023-04-30 1 0
2 2023-03-31 3
2 2023-02-28 2
2 2023-01-31 1
2 2022-12-31 1 0

I tried the below in BigQuery but it only gives me cust_ids for previous periods, basically getting nowhere:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

SELECT cust_id, 
       eom_date,
       lag(cust_id, 1) over (
         partition by cust_id order by eom_date) t1,
       lag(cust_id, 2) over (
         partition by cust_id order by eom_date) t2
  FROM table

Thanks in advance.

>Solution :

This is for SQL Server. Don’t have BigQuery, but these are pretty commonly supported functions.

The general idea should work for BigQuery using date_diff instead of datediff

Fiddle: https://dbfiddle.uk/Hor4MnrD

CREATE TABLE your_table_name (
    cust_id INT,
    eom_date DATE,
    email_sent INT
);

INSERT INTO your_table_name (cust_id, eom_date, email_sent)
VALUES 
(1, '2023-06-30', 1),
(1, '2023-05-31', NULL),
(1, '2023-04-30', 1),
(1, '2023-03-31', NULL),
(1, '2023-02-28', NULL),
(1, '2023-01-31', 1),
(2, '2023-06-30', 1),
(2, '2023-05-31', NULL),
(2, '2023-04-30', 1),
(2, '2023-03-31', NULL),
(2, '2023-02-28', NULL),
(2, '2023-01-31', NULL),
(2, '2022-12-31', 1);

Option A: Correlated subquery. Find the last email sent by cust_id and take the date diff.

select *,
       datediff(
       month,
       (
         select max(t2.eom_date)
           from your_table_name t2
          where t1.cust_id = t2.cust_id
            and t1.eom_date >= t2.eom_date
            and t2.email_sent = 1
          group
             by t2.cust_id
       ),
       eom_date) as months_since_last_email
  from your_table_name t1
 order
    by cust_id,
       eom_date desc;

Option B: Window function. Send non-email months as null, so the max in the unbounded preceding window will be the last sent date.

select *,
       datediff(
       month,
       max(case when email_sent = 1 then 
                     eom_date
            end
          ) over ( partition by cust_id
                       order by eom_date
                      rows between unbounded preceding and current row
                 ),
       eom_date) as months_since_last_email
  from your_table_name t1
 order
    by cust_id,
       eom_date desc;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading