Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculate time span with different patterns

The calculation of a time span with gaps and islands was already discussed.
But what about different patterns of time spans present?

The present data is a combination of gaps between time spans and contiguous time spans. These are available in different forms.
ID 1 has 5 time spans of which 4 lie within the first time span from 2000-08-08 to 2019-03-31 but for ID 2 the time spans are present in such a way that the next time span in the next line falls within the end date of the following line. ID 3 has a gap between the first enddate and the following startdate.

My code works just for ID 2 and 3.
How do I find the time span for ID 1 as well?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

In exactly this case my code does not work.

The problem occurs with ID 1 as the code is not able to solve it with 1 iteration, because the data is present in a different way.

id startdate enddate
1 2000-08-08 2019-03-31
1 2007-06-08 2007-09-09
1 2008-02-08 2011-08-04
1 2012-08-01 2012-09-20
1 2018-02-01 2019-03-31
2 2013-02-18 2019-03-28
2 2018-04-12 2020-07-28
3 2015-01-11 2015-04-02
3 2016-02-08 2021-11-22

db<>fiddle

My code:

WITH cte1 AS (
  SELECT
    id,
    startdate,
    enddate,
    CASE
      WHEN LAG(enddate) OVER (PARTITION BY id ORDER BY startdate) >= DATEADD(day, -1, startdate) THEN 0
      ELSE 1
    END AS new_grp
  FROM tab1
), cte2 AS (
  SELECT
    cte1.*,
    SUM(new_grp) OVER (PARTITION BY id ORDER BY startdate) AS grp_num
  FROM cte1
)
SELECT
  id,
  MIN(startdate) AS startdate,
  MAX(enddate) AS enddate
FROM cte2
GROUP BY id, grp_num
ORDER BY id, startdate;

But when I want to derive the time span for ID 1 it does need 4 iterations – so the classical gaps and islands approach is not feasible.

What I expect:

id startdate enddate
1 2000-08-08 2019-03-31
2 2013-02-18 2020-07-28
3 2015-01-11 2015-04-02
3 2016-02-08 2021-11-22

>Solution :

You can take max(enddte) in preceding rows.

Change

CASE
      WHEN LAG(enddate) OVER (PARTITION BY id ORDER BY startdate) >= DATEADD(day, -1, startdate) THEN 0
      ELSE 1
    END AS new_grp

with

    CASE
      WHEN MAX(enddate) 
        OVER (PARTITION BY id ORDER BY startdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
          >= DATEADD(day, -1, startdate) 
      THEN 0
      ELSE 1
    END AS new_grp

Query example (from your fiddle)

WITH cte1 AS (
  SELECT
    id,
    startdate,
    enddate,
    CASE
      WHEN MAX(enddate) 
        OVER (PARTITION BY id ORDER BY startdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
          >= DATEADD(day, -1, startdate) 
      THEN 0
      ELSE 1
    END AS new_grp
  FROM tab1
), cte2 AS (
  SELECT
    cte1.*,
    SUM(new_grp) OVER (PARTITION BY id ORDER BY startdate) AS grp_num
  FROM cte1
)
SELECT
  id,
  MIN(startdate) AS startdate,
  MAX(enddate) AS enddate
FROM cte2
GROUP BY id, grp_num
ORDER BY id, startdate;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading