Calculate time span with different patterns

September 1, 2023

The calculation of a time span with gaps and islands was already discussed.
But what about different patterns of time spans present?

The present data is a combination of gaps between time spans and contiguous time spans. These are available in different forms.
ID 1 has 5 time spans of which 4 lie within the first time span from 2000-08-08 to 2019-03-31 but for ID 2 the time spans are present in such a way that the next time span in the next line falls within the end date of the following line. ID 3 has a gap between the first enddate and the following startdate.

My code works just for ID 2 and 3.
How do I find the time span for ID 1 as well?

In exactly this case my code does not work.

The problem occurs with ID 1 as the code is not able to solve it with 1 iteration, because the data is present in a different way.

id	startdate	enddate
1	2000-08-08	2019-03-31
1	2007-06-08	2007-09-09
1	2008-02-08	2011-08-04
1	2012-08-01	2012-09-20
1	2018-02-01	2019-03-31
2	2013-02-18	2019-03-28
2	2018-04-12	2020-07-28
3	2015-01-11	2015-04-02
3	2016-02-08	2021-11-22

db<>fiddle

My code:

WITH cte1 AS (
  SELECT
    id,
    startdate,
    enddate,
    CASE
      WHEN LAG(enddate) OVER (PARTITION BY id ORDER BY startdate) >= DATEADD(day, -1, startdate) THEN 0
      ELSE 1
    END AS new_grp
  FROM tab1
), cte2 AS (
  SELECT
    cte1.*,
    SUM(new_grp) OVER (PARTITION BY id ORDER BY startdate) AS grp_num
  FROM cte1
)
SELECT
  id,
  MIN(startdate) AS startdate,
  MAX(enddate) AS enddate
FROM cte2
GROUP BY id, grp_num
ORDER BY id, startdate;

But when I want to derive the time span for ID 1 it does need 4 iterations – so the classical gaps and islands approach is not feasible.

What I expect:

id	startdate	enddate
1	2000-08-08	2019-03-31
2	2013-02-18	2020-07-28
3	2015-01-11	2015-04-02
3	2016-02-08	2021-11-22

>Solution :

You can take max(enddte) in preceding rows.

Change

CASE
      WHEN LAG(enddate) OVER (PARTITION BY id ORDER BY startdate) >= DATEADD(day, -1, startdate) THEN 0
      ELSE 1
    END AS new_grp

with

    CASE
      WHEN MAX(enddate) 
        OVER (PARTITION BY id ORDER BY startdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
          >= DATEADD(day, -1, startdate) 
      THEN 0
      ELSE 1
    END AS new_grp

Query example (from your fiddle)

WITH cte1 AS (
  SELECT
    id,
    startdate,
    enddate,
    CASE
      WHEN MAX(enddate) 
        OVER (PARTITION BY id ORDER BY startdate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
          >= DATEADD(day, -1, startdate) 
      THEN 0
      ELSE 1
    END AS new_grp
  FROM tab1
), cte2 AS (
  SELECT
    cte1.*,
    SUM(new_grp) OVER (PARTITION BY id ORDER BY startdate) AS grp_num
  FROM cte1
)
SELECT
  id,
  MIN(startdate) AS startdate,
  MAX(enddate) AS enddate
FROM cte2
GROUP BY id, grp_num
ORDER BY id, startdate;