Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

T-SQL Sum Since Condition Last Met

I’m trying to do a sort of running total with a variable window size. I want to sum a value from the current row to the last time another row met a condition.

Mock Data: I want a formula for the last column

ID condition_col col_to_sum running_total_since_last_true
01 TRUE 10 10
02 FALSE 8 18
03 FALSE 12 30
04 TRUE 5 5
05 FALSE 8 13
06 FALSE 10 23
07 FALSE -2 21
08 TRUE 1 1
09 FALSE 1 2

What I have tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

select ID
     , condition_col
     , col_to_sum
     , (
        select sum(col_to_sum) as total
          from tbl t1
         where t1.ID between tbl.ID and (select top 1 t2.ID 
                                           from tbl t2 
                                          where condition_col = TRUE 
                                       order by t2.ID))
        ) as running_total_since_last_true
  from tbl

Because of the size of the dataset and the other things going on in the query (not shown here) this resulted in the query taking several minutes to run for a few thousand rows.

Now I’m looking at using sum() over(rows between current row and (select…)

I feel like I’m on the right track, but I keep getting stuck.

I also tried lag() but I couldn’t make the lag reference itself properly.

Any ideas?

>Solution :

You would be better off putting your data into groups first and then using a cumulative SUM. You can put your data into groups by using a conditional cumulative COUNT (inside a CTE):

WITH Grps AS(
    SELECT ID,
           condition_col,
           col_to_sum,
           COUNT(CASE condition_col WHEN 'TRUE' THEN 1 END) OVER (ORDER BY ID ASC
                                                                  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
    FROM (VALUES(01,'TRUE',10,10),
                (02,'FALSE',8,18),
                (03,'FALSE',12,30),
                (04,'TRUE',5,5),
                (05,'FALSE',8,13),
                (06,'FALSE',10,23),
                (07,'FALSE',-2,21),
                (08,'TRUE',1,1),
                (09,'FALSE',1,2))V(ID,condition_col,col_to_sum,running_total_since_last_true))
SELECT ID,
       condition_col,
       col_to_sum,
       SUM(col_to_sum) OVER (PARTITION BY Grp ORDER BY ID
                             ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total_since_last_true
FROM Grps;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading