Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get cumulative sum that reset for each year

Please consider this table:

Year    Month   Value   YearMonth
2011      1      70      201101
2011      1      100     201101
2011      2      200     201102
2011      2      50      201102
2011      3      80      201103
2011      3      250     201103
2012      1      100     201201
2012      2      200     201202
2012      3      250     201203

I want to get a cumulative sum based on each year. For the above table I want to get this result:

Year    Month       Sum
-----------------------
2011      1         170
2011      2         420  <--- 250 + 170
2011      3         750  <--- 330 + 250 + 170
2012      1         100
2012      2         300  <--- 200 + 100
2012      3         550  <--- 250 + 200 + 100

I wrote this code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Select c1.YearMonth, Sum(c2.Value) CumulativeSumValue
From @Tbl c1,  @Tbl c2
Where c1.YearMonth >= c2.YearMonth
Group By c1.YearMonth
Order By c1.YearMonth Asc

But its CumulativeSumValue is calculated twice for each YearMonth:

YearMonth   CumulativeSumValue
  201101          340           <--- 170 * 2
  201102          840           <--- 420 * 2
  201103          1500
  201201          850
  201202          1050
  201203          1300

How can I achieve my desired result?

I wrote this query:

select year, (Sum (aa.[Value]) Over (partition by aa.Year Order By aa.Month)) as 'Cumulative Sum'
from @Tbl aa

But it returned multiple records for 2011:

Year    Cumulative Sum
2011        170
2011        170
2011        420
2011        420
2011        750
2011        750
2012        100
2012        300
2012        550

>Solution :

You are creating a cartesian product here. In your ANSI-89 implicit JOIN (you really need to stop using those and switch to ANSI-92 syntax) you are joining on c1.YearMonth >= c2.YearMonth.

For your first month you have two rows with the same value of the year and month, so each of those 2 rows joins to the other 2; this results in 4 rows:

Year Month Value1 Value2
2011 1 70 70
2011 1 70 100
2011 1 100 70
2011 1 100 100

When you SUM this value you get 340, not 170, as you have 70+70+100+100.

Instead of a triangular JOIN however, you should be using a windowed SUM. As you want to also get aggregate nmonths into a single rows, you’ll need to also aggregate inside the windowed SUM like so:

SELECT V.YearMonth,
       SUM(SUM(V.Value)) OVER (PARTITION BY Year ORDER BY V.YearMonth) AS CumulativeSum
FROM (VALUES (2011, 1, 70, 201101),
             (2011, 1, 100, 201101),
             (2011, 2, 200, 201102),
             (2011, 2, 50, 201102),
             (2011, 3, 80, 201103),
             (2011, 3, 250, 201103),
             (2012, 1, 100, 201201),
             (2012, 2, 200, 201202),
             (2012, 3, 250, 201203)) V (Year, Month, Value, YearMonth)
GROUP BY V.YearMonth,
         V.Year;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading