Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Status of values generated by OVER and PARTITION BY in SQL Server

I have a source table as follow:

enter image description here

I want to get the below result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here

EXPLANATION OF THE SOURCE TABLE:

This table contains requests with their related tasks on specific dates, each request can have multiple tasks and these tasks can occur many times, I marked each request and it related tasks with different color

For example, the request = ‘NC2’ in blue has 3 tasks:

  1. task1 occurs 3 times over time.
  2. task2 occurs 2 times over time.
  3. task3 occurs 1 time over time.

EXPLANATION OF THE RESULT TABLE:

The score indicates the number of tasks for each request based on time of occurrence.

For example, the request = ‘NC2’ in blue has 3 tasks:

SCORE COLUMN:

task1: first occurrence was on 1/5/2022, then it occurs on 1/7/2022 and lastly on 1/9/2022 "for the same request number"

, as marked on yellow, the same for task two it only occurs twis on 1/6/2022 and 1/8/2022 and for the task3 also.

You can see the score column contains the scores of the tasks based on their occurrence over time for specific request number.

STATUS COLUMN:

For the status it shows the first and last occurrence of a task for a specific request, and it ignores the middles like task1 marked in red it’s not the first occurrence and neither the last, so it was ignored.

What I was able to achieve:

With the query below I was able to get SCORE column:

select RequestNumber, Task, StartDate, row_number() over(partition by RequestNumber, TaskName 
order by START_DATE) as score
from [SOURCE_TABLE] order by RequestNumber, START_DATE

for the STATUS I am lost, and I couldn’t solve it, probably I am thinking inside the box, so any suggestion are welcome.

>Solution :

A slightly more efficient version of @Larnu’s excellent answer:

SELECT
  RequestNumber,
  Task,
  StartDate,
  ROW_NUMBER() OVER (PARTITION BY RequestNumber, TaskName ORDER BY StartDate) AS score,
  CASE WHEN ROW_NUMBER()    OVER (PARTITION BY RequestNumber, TaskName ORDER BY StartDate) = 1
         THEN 'First'
       WHEN LEAD(StartDate) OVER (PARTITION BY RequestNumber, TaskName ORDER BY StartDate) IS NULL
         THEN 'Last'
       END AS Status
FROM dbo.[SOURCE_TABLE];

It uses LEAD instead of another ROW_NUMBER. This means that the same windowing clause can be used, and therefore does not require another sort, nor a spool.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading