SQL query – tracking which rows in an interrupted sequence are related to eachother

Advertisements

I have a table consisting of actions that are logged when users are viewing files. When a user views a file, this can result in multiple actions over time. As multiple users can be active at the same time, their actions can become intertwined.

I am looking to create an additional column in my query that tells me when a user started viewing this file (let’s call this StartId).

When a user stops viewing a file to view some other file, and then goes back to viewing the first file, this should be regarded as a new viewing session.

This table illustrates my problem and my desired result:

Id User File StartId (desired result)
1 A X 1
2 A X 1
3 B Y 3
4 A X 1
5 B Y 3
6 A Y 6
7 A X 7

The closest I have come is with this line:

StartId = FIRST_VALUE(Id) OVER (PARTITION BY User, File ORDER BY Id)

However, this has the following result for the last action in the example:

Id User File StartId
7 A X 1

Can someone point me in the right direction with this?

>Solution :

you can use LAG to see if the previous file for the same user was the same as the current file for that user and so whether we are in a new session or not and then use that result accordingly.

WITH T AS
(
SELECT *, 
       CASE WHEN "File" = LAG("File") OVER (PARTITION BY "User" ORDER BY "Id") THEN NULL ELSE "Id" END AS NewSessionFlag
FROM  YourTable
)
SELECT *, 
       MAX(NewSessionFlag) OVER (PARTITION BY "User" ORDER BY "Id" ROWS UNBOUNDED PRECEDING)
FROM T
ORDER BY "Id"

DB Fiddle

Leave a ReplyCancel reply