Hi I have a table and want to return the session_id where rn=1 for each customer, rn is not always the same.
rn here is row_number over (partition by customer_id, order_id by date_key) rn
| customer_id | session_id | order_id | date_key | rn |
|---|---|---|---|---|
| 1 | 2342 | A123 | 2022-01-01 | 1 |
| 1 | 2343 | A125 | 2022-01-01 | 2 |
| 1 | 2344 | A126 | 2022-01-01 | 3 |
| 2 | 2345 | A127 | 2022-01-01 | 1 |
| 2 | 2346 | A128 | 2022-01-01 | 2 |
| 2 | 2346 | A129 | 2022-01-01 | 3 |
| 2 | 2346 | A130 | 2022-01-01 | 4 |
| customer_id | session_id | order_id | date_key | rn | firstSession |
|---|---|---|---|---|---|
| 1 | 2342 | A123 | 2022-01-01 | 1 | 2342 |
| 1 | 2343 | A125 | 2022-01-01 | 2 | 2342 |
| 1 | 2344 | A126 | 2022-01-01 | 3 | 2342 |
| 2 | 2345 | A127 | 2022-01-01 | 1 | 2345 |
| 2 | 2346 | A128 | 2022-01-01 | 2 | 2345 |
| 2 | 2346 | A129 | 2022-01-01 | 3 | 2345 |
| 2 | 2346 | A130 | 2022-01-01 | 4 | 2345 |
I could basically do a
with firstsession as
(select * from table1 where rn=1)
select
t1.*
,fs.session_id firstSession
from table1 t1
left join
firstsession fs
on t1.customer_id=fs.customer_id and t1.date_key=fs.date_key
How do I do a lag function to get the same result?
>Solution :
Consider below approach
select *,
first_value(session_id) over session as firstSession
from your_table
window session as (partition by customer_id order by date_key)
if applied to sample data in your question – output is
