I have a PostgreSQL table with the following structure:
CREATE TABLE cte1 (
entity_id INT,
assignedtogroup INT,
time BIGINT
);
INSERT INTO cte1 (entity_id, assignedtogroup, time)
VALUES
(1, 435198, 1687863949740),
(1, 435198, 1687863949741),
(1, NULL, 1687863949742),
(1, NULL, 1687863949743),
(1, 435224, 1687863949744),
(1, 435224, 1687863949745),
(1, 435143, 1687863949746),
(1, 435143, 1687863949747),
(1, 435191, 1687863949748),
(1, NULL, 1687863949749),
(2, 435143, 1690452125291),
(2, 435143, 1690452125292),
(2, 435191, 1690452125293),
(2, NULL, 1690452125294);
I would like to fill the empty values in the assignedtogroup column using the previous row’s (time just before the current row and same entity_id) non-null value. The expected result should be:
| entity_id | assignedtogroup | time |
|---|---|---|
| 1 | 435198 | 1687863949740 |
| 1 | 435198 | 1687863949741 |
| 1 | 435198 | 1687863949742 |
| 1 | 435198 | 1687863949743 |
| 1 | 435224 | 1687863949744 |
| 1 | 435224 | 1687863949745 |
| 1 | 435143 | 1687863949746 |
| 1 | 435143 | 1687863949747 |
| 1 | 435191 | 1687863949748 |
| 1 | 435191 | 1687863949749 |
| 2 | 435143 | 1690452125291 |
| 2 | 435143 | 1690452125292 |
| 2 | 435191 | 1690452125293 |
| 2 | 435191 | 1690452125294 |
Is there a way to achieve this using only a SELECT statement?
I tried using the LAG function:
SELECT
entity_id,
COALESCE(
assignedtogroup,
LAG(assignedtogroup) OVER (PARTITION BY entity_id ORDER BY time)
) AS filled_assignedtogroup
FROM cte1;
However, I still have a NULL value and for the entity_id 2, the values are completely mixed.
You can find the DB Fiddle : https://www.db-fiddle.com/f/m52Rgq8jtK85g9yvaDMJqz/3
>Solution :
You would be better-off imo using a simple correlation here:
select Entity_Id, Coalesce(assignedtogroup, (
select assignedtogroup
from cte1 cte2 where cte2.entity_id = cte1.entity_id
and cte2.time < cte1.time
and cte2.assignedtogroup is not null
order by time desc
limit 1
)), time
from cte1;
Updated DB fiddle https://www.db-fiddle.com/f/m52Rgq8jtK85g9yvaDMJqz/3