I want to use PostgreSQL’s functionality in order to obtain the row that has the longest path in either ind or tar column, for the same id, ind_id and tar_id.
For example:
At the table below, since rows 1, 2, 3 have the same id=1, ind_id=2 and tar_id=3, I want to only take the 3rd row because the array tar={7,8,9}, is longer than {7,8} (2nd row) and {7} (1st row).
Similarly, rows #4 and #5, have the same values for id, ind_id and tar_id, and since the length of the array in column ind is longer in row #5, I want to keep this row only, and drop row #4.
Since I am a newbie in the world of PostgreSQL, can someone give me a hint how to do that?
text id ind_id tar_id ind tar
1 bla bla 1 2 3 {4} {7}
2 bla bla 1 2 3 {4} {7,8}
3 bla bla 1 2 3 {4} {7,8,9}
4 bla bla 1 2 33 {5} {10}
5 bla bla 1 2 33 {5,6,7} {10}
>Solution :
Use distinct on and order by id, ind_id, tar_id and the greatest array length of ind and tar.
select distinct on (id, ind_id, tar_id) *
from the_table
order by id, ind_id, tar_id,
greatest(array_length(ind::integer[], 1), array_length(tar::integer[], 1)) desc;