I have the following nested data in two columns:
Categories_A Categories_B
{"A"} {"B","F","C"}
{"B","A"} {"Z","B","F"}
{"B","F"} {"A","E","R"}
I would like to return in a new column the missing category/categories in categories_B when compared to categories_A – what is in Categories_A that is not in categories_B. So ideally:
Categories_A Categories_B Missing_Category
{"A"} {"B","F","C"} {"A"}
{"B","A"} {"Z","B","F"} {"A"}
{"B","F"} {"A","E","R"} {"B","F"}
So far what I managed is to return if there is a complete match between the columns:
select Categories_A, Categories_B,
case
when Categories_A = Categories_B then 'TRUE'
else 'FALSE'
end is_a_match
from facts_themes
Categories_A Categories_B is_a_match
{"A"} {"B","F","C"} FALSE
{"B","A"} {"Z","B","F"} FALSE
{"B","F"} {"A","E","R"} FALSE
>Solution :
There is no built-in function for that, but it’s easy to write your own:
create function array_except(p_one anyarray, p_two anyarray)
returns anyarray
as
$$
select array_agg(e)
from (
select e
from unnest(p_one) as p1(e)
except
select e
from unnest(p_two) as p2(e)
) x
$$
language sql
immutable
;
Then you can use it like this:
select categories_a, categories_b,
array_except(categories_a, categories_b) as missing_categories
from facts_themes