Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to write a faster sql query to break apart custom dictionary in Postgres column

I’m working with this fun data that has this custom dictionary-ish format in one of the columns.

id parameters
1 x_id x_value; y_id y_value; z_id z_value;
2 y_id y_value2; z_id z_value2;

And looking to get it into this format…

id x_id y_id z_id
1 x_value y_value z_value
2 NULL y_value2 z_value2

I’d prefer for it to be completely dynamic, but I can live with knowing all possible column names in advance if it reduces complexity/improves performance. I can also guarantee this pattern is consistent. No additional nesting of dictionaries, etc.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Not being a sql master, this is the naive implementation I came up with but it seems quite slow. Is there a more performant way to do this?

select 
        string_agg(x_id, ',') as x_id,
        string_agg(y_id, ',') as y_id,
        string_agg(z_id, ',') as z_id
    from (
        select
            t.id,
            case when 'x_id' LIKE kvs.key then kvs.value else null end as x_id,
            case when 'y_id' LIKE kvs.key then kvs.value else null end as y_id,
            case when 'z_id' LIKE kvs.key then kvs.value else null end as z_id
        from my_table as t
        join (
            SELECT
                id,
                split_part(trim(both ' ' FROM unnest(string_to_array(parameters, ';'))), ' ', 1) "key",
                split_part(trim(both ' ' FROM unnest(string_to_array(parameters, ';'))), ' ', 2) "value"
            FROM my_table
        ) as kvs
            on t.id = kvs.id
    ) as params
    group by params.id

>Solution :

I would convert this to a JSON value, then you can access each key quite easily:

select id, 
       parameters ->> 'x_id' as x_id,
       parameters ->> 'y_id' as y_id,
       parameters ->> 'z_id' as z_id
from (       
  select t.id, 
         jsonb_object_agg(split_part(p.parm, ' ', 1), split_part(p.parm, ' ', 2)) as parameters
  from the_table t
    left join unnest(regexp_split_to_array(trim(';' from parameters), '\s*;\s*')) as p(parm) on true
  group by id
) x
order by id;  

The trim(';' from p_input) is necessary to remove the trailing ;, otherwise regexp_split_to_array() will return one empty array element.

You can put this into a function to make things easier:

create or replace function convert_to_json(p_input text)
  returns jsonb
as
$$
  select jsonb_object_agg(elements[1], elements[2]) as parameters
  from (
    select string_to_array(p.parm, ' ') as elements
    from unnest(regexp_split_to_array(trim(';' from p_input), '\s*;\s*')) as p(parm)
  ) t1;
$$
language sql
immutable;

The this gets a bit simpler:

select id, 
       convert_to_json(parameters) ->> 'x_id' as x_id,
       convert_to_json(parameters) ->> 'y_id' as y_id,
       convert_to_json(parameters) ->> 'z_id' as z_id
from the_table
order by id;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading