Snowflake/SQL – Nested json objects and arrays

Advertisements

I got stuck – I have a json file with a few nested json objects and arrays and I can’t figure out how to query this data properly.

{
  "dim":[
    "ENT1",
    "ENT2",
    "ENT3",
    "ENT4"
  ],
  "col":[
    [
      "ST1",
      "ST2",
      "ST3"
    ]
  ],
  "row":[
    {
      "head":[
        "AC1"
      ],
      "data":[
        "1",
        "2",
        "3"
      ]
    },
    {
      "head":[
        "AC2"
      ],
      "data":[
        "4",
        "5",
        "6"
      ]
    },
    {
      "head":[
        "AC3"
      ],
      "data":[
        "7",
        "8",
        "9"
      ]
    }
  ]
}

Here are the expected resluts:

ENT1, ENT2, ENT3, ENT4, ST1, AC1, 1
ENT1, ENT2, ENT3, ENT4, ST1, AC2, 4
ENT1, ENT2, ENT3, ENT4, ST1, AC3, 7
ENT1, ENT2, ENT3, ENT4, ST2, AC1, 2
ENT1, ENT2, ENT3, ENT4, ST2, AC2, 5
ENT1, ENT2, ENT3, ENT4, ST2, AC3, 8
ENT1, ENT2, ENT3, ENT4, ST3, AC1, 3
ENT1, ENT2, ENT3, ENT4, ST3, AC2, 6
ENT1, ENT2, ENT3, ENT4, ST3, AC3, 9

Here are the results I’m receiving:

ENT1, ENT2, ENT3, ENT4, ST1, AC1, 1
ENT1, ENT2, ENT3, ENT4, ST1, AC1, 2
ENT1, ENT2, ENT3, ENT4, ST1, AC1, 3
ENT1, ENT2, ENT3, ENT4, ST1, AC2, 4
ENT1, ENT2, ENT3, ENT4, ST1, AC2, 5
ENT1, ENT2, ENT3, ENT4, ST1, AC2, 6
ENT1, ENT2, ENT3, ENT4, ST1, AC3, 7
ENT1, ENT2, ENT3, ENT4, ST1, AC3, 8
ENT1, ENT2, ENT3, ENT4, ST1, AC3, 9
ENT1, ENT2, ENT3, ENT4, ST2, AC1, 1
ENT1, ENT2, ENT3, ENT4, ST2, AC1, 2
ENT1, ENT2, ENT3, ENT4, ST2, AC1, 3
ENT1, ENT2, ENT3, ENT4, ST2, AC2, 4
ENT1, ENT2, ENT3, ENT4, ST2, AC2, 5
ENT1, ENT2, ENT3, ENT4, ST2, AC2, 6
ENT1, ENT2, ENT3, ENT4, ST2, AC3, 7
ENT1, ENT2, ENT3, ENT4, ST2, AC3, 8
ENT1, ENT2, ENT3, ENT4, ST2, AC3, 9
ENT1, ENT2, ENT3, ENT4, ST3, AC1, 1
ENT1, ENT2, ENT3, ENT4, ST3, AC1, 2
ENT1, ENT2, ENT3, ENT4, ST3, AC1, 3
ENT1, ENT2, ENT3, ENT4, ST3, AC2, 4
ENT1, ENT2, ENT3, ENT4, ST3, AC2, 5
ENT1, ENT2, ENT3, ENT4, ST3, AC2, 6
ENT1, ENT2, ENT3, ENT4, ST3, AC3, 7
ENT1, ENT2, ENT3, ENT4, ST3, AC3, 8
ENT1, ENT2, ENT3, ENT4, ST3, AC3, 9

And here is my query:

`SELECT t.$1:dim[0]::string AS pov1,
t.$1:dim[1]::string AS pov2,
t.$1:dim[2]::string AS pov3,
t.$1:dim[3]::string AS pov4,
s.value::string AS st,
a.value:head::string AS ac,
am.value::string AS at
FROM my_table t
, table(flatten( input => array_flatten(t.$1:col) ) s
, lateral flatten ( input => t.$1:row ) a
, table(flatten( input => a.value:data)) am;`

My SQL skills are not great so if anyone can enlighten me it would be greatly appriciated.

>Solution :

It’s not clear what you want, but it seems like from your output you’re trying to get the row value at a particular column index. If this is the case, snowflake provides a lot of meta information from flattened arrays, including the index of an element.

So you can just use the column array index to get the data for the corresponding row without even having to create another table alias since you already have row.

SELECT t.$1:dim[0]::string AS pov1,
  t.$1:dim[1]::string AS pov2,
  t.$1:dim[2]::string AS pov3,
  t.$1:dim[3]::string AS pov4,
  s.value::string AS st,
  a.value:head::string AS ac,
  a.value:data[s.index] AS at -- Get the data value corresponding to the column's index
FROM my_table t
  , table(flatten( input => array_flatten(t.$1:col) )) s
  , lateral flatten ( input => t.$1:row ) a

Leave a ReplyCancel reply