Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split delimiter based rows in to new rows – SQL

I have a following data and I want to split the each row by delimiter into new rows.

Sample Data:

**Row No. | Data**
        1 | abc,pqr,lmn,omr
        2 | aaa,bbb,ccc,ddd

And the output should be:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

**Row No. | Data**
        1 | abc
        2 | pqr
        3 | lmn
        4 | omr
        5 | aaa
        6 | bbb
        6 | ccc
        6 | ddd

I have a following code snippet but it only works for the first row.

    SELECT * EXCEPT(c) REPLACE(c AS col) 
    FROM `project.dataset.table`, 
    UNNEST(SPLIT(col)) c;

>Solution :

Use the following:

WITH myData AS (
  select 
    id, 
    values
    FROM 
      UNNEST([
        STRUCT<id int, values string>(1, 'abc,pqr,lmn,omr'),
        STRUCT<id int, values string>(2, 'aaa,bbb,ccc,ddd')
      ])
)
SELECT 
  id,
  data
FROM myData,
UNNEST(SPLIT(values, ',')) AS data
ORDER BY id ASC

I start by just creating a sample temporary table with your test data. The gist of what you’re looking for is there at the end. You’re on the right track using UNNEST to break out the elements of the array out to separate rows, but you were missing the SPLIT function to actually break apart the individual values in the string by their comma delimiters.

This query yields the following:

id data
1 abc
1 pqr
1 lmn
1 omr
2 aaa
2 bbb
2 ccc
2 ddd

And for just a tweak on that – if you simply want a monotonically increasing number as the ID instead of it repeating as it does there (e.g. no need to link back to an ID), update your query to the following:

WITH myData AS (
  select 
    id, 
    values
    FROM 
      UNNEST([
        STRUCT<id int, values string>(1, 'abc,pqr,lmn,omr'),
        STRUCT<id int, values string>(2, 'aaa,bbb,ccc,ddd')
      ])
)
SELECT ROW_NUMBER() OVER() AS id,
  data
FROM myData,
UNNEST(SPLIT(values, ',')) AS data
ORDER BY id ASC

Here I simply dropped use of the id field in the SELECT expression and replaced with an incrementing row number that increments over all the records in the data set. This yields the following:

id data
1 abc
2 pqr
3 lmn
4 omr
5 aaa
6 bbb
7 ccc
8 ddd
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading