Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create an associated matching ID based on multiple columns in the same table

I have an interesting issue where I need to create a unique identifier based on match groups for a set of data. This is is based on multiple criteria, but generally what I need to happen is to take this input:

SOURCE_ID MATCH_ID PHONE
1 1 (999)9999999
1 2 (999)9999999
2 1 (999)9999999
213710 707187 (001)2548987
213710 759263 (100)8348243
213705 2416730 (156)6676200
213705 12116102 (132)3453523

And it needs to look like this as the output:

SOURCE_ID MATCH_ID PHONE GENERATED_ID
1 1 (999)9999999 1
1 2 (999)9999999 1
2 1 (999)9999999 1
213710 707187 (001)2548987 2
213710 759263 (100)8348243 2
213705 2416730 (156)6676200 3
213705 12116102 (132)3453523 3

I’ve utilized the DENSE_RANK() function to create two separate IDs, one sorting on PHONE, the other SOURCEID column. The PHONE sort gives me the correct output for lines 1-3, but incorrect for 4-7, while the SOURCE_ID sort works on lines 4-6, but not 1-3.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can I combine these in a way that gives the desired output above? I’ve tried combining the columns in every format possible, but no luck there either.

Output from testing, highlighted correct results. Each TEST## column is noted below

The SQL for reference:

SELECT SOURCE_ID, 
    MATCH_ID, 
    PHONE, 
    DENSE_RANK() OVER(ORDER BY PHONE) AS PHONE_SORT
    DENSE_RANK() OVER(ORDER BY SOURCE_ID) AS SOURCE_ID_SORT
    DENSE_RANK() OVER(ORDER BY MATCH_ID, INTERNAL_ROW_ID) AS TEST1,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID) AS TEST2,
    DENSE_RANK() OVER(ORDER BY MATCH_ID) AS TEST3,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, SOURCE_ID, PHONE) AS TEST4,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE, SOURCE_ID) AS TEST5,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID, PHONE) AS TEST6,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE, MATCH_ID) AS TEST7,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID, MATCH_ID) AS TEST8,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID, SOURCE_ID) AS TEST9,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID) AS TEST10,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID) AS TEST11,     
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE) AS TEST12,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE) AS TEST13      
FROM MY_TABLE; 

TIA!

>Solution :

that’s some rather obtuse logic.

SELECT 
    column1
    ,column2
    ,column3
    ,dense_rank() over (order by rankable)
FROM (    
    SELECT *
        ,count(column1) over (partition by column1) c_c1
        ,count(column3) over (partition by column3) c_c3
        ,iff(c_c1> c_c3, column1::text, column3) as rankable
    FROM VALUES
    (1,1,'(999)9999999'),
    (1, 2,'(999)9999999'),
    (2, 1,'(999)9999999'),
    (213710,    707187,'(001)2548987'),
    (213710,    759263,'(100)8348243'),
    (213705,    2416730,'(156)6676200'),
    (213705,    12116102,'(132)3453523')
)

gives:

COLUMN1 COLUMN2 COLUMN3 DENSE_RANK() OVER (ORDER BY RANKABLE)
1 1 (999)9999999 1
1 2 (999)9999999 1
2 1 (999)9999999 1
213,705 2,416,730 (156)6676200 2
213,705 12,116,102 (132)3453523 2
213,710 707,187 (001)2548987 3
213,710 759,263 (100)8348243 3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading