Home Create an associated matching ID based on multiple columns in the same table

Questions

Create an associated matching ID based on multiple columns in the same table

March 22, 2022

I have an interesting issue where I need to create a unique identifier based on match groups for a set of data. This is is based on multiple criteria, but generally what I need to happen is to take this input:

SOURCE_ID	MATCH_ID	PHONE
1	1	(999)9999999
1	2	(999)9999999
2	1	(999)9999999
213710	707187	(001)2548987
213710	759263	(100)8348243
213705	2416730	(156)6676200
213705	12116102	(132)3453523

And it needs to look like this as the output:

SOURCE_ID	MATCH_ID	PHONE	GENERATED_ID
1	1	(999)9999999	1
1	2	(999)9999999	1
2	1	(999)9999999	1
213710	707187	(001)2548987	2
213710	759263	(100)8348243	2
213705	2416730	(156)6676200	3
213705	12116102	(132)3453523	3

I’ve utilized the DENSE_RANK() function to create two separate IDs, one sorting on PHONE, the other SOURCEID column. The PHONE sort gives me the correct output for lines 1-3, but incorrect for 4-7, while the SOURCE_ID sort works on lines 4-6, but not 1-3.

How can I combine these in a way that gives the desired output above? I’ve tried combining the columns in every format possible, but no luck there either.

Output from testing, highlighted correct results. Each TEST## column is noted below

The SQL for reference:

SELECT SOURCE_ID, 
    MATCH_ID, 
    PHONE, 
    DENSE_RANK() OVER(ORDER BY PHONE) AS PHONE_SORT
    DENSE_RANK() OVER(ORDER BY SOURCE_ID) AS SOURCE_ID_SORT
    DENSE_RANK() OVER(ORDER BY MATCH_ID, INTERNAL_ROW_ID) AS TEST1,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID) AS TEST2,
    DENSE_RANK() OVER(ORDER BY MATCH_ID) AS TEST3,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, SOURCE_ID, PHONE) AS TEST4,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE, SOURCE_ID) AS TEST5,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, MATCH_ID, PHONE) AS TEST6,
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE, MATCH_ID) AS TEST7,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID, MATCH_ID) AS TEST8,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID, SOURCE_ID) AS TEST9,
    DENSE_RANK() OVER(ORDER BY PHONE, SOURCE_ID) AS TEST10,
    DENSE_RANK() OVER(ORDER BY PHONE, MATCH_ID) AS TEST11,     
    DENSE_RANK() OVER(ORDER BY SOURCE_ID, PHONE) AS TEST12,
    DENSE_RANK() OVER(ORDER BY MATCH_ID, PHONE) AS TEST13      
FROM MY_TABLE;

TIA!

>Solution :

that’s some rather obtuse logic.

SELECT 
    column1
    ,column2
    ,column3
    ,dense_rank() over (order by rankable)
FROM (    
    SELECT *
        ,count(column1) over (partition by column1) c_c1
        ,count(column3) over (partition by column3) c_c3
        ,iff(c_c1> c_c3, column1::text, column3) as rankable
    FROM VALUES
    (1,1,'(999)9999999'),
    (1, 2,'(999)9999999'),
    (2, 1,'(999)9999999'),
    (213710,    707187,'(001)2548987'),
    (213710,    759263,'(100)8348243'),
    (213705,    2416730,'(156)6676200'),
    (213705,    12116102,'(132)3453523')
)

gives:

COLUMN1	COLUMN2	COLUMN3	DENSE_RANK() OVER (ORDER BY RANKABLE)
1	1	(999)9999999	1
1	2	(999)9999999	1
2	1	(999)9999999	1
213,705	2,416,730	(156)6676200	2
213,705	12,116,102	(132)3453523	2
213,710	707,187	(001)2548987	3
213,710	759,263	(100)8348243	3