How to remove duplicate words within columns using group_concat?

May 8, 2022

I have a table with records in TableA.ColumnA like this:

New York City
New York
New Rochelle
York Town

I am doing

Select group_concat(ColumnA SEPARATOR ‘;’) from TableA

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.
Visit Medevel

but naturally end up with;

New York City; New York; New Rochelle; York Town

when what I am looking for is:

New; York; City; Rochelle; Town

In other words I want each word in each columnA to compare to each other words in ColumnA and remove any matching words so I end up with no duplicate ‘keywords’.

Can I do this somehow in the group_concat or is there some process I need to run prior?

>Solution :

The easiest way assuming your mySql supports json_table is to convert the strings to a json array and split into rows, then you can aggregate the distinct list:

select group_concat(words separator '; ') result
from (
  select distinct words
  from tableA t
  join json_table(
    replace(json_array(t.columnA), ' ', '","'),
    '$[*]' columns (words varchar(50) path '$')
  ) j
)t

See example Fiddle