Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

MS Power Query – eliminating duplicates in a set of delimited values

I have a excel file with data something like below:

enter image description here

A1 | 13,15,14,16,17

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

A2 | 13,16

I want the data to look like

A3 | 13, 15, 16, 17

Merge the selected cells data into single cell, avoid duplicate values.

Also, while doing this compare the values to other cells and then exclude those cells.

Say, B1 | 13,14

Then Final Output should not include values from B1

A3 | 15, 16, 17

I tried using Text Join, Unique functions but none of them are working. I found out about XMLconvert function but that is also not supporting excel.

>Solution :

Since OP has tagged Excel Formula then one could try the following :

enter image description here


• Formula used in cell A3

=TEXTJOIN(",",,UNIQUE(TOCOL(TEXTSPLIT(TEXTAFTER(","&A1:A2,",",SEQUENCE(,MAX(LEN(A1:A2)-LEN(SUBSTITUTE(A1:A2,",",))+1))),","),3)))

Or,

=ARRAYTOTEXT(DROP(UNIQUE(REDUCE("",A1:A2,LAMBDA(x,y,VSTACK(x,TEXTSPLIT(y,,","))))),1))

With Power Query, you can try using the following M Code:

enter image description here


let
    Source = Excel.CurrentWorkbook(){[Name="DelimitedTab"]}[Content],
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
    #"Removed Duplicates" = Table.Distinct(#"Split Column by Delimiter"),
    Column1 = Text.Combine(#"Removed Duplicates"[Column1],",")
in
    Column1

Or, if the data is limited then one could use the following as well:

enter image description here


=ARRAYTOTEXT(UNIQUE(TEXTSPLIT(TEXTJOIN(",",,A1:A2),,",")))

Edit:

As per OP’s comment:

Also, while doing this compare the values to other cells and then
exclude those cells. Say, B1 | 13,14 Then Final Output should not
include values from B1 A3 | 15, 16, 17


enter image description here


• Formula used in cell B1

=LET(α, TEXTSPLIT(A1,","), ARRAYTOTEXT(FILTER(α, ISNA(XMATCH(α,TEXTSPLIT(A2,","))))))

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading