Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove Duplicates from Datatable with LINQ without keeping a duplicated entry at all

I have a Datatable with several Columns which I want to remove all duplicates from like that

Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();

However above code leaves one entry (the first one that is found) in the DataTable via the Select.First at the end of the LINQ code.

Is there a way to remove all duplicates and keep none?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Edit:
Example what the code is doing now and what it should do.

Datatable with entries like that

Name Filesize Filename
One 50 Fileone
Two 50 Fileone
Three 50 Filetwo
Four 50 Filethree

Above LINQ will now remove Line 2 as Filename and Filesize are the same. However Line 1 will stay as the LINQ Code selects the first duplicated entry.

I want to have removed line 1 and line 2 from the Datatable.

>Solution :

Dt1 = Dt1.AsEnumerable()
         .GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") })
         .Where(g => g.Count() == 1)
         .Select(g => g.First())
         .CopyToDataTable();

That will discard any groups with more than one item, then get the first (and only) item from the rest.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading