Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Delete duplicate rows in a large table

I have a table with 50+ columns that contains duplicate rows which I would like to delete. I would normally use the ROW_NUMBER function, however I don’t know the unique combination of columns in my table to put in the partition by clause. Is there a way to delete duplicate rows in a table without typing out all the column names?

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Since you use ROW_NUMBER i assume you use MS SQL-Server. You can select all columns easily in SSMS. Go to your table and drag the Columns folder under it to the editor. Now you have all columns comma separated. You just need to copy&paste it into this query:

WITH CTE AS(
   SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7], ... [col50]
       RN = ROW_NUMBER()OVER(PARTITION BY [col1], [col2], [col3], [col4], [col5], [col6], [col7], ..., [col50] ORDER BY col1)
   FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading