Best SQL group by – find, select and remove duplicate rows

Ever wanted to find duplicate data and select the actual rows?
Download complete example as sql-script

with MYCTE  as (
  SELECT DuplicateKey1
        ,DuplicateKey2 --optional
        ,count(*) X
  FROM MyTable
  group by DuplicateKey1, DuplicateKey2
  having count(*) > 1
) 
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
   AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt

If you want to keep only one of the rows. Choose if you keep the newest or oldest data.

with MYCTE as (
  SELECT ROW_NUMBER() OVER (
    PARTITION BY DuplicateKey1
                ,DuplicateKey2 -- optional
    ORDER BY CreatedAt -- the first row among duplicates will be kept, other rows will be removed
 -- ORDER BY NEWID() -- if the duplicates are identical and you dont care which of the duplicates will be kept
  ) RN
  FROM MyTable
)
DELETE FROM MYCTE
WHERE RN > 1

Leave a Reply

Your email address will not be published. Required fields are marked *