Following code is useful to delete duplicate records. The table must have identity column, which will be used to identify the duplicate records. Table in example is has ID as Identity Column and Columns which have duplicate data are DuplicateColumn1, DuplicateColumn2 and DuplicateColumn3.
DELETE
FROM MyTable
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM MyTable
GROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn2)
if table dont have primary key:
CREATE TABLE [dbo].[Example]
(
[ID] [int] NOT NULL,
[Name] [nvarchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
)
ON [PRIMARY]
INSERT INTO [Example] ([ID],[Name]) VALUES (1,’Chirag’)
INSERT INTO [Example] ([ID],[Name]) VALUES (1,’Chirag’)
INSERT INTO [Example] ([ID],[Name]) VALUES (2,’Shailesh’)
INSERT INTO [Example] ([ID],[Name]) VALUES (3,’Dipak’)
INSERT INTO [Example] ([ID],[Name]) VALUES (4,’Mihir’)
INSERT INTO [Example] ([ID],[Name]) VALUES (4,’Mihir’)
INSERT INTO [Example] ([ID],[Name]) VALUES (4,’Piyush’)
SELECT
[ID],[NAME],COUNT([ID])
FROM
[Example]
GROUP BY
[ID],[NAME]
HAVING
COUNT([ID]) > 1
WITH
Test AS (SELECT ROW_NUMBER () OVER ( PARTITION BY ID ORDER BY ID) AS A FROM example )
DELETE FROM Test WHERE A > 1
Eg:2
Code Snippet
CREATE TABLE [DAS](
[INSTANCE_NAME] Varchar(16) NULL,
[SESSIONID] float NOT NULL,
[IMPORT_DATE] smalldatetime NULL
)
INSERT INTO DAS VALUES(‘IN1’, 1, DATEADD(dd,1,GETDATE()))
INSERT INTO DAS VALUES(‘IN2’, 1, DATEADD(dd,2,GETDATE()))
INSERT INTO DAS VALUES(‘IN4’, 2, DATEADD(dd,3,GETDATE()))
INSERT INTO DAS VALUES(‘IN4’, 2, DATEADD(dd,3,GETDATE()))
INSERT INTO DAS VALUES(‘IN5’, 2, DATEADD(dd,4,GETDATE()))
INSERT INTO DAS VALUES(‘IN6’, 3, DATEADD(dd,6,GETDATE()))
INSERT INTO DAS VALUES(‘IN5’, 3, DATEADD(dd,6,GETDATE()))
WITH CTE
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY sessionid ORDER BY [Import_Date] desc) as rn, instance_name, sessionid, Import_date
FROM
DAS
)
SELECT instance_name, sessionid, Import_date FROM CTE WHERE rn > 1
Results
—————————–
IN1 1 2008-10-22 17:07:00
IN4 2 2008-10-24 17:07:00
IN4 2 2008-10-24 17:07:00
IN5 3 2008-10-27 17:07:00
eg:3
DELETE TOP(1) FROM [Example] WHERE [ID] = 1
Fig – (3) Delete single row from duplicate rows.
Here I have used TOP(1) , If you have n rows which has all the values same than you have to use TOP(n-1) so that only 1 row will be remain after delete statement. To delete all the duplicate rows you need to write a cursor as shown below,
DECLARE @ID int
DECLARE @NAME NVARCHAR(50)
DECLARE @COUNT int
DECLARE CUR_DELETE CURSOR FOR
SELECT [ID],[NAME],COUNT([ID]) FROM [Example] GROUP BY [ID],[NAME] HAVING COUNT([ID]) > 1
OPEN CUR_DELETE
FETCH NEXT FROM CUR_DELETE INTO @ID,@NAME,@COUNT
/* Loop through cursor for remaining ID */
WHILE @@FETCH_STATUS = 0
BEGIN
DELETE TOP(@COUNT -1) FROM [Example] WHERE ID = @ID
FETCH NEXT FROM CUR_DELETE INTO @ID,@NAME,@COUNT
END
CLOSE CUR_DELETE
DEALLOCATE CUR_DELETE
Fig – (4) Cursor to delete all duplicate records
This is all about deleting duplicate rows in SQL Server 2005.
For removing duplicates using row_number?
select ROW_NUMBER() over(partition by name order by name) as a,* from example
a ID Name
——————– ———– ————————————————–
1 1 Chirag
2 1 Chirag
1 3 Dipak
1 4 Mihir
2 4 Mihir
1 4 Piyush
1 2 Shailesh
using rank():
ID Name rownumber
———– ————————————————– ——————–
1 Chirag 1
1 Chirag 1
2 Shailesh 3
3 Dipak 4
4 Mihir 5
4 Mihir 5
4 Piyush 5
with test as
(
select *,rank() over(order by id) as rownumber from example
)
select distinct rownumber,name from test
rownumber name
——————– ————————————————–
1 Chirag
3 Shailesh
4 Dipak
5 Mihir
5 Piyush
(5 row(s) affected)
ex:
select *,rank() over(order by id) as rownumber from example
ID Name rownumber
———– ————————————————– ——————–
1 Chirag 1
1 Chirag 1
2 Shailesh 3
3 Dipak 4
4 Mihir 5
4 Mihir 5
4 Piyush 5
select *,rank() over(order by name) as rownumber from example
ID Name rownumber
———– ————————————————– ——————–
1 Chirag 1
1 Chirag 1
3 Dipak 3
4 Mihir 4
4 Mihir 4
4 Piyush 6
2 Shailesh 7
(7 row(s) affected)