SQL Training Online

  • Training Courses
  • Beginner Book
  • Blog

November 9, 2012 by Joey Blue Leave a Comment

How to Remove Duplicate Rows in SQL Server

Video: How to Remove Duplicate Rows in SQL Server – SQL Training Online
In this video, I show you a trick on removing duplicate rows in SQL Server 2012.
[sharebox4 sharetext=”Share This Page”] [/sharebox4]

How to Remove Duplicate Rows in SQL Server?

I first need a table that has duplicate records to remove, so here is a script to create some duplicates in an email address table.

create table email_list
(
name varchar(100)
,email varchar(300)
)
go
insert into email_list values ('Jan','[email protected]');
insert into email_list values ('Bob','[email protected]');
insert into email_list values ('Jill','[email protected]');
insert into email_list values ('Bob','[email protected]');
insert into email_list values ('Jill','[email protected]');
insert into email_list values ('Bob','[email protected]');
insert into email_list values ('Jill','[email protected]');

Remove Duplicate Rows in SQL Server

Now, the first method to remove the duplicates is in the following script that I adapted from Alexander’s Post on duplicates.

WITH list_rownumbers AS
(
SELECT name,email,
ROW_NUMBER() OVER (ORDER BY name,email) AS 'RowNumber'
FROM email_list
)
DELETE list_rownumbers WHERE RowNumber not in
(SELECT min(RowNumber) FROM list_rownumbers GROUP BY name,email)

 

If you just look at the SQL With statement (CTE), you will see that it simply numbers all of the name, email combinations using the Row_Number function. Then it uses the minimum rows number per name, email group as an exclusion to remove the duplicate rows.

The second script is and adaptation of Ritesh’s Post on duplicates.

WITH list_duplicates (name, email, duplicate_count) AS
(
SELECT name,email,
ROW_NUMBER() OVER(PARTITION BY name,email ORDER BY name,email) AS duplicate_count
FROM email_list
)
DELETE
FROM list_duplicates
WHERE duplicate_count > 1

 

If we focus on the CTE query, we see that is uses the Partition By statement to actually reset the row_number count on each name, email group.

It makes the delete statement a little easier to follow since all we have to do is delete where the row_number (duplicate_count) is larger than 1.

And that’s it.  That is how you Delete Duplicates in SQL.

 

Let me know what you think by commenting or sharing on twitter, facebook, google+, etc.

 

SQL Training Online: https://www.sqltrainingonline.com

Twitter: http://www.twitter.com/sql_by_joey

Google+: https://plus.google.com/#100925239624117719658/posts

LinkedIn: http://www.linkedin.com/in/joeyblue

Facebook: http://www.facebook.com/sqltrainingonline

Filed Under: SQL Training Tagged With: SQL, SQL CTE, SQL Delete, SQL Duplicates, SQL Server 2012

Recent Posts

  • SQL Database Normalization – 1NF, 2NF, 3NF, and 4NF
  • SQL Joins Tutorial for Beginners – Inner Join, Left Join, Right Join, Full Outer Join – SQL Training Online
  • Zillow House Price Analysis from CSV Archive – #PowerBI 002
  • Learn Basic SQL – 1 Hour Training Course – SQL Training Online
  • Create Table Statement in SQL Server and Inserting Baseball Homerun Leader Dataset – SQL Training Online

Popular Posts

  • SQL Functions
  • SQL Jobs in Oracle and Microsoft SQL Server
  • Troubleshooting a SQL Join

Have Questions? Connect with me.

Link to my Facebook Page
Link to my Pinterest Page
Link to my Rss Page
Link to my Twitter Page
Link to my Youtube Page
Copyright © 2023 · SQLTrainingOnline.com · Consulting at EmbarkBlue.com