SQL Query to Find Duplicate Names in a Table
Duplicate records in a database can be problematic sometimes as they create confusion or they might generate wrong outputs as well. So, it is better to remove all the duplicate records from the database as it will save our time and space. However, the best practice is to put unique constraints (Primary key constraints) on a table to prevent that.
This error could be because of various reasons such as –
- Application flaws,
- User error
- Bad database design
- Or because of some unknown external source.
We recommend you first go through this article for a better understanding.
To find the duplicate Names in the table, we have to follow these steps:
- Defining the criteria: At first, you need to define the criteria for finding the duplicate Names. You might want to search in a single column or more than that.
- Write the query: Then simply write the query to find the duplicate Names.
Let’s get started-
Suppose you are working with a database of an e-commerce website. Now, some usernames are saved more than once and so are their email ids. This is going to cause erroneous analytical results for the e-commerce website as saving this data more than once is unnecessary.
Now let’s first create our demo database,
Step 1: Creating the database
Create a new database named User_details and then use that.
CREATE DATABASE User_details; USE User_details;
Step 2: Defining the table
Create a table named Users1 and add these four columns ID, Names, EmailId and Age.
CREATE Table Users1 (ID VARCHAR(20) Primary Key, Names VARCHAR(30), EmailId VARCHAR(30), Age INT);
Step 3: Insert rows into the table and Insert these six rows in the table.
INSERT INTO Users1 VALUES('O1201', 'Radhika Malhotra', 'email@example.com', 21); INSERT INTO Users1 VALUES('O1202', 'Aryan Ray', 'Ar13@gmail.com', 25); INSERT INTO Users1 VALUES('O1203', 'Sam Das', 'Sam1@gmail.com', 54); INSERT INTO Users1 VALUES('O1204', 'Radhika Malhotra', 'firstname.lastname@example.org', 21); INSERT INTO Users1 VALUES('O1205', 'Aryan Ray', 'Ar13@gmail.com', 25); INSERT INTO Users1 VALUES('O1206', 'Radhika Malhotra', 'email@example.com', 21);
Step 4: Viewing Inserted data
Run this command to see our table.
SELECT * FROM Users1;
Step 5: Now, let’s make our query to find duplicate Names in this table.
- Defining the Criteria: Here we define criteria for only the Names column is selected from the Users1 table.
SELECT Names,COUNT(*) AS Occurrence FROM Users1 GROUP BY Names HAVING COUNT(*)>1;
This query is simple. Here, we are using the GROUP BY clause to group the identical rows in the Names column. Then we are finding the number of duplicates in that column using the COUNT() function and show that data in a new column named Occurrence. Having a clause only keeps the groups that have more than one occurrence.
We have found the duplicate Names and their Occurrence in our table. This information can help us to remove the duplicate rows from the table in the future.