How to Find Duplicates Values Across Multiple Columns in SQL?
Last Updated :
19 Nov, 2021
In SQL, sometimes we need to find duplicate entries across multiple columns in a table in a single query. We will use the GROUP BY and COUNT keywords to achieve this. For this, we use a specific kind of query shown in the below demonstration. For this article, we will be using the Microsoft SQL Server as our database and Select keyword.
Step 1: Create a Database. For this use the below command to create a database named GeeksForGeeks.
Query:
CREATE DATABASE GeeksForGeeks
Output:
Step 2: Use the GeeksForGeeks database. For this use the below command.
Query:
USE GeeksForGeeks
Output:
Step 3: Create a table of POSTINGS inside the database GeeksForGeeks. This table has 4 columns namely POSTING_ID, OFFICER_NAME, TEAM_SIZE and POSTING_LOCATION containing the id, the name of the officer, the size of the team and locations of the postings of Foreign Services in different countries.
Query:
CREATE TABLE POSTINGS(
POSTING_ID INT,
OFFICER_NAME VARCHAR(10),
TEAM_SIZE INT,
POSTING_LOCATION VARCHAR(10));
Output:
Step 4: Describe the structure of the table POSTINGS.
Query:
EXEC SP_COLUMNS POSTINGS;
Output:
Step 5: Insert 5 rows into the POSTINGS table.
Query:
INSERT INTO POSTINGS VALUES(1,'RYAN',10,'GERMANY');
INSERT INTO POSTINGS VALUES(2,'JACK',6,'ROMANIA');
INSERT INTO POSTINGS VALUES(3,'JANE',4,'HAWAII');
INSERT INTO POSTINGS VALUES(4,'JIM',10,'GERMANY');
INSERT INTO POSTINGS VALUES(5,'TIM',10,'GERMANY');
INSERT INTO POSTINGS VALUES(6,'RYAN',11,'GERMANY');
INSERT INTO POSTINGS VALUES(7,'RYAN',10,'GERMANY');
INSERT INTO POSTINGS VALUES(8,'RYAN',10,'GERMANY');
INSERT INTO POSTINGS VALUES(9,'JACK',6,'CUBA');
INSERT INTO POSTINGS VALUES(10,'JACK',6,'HAITI');
Output:
Step 6: Display all the rows of the POSTINGS table.
Query:
SELECT * FROM POSTINGS;
Output:
Step 7: Find duplicates in 3(multiple) columns i.e. in OFFICER_NAME, TEAM_SIZE and POSTING_LOCATION in the table POSTINGS. To achieve the, we need to group the records by these three columns and display those which have the count greater than 1 i.e. have matching values. Use the keywords GROUP BY and COUNT.
Syntax:
SELECT COLUMN_NAME1, COLUMN_NAME2,
COLUMN_NAME3, COUNT(*) AS ALIAS
FROM TABLE_NAME
GROUP BY COLUMN_NAME1, COLUMN_NAME2,
COLUMN_NAME3 HAVING COUNT(*)>1;
Query:
SELECT OFFICER_NAME, TEAM_SIZE,
POSTING_LOCATION, COUNT(*) AS QTY
FROM POSTINGS
GROUP BY OFFICER_NAME, TEAM_SIZE,
POSTING_LOCATION HAVING COUNT(*)>1;
Output:
Step 8: Find duplicates in 2(multiple) columns i.e. in TEAM_SIZE and POSTING_LOCATION in the table POSTINGS.
Query:
SELECT TEAM_SIZE, POSTING_LOCATION,
COUNT(*) AS QTY
FROM POSTINGS
GROUP BY TEAM_SIZE, POSTING_LOCATION
HAVING COUNT(*)>1;
Output:
Step 9: Find duplicates in 2(multiple) columns i.e. in OFFICER_NAME and TEAM_SIZE in the table POSTINGS.
Query:
SELECT OFFICER_NAME, TEAM_SIZE,
COUNT(*) AS QTY
FROM POSTINGS
GROUP BY OFFICER_NAME,
TEAM_SIZE HAVING COUNT(*)>1;
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...