How to Find Duplicates Values Across Multiple Columns in SQL?

Last Updated : 19 Nov, 2021

In SQL, sometimes we need to find duplicate entries across multiple columns in a table in a single query. We will use the GROUP BY and COUNT keywords to achieve this. For this, we use a specific kind of query shown in the below demonstration. For this article, we will be using the Microsoft SQL Server as our database and Select keyword.

Step 1: Create a Database. For this use the below command to create a database named GeeksForGeeks.

Query:

CREATE DATABASE GeeksForGeeks

Output:

Step 2: Use the GeeksForGeeks database. For this use the below command.

Query:

USE GeeksForGeeks

Output:

Step 3: Create a table of POSTINGS inside the database GeeksForGeeks. This table has 4 columns namely POSTING_ID, OFFICER_NAME, TEAM_SIZE and POSTING_LOCATION containing the id, the name of the officer, the size of the team and locations of the postings of Foreign Services in different countries.

Query:

CREATE TABLE POSTINGS(
POSTING_ID INT,
OFFICER_NAME VARCHAR(10),
TEAM_SIZE INT,
POSTING_LOCATION VARCHAR(10));

Output:

Step 4: Describe the structure of the table POSTINGS.

Query:

EXEC SP_COLUMNS POSTINGS;

Output:

Step 5: Insert 5 rows into the POSTINGS table.

Query:

INSERT INTO POSTINGS VALUES(1,'RYAN',10,'GERMANY');
INSERT INTO POSTINGS VALUES(2,'JACK',6,'ROMANIA');
INSERT INTO POSTINGS VALUES(3,'JANE',4,'HAWAII');
INSERT INTO POSTINGS VALUES(4,'JIM',10,'GERMANY');
INSERT INTO POSTINGS VALUES(5,'TIM',10,'GERMANY');
INSERT INTO POSTINGS VALUES(6,'RYAN',11,'GERMANY');
INSERT INTO POSTINGS VALUES(7,'RYAN',10,'GERMANY');
INSERT INTO POSTINGS VALUES(8,'RYAN',10,'GERMANY');
INSERT INTO POSTINGS VALUES(9,'JACK',6,'CUBA');
INSERT INTO POSTINGS VALUES(10,'JACK',6,'HAITI');

Output:

Step 6: Display all the rows of the POSTINGS table.

Query:

SELECT * FROM POSTINGS;

Output:

Step 7: Find duplicates in 3(multiple) columns i.e. in OFFICER_NAME, TEAM_SIZE and POSTING_LOCATION in the table POSTINGS. To achieve the, we need to group the records by these three columns and display those which have the count greater than 1 i.e. have matching values. Use the keywords GROUP BY and COUNT.

Syntax:

SELECT COLUMN_NAME1, COLUMN_NAME2, 
COLUMN_NAME3, COUNT(*) AS ALIAS
FROM TABLE_NAME
GROUP BY COLUMN_NAME1, COLUMN_NAME2,
COLUMN_NAME3 HAVING COUNT(*)>1;

Query:

SELECT OFFICER_NAME, TEAM_SIZE,
POSTING_LOCATION, COUNT(*) AS QTY
FROM POSTINGS
GROUP BY OFFICER_NAME, TEAM_SIZE,
POSTING_LOCATION HAVING COUNT(*)>1;

Output:

Step 8: Find duplicates in 2(multiple) columns i.e. in TEAM_SIZE and POSTING_LOCATION in the table POSTINGS.

Query:

SELECT TEAM_SIZE, POSTING_LOCATION, 
COUNT(*) AS QTY
FROM POSTINGS
GROUP BY TEAM_SIZE, POSTING_LOCATION
 HAVING COUNT(*)>1;

Output:

Step 9: Find duplicates in 2(multiple) columns i.e. in OFFICER_NAME and TEAM_SIZE in the table POSTINGS.

Query:

SELECT OFFICER_NAME, TEAM_SIZE, 
COUNT(*) AS QTY
FROM POSTINGS
GROUP BY OFFICER_NAME, 
TEAM_SIZE HAVING COUNT(*)>1;

Output:

Suggest improvement

How to Exclude Records With Certain Values in SQL Select?

How to Update Multiple Records Using One Query in SQL Server?

Share your thoughts in the comments

How to Find Duplicates Values Across Multiple Columns in SQL?

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?