Open In App

How to Delete Duplicate Rows in PL/SQL?

Last Updated : 26 Apr, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Inconsistencies and inefficiencies in data management are frequently caused by duplicate rows in a database table. Eliminating duplicate rows is a typical PL/SQL activity to maintain data integrity and improve database performance.

This article will guide you on how to remove duplicated rows in PL/SQL. It provides you with easy-to-understand examples and terminology to help you better understand the procedure.

PL/SQL Deleting Duplicate Rows

In PL/SQL, duplicate rows are removed by finding and eliminating rows with the same values in a subset of columns. Utilizing the ROWID pseudo-column to uniquely identify rows is one method of doing this. The main procedure entails eliminating rows with duplicate ROWIDs after choosing unique rows based on predetermined columns. The following is a summary of the PL/SQL syntax for removing duplicate rows:

Syntax:

DELETE FROM table_name

WHERE rowid not in

(SELECT MIN(rowid)

FROM table_name

GROUP BY column1, column2, …);

In each unique combination of columns, this statement removes rows from table_name where the ROWID is not the minimum ROWID.

Examples of PL/SQL deleting duplicate rows

Example 1: Deleting Duplicate Rows

Consider a scenario where we have a students table with duplicate entries, and we want to remove these duplicates based on the student_id column.

-- Schema for the students table
CREATE TABLE students (
student_id NUMBER,
student_name VARCHAR2(50),
age NUMBER
);

-- Insert some sample data with duplicates
INSERT INTO students VALUES (1, 'John Doe', 20);
INSERT INTO students VALUES (2, 'Jane Smith', 22);
INSERT INTO students VALUES (3, 'John Doe', 20); -- Duplicate
INSERT INTO students VALUES (4, 'Alice Johnson', 21);
INSERT INTO students VALUES (5, 'Jane Smith', 22); -- Duplicate

Output:

student_id

student_name

age

1

John Doe

20

2

Jane Smith

22

3

John Doe

20

4

Alice Johnson

21

5

Jane Smith

22

-- Delete duplicate rows
DELETE FROM students
WHERE rowid not in
(SELECT MIN(rowid)
FROM students
GROUP BY student_id);

Output:

student_id

student_name

age

1

John Doe

20

2

Jane Smith

22

3

Alice Johnson

21

2 rows deleted.

After executing the above code, duplicate rows based on the student_id column are removed from the students table.

Example 2: Deletion in different Case

Let’s now consider a different scenario where we have a sales table with duplicate entries based on multiple columns, and we want to remove these duplicates.

-- Schema for the sales table
CREATE TABLE sales (
order_id NUMBER,
product_id NUMBER,
quantity NUMBER
);

-- Insert some sample data with duplicates
INSERT INTO sales VALUES (1, 101, 5);
INSERT INTO sales VALUES (2, 102, 3);
INSERT INTO sales VALUES (3, 101, 5); -- Duplicate
INSERT INTO sales VALUES (4, 103, 2);
INSERT INTO sales VALUES (5, 102, 3); -- Duplicate

Output:

order_id

product_id

quantity

1

101

5

2

102

3

3

101

5

4

103

2

5

102

3

-- Delete duplicate rows
DELETE FROM sales
WHERE rowid not in
(SELECT MIN(rowid)
FROM sales
GROUP BY order_id, product_id, quantity);

Output:

order_id

product_id

quantity

1

101

5

2

102

3

3

103

2

2 rows deleted.

After executing the above code, duplicate rows based on multiple columns (order_id, product_id, quantity) are removed from the sales table.

Conclusion

In PL/SQL, removing duplicate rows is crucial to preserving data integrity and maximising database efficiency. With the help of the examples and syntax offered, users may efficiently find and remove duplicate entries from their database tables. By preventing redundancies and maintaining accuracy, this procedure serves to increase the quality and dependability of the data.


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads