Open In App

How to Restrict Results to top N Rows per Group in PL/SQL?

Last Updated : 19 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In the world of database administration, retrieving top N rows from every grouping is a frequent but complicated one. Whether you are performing cross-tabulations with large datasets or looking for specific insights within the grouping of data, the ability to restrict the output to top N rows per group is precious.

Implementing this job in PL/SQL, which is Oracle’s procedural extension to SQL, may call for a thought-out approach applying approaches customized for the database environment. Now is the time to explore a how-to and a methodology for determining top N results per group in PL/SQL.

How to Restrict Results to Top N Rows per Group in PL/SQL

Imagine you have data sets in which several groups are there, and you have to take out the top N records from each group with the particular conditions. We may offer the sales leaders by region, the top lists in the person employee for the department to any other particular data segment analysis.

  • Using Analytic Functions
  • Using Subqueries
  • Using Common Table Expressions (CTEs)
  • Using RANK() Function

We’ll discuss methods to extract top N records per group: Analytic Functions, Subqueries, Common Table Expressions (CTEs), and the RANK() Function, each offering unique solutions for data analysis.

Setup an Environment

Now we create a sales_data table and insert the value in it:

CREATE TABLE sales_data (
region VARCHAR2(50),
salesperson VARCHAR2(50),
sales_amount NUMBER
);

INSERT INTO sales_data (region, salesperson, sales_amount) VALUES ('North', 'Alice', 1000);
INSERT INTO sales_data (region, salesperson, sales_amount) VALUES ('North', 'Bob', 1500);
INSERT INTO sales_data (region, salesperson, sales_amount) VALUES ('North', 'Charlie', 1200);
INSERT INTO sales_data (region, salesperson, sales_amount) VALUES ('South', 'David', 1100);
INSERT INTO sales_data (region, salesperson, sales_amount) VALUES ('South', 'Emma', 1300);
INSERT INTO sales_data (region, salesperson, sales_amount) VALUES ('South', 'Frank', 900);

Output:

sales_data

1. Using Analytic Functions

Analytic Functions like ROW_NUMBER() can be employed to partition data by groups and assign row numbers based on specific criteria, enabling the selection of top N rows per group.

Query:

SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITON BY region ORDAR BY sales_amount DESC) AS row_num
FROM sales_data
)
WHERE row_num <= 2;

Output:

analytics-FUNCTION

Explanation: The output displays the top 2 rows per region from the sales_data table, ordered by sales amount in descending order, with row numbers assigned accordingly.

2. Using Subqueries

Subqueries can be utilized to filter data based on the results of inner queries, enabling the selection of top N rows per group.

Query:

SELECT *
FROM sales_data t
WHERE t.sales_amount IN (
SELECT sales_amount
FROM (
SELECT sales_amount,
ROW_NUMBER() OVER (PARTITION BY region ORDER BY sales_amount DESC) AS row_num.
FROM sales_data
)
WHERE row_num <= 2
);

Output:

SUBQUERIES

Explanation: The output filters the sales_data table to include rows where the sales amount matches those selected from each region’s top 2 sales amounts, identified by assigning row numbers and filtering accordingly.

3. Common Table Expressions (CTEs)

CTEs provide a readable and reusable way to define temporary result sets, facilitating the selection of top N rows per group.

Query:

WITH ranked_data AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY area ORDER BY sales_num DESC) AS row_value.
FROM sales_data
)
SELECT *
FROM ranked_data
WHERE row_num <= 2;

Output:

CTE

Explanation: The output selects the top 2 rows per area from the sales_data table, ordered by sales number in descending order, utilizing a common table expression to assign row numbers.

4. Using RANK() Function

The RANK() function assigns a rank to each row within a partition, allowing for the selection of top N rows per group based on ranking.

Query:

SELECT *
FROM (
SELECT *,
RANK() OVER (PARTITION BY region ORDER BY sales_amount DESC) AS rank_num is calculated.
FROM sales_data
)
WHERE rank_num <= 2;

Output:

RANKFUNCTION

Explanation: The output presents records from the sales_data table where each row is assigned a rank based on the sales amount within its region, sorted in descending order. Only rows with ranks less than or equal to 2 are displayed.

Conclusion

Understanding the PL/SQL Top N Rows per Group extraction doors to many new dimensions for data analysis and reporting. Either using analyze functions, subqueries, PL/SQL cursors, and other methods makes know-how opportunities to the developers and analysts to work with grouped data effectively, precisely, and in a flexible way. By choosing a suitable method that takes into consideration the particular requirements and constraints one can move in a complex and multidimensional dataset and therefore get useful and significant information.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads