Open In App

How to Restrict Results to Top N Rows per Group in MySQL

Last Updated : 12 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

When working with MySQL databases, the need to limit results to the top N rows per group is a common requirement. This is particularly useful in scenarios where you aim to retrieve a specific number of records for each distinct group in your dataset. Achieving this involves leveraging the ROW_NUMBER() window function along with the PARTITION BY clause to effectively partition the data into groups.

How to Restrict Results to Top N Rows per Group in MySQL

When dealing with databases, limiting results to the top N rows within specific groups is crucial. We’ll explore three approaches in MySQL to efficiently handle this scenario.

  • Using Subquery with ROW_NUMBER()
  • Using Common Table Expression (CTE)
  • Using DENSE_RANK() for Tie-Breaking

Syntax:

SELECT *

FROM (

SELECT

your_columns,

ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num

FROM

your_table

) ranked

WHERE

row_num <= N;

Using ROW_NUMBER() and PARTITION BY

The ROW_NUMBER() function, combined with PARTITION BY, allows efficient grouping and filtering based on assigned row numbers.

SELECT
your_columns,
ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num
FROM
your_table;

Example 1: Restricting Sales Data to the Top 2 Products per Category

Let’s consider a scenario where you have a table named sales with columns product, category, and revenue. The goal is to retrieve the top 2 products based on revenue for each category.

-- Sample Data
CREATE TABLE sales (
product VARCHAR(50),
category VARCHAR(50),
revenue INT
);

INSERT INTO sales VALUES ('Laptop', 'Electronics', 1000);
INSERT INTO sales VALUES ('Smartphone', 'Electronics', 1200);
INSERT INTO sales VALUES ('Refrigerator', 'Appliances', 800);
INSERT INTO sales VALUES ('Washing Machine', 'Appliances', 900);
INSERT INTO sales VALUES ('Tablet', 'Electronics', 500);
INSERT INTO sales VALUES ('Coffee Maker', 'Appliances', 600);

-- Query to Retrieve Top 2 Products per Category
SELECT
product,
category,
revenue
FROM (
SELECT
product,
category,
revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS row_num
FROM
sales
) ranked
WHERE
row_num <= 2;


Output:

product

category

revenue

Smartphone

Electronics

1200

Laptop

Electronics

1000

Washing Machine

Appliances

900

Refrigerator

Appliances

800

Explanation: The query retrieves the top 2 products per category from the ‘sales’ table based on descending revenue. The output displays columns ‘product‘, ‘category‘, and ‘revenue‘. It utilizes the ROW_NUMBER() window function to assign row numbers within each category, and filters rows where the row number is 1 or 2.

Example 2: Retrieving Top 3 Employees per Department Based on Salary

Consider a table named employees with columns employee_id, department, and salary. You want to retrieve the top 3 employees based on salary for each department.

-- Sample Data
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department VARCHAR(50),
salary INT
);

INSERT INTO employees VALUES (1, 'HR', 50000);
INSERT INTO employees VALUES (2, 'IT', 60000);
INSERT INTO employees VALUES (3, 'HR', 55000);
INSERT INTO employees VALUES (4, 'IT', 65000);
INSERT INTO employees VALUES (5, 'Finance', 70000);

-- Query to Retrieve Top 3 Employees per Department
SELECT
employee_id,
department,
salary
FROM (
SELECT
employee_id,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM
employees
) ranked
WHERE
row_num <= 3;


Output:

employee_id

department

salary

2

IT

60000

4

IT

65000

5

Finance

70000

1

HR

50000

3

HR

55000

Explanation: The output displays the top 3 employees with the highest salaries in each department. It includes their ‘employee_id‘, ‘department‘, and ‘salary’. The result is obtained by ranking employees within each department based on salary in descending order using ROW_NUMBER().

Using Common Table Expression (CTE)

Utilizing a Common Table Expression enhances readability and provides a clear structure for the query.

Syntax:

WITH ranked_data AS (
SELECT
your_columns,
ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num
FROM
your_table
)
SELECT *
FROM ranked_data
WHERE row_num <= N;

Example

WITH ranked_data AS (
SELECT
product,
category,
revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS row_num
FROM
sales
)
SELECT *
FROM ranked_data
WHERE row_num <= 2;

Output:

| product          | category    | revenue | row_num |
|------------------|-------------|---------|---------|
| Smartphone | Electronics | 1200 | 1 |
| Laptop | Electronics | 1000 | 2 |
| Washing Machine | Appliances | 900 | 1 |
| Refrigerator | Appliances | 800 | 2 |
| Coffee Maker | Appliances | 600 | 3 |

Using DENSE_RANK()

Use DENSE_RANK() instead of ROW_NUMBER() for scenarios where tied rows should share the same rank.

Syntax:

SELECT
your_columns,
DENSE_RANK() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS rank
FROM
your_table;

Example:


SELECT
product,
category,
revenue,
DENSE_RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS rank
FROM
sales;

Output:

| product          | category    | revenue | rank |
|------------------|-------------|---------|------|
| Smartphone | Electronics | 1200 | 1 |
| Laptop | Electronics | 1000 | 2 |
| Tablet | Electronics | 500 | 3 |
| Washing Machine | Appliances | 900 | 1 |
| Refrigerator | Appliances | 800 | 2 |
| Coffee Maker | Appliances | 600 | 3 |

The result shows the products, their respective categories, revenues, and the calculated ranks. Adjust the column names and table accordingly for your specific use case.

Conclusion

So, overall to Restricting results to the top N rows per group in MySQL is a powerful capability that can be achieved by using the ROW_NUMBER() window function along with PARTITION BY. This technique allows you to efficiently filter and retrieve specific records for each distinct group in your data, providing valuable insights into top-performing items or individuals within each category. Whether it’s sales data, employee records, or any other dataset, the flexibility of this approach makes it a valuable tool for data analysis and reporting in MySQL.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads