Top-N Analysis in SQL deals with How to limit the number of rows returned from ordered sets of data in SQL.
Top-N queries ask for the n smallest or largest values of a column. Both smallest and largest values sets are considered Top-N queries. Following this type of searching technique could save lot of time and complexities. Top-N analysis are useful in cases where the need is to display only the n bottom-most or the n top-
most records from a table based on a condition. This result set can be used for further analysis.
For example, using Top-N analysis we can perform the following types of queries:
- The top five products that have had the maximum sales in the last two months.
- The top three agents who sold the maximum policies.
- The least two students who scored minimum marks in end semester exams.
Performing Top-N Analysis
Following the below mentioned queries we can easily understand the working of Top-N analysis in SQL:
SELECT [column_list], ROWNUM FROM (SELECT [column_list] FROM table_name ORDER BY Top-N_clolumn) WHERE ROWNUM<=N;
We will perform the various commands on the following table named Employee:
Input : SELECT ROWNUM as RANK, first_name, last_name, employee_id, salary FROM (SELECT salary, first_name, last_name, employee_id FROM Employee ORDER BY salary) WHERE ROWNUM<=3;
Explanation: In the above SQL statement, the required fields are displayed for employees with top 3 highest salaries. The result is displayed in decreasing order of their salaries.
Input : SELECT ROWNUM as RANK, first_name, employee_id, hire_date FROM (SELECT first_name, employee_id, hire_date FROM Employee ORDER BY hire_date) WHERE ROWNUM<=3;
Explanation: In the above SQL statement, the required fields are displayed for those 3 employees who were hired earliest. The result is displayed in increasing order of their hire date.
Different styles for using Top-N analysis
- Inline View and ROWNUM : The classic Top-N style query uses an ordered inline view to force the data into the correct order which then finally uses the ROWNUM check to limit the data returned.
Input : SELECT first_name, last_name FROM (SELECT first_name, last_name FROM Employee ORDER BY salary DESC) WHERE ROWNUM<=4;
Explanation: In the above SQL statement, the required fields are displayed for highest paid 4 employees. The altering is done by ORDER BY clause.
- Nested Inline View and ROWNUM : This method can also be used for paging through data, like paged web reports.
Input : SELECT employee_id, first_name, salary FROM (SELECT employee_id, first_name, salary, rownum AS rnum FROM (SELECT employee_id, first_name, salary FROM Employee ORDER BY salary) WHERE rownum<=4) WHERE rnum>=2;
Explanation: In the above SQL statement, first of all the inside query runs and gives its output to the outer query which then finally gives us the desired output.
- Using RANK function : The RANK analytic function assigns a sequential rank to each distinct value in output.
Input : SELECT dpartment_id, first_name FROM (SELECT dpartment_id, first_name, RANK() OVER (ORDER BY dpartment_id DESC) AS rnum FROM Employee) WHERE rnum<=3;
Explanation: In the above SQL statement, RANK() function also acts as a virtual field whose value is restricted at the end. RANK() function doesn’t give us the top N rows or the top N distinct values. The number of rows returned is dependent on the number of duplicates in the data.
- Using DENSE_RANK function : The DENSE_RANK analytic function is similar to RANK() function. The difference is that the ranks are compacted due to which there are no gaps.
Input : SELECT dpartment_id, first_name FROM (SELECT dpartment_id, first_name, DENSE_RANK() OVER (ORDER BY dpartment_id DESC) AS rnum FROM Employee) WHERE rnum<=3;
Explanation: In the above SQL statement, DENSE_RANK() function also assigns same rank to the duplicate values but there is no gap in the rank sequence. Thus it always gives us a Top N distinct values result.
- Using ROW_NUMBER function : The ROW_NUMBER analytic function is similar to ROWNUM virtual column but like all analytic functions its action can be limited to a specific output of data based on the order of data.
Input : SELECT dpartment_id, first_name FROM (SELECT dpartment_id, first_name, ROW_NUMBER() OVER (ORDER BY dpartment_id DESC) AS rnum FROM Employee) WHERE rnum<=4;
Explanation: In the above SQL statement, ROW_NUMBER() will only select the top N values irrespective of their being duplicate.
- SQL | SUB Queries
- SQL | Sub queries in From Clause
- Perform the given queries on the rooted tree
- Array range queries for elements with frequency same as value
- Queries to check whether a given digit is present in the given Range
- Range Queries for Longest Correct Bracket Subsequence
- Queries for the minimum element in an array excluding the given index range
- Mitigation of SQL Injection Attack using Prepared Statements (Parameterized Queries)
- Binary Indexed Tree : Range Update and Range Queries
- Queries to find the last non-repeating character in the sub-string of a given string
- SQL queries on clustered and non-clustered Indexes
- Evolution of Foldable Smartphones
- PLSQL | UPPER Function
- How to wait for a promise to finish before returning the variable of a function?
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.