SQL | Top-N Queries

Top-N Analysis in SQL deals with How to limit the number of rows returned from ordered sets of data in SQL.
Top-N queries ask for the n smallest or largest values of a column. Both smallest and largest values sets are considered Top-N queries. Following this type of searching technique could save lot of time and complexities. Top-N analysis are useful in cases where the need is to display only the n bottom-most or the n top-
most
records from a table based on a condition. This result set can be used for further analysis.

For example, using Top-N analysis we can perform the following types of queries:

  • The top five products that have had the maximum sales in the last two months.
  • The top three agents who sold the maximum policies.
  • The least two students who scored minimum marks in end semester exams.

Performing Top-N Analysis

Following the below mentioned queries we can easily understand the working of Top-N analysis in SQL:
Syntax:

SELECT [column_list], ROWNUM
FROM (SELECT [column_list]
      FROM table_name
      ORDER BY Top-N_clolumn)
WHERE ROWNUM<=N;

We will perform the various commands on the following table named Employee:

Example 1:

Input : 
SELECT ROWNUM as RANK, first_name, last_name, employee_id, salary
FROM (SELECT salary, first_name, last_name, employee_id
      FROM Employee
      ORDER BY salary)
WHERE ROWNUM<=3;

Output :

Explanation: In the above SQL statement, the required fields are displayed for employees with top 3 highest salaries. The result is displayed in decreasing order of their salaries.

Example 2:

Input : 
SELECT ROWNUM as RANK, first_name, employee_id, hire_date
FROM (SELECT first_name, employee_id, hire_date
      FROM Employee
      ORDER BY hire_date)
WHERE ROWNUM<=3;

Output :

Explanation: In the above SQL statement, the required fields are displayed for those 3 employees who were hired earliest. The result is displayed in increasing order of their hire date.

Different styles for using Top-N analysis

  • Inline View and ROWNUM : The classic Top-N style query uses an ordered inline view to force the data into the correct order which then finally uses the ROWNUM check to limit the data returned.

    Example:

    Input : 
    SELECT first_name, last_name
    FROM (SELECT first_name, last_name
          FROM Employee
          ORDER BY salary DESC)
    WHERE ROWNUM<=4;
    

    Output :

    Explanation: In the above SQL statement, the required fields are displayed for highest paid 4 employees. The altering is done by ORDER BY clause.

  • Nested Inline View and ROWNUM : This method can also be used for paging through data, like paged web reports.
    Example:

    Input :
    SELECT employee_id, first_name, salary
    FROM   (SELECT employee_id, first_name, salary, rownum AS rnum
            FROM   (SELECT employee_id, first_name, salary
                    FROM Employee
                    ORDER BY salary)
            WHERE rownum<=4)
    WHERE  rnum>=2;
    

    Output :

    Explanation: In the above SQL statement, first of all the inside query runs and gives its output to the outer query which then finally gives us the desired output.

  • Using RANK function : The RANK analytic function assigns a sequential rank to each distinct value in output.
    Example:

    
    Input : 
    SELECT dpartment_id, first_name
    FROM (SELECT dpartment_id, first_name,
          RANK() OVER (ORDER BY dpartment_id DESC) AS rnum 
          FROM Employee)
    WHERE rnum<=3;
    

    Output :

    Explanation: In the above SQL statement, RANK() function also acts as a virtual field whose value is restricted at the end. RANK() function doesn’t give us the top N rows or the top N distinct values. The number of rows returned is dependent on the number of duplicates in the data.

  • Using DENSE_RANK function : The DENSE_RANK analytic function is similar to RANK() function. The difference is that the ranks are compacted due to which there are no gaps.
    Example:

    Input : 
    SELECT dpartment_id, first_name
    FROM (SELECT dpartment_id, first_name,
          DENSE_RANK() OVER (ORDER BY dpartment_id DESC) AS rnum 
          FROM Employee)
    WHERE rnum<=3;
    

    Output :

    Explanation: In the above SQL statement, DENSE_RANK() function also assigns same rank to the duplicate values but there is no gap in the rank sequence. Thus it always gives us a Top N distinct values result.

  • Using ROW_NUMBER function : The ROW_NUMBER analytic function is similar to ROWNUM virtual column but like all analytic functions its action can be limited to a specific output of data based on the order of data.
    Example:

    
    Input : 
    SELECT dpartment_id, first_name
    FROM (SELECT dpartment_id, first_name,
          ROW_NUMBER() OVER (ORDER BY dpartment_id DESC) AS rnum 
          FROM Employee)
    WHERE rnum<=4;
    

    Output :

    Explanation: In the above SQL statement, ROW_NUMBER() will only select the top N values irrespective of their being duplicate.



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.




Article Tags :
Practice Tags :


1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.