Order of Execution of SQL Queries

Last Updated : 08 Sep, 2023

Structured Query Language (SQL) is a standard language for managing relational databases. For any SQL query, the SQL Engine will execute the clauses in some specific standard order, which is referred to as the Order of Execution of SQL Queries.

Why Ordering is Important?

The order of execution of SQL Query optimizes the query, reduces the amount of data to be processed, and affects the performance of the Query processing significantly. The order makes the SQL engine process the queries faster, and efficiently and obtains optimized query results. Understanding this SQL query execution order helps to debug the code, write efficient queries, and trace the output of SQL accurately.

Order of Execution

The order of execution of an SQL query’s clauses is as follows:

First, the table on which the DML operation is performed has to be processed. So, the FROM clause is evaluated first in an SQL Query. If the query contains JOIN clauses, tables are combined by merging rows involved, before FROM clause. So, JOIN predates FROM in statements with JOIN. Now table data is acquired before any filter or group is done. So subsequent clauses can be evaluated based on this data which is much smaller than the original tables before JOIN operations. This also processes the subqueries.
SQL may create a temporary table internally to evaluate this. To make code memory efficient, it is good to have simpler table JOINs it would be highly memory intensive.
WHERE: After the table data on which other operations take place is processed by JOIN and FROM clause, WHERE clause is evaluated. WHERE clause filters the rows based on conditions from the table evaluated by the FROM clause. This WHERE clause discards rows that don’t satisfy the conditions, thus reducing the rows of data that need to be processed further in other clauses.
GROUP BY: If the query has a GROUP BY clause, it is executed next. Here, the data is grouped based on the common value in the column specified in the GROUP BY clause. This reduces the number of rows further equal to no of distinct values in the GROUP BY column. This helps to calculate aggregate functions.
HAVING: If the query had a GROUP BY clause, the HAVING clause is evaluated immediately after GROUP BY. HAVING clause is not compulsory for GROUP BY. Similar to WHERE operations, this clause also filters the table group processed before by the GROUP BY clause. This HAVING also discards rows that don’t satisfy the conditions, thus reducing the rows of data that need to be processed further in other clauses
SELECT: The SELECT is executed next after GROUP BY and HAVING. It computes expressions (arithmetic, aggregate, etc.) and aliases given in the SELECT clause. The computation is now performed on the smallest dataset after much filtering and grouping operations done by previous clauses.
DISTINCT: The DISTINCT clause is executed after evaluating expressions, and alias references in the previous step. It filters any duplicate rows and returns only unique rows.
ORDER BY: After executing all the above clauses, the data to be displayed or processed is computed. Now ORDER BY is executed to sort it based on particular column(s) either in ascending or descending order. It is left associative, that is it is sorted based on the first specified column and then by the second, and so on.
LIMIT/OFFSET: At last, after the order of data to be processed is evaluated, LIMIT and OFFSET clauses are evaluated to display only the rows that fall within the LIMIT. So, it is generally not recommended to LIMIT only certain rows from many rows evaluated before, since It is not efficient and waste of computation.

Example: The order of Execution of SQL is explained using an SQL query.

Assume there is a table with the name “orders” which contains data about orders made by customers, like order_ID, customer_ID, customer_city, order_date, and total_amount of the order.

The below query retrieves the total amount of all orders named “TOTAL” made by customers who are living in New York and placed their orders between the dates January 1, 2022, and March 31, 2022, sorted by the total amount in descending order.

Syntax:

SELECT customer_ID, SUM(total_amount) AS “Total”
FROM orders
WHERE order_date BETWEEN ‘2022-01-01’ AND ‘2022-03-31’
AND customer_city = ‘New York’
GROUP BY customer_id
ORDER BY Total DESC;

The above query is executed in the following way. First, the FROM clause is executed to identify the table involved, which is “orders.” Next, the WHERE clause is executed to filter the rows based on the conditions, which here is the date range using BETWEEN operation and customer_city value. Next, the GROUP BY clause is executed which groups the rows based on Customer_ID. This makes us use the SUM aggregate function now. Next, the SELECT clause is executed which retrieves the columns customer_ID and SUM of total_amount corresponding to each ID with the alias name “TOTAL”. At last, the ORDER BY clause is executed which sorts the results based on “TOTAL” in descending order. The order of evaluation is described in the subset view below.

Order of Execution of a SQL Query. The order is in the bottom right. The first clause within the Red box is evaluated followed by the one in Yellow, then Green, Blue, and Violet.

If the above is executed in a different order, some incorrect results would have been evaluated or evaluation would be slow. For instance, if the ORDER BY clause is evaluated before the WHERE clause, the data would be sorted first then grouping would have taken place and It would have returned incorrect results.

By understanding the order of execution of SQL and applying optimization techniques, efficient queries that give the desired results faster and more accurately can be developed.

FAQs on Order of Execution of SQL

1. Can the order of these phases be changed or optimized?

The order of execution is generally fixed according to the Logical Query Processing Order, as it ensures the correct and consistent retrieval of data. However, query optimization techniques, such as index usage, caching, and query rewriting, can influence the performance of a query without changing the logical order of execution.

2. Are there any exceptions to this order of execution?

While the Logical Query Processing Order provides a general guideline, certain database systems or query optimizers may employ optimizations that can rearrange the order of execution for improved performance, as long as the final result is consistent with the logical order. These optimizations are often transparent to the SQL developer.

3. What should I consider when writing efficient SQL queries?

When writing SQL queries, consider the following to improve efficiency:

Properly index columns used in WHERE and JOIN conditions.
Use appropriate data types and avoid unnecessary data type conversions.
Limit the use of SELECT * and only retrieve the columns you need.
Minimize the use of subqueries and consider JOIN alternatives.
Monitor and analyze query performance using database-specific tools and profiling.

Suggest improvement

Query-Execution Plan in SQL

Share your thoughts in the comments