Difference between Where and Group By

Prerequisite – WHERE Clause, GROUP BY, Having vs Where Clause
Where and Gropb By clauses are used to filter rows returned by the query based on the condition. The differences are as following below.

WHERE clause specifies search conditions for the rows returned by the Query and limits rows to a specific row-set. If a table has huge amount of records and if someone wants to get the particular records then using ‘where’ clause is useful.

GROUP BY clause summaries identical rows into a single/distinct group and returns a single row with the summary for each group, by using appropriate Aggregate function in the SELECT list, like COUNT(), SUM(), MIN(), MAX(), AVG(), etc.



Use Case:
Suppose some sales company wants to get a list of Customers who bought some number of items last year, so that they can sell more some stuff to them this year.
There is table called SalesOrder with columns CustomerId, SalesOrderId, Order_Date, OrderNumber, OrderItem, UnitPrice, OrderQty
Now we need to get the customers who made orders last year i.e. 2017

Using Where clause –

SELECT * 
FROM [Sales].[Orders]
WHERE Order_Date >= '2017-01-01 00:00:00.000'
AND Order_Date < '2018-01-01 00:00:00.000' 

This will return the row set with all the Customers and corresponding Orders of year 2017.

Using Group By clause –

SELECT CustomerID, COUNT(*) AS OrderNumbers
FROM [Sales].[Orders]
WHERE Order_Date >= '2017-01-01 00:00:00.000'
AND Order_Date < '2018-01-01 00:00:00.000'
GROUP BY CustomerId 

This will return the row set of the Customers (CustomerId) who made orders in year 2017 and total count of orders each Customer made.

Using Having Clause –
Having clause is used to filter values in Group By clause. The below query filters out some of the rows

SELECT SalesOrderID,
         SUM(UnitPrice* OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING   TotalPrice > 5000 

Since the WHERE clause’s visibility is one row at a time, there isn’t a way for it to evaluate the SUM across all SalesOrderID’s. The HAVING clause is evaluated after the grouping is created.

You can use ‘Where’ clause with ‘Having’ clause as well. The WHERE clause is applied first to the individual rows in the tables. Only the rows that meet the conditions in the WHERE clause are grouped. The HAVING clause is then applied to the rows in the result set.

Example:

SELECT SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
WHERE    SalesOrderID > 500
GROUP BY SalesOrderID
HAVING   SUM(UnitPrice * OrderQty) > 10000 

So here, the having clause will be applied on the rows that are filtered by where clause. Having clause can only compare results of aggregated functions or column part of the group by.

Conclusion:

  1. WHERE is used to filter records before any groupings take place that is on single rows.
  2. GROUP BY aggregates/ groups the rows and returns the summary for each group.
  3. HAVING is used to filter values after they have been groups.


My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.