Open In App

DISTINCT vs GROUP BY in SQL

SQL (Structured Query Language) is used to manage and manipulate the data in relational databases. It can be used for tasks such as database querying, data editing, database and table creation and deletion, and granting user permissions.

We can use the DISTINCT keyword and GROUP BY clause when we want to obtain the unique records of a table in SQL. Even though their purpose is the same, they are used in various ways and they also function differently.



DISTINCT Keyword

The DISTINCT keyword retrieves unique values or records from a table, eliminating duplicates. It is applied with the SELECT statement to obtain unique values of one or more columns. Consider a sales table with customer ID, name, product, and price. To find the unique products in the sales table, we can use the DISTINCT keyword.

Syntax:



SELECT DISTINCT column(s) FROM table_name

GROUP BY

The GROUP BY clause groups similar or identical values in a table and is used with aggregate functions like AVG(), SUM(), MAX(), MIN(), COUNT(), etc. It is employed with the SELECT statement and is positioned after the WHERE clause. Unlike DISTINCT, GROUP BY doesn’t directly remove duplicate records; instead, it groups similar values into sets and applies aggregate functions.

Syntax:

SELECT column_name1, aggregate_function(column_name) FROM table_name GROUP BY column_name1;

Examples of DISTINCT and GROUP BY in SQL

Let’s take a closer look at the functionality and use cases of GROUP BY and DISTINCT using a simple example in MySQL.

Sales Table:

-- Create sales table
CREATE TABLE sales(
OrderID INT PRIMARY KEY,
Product VARCHAR,
Name VARCHAR(50),
Price INT
);

To add the value to the sales table:

-- Insert data into sales table
INSERT INTO sales (OrderID,Product, Name, Price) VALUES
(1, 'Laptop','John', 1200),
(2, 'Smartphone','Alice', 800),
(3, 'Tablet','John', 500),
(4, 'Laptop','Bob', 1200),
(5, 'Laptop','Alice', 1200),
(6, 'Tablet','Emily' 400);

Sales Table

DISTINCT keyword

Example 1: Unique Products from Sales Table

SELECT DISTINCT Product FROM sales;

Output:

Distinct products

Explanation: The above result shows the unique values in the product column i.e. without duplicates.

Example 2: Find a the DISTINCT product and price

SELECT DISTINCT  Product, Price  FROM sales; 

Output:

Distinct with multiple columns

Explanation: The DISTINCT keyword can be used with multiple columns but it is recommended to use it on necessary columns to obtain the desired result.

GROUP BY clause

Example 1: Find the total sales of each product

SELECT  Product ,SUM(Price)  FROM sales GROUP BY  Product;

Output:

Group by one column

Explanation: This SQL query retrieves the total sales for each unique product from the “sales” table. The output displays product names alongside their corresponding aggregated sum of prices.

Example 2: Find the total sales and quantity of each product

SELECT  Product, SUM(price)  as total_sales, COUNT(*)  as quantity FROM Sales GROUP BY Product;

Output:

Group by with aggregation functions

Explanation: This SQL query calculates the total sales and quantity for each unique product in the “Sales” table. The output presents product names, their total sales, and the respective quantities.

DISTINCT vs GROUP BY in SQL

Feature

DISTINCT

GROUP BY

Used for

Unique values from a single column

Grouped data (by one or more columns) along with aggregate function calculation.

Syntax

SELECT DISTINCT column1 FROM table_name;

(multiple columns can be added)

SELECT column1 aggregate_function(column_name) FROM table_name GROUP BY column1;

(multiple columns can be added)

Goal

Removes duplicate rows from the result

Groups rows based on specified columns, and use aggregate functions

Columns in SELECT

Include only the column(s) for which uniqueness is desired

Include columns specified in group by clause as well as columns for which aggregate functions are applied

Conclusion

In conclusion, DISTINCT and GROUP BY in SQL, though serving the common purpose of obtaining unique records, are applied differently. DISTINCT is suitable for obtaining unique values from one or more columns, while GROUP BY is employed for grouping data based on one or more columns along with aggregate function calculations. Understanding when to use each is crucial for crafting efficient and effective SQL queries.


Article Tags :