Open In App

SELECT DISTINCT vs GROUP BY in MySQL

In MySQL, the two most common ways of managing and retrieving unique values are with SELECT and GROUP BY. However, they are used for different reasons. With SELECT, we can get different values from the same column, so we don’t have to worry about duplicates. With GROUP BY, we can aggregate data and group results based on specific columns. However, there are some differences between the two operators. Both can be used to generate the same output. But we need to know the difference for better utilization of resources and time.

SELECT DISTINCT

The SELECT DISTINCT statement is used to retrieve unique values from a single column or a combination of columns in a table. It is often used when you want to eliminate duplicate rows from the result set.



Syntax:

SELECT DISTINCT column1, column2 

FROM table_name

Parameters Used:

GROUP BY

The GROUP BY clause is used to group rows that have the same values in specified columns into summary rows, like calculating aggregates (e.g., COUNT, SUM, AVG) for each group. It is typically used with aggregate functions.



Syntax:

SELECT column1, column2, ..., aggregate_function(column_name)
FROM table_name
WHERE condition
GROUP BY column1, column2, ...;

Parameters Used:

Let’s understand the difference between SELECT DISTINCT and GROUP BY through some examples:

Example of SELECT DISTINCT vs GROUP BY in MySQL

Let’s assume we have a customers table and an orders table. We are going to use these tables to show how DISTINCT and GROUP BY can be used for different use cases:

Query:

CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
city VARCHAR(255) NOT NULL
);

INSERT INTO customers (customer_id, name, city) VALUES
(1, 'John Doe', 'New York'),
(2, 'Jane Smith', 'London'),
(3, 'Mike Brown', 'Paris'),
(2, 'Jane Smith', 'London');

Output:

customers table

Orders Table

Query:

CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT NOT NULL,
product VARCHAR(255) NOT NULL,
price DECIMAL(10,2) NOT NULL,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

INSERT INTO orders (order_id, customer_id, product, price) VALUES
(1, 1, 'Phone', 100.00),
(2, 2, 'Laptop', 500.00),
(3, 1, 'Tablet', 200.00),
(4, 2, 'Watch', 150.00);

Output:

orders table

Example 1: Find distinct customer cities

Here, we want to get names of distinct cities only, so DISTINCT will be the obvious choice here.

Use SELECT DISTINCT to retrieve unique cities where customers reside:

SELECT DISTINCT city
FROM customers;

Output:

Distinct clause output

Explanation: The SQL query retrieves unique values from the “city” column in the “customers” table. The output provides a list of distinct cities where customers are located, eliminating duplicate entries and showcasing the unique cities in the dataset.

Example 2: Count Orders Per Customer City

Here, we also want to count the number of orders per city. Since an aggregate function(COUNT) is involved here, we will be using GROUP BY.

Use GROUP BY to group customers by city and count their orders:

SELECT city, COUNT(*) AS order_count
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY city;

Output:

order by

Explanation: The SQL query joins the “customers” and “orders” tables on the customer_id, grouping the results by city. It counts the number of orders for each city, providing an output that displays the distinct cities along with the corresponding count of orders placed by customers in each city.

Key Differences Observed in This Example

Purposes and Functionalities

SELECT DISTINCT:

GROUP BY:

Choosing between them

Feature

SELECT DISTINCT

GROUP BY

Purpose

Remove duplicate rows

Group data and aggregate

Aggregate functions

No

Yes

Sorting

No (optional)

Yes (default)

Performance

Can be faster (no index)

Slower (sorting)

Syntax

Simpler

More Complex

Conclusion

In MySQL, SELECT DISTINCT is used to retrieve unique values from one or more columns in a result set, eliminating duplicates without aggregation. Conversely, GROUP BY is employed to group rows sharing common values in specified columns, enabling aggregation functions to summarize data within each group. While SELECT DISTINCT is suitable for obtaining unique values, GROUP BY is essential for performing complex aggregations and generating summary results based on grouping criteria, each serving distinct purposes in MySQL query operations.


Article Tags :