Open In App

Group By Vs Distinct Difference In SQL Server

Last Updated : 02 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Distinct is a relational database management system. SQL Server offers a wide range of features and tools that handle different needs, from small-scale applications to large-scale application solutions. GROUP BY has performance features, especially when dealing with large datasets and complex aggregations. DISTINCT is generally more effective and more efficient when the purpose is to obtain unique values.

In this article, we will understand the Group By vs. Distinct Difference In SQL Server with examples and so on.

Introduction to Group By Vs Distinct Clause

GROUP BY and DISTINCT Clauses both are used to get the unique value from a column or a set of columns. But they are different in the way they are used.

  • The functionality of DISTINCT: It Removes the Duplicates.
  • Functionality of GROUP BY (Functionality of DISTINCT The functionality) + Applying Aggregate Function on that group.

Let’s Create a table to understand both the clause. We will be creating a table of Employees.

Query to Create a table

CREATE TABLE Employee 
(
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);

Query to Insert data into it

INSERT INTO Employee (EmployeeID, FirstName, LastName, Department, Salary)
VALUES
(1, 'John', 'Doe', 'IT', 60000.00),
(2, 'Jane', 'Smith', 'HR', 55000.00),
(3, 'Bob', 'Johnson', 'IT', 65000.00),
(4, 'Alice', 'Williams', 'Finance', 70000.00),
(5, 'Charlie', 'Brown', 'HR', 60000.00),
(6, 'David', 'Miller', 'Finance', 75000.00),
(7, 'Eva', 'Davis', 'IT', 62000.00),
(8, 'Frank', 'Clark', 'Finance', 72000.00),
(9, 'Grace', 'Moore', 'HR', 58000.00),
(10, 'Harry', 'Young', 'IT', 63000.00),
(11, 'Isabel', 'Hall', 'HR', 59000.00),
(12, 'Jack', 'Baker', 'Finance', 71000.00),
(13, 'Olivia', 'Turner', 'IT', 60000.00),
(14, 'Paul', 'Moore', 'Finance', 73000.00),
(15, 'Quinn', 'Parker', 'HR', 60000.00),
(16, 'Ryan', 'Scott', 'IT', 64000.00),
(17, 'Samantha', 'Bryant', 'HR', 61000.00),
(18, 'Tyler', 'Ward', 'Finance', 70000.00),
(19, 'Ursula', 'Hill', 'IT', 61000.00),
(20, 'Victor', 'Gomez', 'HR', 59000.00),
(21, 'Wendy', 'Fisher', 'IT', 62000.00),
(22, 'Xavier', 'Jordan', 'Finance', 71000.00),
(23, 'Yvonne', 'Lopez', 'HR', 58000.00),
(24, 'Zachary', 'Evans', 'IT', 63000.00),
(25, 'Ava', 'Hernandez', 'Finance', 69000.00);

Our Table Looks Like:

Table-Overview

Employee Table

DISTINCT Clause

The DISTINCT Clause gives us the unique value from the column. For example, if we have to find the no. of DISTINCT Departments then we will write the query using the DISTINCT clause.

Example 1: Department Details Using DISTINCT Clause

Let’s fetch the all distinct department form the Employee table using DISTINCT Clause.

Query:

-- Getting the Distinct Departments from the Table
SELECT DISTINCT Department
FROM Employee;

Output:

DistinctDept

Output

Explanation: Mostly it is used to know the redundancy in the table. For example, if we are working with a student table and there are multiple entries of the same student then we can find it using the DISTINCT keyword. We can also use multiple columns with DISTINCT clauses. If I have to find the sum of the salaries given to each department and if there is a duplicate entry of an employee then it might give the wrong result. So to overcome this I will use three columns in DISTINCT clauses.

Example 2: Employees Details Using DISTINCT Clause

Let’s fetch the FirstName, LastName, and Department from the employee table using DISTINCT Clause.

Query:

SELECT DISTINCT FirstName, LastName, Department
FROM employee;

Output:

FirstNameLastName

Output

Explanation: Using this we can get the idea about the duplicates in the table. DISTINCT is generally used where we have to use aggregate functions.

GROUP BY Clause

GROUP BY Clause is used to group the table based on the value of one or multiple columns and on that group to apply the aggregate functions to find some results.

Example 1: Number of Employees in Each Deparment Using GROUP BY Clause

If We Have to Find the Employees in Each Department Then We Can Use GROUP BY Clause.

Query:

SELECT Department, COUNT(*)  AS Count
FROM employee
GROUP BY Department;

Output:

TotalCount

No. of Employee in Each Department

Explanation: Similarly, if we have to find the SUM of salary given to each department or the maximum salary given to the department then we can do this very easily using GROUP BY Clause.

Example 2: Total Salary and Maximum Salary with Department

Let’s calculate the total salary of employees and also find the maximum salary along with the department grouped by Department.

Query:

SELECT Department, SUM(Salary) AS Total_Salary
FROM employee
GROUP BY Department
GO;


SELECT Department, MAX(Salary) AS Maximum_Salary
FROM employee
GROUP BY Department
GO;

Output:

DoubleQuery

Output

Explanation: GROUP BY clause is always used with aggregate functions and it is used to generate the result from those groups.

DISTINCT Clause Vs GROUP BY Clause

Before jumping to difference let’s see the Query execution plan of both clauses in SQL Server.

Query:

SELECT DISTINCT Department
FROM employee
GO

SELECT Department
FROM employee
GROUP BY Department
GO

Output:

QueryExecutionPlan

Query Execution Plan

Explanation: The result set of both queries would be the same and now let’s look at the Execution Plan.

Difference Between Group By Vs Distinct Clause

We can see that both plans are the same because on the back DISTINCT and GROUP BY work similarly if they are not bound by any other clause or aggregate.

GROUP BY Clause

DISTINCT Clause

Used to group rows by one or more columns or expressions and apply aggregate functions

Used to return only the unique values from a column or a set of columns

Can be used without aggregate functions, but not recommended

Simpler and clearer to use when no aggregation is needed

Preserves the order of the rows

May change the order of the rows

Conclusion

GROUP BY and DISTINCT Clauses are similar clauses when they are used alone but adding aggregation or using any other clause will change the behavior of the query. When we use group alone than in the backend it will convert the query with a DISTINCT clause only. Thus, if the case is to find the unique values then go with DISTINCT, and if you want to calculate something based on the creation of a group then go with GROUP BY Clause.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads