Aggregation Pipeline Stages in MongoDB – Set 1

Last Updated : 17 Oct, 2022

In MongoDB, the aggregation method takes the stages which work in the sequence. In this, the data go through multiple stages and return the resulting output. There are different types of stages in the MongoDB called aggregation pipeline stages.

Syntax:

db.<collection_name>.aggregate(pipeline, options)

or

db.<collection_name>.aggregate( [ { <stage1> }, {<stage2>},… ] )

A pipeline is a sequence of aggregation operations and stages. It is of array type. If the pipeline is not mentioned then it returns all the documents from the collection.
Aggregation Stages always work in sequence means stage1 will execute first and pass the result to stage2 and so on.
Options are optional and only used when you specify pipeline as an array.
Aggregation is used to enhance performance by optimizing the query.
Use variable and field names with a dollar sign($) and string without a dollar sign.

$addFields Stage

$addFields stage is used to add new fields to the existing documents. It returns documents that contain all the existing fields and newly added fields.

Syntax :

db.<collection_name>.aggregate(
[
{
$addFields: { <newField1>: <expression>, <newField2>: <expression> … }
}
]);

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. Collection ’employee_details’ contains 4 documents. The following aggregation operation uses $addFields aggregation pipeline stages where $addField stage adds a new field that is ‘isActive : true’ to all the documents. This operation returns all the documents with the existing fields as well as a new field that is ‘isActive : true’.

db.employee_details.aggregate([
{
$addFields: {“isActive” : true }
}
])

$set Stage

$set stage is used to add new fields to the existing documents. It returns documents that contain all the existing fields and newly added fields. It is the same as that of the $addFields stage. If the name of the new field is the same as an existing field (including _id), $set overwrites the existing value of that field with the new value.

Note: $set stage is new staging in version 4.2.

Syntax:

db.<collection_name>.aggregate(
[
{
$set: { <newField>: <expression>, … }
}
]);

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. Collection ’employee_details’ contains 4 documents. The following aggregation operation uses $set aggregation pipeline stages where $set stage adds a new field that is ‘company_name’: ‘GFG’ to all the documents. This operation returns all the documents with the existing fields as well as a new field that is ‘company_name’: ‘GFG’.

db.employee_details.aggregate([
{
$set: {“company_name” : “GFG”}
}
])

$unset Stage

$unset stage is used to remove/exclude fields from the documents. If you want to remove the field or fields from the embedded document then use the dot notation.

Note: $unset stage is new staging in version 4.2.

Syntax:

To remove a single field, The $unset stage takes a string that specifies the field to remove.

db.<collection_name>.aggregate(
[
{
$unset: “<field>”
}
]);

To remove multiple fields, The $unset takes an array of fields to remove.

db.<collection_name>.aggregate(
[
{
$unset: [ “<field1>”, “<field2>”, … ]
}
]);

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $unset stage to remove the field or fields. ‘address.phone.type’ is the embedded field that we need to remove in this example. To remove ‘address.phone.type’ we set the value of $unset stage to ‘address.phone.type’. This operation returns all the documents with all the fields except the removed ‘address.phone.type’ field.

db.employee_details.aggregate([
{
$unset: “address.phone.type”
}
]);

$project Stage

$project stage is used to include fields, exclude _id field, add new fields and replace the existing field value in the output document. The following is the table that describes the specification and its description.

Specification	Description
field : <1 or true>	include the field in the output document
_id : <0 or false>	exclude the _id from the output document
field : <expression>	add the new field or replace the existing field value
field : <0 or false>	exclude the field from the output document. If you use the field to exclude instead of _id then you can’t use any other specification.

Syntax:

db.<collection_name>.aggregate(
[
{
$project: { <specification>}
}
])

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $project stage to include, exclude, add and update the field. “name.firstName” : 1, “name.lastName” : 1, “salary” : 1, “_id” : 0 are the specifications. 1 and 0 denote inclusion and exclusion respectively. This operation returns the document with fields name.fistName, name.lastName, salary, and excludes the _id.

db.employee_details.aggregate(
[
{
$project:{“name.firstName”:1, “name.lastName”:1, “salary”:1, “_id”:0}
}
]);

$count Stage

$count returns the count of the documents present in the selected collection.

Note: To get count of the documents in the collection we can also use db.<collection_name>.count(). It only returns the count we can’t specify any output field.

Syntax:

db.<collection_name>.aggregate(
[
{
$count: <String>
}
]);

<string> is the name of the output field which has the count as its value.

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $count stage to return the number of documents in the ’employee_details’ collection. ’employees_count’ is the name of the output field which has the number of documents as its value.

db.employee_details.aggregate(
[
{ $count: “employees_count” }
]);

The above operation returns the following document.

{ “employees_count” : 4 }

$unwind Stage

$unwind stage is used to deconstruct an array field from the input documents to output documents. If mentioned field path is not an array but is also not missing, null, or an empty array then $unwind treats the non-array field as a single element array and the includeArrayIndex for that field is null.

Note: $unwind stage does not include documents whose field is missing, null or empty array until you use preserveNullAndEmptyArrays : true

Syntax:

db.<collection_name>.aggregate(
[
{
$unwind:
{
path: <field path>,
includeArrayIndex: <string>,
preserveNullAndEmptyArrays: <boolean>
}
}
]);

Path specifies which field you want to unwind.
includeArrayIndex includes array index only for array, not an application for fields that are not an array, null or empty.
preserveNullAndEmptyArrays, If true then this option is used to include documents whose field is missing, null or empty array field. By default it is false.

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $skip and $unwind stage. ‘skills’ is the array field by which we deconstruct the document. Here, we are using two stages. In the first stage, the $skip stage skips the 3 documents out of 4 documents and passes the result to the next stage. In the next stage, the $unwind stage takes only 1 document coming from the first stage and deconstructs that document on the basis of mentioned ‘skills’ field.

db.employee_details.aggregate(
[
{$skip : 3},
{ $unwind: “$skills” }
]
);

The following is the output of the above operation. In this, we break down the document on the basis of an array field that is skills.

$limit Stage

$limit stage is used to limit the number of documents passed to the next stage in the pipeline.

Syntax:

db.<collection_name>.aggregate(
[
{
$limit: <positive 64-bit integer>
}
]);

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $limit and $count stage. Here, we are using two stages, In the first stage, the $limit stage limits the number of documents by 2 and passes the 2 documents out of 4 documents to the next stage. In the next stage, the $count stage counts the number of documents coming from the previous stage.

db.employee_details.aggregate(
[
{$limit : 2},
{$count : “employees_count” }
]);

The following is the output of the above operation.

{ “employees_count” : 2 }

$sort Stage

$sort stage is used to sort the documents. If sorting is on multiple fields, the sort order is evaluated from left to right.
For example, documents are first sorted by <field1> and then documents with the same <field1> values are further sorted by <field2>.

Note: You can sort a maximum of 32 keys.

Syntax:

db.<collection_name>.aggregate(
[
{ $sort: { <field1>: <sort order>, <field2>: <sort order> … } }
]);

<field> represents the document field on the basis of which we sort the document either ascending or descending depends on the <sort-order>. <sort order> has the following values :

Value	Order
1	Ascending
-1	Descending

Example:

Ascending Order

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $sort stage. In this example, we sort the documents in ascending order. ‘name.firstName’ is the <field> name on the basis of which we sort the document. ‘1’ is the <sort-order> which means the documents will be sorted in ascending order. The following operation returns the documents sorted in ascending order on the basis of the ‘name.firstName’ field.

db.employee_details.aggregate(
[
{$sort : {“name.firstName” : 1}}
]);

Descending Order

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $sort stage. In this example, we sort the documents in descending order. ‘name.firstName’ is the <field> name on the basis of which we sort the document. ‘-1’ is the <sort-order> which means the documents will be sorted in descending order. The following operation returns the documents sorted in descending order on the basis of the ‘name.firstName’ field.

db.employee_details.aggregate(
[
{$sort : {“name.firstName” : -1}}
]);

$sortByCount Stage

$sortByCount stage is used to group the documents on the basis of field path (document field) and then count the number of documents in each different group and the documents are sorted by count in descending order. The $sortByCount stage returns the documents. Each document has two fields ‘_id’ field containing the different grouping values and the ‘count’ field containing the number of documents belonging to that group.

Note: The $sortByCount stage is equivalent to the following: $group + $sort sequence: { $group: { _id: <expression>, count: { $sum: 1 } } }, { $sort: { count: -1 } }

Syntax:

db.<collection_name>.aggregate(
[
{$sortByCount : “<field path>”}
]);

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses $unwind and $sortByCount stage. ‘skills’ is the <field path> name on the basis of which we group the documents. Here, we are using two stages, In the first stage, the $unwind stage deconstructs the document on the basis of the ‘skills’ field and passes the result to the next stage. In the next stage, the $sortByCount stage groups the document coming from the previous stage on the basis of the ‘skills’ field and then count the number of documents in each separate group.

db.employee_details.aggregate(
[
{$unwind : “$skills”},
{$sortByCount : “$skills”}
]);

The above operation returns the following documents containing two fields ‘_id’ and ‘count’. The documents are sorted by count in descending order.

{ “_id” : “Javascript”, “count” : 4 }
{ “_id” : “MongoDB”, “count” : 4 }
{ “_id” : “React”, “count” : 2 }
{ “_id” : “Java”, “count” : 2 }

$skip Stage

$skip stage is used to skip the number of documents and passes the remaining documents to the next stage in the pipeline. It returns the remaining documents.

Syntax:

db.<collection_name>.aggregate(
[
{ $skip: <positive 64-bit integer> }
]);

Example:

In the following example, the database name is ‘company’ and the collection name is ’employee_details’. The Collection ’employee_details’ contains 4 documents. The following aggregation operation uses the $skip stage. Here, we ignore the first 3 documents out of 4 documents and pass the remaining 1 document to the next stage. This operation returns the remaining 1 document after ignoring the first 3 documents.