Surrogate Key in DBMS

Last Updated : 06 Nov, 2023

A key is a column, or group of columns, in a database management system (DBMS) that uniquely identifies every row in a table. Natural keys and surrogate keys are the two categories of keys.

Natural Key: A column, or group of columns, that is generated from the table’s data is known as a natural key. For instance, since it uniquely identifies every client in the table, the customer ID column in a customer table serves as a natural key.
Surrogate key: A column that is not generated from the data in the database is known as a surrogate key. Rather, the DBMS generates a unique identifier for you. In database tables, surrogate keys are frequently utilized as primary keys.

Surrogate Key

A surrogate key also called a synthetic primary key, is generated when a new record is inserted into a table automatically by a database that can be declared as the primary key of that table. It is the sequential number outside of the database that is made available to the user and the application or it acts as an object that is present in the database but is not visible to the user or application.

We can say that, in case we do not have a natural primary key in a table, then we need to artificially create one in order to uniquely identify a row in the table, this key is called the surrogate key or synthetic primary key of the table. However, the surrogate key is not always the primary key. Suppose we have multiple objects in a database that are connected to the surrogate key, then we will have a many-to-one association between the primary keys and the surrogate key and the surrogate key cannot be used as the primary key.

Features of the Surrogate Key

It is automatically generated by the system.
It holds an anonymous integer.
It contains a unique value for all records of the table.
The value can never be modified by the user or application.
The surrogate key is called the factless key as it is added just for our ease of identification of unique values and contains no relevant fact(or information) that is useful for the table.

Consider an example:
Suppose we have two tables of two different schools having the same column registration_no, name, and percentage, each table having its own natural primary key, that is registration_no.

Table of school A:

registration_no	name	percentage
210101	Harry	90
210102	Maxwell	65
210103	Lee	87
210104	Chris	76

Table of school B:

registration_no	name	percentage
CS107	Taylor	49
CS108	Simon	86
CS109	Sam	96
CS110	Andy	58

Now, suppose we want to merge the details of both the schools in a single table.
Resulting table will be:

surr_no	registration_no	name	percentage
1	210101	Harry	90
2	210102	Maxwell	65
3	210103	Lee	87
4	210104	Chris	76
5	CS107	Taylor	49
6	CS108	Simon	86
7	CS109	Sam	96
8	CS110	Andy	58

As we can observe the above table and see that registration_no cannot be the primary key of the table as it does not match with all the records of the table though it is holding all unique values of the table . Now , in this case, we have to artificially create one primary key for this table. We can do this by adding a column surr_no in the table that contains anonymous integers and has no direct relation with other columns . This additional column of surr_no is the surrogate key of the table.

Why use Surrogate Key in DBMS?

There are several reasons to use surrogate keys in database tables:

Uniqueness: Data integrity is improved by the guaranteed uniqueness of surrogate keys.
Stability: Since surrogate keys do not depend on any business rules or data value, they have a lower chance of changing over time.
Efficiency: Compared to natural keys, surrogate keys are frequently smaller and process more quickly.
Flexibility: In the event that the natural key changes, rows can still be uniquely identified using surrogate keys.

Advantages of the Surrogate Key

As there is no direct information related with the table, so the changes are only based on the requirements of the application.
Performance is enhanced as the value of the key is relatively smaller.
The key value is guaranteed to contain unique information .
As it holds smaller constant values , this makes integration of the table easy .
Enables us to run fast queries (as compared to the natural primary key)

Disadvantages of the Surrogate Key

The surrogate key value can never be used as a search key.
As the key value has no relation to the data of the table, so third normal form is violated.
The extra column for surrogate key will require extra disk space.
We will need extra IO when we have to insert or update data of the table.

Examples of Surrogate Key

System date & time stamp
Random alphanumeric string

Conclusion

Surrogate keys are an important tool for designing and implementing databases. They can be applied to enhance database systems’ flexibility, stability, and performance.

Suggest improvement

Authorization and Integrity Manager in DBMS

Difference between “INNER JOIN” and “OUTER JOIN”

Share your thoughts in the comments