Open In App

Normal Forms in DBMS

Normalization is the process of minimizing redundancy from a relation or set of relations. Redundancy in relation may cause insertion, deletion, and update anomalies. So, it helps to minimize the redundancy in relations. Normal forms are used to eliminate or reduce redundancy in database tables.

Normalization of DBMS

In database management systems (DBMS), normal forms are a series of guidelines that help to ensure that the design of a database is efficient, organized, and free from data anomalies. There are several levels of normalization, each with its own set of guidelines, known as normal forms.



Important Points Regarding Normal Forms in DBMS

Normal forms help to reduce data redundancy, increase data consistency, and improve database performance. However, higher levels of normalization can lead to more complex database designs and queries. It is important to strike a balance between normalization and practicality when designing a database.

Advantages of Normal Form

Overall, using normal forms in DBMS helps to improve data quality, increase database efficiency, and simplify database design and maintenance.



First Normal Form

If a relation contain composite or multi-valued attribute, it violates first normal form or a relation is in first normal form if it does not contain any composite or multi-valued attribute. A relation is in first normal form if every attribute in that relation is singled valued attribute.

Example

ID   Name   Courses
------------------
1 A c1, c2
2 E c3
3 M C2, c3
ID   Name   Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3

Second Normal Form

To be in second normal form, a relation must be in first normal form and relation must not contain any partial dependency. A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime attribute (attributes which are not part of any candidate key) is dependent on any proper subset of any candidate key of the table. Partial Dependency – If the proper subset of candidate key determines non-prime attribute, it is called partial dependency.

STUD_NO            COURSE_NO        COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000


       Table 1                                                           Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
AB -> C  [A and B together determine C]
BC -> D [B and C together determine D]

In the above relation, AB is the only candidate key and there is no partial dependency, i.e., any proper subset of AB doesn’t determine any non-prime attribute.

X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).

Example 1: In relation STUDENT given in Table 4, FD set: {STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE, STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}

Candidate Key: {STUD_NO}

For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE -> STUD_COUNTRY are true.

So STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third normal form.

To convert it in third normal form, we will decompose the relation STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE) as: STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE) STATE_COUNTRY (STATE, COUNTRY)

Consider relation R(A, B, C, D, E) A -> BC, CD -> E, B -> D, E -> A All possible candidate keys in above relation are {A, E, CD, BC} All attributes are on right sides of all functional dependencies are prime.

Example 2: Find the highest normal form of a relation R(A,B,C,D,E) with FD set as {BC->D, AC->BE, B->E}

Step 1: As we can see, (AC)+ ={A,C,B,E,D} but none of its subset can determine all attribute of relation, So AC will be candidate key. A or C can’t be derived from any other attribute of the relation, so there will be only 1 candidate key {AC}.

Step 2: Prime attributes are those attributes that are part of candidate key {A, C} in this example and others will be non-prime {B, D, E} in this example.

Step 3: The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or composite attribute. The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E is in 2nd normal form (B is not a proper subset of candidate key AC).

The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime attribute) and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal for, either LHS of an FD should be super key or RHS should be prime attribute. So the highest normal form of relation will be 2nd Normal form.

For example consider relation R(A, B, C) A -> BC, B -> A and B both are super keys so above relation is in BCNF.

Third Normal Form

A relation is said to be in third normal form, if we did not have any transitive dependency for non-prime attributes. The basic condition with the Third Normal Form is that, the relation must be in Second Normal Form.

Below mentioned is the basic condition that must be hold in the non-trivial functional dependency X -> Y:

For more, refer to Third Normal Form in DBMS.

BCNF

BCNF (Boyce-Codd Normal Form) is just a advanced version of Third Normal Form. Here we have some additional rules than Third Normal Form. The basic condition for any relation to be in BCNF is that it must be in Third Normal Form.

We have to focus on some basic rules that are for BCNF:

1. Table must be in Third Normal Form.
2. In relation X->Y, X must be a superkey in a relation.

For more, refer to BCNF in DBMS.

Fourth Normal Form

Fourth Normal Form contains no non-trivial multivaued dependency except candidate key. The basic condition with Fourth Normal Form is that the relation must be in BCNF.

The basic rules are mentioned below.

1. It must be in BCNF.
2. It does not have any multi-valued dependency.

For more, refer to Fourth Normal Form in DBMS.

Fifth Normal Form

Fifth Normal Form is also called as Projected Normal Form. The basic conditions of Fifth Normal Form is mentioned below.

Relation must be in Fourth Normal Form.
The relation must not be further non loss decomposed.

For more, refer to Fifth Normal Form in DBMS.

Applications of Normal Forms in DBMS

Some Important Points about Normal Forms

Conclusion

In Conclusion, relational databases can be arranged according to a set of rules called normal forms in database administration (1NF, 2NF, 3NF, BCNF, 4NF, and 5NF), which reduce data redundancy and preserve data integrity. By resolving various kinds of data anomalies and dependencies, each subsequent normal form expands upon the one that came before it. The particular requirements and properties of the data being stored determine which normal form should be used; higher normal forms offer stricter data integrity but may also result in more complicated database structures.

Previous Year Question Links

FAQs on Normal Form

Q.1: Why is Normalization Important in DBMS?

Answer:

Normalization helps in preventing database from anomalies, that ultimately ensures the consistency of database and helps in easy maintenance of database.

Q.2: Is it possible to over-normalize the database?

Answer:

Yes, excessive normalization will go to complex queries and also reduces performance. It strikes balance between normlization and practicality.

Q.3: Is it necessary to normalize a database to Highest Normal Form like (BCNF or 4NF)?

Answer:

There is no certain necessary condition for any database normalization. Manytimes, lower form can be sufficient for specific performance and simplicity.


Article Tags :