What is Normalization in DBMS?

In this course, we will study what is normalization in DBMS and its types: first normal forms, second normal forms, third normal forms, Boyce and Codd normal forms, fourth normal form, fifth normal form and domain key normal form.

What is Normalization in DBMS?

Normalization in DBMS is a process of organizing the data inside the database in such a way that we can ignore data redundancy, insertion anomaly, update anomaly & deletion anomaly.

It can be seen as a process used for minimizing redundancy from a relation or set of relations.

Redundancy in relation leads to insertion, deletion, and update anomalies. To eliminate, or reduce redundancy in database tables normal forms are used.

In normalization, we split a large table into smaller units or tables and then define the relationships between them to increase the clarity of data.

What is Normalization in DBMS?

Types of Normalization in DBMS

  1. First normal form (1NF)
  2. Second normal form (2NF)
  3. Third normal form (3NF)
  4. Boyce and Codd normal form (BCNF)
  5. Fourth normal form (4NF)
  6. Fifth normal form (5NF)
  7. Domain key normal form (DKNF)

First normal form

Data in the database is stored in the form of tables that contain rows and columns.
As per the rule of the first normal form, an attribute or the column of a table can only hold single values. It can hold only atomic values. In case a relation comprises a composite or multi-valued attribute then it is said to violate the rules of the first normal form or relation.

Student number Student name Grade Phone number State
1 Anand A 9486436877, 8686876422 Punjab
2 Kamal A+ 9557939701 Andhra Pradesh
3 Mohan B 8236475802 Rajasthan

As we know that there should be only one value in each column. So we need to normalize the table.

Student number Student name Grade Phone number State
1 Anand A 9486436877 Punjab
2 Anand A 8686876422 Punjab
3 Kamal A+ 9557939701 Andhra Pradesh
4 Mohan B 8236475802 Rajasthan

Second normal form

  1. For a relation to be in second normal form it must be in first normal form.
  2. There should be no partial dependency.
  3. Partial dependency means that non-prime attributes can be determined using a proper subset of the primary key.

Consider an example, R(PQRS) in which PQ->R and Q->S
In this example, PQ is the candidate key and therefore is a part of the prime attributes, and R and S are the non-prime attributes. S not being dependent on the entire candidate key depends on part of the candidate key and this type of dependency is known as partial dependency.

Now to make it in the second normal form we will break the tables and make new relations out of it. R(PQRS)->R(PQR) and R(QS).Now we have it in second normal form as there is no partial dependency.

Third normal form

  1. For a relation to be in the third normal form it must be in the second normal form which means it should be free from partial dependency.
  2. No non-prime attribute should be driving another non-prime attribute.
  3. There should be no transitive dependency.

Transitive dependency: if A->B and B->C are two functional dependencies then A->C is called transitive dependency.

Consider an example, R(PQRS) in which PQ->R and R->S
In this example, PQ is the candidate key and therefore is a part of prime attributes and R and S are non-prime attributes. S is dependent on another non-prime attribute and this type of dependency is known as a transitive dependency.

Now to make it in the third normal form we will break the tables and make new relations out of it. R(PQRS)->R(PQR) and R(RS). Now we have it in the third normal form as there is no transitive dependency.

Boyce Codd normal form

  • For a relation to be in Boyce Codd normal form it must be in the third normal form which means it should be free from transitive as well as partial dependency.
  • For every relation, the left-hand side should be a super key.
  • It is free from redundancy.
  • Sometimes there can be a loss of functional dependency during decomposition.
  • When all the attributes in a relation are prime attributes then it must be in BCNF.
  • In every functional dependency if on the left-hand side there is a super key then it is in BCNF.

Consider an example, R(PQR) in which PQ->R and R->Q 
In this example, PQ and PR are the candidate keys and therefore are a part of prime attributes. R is not a super key and hence it is not in BCNF.

Now to make it in the Boyce and Codd normal form we will break the tables and make new relations out of it. R(PQR)->R(PR) and R(RQ) because there should be a common attribute among the decomposed tables and in at least one of the decompositions the common attribute should be the candidate key. Now we have it in the Boyce and Codd normal form(BCNF).

Fourth normal form

  • For a relation to be in the fourth normal form it must be in Boyce Codd normal form.
  • A relation should not contain more than one multivalued attribute.
  • It eliminates the independent many-to-one relationship between the columns.

Fifth normal form

For a relation to be in the fifth normal form it must be in the fourth normal form.
When there is no division of tables possible to reduce redundancy.

Domain key normal form

When all the insertion and deletion anomalies are removed then the domain key normal form is attained.

Thanks for giving me your valuable time to learn What is Normalization in DBMS? We publish similar content Data Model in DBMS I hope you like this.