What is normalization?
NORMALIZATION is a database design technique that organizes tables in such a way as to reduce data redundancy and dependency. Normalization divides large tables into smaller tables and links them using relationships. The purpose of normalization is to eliminate redundant (useless) data and provide logical data storage.
The inventor of the relational model, Edgar Codd, proposed the theory of normalization with the introduction of first normal form, and he went on to extend the theory to second and third normal forms. He later joined Raymond F. Boyce to develop the Boyce-Codd normal form theory.
Database Normal Forms
The theory of data normalization in SQL is still evolving. For example, there are discussions even on 6th normal form. However, in most practical applications, normalization reaches its best value in 3rd normal form . The evolution of normalization theories is illustrated below:
Database normalization example:
We will explore normalization with a case study. Let's say the video library maintains a database of movies for rent. Without any normalization, all information is stored in one table as shown below.
Here you can see that the Movies Rented column has multiple values. Now let's move on to 1st normal forms:
1NF (First Normal Form) Rules
- Each table cell must contain one value.
- Each entry must be unique.
Table above in 1NF-
Table 1: in 1NF form
Before we continue, let's understand a few things −
What is a key?
KEY is a value used to uniquely identify an entry in a table. A KEY can be a single column or a combination of multiple columns
Note. Columns in a table that are NOT used to uniquely identify a record are called non-key columns.
What is a primary key?
A primary is a value in a single column that is used to uniquely identify a record in a database.
Has the following attributes
- Primary key cannot be NULL
- Primary key value must be unique
- Primary key values should rarely be changed
- The primary key must be assigned a value when a new record is inserted.
What is a composite key?
A composite key is a primary key that consists of multiple columns used to uniquely identify a record.
There are two people in our database with the same name Robert Phil, but they live in different places.
Therefore, we require that the full name and address uniquely identify the entry. This is a composite key.
Let's move to second normal form 2NF
2NF (second normal form) rules
- Rule 1 - be in 1NF
- Rule 2 - Single Column Primary Key
It is clear that we cannot make our simple database in the form of 2nd normalization if we do not parse the table above.
We have split our 1NF table into two tables viz. Table 1 and Table 2. Table 1 contains information about the participants. Table 2 contains information about rented films.
We have introduced a new column called Membership_id which is the primary key for table 1. Records can be uniquely identified in table 1 using the membership id.
Database - Foreign Key
In table 2, Membership_ID is a foreign key
The foreign key refers to the primary key of another table! It helps to connect your tables
- A foreign key may have a different name than its primary key.
- This ensures that rows in one table have corresponding rows in another
- Unlike the primary key, they do not have to be unique. Most often they are not
- Foreign keys can be null although primary keys cannot
Why do you need a foreign key?
Suppose a newbie inserts a record in table B such as
You will only be able to insert into your foreign key those values that exist in the unique key of the parent table. It helps in referential integrity.
The above problem can be overcome by declaring the membership id from table 2 as a foreign key of the membership id from table 1
Now, if someone tries to insert a value into the membership id field that is not in the parent table, an error message will appear!
What are transitive functional dependencies?
Transitive Functional Dependency - Changing a non-key column may cause any other non-key column to change.
Consider Table 1. Changing a non-key column The full name can change the salutation.
Let's go to 3NF
3NF (third normal form) rules
- Rule 1 - be in 2NF
- Rule 2 - no transitive functional dependencies
In order to move our 2NF table to 3NF, we again need to partition our table again.
We split our tables again and created a new table that holds the greetings.
There are no transitive functional dependencies, and hence our table in 3NF
In table 3, the hello id is the primary key, and in table 1, the hello id is external to the primary key in table 3.
Now our little example is at a level that cannot be further decomposed to achieve higher forms of normalization. In fact, it is already in higher forms of normalization. Separate efforts to move to the next levels of data normalization are usually required in complex databases. However, we will briefly discuss the next levels of normalization in the following.
BCNF (Boyce-Codd normal form)
Even if the database is in 3rd normal form, anomalies can still occur if it has more than one candidate key .
Sometimes BCNF is also called 3.5 normal form.
4NF (Fourth Normal Form) Rules
If no instance of a database table contains two or more independent and multivalued data describing the corresponding object, it is in 4th normal form.
5NF (Fifth Normal Form) Rules
A table is in 5th normal form only if it is in 4NF and cannot be decomposed into any number of smaller tables without data loss.
6NF (Sixth Normal Form) Proposed
6th normal form is not standardized, however, it has been discussed by database experts for some time. Hopefully we will have a clear and standardized definition for 6th normal form in the near future...
This is all for normalization!!!
- Database design is critical to the successful implementation of a database management system that meets the data requirements of an enterprise system.
- Normalization helps create database systems that are cost effective and have better security models.
- Functional dependencies are a very important component of the data normalization process.
- Most database systems are normalized databases up to third normal form.
- The primary key uniquely identifies a record in a table and cannot be null
- The foreign key helps join the table and refers to the primary key