Resources‎ > ‎Modeling‎ > ‎Data Modeling‎ > ‎

Database Normalization


This page will discuss the different areas of normalization for a database.

Traditional databases are organized by fields, records, and tables.  A field is a single piece of information; a record is one complete set of fields, and a table is a collection of records.  For example, a telephone book is analogous to a file.  It contains a list of records, each of which consists of three fields: name, address and telephone numbers.

Database normalization is the process of effeciently organizing data in a database.  There are two goals of the normalization process; #1: Eliminate Redundant Data (Example: storing the same data in more than one table)  and #2: Ensure Data Dependencies make sense ( Only storing related data in the correct place).  Both of these are worthy goals as they reduce the amount of space consumed by the database and ensure that the data is logically stored.

Normalization Ranking

The database community has developed a series of guidelines for ensuring normalization.  These are referred to as "Normal Forms" and are numbered from 1 (first normal form or 1NF) through to 5 (fifth normal form or 5NF).  In practical applications, you will often see 3NF along with the occasional 4NF.  The fifth normal form is very rarely seen.  Below will be the definitions of the different normal form levels.

1st Normal Form (1NF)

The level sets the very basic rules for an organized database and is the lowest form of normalization.

Guideline(s):
  • Eliminate duplicative columns from the same table
  • Create separate tables for each group of related data
  • Identify each row with a unique column or set of columns (the primary key [PK]).
The primary key ( PK ) uniquely identifies each record in a table.  It can either be a normal attribute that is guaranteed to be unique or it can be generated by the DBMS.  It may consist of a single attribute (field) or multiple attributes in combination.

2nd Normal Form (2NF)

This level further focuses on the concepts of removing duplicative data from the system.

Guideline(s):

  • Meet all the requirements of the 1NF
  • Remove subsets of data that apply to the multiple rows of a table and place them into separate tables.
  • Create relationships between these new tables and their predecessors through the use of foreign keys ( FK ).
The foreign key ( FK ) is a relationship or link between two tables which ensures that the data stored in the database is consistent.  The foreign key link is setup by matching column(s) in one table ( child ) to the primary key column(s) in another table ( parent ).
 

3rd Normal Form (3NF)

This form goes on large step further.

Guideline(s):
  • Meet all the requirements of the 2NF.
  • Remove columns that are not dependent upon the primary key 
4th Normal Form (4NF)

This form has one additional requirement

Guideline(s):
  • Meet all the requirements of the 3NF.
  • A relation is in 4NF if it has no mutli-valued dependencies.

NOTE:    It is important to point out that these are guidelines only.  Occasionally, it becomes necessary to stray from them to meet business requirements.  When variations take place, it's extremely important to evaluate any possible problems that could occur to your system and account for possible inconsistencies.