DeveloperBarn Forums

Go Back   DeveloperBarn Forums > Databases > Database Design Help

Discuss "Relational Database Normalization Basics" in the Database Design Help forum.

Database Design Help - Database design is important to build fast and efficient applications. Discuss the best practices such as naming conventions and relational database schemes here.


Closed Thread
 
LinkBack (1) Thread Tools Display Modes
  1 links from elsewhere to this Post. Click to view. #1 (permalink)  
Old 03-31-2008, 08:01 AM
AOG123's Avatar
Lightning Master

 
Join Date: Mar 2008
Location: Fortress Of Solitude
Posts: 52
Thanks: 3
Thanked 13 Times in 10 Posts
Rep Power: 1
AOG123 is on a distinguished road
Default Relational Database Normalization Basics

Relational Database Normalization Basics

Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the existing data (accidental deletions or amendments) and to make the database more flexible by eliminating redundancy and inconsistent dependency.

Redundant data wastes disk space and creates database maintenance problems. If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations. A customer address change is much easier to implement if that data is stored only in the Customers table and nowhere else in the database.

What is an "inconsistent dependency"? While it is intuitive for a user to look in the Customers table for the address of a particular customer, it may not make sense to look there for the salary of the employee who calls on that customer. The employee's salary is related to, or dependent on, the employee and thus should be moved to the Employees table. Inconsistent dependencies can make data difficult to access because the path to find the data may be missing or broken.

There are a few rules for database normalization. Each rule is called a "normal form". If the first rule is observed, the database is said to be "in first normal form". If the first three rules are observed, the database is considered to be "in third normal form". Although other levels of normalization are possible, third normal form is considered the highest level necessary for most applications.

As with many formal rules and specifications, real world scenarios do not always allow for perfect compliance. In general, normalization requires additional tables and some designers find this first difficult and then cumbersome. If you decide to violate one of the first three rules of normalization, make sure that your application anticipates any problems that could occur, such as redundant data and inconsistent dependencies.

First Normal Form

Eliminate repeating groups in individual tables.
Create a separate table for each set of related data.
Identify each set of related data with a primary key.

Do not use multiple fields in a single table to store similar data. For example, to track an inventory item that may come from two possible sources, an inventory record may contain fields for Vendor Code 1 and Vendor Code 2. Also, what happens when you add a third vendor? Adding a field is not the answer; it requires program and table modifications and does not smoothly accommodate a dynamic number of vendors. Instead, place all vendor information in a separate table called Vendors, then link inventory to vendors with an item number key, or vendors to inventory with a vendor code key.

Second Normal Form Create separate tables for sets of values that apply to multiple records.
Relate these tables with a foreign key.

Records should not depend on anything other than a table's primary key (a compound key, if necessary). For example, consider a customer's address in an accounting system. The address is needed by the Customers table, but also by the Orders, Shipping, Invoices, Accounts Receivable and Collections tables. Instead of storing the customer's address as a separate entry in each of these tables, store it in one place, either in the Customers table or in a separate Addresses table.

Third Normal Form Eliminate fields that do not depend on the key.
Values in a record that are not part of that record's key do not belong in the table. In general, any time the contents of a group of fields may apply to more than a single record in the table, consider placing those fields in a separate table.

For example, in an Employee Recruitment table, a candidate's university name and address may be included. But you need a complete list of universities for group mailings. If university information is stored in the Candidates table, there is no way to list universities with no current candidates. Create a separate Universities table and link it to the Candidates table with a university code key.

EXCEPTION: Adhering to the third normal form, while theoretically desirable, is not always practical. If you have a Customers table and you want to eliminate all possible inter-field dependencies, you must create separate tables for cities, ZIP codes, sales representatives, customer classes, and any other factor that may be duplicated in multiple records. In theory, normalization is worth pursing. However, many small tables may degrade performance or exceed open file and memory capacities.

It may be more feasible to apply third normal form only to data that changes frequently. If some dependent fields remain, design your application to require the user to verify all related fields when any one is changed.
Other Normalization Forms

Fourth normal form,

Also called Boyce Codd Normal Form (BCNF), and fifth normal form do exist, but are rarely considered in practical design. Disregarding these rules may result in less than perfect database design, but should not affect functionality.

Normalizing an Example Table

These steps demonstrate the process of normalizing a fictitious student table.
Un-normalized table:

Code:
StudentNo    Advisor    AdvRoom    Class1    Class2    Class3
1022         Jones    412    101-07    143-01    159-02
4123         Smith    216    201-01    211-02    214-01
First Normal Form: No Repeating Groups

Tables should have only two dimensions. Since one student has several classes, these classes should be listed in a separate table. Fields Class1, Class2, and Class3 in the above records are indications of design trouble.
Spreadsheets often use the third dimension, but tables should not. Another way to look at this problem is with a one-to-many relationship, do not put the one side and the many side in the same table. Instead, create another table in first normal form by eliminating the repeating group (ClassNo), as shown below:

Code:
StudentNo        Advisor    AdvRoom    ClassNo
1022                 Jones    412    101-07
1022                 Jones    412    143-01
1022                 Jones    412    159-02
4123                 Smith    216    201-01
4123                 Smith    216    211-02
4123                Smith      216    214-01
Second Normal Form: Eliminate Redundant Data

Note the multiple ClassNo values for each StudentNo value in the above table. ClassNo is not functionally dependent on StudentNo (primary key), so this relationship is not in second normal form.

The following two tables demonstrate second normal form:

Table 'Students'
Code:
StudentNo    Advisor    AdvRoom
1022           Jones    412
4123            Smith   216
Table 'Registration'

Code:
StudentNo    ClassNo
1022              101-07
1022              143-01
1022              159-02
4123              201-01
4123              211-02
4123              214-01
Third Normal Form: Eliminate Data Not Dependent On Key

In the last example, AdvRoom (the advisor's office number) is functionally dependent on the Advisor attribute. The solution is to move that attribute from the Students table to the Faculty table, as shown below:
Table 'Students'
Code:
StudentNo    Advisor
1022            Jones
4123            Smith
Table 'Faculty'
Code:
Name    Room    Dept
Jones    412    42
Smith    216    42
Notes: To avoid storing redundant data in a database, each table can be designed to store only the data that pertains to a separate entity. Foreign keys will then represent the relationships between the entities. There are formal rules that define how data can be separated; those rules are referred to as normal forms. A database that complies with the normal forms is referred to as normalized to a normal form.

Normalization to the third normal form is usually sufficient in many practical situations.

Each table in a normalized database should contain only the data that pertains to a single entity. Related tables reference one another with foreign keys. Foreign Key constraints formally define foreign keys and enforce the referential integrity of data. Before you can create a foreign key constraint, you must create a Primary Key or Unique constraint or a Unique Index on the column(s) that will be referenced by the Foreign Key constraint.

Microsoft Support Article ID : 283878

Comments on this post
don94403 agrees: Good summary. Thanks.
lewy agrees: Very well explained
BLaaaaaaaaaarche agrees:
__________________
If i helped you, make me famous by clicking the
Sponsored Links
Closed Thread

  DeveloperBarn Forums > Databases > Database Design Help

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

LinkBacks (?)
LinkBack to this Thread: http://www.developerbarn.com/database-design-help/86-relational-database-normalization-basics.html
Posted By For Type Date
DeveloperBarn Forums - ASP Help, ASP.Net Help, PHP Help, SQL Help, Tutorials, Windows Help This thread Refback 04-26-2008 09:36 AM


Sponsored Links

ASP.NET Resource Index
a directory of ASP.NET tutorials, applications, scripts, assemblies and articles for the novice to professional developer.

Free Web Directory
Including Chats and Forums Resources, Offer automatic, instant and free directory submissions.
URLZ Web Directory
URLZ Web Directory

Free Web Directory - Add Your Link
The Little Web Directory
Free Web Directory
Pegasus free web directory is a free directory organised by categories.


All times are GMT -4. The time now is 12:57 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.2.0 RC7
©2008 DeveloperBarn.com

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45