Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: reduce redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that
data is logically stored.
Normalization Avoids
1. Duplication of Data– The same data is listed in multiple lines of the database.
2. Insert Anomaly– A record about an entity cannot be inserted into the table without first inserting information about another entity – Cannot enter a customer without a sales order.
3. Delete Anomaly- A record cannot be deleted without deleting a record about a related entity. Cannot delete a sales order without deleting all of the customer’s information.
4. Update Anomaly- Cannot update information without changing information in many places. To update customer information, it must be updated for each sales order the customer has placed.
Normalization is a three stage process –After the first stage, the data is said to be in first normal form, after the second, it is in second normal form, after the third, it is in third normal form.
Before Normalization
1. Begin with a list of all of the fields that must appear in the database. Think of this as one big table.
2. Do not include computed fields
3. One place to begin getting this information is from a printed document used by the system.
4. Additional attributes besides those for the entities described on the document can be added to the database.