The basic component of a file in a file system is a data item,  which is the smallest named
Unit of data.   A group of related data items treated as a unit  by an application  is called a
Record.  A file is a collection of records of a single type.
In a relation database, a data item is called an attribute, a record  is called a row or tuple,
And a file is called a table.      A  database   is a collection of interrelated stored data  that 
Serves  the needs of multiple   users within   one or  more organizations.        A  database 
management system (DBMS) is a generalized software system for manipulating database.
A  DBMS supports a logical  view (schema,subschema);   physical view (access methods,
Data clustering);   data definition language;  data manipulation language;   and  important
Utilities such as  transaction management and  concurrency control,  data integrity,  crash
Recovery  and  security.   Relational  DBMSs   provide   a  much  higher  degree  of  data
Independence and also make database conversion and reorganization much easier.

Steps: Database Life Cycle
1. Requirements Analysis
2. Logical Design
             a) ER Modeling
             b) View Integration
             c) Transformation of the ER Model to SQL Tables
             d) Normalization of Tables
3. Usage Refinement
4. Data Distribution
5. Local Schema and Physical Design
6. Database Implementation, monitoring and modification.

Basic Objects: Entities, Relationships, Attributes 

The  basic  ER  model  consists  of  three classes of objects:  Entities,  Relationships  and
Attributes.
Entities
Entities are the principal  data objects  about which  information  is to be collected;  they 
Usually denote a person,  place, thing ,  or event of informational interrest.   A particular
Occurrence of an entity is called an entity instance or sometimes an  entity  occurrence.
Relationships
Relationships represent real-world associations among one or more entities, and as such
Have no physical or conceptual  existence  other than that which is  inherited  from their
Entity associations.    A  particular occurrence of a relationship  is called a  relationship
Instance or relationship occurrence.     A role is defined as the function an entity  plays 
In a relationship.
Attributes
Attributes  are characteristics of entities that provide  descriptive detail about them.   A
Particular occurrence  of  an  attribute  within  an  entity   or  relationship  is  called  an 
Attribute  value.    Attributes of  an entity  such as Employee may include emp_id,emp_
Name,emp_address,phone_no,fax_no,job_title , and so on.
Transformation Rules and SQL Constructs
The basic transformations can be described in terms of the three types of tables they
Produce:
--An Entity table with the same information content as the original entity
This  transformation  always  occurs  for entities with   binary relationships  that  are 
Many-to-many,  one-to-many  on the 'one' (parent) side,  or one-to-one  on one  side;
Entities with binary recursive relationships that are many-to-many; and entities with
Any ternary or higher-degree relationship, or a generalization hierarchy.
--An Entity table with the embedded foreign key of the parent entity
This transformation always occurs for entities with binary relationships that are one-
To-many for the entity on the 'many'(child)side, for one-to-one relationships for one
Of the entities, and for each entity with a binary recursive relationship that is one-to-
One or one-to-many.
--A relationship table with the foreign keys of all the entities in the relationship
This  transformation  always  occurs for relationships  that  are  binary and  many-to-
Many, relationships that are binary recursive and many-to-many, and all relationship
That are of ternary or higher degree.    This  is  other most common way CASE tools
Handle relationships in the ER model.
Binary Relationships
A  one-to-one binary  relationship  between  two  entities   -   When both entities  are
Mandatory, each entity becomes a table and the key of either entity can appear in the
Other  entity's table  as a foreign key.   One of the  entities in an optional in the other
Should  contain  the foreign key  of the other entity  in its  transformed table.  When 
both entities are optional, either entity can contain  the embedded foreign key of the
Other entity, with nulls allowed in the foreign keys.
The one-to-many  relationship can be shown  as either mandatory or optional  on the
'many' side, without affecting the transformation.  On the  'one'  side it may be either
mandatory  or optional.   In all the cases the foreign key  must appear  on the  'many'
side,   which represents the child entity,  with nulls allowed for foreign keys only  in 
the optional  'one'  case.    Foreign key constraints  are set  according  to the  specific
meaning of the relationship and may vary from one relationship to another.
The many-to-many relationship, as completely  optional,requires a relationship table
with  primary keys of  both entities.    The same transformation applies to  either  the 
optional or mandatory case, including the fact the not null cause must appear for  the
Foreign  keys  in both  cases.    Foreign key  constraints  on delete  and  update  must
always be cascade because each entry in the SQL table depends on the current  value
or existence of the referenced primary key.
Binary Recursive Relationships
 A single entity with a one-to-one relationship implies some form  of entity occurance
pairing,  as  indicated  by  the   relationship  name.   This  pairing  may  be completely 
optional,  completely  mandatory,  or  neither.   In all of these cases, the pairing entity 
key appears as a foreign key in the resulting table.    The  two key attributes are taken
from the  same domain  but are  given  different  names to designate their unique  use.
The one-to-many relationship  requires a  foreign key in the entity table.   The foreign
key constraints  can vary with the particular relationship.  The many-to-many relation
-ship uses a relationship table; it could also be defined as mandatory .
Multiple Relationships
Multiple  relationships  among  n  entities are always considered to be completely 
independent.  One-to-one or one-to-many binary or binary recursive relationships 
that result in entity tables  that  are either equivalent or differ only in the addition 
of a  foreign key  can simply be  merged  into  a single entity table containing all 
the  foreign   keys.     Many-to-many   or   ternary   relationships  that   result   in 
relationship tables tend to be unique and cannot be merged.


Transformation Steps
The  list  that  follows  summarizes  the  basic  transformation  steps  from an ER
Diagram to SQL.
1) Transform each entity into a table  containing the key and  non-key attributes
      of the entity.
2) Transform every many-to-many binary or binary recursive relationship into  a
Relationship  table  with  the  keys  of  the  entities  and  the  attributes  of  the 
      relationship.
3 ) Transform  every ternary  or  higher level n-ary relationship into a relationship
      table.


                                                   NORMALIZATION

Fundamentals
Relational database tables, whether they are derived from ER models or from some
other design method, sometimes suffer from some rather serious problems in terms
of performance, integrity, and maintainability.    For example, when the entire data-
base is defined as a single large table,   it  can result in a large amount of redundant 
data and lengthy searches for just a small number of target rows.    It can also result
in  long  and  expensive  updates,    and   deletions  in  particular  can  result  in  the 
elimination of useful data as an unwanted side effect.
For example, products, salespersons, customers, and orders are all stored in a single
Table called 'sales'.  In this table, certain product and customer information is stored
redundantly, wasting storage space.  Also, updates such as changing the address   of
the customer would require changing many rows.   Finally,  deleting  the  only  out-
standing order by a valued customer also deletes the only copy  of  his  address  and
credit rating as a side  effect.   Such  information  may  be  difficult  ( or  sometimes 
impossible) to recover.  
If we had a method of breaking up such a large table into smaller tables so that these
types of problems would be eliminated, the database would be much more   efficient
and reliable.   Classes  of  relational  database schemes  or table  definitions,   called 
normal forms, are  commonly  used to accomplish  this  goal.    The  creation  of  a 
normal form database table is called normalization.  It is accomplished by analyzing
the interdependencies among individual attributes  associated with those tables  and 
taking projections (subsets of columns) of larger tables to form smaller ones.

First Normal Form
A Table is in first normal form (1NF) if and only if all columns contain only atomic
values; that is, there are no repeating groups (columns) within a row.
A repeating group occurs in a relational table when a multivalued attribute is allowed
to have more than one value represented within a single row. When this happen, rows
must either be defined as variable length or defined with enough attribute positions to
accommodate the maximum possible set of values.
The  advantages  of  1NF  over unnormalized tables are its representational simplicity
and the ease with which one can develop a query language for it.  The disadvantage is
the requirement of duplicate data.

Super keys, Candidate keys, and Primary keys
A table in 1NF often suffers from data duplication,  update performance,  and  update
Integrity  problems.     A  superkey  is a set  of one  or more  attributes  which,  taken
collectively,  allows  us to identify uniquely  an entity  or  table.     Any subset of  the 
attributes of a superkey that is also a superkey  and not reducible to another superkey 
is  called  a  candidate key.    A  primary key  is  selected  arbitrarily  from the set  of 
candidate keys to be used in an index for that table.

Second Normal Form
A Table is in second normal form (2NF)  if and only if it is in 1NF and every nonkey
attribute is fully dependent on the primary key.  An attribute is fully dependent on the
primary key  if  it  is on  the right side of an FD  for which  the left side  is  either  the 
primary key  itself  or  something that can be derived from the primary key  using  the 
transitivity of FDs.
The property  of one or more attributes  that uniquely   determine the value of  one  or 
more other attributes is called functional dependence.
An example of a transitive FD in report(table) is the following:
		report_no -> dept_no
            	dept_no   -> dept_name
Therefore we can derive  the  FD  ( report_no  ->  dept_name ),   since  dept_name  is 
Transitively dependent on report_no.

Third Normal Form
A Table is in third normal form (3NF) if and only if for every nontrivial functional
dependency X->A, where X and A are either simple or composite attributes, one of
two conditions must hold.    Either  attribute  X  is  a  superkey,  or  attribute A  is a 
member of a candidate key.   If  attribute  A  is  a  member  of a candidate key, A is 
called a prime attribute.
Note: A trivial FD is of the form YZ->Z.

Boyce-Codd Normal Form
Third normal form,   which  eliminates  most  of the anomalies known in databases
today, is the most  common  standard  for  normalization  in  commercial databases
and  CASE  tools.   The  few remaining anomalies can be eliminated by the Boyce-
Codd normal form and higher normal forms.
"A table R is in Boyce_Codd normal form (BCNF) if for every nontrivial FD X->A,
X is a superkey".
BCNF is a stronger form of normalization than 3NF because it eliminates the second
condition  for   3NF ,  which  allowed the right side of the FD to be a prime attribute.
Thus,  every  left side  of an  FD  in a table  must be a superkey.   Every table  that is 
BCNF is also 3NF, 2NF and 1NF, by the previous definitions.
A  loseless  decomposition  requires  that  when you decompose  the  table  into  two 
smaller  tables  by projecting the original table  over  two overlapping  subsets of the
scheme,  the  natural join  of  those  subset  tables  must  result  in  the  original  table 
without  any extra  unwanted rows.    The  simplest  way to avoid the delete anomaly
for this  kind of situation is  to create a separate table  for each of the  two  assertions.
These  two tables are partially redundant,  enough so as to  avoid the  delete anomaly.
This  decomposition  is  lossless ( trivially )  and  preserves  functional dependencies,
but   it  also   degrades   update   performance  due  to  redunancy,  and   necessitates
additional   storage  space.     The  trade-off  is  often  worth  it  bacause   the   delete 
anomaly is avoided.

Fourth and Fifth Normal Forms
Normal forms  up to BCNF  were  defined  solely  on  FDs,  and  for  most  database
practitioners,   either  3NF or BCNF is a sufficient level of normalization.  However,
there are  in fact  two more  normal forms that are needed to eliminate the rest of the
Currently known anomalies.                           (constraints: Multivalued dependencies
and join dependencies).

In a multivalued dependency (MVD), X->>Y holds on table R with table schemeRS
If , whenever a valid instance of table R(X,Y,Z) contains a pair of rows  that  contain
duplicate values of X ,  then  the instance also  contains the  pair of rows obtained by
interchanging  the Y  values in the original pair.   This includes situations where only
pairs  of  rows  exist.    Note  that  X  and  Y  may  contain either single or composite
attributes.
An MVD  X->>Y  is trivial if Y is a subset of X,  or if X union Y = RS .   Finally, an
FD implies an MVD , which implies that a single row with a given value of X is also
An MVD , albeit a trivial form.

Fourth Normal Form
A Table R is in fourth normal form (4NF)  if and  only if it is in BCNF and,  when-
ever there exists an  MVD in R  ( say X->>Y ),  at least  one of the  following holds:
The MVD is trivial  or X is a superkey for R.
The goal of  fourth normal form  is to  eliminate nontrivial   MVDs  from a table  by
projecting them onto separate smaller tables and thus eliminate the update anomalies
associated with the MVDs.   Decomposition of a table into 4NF tables results in less
data redundancy.

Fifth Normal Form
A Table is in fifth normal form (5NF) if it cannot have a lossless decomposition by
the projection operation into any number of smaller tables.
A lossless decomposition  of a  table implied  that it  can be decomposed  by  two  or 
more  projections, followed  by a  natural  join  of  those  projections  (in  any  order)
that results in the original table, without any spurious or missing rows.   The general
lossless decomposition constraint, involving any number of projections, is also known
as  a  join dependency (JD).  In other words, a table is not in 5NF  if it can be lossless
decomposed/joined via some n1 projections.
A lossless decomposition of two projections is equivalent  to an  MVD,  and the table
may be in BCNF  ( if the MVD is trivial)  or  4NF  ( if the MVD  is  trivial  or the left 
side is a superkey).      A  lossless  decomposition  of   three  or  more  projections   is 
equivalent to a JD,   and the table is only in 4NF.   Thus, an MVD is a special case of
A JD where the number of projections is two.
A table may have constraints that are FDs,  MVDs,  and  JDs.    An MVD is a special 
case of a JD.    In order to  determine the level of normalization of the table,  analyze 
the FDs first to determine normalization through BCNF ;  then analyze the MVDs to
determine which BCNF tables are also  4NF;  and,  finally,  analyze  the JDs  to find
which 4NF tables are also 5NF.
A many-to-many-to-many ternary relationship is:
1. BCNF if it can be replaced by two binary relationships.
2. 4NF if it can only be replaced by three binary relationships, and
3. 5NF if it cannot be replaced in any way (and thus is a true ternary
      relationship).

1