Upload
kiranqa
View
224
Download
0
Embed Size (px)
Citation preview
7/25/2019 ER Model Info.docx
1/34
The ER model defines the conceptual view of a database. It works around real-world entities and
the associations among them. At view level, the ER model is considered a good option for
designing databases.
EntityAn entity can be a real-world obect, either animate or inanimate, that can be easily identifiable. !or
e"ample, in a school database, students, teachers, classes, and courses offered can be considered
as entities. All these entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with
attribute sharing similar values. !or e"ample, a #tudents set may contain all the students of a
school$ likewise a Teachers set may contain all the teachers of a school from all faculties. Entity
sets need not be disoint.
AttributesEntities are represented by means of their properties, called attributes. All attributes have values.
!or e"ample, a student entity may have name, class, and age as attributes.
There e"ists a domain or range of values that can be assigned to attributes. !or e"ample, a
student%s name cannot be a numeric value. It has to be alphabetic. A student%s age cannot be
negative, etc.
Types of Attributes
Simple attribute& #imple attributes are atomic values, which cannot be divided further. !or
e"ample, a student%s phone number is an atomic value of '( digits.
Composite attribute& )omposite attributes are made of more than one simple attribute.
!or e"ample, a student%s complete name may have first*name and last*name.
Derived attribute& +erived attributes are the attributes that do not e"ist in the physical
database, but their values are derived from other attributes present in the database. !or
e"ample, average*salary in a department should not be saved directly in the database,
instead it can be derived. !or another e"ample, age can be derived from data*of*birth.
Single-value attribute& #ingle-value attributes contain single value. !or e"ample &
#ocial*#ecurity*umber.
7/25/2019 ER Model Info.docx
2/34
Multi-value attribute& ulti-value attributes may contain more than one values. !or
e"ample, a person can have more than one phone number, email*address, etc.
These attribute types can come together in a way like &
simple single-valued attributes
simple multi-valued attributes
composite single-valued attributes
composite multi-valued attributes
Entity-#et and eys
ey is an attribute or collection of attributes that uni/uely identifies an entity among entity set.
!or e"ample, the roll*number of a student makes him0her identifiable among students.
Super Key& A set of attributes 1one or more2 that collectively identifies an entity in an entity
set.
Candidate Key& A minimal super key is called a candidate key. An entity set may have
more than one candidate key.
Primary Key& A primary key is one of the candidate keys chosen by the database designer
to uni/uely identify the entity set.
RelationshipThe association among entities is called a relationship. !or e"ample, an employee works_ata
department, a student enrollsin a course. 3ere, 4orks*at and Enrolls are called relationships.
Relationship #et
A set of relationships of similar type is called a relationship set. 5ike entities, a relationship too can
have attributes. These attributes are called descriptive attributes.
+egree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
6inary 7 degree 8
7/25/2019 ER Model Info.docx
3/34
Ternary 7 degree 9
n-ary 7 degree
apping )ardinalities
Cardinalitydefines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.
One-to-one& :ne entity from entity set A can be associated with at most one entity of entity
set 6 and vice versa.
One-to-many& :ne entity from entity set A can be associated with more than one entities
of entity set 6 however an entity from entity set 6, can be associated with at most one
entity.
Many-to-one& ore than one entities from entity set A can be associated with at most one
entity of entity set 6, however an entity from entity set 6 can be associated with more than
one entity from entity set A.
7/25/2019 ER Model Info.docx
4/34
Many-to-many& :ne entity from A can be associated with more than one entity from 6 and
vice versa.
5et us now learn how the ER odel is represented by means of an ER diagram. Any obect, for
e"ample, entities, attributes of an entity, relationship sets, and attributes of relationship sets, can be
represented with the help of an ER diagram.
EntityEntities are represented by means of rectangles. Rectangles are named with the entity set they
represent.
Attributes
7/25/2019 ER Model Info.docx
5/34
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity 1rectangle2.
If the attributes are composite, they are further divided in a tree like structure. Every node is then
connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.
Multivaluedattributes are depicted by double ellipse.
7/25/2019 ER Model Info.docx
6/34
Derivedattributes are
depicted by dashed
ellipse.
RelationshipRelationships are represented by diamond-shaped bo". ame of the relationship is written inside
the diamond-bo". All the entities 1rectangles2 participating in a relationship, are connected to it by a
line.
6inary Relationship and )ardinality
A relationship where two entities are participating is called a binary relationship. )ardinality is the
number of instance of an entity from a relation that can be associated with the relation.
7/25/2019 ER Model Info.docx
7/34
One-to-one& 4hen only one instance of an entity is associated with the relationship, it is
marked as %';'%. The following image reflects that only one instance of each entity should be
associated with the relationship. It depicts one-to-one relationship.
One-to-many& 4hen more than one instance of an entity is associated with a relationship,
it is marked as %';%. The following image reflects that only one instance of entity on the left
and more than one instance of an entity on the right can be associated with the relationship.It depicts one-to-many relationship.
Many-to-one& 4hen more than one instance of entity is associated with the relationship, it
is marked as %;'%. The following image reflects that more than one instance of an entity on
the left and only one instance of an entity on the right can be associated with the
relationship. It depicts many-to-one relationship.
Many-to-many& The following image reflects that more than one instance of an entity on
the left and more than one instance of an entity on the right can be associated with the
relationship. It depicts many-to-many relationship.
7/25/2019 ER Model Info.docx
8/34
enerali=ationAs mentioned above, the process of generali=ing entities, where the generali=ed entities contain the
properties of all the generali=ed entities, is called generali=ation. In generali=ation, a number of
entities are brought together into one generali=ed entity based on their similar characteristics. !or
e"ample, pigeon, house sparrow, crow and dove can all be generali=ed as 6irds.
7/25/2019 ER Model Info.docx
9/34
#peciali=ation#peciali=ation is the opposite of generali=ation. In speciali=ation, a group of entities is divided into
sub-groups based on their characteristics. Take a group ?
7/25/2019 ER Model Info.docx
10/34
!or e"ample, the attributes of a
7/25/2019 ER Model Info.docx
11/34
What is a relational database?
Though you are probably aware of the term database! what you(re actually referring to
is a Database anagement "ystem. ) database in specific terms is a collection of
related data managed as a single unit and stored on some form of persistent storage
device such as a hard disk. This collection of data is managed by the Relational
Database anagement "ystem %RD'"& which acts to control access to the database.
The RD'" is a collection of applications sitting on top of the database providing a
number of vital functions which include*
Allows users to create databases
Allow users to query the data stored
Support and maintain large amounts of data
Allow multiple users to access the data concurrently
ost importantly the RD'" provides consistency! integrity and security of the data
it holds and as such the reliability and robustness of the relational database it looks
after.
Separation of Concept+hen we consider a Relational Database we can illustrate it as a set of conceptual
components and interfaces. This helps us get a better overview of how each of the
elements to be learnt fit together and interact with each other.
Database can be considered as a collection of related datamanaged as a single unit
DBMS a collection of applications providing a number of vital
functions for the database
The other four are our perception of design! implementation! use and maintenance of
the database and how we interact with it to achieve our goals of efficient data storage
and information retrieval. These are the areas we will be looking at within this unit but
for now they are*
7/25/2019 ER Model Info.docx
12/34
Design Allows us to model our business and translate thatinto a database design (database schema). or this we use twodesign techniques.
Entity Modelling !op"down design through information
modelling
Normalisation #ottom"up design through data modelling
Implementation $nce the database schema has beenrealised we use the relational database access language%Structured &uery 'anguage (S&'). ore specically we use asubset of S&' called the *ata *enition 'anguage (**') tocreate the database schema in the database in the form ofrelational tables.
Database Queries $nce the database has been created we
need a means of adding% removing% updating and querying thedata. !his is achieved normally through some form ofApplication +rogramming ,nterface (A+,) or *atabase -lientusing another subset of S&' called *ata anipulation'anguage (*')
7/25/2019 ER Model Info.docx
13/34
Database Administration Administration tass are carriedout by a *atabase Administrator (*#A). !he *#A is responsiblefor the maintenance and security of the database and thoughmight use **' and *' for maintenance% their main tool ofcontrol is the third subset of S&' called the *ata -ontrol'anguage (*-') which allows the *#A to set up security rulesabout how the data might be accessed and by whom.
The Database Terms of Reference
,ntroduction
+hen we communicate about a particular sub,ect area it is very important we use the
common terms of reference. Relational database design is no different with its own setof uniue terms! and in order to epress or define our ideas and concepts we need to
know the language to use. +ith Relational Databases this set of terms goes further
being used to epress and define the database design process itself and is used
throughout the industry from publications and education to conversations between
database professionals. In this section we are going to look at some of these terms and
where they are used.
The process of creating a database can be broadly divided into two main stages*
/. *ata analysis% using a formalised methodology to create adatabase design. !wo widely used methods are 0ntity1elationship odelling (01) and 2ormalisation.
3. +hysical implementation of that design in a database system.!here are many e4amples of 1elational databases includingyS&'% $racle% S&'Server to mention but a few.
)s you move from a database design to a physical implementation! different
terminology is used. It is important to understand these differences and ensure the
correct terms are used for the appropriate ethodology or stage you are discussing or
presenting.
The following table identifies each of the different disciplines and their euivalent
terms in relation to the other disciplines.
7/25/2019 ER Model Info.docx
14/34
It is a common misunderstanding that an Entity is like a Relation or that a
Relation is a table. This is not true as they stem from very different disciplines within
the Relational Database model and as such represent different descriptive types
specific to that discipline. In saying this! the rational of each being representative of
the other within the different disciplines can hold true. Therefore an Entity can be
compared to a Relation in terms of design and used as validation. $ikewise an Entity
or Relation can become a table during the transition of implementation. This may
seem somewhat pedantic but clarity of definition and scope will help to ensure you
can correctly communicate your needs or reuests to yourself and those around you.
/oor use of terms can lead to confusion! misrepresentation or poor implementation.
$et us now eamine each of the terms in the above table in more detail
0ntity 1elationship odelling !erms
0ntity5
A uniquely identiable ob6ect of important from a top level
perspective of an organisation or business model.
0ntity $ccurrence5
A single instance of an entity.
7/25/2019 ER Model Info.docx
15/34
Attribute5
An identied element within an entity
If we consider a college as an eample of something we might be modelling! a
department! student or module might be eamples of a uniuely identifiable ob,ect ofimportant. It is worth noting that by convention we tend to name entities in the
singular as in student rather than students.
0ontinuing with this eample! within a "tudent entity we would have attributes such
as studentId and studentName and an entity occurrence would be a single instance of
these attributes.
2ormalisation !erms
1elation5
A bottom up view of a design concept of a realisation of a
potential database table taen from the movement of
information within an organisation or business model
!uple5
An ordered nite set of values of a relation.
*omain5
*enes the constraint and type of a single value element of a
relation.
7/25/2019 ER Model Info.docx
16/34
If we take a hotel as an eample we might have 0ustomer or 'ooking as eamples of
Relations with eamples of Domains within a 'ooking being bookingRef!
bookingDate and roomNo
1elational *atabase !erms
!able5
A table is the conceptual view of the database7s internal
structure in the conte4t of the 8 layer model.
1ecord91ow5
A row9record is a set of related data values of a common item.
-olumn5
A column is a data value of a particular item type.
This is the implementation of the design and an eample can be seen in the following
"#$ statement
7/25/2019 ER Model Info.docx
17/34
odels and Schemas
1elational odel
is representative of a single entity or relation within the conte4t
of a relational database where each of the elements of the
entity or relation have been dened. or e4ample a single
entity has its attributes% constraints and eys dened which are
representative of the completed table to be implemented.
1elational Schema
is a realisation of a relational model. ,t is the implementation of
this model into the relational database. ,n other words% it is a
physical single table within the database.
1elational *atabase Schema
is the collection of 1elational Schemas and their relationships
to each other as implemented into a relational database. ,n
other words% it is a collection of physical tables and their
relationships that mae up the database as a whole.
7/25/2019 ER Model Info.docx
18/34
Database Keys
,ntroduction
1or the purposes of clarity we will refer to keys in terms of RD'" tables but thesame definition! principle and naming applies eually to Entity odelling and
Normalisation.
2eys are! as their name suggests! a key part of a relational database and a vital part of
the structure of a table. They ensure each record within a table can be uniuely
identified by one or a combination of fields within the table. They help enforce
integrity and help identify the relationship between tables. There are three main types
of keys! candidate keys! primary keys and foreign keys. There is also an alternative
key or secondary key that can be used! as the name suggests! as a secondary or
alternative key to the primary key
Super :ey
) "uper key is any combination of fields within a table that uniuely identifies each
record within that table.
-andidate :ey
) candidate is a subset of a super key. ) candidate key is a single field or the least
combination of fields that uniuely identifies each record in the table. The least
combination of fields distinguishes a candidate key from a super key. Every table
must have at least one candidate key but at the same time can have several.
7/25/2019 ER Model Info.docx
19/34
)s an eample we might have a student3id that uniuely identifies the students in a
student table. This would be a candidate key. 'ut in the same table we might have the
student(s first name and last name that also! when combined! uniuely identify the
student in a student table. These would both be candidate keys.
In order to be eligible for a candidate key it must pass certain criteria.
,t must contain unique values
,t must not contain null values
,t contains the minimum number of elds to ensure uniqueness
,t must uniquely identify each record in the table
7/25/2019 ER Model Info.docx
20/34
4nce your candidate keys have been identified you can now select one to be your
primary key
+rimary :ey
) primary key is a candidate key that is most appropriate to be the main reference key
for the table. )s its name suggests! it is the primary key of reference for the table and
is used throughout the database to help establish relationships with other tables. )s
with any candidate key the primary key must contain uniue values! must never be
null and uniuely identify each record in the table.
)s an eample! a student id might be a primary key in a student table! a department
code in a table of all departments in an organisation. This module has the code D56D
67 that is no doubt used in a database somewhere to identify RD'" as a unit in a
table of modules. In the table below we have selected the candidate key student3id to
be our most appropriate primary key
7/25/2019 ER Model Info.docx
21/34
/rimary keys are mandatory for every table each record must have a value for its
primary key. +hen choosing a primary key from the pool of candidate keys always
choose a single simple key over a composite key.
oreign :ey
) foreign key is generally a primary key from one table that appears as a field in
another where the first table has a relationship to the second. In other words! if we had
a table ) with a primary key 8 that linked to a table ' where 8 was a field in '! then
8 would be a foreign key in '.
)n eample might be a student table that contains the course3id the student is
attending. )nother table lists the courses on offer with course3id being the primary
key. The 9 tables are linked through course3id and as such course3id would be a
foreign key in the student table.
7/25/2019 ER Model Info.docx
22/34
Secondary :ey or Alternative :ey
) table may have one or more choices for the primary key. 0ollectively these are
known as candidate keys as discuss earlier. 4ne is selected as the primary key. Those
not selected are known as secondary keys or alternative keys.
1or eample in the table showing candidate keys above we identified two candidate
keys! studentId and firstName : lastName. The studentId would be the most
appropriate for a primary key leaving the other candidate key as secondary or
alternative key. It should be noted for the other key to be candidate keys! we are
assuming you will never have a person with the same first and last name combination.
)s this is unlikely we might consider fistName:lastName to be a suspect candidate
key as it would be restrictive of the data you might enter. It would seem a shame to
not allow ;ohn "mith onto a course ,ust because there was already another ;ohn"mith.
Simple :ey
)ny of the keys described before %ie primary! secondary or foreign& may comprise one
or more fields! for eample if firstName and lastName was our key this would be a
key of two fields where as studentId is only one. ) simple key consists of a single
field to uniuely identify a record. In addition the field in itself cannot be broken
down into other fields! for eample! studentId! which uniuely identifies a particular
student! is a single field and therefore is a simple key. No two students would have the
same student number.
-ompound :ey
) compound key consists of more than one field to uniuely identify a record. )
compound key is distinguished from a composite key because each field! which
makes up the primary key! is also a simple key in its own right. )n eample might be
a table that represents the modules a student is attending. This table has a studentId
and a module0ode as its primary key. Each of the fields that make up the primary key
are simple keys because each represents a uniue reference when identifying a student
in one instance and a module in the other.
-omposite
7/25/2019 ER Model Info.docx
23/34
) composite key consists of more than one field to uniuely identify a record. This
differs from a compound key in that one or more of the attributes! which make up the
key! are not simple keys in their own right. Taking the eample from compound key!
imagine we identified a student by their firstName : lastName. In our table
representing students on modules our primary key would now be firstName :lastName : module0ode. 'ecause firstName : lastName represent a uniue reference
to a student! they are not each simple keys! they have to be combined in order to
uniuely identify the student. Therefore the key for this table is a composite key.
ormalisation !"er"ie#
!he 2ormalisation +rocess
Normalisation was developed by Dr. E.1.0odd in 9 as part of the Relational
Database Theory as a means of breaking data into its related groups and defining the
relationships between those groups. It is said the name Normalisation was initially a
political gag taken from /resident Nion and his initiative for ?Normalising( relations
with 0hina. 0odd figured if you can Normalise relations with a country you should be
able to normalise the relations with data as well.
Normalisation is a specific relational database analysis and design techniue used to
model groups of related data within an organisation. Its purpose is to ensure datastored within the database adheres to best practices by following a set of rules with the
purpose of eliminating redundancies and optimising the process of information
retrieval. Normalisation leaves us with a structure that groups like data into relational
models referenced by keys and linked to other relational models to form a relational
database schema.
Normalisation is represented by a logical set of steps that follow simple rules that are
applied to each stage of the modelling process. )t the highest level the stages are
separated into something called Normal 1orms! identified by a particular named
process.
7/25/2019 ER Model Info.docx
24/34
Initially there were only three normal forms! 1irst Normal 1orm %
7/25/2019 ER Model Info.docx
25/34
3. irst 2ormal orm (/2) 1epeating
7/25/2019 ER Model Info.docx
26/34
3. ;sing the form or document from step one% select a sample ofdata to create rows under the column headings. !ry and createat least 8 rows of data taen directly from the form then createat least 8 more model data rows to provide a good range ofdata. !hese rows of data represent a normalisation tuple andare a very important part of the process as without good model
data it is harder to achieve good model design.
8. We now need to select a suitable ey from our domains thatwill allow us to have a unique reference. ,dentify the candidateeys and from this select a suitable +rimary :ey. ;nderline theselected domain(s)% this will be our starting ey.
=. $ur table should be looing complete but the last thing wemust do is remove any repeating data as this will help us withour rst normal form. 1epeating data is data that because of itsdirect relationship with the +rimary ey% repeats itself in eachof the tuples where the ey is the same. >ou must be carefulnot to misread domains where the data appears to repeat but
7/25/2019 ER Model Info.docx
27/34
this is due to the restrictions of the model data selected andnot because of its relation with the ey.
&irst%ormal &orm '(&)+ith our un-normalised relation now complete we are ready to start the normalisation
process. 1irst Normal form is probably the most important step in the normalisation
process as it facilities the breaking up of our data into its related data groups! with the
following normalised forms fine tuning the relationships between and within the
grouped data.
+ith 1irst Normal 1orm we are looking to remove repeating groups. ) repeating
group is a domain or set of domains! directly relating to the key! that repeat dataacross tuples in order to cater for other domains where the data is different for each
tuple.
7/25/2019 ER Model Info.docx
28/34
In this eample with "tudent ID as the primary key we see the three domains!
"tudentName! Aear and "emester repeat themselves across the tuples for each of the
different @nit0ode and @nitName entries. Though workable it means our relation
could potentially be huge with loads of repeating data taking up valuable space and
costing valuable time to search through.
The rules of 1irst Normal 1orm break this relation into two and relate them to each
other so the information needed can be found without storing unneeded data. "o from
our eample we would have one table with the student information and another with
the @nit Information with the two relations linked by a domain common to both! in
this case! the "tudentId.
"o the steps from @N1 to
7/25/2019 ER Model Info.docx
29/34
8. !he original primary ey will not now be unique so assign anew primary ey to the relation using the original primary eyas part of a compound or composite ey.
=. ;nderline the domains that mae up the ey to distinguish
them from the other domains.
Taking our original eample once we have followed these simple steps we have
relations that looks like this*
Second%ormal &orm '*&)Now our data is grouped into sets of related data we still need to check we are not
keeping more data than we need to in our relation. +e know we don(t have any
repeating groups as we removed these with 1irst Normal 1orm. 'ut if we look at our
eample we can see for every @nit0ode we are also storing the @nitName.
7/25/2019 ER Model Info.docx
30/34
+ould it not seem more sensible to have a different relation we could use to look up
@BC7>6 and find the unit name ?)dvanced Database( This way we wouldn(t have
to store lots of additional duplicate information in our "tudentF@nit relation.
This is eactly what we aim to achieve with "econd Normal 1orm and its purpose is
to remove partial dependancies.
+e can consider a relation to be in "econd Normal 1orm when* The relation is in 1irst
Normal 1orm and all partial key dependencies are removed so that all non keydomains are functionally dependant on all of the domains that make up the primary
key.
'efore we start with the steps! if we have a table with only a single simple key this
can(t have any partial dependencies as there is only one domain that is a key therfroe
these relations can be moved directly to 9nd normal form.
1or the rest the steps from 9N1 to 6N1 are*
/. !ae each non"ey domain in turn and chec if it is onlydependant on part of the ey?
3. ,f yes
a. 1emove the non"ey domain along with a copy of the partof the ey it is dependent upon to a new relation.
7/25/2019 ER Model Info.docx
31/34
b. ;nderline the copied ey as the primary ey of the newrelation.
8. ove down the relation to each of the domains repeating steps/ and 3 till you have covered the whole relation.
=. $nce completed with all partial dependencies removed% thetable is in 3nd normal form.
In our eample above! @nitName is only dependant on unit0ode and has no
dependency on studentId. )pplying the steps above we move the unitName to a new
relation with a copy of the part of the key it is dependent upon. 4ur table in second
normal form would subseuently look like this*
Third%ormal &orm '+&)Third Normal 1orm deals with something called ?transitive( dependencies. This means
if we have a primary key ) and a non-key domain ' and 0 where 0 is more
dependent on ' than ) and ' is directly dependent on )! then 0 can be considered
transitively dependant on ).
)nother way to look at it is a bit like a stepping stone across a river. If we consider the
primary key ) to be the far bank of the river and our non-key domain 0 to be our
7/25/2019 ER Model Info.docx
32/34
current location! in order to get to )! our primary key! we need to step on a stepping
stone '! another non-key domain! to help us get there. 4f course we could ,ump
directly from 0 to )! but it is easier! and we are less likely to fall in! if we use our
stepping stone '. Therefore current location 0 is transitively dependent on ) through
our stepping stone '.
'efore we start with the steps! if we have any relations with Gero or only one non-key
domain we can(t have a transitive dependency so these move straight to 6rd Normal
1orm
1or the rest the steps from 9N1 to 6N1 are*
/. !ae each non"ey domain in turn and chec it is moredependent on another non"ey domain than the primary ey.
3. ,f yes
a. ove the dependent domain% together with a copy of thenon"ey attribute upon which it is dependent% to a newrelation.
b. ae the non"ey domain% upon which it is dependent%the ey in the new relation.
7/25/2019 ER Model Info.docx
33/34
c. ;nderline the ey in this new relation as the primary ey.
d. 'eave the non"ey domain% upon which it was dependent%in the original relation and mar it a foreign ey ().
8. ove down the relation to each of the domains repeating steps/ and 3 till you have covered the whole relation.
=. $nce completed with all transitive dependencies removed% thetable is in 8rd normal form.
In our eample above! we have unit0ode as our primary key! we also have a
courseName that is dependent on course0ode and course0ode! dependent on
unit0ode. Though couseName could be dependent on unit0ode it more dependent on
course0ode! therefore it is transitively dependent on unit0ode.
"o following the steps! remove courseName with a copy of course code to another
relation and make course0ode the primary key of the new relation. In the original
table mark course0ode as our foreign key.
7/25/2019 ER Model Info.docx
34/34
Multiple and ested Repeating
,roups)s we mentioned earlier! 1irst Normal 1orm is probably the most important step in
the normalisation process as mistakes here will have a ripple effect on the rest of the
normalisation process. +here some people get caught out is in applying