DESIGNING OF DATABASEENTITY-RELATIONSHIP MODEL
1
HOW TO DESIGN THE DATABASE?
2
HOW TO DESIGN THE DATABASE?There are two approaches E-R Modeling (top down approach):
Identifying entity and relations Normalization(bottom up approach):
Refinement of database designing
3
ENTITY-RELATION MODEL
4
E-R MODEL The Entity-Relationship (ER) model was
originally proposed by Peter in 1976 The ER model is a conceptual data model
that views the real world as entities and relationships.
A basic component of the model is the Entity-Relationship diagram, which is used to visually represent data objects.
5
6
BASIC CONSTRUCTS OF E-R MODELING A database can be modeled as:
a collection of entities,relationship among entities.
An entity is an object that exists and is distinguishable from other objects.Example: specific person, company,
event, plant Entities have attributes
Example: people have names and addresses
An entity set is a set of entities of the same type that share the same properties.Example: set of all persons, companies,
trees, holidays 7
BASIC CONSTRUCTS OF E-R MODELING Entities Entities are the principal data object
about which information is to be collected. Entities are usually recognizable concepts, either concrete or abstract, such as person, places, things, or events, which have relevance to the database. Some specific examples of entities are EMPLOYEES, PROJECTS, and INVOICES. An entity is analogous to a table in the relational model. 8
Relationships
A Relationship represents an association between two or more entities. Relationships are classified in terms of degree, connectivity, cardinality, and existence. An example of a relationship would be:
Employees are assigned to projects
Projects have subtasks
Departments manage one or more projects
9
ENTITY SETS CUSTOMER AND LOAN
10
RELATIONSHIP SET BORROWER
11
ATTRIBUTES Attributes describe the properties of the
entity of which they are associated. We can classify attributes as following:
Simple Composite Single-valued Multi-valued Derived
12
13
14
15
EXAMPLE
16
DEGREE OF A RELATIONSHIP
The degree of a relationship is the number of entities associated with the relationship. The n-ary relationship is the general form for degree n. Special cases are the binary, and ternary, where the degree is 2, and 3, respectively.
17
CONNECTIVITY AND CARDINALITYThe connectivity of a relationship describes the mapping of associated entity instances in the relationship. The values of connectivity are "one" or "many". The cardinality of a relationship is the actual number of related occurrences for each of the two entities. The basic types of connectivity for relations are:
One to One (1:1) One to Many (1:M) Many to One (M:1) Many to Many (M:M)
18
19
20
21
22
DIRECTION The direction of a relationship indicates the
originating entity of a relationship. The entity from which a relationship originates is
the parent entity; the entity where the relationship terminates is the child entity.
The type of the relation is determined by the direction of line connecting relationship component and the entity.
To distinguish different types of relation, we draw either a directed line or an undirected line between the relationship set and the entity set.
Directed line is used to indicate one occurrence and undirected line is used to indicate many occurrences in a relation.
23
24
25
26
E-R NOTATION Entities are represented by labeled rectangles. The
label is the name of the entity. Entity names should be singular nouns.
Attributes are represented by Ellipses. A solid line connecting two entities represents
relationships. The name of the relationship is written above the line. Relationship names should be verbs and diamonds sign is used to represent relationship sets.
Attributes, when included, are listed inside the entity rectangle. Attributes, which are identifiers, are underlined. Attribute names should be singular nouns.
Multi-valued attributes are represented by double ellipses.
Directed line is used to indicate one occurrence and undirected line is used to indicate many occurrences in a relation. 27
28
E-R NOTATION
29
EXAMPLE
30
31
CUSTOMER-LOAN RELATIONSHIP
32
EXERCISE Consider the following
database: S (S#, SSNAME, STATUS, CITY) P (P#, PNAME, COLOR,
WEIGHT, CITY) J ( J#, JNAME, CITY) SPJ( S#, P#, J#, QTY)
Here, S indicates information of suppliers, P Parts, J Projects and SPJ indicates the supplied quantity details.
33
34
TOTAL PARTICIPATION
Total participation (indicated by double line): every entity in the entity set participates in at least one relationship in the relationship set E.g. participation of loan in borrower is total
every loan must have a customer associated to it via borrower
Partial participation: some entities may not participate in any relationship in the relationship set Example: participation of customer in borrower is partial
35
36
37
SOME MORE EXAMPLES
38
REPRESENTATION OF CARDINALITY IN THE E-R DIAGRAM The cardinality of a relationship is the actual number
of related occurrences for each of the two entities E-R diagrams also provide a way to indicate
cardinality of the relationship. An edge between an entity set and a relationship set
can have an associated minimum and maximum cardinality, shown in the form l..h, where l is the minimum and h the maximum cardinality.
A minimum value of 1 indicates total participation of the entity set in the relationship set.
A maximum value of 1 indicates that the entity participates in at most one relationship, while a maximum value * indicates no limit.
The label 1..* on an edge is equivalent to a double line. 39
It is easy to misinterpret the 0..* on the edge between customer and Cust_Loan, and think that the relationship Cust_Loan is many to one from customer to loan, this is exactly the reverse of the correct interpretation.
Customer Loan
C_name
Address
Phone_no
City amountLoan_No
Cust_ Loan0..* 1..1
40
For example,the edge between loan and Cust_Loan has a cardinality constraint of 1..1, meaning the minimum and the maximum cardinality are both 1.
That is, each loan must have exactly one associated customer. The limit 0..* on the edge from customer to Cust_Loan indicates that a customer can have zero or more loans.
Thus, the relationship Cust_Loan is one to many from customer to loan, and further the participation of loan in Cust_Loan is total. 41
Customer Loan
C_name
Address
Phone_no
City amountLoan_No
Cust_ Loan
0..* 1..1
42
If both edges from a binary relationship have a maximum value of 1, the relationship is one to one. If we had specified a cardinality limit of 1..* on the edge between customer and Cust_Loan, we would be saying that each customer must have at least one loan.
43
SOME MORE EXAMPLES
44
SOME MORE EXAMPLES
45
SOME MORE EXAMPLES
46
Consider the following database:
S (S#, SSNAME, STATUS, CITY)
P (P#, PNAME, COLOR, WEIGHT, CITY)
J ( J#, JNAME, CITY)
SPJ( S#, P#, J#, QTY)
Here S indicates information of suppliers, P Parts, J Projects and SPJ indicates the supplied quantity details.
47
48
Car-insurance company
•It has a set of customers, each of who owns one or more cars.
•Car may have any number of customers.
•Each car has associated with it zero to any number of recorded accidents. System should store date and location of accident.
•Car insurance company will pay damage amount for the accidental cars to concerned driver.
49
50
CASE STUDY OF UNIVERSITY MANAGEMENT SYSTEM Consider, a university contains many
departments. Each department can offer any number of courses. Many teachers can work in a department. A teacher can work only in one department. For each department there is a Head. A teacher can be head of only one department. Each teacher can take any number of courses. A course can be taken by only one instructor. A student can enroll for any number of courses. Each course can have any number of students. 51
STEPS TO DESIGN E-R DIAGRAM First Step to Identify the Entities Second Step to find relationships among
these entities Step 3 to identify the key attributes Step 4 to identify other relevant
attributes Step 5 to draw the complete e-r
diagram
52
FIRST STEP TO IDENTIFY THE ENTITIES In order to identify the entities collect all the
noun in the requirement sheet which has some properties and are important for the system.
We can identify the following nouns: University, Department, Course, Teacher,
Student. Here, the database is of only one university.
If an entity has a single instance then that entity is ignored. Thus, the final entities are:
DEPARTMENT COURSE TEACHER STUDENT 53
SECOND STEP TO FIND RELATIONSHIPS AMONG THESE ENTITIES Each department can offer any number of courses
and we can assume that each course belongs to only one department. Then the connectivity of the relation ship among DEPARTMENT and COURSE is One to Many. If a course can run in more than one department then it is Many to Many.
Many teachers can work in a department and a teacher can work only in one department. Thus the connectivity among DEPARTMENT and TEACHER is one to many.
For each department there is a Head and a teacher can be head of only one department. Hence, the connectivity is one to one.
54
Each teacher can take any number of courses and a course can be taken by only one instructor. Thus, the connectivity between TEACHER and COURSE is one to many.
A student can enroll for any number of courses and each course can have any number of students. Thus, the connectivity between STUDENT and COURSE is many to many.
55
STEP 3 TO IDENTIFY THE KEY ATTRIBUTES Following are the primary key attributes
for each entity set: Dno (Department number) is the key
attribute for the Entity DEPARTMENT. C_code (Course number) is the key
attribute for COURSE Entity. Roll_no (Roll number) is the key attribute
for STUDENT Entity. T_code (Teacher code) is the key attribute
for TEACHER Entity.56
STEP 4 TO IDENTIFY OTHER RELEVANT ATTRIBUTES Following are the other relevant attributes for
each entity set: DEPARTMENT entity will have other relevant
attributes as dname, loc. For COURSE entity, c_name, credits. For TEACHER entity, name, mob_no For STUDENT entity, name, address
57
STEP 5 TO DRAW THE COMPLETE E-R DIAGRAM
DEPARTMENT
TEACHER STUDENT
COURSEoffers
has enrollteach
heads
dno dname loc c_code credits
roll_no addressnamet_code mob_noname
c_name
58
Strong and Weak Entity Sets
The entity set which does not has sufficient attributes to form a primary key is called as weak entity set.
An entity set that has a primary key is called as Strong entity set. Consider an entity set Payment which has three attributes: payment_number, payment_date and payment_amount.
Although each payment entity is distinct but payment for different loans may share the same payment number. Thus, this entity set does not have a primary key and it is a weak entity set.
Each weak set must be a part of one-to-many relationship set.
59
A member of a strong entity set is called dominant entity and member of weak entity set is called as subordinate entity.
A weak entity set does not have a primary key but we need a means of distinguishing among all those entries in the entity set that depend on one particular strong entity set.
The discriminator of a weak entity set is a set of attributes that allows this distinction to be made.
For example, payment_number is acts as discriminator for payment entity set. It is also called as the Partial key of the entity set.
60
The primary key of a weak entity set is formed by the primary key of the strong entity set on which the weak entity set is existence dependent, plus the weak entity set’s discriminator. In the above example {loan_number, payment_number} acts as primary key for payment entity set.
61
The relationship between weak entity and strong entity set is called as Identifying Relationship. In example, loan-payment is the identifying relationship for payment entity. A weak entity set is represented by doubly outlined box and corresponding identifying relation by a doubly outlined diamond
62
63
64
CASE STUDYRepresent each of the following requirements with an ER diagram: A regional council requires the design of a database system that
can provide information on all schools in the region. The requirements collection and analysis phase of the database design process has provided the following data requirements for the schools database system.(a) Every school has many pupils and many teachers. Each pupil is assigned to one school and each teacher works for one school only.(b) Each teacher teaches more than one subject but a subject may be taught by more than one teacher. The database should store the number of hours a teacher spent teaching a subject. Data held on each teacher includes his/her national Insurance Number (NIN), name (first and last), sex, and qualifications. The data held on each subject includes subject title and type.(c) Each pupil can study more than one subject and a subject may be studied by more than one pupil. Data held on each pupil includes the pupil's code, name (first and last), sex, and date of birth.(d) Each school is managed by one of its teachers. The database should keep track of the date he/she started managing the school. Data stored on each school includes the school's code, name, address (town, street, and post code) and phone. 65
DESIGN ISSUES USE OF ENTITY SETS VERSUS
ATTRIBUTES Consider the entity set employee with attributes
- employee name and -telephone number. It can easily be argued that a telephone is an
entity in its own right with attributes -telephone number and location ( the office where the
). telephone is located If we take this point of, view we must redefine the employee entity set
:as • The employee entity set with attribute
-employee name • The telephone entity set with attributes
- telephone number and location • The relationship set -emp telephone, which
denotes the association between employees and the telephones that they have 66
, , WHAT THEN IS THE MAIN DIFFERENCE BETWEEN THESE TWO DEFINITIONS OF
?AN EMPLOYEE Treating a telephone as an attribute
- telephone number implies that employeeshave .precisely one telephone number each
Treating a telephone as an entity telephonepermits employees to have several
( ) telephone numbers including zero associated with them
, However we could instead easily define- telephone number as a multivalued
attribute to allow multiple telephones per.employee 67
The main difference then is that treating a telephone as an entity better models a
situation where one may want to keep , extra information about a telephone
, ( , such asits location or its type mobile , ), video phone or plain old telephone or
who all share . the telephone , Thus treating telephone as an entity is
more general than treating it as an attribute and is appropriate when the
.generality may be useful 68
69
, In contrast it would not be appropriate to treat the attribute - employee name as an ; entity
it is difficult to argue that - employee name is ( an entity in its own right in contrast to the
).telephone , Thus it is appropriate to have -employee
name as an attribute of the employee entity.set
70
: Two natural questions thus arise What , constitutes an attribute and what
constitutes ? an entity set , . Unfortunately there are no simple answers The distinctions mainly depend on the
- structure of the real world enterprise being, modeled and on the semantics associated
with the attribute in .question
71
USE OF ENTITY SETS VERSUS RELATIONSHIP SETS
It is not always clear whether an object is best expressed by an entity set or a
relationship . set W e assumed that a bank loan is modeled as
.an entity An alternative is to model a loan not as an
, entity but rather as a relationship betwee , customers and branches with - loan number
and amount . as descriptive attributes Each loan is represented by a relationship
.between a customer and a branch 72
If every loan is held by exactly one customer and is associated with exactly one , branch we may find satisfactory the design where a loan
.is represented as a relationship
73
Generalization: A bottom-up design process
A generalization hierarchy is a form of abstraction that specifies that two or more entities that share common attributes can be generalized into a higher-level entity type called a supertype or generic entity.
The lower level of entities becomes the subtype, or categories, to the super type. Subtypes are dependent entities.
Generalization is used to emphasize the similarities among lower-level entity sets and to hide differences.
It makes ER diagram simpler because shared attributes are not repeated. Generalization is denoted through a triangle component labeled ‘IS A”, 74
75
Specialization: Top-down design process
Specialization is the process of taking subsets of a higher-level entity set to form lower level entity sets. It is a process of defining a set of subclasses of an entity type, which is called as superclas of the specialization. The process of defining subclass is based on the basis of some distinguish characteristics of the entities in the super class.
For example, specialization of the Employee entity type may yield the set of subclasses namely Salaried_Employee and Hourly_Employee on the method of pay
76
77
78
Difference between Specialization and Generalization
Specialization is the process of taking subsets of a higher-level entity set to form lower level entity sets. Specialization emphasizes differences among entities within the set by creating distinct lower-level entity sets. These lower-level entity sets may have attributes, or may participate in relationships, that do not apply to all entities in the higher-level entity set.
Generalization proceeds from the recognition that a number of entities set share some common features, which are described by the same, attributes and participate in the same relationship sets. Generalization is used to emphasize the similarities among lower-level entity sets and to hide differences.
79
DESIGN CONSTRAINTS ON A SPECIALIZATION/GENERALIZATION Constraint on which entities can be members of a
given lower-level entity set. condition-defined OR Attribute Defined
Example: all customers over 65 years are members of senior-citizen entity set; senior-citizen ISA person.
All account entities are evaluated on the defining account-type attribute. Only those entities that satisfy the condition account-type = “savings account” are allowed to belong to the lower-level entity set person. All entities that satisfy the condition account-type = “checking account” are included in checking account. Since all the lower-level entities are evaluated on the basis of the same attribute (in this case, on account-type), this type of generalization is said to be attribute-defined.
80
user-defined
User-defined lower-level entity sets are not constrained by a membership condition; rather, the database user assigns entities to a given entity set.
For instance, let us assume that, after 3 months of employment, bank employees are assigned to one of four work teams.
We therefore represent the teams as four lower-level entity sets of the higher-level employee entity set.
A given employee is not assigned to a specific team entity automatically on the basis of an explicit defining condition.
Instead, the user in charge of this decision makes the team assignment on an individual basis. The assignment is implemented by an operation that adds an entity to an entity set.
81
Constraint on whether or not entities may belong to more than one lower-level entity set within a single generalization.
Disjoint
A disjointness constraint requires that an entity belong to no more than one lower-level entity set. In our example, an account entity can satisfy only one condition for the account-type attribute; an entity can be either a savings account or a checking account, but cannot be both.
82
Overlapping
an entity can belong to more than one lower-level entity set
83
Overlapping. In overlapping generalizations, the same entity may belong to -more than one lower
. level entity set within a single generalization
For an , illustration consider the employee work team, example and assume that certain managers
participate in more .than one work team
A given employee may therefore appear in more -than one of the team entity sets that are lower level
entity sets of employee. ,Thus the generalization is.overlapping
84
,As another example suppose generalization applied to entity sets customer and employee
- leads to a higher level entity set person. The generalization is overlapping if an employee
.can also be a customer
85
DESIGN CONSTRAINTS ON A SPECIALIZATION/GENERALIZATION (CONT.) Completeness constraint -- specifies
whether or not an entity in the higher-level entity set must belong to at least one of the lower-level entity sets within a generalization. total : an entity must belong to one of the lower-
level entity sets partial: an entity need not belong to one of the
lower-level entity sets
86
.Partial generalization is the default
- We can specify total generalization in an E R diagram by using a double line to connect the box representing - the higher level entity set .to the triangle symbol
( This notation is similar to the notation for totalparticipation in a .)relationship
The account : generalization is total All account entities must be either a savings account or a checking
account
87
- Because the higher level entity set arrived atthrough generalization is generally composed ofonly - those entities in the lower level entity , sets
the completeness constraint for a generalized- higher level entity set is usually . total
, - When the generalization is partial a higher level entity is not constrained to -appear in a lower
.level entity set
The work team entity sets illustrate a partial.specialization 88
Since employees are assigned to a team only 3 after months on the ,job some employee
entities may not be members of any of the- .lower level team entity sets We may
characterize the team entity sets more fully , as a partial overlapping specialization of
employee.
89
The generalization of - checking account and -savings account into account , is a total disjoint
. generalization
,The completeness and disjointness constraints , however do . not depend on each other
Constraint patterns may also be - partial disjoint and- .total overlapping
We can see that certain insertion and deletion requirements follow from the constraints that apply
to a given . generalization or specialization90
, For instance when a total completeness , constraint is in place an entity inserted into a
- higher level entity set must also be inserted - into at least one of the lower level entity . sets
With a - , -condition defined constraint all higher level entities that satisfy the condition must
be - . inserted into that lower level entity set
, Finally an entity that is deleted from a -higher level entity set also is deleted from all the
- associated lower level entity sets to which it.belongs 91
Aggregation
One limitation of the E-R model is that it cannot express relationships among relationships.
The best way to model a situation like this is by the use of aggregation. Thus, the relationship set work_on relating the entity sets Employee, Branch and Job is a higher-level entity set. Such an entity set is treated in the same manner, as is any other entity set. We can then create a binary relationship Manages between work_on and Manager to represent who manages what tasks.
92
93
94
Suppose a customer loan pair may have a bank emp Who is a loan officer for that particular pair.
CUST LOANBORROWER
LOAN-OFFICER
EMP 95
The best way to model the situation is to use aggregation. Aggregation is asn abstraction through which relationship are treated
as higher level entities
CUST LOANBORROWER
LOAN-OFFICER
EMP 96
E-R DIAGRAM OF BAKING SYSTEM
97
CASE STUDY: DATABASE DESIGN FOR BANKING
ENTERPRISE Here are the major characteristics of the
.banking enterprise • . The bank is organized into branches Each
branch is located in a particular city and is . identified by a unique name The bank
monitors the assets of each .branch
98
99
100