55
Chapter No 3 Relational Model

Chapter No 3 Relational Model. 3.1 Codd’s rules A relational DBMS must use its relational facilities exclusively to manage and interact with the database

Embed Size (px)

Citation preview

Chapter No 3Relational Model

3.1 Codd’s rules

A relational DBMS must use its relational facilities exclusively to manage and interact with the database. The ’12’ E.F Codd rules:These rules were defined by Dr.E.F Codd who is an IBM researcher who first developed the relational model in 1970.In 1985, Dr. E.F Codd published a list of 12 rule that define an ideal relational database and has provided a guideline for the design of all relational database systems.

• They specify what a relational database must support in order to be relational.

• Rule 1-Information rule:• All information in the database should be represented in one

and only one way - as values in a table.

• Rule 2-Guaranteed access rule:• All the data should be accessible without ambiguity.• Every value can be accessed by providing table name,

column name and key.

• Rule 3-Systematic Treatment of Null values:

A field should be allowed to remain empty.

• This involves the support of a null value which is distinct from an empty string or a number with a value of zero.

• Of course this cannot apply to primary keys.

• In addition, most database implementations support the concept of a not-null field constraint that prevents null values in a specific table column.

• Rule 4-Dynamic On-Line Catalog based on the relation model:

• The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data.

• A Catalog (data dictionary) can be queried by authorized users as part of the database.

• Rule 5-Comprehensive Data Sublanguage Rule:

• Used interactively and embedded within programs

• Supports data definition, data manipulation, security, integrity constraints and transaction processing

• Today means: must support SQL.

use well-defined syntax, as character strings and whose ability to support all of the following is comprehensible:

• Data definition • View definition • Data manipulation (interactive and by program) • Integrity constraints • Authorization • Transaction boundaries (begin, commit, and rollback).• All the relational databases use the SQL”(Structure Query

Language) for this purpose

• Rule 6-View Updating Rule:

All views that are theoretically updateable are also updateable by the system.

• Views are virtual tables.• They appear to behave as conventional tables except that

they are built dynamically when the query is run. • It is not always theoretically possible to update and delete

views. So this rule is not fully supported.

• Rule 7-High-level Insert, Update, and Delete Rule: This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table.

• EX: UPDATE mytable SET mycol = value WHERE condition; Many rows may be updated with this single statement.

• Rule 8-Physical data independence: The user is isolated from the physical method of storing

and retrieving information from the database.• Changes can be made to the underlying architecture

(hardware, disk storage methods) without effecting how the user access it.

• Rule 9-Logical data independence: How data is viewed should not be changed when the logical

structure (table’s structure) of database, changes.• This rule is particularly difficult to satisfy .• Most databases relay on strong ties between the data viewed

and the actual structure of the underlying tables.

• Rule 10-Integrity independence: The database language (like SQL) should support

constraints on user input that maintain database integrity. • This rule is not fully implemented by most major vendors.• At a minimum all databases do preserve two constraints

through SQL.• No component of a primary key can have a null value.• If a foreign key is defined in one table, any value in it must

exist as a primary key in another table.

• Rule 11-Distribution independence:• A user should be totally unaware of whether or not the

database is distributed (whether parts of the database exist in multiple locations).

• A variety of reasons make this rule difficult to implement.

• Rule 12-Non subversion rule:• There should be no way to modify the database structure

other than through the database language (like SQL).• Security and integrity of the database must not be violated.

• Rule Zero for RDBMS:• Many new DBMS claim to be relational plus supporting

extended features. • EX: PostgreSQL is a RDBMS with extended Object Oriented

features.

• Codd's rule zero specifies a criteria for RDBMS:

• "For any system that is advertised as, or claimed to be, a relational database management system, that system must be able to manage databases entirely through its relational capabilities and support all the above mentioned Codd rules, no matter what additional capabilities the system may support."

• In 1990, Codd extended the 12 rules to 18 to include rules on catalog, data types (domains), authorization etc..

3.2.1 Relational Model Concepts

• The relational Model of Data is based on the concept of a Relation.

• A Relation is a mathematical concept based on the ideas of sets.

• The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations.

•Relational Model Properties:•Each relation (or table) in a database has a unique name.

•An entry at the intersection of each row and column is atomic (or single-valued); i.e there can be no multi-valued attributes in a relation.

•Each row is unique; i.e no two rows in a relation are identical.

•Each attribute (or column) within a table has a unique name.

• The relational data model has three major components:

1.1. Relational database objects:Relational database objects:

Allows to define data structures.Allows to define data structures.

2.2. Relational operators:Relational operators:

Allows manipulation of stored data.Allows manipulation of stored data.

3.3. Relational integrity constraints:Relational integrity constraints:

Allows to defines business rules and ensure data Allows to defines business rules and ensure data integrity.integrity.

The Relational Objects

• Relation:

A named, two dimensional table of data.

• Database:

– A collection of tables and related objects organised in a structured fashion.

– Several database vendors use schema interchangeably with database.

Tables are comprised of rows and a fixed number of named columns.

Data is presented to the user as tables:

Column 1 Column 2 Column 3 Column 4

Row

Row

Row

Table

Columns are attributes describing an entity. Each column must have an unique name and a data type.

• Structure of a relation (e.g. Employee):

• Employee (Name, Designation, Department).

Rows are records that present information about a particular entity occurrence

Relational model terminology

• Row is called a ‘tuple’

• Column header is called an ‘attribute’

• Table is called a ‘relation’

• The data type describing the type of values that can appear in each column is called a ‘domain’

• Eg:-

– Names : The set of names of persons

– Employee_ages : Value between 15 & 80 years old

The above is called ‘logical definitions of domains’.

– A data type or format can also be specified for each domain.

Eg: The employee age is an integer between 15 and 80.

Characteristics of relations:

• Ordering of tuples: Tuples in a relation don’t have any particular order.

– How ever in a file they may be physically ordered based on a criteria, this is not there in relational model.

• Ordering of values within tuple:

Ordering of values within a tuple are unnecessary, hence a tuple can be considered as a ‘set’.

– But when relation is implemented as a file attributes may be physically ordered.

• Values in a tuple are atomic:

Values in the tuple or rows should not be divided i.e they should be atomic.

3.2.2 Relational constraints

• Domain constraints:

Specifies that the value of each attribute ‘A’ must be an atomic value, and from the specified domain.

• Key constraints:

– There is a sub set of attributes of a relational schema with the property that no two tuples should have the same combination of values for the attributes.

– Any such subset of attributes is called a ‘superkey’.

– A ‘superkey’ can have redundant attributes.

– A key is a minimal superkey.

– If a relation has more than one key, they are called candidate keys.

– One of them is chosen as the primary key.

Keys: Primary Key:

An attribute (or combination of attributes) that uniquely identifies each row in a relation.

Employee (Emp_No, Emp_Name, Department).

Composite Key: A primary key that consists of more than one attribute.

Salary(Emp_No, Eff_Date, Amount)

EmployeeE-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

SalaryE-No Eff-Date Amt

179 1/1/98 8000857 3/7/94 9000342 28/1/97 7500

Primary KeyComposite Key

SalaryE-No Eff-Date Amt

179 1/1/98 8000857 3/7/94 9000179 1/6/97 7000342 28/1/97 7500

The cardinality of a table refers to the number of rows in the table. The degree of a table refers to the number of columns.

Salary TableDegree = 3Cardinality = 4

Entity integrity, referential integrity/foreign keys

• Entity integrity constraint specifies that no primary key can be null.

• The referential integrity constraint is specified between two relations and is used to maintain the consistency among tuples of the two relations.

• Informally what this means is that a tuple in one relation that refers to another relation must refer to an existing tuple.

• To define referential integrity we use the concept of foreign keys.

Foreign Key:

An attribute in a relation of a database that serves as the primary key of another relation in the same database.

Employee(Emp_No, Emp_Name, Department)

Department(Dept_No, Dept_Name, M_No)

Employee === works for ==> Department

A foreign key is a set of columns in one table that serve as the primary key in another table

Data is presented to the user as tables:

Foreign KeyPrimary Key

Primary Key

D-No D-Name M-No

4 Finance 857 7 Sales 179

DepartmentEmployeeE-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

• 3.2.3 Relational Algebra• Relational algebra is a collection of operations that

are used to manipulate entire relations.• Relational Operators: Relation operators are used for manipulating the

relation.• Properties:• Relational operations are specified using Structured

Query Language (SQL) -- a standard for relational database access.

• Relational operations are set level, meaning that they operate on multiple rows, rather than one record at a time.

• SQL is non-procedural, meaning that the user specifies what data is to be retrieved rather than how to retrieve the data.

• Each operator takes one or more tables as it operand(s) and produces a table as its result.

• Any column value in a table can be referenced, not just keys.

• Operations can be combined to form complex operations.

• Relational Algebra operations are usually divided into two groups:

• Set theory operations.

• Operations specifically developed for relational databases.

• But are considered too technical for ordinary users, hence came SQL.

• They are written as a sequence of steps, when executed produce the results.

• Hence the user must give say ”what” and not “how” is needed.

• Relational calculus:• Another formal query language which gives

‘what’ is required, and not how is relational calculus.

• Eg:- {t.FNAME, t.LNAME | EMPLOYEE(t) and t.SALARY>500}

SELECT T.FNAME, T.LNAMEFROM EMPLOYEE AS TWHERE T.SALARY>500

• Relational algebra operations are divided into nine operations.

• Selection, Projection, Product, Join,Union, Intersection, Difference, Divide, and Assignment

Relational Operators1.Selection(): The operation which selects only some of the tuples of the

relation based on a condition such operation is known as selection operation.

It yields the horizontal subset of a given relation. It is represented by ‘’ symbol sometimes also known as

restriction operation.Example: Selection operation on personnel table shows that only those

tuples in personnel table are to be selected in which the value of the attribute ID is less than 105(condition).

Select Id,Name from Personnel where id< 105;

Relational Operators

Selection: horizontal subset of a table

EmployeeE-No E-Name D-No

179 Silva 7857 Perera 4 342 Dias 7

Sales EmployeeE-No E-Name D-No

179 Silva 7342 Dias 7

Sales-Emp = D-No=7 (Employee)

• 2.Projection(): The projection of a relation is defined as a

projection of all its tuples over some set of attributes.• It yields a vertical subset of the relation.• The projection operation is used either to reduce the

number of attributes in a relation or to reorder attributes.

• Ex: Select Id, name from Personnel;• So this reduction may due to the deletion of duplicate

tuples in the projected relation also.• Select name from Personnel;

Projection: vertical subset of a table

EmployeeE-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

Employee NamesE-No E-Name

179 Silva 857 Perera342 Dias

Emp-Names = E-No, E-Name (Employee)

• SELECT and PROJECT:

• SELECT and PROJECT can be used by combining both operations together.

• Ex:

To get a list of employee numbers for employees in department number 1.

Eno ( dep_no=1) (Employee) Select Eno from Employee where dep_no=1;

3.Cartesian Product(*): Creates a single table from two tables. The concatenation

of two relations or tables. And the new set is created consisting of all the possible combinations of the tuples.

i.e R=P*Q It is sometimes also called as the Cross Product or Cross

Join.Example: In the below example the resultant table i.e Emp-Info is

the combination of values from both the tables employee and department on attributes Eno from employee table and Dno from department table.

D-No D-Name M-No

4 Finance 857 7 Sales 179

Department(2)Employee(3rows)E-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

Emp-Info(6)E-No E-Name D-No D-No D-Name M-No

179 Silva 7 4 Finance 857857 Perera 4 4 Finance 857342 Dias 7 4 Finance 857 179 Silva 7 7 Sales 179857 Perera 4 7 Sales 179342 Dias 7 7 Sales 179

Emp-Info = E.D-No=D.D-No Employee Department

SELECT E.*, D.*FROM Employee E, Department D

• 4.Join The operator as the name suggests, allows the

combining of two or more relations or tables to form single new relation.

• The tuples from the operand relations that participate in the operation and contribute to the result are related.

• The join operation allows the processing of relationships existing between the operand relations.

• Join is basically the Cartesian product of the relations followed by a selection operation.

• Two common and very useful variants of the joins are equi-join and natural join.

• In equi-join the comparision operator is always the equality operator (=).

• Natural Join: Similarly in natural join the comparison operator is

always the equality operator, however only one of the two sets of domain compatible attributes are retained in the result relation of the natural join.

Join: Creates a single table from two tables.

D-No D-Name M-No

4 Finance 857 7 Sales 179

DepartmentEmployeeE-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

Emp-Info = E.D-No=D.D-No Department Employee

Emp-InfoE-No E-Name D-No D-No D-Name M-No

179 Silva 7 7 Sales 179857 Perera 4 4 Finance 857342 Dias 7 7 Sales 179

Natural Join: Creates a single table from two tables.

D-No D-Name M-No

4 Finance 857 7 Sales 179

DepartmentEmployeeE-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

Emp-InfoE-No E-Name D-No D-Name M-No

179 Silva 7 Sales 179857 Perera 4 Finance 857342 Dias 7 Sales 179

Emp-Info = E.D-No=D.D-No Department Employee

• Equi-Join: In equi-join the comparision operator is always the

equality operator (=).• Equi-joins are also called as simple or inner joins.• An Equi-join selects only those records from both

database tables that have matching values.• Records with values in the joined field that do not

appear in both of the database tables will be excluded from the query.

• The Equi-Join or the inner join is the default join.• To determine an employee’s department name, you

compare the value in the D-No column in the Employee table with the D-no values in the Department table.

• So the relationship between the Employee and Department table is an Equi-join that is, values in the D-no column of both the tables must be equal.

• Q.1) Write the Query to retrieve the Employee Name Silva, his Department Id and Department Name.

EmployeeE-No E-Name D-No

179 Silva 7857 Perera 4342 Dias 7

1.Example of Equi-Join/Inner Join:

SELECT Employee.*, Department.*FROM Employee, DepartmentWHERE Employee.D-No = Department.D-No ;

Department

E-NoE-No E-NameE-Name D-NoD-No D-NoD-No D-NameD-Name

179179 SilvaSilva 77 77 SalesSales

857857 PereraPerera 44 44 FinanceFinance

342342 DiasDias 77 77 SalesSales

Emp-Info

D-No D-Name

4 Finance

7 Sales

2. Example Equi/Inner Join:

person.Nameperson.Name address.Address_descaddress.Address_desc

Fred BloggsFred Bloggs 1, Acacia Avenue, Anytown1, Acacia Avenue, Anytown

Joe SmithJoe Smith 13, High Street, Anywhere13, High Street, Anywhere

Address_idAddress_id Address_DescAddress_Desc

57571, Acacia Avenue, 1, Acacia Avenue,

AnytownAnytown

929213, High Street, 13, High Street,

AnywhereAnywhere

11311352, Main Road, 52, Main Road,

SometownSometown

Person_iPerson_idd NameName

Address_iAddress_idd

11 Fred BloggsFred Bloggs 5757

22 Joe SmithJoe Smith 9292

33 Jane DoeJane Doe   

44 Sue JonesSue Jones 111111

PersonAddress

SELECT person.Name, address.Address_desc FROM person, address WHERE person.Address_id = address.Address_id

• Outer Join:• Outer joins enable rows to be returned from a join where one

of the tables does not contain matching rows for the other table.

• It will also retrieve the Empty/Null value rows and duplicate rows also.

• The (+) is put against the column-name on the deficient table, i.e the one with the missing rows.

• There are three types of outer joins namely:• Left Outer Join: For retrieving all the columns from the first table irrespective

of the column match.• Right Outer Join:• For retrieving all the columns from the second table

irrespective of the column match• Full Outer Join: For retrieving all the columns from both the

tables irrespective of column match.

Left Outer Join Example

person.Nameperson.Name address.Address_descaddress.Address_desc

Fred BloggsFred Bloggs 1, Acacia Avenue, Anytown1, Acacia Avenue, Anytown

Joe SmithJoe Smith 13, High Street, Anywhere13, High Street, Anywhere

Jane DoeJane Doe nullnull

Sue JonesSue Jones nullnull

AddressAddress_id_id Address_DescAddress_Desc

57571, Acacia Avenue, 1, Acacia Avenue,

AnytownAnytown

929213, High Street, 13, High Street,

AnywhereAnywhere

11111152, Main Road, 52, Main Road,

SometownSometown

PersoPerson_idn_id NameName

AddresAddress_ids_id

11Fred Fred BloggsBloggs 5757

22 Joe SmithJoe Smith 9292

33 Jane DoeJane Doe   

44 Sue JonesSue Jones 110110

Person Address

SELECT person.Name, address.Address_desc FROM person, address WHERE person.Address_id = address.Address_id (+)

Right Outer Join Example

person.Nameperson.Name address.Address_descaddress.Address_desc

Fred BloggsFred Bloggs 1, Acacia Avenue, Anytown1, Acacia Avenue, Anytown

Joe SmithJoe Smith 13, High Street, Anywhere13, High Street, Anywhere

nullnull52, Main Road, Sometown52, Main Road, Sometown

AddressAddress_id_id Address_DescAddress_Desc

57571, Acacia Avenue, 1, Acacia Avenue,

AnytownAnytown

929213, High Street, 13, High Street,

AnywhereAnywhere

11111152, Main Road, 52, Main Road,

SometownSometown

PersoPerson_idn_id NameName

AddresAddress_ids_id

11Fred Fred BloggsBloggs 5757

22 Joe SmithJoe Smith 9292

33 Jane DoeJane Doe   

44 Sue JonesSue Jones 110110

Person Address

SELECT person.Name, address.Address_desc FROM person, address WHERE (+)person.Address_id = address.Address_id

Relational Set Operators

Set operators:

UnionIntersection

Difference

Set operations are from mathematical set theory

Union Operator

Union

Fname

KapilaNimalAjithRohan

Lname

DiasPereraSilvaMendis

Student

FN

SunilKamalSamanKapilaNimal

LN

De SilvaSoysaSilvaDiasPerera

Instructor

Fname

KapilaNimalAjithRohanSunilKamalSaman

Lname

DiasPereraSilvaMendisDe SilvaSoysaSilva

Stu-Inst

Stu-Inst = Student Instructor

Intersection Operator

Fname

KapilaNimalAjithRohan

Lname

DiasPereraSilvaMendis

Student

FN

SunilKamalSamanKapilaNimal

LN

De SilvaSoysaSilvaDiasPerera

Instructor Fname

KapilaNimal

Lname

DiasPerera

Stu-Inst

Stu-Inst = Student Instructor

Intersection

Difference Operator

Stu-Inst = Student - InstructorInst-Stu = Instructor - Student

DifferenceFname

KapilaNimalAjithRohan

Lname

DiasPereraSilvaMendis

Student

FN

SunilKamalSamanKapilaNimal

LN

De SilvaSoysaSilvaDiasPerera

Instructor

Fname

AjithRohan

Lname

SilvaMendis

Stu-Inst

Fname

SunilKamalSaman

Lname

De SilvaSoysaSilva

Inst-Stu

• Self-Join:• Using Self-Joins to Access Related Records in the

Same Table i.e it is used to join a table to itself .• Division Operator:• It creates a new relation by selecting the rows in one

relation that match the every row in other relation.• It is the same as the ordinary arithmetic division

operation that we use.• Assignment Operator:• Assignment is a relational algebra operation that

gives names to a relation.• Ex: A:= Select(Student:Student_Id = ‘Ram’) so in the

above example name ‘A’ is assigned to Student.