Upload
shon-richardson
View
224
Download
2
Tags:
Embed Size (px)
Citation preview
3.1 Codd’s rules
A relational DBMS must use its relational facilities exclusively to manage and interact with the database. The ’12’ E.F Codd rules:These rules were defined by Dr.E.F Codd who is an IBM researcher who first developed the relational model in 1970.In 1985, Dr. E.F Codd published a list of 12 rule that define an ideal relational database and has provided a guideline for the design of all relational database systems.
• They specify what a relational database must support in order to be relational.
• Rule 1-Information rule:• All information in the database should be represented in one
and only one way - as values in a table.
• Rule 2-Guaranteed access rule:• All the data should be accessible without ambiguity.• Every value can be accessed by providing table name,
column name and key.
• Rule 3-Systematic Treatment of Null values:
A field should be allowed to remain empty.
• This involves the support of a null value which is distinct from an empty string or a number with a value of zero.
• Of course this cannot apply to primary keys.
• In addition, most database implementations support the concept of a not-null field constraint that prevents null values in a specific table column.
• Rule 4-Dynamic On-Line Catalog based on the relation model:
• The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data.
• A Catalog (data dictionary) can be queried by authorized users as part of the database.
• Rule 5-Comprehensive Data Sublanguage Rule:
• Used interactively and embedded within programs
• Supports data definition, data manipulation, security, integrity constraints and transaction processing
• Today means: must support SQL.
use well-defined syntax, as character strings and whose ability to support all of the following is comprehensible:
• Data definition • View definition • Data manipulation (interactive and by program) • Integrity constraints • Authorization • Transaction boundaries (begin, commit, and rollback).• All the relational databases use the SQL”(Structure Query
Language) for this purpose
• Rule 6-View Updating Rule:
All views that are theoretically updateable are also updateable by the system.
• Views are virtual tables.• They appear to behave as conventional tables except that
they are built dynamically when the query is run. • It is not always theoretically possible to update and delete
views. So this rule is not fully supported.
• Rule 7-High-level Insert, Update, and Delete Rule: This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table.
• EX: UPDATE mytable SET mycol = value WHERE condition; Many rows may be updated with this single statement.
• Rule 8-Physical data independence: The user is isolated from the physical method of storing
and retrieving information from the database.• Changes can be made to the underlying architecture
(hardware, disk storage methods) without effecting how the user access it.
• Rule 9-Logical data independence: How data is viewed should not be changed when the logical
structure (table’s structure) of database, changes.• This rule is particularly difficult to satisfy .• Most databases relay on strong ties between the data viewed
and the actual structure of the underlying tables.
• Rule 10-Integrity independence: The database language (like SQL) should support
constraints on user input that maintain database integrity. • This rule is not fully implemented by most major vendors.• At a minimum all databases do preserve two constraints
through SQL.• No component of a primary key can have a null value.• If a foreign key is defined in one table, any value in it must
exist as a primary key in another table.
• Rule 11-Distribution independence:• A user should be totally unaware of whether or not the
database is distributed (whether parts of the database exist in multiple locations).
• A variety of reasons make this rule difficult to implement.
• Rule 12-Non subversion rule:• There should be no way to modify the database structure
other than through the database language (like SQL).• Security and integrity of the database must not be violated.
• Rule Zero for RDBMS:• Many new DBMS claim to be relational plus supporting
extended features. • EX: PostgreSQL is a RDBMS with extended Object Oriented
features.
• Codd's rule zero specifies a criteria for RDBMS:
• "For any system that is advertised as, or claimed to be, a relational database management system, that system must be able to manage databases entirely through its relational capabilities and support all the above mentioned Codd rules, no matter what additional capabilities the system may support."
• In 1990, Codd extended the 12 rules to 18 to include rules on catalog, data types (domains), authorization etc..
3.2.1 Relational Model Concepts
• The relational Model of Data is based on the concept of a Relation.
• A Relation is a mathematical concept based on the ideas of sets.
• The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations.
•Relational Model Properties:•Each relation (or table) in a database has a unique name.
•An entry at the intersection of each row and column is atomic (or single-valued); i.e there can be no multi-valued attributes in a relation.
•Each row is unique; i.e no two rows in a relation are identical.
•Each attribute (or column) within a table has a unique name.
• The relational data model has three major components:
1.1. Relational database objects:Relational database objects:
Allows to define data structures.Allows to define data structures.
2.2. Relational operators:Relational operators:
Allows manipulation of stored data.Allows manipulation of stored data.
3.3. Relational integrity constraints:Relational integrity constraints:
Allows to defines business rules and ensure data Allows to defines business rules and ensure data integrity.integrity.
The Relational Objects
• Relation:
A named, two dimensional table of data.
• Database:
– A collection of tables and related objects organised in a structured fashion.
– Several database vendors use schema interchangeably with database.
Tables are comprised of rows and a fixed number of named columns.
Data is presented to the user as tables:
Column 1 Column 2 Column 3 Column 4
Row
Row
Row
Table
Columns are attributes describing an entity. Each column must have an unique name and a data type.
• Structure of a relation (e.g. Employee):
• Employee (Name, Designation, Department).
Rows are records that present information about a particular entity occurrence
Relational model terminology
• Row is called a ‘tuple’
• Column header is called an ‘attribute’
• Table is called a ‘relation’
• The data type describing the type of values that can appear in each column is called a ‘domain’
• Eg:-
– Names : The set of names of persons
– Employee_ages : Value between 15 & 80 years old
The above is called ‘logical definitions of domains’.
– A data type or format can also be specified for each domain.
Eg: The employee age is an integer between 15 and 80.
Characteristics of relations:
• Ordering of tuples: Tuples in a relation don’t have any particular order.
– How ever in a file they may be physically ordered based on a criteria, this is not there in relational model.
• Ordering of values within tuple:
Ordering of values within a tuple are unnecessary, hence a tuple can be considered as a ‘set’.
– But when relation is implemented as a file attributes may be physically ordered.
• Values in a tuple are atomic:
Values in the tuple or rows should not be divided i.e they should be atomic.
3.2.2 Relational constraints
• Domain constraints:
Specifies that the value of each attribute ‘A’ must be an atomic value, and from the specified domain.
• Key constraints:
– There is a sub set of attributes of a relational schema with the property that no two tuples should have the same combination of values for the attributes.
– Any such subset of attributes is called a ‘superkey’.
– A ‘superkey’ can have redundant attributes.
– A key is a minimal superkey.
– If a relation has more than one key, they are called candidate keys.
– One of them is chosen as the primary key.
Keys: Primary Key:
An attribute (or combination of attributes) that uniquely identifies each row in a relation.
Employee (Emp_No, Emp_Name, Department).
Composite Key: A primary key that consists of more than one attribute.
Salary(Emp_No, Eff_Date, Amount)
EmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
SalaryE-No Eff-Date Amt
179 1/1/98 8000857 3/7/94 9000342 28/1/97 7500
Primary KeyComposite Key
SalaryE-No Eff-Date Amt
179 1/1/98 8000857 3/7/94 9000179 1/6/97 7000342 28/1/97 7500
The cardinality of a table refers to the number of rows in the table. The degree of a table refers to the number of columns.
Salary TableDegree = 3Cardinality = 4
Entity integrity, referential integrity/foreign keys
• Entity integrity constraint specifies that no primary key can be null.
• The referential integrity constraint is specified between two relations and is used to maintain the consistency among tuples of the two relations.
• Informally what this means is that a tuple in one relation that refers to another relation must refer to an existing tuple.
• To define referential integrity we use the concept of foreign keys.
Foreign Key:
An attribute in a relation of a database that serves as the primary key of another relation in the same database.
Employee(Emp_No, Emp_Name, Department)
Department(Dept_No, Dept_Name, M_No)
Employee === works for ==> Department
A foreign key is a set of columns in one table that serve as the primary key in another table
Data is presented to the user as tables:
Foreign KeyPrimary Key
Primary Key
D-No D-Name M-No
4 Finance 857 7 Sales 179
DepartmentEmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
• 3.2.3 Relational Algebra• Relational algebra is a collection of operations that
are used to manipulate entire relations.• Relational Operators: Relation operators are used for manipulating the
relation.• Properties:• Relational operations are specified using Structured
Query Language (SQL) -- a standard for relational database access.
• Relational operations are set level, meaning that they operate on multiple rows, rather than one record at a time.
• SQL is non-procedural, meaning that the user specifies what data is to be retrieved rather than how to retrieve the data.
• Each operator takes one or more tables as it operand(s) and produces a table as its result.
• Any column value in a table can be referenced, not just keys.
• Operations can be combined to form complex operations.
• Relational Algebra operations are usually divided into two groups:
• Set theory operations.
• Operations specifically developed for relational databases.
• But are considered too technical for ordinary users, hence came SQL.
• They are written as a sequence of steps, when executed produce the results.
• Hence the user must give say ”what” and not “how” is needed.
• Relational calculus:• Another formal query language which gives
‘what’ is required, and not how is relational calculus.
• Eg:- {t.FNAME, t.LNAME | EMPLOYEE(t) and t.SALARY>500}
SELECT T.FNAME, T.LNAMEFROM EMPLOYEE AS TWHERE T.SALARY>500
• Relational algebra operations are divided into nine operations.
• Selection, Projection, Product, Join,Union, Intersection, Difference, Divide, and Assignment
Relational Operators1.Selection(): The operation which selects only some of the tuples of the
relation based on a condition such operation is known as selection operation.
It yields the horizontal subset of a given relation. It is represented by ‘’ symbol sometimes also known as
restriction operation.Example: Selection operation on personnel table shows that only those
tuples in personnel table are to be selected in which the value of the attribute ID is less than 105(condition).
Select Id,Name from Personnel where id< 105;
Relational Operators
Selection: horizontal subset of a table
EmployeeE-No E-Name D-No
179 Silva 7857 Perera 4 342 Dias 7
Sales EmployeeE-No E-Name D-No
179 Silva 7342 Dias 7
Sales-Emp = D-No=7 (Employee)
• 2.Projection(): The projection of a relation is defined as a
projection of all its tuples over some set of attributes.• It yields a vertical subset of the relation.• The projection operation is used either to reduce the
number of attributes in a relation or to reorder attributes.
• Ex: Select Id, name from Personnel;• So this reduction may due to the deletion of duplicate
tuples in the projected relation also.• Select name from Personnel;
Projection: vertical subset of a table
EmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Employee NamesE-No E-Name
179 Silva 857 Perera342 Dias
Emp-Names = E-No, E-Name (Employee)
• SELECT and PROJECT:
• SELECT and PROJECT can be used by combining both operations together.
• Ex:
To get a list of employee numbers for employees in department number 1.
Eno ( dep_no=1) (Employee) Select Eno from Employee where dep_no=1;
3.Cartesian Product(*): Creates a single table from two tables. The concatenation
of two relations or tables. And the new set is created consisting of all the possible combinations of the tuples.
i.e R=P*Q It is sometimes also called as the Cross Product or Cross
Join.Example: In the below example the resultant table i.e Emp-Info is
the combination of values from both the tables employee and department on attributes Eno from employee table and Dno from department table.
D-No D-Name M-No
4 Finance 857 7 Sales 179
Department(2)Employee(3rows)E-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Emp-Info(6)E-No E-Name D-No D-No D-Name M-No
179 Silva 7 4 Finance 857857 Perera 4 4 Finance 857342 Dias 7 4 Finance 857 179 Silva 7 7 Sales 179857 Perera 4 7 Sales 179342 Dias 7 7 Sales 179
Emp-Info = E.D-No=D.D-No Employee Department
SELECT E.*, D.*FROM Employee E, Department D
• 4.Join The operator as the name suggests, allows the
combining of two or more relations or tables to form single new relation.
• The tuples from the operand relations that participate in the operation and contribute to the result are related.
• The join operation allows the processing of relationships existing between the operand relations.
• Join is basically the Cartesian product of the relations followed by a selection operation.
• Two common and very useful variants of the joins are equi-join and natural join.
• In equi-join the comparision operator is always the equality operator (=).
• Natural Join: Similarly in natural join the comparison operator is
always the equality operator, however only one of the two sets of domain compatible attributes are retained in the result relation of the natural join.
Join: Creates a single table from two tables.
D-No D-Name M-No
4 Finance 857 7 Sales 179
DepartmentEmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Emp-Info = E.D-No=D.D-No Department Employee
Emp-InfoE-No E-Name D-No D-No D-Name M-No
179 Silva 7 7 Sales 179857 Perera 4 4 Finance 857342 Dias 7 7 Sales 179
Natural Join: Creates a single table from two tables.
D-No D-Name M-No
4 Finance 857 7 Sales 179
DepartmentEmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
Emp-InfoE-No E-Name D-No D-Name M-No
179 Silva 7 Sales 179857 Perera 4 Finance 857342 Dias 7 Sales 179
Emp-Info = E.D-No=D.D-No Department Employee
• Equi-Join: In equi-join the comparision operator is always the
equality operator (=).• Equi-joins are also called as simple or inner joins.• An Equi-join selects only those records from both
database tables that have matching values.• Records with values in the joined field that do not
appear in both of the database tables will be excluded from the query.
• The Equi-Join or the inner join is the default join.• To determine an employee’s department name, you
compare the value in the D-No column in the Employee table with the D-no values in the Department table.
• So the relationship between the Employee and Department table is an Equi-join that is, values in the D-no column of both the tables must be equal.
• Q.1) Write the Query to retrieve the Employee Name Silva, his Department Id and Department Name.
EmployeeE-No E-Name D-No
179 Silva 7857 Perera 4342 Dias 7
1.Example of Equi-Join/Inner Join:
SELECT Employee.*, Department.*FROM Employee, DepartmentWHERE Employee.D-No = Department.D-No ;
Department
E-NoE-No E-NameE-Name D-NoD-No D-NoD-No D-NameD-Name
179179 SilvaSilva 77 77 SalesSales
857857 PereraPerera 44 44 FinanceFinance
342342 DiasDias 77 77 SalesSales
Emp-Info
D-No D-Name
4 Finance
7 Sales
2. Example Equi/Inner Join:
person.Nameperson.Name address.Address_descaddress.Address_desc
Fred BloggsFred Bloggs 1, Acacia Avenue, Anytown1, Acacia Avenue, Anytown
Joe SmithJoe Smith 13, High Street, Anywhere13, High Street, Anywhere
Address_idAddress_id Address_DescAddress_Desc
57571, Acacia Avenue, 1, Acacia Avenue,
AnytownAnytown
929213, High Street, 13, High Street,
AnywhereAnywhere
11311352, Main Road, 52, Main Road,
SometownSometown
Person_iPerson_idd NameName
Address_iAddress_idd
11 Fred BloggsFred Bloggs 5757
22 Joe SmithJoe Smith 9292
33 Jane DoeJane Doe
44 Sue JonesSue Jones 111111
PersonAddress
SELECT person.Name, address.Address_desc FROM person, address WHERE person.Address_id = address.Address_id
• Outer Join:• Outer joins enable rows to be returned from a join where one
of the tables does not contain matching rows for the other table.
• It will also retrieve the Empty/Null value rows and duplicate rows also.
• The (+) is put against the column-name on the deficient table, i.e the one with the missing rows.
• There are three types of outer joins namely:• Left Outer Join: For retrieving all the columns from the first table irrespective
of the column match.• Right Outer Join:• For retrieving all the columns from the second table
irrespective of the column match• Full Outer Join: For retrieving all the columns from both the
tables irrespective of column match.
Left Outer Join Example
person.Nameperson.Name address.Address_descaddress.Address_desc
Fred BloggsFred Bloggs 1, Acacia Avenue, Anytown1, Acacia Avenue, Anytown
Joe SmithJoe Smith 13, High Street, Anywhere13, High Street, Anywhere
Jane DoeJane Doe nullnull
Sue JonesSue Jones nullnull
AddressAddress_id_id Address_DescAddress_Desc
57571, Acacia Avenue, 1, Acacia Avenue,
AnytownAnytown
929213, High Street, 13, High Street,
AnywhereAnywhere
11111152, Main Road, 52, Main Road,
SometownSometown
PersoPerson_idn_id NameName
AddresAddress_ids_id
11Fred Fred BloggsBloggs 5757
22 Joe SmithJoe Smith 9292
33 Jane DoeJane Doe
44 Sue JonesSue Jones 110110
Person Address
SELECT person.Name, address.Address_desc FROM person, address WHERE person.Address_id = address.Address_id (+)
Right Outer Join Example
person.Nameperson.Name address.Address_descaddress.Address_desc
Fred BloggsFred Bloggs 1, Acacia Avenue, Anytown1, Acacia Avenue, Anytown
Joe SmithJoe Smith 13, High Street, Anywhere13, High Street, Anywhere
nullnull52, Main Road, Sometown52, Main Road, Sometown
AddressAddress_id_id Address_DescAddress_Desc
57571, Acacia Avenue, 1, Acacia Avenue,
AnytownAnytown
929213, High Street, 13, High Street,
AnywhereAnywhere
11111152, Main Road, 52, Main Road,
SometownSometown
PersoPerson_idn_id NameName
AddresAddress_ids_id
11Fred Fred BloggsBloggs 5757
22 Joe SmithJoe Smith 9292
33 Jane DoeJane Doe
44 Sue JonesSue Jones 110110
Person Address
SELECT person.Name, address.Address_desc FROM person, address WHERE (+)person.Address_id = address.Address_id
Relational Set Operators
Set operators:
UnionIntersection
Difference
Set operations are from mathematical set theory
Union Operator
Union
Fname
KapilaNimalAjithRohan
Lname
DiasPereraSilvaMendis
Student
FN
SunilKamalSamanKapilaNimal
LN
De SilvaSoysaSilvaDiasPerera
Instructor
Fname
KapilaNimalAjithRohanSunilKamalSaman
Lname
DiasPereraSilvaMendisDe SilvaSoysaSilva
Stu-Inst
Stu-Inst = Student Instructor
Intersection Operator
Fname
KapilaNimalAjithRohan
Lname
DiasPereraSilvaMendis
Student
FN
SunilKamalSamanKapilaNimal
LN
De SilvaSoysaSilvaDiasPerera
Instructor Fname
KapilaNimal
Lname
DiasPerera
Stu-Inst
Stu-Inst = Student Instructor
Intersection
Difference Operator
Stu-Inst = Student - InstructorInst-Stu = Instructor - Student
DifferenceFname
KapilaNimalAjithRohan
Lname
DiasPereraSilvaMendis
Student
FN
SunilKamalSamanKapilaNimal
LN
De SilvaSoysaSilvaDiasPerera
Instructor
Fname
AjithRohan
Lname
SilvaMendis
Stu-Inst
Fname
SunilKamalSaman
Lname
De SilvaSoysaSilva
Inst-Stu
• Self-Join:• Using Self-Joins to Access Related Records in the
Same Table i.e it is used to join a table to itself .• Division Operator:• It creates a new relation by selecting the rows in one
relation that match the every row in other relation.• It is the same as the ordinary arithmetic division
operation that we use.• Assignment Operator:• Assignment is a relational algebra operation that
gives names to a relation.• Ex: A:= Select(Student:Student_Id = ‘Ram’) so in the
above example name ‘A’ is assigned to Student.