Upload
doankhanh
View
225
Download
2
Embed Size (px)
Citation preview
10/7/2010
1
ITCS 3160Midterm exam review
1
General Information
• Time: Oct 14T f ti• Type of questions:
• True / False• Multi/Single‐selection• Fill blanks• Expect resultExpect result• Open questions (write queries, draw ER diagrams, draw schema diagrams, etc)
2
10/7/2010
2
Database and DBMS
• A database is a collection of related data– Data ‐ Known facts that can be recorded and that have implicit meaning
• A DBMS is a collection of programs that enables users to create and maintain a database
3
Main Functions of DBMS
• Defining a database Specify the data types structures and constraints of– Specify the data types, structures, and constraints of the data to be stored
– Meta‐data• Database definition or descriptive information • Stored by the DBMS in the form of a database catalog or dictionary
• Constructing a database– Store data on storage medium controlled by the DBMS
4
10/7/2010
3
Main Functions of DBMS (cont‘d)
• Manipulating a databased d h d b ld– Query and update the database miniworld
• Application program accesses database by sending queries to DBMS
• Query causes some data to be retrieved
– Generate reports
• Sharing a database – Allow multiple users and programs to access the database simultaneously
5
Characteristics of database approach
• Self‐describing nature of a database system
• Insulation between programs and data, and data abstraction
• Support of multiple views of the data
• Sharing of data and multiuser transaction processingprocessing
6
10/7/2010
4
Actors on the scene
• Database administrators (DBA) D t b d i• Database designers
• End users • Casual end users• Naive or parametric end users• Sophisticated end users• Standalone users• Standalone users
• System analysts• Application programmers
7
Workers behind the Scene
• DBMS system designers and implementers
• Tool developers
• Operators and maintenance personnel
8
10/7/2010
5
Categories of Data Models
• High‐level or conceptual data models Close to the way many users perceive data– Close to the way many users perceive data
• Low‐level or physical data models – Describe the details of how data is stored on computer storage media
• Representational or implementation data modelsmodels– Easily understood by end users – Also similar to how data organized in computer storage
9
Conceptual Data Models
• Entity R t l ld bj t t– Represents a real‐world object or concept
• Attribute– Represents some property of interest – Further describes an entity
• Relationship among two or more entities – Represents an association among the entities
• Entity‐Relationship model
10
10/7/2010
6
Implementation Data Models
• Relational data modelUsed most frequently in traditional commercial– Used most frequently in traditional commercial DBMSs
• Legacy network and hierarchical models• Record‐based data models refer to relational, network, and hierarchical data models
• Object data model• Object data model – New family of higher‐level implementation data models
– Closer to conceptual data models
11
Database Schema
• Database schema– Description of a database
• Schema diagram– Displays selected aspects of schema
• Schema constructE h bj t i th h– Each object in the schema
12
10/7/2010
7
Database State
• Database state or snapshot– Data in database at a particular moment in time– Also called the current set of occurrences or instances in the
database– Each schema construct has its own current set of instances
• Define a new database– Specify database schema to the DBMS only– Database is in empty state with no data
• Initial stateInitial state – Populated or loaded with the initial data
• Valid state– Satisfies the structure and constraints specified in the schema
13
Three‐Schema Architecture• Internal level
– Describes physical storage structure of the database
• Conceptual level– Describes structure of the whole database for a community of users
• External or view level – Describes part of the database that a particular user group is interested in
Data Independence
14
10/7/2010
8
DBMS Languages
• Data definition language (DDL)• Defines both conceptual and internal schemas or pconceptual schema only
• Storage definition language (SDL)• Specifies the internal schema
• View definition language (VDL)• Specifies user views/mappings to conceptual schema
• Data manipulation language (DML)p g g ( )• Allows retrieval, insertion, deletion, modification
• Comprehensive integrated language such as SQL• SQL is a combinations of DDL, VDL, and DML
15
DBMS Interfaces
• Menu‐based interfaces for Web clients or browsingbrowsing
• Forms‐based interfaces• Graphical user interfaces• Natural language interfaces• Speech input and outputp p p• Interfaces for parametric users• Interfaces for the DBA
16
10/7/2010
9
17
Two‐Tier Client/Server Architectures for DBMSs
• Server handles– Query and transaction functionality related to SQL processing
• Client handles– User interface programs and application programs
18
10/7/2010
10
Three‐Tier and n‐Tier Architectures for Web Applications
• Application server or Web server
• N‐tier– Divide the layers between the user and the stored data further into finer components
19
Classification of DatabaseManagement Systems
• Data model• Relational• Relational• Object• …
• Number of users• Number of sitesC t• Cost
• Types of access path options• General or special‐purpose
20
10/7/2010
11
Overview of db design• Requirement analysis
– Data to be stored– Applications to be built– Operations (most frequent) subject to performance requirement
• Conceptual db design– Description of the data (including constraints)– By high level model such as ER
• Logical db design– Choose DBMS to implement– Convert conceptual db design into database schema
S h fi ( li i )• Schema refinement (normalization) • Physical db design
– Analyze the workload– Refine db design to meet performance criteria
• Security design21
Data Modeling Using the Entity-Relationship (ER) Model
10/7/2010
12
ER Model Basics
EMPLOYEE
SsnName
Salary
• Entity: Real‐world object distinguishable from other
objects. An entity is described (in DB) using a set of attributes.
• Entity Set: A collection of similar entities. E.g., all employees.
All titi i tit t h th t f tt ib t (U til– All entities in an entity set have the same set of attributes. (Until we consider ISA hierarchies)
– An entity set usually has a key.
– Each attribute has a domain (value set).
23
ER Model Basics (Cont’d)
SalaryName
BudgetNumber
SinceName
Ssn
Salary
Name
Employees
SuperSuper
Ssn
• Relationship: Association among 2 or more entities. E.g., John Smith works in Research department.
WORKS_FOR DEPARTMENTEMPLOYEE
Relationship Set SUPERVISION
viseeSuper-visor
• Relationship Set: Collection of similar relationships.– An n‐ary relationship set R relates n entity sets E1 ... En; each relationship in R involves entities e1 E1, ..., en En
• Same entity set could participate in different relationship sets, or in different “roles” in same set. 24
10/7/2010
13
Cardinality Ratio Constraints
• Consider WORKS_IN: A l k
Name
NumberSalary
Name
SsnWORKS_FOR
N 1
Budget
An employee works for one department; a dept can have many employees.
• Cardinality ratio: The maximum number of relationship
MANAGES
EMPLOYEE DEPARTMENT
1 1
instances that an entity can participate in
• In WORKS_FOR
• EMPLOYEE : DEPARTMENT is of cardinality ratio N:1
Participation (Minimum Cardinality) Constraints
• Does every department have a manager?– If so, the participation of DEPARTMENT in MANAGES is said to be total (vs. partial).
• Every Number value in DEPARTMENT table must appear in a row of the MANAGES table (with a non‐NULL Ssn value!)
Salary BudgetNumberName Name
Since
Ssn
MANAGES
since
DEPARTMENTEMPLOYEE
WORKS_FOR
Partial
TotalTotal
26
Total
10/7/2010
14
Weak Entities• A weak entity can be identified uniquely only by considering the
primary key of another (owner) entity.
– Owner entity set and weak entity set must participate in a one‐y y p pto‐many relationship set (1 owner, many weak entities).
– Weak entity set must have total participation in this identifying relationship set.
– Partial key of weak entities uniquely identify weak entities that are related to the same owner entity
N
Salary
NameAgeName
DEPENDENTEMPLOYEE
Ssn
DEP_OF
Weak EntityIdentifying Relationship
Primary Keyfor weak entity
27
ISA (`is a’) HierarchiesName
Ssn
EMPLOYEE
Salary
Hourly_wagesISA
Contractid
Hours_workedAs in C++, or other PLs, attributes are inherited.
• Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed)
C l t t i t D E l tit l h
CONTRACT_EMPHOURLY_EMP
Contractid
If we declare A ISA B, every A entity is also considered to be a B entity.
• Completeness constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
• Reasons for using ISA:
– To add descriptive attributes specific to a subclass.
– To identify entities that participate in a relationship.28
10/7/2010
15
Aggregation
• Used when we have to model a relationship Until
EMPLOYEE
MONITOR
SalaryName
Ssn
involving (entitity sets and) a relationship set.– Aggregation allows us
to treat a relationship set as an entity set for purposes of participation in (other)
BudgetNumberId
Started_on
PbudgetName
DEPARTMENTPROJECT SPONSORS
Aggregation
relationships.
– Monitors mapped to table like any other relationship set.
29
Relational Database Design by ER-to-Relational Mapping
10/7/2010
16
ER‐to‐Relational Mapping Algorithm
• Step 1: Mapping of Regular Entity Types
• Step 2: Mapping of Weak Entity Types
• Step 3: Mapping of Binary 1:1 Relationship Types
• Step 4: Mapping of Binary 1:N Relationship Types
• Step 5: Mapping of Binary M:N Relationship TypesTypes
• Step 6: Mapping of Multivalued Attributes
• Step 7: Mapping of N‐ary Relationship Types
32
10/7/2010
17
33
34
10/7/2010
18
35
36
This approach can be used for 1:1 and 1:N relationship
10/7/2010
19
37
Translating ISA Hierarchies to Relations
• General approach:– 3 relations: Employees Hourly Emps and Contract Emps3 relations: Employees, Hourly_Emps and Contract_Emps.
• Hourly_Emps: Every employee is recorded in Employees. For hourly emps, extra info recorded in Hourly_Emps (hourly_wages, hours_worked, ssn); must delete Hourly_Emps tuple if referenced Employees tuple is deleted).
• Queries involving all employees easy, those involving just Hourly_Emps require a join to get some attributes.
Alt ti J t H l E d C t t E
Raghu Ramakrishnan
• Alternative: Just Hourly_Emps and Contract_Emps.– Hourly_Emps: ssn, name, lot, hourly_wages, hours_worked.– Each employee must be in one of these two subclasses.
10/7/2010
20
Directed arc from each foreign key to the relation it references 39
The Relational Data Model and Relational Database Constraints
10/7/2010
21
Relational Model
• Represents data as a collection of relationsR l ti T bl f l• Relation ‐ Table of values– Row
• Represents a collection of related data values• Fact that typically corresponds to a real‐world entity or relationship
• Tuplep
– Table name and column names • Interpret the meaning of the values in each row • attribute
41
Relation Schema
• Relation schema RD t d b R(A A A )– Denoted by R(A1, A2, ...,An)
– Made up of a relation name R and a list of attributes, A1, A2, ..., An
• Attribute Ai– Column header; Name of a role played by some domain D in the relation schema Rdomain D in the relation schema R
• Degree (or arity) of a relation – Number of attributes n of its relation schema
42
10/7/2010
22
Relation or Relation State
• Set of n‐tuples r = {t1, t2, ..., tm}
• Each n‐tuple t• Ordered list of n values t =<v1, v2, ..., vn>
• Each value vi, 1 ≤ i ≤ n, is an element of dom(Ai) or is a special NULL value
43
NULLs in Tuples (cont’d.)
– Meanings for NULL valuesV l k• Value unknown
• Value exists but is not available
• Attribute does not apply to this tuple (also known as value undefined)
44
10/7/2010
23
Relational Model Constraints
• Constraints– Restrictions on the actual values in a database state
– Derived from the rules in the miniworld that the database represents
• Inherent model‐based constraints or implicitconstraints– Inherent in the data model
45
Domain Constraints• In each tuple, value of attribute A must be from dom(A)
• Data type associated with domains: – Numeric data types for integers and real numbers – Characters– Booleans– Fixed‐length strings– Variable‐length stringsVariable length strings– Date, time, timestamp– Money– Other special data types
46
10/7/2010
24
Key Constraints and Constraints on NULL Values
• Superkey
• Key
• Primary key
• Foreign key
47
Foreign Keys
• Referential integrity constraint – Specified between two relations
– Maintains consistency among tuples in two relations
– State that a tuple in one relation that refers to another relation must refer to an existing tuple in that relation
48
10/7/2010
25
Foreign Keys (cont’d.)
• Foreign key rules:Th tt ib t i FK f R h th d i ( )– The attributes in FK of R1 have the same domain(s) as the primary key attributes PK of R2
– Value of FK in a tuple t1 of the current state r1(R1)either occurs as a value of PK for some tuple t2 in the current state r2(R2) or is NULL
– R1 Referencing relation1 g– R2: Referred relation
– A foreign key can refer to its own relation
49
Note: foreign key can have multiple attributes
50
10/7/2010
26
Operations of Relational Model
• Operations of the relational model can be categorized into retrievals and updates
• Basic operations that change the states of relations in the database:– Insert
– Delete
– Update (or Modify)
51
The Insert Operation
• Provides a list of attribute values for a new t l t th t i t b i t d i t l ti Rtuple t that is to be inserted into a relation R
• Can violate any of the four types of constraints
• If an insertion violates one or more constraintsconstraints– Default option is to reject the insertion
52
Sample question: ;will the statement Insert <‘Celilia’, ‘f’, ‘Kolonsky’, NULL, ‘1960-04-05’, ‘6532 Windy Lane, Katy, TX’, F, 28000, NULL, 4> be accepted?
10/7/2010
27
The Delete Operation
• Can violate only referential integrity – If tuple being deleted is referenced by foreign keysIf tuple being deleted is referenced by foreign keys from other tuples
– Restrict• Reject the deletion
– Cascade• Propagate the deletion by deleting tuples that reference the tuple that is being deletedtuple that is being deleted
– Set null or set default• Modify the referencing attribute values that cause the violation
53
The Update Operation
• Necessary to specify a condition on attributes f l tiof relation
– Select the tuple (or tuples) to be modified
• If attribute not part of a primary key nor of a foreign key – Usually causes no problemsUsually causes no problems
• Updating a primary/foreign key– Similar issues as with Insert/Delete
54
10/7/2010
28
SQLSQL
SQL
• SQL language – Considered one of the major reasons for the commercial jsuccess of relational databases
• SQL – Structured Query Language– Statements for data definitions, queries, and updates (both DDL and DML)
– Core specificationPl i li d t i– Plus specialized extensions
• Terminology:– Table, row, and column used for relational model terms relation, tuple, and attribute
56
10/7/2010
29
Attribute Data Types and Domains in SQL (cont’d.)
• Domain – Name used with the attribute specification
– Makes it easier to change the data type for a domain that is used by numerous attributes
– Improves schema readability
– Example:p• CREATE DOMAIN SSN_TYPE AS CHAR(9);
57
58
RED
10/7/2010
30
59RED
Specifying Key and Referential Integrity Constraints
• PRIMARY KEY clause – Specifies one or more attributes that make up the primary key of a relation
– If primary key has only one attribute• Id NUMBER PRIMARY KEY;
– If primary key has more than two attributes
• CREATE TABLE course• (Course_id NUMBER,• Section id NUMBER,Section_id NUMBER,• PRIMARY KEY (Course_id, Section_id));
• UNIQUE clause – Specifies alternate (secondary) keys– Dname VARCHAR2(15) UNIQUE;
60
10/7/2010
31
Specifying Key and Referential Integrity Constraints (cont’d.)
• FOREIGN KEY clause– Default operation: reject update on violation
– Attach referential triggered action clause• Options include SET NULL, CASCADE, and SET DEFAULT
• ON DELETE and ON UPDATEA ti f S S i th• Action for SET NULL or SET DEFAULT is the same for both ON DELETE and ON UPDATE
• CASCADE different for ON DELETE and ON UPDATE
61
Summary of SQL Queries
62
10/7/2010
32
Aliasing and Tuple Variables
• Aliases or tuple variablesDeclare alternative relation names– Declare alternative relation names
SELECT E.Fname, E.Name, E.AddressFROM EMPLOYEE E, DEPARTMENT DWHERE D.Name=‘Reseach’ AND D.Dnumber=E.Dnumber;
/*or FROM EMPLOYEE AS E, DEPARTMENT AS D*/
63
Unspecified WHERE Clause
• Missing WHERE clause – Indicates no condition on tuple selection
64
10/7/2010
33
Use of the Asterisk
• Specify an asterisk (*)– Retrieve all the attribute values of the selected tuples
65
Question: What does each query do?
SELECT ALL and SELECT DISTINCT
• SQL does not automatically eliminate d li t t l i ltduplicate tuples in query results
• Use the keyword DISTINCT in the SELECTclause– Only distinct tuples should remain in the result
66
10/7/2010
34
Set Operations
• Set operations (duplicated tuples are eliminated)eliminated)– UNION, – EXCEPT (difference), – INTERSECT
• Corresponding multiset operations: UNIONALL EXCEPT ALL INTERSECT ALLALL, EXCEPT ALL, INTERSECT ALL
• Two relations must have the same attributes and the attributes appear in the same order
67
Example
68
Practice: Make a list of the names of all Employees who are not managers of any department
10/7/2010
35
Substring Pattern Matching
• LIKE comparison operatorUsed for string pattern matching– Used for string pattern matching
– % replaces an arbitrary number of zero or more characters
– underscore (_) replaces a single character
– Retrieve all employees whose address is in Houston, TexasSELECT Fname, LnameFrom EMPLOYEEWHERE Address LIKE ‘%Houston,TX%’
69
Ordering of Query Results
• Use ORDER BY clause– Keyword DESC to see result in a descending order ofKeyword DESC to see result in a descending order of values
– Keyword ASC to specify ascending order explicitly (default)
– SELECT *– FROM EMPLOYEE– ORDER BY Salary
– How the results will be order if – ORDER BY D.Dname DESC, E.Lname ASC, E.Fname ASC
70
10/7/2010
36
INSERT, DELETE, and UPDATE Statements in SQL
• Three commands used to modify the d t bdatabase: – INSERT, DELETE, and UPDATE
71
The INSERT Command
• Specify the relation name and a list of values for the tuplethe tuple
• If specify explicit attribute names, attributes with NULL allowed or DEFAULT values can be left out
INSERT INTO EMPLOYEE (Fname, Lname, Ssn, Dno)
VALUES (‘Lisa’, ‘Williams”, ‘234553321’, 5);
72
10/7/2010
37
The INSERT Command (cont’d)
• Insert values with a subquery
73
The DELETE Command
• Removes tuples from a relation– Includes a WHERE clause to select the tuples to be deleted
74
True of false: Each DELETE statement will remove one tuple from a table
10/7/2010
38
The UPDATE Command
• Modify attribute values of one or more l t d t lselected tuples
• Additional SET clause in the UPDATEcommand – Specifies attributes to be modified and new values
75
Nested Queries, Tuples,and Set/Multiset Comparisons
• Nested queries– Complete select‐from‐where blocks within WHERE clause of another query
– Outer query
76
10/7/2010
39
EXISTS
• EXISTS(Q) function – Check whether the result of a correlated nested query is empty
or not• EXISTS and NOT EXISTS
– Typically used in conjunction with a correlated nested query
SELECT E.Fname, E.LnameFROM EMPLOYEE AS EWHERE EXISTS (SELECT *
FROM DEPENDENT AS DWHERE E.Ssn = D.Essn
AND E.Sex = D.SexAND E.Fname = D.Dependent_name);
77
Aggregate Functions in SQL
• Used to summarize information from multiple tuples into a single‐tuple summarytuples into a single tuple summary
• Grouping – Create subgroups of tuples before summarizing
• Built‐in aggregate functions – COUNT, SUM, MAX, MIN, and AVG
• NULL values discarded when aggregate functions li d i l lare applied to a particular column
• Functions can be used in the SELECT clause or in a HAVING clause
78
10/7/2010
40
HAVING
• HAVING clause– Provides a condition on the summary information
79
The Relational Algebra and Relational Calculus
10/7/2010
41
Formal Relational Query Languages
Two mathematical Query Languages form the basis for “real” languages (e g SQL) and for implementation:real languages (e.g. SQL), and for implementation:Relational Algebra: More operational, very useful for representing execution plans.– Relational algebra
• Basic set of operations for the relational model– Relational algebra expression
• Sequence of relational algebra operations
Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non‐operational, declarative.)
81
Sequences of Operations and the RENAME Operation
• In‐line expression:
• Sequence of operations:
• Rename attributes in intermediate results• R(First_name, Last_name, Salary)
• RENAME operation
82
10/7/2010
42
Operations of Relational Algebra
83
Operations of Relational Algebra (cont’d.)
84
10/7/2010
43
Relational Calculus
Requirement: understand simple TRCTRCs
Relational Calculus
• Declarative expression – Specify a retrieval request; nonprocedural language
• Any retrieval that can be specified in basic relational algebra– Can also be specified in relational calculusp
86
10/7/2010
44
Relational Calculus
• Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus (DRC)and Domain relational calculus (DRC).
• Calculus has variables, constants, comparison ops, logical connectives and quantifiers.– TRC: Variables range over (i.e., get bound to) tuples.– DRC: Variables range over domain elements (= field values).– Both TRC and DRC are simple subsets of first‐order predicate logicpredicate logic.
• Expressions (predicates) in the calculus are called formulas. An answer tuple is essentially an assignment of constants to variables that make the formula evaluate to true.
87
Tuple Relational Calculus
• Query: {T|P(T)}T is tuple variable– T is tuple variable
– P(T) is a formula that describes T• Result, the set of all tuples t for which P(t) evaluates True.– Find all sailors with a rating above 7.– }7|{ >∧∈ ratingSSailorsSS–– in our book: Sailors(S) specifies that the range relation of tuple variable S is Sailors (s may take as its value any individual tuple from Sailors)
}7.|{ >∧∈ ratingSSailorsSS
88
SailorsS∈
10/7/2010
45
Tuple Relational Calculus
• Atomic formulaeg in our book: Sailors(S)lR Re∈ SailorsS∈– eg. in our book: Sailors(S)
Rel: range relation of R– R.a op S.b , op is one of– R.a op constant eg. 7. >ratingS
lR Re∈
< > = ≤ ≥ ≠, , , , ,
SailorsS∈
89
TRC
• Formula
Any atomic formula– Any atomic formula
–(in our book: NOT(p), p AND q, p OR q )
– Existential quantifiers
– Universal quantifiers
qpqpp ∨∧¬ ,,
))(( RpR∃))(( RpR∀
90