60
Distributed Database Design COSC 5040 Week One

Distributed Database Design

Embed Size (px)

DESCRIPTION

Distributed Database Design. COSC 5040 Week One. Outline. Introduction Course overview Database systems concepts Relational database model Structured query language (SQL). Database System Concept. Data Known facts Database A collection of related data - PowerPoint PPT Presentation

Citation preview

Page 1: Distributed Database Design

Distributed Database Design

COSC 5040Week One

Page 2: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

OutlineIntroductionCourse overviewDatabase systems conceptsRelational database modelStructured query language (SQL)

Page 3: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Database System ConceptData

Known facts

DatabaseA collection of related data

Database Management System (DBMS)A software system to facilitate the defining, constructing, manipulating, and sharing of a computerized database

Database SystemThe DBMS software together with the data itselfSometimes, the applications are also included

Page 4: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Typical DBMS Functionality

Define a databaseConstruct and load the databaseManipulating the database

Querying, generating reports, insertions, deletions and modifications

Concurrent processing and sharingProtection or securityPresentation and visualization

Page 5: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Database System Environment

Page 6: Distributed Database Design

Jiangping Wang

Example of a Database

Webster University Distributed Database Design

Figure 1.2 A database that stores student and course information.

Page 7: Distributed Database Design

Jiangping Wang

Example of a Database

Webster University Distributed Database Design

Page 8: Distributed Database Design

Jiangping Wang

Database ManipulationDatabase manipulation involves querying and updating

P. 9Examples of queriesExamples of updates

Webster University Distributed Database Design

Page 9: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Database Approach Characteristics

Self-describing nature of a database systemMeta-data

Insulation between programs and data, data abstraction

Program-data independence

Support of multiple views of the dataVirtual data

Sharing of data and multi-user transaction processing

Concurrency control

Page 10: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Database UsersActors on the scene

Database administrators (DBA)Authorizing access to the databaseAcquiring software, and hardware resourcesControlling and monitoring efficiency of operations

Database designersDefine content, structure, constraints, and functions or transactionsCommunicate with the end-users

End-usersQueries, reportsUpdate the database content

Actors behind the scene

Page 11: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Database Users

Page 12: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Advantages of Database Approach

Controlling redundancyRestricting unauthorized accessProviding persistent storageProviding storage structures for efficient query processingProviding backup and recoveryProviding multiple interfacesRepresenting complex relationships among dataEnforcing integrity constraintsDrawing inferences and actions

Page 13: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Historical Development

Early database applicationsHierarchical modelNetwork model

Relational model based systemsObject-oriented applications: OODBs and ORDBMSsWeb and e-commerce applicationsDatabase for new applications

Page 14: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Data ModelsData model

Data abstractionA collection of concepts that can be used to describe the structure of a databaseEntities, attributes, relationshipsData types, constraints

Categories of data modelsConceptual (high-level, semantic) data modelsImplementation (representational) data modelsPhysical (low-level, internal) data models

Page 15: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Schemas and InstancesDatabase schema

Description of a database

Schema diagramDiagrammatic display of a database schema

Database stateActual data in the database at a particular moment in timeCurrent set of occurrences or instances

Page 16: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Schema Diagram

Page 17: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Three-Schema Architecture

Page 18: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Data IndependenceLogical data independence

The capacity to change the conceptual schema without having to change the external schemas and their application programs

Physical data independenceThe capacity to change the internal schema without having to change the conceptual schema

Page 19: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

DBMS LanguagesStructured query language (SQL)Data definition language (DDL)

To specify database conceptual schema

Data manipulation language (DML)To specify database retrievals and updates

DBMS InterfacesStand-alone query language interfacesProgrammer interfaces for embedding DML in programming languages

Page 20: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Database System UtilitiesTo perform certain functions such as:

Loading data stored in files into a databaseData conversion toolsBacking up the database periodicallyReorganizing database file structuresReport generation utilitiesPerformance monitoring utilitiesSorting, user monitoring, data compressionData dictionary

Page 21: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Client-Server Architectures Centralized architectureClient-server architecture

ClientProvide appropriate interfaces and a client-version of the system to access and utilize the server resources

ServerProvides services to clientsDatabase server provides database query and transaction services to clients

Page 22: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Three Tier Client-Server Architecture

Page 23: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Classification of DBMSBased on data model

RelationalNetworkHierarchicalObject-orientedObject-relational

Other classificationsSingle-user vs. multi-userCentralized vs. distributed

Page 24: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Relational Model ConceptsThe relational model is based on the concept of a relationA relation is a mathematical concept based on the ideas of setsRelation: A table of values

Contains a set of rows and columns

Page 25: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Example of a Relation

Page 26: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

DefinitionsThe schema, or description of a relation

R (A1, A2, .....An)

CUSTOMER (Cust-id, Cust-name, Address, Phone#)

A tuple is an ordered set of valuesEach value is derived from an appropriate domain

A domain is a set of atomic valuesData type or format

An attribute designates the role played by the domain

Page 27: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

DefinitionsThe relation is formed over a subset of the Cartesian product of the setsEach set has values from a domainThat domain is used in a specific role which is the attribute nameGiven R(A1, A2, .........., An)

r(R) dom (A1) X dom (A2) X ....X dom(An)

R: schema of the relationr of R: a specific "value" or population of R

Page 28: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

ExampleLet R(A1, A2)

Let dom(A1) = {0,1}Let dom(A2) = {a,b,c}

Then, for example:r(R) = {<0,a> , <0,b> , <1,c> }is one possible “state” or “population” or “extension” r of the relation R, defined over domains D1 and D2It has three tuples

Page 29: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Definition ComparisonInformal Terms Formal Terms

Table Relation

Column Attribute

Row Tuple

Values in a column Domain

Table Definition Schema of a Relation

Populated Table State of the Relation

Page 30: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Characteristics of Relations

Ordering of tuples in a relation r(R)The tuples are not considered to be ordered

Ordering of values within each tupleThe attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> are ordered

Values in a tupleAll values are considered atomic (indivisible)A special null value is used to represent values that are unknown or inapplicable to certain tuples

Page 31: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Relational Integrity Constraints

Constraints are conditions that must hold on all valid relation instancesTypes of constraints

Domain constraintsKey constraintsEntity integrity constraintsReferential integrity constraints

Page 32: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Key ConstraintsUniqueness

A set of attributes of R such that no two tuples in any valid relation instance r(R) will have the same value

MinimalRemoval of any attribute results in a set of attributes that is not a key

If a relation has several candidate keys, one is chosen to be the primary key

The primary key value is used to uniquely identify each tuple in a relation

Page 33: Distributed Database Design

Jiangping Wang

Foreign KeyA set of attributes in one relation that references the primary key in another relation

Same domain(s)Value of foreign key either occurs as a value of primary key or is null

Webster University Distributed Database Design

Page 34: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Entity and Referential Integrity

Entity integrity constraint

No primary key value can be null

Referential integrity constraint

Foreign key value can be either an existing primary key value or a null value

Page 35: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Update OperationsUpdate operations

Insert a tuple (p. 76)Delete a tuple (p. 77)Update a tuple (p. 78)

Maintain integrity constraintsChild insert restrictChild update restrictParent update restrictParent delete restrict

Page 36: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Relational Database Schema

Page 37: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Page 38: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Exercise 3.16Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course:

STUDENT(SSN, Name, Major, Bdate)

COURSE(Course#, Cname, Dept)

ENROLL(SSN, Course#, Quarter, Grade)

BOOK_ADOPTION(Course#, Quarter, Book_ISBN)

TEXT(Book_ISBN, Book_Title, Publisher, Author)

Specify the foreign keys for this schema, stating any assumptions you make.

Page 39: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

SQLStructured query language (SQL)

SQL-86 or SQL1SQL-92 or SQL2SQL-99 or SQL3

Comprehensive database languageData definition (DDL)Data manipulation (DML)

QueryUpdate

Page 40: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Data Definition Language (DDL)

Used to CREATE, DROP, and ALTER the descriptions of the tables (relations) of a databaseData types

NumericCharacter stringBooleanData/time

Page 41: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

CREATE TABLESpecifies its name, its attributes and their data typesA constraint NOT NULL may be specified

CREATE TABLE DEPARTMENT ( DNAME VARCHAR(10) NOT NULL,

DNUMBER INTEGER NOT NULL,MGRSSN CHAR(9),MGRSTARTDATE CHAR(9));

Page 42: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

CREATE TABLEUse the CREATE TABLE command for specifying

Primary key attributesSecondary keys, andReferential integrity constraints (foreign keys)

CREATE TABLE DEPT( DNAME VARCHAR(10) NOT NULL,

DNUMBER INTEGER NOT NULL,MGRSSN CHAR(9),MGRSTARTDATE CHAR(9),PRIMARY KEY (DNUMBER),UNIQUE (DNAME),FOREIGN KEY (MGRSSN) REFERENCES EMP );

Page 43: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

DROP TABLE and ALTER TABLE

Remove a relation (base table) and its definition

DROP TABLE DEPENDENT;

Add an attribute to one of the base relations

ALTER TABLE EMPLOYEE ADD JOB VARCHAR(12);

Page 44: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Retrieval Queries in SQLOne basic statement for retrieving information from a database

SELECT statement

Basic form is a SELECT-FROM-WHERE blockSELECT <attribute list>

FROM <table list>

WHERE <condition>

Page 45: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Simple SQL QueriesQuery 0:

Retrieve the birthdate and address of the employee whose name is 'John B. Smith'

SELECT BDATE, ADDRESS

FROM EMPLOYEE

WHERE FNAME='John'

AND MINIT='B'

AND LNAME='Smith';

Page 46: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Simple SQL QueriesQuery 1:

Retrieve the name and address of all employees who work for the 'Research' department

SELECT FNAME, LNAME, ADDRESSFROM EMPLOYEE, DEPARTMENTWHERE DNAME='Research'

AND DNUMBER=DNO;

DNAME='Research' is a selection conditionDNUMBER=DNO is a join condition

Page 47: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Simple SQL QueriesQuery 2:

For every project located in 'Stafford', list the project number, the controlling department number, and the department manager's last name, address, and birthdate

SELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATE FROM PROJECT, DEPARTMENT, EMPLOYEEWHERE DNUM=DNUMBER AND MGRSSN=SSN

AND PLOCATION='Stafford';

There are two join conditionsDNUM=DNUMBER relates a project to its controlling departmentMGRSSN=SSN relates the controlling department to the employee who manages that department

Page 48: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

AliasesA query that refers to the same name must qualify the attribute name with the relation nameSome queries need to refer to the same relation twiceQuery 8:

For each employee, retrieve the employee's name, and the name of his or her immediate supervisor

SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAMEFROM EMPLOYEE E, EMPLOYEE SWHERE E.SUPERSSN=S.SSN;

Page 49: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Unspecified Where-ClauseQuery 9:

Retrieve the SSN values for all employees

SELECT SSNFROM EMPLOYEE;

Query 10:Retrieve the SSN and department name values for all employees

SELECT SSN, DNAMEFROM EMPLOYEE, DEPARTMENT;

Resulting CARTESIAN PRODUCT

Page 50: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Use of Asterisk *Q1C:

SELECT *FROM EMPLOYEEWHERE DNO=5;

Q1D:

SELECT *FROM EMPLOYEE, DEPARTMENTWHERE DNAME='Research' AND

DNO=DNUMBER;

To retrieve all the attribute values

Page 51: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Use Of DistinctTo eliminate duplicate tuples in a query result, the keyword DISTINCT is used

Q11:SELECT SALARYFROM EMPLOYEE;

Q11A:SELECT DISTINCT SALARYFROM EMPLOYEE;

Page 52: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Set OperationsUNION, MINUS and INTERSECT operationsQuery 4:

Make a list of all project numbers for projects that involve an employee whose last name is 'Smith' as a worker or as a manager of the department that controls the project(SELECT PNAME

FROM PROJECT, DEPARTMENT, EMPLOYEEWHERE DNUM=DNUMBER AND MGRSSN=SSN

AND LNAME='Smith')UNION

(SELECT PNAMEFROM PROJECT, WORKS_ON, EMPLOYEEWHERE PNUMBER=PNO AND ESSN=SSN

AND LNAME='Smith');

Page 53: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Substring Matching

Query 12:Retrieve all employees whose address is in Houston, Texas

SELECT FNAME, LNAME

FROM EMPLOYEE

WHERE ADDRESS LIKE '%Houston, TX’;

Page 54: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Arithmetic Operations

Query 13:Show the resulting salaries if every employee on the ‘ProductX’ project is given a 10 percent raise

SELECT FNAME, LNAME, 1.1*SALARY AS INCREASED_SAL

FROM EMPLOYEE, WORKS_ON, PROJECT

WHERE SSN=ESSN AND PNO=PNUMBER AND PNAME=‘ProductX’;

Page 55: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Ordering of Query Results

Query 15:Retrieve a list of employees and the projects they are working on, ordered by department and, within each department, ordered alphabetically by last name, first name

SELECT DNAME, LNAME, FNAME, PNAME

FROM DEPARTMENT, EMPLOYEE, WORKS_ON, PROJECT

WHERE DNUMBER=DNO AND SSN=ESSN AND PNO=PNUMBER

ORDER BY DNAME, LNAME, FNAME;

Page 56: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Specifying Updates in SQLThere are three SQL commands to modify the database

INSERTDELETE, andUPDATE

Page 57: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

INSERTU1:INSERT INTO EMPLOYEE

VALUES ('Richard', 'K', 'Marini', '653298653', ‘1962-12-30', '98 Oak Forest,Katy,TX', 'M', 37000, '987654321', 4);

U1A:INSERT INTO EMPLOYEE (FNAME, LNAME, SSN)

VALUES ('Richard', 'Marini', '653298653');

Page 58: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

DELETEU4:

DELETE FROM EMPLOYEE WHERE LNAME='Brown';

DELETE FROM EMPLOYEE WHERE SSN='123456789';

DELETE FROM EMPLOYEE WHERE DNO=5;

DELETE FROM EMPLOYEE;

Page 59: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

UPDATEU5:

Change the location and controlling department number of project number 10 to 'Bellaire' and 5, respectively

UPDATE PROJECTSET PLOCATION = 'Bellaire', DNUM = 5WHERE PNUMBER=10;

U6:Give all employees in the 'Research' department a 10% raise in salary

UPDATE EMPLOYEESET SALARY = SALARY * 1.1WHERE DNO IN = 5;

Page 60: Distributed Database Design

Jiangping WangWebster University Distributed Database Design

Reading and HomeworkReadings

Chapter 1, 2, 3, and 4

Week one homework