36
SJSU -- CmpE © 2003-2006 Dr. M. E. Fayad Database Design Dr. M.E. Fayad, Professor Computer Engineering Department, Room #283I College of Engineering San José State University One Washington Square San José, CA 95192-0180 http://www.engr.sjsu.edu/~fayad, [email protected]

SJSU -- CmpE © 2003-2006 Dr. M. E. Fayad Database Design Dr. M.E. Fayad, Professor Computer Engineering Department, Room #283I College of Engineering San

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

SJSU -- CmpE© 2003-2006 Dr. M. E. Fayad

Database Design

Dr. M.E. Fayad, Professor

Computer Engineering Department, Room #283I

College of Engineering

San José State University

One Washington Square

San José, CA 95192-0180

http://www.engr.sjsu.edu/~fayad,

[email protected]

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S2 Infinite R-DB

2

Lesson 1:Infinite Relational Database

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S3 Infinite R-DB

Lesson Objectives

Objectives

3

Understand Infinite Relational Databases Explore the view level Understand the logical view Abstract Data Type

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S4 Infinite R-DB

Data Abstraction- allows people to forget unimportant details

– View Level – a way of presenting data to a– group of users– Logical Level – how data is understood to

be when writing queries

4

Infinite Relational Databases

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S5 Infinite R-DB

The highest level of data abstraction is the view level

A view is a way of presenting data to a particular group of users.

Data Presentation may depend on users preferences.

Each view has to be functional for the users. This means that when designing a view we

must keep in mind the functions to be preformed on the data. 5

The View Level

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S6 Infinite R-DB

View level presentation of the data: Science, Art, or both (discussion)

We will illustrate examples from different computer fields, such as computer graphics, for view level presentation of complex data, especially spatiotemporal data, such as realistic display of images and movies. 6

The View Level

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S7 Infinite R-DB

Examples:–Charts –Graphs–Drawings–Maps–Video or Animation

7

The View Level

Examples?What is a view?What is a model?What are the differences between a model and a view?

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S8 Infinite R-DB

Example: Infinite relational data model

• Relation – table(Each table has a name and defines a relation)

• Relational scheme – top row / list of attributes(The top row of a table is called an attribute name)(The ordered set of attributes of a table is called a relation scheme.)

• Arity or dimension – number of attributes of a relation(We will use arity and dimension interchangeably with a preference for dimension in the case of spatiotemporal relations.) 8

The Logical Level

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S9 Infinite R-DB

Example: Infinite relational data model

• Database schema – set of relation names and schemes• Tuple / Point – each row below the scheme

(we will use these two terms interchangeably with a preference for point in the case of spatiotemporal relations.

• Instance – the set of tuples in a table(Each row describes an instance of the scheme.)(Please remember a relation schemes are usually fixed while a relation instances may change over time due to database updates.) 9

The Logical Level

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S10 Infinite R-DB

10

Example (1)

SSN Surname First Name(s) Telephone Number123-45-6789 Doe Jane Q. 512-555-1234987-65-4321 Fulano Juan 210-543-9876567-89-0123 Roe Richard Rodney 512-987-6431

SSN Wages Interest Capital Gain

123-45-6789 100,000 3,400 0

987-65-4321 83,640 2,821 3,400

567-89-0123 46,000 501 1,200

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S11 Infinite R-DB

Name the relations! What is arity of each relation? What is the relation scheme of each relation? What is the database scheme? How many tupls in each of the relation? How many instances of each of these relations?

11

Example (2)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S12 Infinite R-DB

T or F:

Relation schemes are usually fixed

Relation instances change with updates

Example Scheme:Taxrecord(SSN,Wages,Interest,Capital_gain)

Taxtable(Income,Tax)

12

Relation schemes & Instances (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S13 Infinite R-DB

Example: Streets(Name, X, Y )

Streets contains pairs of street names and (x,y) points such that the point belongs to the street. There are an infinite number of (x, y) locations associated with each street.

Example: Crops(Corn,Rye,Sunflower, Wheat)

Crops contains all possible combinations of four crops that a farmer could plant. There are an infinite number of tuples in any instance of this relation.

Relation schemes & Instances (2)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S14 Infinite R-DB

Other examples:

Temporal Data

Spatial Data

Operations Research

14

Infinite Relational Data Model

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S15 Infinite R-DB

In many application areas of machine learning and data mining, researchers face challenges entailed by temporal and spatial data.

What are the differences between temporal and spatial data?

15

Temporal & Spatial Data

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S16 Infinite R-DB

16

Temporal Data Type (1)The user-defined temporal data type is a time representation specially designed to meet the specific needs of the user. For example, the designers of a database used for class scheduling in a school might be based on a "Year:Term:Day:Period" format. Terms belonging to a user-defined temporal data type get the same query language support as do terms belonging to built-in temporal data types such as the DATE data type.

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S17 Infinite R-DB

A temporal database is a database that supports some aspect of time, not counting user-defined time.

17

Temporal Databases

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S18 Infinite R-DB

The spatiotemporal is used to indicate that the modified concept concerns simultaneous support of some aspect of time and some aspect of space, in one or more dimensions.

18

Spatiotemporal

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S19 Infinite R-DB

Domain – range of values for an attribute.– string, integers or real numbers

Scalar Domain – always a single value – (ex: string, integer or real number)

Abstract data type domains – composed of scalar domains. 19

Abstract Data Types (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S20 Infinite R-DB

Example:

Vertices(Cities)

The domain of Cities is a set of strings.

Example:

Streets(Name, Extent)

The domain of Extent is a set of (x,y) points.

20

Abstract Data Types (2)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S21 Infinite R-DB

A database is a collection of related data.

A database management system (DBMS) is a collection of programs that enables users to create and maintain a database.

A database system = database + DBMS

21

Database Glossary (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S22 Infinite R-DB

A database can be of any size and of varying complexity.

IRS database

Assume there are a 100 million taxpayers

Each taxpayer file has an average of 5 forms.

Each form is approx. 200 chars

Assume also that IRS keeps the past three returns for each taxpayer?

What is the size of IRS’s database?

(100*(106)*200*5) = 4*(1011) = 400 gigabytes

22

Database Glossary (2)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S23 Infinite R-DB

Self-describing nature of a database system

Database contains the database itself, the definition or

description of the database structure and constraints

The definition is stored in the system catalog which contains the information, such as structure of each file, the type and storage format of each data item, and various constraints on the data.

The information stored in the catalog is called meta-data.

23

Characteristics of the Database Approach

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S24 Infinite R-DB

Insulation between programs and data, and data abstraction

In OO databases users can define operations on

data as part of the database definitions.

An operation is called a function is specified in two parts: the interface or signature and the implementation

Data abstraction 24

Characteristics of the Database Approach

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S25 Infinite R-DB

Support multiple views of the data

Dealing with Raw Data

Many users = different perspectives or views of the database.

Facilities for multiple views

25

Characteristics of the Database Approach

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S26 Infinite R-DB

Sharing of data and multiuser transaction processing

A multiuser DBMS must allow multiple users to access the database at the same time.

Concurrency control – to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct.

26

Characteristics of the Database Approach

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S27 Infinite R-DB

Database administrators

Database designers

End users (casual end users, naïve or parametric end users, sophisticated end users, and stand-alone user

System analysts and application programmers or software engineers

27

Actors on the Scene

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S28 Infinite R-DB

DBMS system designers and implementers

Tool developers

Operators and maintenance personnel

28

Worker Behind the Scene

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S29 Infinite R-DB

Controlling redundancy

Redundancy is storing the same data multiple times that lead to several problems:

1. Duplication of effort

2. Waste of storage space

3. Inconsistent

29

Advantages of Using DBMS (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S30 Infinite R-DB

Restricting unauthorized access

DBMS should provide a security and authorization mechanisms which specify account restrictions.

DBMS should enforce these restrictions automatically.

30

Advantages of Using DBMS (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S31 Infinite R-DB

Providing persistent storage for program objects and data structures

In OO Database Systems, an object said to be persistent if it survives the execution of program execution and can be later retrieved by another program.

Compatibility – OODBs offer data structure compatible with one or more OO programming languages

Traditional DB systems often suffer from the so-called impedance or mismatch problem 31

Advantages of Using DBMS (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S32 Infinite R-DB

Permitting inferencing and actions using rules

Some database systems provide capabilities for defining deduction rules for inferencing new information from the stored database facts.

Such systems are called deductive database systems. 32

Advantages of Using DBMS (1)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S33 Infinite R-DB

33

Advantages of Using DBMS (2)

Providing multiple user interfaces

Representing complex relationships among data

Enforcing integrity constraints

Providing backup and recovery

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S34 Infinite R-DB

Potential enforcing standards

Reducing application development time

Flexibility

Availability of up-to-date information

Economics of Scale

34

Additional Advantages of Using DBMS (2)

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S35 Infinite R-DB

T/F:

a. A view is a way of presenting data to a particular group of users.

b. Any relation can be presented by multiple views

c. Arity = the number of columns in the relation.

d. An instance = any row of a relation

e. Spatial database is a database that supports some aspect of time, not counting

f. Spatial data in the form of two- or three-dimensional images.

g. Spatial data is any information about the location and shape of, and relationships among, geographic features. This includes remotely sensed data as well as map data. 35

Discussion Questions

© 2003-2006 Dr. M. E. Fayad SJSU – CmpE M.E. Fayad L1-S36 Infinite R-DB

Task 1: Data Modeling Using Entity-Relationship Model

36

Tasks for Next Lecture