Upload
taryn-ralphs
View
227
Download
2
Tags:
Embed Size (px)
Citation preview
Database Management Systems 1 Ramakrishnan & Gehrke
Introduction to Database Systems
Chapter 1
Instructor: Mirsad Hadzikadic
Database Management Systems 2 Ramakrishnan & Gehrke
http://www.sigmod.org/record/issues/0606/index.html
Database Management Systems 3 Ramakrishnan & Gehrke
History
60s C. Bachman GE, network data model, CODASYL Late 60s IBM IMS, hierarchical data model, SABRE, AA-IBM 70 Edgar Codd, IBM, relational model 80s SQL, IBM, System R project, concurrent
transactions management, J. Gray Late 80-90s DB2, Oracle, Informix, Sybase ERP, MRP Baan, Oracle, PeopleSoft, SAP, Siebel
Common tasks – inventory, HR, financial analysis 90s DW, Internet Object-oriented, object-relational DBs Turing award and Turing test?
Database Management Systems 4 Ramakrishnan & Gehrke
Why Study Databases?
Data everywhere Shift from computation to information
– At the “low end:” scramble to Web space– At the “high end:” scientific applications
Datasets increasing in diversity and volume – Digital libraries, interactive video, Human Genome
project, EOS project – ... need for DBMS exploding
DBMS encompasses most of CS– OS, languages, theory, “A”I, multimedia, logic
Database Management Systems 5 Ramakrishnan & Gehrke
What Is a DBMS?
Files vs. Database A very large, integrated collection of data. Models real-world enterprise
– Entities (e.g., students, courses)– Relationships (e.g., Madonna is taking ITCS
6160) A Database Management System (DBMS)
is a software package designed to maintain and utilize databases
Database Management Systems 6 Ramakrishnan & Gehrke
Why Use a DBMS?
Data independence and efficient access Data integrity and security Uniform data administration Concurrent access, recovery from
crashes Reduced application development time When not to use a DB?
Database Management Systems 7 Ramakrishnan & Gehrke
Data Models A data model is a collection of concepts for
describing data A schema is a description of a particular
collection of data, using the given data model The relational model of data is the most
widely used model today– Main concept: relation, basically a table with rows
and columns– Every relation has a schema, which describes the
columns, or fields
Database Management Systems 8 Ramakrishnan & Gehrke
Levels of Abstraction
Many (external) views, single conceptual (logical) schema and physical schema.– Views describe how
users see the data
– Conceptual schema defines logical structure
– Physical schema describes the files and indexes used Views and Schemas are defined using DDL; data is modified/queried using DML
Physical Schema
Conceptual Schema
View 1 View 2 View 3
Database Management Systems 9 Ramakrishnan & Gehrke
Example: University Database
Conceptual schema: – Students(sid: string, name: string, login: string,
age: integer, gpa:real)– Courses(cid: string, cname:string, credits:integer) – Enrolled(sid:string, cid:string, grade:string)
Physical schema:– Relations stored as unordered files – Index on first column of Students
External Schema (View): – Course_info(cid:string, enrollment:integer)
Database Management Systems 10 Ramakrishnan & Gehrke
Data Independence
Applications insulated from how data is structured and stored
Logical data independence: Protection from changes in logical structure of data
Physical data independence: Protection from changes in physical structure of data
One of the most important benefits of using a DBMS!
Database Management Systems 11 Ramakrishnan & Gehrke
Structure of a DBMS
A typical DBMS has a layered architecture
The figure does not show the concurrency control and recovery components
This is one of several possible architectures; each system has its own variations
Query Optimizationand Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
These layersmust considerconcurrencycontrol andrecovery
Database Management Systems 12 Ramakrishnan & Gehrke
Transaction Management: ACID properties
AAtomicity: All actions in the Xact happen, or none happen
CConsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent
IIsolation: Execution of one Xact is isolated from that of other Xacts
DDurability: If a Xact commits, its effects persist
The Recovery Manager guarantees Atomicity & Durability
Database Management Systems 13 Ramakrishnan & Gehrke
Motivation of concurrency control
Consistency Isolation Example
– Two parallel transactions T1 and T2– Serial execution– Execution with interleaving actions– Example
Database Management Systems 14 Ramakrishnan & Gehrke
Motivation of recovery management
Atomicity: – Transactions may abort (“Rollback”)
Durability:– What if DBMS stops running? (Causes?)
crash! Desired Behavior after
system restarts:– T1, T2 & T3 should be
durable– T4 & T5 should be
aborted (effects not seen)
T1T2T3T4T5
Database Management Systems 15 Ramakrishnan & Gehrke
Databases make these folks happy ...
End users and DBMS vendors DB application programmers
– E.g. smart webmasters Database administrator (DBA)
– Designs logical /physical schemas– Handles security and authorization– Data availability, crash recovery – Database tuning as needs evolve
Must understand how a DBMS works!
Database Management Systems 16 Ramakrishnan & Gehrke
Summary DBMS is used to maintain, query large datasets Benefits include recovery from system crashes,
concurrent access, quick application development, data integrity and security
Levels of abstraction give data independence A DBMS typically has a layered architecture DBAs hold responsible jobs and are
well-paid! DBMS R&D is one of the broadest,
most exciting areas in CS