Upload
jalena
View
29
Download
0
Embed Size (px)
DESCRIPTION
V. Data Storage Management. Yunsheng Liu. Software College, HUST 2007. 11. Data satisfying request. CPU. Request for data. Cache. Primary storage. EEPROM. Main Memory. Fresh Memory. Magnetic Disk. Secondary storage. CD-ROM. Optical Disk. Magnetic Tape. Tertiary storage. - PowerPoint PPT Presentation
Citation preview
Yunsheng LiuYunsheng Liu
Software College, HUSTSoftware College, HUST2007. 112007. 11
Yunsheng Liu Yunsheng Liu
22
5.1 The Memory Hierarchy5.1 The Memory Hierarchy
CPU
Cache
Main Memory
Magnetic Disk
Magnetic Tape
Request for data
Data satisfying request
Tertiary storage
Secondary storage
Primary storage
5.1.1 The Storage Levels5.1.1 The Storage Levels
Fresh Memory
Optical Disk
EEPROM
CD-ROM
Yunsheng Liu Yunsheng Liu
33
5.1.2 Disk Property5.1.2 Disk Property— electrical erasable
TracksTracks
BlockBlock
CylinderCylinder
PlatterPlatter
SectorSector
GapsGaps
Disk headDisk headDisk armDisk armSpindleSpindle
Yunsheng Liu Yunsheng Liu
44
2. Performance property of disks
1). Data must be in main memory for the DBMS to
operate on it
2). The unit for data transfer between main memory
and disk is a block.
R/W a disk block is called an I/O
3). Block access time—from when an R/W is issued to
when the block appears in MM:
access time=seek time+rotational delay+transfer time
5.1.2 Disk Property5.1.2 Disk Property
Yunsheng Liu Yunsheng Liu
55
4. Optimization of disk-block access File organization Scheduling (Disk-arm, i.e. I/O) Nonvolatile RAM for writing: Battery-backed-up RAM Log disk—devoted to writing log in much the same way as non-V RAM. Log-based file system
Capacity Access latency( seek time + rotational latency time) Data transfer rate
3. Performance measures of disks
5.1.2 Disk Property5.1.2 Disk Property
Yunsheng Liu Yunsheng Liu
66
5.1.3 RAID Concept
RAID—Redundant Arrays of Independent(historically Inexpensive) DisksA Disk array—an arrangement of several disks, organized so as to
Increase performance—data striping Improve reliability—redundancy
RAID Levels Level 0: Nonredundant striping Level 1: Mirrored disksLevel 2: Error-correcting code (ECC)
Yunsheng Liu Yunsheng Liu
77
5.2 Stored Data Management5.2 Stored Data Management
5.2.1 Introduction 1. Stored Data Kinds: User database Data Dictionary/ Directory, Log 2. Stored Data Structures: Arrangement: sequential, random Connection: address adjacent, chaining 3. Access modes: sequential, indexed, hashing 4. I/O Buffer management 5. Interface to OS
Yunsheng Liu Yunsheng Liu
88
5.2.2 Storage Management Structures
2. Physical structure Stored structure: stored file stored record stored item Device structure: device volume cylinder
track physical record/sector
1. Logical structure: Logical file page record field
3. Allocation structure Extent block
5.2 Stored Data Management5.2 Stored Data Management
Yunsheng Liu Yunsheng Liu
99
4. Mapping From Logical Structure to Physical Structure4. Mapping From Logical Structure to Physical Structure
Volume
Sector
Logical structureLogical structure
Stored structureStored structure
Allocation structureAllocation structure
Physical structurePhysical structure
Logic File
PagePage
Logic RecordLogic Record
FieldField
Stored File
Stored RecordStored Record
Stored ItemStored Item
Cylinder
Track
BlockBlock
Extent
5.2 Stored Data Management5.2 Stored Data Management
Yunsheng Liu Yunsheng Liu
1010
5.2.3 Overview of File Organizations 1. Stored Data Arrangement 2. Access Modes
File Org.File Org.
Sequential FileSequential File
Random FileRandom File
Heap FileHeap File
Sorted FileSorted File
Indexed FileIndexed FileGeneral Index FileGeneral Index File
Hash FileHash File
Tree Index FileTree Index File
B+-Tree B+-Tree
B-Tree B-Tree
Static Hash Static Hash
Dynamic Hash Dynamic Hash
5.2 Stored Data Management5.2 Stored Data Management
Yunsheng Liu Yunsheng Liu
1111
3. Classification of File Organizations3. Classification of File Organizations
Adjacent ChainedSequential
Indexed sequential
Tree-structural
Static hash
Dynamic hash
Sequential processing
Random processing
Storage structure
Heap Sorted Indexed HashedAccess mode
Chain
5.2 Stored Data Management5.2 Stored Data Management
Yunsheng Liu Yunsheng Liu
1212
- How to organize blocks/pages in a file to support to create, destroy a file, and get, insert, delete a record and scan all records in the file
Conjunctive arrangement of blocks
Problems: how to insert, delete? how many free slots/pages?
File Head Free blocks
Data block 1 Data block 2 Data block N
Frame 1 Frame 2 Frame N Frame N+m
• • •Fid P
5.3 Sequential File Structure5.3 Sequential File Structure
Yunsheng Liu Yunsheng Liu
1313
Student S# SName SAge Dept
200103001 李红光 23 SW200405840 何清溪 19 MS200203123 刘要武 20 CS200101015 李 光 22 EE200203101 刘 民 20 CS200305103 张一清 21 MS200403123 张扬名 18 SW200201123 王克勤 21 EE
(a). Natural Sequence Structure
Student S# SName SAge Dept
200101015 李 光 22 EE 200103001 李红光 23 SW200201123 王克勤 21 EE200203101 刘 民 20 CS200203123 刘要武 20 CS200305103 张一清 21 MS200403123 张扬名 18 SW200405840 何清溪 19 MS
(b). Ordered Sequence Structure
Example
5.3 Sequential File Structure5.3 Sequential File Structure
Yunsheng Liu Yunsheng Liu
1414
The space for pointers of the chainsVirtually, the full list will be empty in variable record
5.4 Chained List File Structure5.4 Chained List File Structure
Data page Data page Data page Fid P
File Header
(a). Hybrid Chain
…
Data block
(b) Separated Chain
File HeaderFile Header
P1
P2Fid
Data block Data block
Data block Data block Data block
…
…
Yunsheng Liu Yunsheng Liu
1515
5.5 Index Structures 5.5 Index Structures
5.5.1 Overview of Indexes5.5.1 Overview of Indexes 1. Concepts1. Concepts
An index is an auxiliary data structure that is intended
to help us find Rids of records with given search key
value An index is a file/collection of records, referred as
index entries, which are usually pairs (k, Rid) and
Rid is a pointer to a record with search key value k An index is a mechanism of KTA(Key to Address)
Yunsheng Liu Yunsheng Liu
1616
2. Generic index structure
Indexing on SK
k
Index entries
Search key
The records with the value k of SK
ridiki
ridrkr
ridjkj
Index
kj
ki
kr
SK
Data File
Domain of SK
5.5.1 Overview of Indexes5.5.1 Overview of Indexes
Yunsheng Liu Yunsheng Liu
1717
3. Index file organizations
How to organize index entries to support rapid retrieval of entries with a given search key value? e.g.
Sequential indexes Various tree-structural indexes, Hash-based indexes—Scatter Table
5.5.1 Overview of Indexes5.5.1 Overview of Indexes
Yunsheng Liu Yunsheng Liu
1818
5.5.2 Properties of Indexes
1. Clustered vs. Unclustered
Clustered — the ordering of data records is the same
as (or close to ) the ordering of index entries
- The two orderings are matched with each other
Unclusterd — not match with each other
2. Dense vs. Sparse
Dense: an index entryindividual data record
Sparse: an index entrya set (usually, a block/page)
of data records
Yunsheng Liu Yunsheng Liu
1919
3. Primary vs. Secondary
Primary index Primary key
Secondary index Candidate/Secondary key
4. Simple vs. Composite Key
Composite key more than one fields
Simple key single field
5.5.2 Properties of Indexes
Yunsheng Liu Yunsheng Liu
2020
5.6 B-Tree Structured Indices5.6 B-Tree Structured Indices
1. Nonleaf Node structure
Root Node
Inner Nodes
- -… …
…PnPn-1• • • Kn rnP2K2 r2P1K1 r1P0
R(K1) R(K2)
Data Records
Yunsheng Liu Yunsheng Liu
2121
rmKm…r2K2r1K1 …
R(K1)R(K1) R(K2)
R(K2) …
2. Leaf Node Structure
3. A B-tree structure
--1010 2020 --6060 7070
--5050 ……
33 55 - - 5252 5454 5555 - - 7474 7878 - -6161 6363 6565 69691111 1313 1414 - -2222 2323 - -
R(50)R(50)
R(20)R(20)R(10)R(10) R(60)R(60) R(70)R(70)
5.6 B-Tree Structured Indices5.6 B-Tree Structured Indices
Yunsheng Liu Yunsheng Liu
2222
PnPn-1• • • Kn P1K1P0 P2K2
rmKm…r2K2r1K1 …
…
1. Nonleaf Node structure
2. Leaf Node Structure
5.7 B+-Tree Structured Indices5.7 B+-Tree Structured Indices
Yunsheng Liu Yunsheng Liu
2323
Index Set
Sequence Set
Record Set
B+-tree
Data File
Random access
Sequential access
3. A B+-tree structure
5.7 B+-Tree Structured Indices5.7 B+-Tree Structured Indices
Yunsheng Liu Yunsheng Liu
2424
B0
B1
Bn-1
• • •
h(k i)ki
Hash Function
Major Data Area Overflow Area
Record slot
Block• • •
1. General Hashing Structure
5.8 Hashing File Structures 5.8 Hashing File Structures
Yunsheng Liu Yunsheng Liu
2525
5.8 Hashing File Structures 5.8 Hashing File Structures
2. Bucket Hashing
Primary blocks
Hashing Function
Key
KTA Transformation Buckets
Bu1
Bu2
Bun
• • •• • •
• • •