1 Lecture 16: Data Storage Friday, November 5, 2004

Preview:

Citation preview

1

Lecture 16: Data Storage

Friday, November 5, 2004

2

Outline

SQL Security – 8.7

Part II: Database Implementation

Today: Data Storage– The memory hierarchy – 11.2– Disks – 11.3

3

Discretionary Access Control in SQL

GRANT privileges ON object TO users [WITH GRANT OPTIONS]GRANT privileges ON object TO users [WITH GRANT OPTIONS]

privileges = SELECT | INSERT(column-name) | DELETE | REFERENCES(column-name)object = table | attribute

4

Examples

GRANT INSERT, DELETE ON Customers TO Yuppy WITH GRANT OPTIONSGRANT INSERT, DELETE ON Customers TO Yuppy WITH GRANT OPTIONS

Queries allowed to Yuppy:

Queries denied to Yuppy:

INSERT INTO Customers(cid, name, address) VALUES(32940, ‘Joe Blow’, ‘Seattle’)

DELETE Customers WHERE LastPurchaseDate < 1995

INSERT INTO Customers(cid, name, address) VALUES(32940, ‘Joe Blow’, ‘Seattle’)

DELETE Customers WHERE LastPurchaseDate < 1995

SELECT Customer.addressFROM CustomerWHERE name = ‘Joe Blow’

SELECT Customer.addressFROM CustomerWHERE name = ‘Joe Blow’

5

Examples

GRANT SELECT ON Customers TO MichaelGRANT SELECT ON Customers TO Michael

No Michael can SELECT, but not INSERT or DELETE

6

Examples

GRANT SELECT ON Customers TO Michael WITH GRANT OPTIONSGRANT SELECT ON Customers TO Michael WITH GRANT OPTIONS

Michael can say this: GRANT SELECT ON Customers TO Yuppi

Now Yuppi can SELECT on Customers

7

Examples

GRANT UPDATE (price) ON Product TO LeahGRANT UPDATE (price) ON Product TO Leah

Leah can update, but only Product.price, but not (say) Product.name

8

Examples

GRANT REFERENCES (cid) ON Customer TO BillGRANT REFERENCES (cid) ON Customer TO Bill

Customer(cid, name, address, …..)Orders(. . ., cid, …) cid=foreign key

Customer(cid, name, address, …..)Orders(. . ., cid, …) cid=foreign key

Now Bill can INSERT tuples into Orders

Bill has INSERT/UPDATE rights to Orders.BUT HE CAN’T INSERT ! (why ?)

9

Views and Security

• David has SELECT rights on table Customers• John is a debt collector: should see the delinquent

customers only:

David says:

CREATE VIEW DelinquentCustomers SELECT * FROM Customers WHERE balance < -1000GRANT SELECT ON DelinquentCustomers TO John

David says:

CREATE VIEW DelinquentCustomers SELECT * FROM Customers WHERE balance < -1000GRANT SELECT ON DelinquentCustomers TO John

Views are an important security mechanism

10

Revokation

REVOKE [GRANT OPTION FOR] privileges ON object FROM users { RESTRICT | CASCADE }

REVOKE [GRANT OPTION FOR] privileges ON object FROM users { RESTRICT | CASCADE }

Administrator says:

REVOKE SELECT ON Customers FROM David CASCADEREVOKE SELECT ON Customers FROM David CASCADE

John loses SELECT privileges on DelinquentCustomers

11

Revocation

Joe: GRANT [….] TO Art …Art: GRANT [….] TO Bob …Bob: GRANT [….] TO Art …Joe: GRANT [….] TO Cal …Cal: GRANT [….] TO Bob …Joe: REVOKE [….] FROM Art CASCADE

Joe: GRANT [….] TO Art …Art: GRANT [….] TO Bob …Bob: GRANT [….] TO Art …Joe: GRANT [….] TO Cal …Cal: GRANT [….] TO Bob …Joe: REVOKE [….] FROM Art CASCADE

Same privilege,same object,

GRANT OPTION

What happens ??

12

Revocation

Admin

Joe Art

Cal Bob

0

1

234

5

Revoke

According to SQL everyone keeps the privilege

13

New Challenges in Data Security

• SQL Injection

• Data collusion

• View Leakage

14

Three Attacks

• SQL injectionChris Anley, Advanced SQL Injection In SQL

Server Applications, www.ngssoftware.com

• Latanya Sweeney’s finding

• Leakage in Views

15

SQL InjectionGo to your favorite shopping Website and login:

Normal use:

Search order by date: 9/15/04’; drop table user; --

Now this:

Search order by date: 9/15/04

Search order by date:

16

SQL Injection

• The DBMS works perfectly. So why is SQL injection possible so often ?

17

Latanya Sweeney’s Finding

• In Massachusetts, the Group Insurance Commission (GIC) is responsible for purchasing health insurance for state employees

• GIC has to publish the data:

GIC(zip, dob, sex, diagnosis, procedure, ...)GIC(zip, dob, sex, diagnosis, procedure, ...)

18

Latanya Sweeney’s Finding

• Sweeney paid $20 and bought the voter registration list for Cambridge Massachusetts:

GIC(zip, dob, sex, diagnosis, procedure, ...)VOTER(name, party, ..., zip, dob, sex)

GIC(zip, dob, sex, diagnosis, procedure, ...)VOTER(name, party, ..., zip, dob, sex)

19

Latanya Sweeney’s Finding

• William Weld (former governor) lives in Cambridge, hence is in VOTER

• 6 people in VOTER share his dob

• only 3 of them were man (same sex)

• Weld was the only one in that zip

• Sweeney learned Weld’s medical records !

zip, dob, sex

20

Latanya Sweeney’s Finding

• All systems worked as specified, yet an important data has leaked

• How do we protect against that ?

Some of today’s research in data security address breachesthat happen even if all systems work correctly

21

Leakage in Views

Employee(name, department, phone)

CREATE VIEW Emp SELECT name, department FROM Employee

CREATE VIEW Phons SELECT department, phone FROM Employee

CREATE VIEW Emp SELECT name, department FROM Employee

CREATE VIEW Phons SELECT department, phone FROM Employee

Want to keep secretthe employees’ phonenumbers

Does this work ?

Important leakage !

22

New Trend:Fine-grained Access Control

• SQL provides only coarse-grained control

• Hence, implemented by the application.

• BIG PROBLEMS:– Security policies checked at each user interface– Easy to get it wrong: SQL injection !

23

Policy Specification Language

CREATE AUTHORIZATION VIEW PatientsForDoctors AS SELECT Patient.* FROM Patient, Treats, Doctor WHERE Patient.pid = Treats.pid and Treats.did = Doctor.did and Doctor.uid = %userId and %accessMode in (‘local’, ‘ssh’)

CREATE AUTHORIZATION VIEW PatientsForDoctors AS SELECT Patient.* FROM Patient, Treats, Doctor WHERE Patient.pid = Treats.pid and Treats.did = Doctor.did and Doctor.uid = %userId and %accessMode in (‘local’, ‘ssh’)

[Oracle 7i], [Rizvi et al.2004]Context

parameters[Oracle]

Several policy languages for XML

(Too) many exists. The good ones re-use a declarative querylanguage, e.g. SQL, Xpath, XQuery

24

Enforcementby query analysis/modification

SELECT Patient.name, Patient.ageFROM PatientWHERE Patient.disease = ‘flu’

SELECT Patient.name, Patient.ageFROM PatientWHERE Patient.disease = ‘flu’

SELECT Patient.name, Patient.ageFROM Patient, Treats, DoctorWHERE Patient.disease = ‘flu’ and Patient.pid = Treats.pid and Treats.did = Doctor.did and Doctor.userID = %currentUser

SELECT Patient.name, Patient.ageFROM Patient, Treats, DoctorWHERE Patient.disease = ‘flu’ and Patient.pid = Treats.pid and Treats.did = Doctor.did and Doctor.userID = %currentUser

e.g. Oracle

25

Semantics

• The Truman Model: transform reality– ACCEPT all queries

– REWRITE queries

– Sometimes misleading results

• The non-Truman model: reject queries– ACCEPT or REJECT queries

– Execute query UNCHANGED

– Subtle semantics: instance dependent or independent

[Rizvi et al. SIGMOD 2004]

SELECT count(*)FROM Patients

SELECT count(*)FROM Patients

26

Part II of this Course:Database Implementation

Outline:

• Buffer manager

• Transaction manager (recovery, concurrency)

• Operator execution

• Optimizer

27

What Should a DBMS Do?

• Store large amounts of data

• Process queries efficiently

• Allow multiple users to access the database concurrently and safely.

• Provide durability of the data.

• How will we do all this??

28

Generic Architecture

Query compiler/optimizer

Execution engine

Index/record mgr.

Buffer manager

Storage manager

storage

User/Application

Queryupdate

Query executionplanRecord,

indexrequests

Page commands

Read/writepages

Transaction manager:•Concurrency control•Logging/recovery

Transactioncommands

29

The Memory HierarchyMain Memory Disk Tape

•Volatile•limited address spaces • expensive• average access time: 10-100 ns

• 10-40 MB/S transmission rates• 100s GB storage• average time to access a block: 10-15 msecs.• Need to consider seek, rotation, transfer times.• Keep records “close” to each other.

• 1.5 MB/S transfer rate• 280 GB typical capacity• Only sequential access• Not for operational data

Cache: access time 10 nano’s

30

Main Memory

• Fastest, most expensive

• Today: 2GB are common on PCs

• Many databases could fit in memory– New industry trend: Main Memory Database– E.g TimesTen, DataBlitz

• Main issue is volatility– Still need to store on disk

31

Secondary Storage

• Disks

• Slower, cheaper than main memory

• Persistent !!!

• Used with a main memory buffer

32

How Much Storage for $200

33

Buffer Management in a DBMS

• Data must be in RAM for DBMS to operate on it!• Table of <frame#, pageid> pairs is maintained.• LRU is not always good.

DB

MAIN MEMORY

DISK

disk page

free frame

Page Requests from Higher Levels

BUFFER POOL

choice of frame dictatedby replacement policy

34

Buffer Manager

Manages buffer pool: the pool provides space for a limited number of pages from disk.

Needs to decide on page replacement policy• LRU• Clock algorithm

Enables the higher levels of the DBMS to assume that theneeded data is in main memory.

35

Buffer Manager

Why not use the Operating System for the task??

- DBMS may be able to anticipate access patterns- Hence, may also be able to perform prefetching- DBMS needs the ability to force pages to disk.

36

Tertiary Storage

• Tapes or optical disks

• Extremely slow: used for long term archiving only

37

The Mechanics of Disk

Mechanical characteristics:

• Rotation speed (5400RPM)

• Number of platters (1-30)

• Number of tracks (<=10000)

• Number of bytes/track(105)

Platters

Spindle

Disk head

Arm movement

Arm assembly

Tracks

Sector

Cylinder

38

Disk Access Characteristics

• Disk latency = time between when command is issued and when data is in memory

• Disk latency = seek time + rotational latency– Seek time = time for the head to reach cylinder

• 10ms – 40ms

– Rotational latency = time for the sector to rotate• Rotation time = 10ms• Average latency = 10ms/2

• Transfer time = typically 40MB/s• Disks read/write one block at a time (typically 4kB)

39

Average Seek Time

Suppose we have N tracks, what is the average seek time ?

• Getting from cylinder x to y takes time |x-y|

3)

66(

1

)2

)(

2(

1

))()((1

||1

33

2

0

22

2

0 02

0 02

NNN

N

dxxNx

N

dxdyxydyyxN

dxdyyxN

N

N N

x

x

N N

=+

=−

+

=−+−

=−

∫ ∫∫

∫∫