29
CS 4432 1 CS4432: Database Systems II Logical Plan Rewriting

CS 44321 CS4432: Database Systems II Logical Plan Rewriting

Embed Size (px)

Citation preview

Page 1: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 1

CS4432: Database Systems II

Logical Plan Rewriting

Page 2: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 query processing 2

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answer

SQL query

parse tree

logical query plan

“improved” l.q.p

l.q.p. +sizes

statistics

Page 3: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 3

Query in SQL Query Plan in Algebra (logical) Other Query Plan in Algebra (logical)

Page 4: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 4

Query plan 1 (in relational algebra)

B,D

R.A=“c” S.E=2 R.C=S.C

XR S

Page 5: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 5

Query plan 2 (in relational algebra)

B,D

R.A = “c” S.E = 2

R S

natural join on R.C=S.C

Page 6: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 6

Relational algebra optimization

• What are transformation rules ?– preserve equivalence

• What are good transformations?– reduce query execution costs

Page 7: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 7

Rules: Natural join rewriting.

R S = S R(R S) T = R (S T)

R S S T

T R

Can also write as trees, e.g.:

Page 8: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 8

Rules: Other binary operators ?

R S = S R(R S) T = R (S T)

What about :• Cross product?• Condition join?• Union?• Intersection ?• Difference ?

Page 9: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 11

Rules: Selects

p1p2(R)= p1 [ p2 (R)]

[ p1 (R)] U [ p2 (R)]

p1vp2(R) =

Page 10: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 12

Bags vs. Sets

R = {a,a,b,b,b,c}S = {b,b,c,c,d}What about union R U S = ?

• Option 1 SUMR U S = {a,a,b,b,b,b,b,c,c,c,d}

• Option 2 MAXR U S = {a,a,b,b,b,c,c,d}

Page 11: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 13

Which option makes this rule work ?

p1vp2 (R) = p1(R) U p2(R)

Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c

p1vp2 (R) = {a,a,b,b,b,c}

p1(R) = {a,a,b,b,b}

p2(R) = {b,b,b,c}

p1(R) U p2 (R) = {a,a,b,b,b,c}

Let us try MAX():

Page 12: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 14

Which option makes this rule work ?

p1vp2 (R) = p1(R) U p2(R)

Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c

p1vp2 (R) = {a,a,b,b,b,c}

p1(R) = {a,a,b,b,b}

p2(R) = {b,b,b,c}

p1(R) U p2 (R) = {a,a,b,b,b,b,b,b,c}

What about Sum()?

Page 13: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 15

Which option makes this rule work ?

p1p2(R)= p1 [ p2 (R)]

Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c

What about MAX versus SUM ?

Page 14: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 17

Yet another example !

Senators (……) Reps (……)

T1 = yr,state Senators; T2 = yr,state Reps

T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA

Union?

“Sum” option makes more sense!

Page 15: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 18

Executive Decision

-> Use “SUM” option for bag unions

-> CAREFUL ! Some rules cannot be used for

bags

Page 16: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 19

Rules: Project

Let: X = set of attributesY = set of attributesXY = X U Y

xy (R) = x [y (R)]

Page 17: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 20

Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attribs

p (R S) =

q (R S) =

Rules: combined

[p (R)] S

R [q (S)]

Page 18: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 21

pq (R S) = ?

Rules: combined

Rule can be derived !

Page 19: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 22

Derivation for rule :

pq (R S) =

p [q (R S) ] =

p [ R q (S) ] =

[p (R)] [q (S)]

Page 20: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 23

More Rules can be Derived:

pq (R S) =

pqm (R S) =

pvq (R S) =

Rules: combined (continued)

Page 21: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 24

We did one, do others on your own :

pq (R S) = [p (R)] [q (S)]

pqm (R S) =

m [(p R) (q S)]pvq (R S) =

[(p R) S] U [R (q S)]

Page 22: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 25

Rules: combined

Let x = subset of R attributes z = attributes in predicate P

(subset of R attributes)

x[p (R) ] = {p [ x

(R) ]} x

xz

Page 23: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 26

Rules: combined

Let x = subset of R attributes y = subset of S attributes

z = intersection of R,S attributes

xy (R S) = xy{[xz (R) ] [yz

(S) ]}

Page 24: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 29

Which are “good” transformations?

Page 25: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 30

Conventional wisdom: do projects early

Example: relation R(A,B,C,D,E) predicate P: (A=3) (B=“cat”)

E {p (R)} vs. E {p{ABE(R)}}

Page 26: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 31

What if we have A, B indexes?

B = “cat” A=3

Intersect pointers to get

pointers to matching tuples!

But

Then better to do projection later !

Page 27: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 32

p1p2 (R) p1 [p2 (R)]

p (R S) [p (R)] S

R S S R

x [p (R)] x {p [xz (R)]}

Which are “good” transformations?

Page 28: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 33

Bottom line:

• Some heuristics : – Early selection is usually good

• No transformation is always good

• Rule application defines a search space– Need cost criteria to make decision

Page 29: CS 44321 CS4432: Database Systems II Logical Plan Rewriting

CS 4432 34

In textbook: more transformations

• Chapter 16.2, 16.3.3

• More rewrite rules

• Other operations, such as, duplicate elimination, etc.