34
Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise Getoor, Univ. of Maryland VLDB 2009, Lyon, France

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

  • Upload
    tallys

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications. Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise Getoor, Univ. of Maryland. VLDB 2009, Lyon, France. Index Selection. Index selection problem: Given a query workload - PowerPoint PPT Presentation

Citation preview

Page 1: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

Karl Schnaitter, UC Santa CruzNeoklis Polyzotis, UC Santa CruzLise Getoor, Univ. of Maryland

VLDB 2009, Lyon, France

Page 2: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

2University of California, Santa Cruz

Index Selection• Index selection problem:

– Given a query workload– Choose indices that improve workload performance

• Does index benefit depend on other indices? – If so, this is called index interaction

• Index “benefit” is a key concept– Informally, for an index i,

[benefit of i] = [exec cost without i] – [exec cost with i]

Page 3: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

3University of California, Santa Cruz

Related Work• Interactions are a key concern in physical tuning

– [Whang et al. 1981] make assumptions implying that indices on different tables do not interact

– [Finklestein et al. 1988] assume that indices do not interact if they are relevant to separate queries

– [Bruno and Chaudhuri 2007] explicitly account for some interactions in on-line index selection

– Many more…

• These studies treat interactions as a secondary issue, and often rely on ad hoc assumptions

Page 4: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

4University of California, Santa Cruz

Index Interactions• Let S be a set of indices relevant to a query Q• •

cost(X)

cost(X {a}) benefit({a}, X)

cost(X {b})

cost(X {a,b}) benefit({a}, X {b})

Indices a,b are independent with respect to X

cost(X) = cost of Q if only X ⊆S is available

benefit(Y,X ) = cost(X) − cost(Y ∪X)

Page 5: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

5University of California, Santa Cruz

Index Interactions

cost(X)

cost(X {a}) benefit({a}, X)

cost(X {b})

cost(X {a,b}) benefit({a}, X {b})

Indices a,b positively interact with respect to X

• Let S be a set of indices relevant to a query Q• •

cost(X) = cost of Q if only X ⊆S is available

benefit(Y,X ) = cost(X) − cost(Y ∪X)

Page 6: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

6University of California, Santa Cruz

Index Interactions

cost(X)

cost(X {a}) benefit({a}, X)

cost(X {b})

cost(X {a,b}) benefit({a}, X {b})

Indices a,b negatively interact with respect to X

• Let S be a set of indices relevant to a query Q• •

cost(X) = cost of Q if only X ⊆S is available

benefit(Y,X ) = cost(X) − cost(Y ∪X)

Page 7: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

7University of California, Santa Cruz

• = degree of interaction between a,b with respect to X

=

Degree of Interaction

=

• •

benefit({a},X) − benefit({a},X ∪{b})cost(X ∪{a,b})

cost(X ∪{a}) − cost(X) − cost(X ∪{a,b}) + cost(X ∪{b})cost(X ∪{a,b})

doi(a,b,X)

X€

X ∪{a}

X ∪{b}€

X ∪{a,b}

doi is symmetric

doi(a,b) = maxX ⊆S

doi(a,b,X)

Page 8: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

8University of California, Santa Cruz

Problem Statement• Which indices in S interact?• How strong are the interactions?• The Degree of Interaction Problem:

Compute doi(a,b) for all a,b∈ S

Page 9: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

9University of California, Santa Cruz

Outline

• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information

Page 10: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

10University of California, Santa Cruz

Outline

• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information

Page 11: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

11University of California, Santa Cruz

Query Optimization• Computing doi(a,b) is not practical if the

optimizer is totally arbitrary– Need to compute

• In practice, query optimization is not arbitrary– E.g., we expect

• We put mild assumptions on query optimization:– Plans are selected from some fixed space P– Optimizer chooses the cheapest feasible plan from P– Ties are broken consistently

cost(∅ ) ≥ cost({a})

S allfor ),,( XXbadoi

Page 12: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

12University of California, Santa Cruz

Index Benefit Graph• An Index Benefit Graph (IBG) encodes the

selection of optimal plans for a query– Introduced by [Frank, Omiecinski, and Navathe 1992]

• Example IBG when S = {a,b,c,d}

a b c d

a b c b c d

a c b c

= 20

= 45

d = 80c = 80

= 50

c d = 65= 50= 80

used in opt plan

cost of plan

– There are 16 subsets of S– IBG has 8 nodes– But IBG can compute

cost(X) for all X ⊆S

Page 13: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

13University of California, Santa Cruz

Outline

• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information

Page 14: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

14University of California, Santa Cruz

Naive Algorithm• Recall that we want the degree of interaction between

all pairs of indices in S• Each doi(a,b) may be computed directly

For all a,b∈ S

Initialize T[a,b] = 0

Assign T[a,b] = max(d,T[a,b])

Let d =cost(X ∪{a}) − cost(X) − cost(X ∪{a,b}) + cost(X ∪{b})

cost(X ∪{a,b})

For all X ⊆S

Upon termination, T[a,b] = doi(a,b) for all a,bCan save time using an IBG as a cache of cost

functionDownside: iteration over all subsets of S

Page 15: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

15University of California, Santa Cruz

The QINTERACT Algorithm

For all a,b∈ S

Initialize T[a,b] = 0

Assign T[a,b] = max(doi(a,b,X1),doi(a,b,X2),T[a,b])

For all IBG nodes Y

Construct two index sets X1, X2 ⊆S (see paper)

For all a,b∈ S

Initialize T[a,b] = 0

Assign T[a,b] = max(doi(a,b,X),T[a,b])

For all X ⊆S

Naive Algorithm (condensed)

We should avoid evaluating doi(a,b,X) for all

X ⊆S

QINTERACT algorithm processes two index sets per IBG node

QINTERACTAlgorithm

Page 16: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

16University of California, Santa Cruz

cost(∅ )€

cost(a)

cost(b)€

cost(ab)

cost(u)€

cost(ua)

cost(ub)€

cost(aub)

QINTERACT Example

a b u v = 20

a u v = 30 b u v = 30

a u = 40 u v = 40

v = 50u = 50

b v = 40

•Let’s calculate doi(a,b) on the graph below•What happens on iteration Y = {u} ?

Y

a b u v = 20

a u v = 30 b u v = 30

a u = 40 u v = 40

v = 50u = 50

b v = 40

Y

doi(a,b,X1) =40 − 50 − 20 + 30

20= 0

X1 = {u}

doi(a,b, X2) =40 − 50 − 20 + 40

20= 0.5

X2 =∅

Page 17: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

17University of California, Santa Cruz

Interleaved IBG Processing• In QINTERACT, the IBG is built, then analyzed

– I.e., IBG construction and analysis is serial

• We can discover interactions in a partial IBG

• IBG construction and analysis may be interleaved- Improves accuracy of doi over time

a b c d

a b c b c d

a c

= 20

= 45 = 50

= 80 . . . . . .b c

d = 80c = 80

c d = 65= 50

doi(b,d,{a,c}) =45 − 80 − 20 + 20

20=1.75

Page 18: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

18University of California, Santa Cruz

Outline

• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information

- Visualizing Index Interactions- Scheduling Index Creation

Page 19: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

19University of California, Santa Cruz

Outline

• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information

- Visualizing Index Interactions- Scheduling Index Creation

Page 20: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

20University of California, Santa Cruz

Visualizing Index Interactions• We can visualize the doi function as a graph

– Nodes correspond to indices– Edge between a and b has weight doi(a,b)

O(CK,OK)

C(CK,NK)

LI(SK,SD,D,EP,OK)

LI(SD,D)

S(NK,N,SK) S(NK,SK) S(SK,NK)

C(NK,CK)

LI(SD,Q)

0.01

0.02

0.04

0.02

0.03

0.09 0.020.01

0.02TPC-H Query 7

Page 21: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

21University of California, Santa Cruz

Interaction Graph• The connected components have special meaning

1. The benefit of any X ⊆Ci does not depend on S −Ci

2. Refining the partition loses property (1)3. This is the only partition with property (1) and (2)

C1

C3

C2

Page 22: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

22University of California, Santa Cruz

Outline

• Properties of Query Optimization• Degree of Interaction Algorithm• Applying Interaction Information

- Visualizing Index Interactions- Scheduling Index Creation

Page 23: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

23University of California, Santa Cruz

Scheduling Index Creation• Suppose we want to materialize new indices• In what order should they be created?

Benefit

∅ a,ba a,b,c

Materialized Indices

∅ a,cc a,b,c

Schedule = a,b,c

Choose first schedule to maximize benefit over time (shaded area)€

∅ a,bb a,b,c

Schedule = b,a,c Schedule = c,a,b

Page 24: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

24University of California, Santa Cruz

Scheduling Index Creation• We define an optimization problem

– M = preexisting indices– {a1, …, an} = new indices to create

– Permute new indices as t1, …, tn to maximize

benefit({t1,..., ti}, M )i=1

n

∑• This problem is computationally hard

– There is a connection to the Set Cover problem, since each new index “covers” more benefit

Page 25: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

25University of California, Santa Cruz

Greedy Scheduling• We are tempted to use a greedy heuristic• This results in the third schedule

Greedy schedule can be suboptimal by a factor of about (n – 1)

Benefit

∅ a,ba a,b,c

Materialized Indices

∅ a,cc a,b,c

Schedule = a,b,c

∅ a,bb a,b,c

Schedule = b,a,c Schedule = c,a,b

Page 26: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

26University of California, Santa Cruz

Interaction-Aware Scheduling• Scheduling can use interaction graph

C1

C3

C2

Idea: First find optimal sub-schedules for each Ci

Then choose the best interleaving of sub-schedulesThis heuristic avoids the pitfalls of greedy scheduling We can also show stronger performance guarantees

Page 27: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

27University of California, Santa Cruz

Conclusions• Index interactions provide useful insights

for physical design tuning• The doi metric is an effective characterization

of interaction relationships• We can analyze interactions efficiently when

the Index Benefit Graph has limited size• Future work?

Page 28: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

28University of California, Santa Cruz

Thank You

Page 29: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

29University of California, Santa Cruz

Performance Evaluation• QINTERACT implementation in Java

– Uses JDBC to connect to IBM DB2 database• Experiments use 22 TPC-H benchmark queries • We generate indices based on the DB2 advisor

– SALL = all indices recommended by DB2– S1C = indices in SALL with first column only

• We monitor the progress of the “serial” and “interleaved” approaches over time

Page 30: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

30University of California, Santa Cruz

Experimental Results

SALL index set0.1 threshold

S1C index set0.1 threshold

Page 31: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

31University of California, Santa Cruz

Applications• QINTERACT returns doi(a,b) for all a,b• We propose two applications of this

information– Visualizing index interactions

• Illustrates the global interactions as a graph• Useful when manually tuning the index set

– Scheduling index construction• Want to choose when new indices will be created• Goal is to increase performance as quickly as possible• Knowledge of index interactions can help

Page 32: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

32University of California, Santa Cruz

Problem Statement• Which indices in S interact?• How strong are the interactions?• The Degree of Interaction Problem:

Compute doi(a,b) for all a,b∈ S

• It may be useful to ignore “minor” interactions• A threshold-based variant:

Decide if doi(a,b) > τ for all a,b∈ S

Page 33: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

33University of California, Santa Cruz

Index Selection• Index selection problem:

a = any indexX = set of other indicesbenefit(a,X ) = cost(X) − cost(X ∪{a})

• Does benefit(a, X) depend on X ? – If so, this is called index interaction

W = a query workloadS = a set of indices relevant to Wcost(M ) = cost of W when indices M ⊆S are availableWant to find M ⊆S to minimize cost(M )

• We can quantify the benefit of an index:

Page 34: Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

34University of California, Santa Cruz

Future Work• Expand our support for updates• Implementation of visualization tool• Experiments with materialization scheduling• Incremental updates to doi function• Exploring stronger assumptions on query

optimization– Efficient upper bounds on doi function?