26
Selectivity Estimation of XPath for Cyclic Graphs Yun Peng

Selectivity Estimation of XPath for Cyclic Graphs

Embed Size (px)

DESCRIPTION

Selectivity Estimation of XPath for Cyclic Graphs. Yun Peng. Outline. Motivation Problem definition Prime number labeling Selectivity estimation Implementation. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Selectivity Estimation of XPath for Cyclic Graphs

Selectivity Estimation of XPath for Cyclic Graphs

Yun Peng

Page 2: Selectivity Estimation of XPath for Cyclic Graphs

Outline

Motivation Problem definition Prime number labeling Selectivity estimation Implementation

Page 3: Selectivity Estimation of XPath for Cyclic Graphs
Page 4: Selectivity Estimation of XPath for Cyclic Graphs

Motivation To retrieve sub graphs from large graph

databases efficiently, selectivity estimation is one of the most important query optimization technologies

Page 5: Selectivity Estimation of XPath for Cyclic Graphs

An Example

Query q=//faculty[//RA][//TA] means to list all faculties that have both RA and TA To evaluate this query, we have two evaluation plans

One plan Find out faculties having RA. Result set size is 3. Find out faculties having TA from the intermediate results

Another plan Find out faculties having TA. Result set size is 2. Find out faculties having RA from the intermediate results

department

facul ty facul ty facul ty facul ty

name RA name TA RA TA RA RAname name

Page 6: Selectivity Estimation of XPath for Cyclic Graphs

Problem Definition

Selectivity estimation is that given a query, estimate how many results are produced by this query without costly evaluation

department

facul ty facul ty facul ty facul ty

name RA name TA RA TA RA RAname name

q=//faculty[//RA]

Selectivity(q) = 3

Page 7: Selectivity Estimation of XPath for Cyclic Graphs

Our methodology skeleton

Step1: label the graph nodes (pre-prepared)

Step2: Estimate query selectivity based on the pre-prepared labels (after a query comes)

Page 8: Selectivity Estimation of XPath for Cyclic Graphs

Prime number labeling

Label each graph node with an integer, which is production of some prime numbers

Page 9: Selectivity Estimation of XPath for Cyclic Graphs

Prime number labeling (cont.) Divisibility of labels implies ancestor-descendent

relationship

For example, 3*5*7*11 is divisible by 11, so node g is descendent of node a

Page 10: Selectivity Estimation of XPath for Cyclic Graphs

Optimization

Replace integers by vectors

1 1 1 1

1 1 0 0

1 0 1 1

1 0 0 0

0 1 0 0

0 0 1 0

1 0 0 1

a

b

c

d

e

f

g

Page 11: Selectivity Estimation of XPath for Cyclic Graphs

Optimization (cont.)

( ) ( ) 0VL a VL b implies node b is descendent of node a

Page 12: Selectivity Estimation of XPath for Cyclic Graphs

Our methodology skeleton

Step1: label the graph nodes (pre-prepared)

Step2: Estimate query selectivity based on the pre-prepared labels (after a query comes)

Page 13: Selectivity Estimation of XPath for Cyclic Graphs

Selectivity Estimation

Two dimensional histogram Originally designed for selectivity estimation on

trees [Jargadish 2004] Label each tree node by an interval, e.g. (l, r) Represent the interval by a dot (l, r) on the XOY

coordination system Partition the XOY plain to grids as buckets Estimate results using this histogram

Page 14: Selectivity Estimation of XPath for Cyclic Graphs

Selectivity Estimation (cont.)

Page 15: Selectivity Estimation of XPath for Cyclic Graphs

Optimization

Replace integers by vectors

1 1 1 1

1 1 0 0

1 0 1 1

1 0 0 0

0 1 0 0

0 0 1 0

1 0 0 1

a

b

c

d

e

f

g

Page 16: Selectivity Estimation of XPath for Cyclic Graphs

Consecutive Ones Property Matrix Given a 0/1 matrix, if we can find an order of

columns such that all row’s 1s are consecutive, this matrix is called consecutive ones property matrix (C1P matrix)

Reorganization is linear Find the largest C1P sub matrix is NP and if 1s

number of each column is larger than 3, it is un- polynomial time approximatable

Page 17: Selectivity Estimation of XPath for Cyclic Graphs

Add extra columns

0 1 2 3

1 1 1 1

1 1 0 0

1 0 1 1

1 0 0 0

0 1 0 0

0 0 1 0

1 0 0 1

a

b

c

d

e

f

g

0 1 2 3 4

1 1 1 1 0

1 1 0 0 0

0 0 1 1 1

: 4 01 0 0 0 0

0 1 0 0 0

0 0 1 0 0

0 0 0 1 1

a

b

c

Mapd

e

f

g

Page 18: Selectivity Estimation of XPath for Cyclic Graphs

Add extra columns

Given a 0/1 matrix, add minimum number of extra columns such that result matrix is a C1P matrix is NP?

Page 19: Selectivity Estimation of XPath for Cyclic Graphs

Heuristic algorithm

Duplicate Merge

1

2

3

1 2 3 4 5 6

1 1 1 0 1 1

0 1 1 0 1 0

0 0 1 1 1 1

r

r

r

Page 20: Selectivity Estimation of XPath for Cyclic Graphs

Heuristic algorithm (cont.)

Page 21: Selectivity Estimation of XPath for Cyclic Graphs

Heuristic Algorithm (cont.)

1

2

3

1 2 3 4 5 6

1 1 1 0 1 1

0 1 1 0 1 0

0 0 1 1 1 1

r

r

r

1

2

3

1 2 3 6 5 4

1 1 1 1 1 0

0 1 1 0 1 0 0

0 0 1 1 1 1

r

r

r

Page 22: Selectivity Estimation of XPath for Cyclic Graphs

Selectivity Estimation (cont.)

Page 23: Selectivity Estimation of XPath for Cyclic Graphs

Implementation

Page 24: Selectivity Estimation of XPath for Cyclic Graphs

Implementation

Page 25: Selectivity Estimation of XPath for Cyclic Graphs

Implementation

Page 26: Selectivity Estimation of XPath for Cyclic Graphs