31
Creating Competitive Products Qian Wan [1] , Raymond Chi-Wing Wong [1] , Ihab F. Ilyas [2] , M. Tamer Ozsu [2] , Yu Peng [1] [1] Hong Kong University of Science and Technology [2] University of Waterloo Presented by Qian Wan Prepared by Qian Wan

Creating Competitive Products

  • Upload
    amelie

  • View
    52

  • Download
    9

Embed Size (px)

DESCRIPTION

Creating Competitive Products. Qian Wan [1] , Raymond Chi-Wing Wong [1] , Ihab F. Ilyas [2] , M. Tamer Ozsu [2] , Yu Peng [1] [1] Hong Kong University of Science and Technology [2] University of Waterloo Presented by Qian Wan Prepared by Qian Wan. Outline. Background - PowerPoint PPT Presentation

Citation preview

Page 1: Creating Competitive Products

Creating Competitive Products

Qian Wan[1], Raymond Chi-Wing Wong[1], Ihab F. Ilyas[2], M. Tamer Ozsu[2], Yu Peng[1]

[1] Hong Kong University of Science and Technology [2] University of Waterloo

Presented by Qian WanPrepared by Qian Wan

Page 2: Creating Competitive Products

Creating Competitive Products | VLDB '09 2

Outline

• Background– Skyline, Related Work

• Motivation– Example, Problem Definition

• Algorithm– Framework, Grouping, Pruning

• Experiments– Synthetic, Real data– 6 factors, 4 measurements

• Conclusions

Page 3: Creating Competitive Products

Creating Competitive Products | VLDB '09 3

Skyline

• Definition– Skyline contains the points which are not dominated by

others• Hotel searching problem– Distance to beach VS Price– Dominance– Skyline

Dist

Price

H3

H5

H7

H9

H1

H2

H4

H6

H8

Dist

Price

H1

H2

Page 4: Creating Competitive Products

Creating Competitive Products | VLDB '09 4

Related Work

• Skyline Queries in DBMS [S.Borzsonyi, 2001]

• Single Table Skyline Queries– Bitmaps[K.L. Tan,2001], Nearest Neighbor[D.Kossomann,

2002], Branch and Bound Skylines[D.Papadias, 2005]

• Multi-Table Skyline Queries– Natural Join [W.Jin, 2007][D.Sun, 2008]

– Our Work• Join different source tables via a “Cartesian product”

like procedure.

Page 5: Creating Competitive Products

Creating Competitive Products | VLDB '09 5

Outline

• Background– Skyline, Related Work

• Motivation– Example, Problem Definition

• Algorithm– Framework, Grouping, Pruning

• Experiments– Synthetic, Real data– 6 factors, 4 measurements

• Conclusions

Page 6: Creating Competitive Products

Creating Competitive Products | VLDB '09 6

A Travel Agency’s DatabasePackage No-of-

stopsDistance-to-beach

Hotel-class Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

P4 1 150 4 300

Existing Vacation Packages

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

Package No-of-stops

Distance-to-beach

Hotel-class Price

Q1(F1:H1) 0 100 3 220

Q2(F1,H2) 0 200 2 210

Q3(F1, H3) 0 400 1 200

… … … … …

Q24(f4,h6) 2 200 3 210

Newly Created Vacation Packages

Source Tables

1. Direct attributes2. Indirect attributes3. One indirect attribute characteristic e.g. Travel Agency (Price), PC Manufacture(Price)21,TT

ET

QT

Skyline tuples

Page 7: Creating Competitive Products

Creating Competitive Products | VLDB '09 7

Finding Competitive Products

• Given a set of source tables• Market packages• New packages • Then, a tuple q in TQ is said to be competitive

product if q is in Skyline with respect to

kTTT ..., 21

ET

QT

QE TT

Page 8: Creating Competitive Products

Creating Competitive Products | VLDB '09 8

Naïve Solution

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

H4 150 2 150

H5 170 2 140

H6 200 3 120

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

F3 2 80

F4 2 90

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

… … … … …

Q24(f4,h6)

2 200 3 210

Package

No-of-stops

Distance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

P4 1 150 4 300

1. Intra-dominance checking2. Inter-dominance checking

Source Tables

Existing Vacation Packages

Newly Created Vacation Packages

Package

No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

Competitive Products

Page 9: Creating Competitive Products

Creating Competitive Products | VLDB '09 9

Outline

• Background– Skyline, Related Work

• Motivation– Example, Problem Definition

• Algorithm– Framework, Grouping, Pruning

• Experiments– Synthetic, Real data– 6 factors, 4 measurements

• Conclusions

Page 10: Creating Competitive Products

Creating Competitive Products | VLDB '09 10

Algorithm Overview

• Intra-dominance checking– To Find Skyline in Source Tables

• Inter-dominance checking– Skyline in Existing Market Packages– R* Tree Indies in Existing Market Packages– Full Pruning– Partial Pruning

• Post-processing

Page 11: Creating Competitive Products

Creating Competitive Products | VLDB '09 11

Intra-dominance Checking

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

H4 150 2 150

H5 170 2 140

H6 200 3 120

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

F3 2 80

F4 2 90

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

… … … … …

Q15(f3,h5)

2 170 3 200

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

H4 150 2 150

H5 170 2 140

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

F3 2 80

Skyline Tuples of Source Tables

Newly Created Vacation Packages (conceptual)

1. NO intra-dominance checking (one indirect attribute)2. NO competitive products are missed

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

Competitive Products

'QT'2T

'1T

Conceptual

Page 12: Creating Competitive Products

Creating Competitive Products | VLDB '09 12

Algorithm Overview

• Intra-dominance checking (Framework)– To Find Skyline in Source Tables

• Inter-dominance checking– Skyline in Existing Market Packages– R* Tree Indies in Existing Market Packages– Full Pruning– Partial Pruning

• Post-processing

Page 13: Creating Competitive Products

Creating Competitive Products | VLDB '09 13

Inter-dominance Checking

Package No-of-stops

Distance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

P4 1 150 4 300

Package No-of-stops

Distance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

P4 1 150 4 300

Package No-of-stops

Distance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

No Competitive Products are missed

R* Tree will speedup the inter-dominance checking

Existing Vacation Packages

Skyline in Existing Vacation Packages

R0

R1

R3 R4

R2

R5

Inter-dominance Checking Range query

ET 'ET

Spatial Index

Page 14: Creating Competitive Products

Creating Competitive Products | VLDB '09 14

Algorithm Overview

• Intra-dominance checking (Framework)– To Find Skyline in Source Tables

• Inter-dominance checking– Skyline in Existing Market Packages– R* Tree Indies in Existing Market Packages– Full Pruning– Partial Pruning

• Post-processing

Page 15: Creating Competitive Products

Creating Competitive Products | VLDB '09 15

Full PruningPackage No-of-

stopsDistance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

… … … … …

Q15(f3,h5)

2 170 3 200

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

H4 150 2 150

H5 170 2 140

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

F3 2 80

Skyline Tuples of Source Tables

Newly Created Vacation Packages(Conceptual)

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

Existing Vacation Packages

Competitive Products

A1

A2

B1

B2

C1={A1, B1}

C4={A2, B2}

Full Pruning

'2T

'1T

'ET

'QT

Page 16: Creating Competitive Products

Creating Competitive Products | VLDB '09 16

Full PruningPackage No-of-

stopsDistance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

Best Representative

B1

B2

… … … … …

Bi

… … … … …

Bj

… … … … …

Bk

Groups

C1

C2

… … … … …

Ci

… … … … …

Cj

… … … … …

Ck

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q(f2:h4) 1 150 4 250

Q’(f2,h5) 1 170 4 240

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Min 1 150 4 240

Quality of Best Representative(tightness of each group): (Clustering, e.g. KMeans)

Best Representative

'QT 'ET

Page 17: Creating Competitive Products

Creating Competitive Products | VLDB '09 17

Algorithm Overview

• Intra-dominance checking (Framework)– To Find Skyline in Source Tables

• Inter-dominance checking– Skyline in Existing Market Packages– R* Tree Indies in Existing Market Packages– Full Pruning– Partial Pruning

• Post-processing

Page 18: Creating Competitive Products

Creating Competitive Products | VLDB '09 18

Partial Pruning• Full pruning prunes all members in the group• Partial pruning prunes some members in the group• Direct attribute does not change• Estimate the best possible value for indirect attributes• Using tuples in TE’ to conduct Range Query in each Source Table• Eliminate dominated combinations, if

– They are dominated on all direct attributes– They are dominated on all indirect attributes according to their best

estimation

• Partial pruning is used when full pruning cannot be applied

Page 19: Creating Competitive Products

Creating Competitive Products | VLDB '09 19

Partial PruningPackage No-of-

stopsDistance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

… … … … …

Q15(f3,h5)

2 170 3 200

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

H4 150 2 150

H5 170 2 140

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

F3 2 80

Skyline Tuples of Source Tables

Newly Created Vacation Packages

Package No-of-stops

Distance-to-beach

Hotel-class

Price

Q1(f1:h1)

0 100 3 220

Q2(f1,h2)

0 200 2 210

Q3(f1, h3)

0 400 1 200

… … … … …

Q7(f2,h1)

1 100 3 200

… … … … …

Q13(f3,h1)

2 100 3 180

Existing Vacation Packages

Competitive Products

A1

B1

C1={A1, B1}

Full Pruning

Page 20: Creating Competitive Products

Creating Competitive Products | VLDB '09 20

Meta Transformation

Package No-of-stops

Distance-to-beach

Hotel-class

Price

P1 0 130 2 250

P2 1 140 2 170

P3 1 300 1 150

Package No-of-stops

Distance-to-beach

Hotel-class

Price

P2 1 140 2 170

Package No-of-stops

Price

P2 1 170

Package Distance-to-beach

Hotel-class Price

P2 140 2 170

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 200

H2 200 2 190

H3 400 1 180

Flight No-of-stops

Flight-cost

F1 0 200

F2 1 180

•No inter-dominance checking for {F2} X{H2}

Meta-Hotel

Meta-Flight

Min 1 100

Min 400 1 80

Hotel Distance-to-beach

Hotel-class

Hotel-cost

H1 100 3 100

H2 200 2 90

H3 400 1 80

Flight No-of-stops

Flight-cost

F1 0 120

F2 1 100

A1

B1

Page 21: Creating Competitive Products

Creating Competitive Products | VLDB '09 21

Algorithm Overview

• Framework• Intra-dominance checking– To Find Skyline in Source Tables

• Inter-dominance checking– Skyline in Existing Market Packages– R* Tree Indies in Existing Market Packages– Full Pruning– Partial Pruning

• Post-processing

Page 22: Creating Competitive Products

Creating Competitive Products | VLDB '09 22

Post-processing

• More than one indirect attributes– Calculation• Previous algorithm Intra-dominance checking

– Any existing Skyline algorithm– Post-processing cost depends on the size of

Competitive Products

Page 23: Creating Competitive Products

Creating Competitive Products | VLDB '09 23

Outline

• Background– Skyline, Related Work

• Motivation– Example, Problem Definition

• Algorithm– Framework, Grouping, Pruning

• Experiments– Synthetic, Real data– 6 factors, 4 measurements

• Conclusions

Page 24: Creating Competitive Products

Creating Competitive Products | VLDB '09 24

Experiments

• Pentium IV 2.4GHz PC with 4GB memory, Linux platform, C++

• Synthetic anti-correlated datasets• Real datasets, Travel Agency A and Travel Agency B

– A, 296 packages, 1014 hotels and 4394 flights – B, 149 packages, 995 hotels and 866 flights

• Implementation– Algorithm for Creating Competitive Products (ACCP)– Baseline algorithm – Naïve algorithm

Skyline in tables

R* Tree Full & Partial Pruning

ACCP Yes Yes Yes

Baseline Yes Yes No

Naïve No No No

Page 25: Creating Competitive Products

Creating Competitive Products | VLDB '09 25

Synthetic DatasetsParameters Default value

No. of attributes in each source table 4

No. of indirect attributes in a product table

1

No. of source tables 2

No. of clusters in each source table 2

Size of existing packages 5M

Size of each source table 100k

• Schema is similar to our example

• Anti-correlated• 6 factors• Measurement

– Execution time– Pruning Power– Ratio of Competitive

Products out of all combinations

– Memory Usage

Page 26: Creating Competitive Products

Creating Competitive Products | VLDB '09 26

Experiments

From 100k to 500k

Full pruning & partial pruning

TQ, TQ’, TR SKY

Pruning Powerslightly increases

Parameters Default value

No. of attributes in each source table 4

No. of indirect attributes in a product table

1

No. of source tables 2

No. of clusters in each source table 6

Size of existing packages 5M

Size of each source table 100k

Page 27: Creating Competitive Products

Creating Competitive Products | VLDB '09 27

Experiments

From 2.5M to 10M

Parameters Default value

No. of attributes in each source table 4

No. of indirect attributes in a product table

1

No. of source tables 2

No. of clusters in each source table 6

Size of existing packages 5M

Size of each source table 100k

More competitive Slightly decreases

Page 28: Creating Competitive Products

Creating Competitive Products | VLDB '09 28

Experiments

Travel Agency A Package Generation Set

1. A, 296 packages, 1014 hotels and 4394 flights . B, 149 packages, 995 hotels and 866 flights

2. Source tables from B, and Package from A

3. Vary discount from 0 to 0.504. Efficiency

ACCP(44.74s) and Baseline (84.47s)

5. |SKY|/|TQ|6. |DOM|/|TE|

DOMSKY

Page 29: Creating Competitive Products

Creating Competitive Products | VLDB '09 29

Outline

• Background– Skyline, Related Work

• Motivation– Example, Problem Definition

• Algorithm– Framework, Grouping, Pruning

• Experiments– Synthetic, Real data– 6 factors, 4 measurements

• Conclusions

Page 30: Creating Competitive Products

Creating Competitive Products | VLDB '09 30

Conclusions• Creating Competitive Products

– Example– Problem Definition

• Algorithms– Framework– Intra-dominance checking– Inter-dominance checking– Post-processing

• Experiments– Synthetic anti-correlated datasets– Real datasets

Page 31: Creating Competitive Products

Creating Competitive Products | VLDB '09 31

THANK YOU !Q&A