20101217 mtg

Preview:

DESCRIPTION

 

Citation preview

Minimization of SuperSQL Execution Time by Query

Decomposition Toyama Laboratory

Midterm Presentation

Ria Mae Borromeo25 December 2010

2

an extension of SQL that provides the capability of generating various kinds of application data directly as a result of a query

Introduction (1)

4

Introduction (2)GENERATE HTML { { imagefile(path="C:\webpage", "prc_logoback.gif"), imagefile(path="C:\webpage", "main_top02.gif") @{valign=middle, align=center} } @{bgcolor="#296793", width="600"}! ""@{border-width=0, height=20}! "August 2010 Physician Licensure Exam"@{font-size=20, align="center"}! { {"List of Exams"@{align="center", font-size=14}! [link(t.t_name@{font-size=12, width=150}, file="", att="")@{border-width=0}]! }@{valign=top}, {"List of Passers"@{align="center", font-size=14}! [e.e_name@{font-size=10, border-width=0}]! }@{width=330, align=left, valign=top}, {"List of Schools"@{align="center", font-size=14}! [s.s_name@{font-size=10, border-width=0}]! }@{width=330, align=left, valign=top} }@{align="center"}} FROM examinee e, school s, examtype t

Remove decorations

5

Introduction (3)

GENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s

SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s

SQL Query

Problem Illustration

SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s

t

(20)

e

(250)

s

(50)

x x

t e s

Expected Output

Intermediate Table (250,000)20 + 250 + 50 = 320 tuples

6

Proposal (1)

SELECT t_nameFROM examtype

SELECT e_nameFROM examinee

SELECT s_nameFROM school

GENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s

t_name

(20)

e_name

(250)

s_name

(50)

Expected Output

… … …

SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s

8

Proposal (2)

3 Parts1. Detection of divisible queries2. Division of Queries

▪ Creation of SQL queries▪ Execution of SQL queries

3. Combination of query results

start

SuperSQL

query

Make SQL Statemen

t

Parse Query

Make Schema

Execute SQL Statement

Make Tree Structure

Generate Code

Application Data

end9O

rig

inal P

rocess Parse Query

Page decoration and query attributes are separated

Parse Query

Make Schema

Data

C

on

str

ucti

on

10

Schema

A tree-structured representation of the layout of the attributes

[ [t.t_name], [e.e_name], [s.s_name] ]

start

SuperSQL

query

Make SQL Statemen

t

Parse Query

Make Schema

Execute SQL Statement

Make Tree Structure

Generate Code

Application Data

end11O

rig

inal P

rocess Parse Query

Page decoration and query attributes are separated

Make Tree Structure The schema and the

tuples from the database are combined into a tree-structure

Generate Code Using the tree

structure, code for application data is generated

Parse Query

Make Schema

Make SQL Statemen

t

Execute SQL Statement

Make Tree Structure

Generate Code

Data

C

on

str

ucti

on

start

SuperSQL

query

Check Divisibility

Execute SQL

Statements

Combine results into

a tree structure

Generate Code

Application Data

end

Make SQL Statements

YN

12Pro

posed

Pro

cess

Make SQL Statemen

t

Make Schema

Execute SQL Statement

Make Tree Structure

Parse Query

Check Divisibility

13

Check Divisibility Algorithm

[ [t.t_name], [e.e_name], [s.s_name] ]

1. Get the attributes from the schemaGENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s

Schema

14

Check Divisibility Algorithm

2. Make a graph to identify relationshipst.t_nam

es.s_nam

ee.e_nam

e

Vertices : attributesEdges:

If two attributes are from the same table If two attributes are equated in a where

condition If two attributes are grouped in the

schema

15

Check Divisibility Algorithm

3. Find connected components using Depth-First Search

t.t_name

s.s_name

e.e_name

Connected Component - a subgraph that contains a path between all pairs of vertices in the subgraph

Depth-First Search - A search algorithm that extends the current path as far as possible before backtracking to the last choice point and trying the next alternative path. 

16

Example 2

GENERATE HTML [ f.title !{ { { "Director" ! f.dir } ! { "Running" ! f.length } }, { "Starring" , [ r.name % {r.name , { "Birthday" , r.bday }! "Biography" ! r.bio } } ]! }}! f.story ]!FROM film f, actor r, act aWHERE a.film = f.idAND a.act = r.id

1. Get the attributes from the schema

[[f.title, f.dir, f.length, [actor.name, actor.name, actor.birthday, actor.biography], f.story]]

17

Example 2

2. Make a graph to identify relationships

f.title

f.dir

f.len

f.story

f.id

r.name

r.bday

r.bio

r.ida.film

a.act

18

Example 2

3. Find connected components by Depth-First Search

f.title

f.dir

f.len

f.story

f.id

r.name

r.bday

r.bio

r.ida.film

a.act

f.lenf.title

f.dir

f.id

a.film

a.act

r.id

r.bio

r.bday

r.name

start

SuperSQL

query

Check Divisibility

Execute SQL

Statements

Combine results into

a tree structure

Generate Code

Application Data

end

Make SQL Statements

YN

19Pro

posed

Pro

cess

Make SQL Statemen

t

Make Schema

Execute SQL Statement

Make Tree Structure

Parse Query

Check Divisibility Make SQL

Statements

Execute SQL

Statements

Combine results into

a tree structure

20

Current Status

3 Parts1. Detection of divisible

queries2. Division of Queries

▪ Creation of SQL queries▪ Execution of SQL queries

3. Combination of query results

▪ Trivial case: columns are independent of each other

▪ Others

21

Experiments (1)

Intel (R) Core (TM)2 Duo CPU 2.09 GHz 1.97 GB RAM Microsoft Windows XP Professional

2002 SP3 Java 6 Update 22 Standard Edition

22

Experiments (2)

No. of Queries: 25

No. of Attributes per query

2 – 10

Best Case Running Time (ms)

4.043531

Worst Case Running Time (ms)

7.87614

Average Running Time (ms)

5.41567164

Standard Deviation (ms)

1.153155682

Check Divisibility Function

23

Experiments (3)

10 15 20 25

Orig-inal

195.8656734

194.7371448

199.2996856

207.803956

New 197.955362

194.8125736

201.3419246

201.3696432

187.5192.5197.5202.5207.5

Data Construction Time of Small Intermediate Table Size

Tim

e (

ms)

No. of Attributes

24

Experiments (4)

25000 50000 75000 100000 125000

Origi-nal

4503.630762

9322.29402

14360.094452

29402.622868

39187.224832

New

308.163948

296.253179

305.315075

308.774652

316.301103

2500

17500

32500

Data Construction Time of Large Intermediate Table Size

Execu

tion

Tim

e (

ms)

No. of Attributes

25

Summary

Implemented an algorithm for checking if SuperSQL queries can be decomposed into several SQL Queries

Queries are decomposed into several SQL queries when possible

Experiments show that the procedure reduces the execution time of SuperSQL in some query cases

27

Future Work

Continue implementation of the MakeTree function to address other query classes

Testing

More experiments

Statistical Analysis of data

28

Comments from Panel

Research Suggestions: Use database information For the example, use embedded system (with

limited memory) that generates webpages

Presentation Suggestion: Take note of the number of decimal points in the

data

Question: How can you be sure that the output generated by

the new one is the same as the original? It can be verified by checking the tree structure

Thank you for listening.