22
Page 1 KUT Graduate Course Spatial Data 한한한한한한한 한한한

Graduate Course Spatial Data

Embed Size (px)

DESCRIPTION

Graduate Course Spatial Data. 한국기술대학교 민준기. Spatial Data. Traditional Data Single Dimension value, text New Application GIS, CAD LBS Multimedia Data Multi-dimensional Data. Spatial Access Method(SAM). Support efficient access of Spatial Data B-Tree Only one dimensional Data - PowerPoint PPT Presentation

Citation preview

Page 1: Graduate Course Spatial Data

Page 1KUT

Graduate CourseSpatial Data

한국기술대학교민준기

Page 2: Graduate Course Spatial Data

Page 2KUT

Spatial Data

• Traditional Data– Single Dimension– value, text

• New Application– GIS,– CAD– LBS– Multimedia Data

– Multi-dimensional Data

Page 3: Graduate Course Spatial Data

Page 3KUT

Spatial Access Method(SAM)

• Support efficient access of Spatial Data

• B-Tree– Only one dimensional Data– Not appropriate to multi-dimensional Data

• One of famous spatial indexes– R-Tree

Page 4: Graduate Course Spatial Data

Page 4KUT

R-Trees : A Dynamic Index Structure for Spatial Searching

• R-Tree– A Height-balanced Tree with index records in its

leaf nodes containing pointers to data objects.– Dynamic structure: inserts and deletes can be

intermixed with searches and no periodic reorganization is required.

Page 5: Graduate Course Spatial Data

Page 5KUT

R-Trees : A Dynamic Index Structure for Spatial Searching

• R-Tree– It is difficult to handle pure spatial data– Based On MBR (minimum bounding rectangle)

approximation

A1 A2

R1

a3 a4a1 a2

A1

A2

a1

a2

a3

a4

Page 6: Graduate Course Spatial Data

Page 6KUT

R-Tree Structure

• Node = (E1,… ,EM)

• Ei = (I, pointer) where I = (I0,..,Id) , d is dimension and Ij = [a,b]

• Let M be the maximum number of entries, and m <= M/2 be the minimum number of entries of a node

Page 7: Graduate Course Spatial Data

Page 7KUT

Property of R-tree• Every leaf Node contains between m and M index

record unless it is the root.• For each index record (I, pointer) in a leaf node, I is

the smallest rectangle that spatially contains the n-dimensional data object represented by the indicated tuple.

• Every non-leaf node has between m and M children unless it is the root.

• For each entry (I, pointer) in a non-leaf node, I is the smallest rectangle that spatially contains the rectangles in the child node.

• The root node has at least two children unless it is a leaf.

• All leaves appear on the same level.

Page 8: Graduate Course Spatial Data

Page 8KUT

Property of R-Tree

• The height of an R-Tree containing N index records is at most [log_mN]-1– The maximum number of nodes is

[N/m]+[N/m^2]+...+1

– Worst case space utilization for all nodes except root node is m/M.

#of leaf nodes

Page 9: Graduate Course Spatial Data

Page 9KUT

R-Tree Search

• Due to the overlap of MBRs, many index nodes may be visited.

Search(MBR)

if(leaf node){

check all entries in this node which overlap MBR

}else{

for each childnode nx which overlap MBR

nx.seach(MBR)

}

Page 10: Graduate Course Spatial Data

Page 10KUT

R-Tree Insertion

• Algorithm Insertion (newMBR)– Find position for new record

• ChooseLeaf Call to select a leaf node

– Add record to leaf node• If full, SplitNode call

– Propagate changes upward• AdjustTree

– Grow tree taller

Page 11: Graduate Course Spatial Data

Page 11KUT

R-Tree Insert

• Algorithm ChooseLeafCL1 Set N to be a rootCL2 If N is a leaf

return N else

Choose the entry in N whose rectangle needs least area enlargement to include the new data. Resolve ties by choosing the entry with the smallest rectangle

CL3 Set N to be the childnode pointed to by the childpointer of the chosen entry.

CL4 Repeat CS2.

Page 12: Graduate Course Spatial Data

Page 12KUT

R-Tree Insert

• If there is no room invokes SplitNode– Splite MBR to minize the MBR size

• Optimal SpliteNode -> cases that make two subset with M+1 entries-> O(2M-1)

bad good

Page 13: Graduate Course Spatial Data

Page 13KUT

R-Tree Insert

• Approximation (see details)– Quadratic (O(M2))– Linear

• Select two entries whose lengh are fartest• Insert Remains intp groups

Page 14: Graduate Course Spatial Data

Page 14KUT

R-Tree Insertion• Adjust covering rectangles and propagating nodes splits as

necessary• Ascend from leaf node L to the rootAdjustTree Algorithm• [Initialize] N = L• [Check if done] if N is root, stop• [Adjust covering rectangle in parent entry]

– Let P be the parent of N, E_N be N’s entry of P– Modify E_N MBR to enclose all MBRS in N.

• [Propagate node split upward]– If N has a partnet NN resulting from an earlier split, – Create a new entry E_NN and add E_NN to P– If P has no room, invoke SplitNode

• [Move up to next node]– Set N= P and NN= PP, goto step 2.

Page 15: Graduate Course Spatial Data

Page 15KUT

Processing and Optimization of Multiway Spatial Joins Using R-trees

• Cost Based Query Optimizer – Join Selectivity

• probability that a tuple is result

– best efficient query execution plan generate

• Spatial Join Selectivity– Multi-dimension attribute

• commonly 2dimension

• In this work, focus computation the cost of filer Step(= consider only MBR)

Page 16: Graduate Course Spatial Data

Page 16KUT

Previous Work

• Assumption– [0,1)d

• d-dimensional work space• data is uniformly distributed• each dimension is independent

Page 17: Graduate Course Spatial Data

Page 17KUT

Previous Work

• Window Query– find all points include window q

– S(q) =|qi|d

|qi| = size of q of dimension i q

qx

qy

Page 18: Graduate Course Spatial Data

Page 18KUT

Previous Work

• 2-Way Join Query– find Ra interset Rb

S(Ra,Rb) = (|Sa|+ |Sb|)d

(where |Si| = average size of Ri on one dimension

d = dimension)

(|Sa,y|+|Sb,y|)

(|Sa,x|+|Sb,x|)

Page 19: Graduate Course Spatial Data

Page 19KUT

Previous Work

• M-Way Linear Queries(Acyclic Queries)– Ra intersect Rb and Rb intersect Rc

S(Ra,Rb,Rc) = (|Sa|+ |Sb|)d (|Sb|+ |Sc|)d

– Generalization

∏ (|Si|+|Sj|)d∀i,j:Q(i,j) = TRUE

|Sb||Sa|

|Sc|

Page 20: Graduate Course Spatial Data

Page 20KUT

Previous Work

• M-Way Clique Join Query(M≥3)– Papadias, Mamoulis, Theodoridis(ACM PODS99)– Clique: if a set of rectangles mutually intersect,

then they must share a common area

R1 R2

R3

S1S2

S3

Query graph Spatial relationship

Page 21: Graduate Course Spatial Data

Page 21KUT

Previous Work

– Common Area(qn)

– Proof(by induction): ||

||||

1 ,1

1

i

n

i

n

ijj

i

n

in

S

Sq

||||

||||||

21

212 ss

ssq

s1s2

s1s1

s2s2

||||

||

21

1

ss

s

||||

||||

21

12

ss

ss

||||

||

21

1

ss

s

2

|| 1s2

|| 1s|s1|

확률 :

대표값 :

Page 22: Graduate Course Spatial Data

Page 22KUT

Previous Work

– Selectivity of M-Way Clique Join QueryProb(s2 interset s1)*Prob(s3intersect s1∧s3 intersect s2|s1 s2 mutually intersect) =

Prob(s2 intersect s1)*Prob(s3 intersects common intersection area of s1 s2)

– General Case:

d

d

d sssssssss

ssss |)||||||||||(|||

||||

|||||)||(| 133223

21

2121 1

d

i

n

i

n

ijj

S

||1 ,1