15
1 C. Shahabi CSCI585 CSCI585 Spatial Index Structures Cyrus Shahabi Computer Science Department University of Southern California [email protected] 2 C. Shahabi CSCI585 CSCI585 Outline Introduction Spatial Indexing R-Tree R + -Tree Quad Trees

Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

  • Upload
    others

  • View
    28

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

1

C. Shahabi

CSCI585CSCI585

Spatial Index Structures

Cyrus ShahabiComputer Science Department

University of Southern [email protected]

2

C. Shahabi

CSCI585CSCI585 Outline

IntroductionSpatial IndexingR-TreeR+-TreeQuad Trees

Page 2: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

2

3

C. Shahabi

CSCI585CSCI585 Introduction

Spatial objects– Points, lines, rectangles, regions, …

Hierarchical data structures– Based on recursive decomposition, similar to divide and

conquer method, like B-tree.Why not B-Tree?– More than one dimension– Concept of closeness relies on all the dimensions of the

spatial dataSpatial index structures demo– http://www.cs.umd.edu/~brabec/quadtree/index.html

4

C. Shahabi

CSCI585CSCI585 Spatial Indexing

Mapping spatial object into point– In either same, lower, or higher dimensional spaces– Good for storage purposes– Problems with queries like finding the nearest objects

Bucketing methods– Based on spatial occupancy– Decomposing the space from which the data is drawn

• Minimum bounding rectangle (MBR) : e.g., R-Tree• Disjoint cells: e.g., R+-Tree• Blocks of uniform size• Distribution of the data: e.g., quadtree

data-dependent

greater degree of data-independence

Page 3: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

3

5

C. Shahabi

CSCI585CSCI585 R-Tree[proposed in 1984 by Guttman]

Based on Minimum Bounding Rectangle (m,M)=(1,3)

R4

d g h

R3

R1

a b

a b

dg

h

R2

R5 R6

c i e fc

ie

f

�bounding rectangles could overlap each others (e.g., R3 vs R4)�an object is only associated with one bounding rectangle

6

C. Shahabi

CSCI585CSCI585 R-Tree[proposed in 1984 by Guttman]

Height-balanced tree similar to B-tree for k-dimensionsEvery leaf node contains between m (m ≤M/2) and M index records, unless it is the rootFor each index record (I, tuple-identifier) in a leaf node, I is the MBR that contains the n-dimensional data object represented by the indicated tupleEvery non-leaf node has between m and M children unless it is the rootFor each entry (I, child-pointer) in a non-leaf node, I is the MBR that spatially contains the rectangles in the child node.All leaves appear on the same levelThe root node has at least two children unless it is a leaf

Page 4: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

4

7

C. Shahabi

CSCI585CSCI585 Insertion Processes

d

ef

i

gm

A

Bh

k lj

CX

A new index entry

X

ChooseLeaf

Has room

Install X

Yes

SplitNode

No

C

AdjustTree

C Adjust all related entries

Different variant:•Exhaustive•Quadratic•Linear•Packed•Hilbert Packed•…etc.

8

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each groupRun PickSeeds d

ef

i

gm

A

Bh

k lj

CX

d e f g h i j k l m

CBA

Page 5: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

5

9

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

d

ef

i

gm

A

Bh

k lj

CX

d e f g h i j k l m

CBA

PickSeedsPS1 [Calculate inefficiency of grouping entries together]For each pair of E1 and E2, compose a rectangle R including E1 and E2Calculate d = area(R) - area(E1) - area(E2)

PS2 [Choose the most wasteful pair ] Choose the pair with the largest d

lj

k ll

Xm

l

10

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each group(PickSeeds)

d e f g h i j k l m

CBA

Check if done

G1 G2l m

Select entry to assign(PickNext)

No

i

gm

d

ef

A

Bh

k lj

G1

XG2

Page 6: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

6

11

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each group(PickSeeds)

i

gm

d

ef

A

Bh

k lj

G1

X

Check if done

G1 G2l m

PickNextPN1 [Determine cost of putting each entry in each group] For each entry Ecalculate d1 = the increased MBR area required for G1 calculate d2 = the increased MBR area required for G2 PN2 [Find entry with greatest preference for one group]Choose the entry with the maximum difference between d1 and d2

No

G2

G1

G2

kG2

G1m

X

G2

G1j

12

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each group(PickSeeds)

d e f g h i j k l m

CBA

Check if done

G1 G2l m

Select entry to assign(PickNext)

No

G1 G2l mk

i

gm

d

ef

A

Bh

k lj

G1

XG2

Page 7: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

7

13

C. Shahabi

CSCI585CSCI585

G2G2G1

G1

Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each group(PickSeeds)

Check if done

G1 G2l m

Select entry to assign(PickNext)

No

G1 G2l mk

i

gm

d

ef

A

Bh

k lj

G1

XG2

jX

14

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each group(PickSeeds)

Check if done

G1 G2l m

Select entry to assign(PickNext)

No

i

gm

d

ef

A

Bh

k lj

G1

XG2

G1 G2l mk X

Page 8: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

8

15

C. Shahabi

CSCI585CSCI585 Processes of Quadratic Spilt(page 52 in Guttman’s paper)

Pick first entry for each group(PickSeeds)

Check if done

G1 G2l m

Select entry to assign(PickNext)

No

i

gm

d

ef

A

Bh

k lj

G1

XG2

G1 G2l mk X

j

d e f g h i l k

G2G1BA

m x j

16

C. Shahabi

CSCI585CSCI585 Your Exercise Before Midterm(m,M)=(2,4)

Build a R-Tree for these spatial dataHint: You could use the Spatial index structures demo application step by step

DF G

E HJ

K

L

Page 9: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

9

17

C. Shahabi

CSCI585CSCI585 Search Object in R-Tree

i

g m

d

ef

A

Bh

k lj

G1

xG2

d e f g h i l k

G2G1BA

m x j

Input (T, j)

Leaf?

Yes

Search each subtree E of T

No

Overlap

(E,j)

Yes

Check each entry E of T

Overlap

Qualifyingrecord

Yes

18

C. Shahabi

CSCI585CSCI585 Main Drawbacks of R-Tree

R-tree is not unique, rectangles depend on how objects are inserted and deleted from the tree.In order to search some object you might have to go through several rectangles or the whole database– Why?– Solution?

Page 10: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

10

19

C. Shahabi

CSCI585CSCI585 R+-Tree

Overcome problems with R-TreeIf node overlaps with several rectangles insert the node in allDecompose the space into disjoint cells

R2

R3 R4 R5 R6

R1

a bd g h c i e fh i c i

g

d

h

ci

f

e

b

20

C. Shahabi

CSCI585CSCI585 R+-Tree PropertiesR+-tree and cell-trees used approach of discomposing space into cells– R+-trees deals with collection of objects bounded by rectangles– Cell tree deals with collection of objects bounded by convex

polyhedronR+-trees is extension of k-d-B-treeRetrieval times are smallerWhen summing the objects, needs eliminate duplicatesAgain, it is data-dependent

Page 11: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

11

21

C. Shahabi

CSCI585CSCI585 Quad Trees

Region Quadtree– The blocks are required to be disjoint– Have standard sizes (squares whose sides are power of two)– At standard locations– Based on successive subdivision of image array

into four equal-size quadrants– If the region does not cover the entire array, subdivide into

quadrants, sub-quadrants, etc.– A variable resolution data structure

22

C. Shahabi

CSCI585CSCI585 Example of Region Quadtree

12 3

4 5

67 8

9 10

11 12

13 14

15 16

17 1819

A

B C E

2

1

3 4 5 6 11 12D 13 14 19F

15 16 17 187 8 9 10

NW NE SW SE

3

311

11

Page 12: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

12

23

C. Shahabi

CSCI585CSCI585 PR Quadtree

PR (Point-Region) quadtreeRegular decomposition (similar to Region quadtree)Independent of the order in which data points are inserted into it�: if two points are very close, decomposition can be very deep

24

C. Shahabi

CSCI585CSCI585 Example of PR Quadtree

(0,0) (100,0)

(100,100)(0,100)

Seattle(1,55)

Toronto(62,77)

Buffalo(82,65)

Denver(5,45)

Chicago(35,42)

Omaha(27,35) Mobile

(52,10)

Atlanta(85,15)

Miami(90,5)

A

Seattle(1,55)

Toronto(62,77)

Buffalo(82,65)

Denver(5,45)

Chicago(35,42)

Omaha(27,35)

Mobile(52,10)

Atlanta(85,15)

Miami(90,5)

B C E

D F

Subdivide into quadrants until the two points are located in different regions

Page 13: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

13

25

C. Shahabi

CSCI585CSCI585 PM Quadtree

PM (Polygonal-Map) quadtree family– PM1 quadtree, PM2 quadtree, PM3 quadtree, PMR quadtree, … etc.

PM1 quadtree– Based on regular decomposition of space– Vertex-based implementation– Criteria

• At most one vertex can lie in a region represented by a quadtree leaf• If a region contains a vertex, it can contain no partial-edge that does not

include that vertex• If a region contains no vertices, it can contain at most one partial-edge

26

C. Shahabi

CSCI585CSCI585 PM Quadtree

PM1 quadtree

PM2 quadtree

PM3 quadtree

Page 14: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

14

27

C. Shahabi

CSCI585CSCI585 Example of PM1 Quadtree

(0,0) (100,0)

(100,100)(0,100) Each node in a PM quadtreeis a collection of partial edges (and a vertex)Each point record has two field (x,y)Each partial edge has four field (starting_point, ending_point, left region, right region)

28

C. Shahabi

CSCI585CSCI585 Question?

Question?

Page 15: Spatial Index Structures - InfoLab · CSCI585 Spatial Indexing Mapping spatial object into point – In either same, lower, or higher dimensional spaces – Good for storage purposes

15

29

C. Shahabi

CSCI585CSCI585 B-Tree: Definition

A B-Tree of order M is a height-balanced tree 1. All leaves are on the same level2. All nodes have at most M children3. All internal nodes except the root have at least M/2 children

=m

30

12 15 22

11

51

35 41

66 78

2 7 54 55 60 68 71 75 89 95100Back to R+ Tree

30

C. Shahabi

CSCI585CSCI585 B-Tree: InsertionA new index entry

ChooseLeaf

Has room

Install X

Yes

SplitNode

No

AdjustTreeback