44
Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für Angewandte Wissenschaften Albert-Ludwigs-Universität Freiburg

Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

  • View
    225

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Persistent Data Structures

Computational Geometry, WS 2007/08Lecture 12

Prof. Dr. Thomas OttmannKhaireel A. Mohamed

Algorithmen & Datenstrukturen, Institut für InformatikFakultät für Angewandte WissenschaftenAlbert-Ludwigs-Universität Freiburg

Page 2: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 2

Overview

• Versions and persistence in data structures• Making structures persistent• Partial persistence

– Fat node method

– Path-copying method

– Node-copying (DSST) method

• Revisit: Planar point-location– Sarnak-Tarjan solution

– Dobkin-Lipton observation

Page 3: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 3

Data Structures in the Temporal Sense

A data structure is called• Ephemeral – no mechanisms to revert to previous states.

– Usually, a single transitory structure where a change to the structure destroys the old version.

• Persistent – supports access to multiple versions. Furthermore, a structure is– partially persistent if all versions can be accessed but only the newest

version can be modified, and

– fully persistent if every version can be both accessed and modified.

Page 4: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 4

A Linked Data Structure

Pre-definitions:• A linked data structure has a finite collection of nodes.• Each node contains a fixed number of named fields.• All nodes in the structure are of exactly the same type• Access to the linked structure is by pointers indicating nodes of the

structure.

In our deliberations:• We shall use the binary search tree as our linked data structure for

all running examples throughout the lecture.

Page 5: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 5

Persisted Versions

Versions are directly related to the operations incurred on the data structure, mainly:

• Update operations• Access operations

• After an update operation, the current and all previous states of the data structure are archived in a manner that makes them accessible (via access operations) from their version identities.

Page 6: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 6

Terminologies

• Current version – Version vi of the data structure where a current operation is about to be performed

• Current operation – An update operation performed on the current version vi of the data structure, which will result in the newest version vi+1, spawned after a successful completion of the operation.

Page 7: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 7

Making Structures Persistent: Naïve I

Naïve Structure-Copy Method• Make a copy of the data structure each time it is changed• At current operation:

– A new version vi+1 is spawned by completely copying the current version

– The update operation is performed on the newest version

• Costs (for structure of size n):– Per update: Time Space

– For m updates: Time Space

Page 8: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 8

Making Structures Persistent: Naïve II

Naïve Log-File Method• Store a log-file of all updates• At current operation:

– Update log-file

• To access version i: – Sequentially carry out i updates, starting from the initial structure, to

generate version i.

• Costs (for structure of size n):– Per update: Time Space

– For m updates: Time Space

– Per access: Time

Page 9: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 9

Hybrid Method

Structure-Copying with Log-file• Store the complete sequence of updates in a log-file• Store every kth version of the data structure, for a suitably chosen k

• To access version i:– Retrieve structure from version k i/k– Sequentially update structure to get version i

• Tradeoffs from Log-file method:– Time and space requirement increase at least with a factor of

Page 10: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 10

Ideals

We seek more efficient techniques:

Ideally, we want (on average) to have• Storage space used by the persistent structure to be O(1) per

update step, and• Time per operation to increase by only a constant factor over the

time in the ephemeral structure

Page 11: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 11

Fat Node Method – Partial Persistence

• Record all changes made to the node field in the nodes themselves• Nodes are allowed to become arbitrarily “fat” to include version

history; i.e. a list of version stamps

• A version stamp indicates the version in which the named field was changed to the specified value

• Each fat node has its own version stamp to indicate the version in which it was created

• However, a version stamp is not unique; i.e. several Fat nodes can have the same version stamp

Page 12: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 12

Update Operation – Fat Node Method

Consider update operation i.

Persistent (Fat Node Method)

• Creates new Fat node with version stamp i, and all original field values

• Store field value plus version stamp

Ephemeral

• Creates new node

• Changes a field

Page 13: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 13

Update Operation – Example

• (Versions 1 to 9) Insert: 5, 20, 8, 15, 6, 2, 1, 28, 12

5

20

8

156

2

28

12

1

• (Versions 10 to 12) Delete: 20, 5, 1

Page 14: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 14

Access Operation – Fat Node Method

Accessing any version i m in the persistent structure:• Find the root node at version i.• Then traverse nodes in the structure, choosing only version values

with the maximum version stamp i.

Example: Given this persistent structure, access version v11

5

20

8

156

2

28

12

1

v1-v10

v6

v7

v2

v3

v4v5

v8

v9

v10

v10v10

v10

v11

v12

v11-v12

Page 15: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 15

Analysis – Fat Node Method

Assumption: The version stamps in a Fat node are ordered and stored in a balanced binary search tree.

Update operation• Space per update:• Time per update:

Access operation• Time per access: (multiplicative slow-down)

Page 16: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 16

Path-Copying Method – Partial Persistence

• Creates a set of search trees, one per update, having different roots but sharing common subtrees

• Copy only the nodes in which changes are made, such that any node in the current version that contains a pointer to a node must itself be copied

• In our linked data structure, each node contains pointers to its children

• Copying one node in the current version requires copying the entire path from the node to the root – hence the name “Path-Copying”

Page 17: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 17

Update Operation – Path-Copying

Consider update operation i.

• Identify the node in the current version that will be affected by the update operation

• Make a copy of this node (and hence the path to the root in the current version)

• Modify the path accordingly to the operation

Page 18: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 18

Update Operation – Example (Insert)

• (Versions 1 to 9) Insert: 5, 20, 8, 15, 6, 2, 1, 28, 12

5

20

8

156

2

1

… v7

Page 19: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 19

Update Operation – Example (Insert)

• (Versions 1 to 9) Insert: 5, 20, 8, 15, 6, 2, 1, 28, 12

5

20

8

156

2

281

… v7

5

20

v8

Page 20: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 20

Update Operation – Example (Insert)

• (Versions 1 to 9) Insert: 5, 20, 8, 15, 6, 2, 1, 28, 12

5

20

8

156

2

28

12

1

… v7

5

20

v8

5

20

v9

8

15

Page 21: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 21

Update Operation – Example (Delete)

• (Versions 10 to 12) Delete: 1, 20, 5

5

20

8

156

2

1

… v9

12

28

Page 22: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 22

Update Operation – Example (Delete)

• (Versions 10 to 12) Delete: 1, 20, 5

5

20

8

156

2

1

… v9

12

28

5

2

v10

Page 23: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 23

Update Operation – Example (Delete)

• (Versions 10 to 12) Delete: 1, 20, 5

5

20

8

156

2

1

… v9

12

28

5

2

v10

5

v11

15

8

Page 24: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 24

Update Operation – Example (Delete)

• (Versions 10 to 12) Delete: 1, 20, 5

5

20

8

156

2

1

… v9

12

28

5

2

v10

5

v11

15

8

2

v12

Page 25: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 25

Access Operation – Path-Copying

Assumption: The version roots are ordered and stored in some accessible structure on top of all the m persisted versions.

To access any version vi:

• We only need to locate the correct root from the accessible top structure to access the required version i

Page 26: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 26

Analysis – Path-Copying Method

Update operation• Space per update:• Time per update:

Access operation• Time per access:

Page 27: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 27

Node-Copying Method – Partial Persistence

• An improvement to the Fat node method• We do not allow nodes to become arbitrarily “fat”, but fix this number• When we run out of space for version stamps, we then create a new

copy of the node

• In our deliberation, we allow only 1 additional pointer, contained in the node and call it the version stamp modification box.

klp rp

vt: ptrOriginal left pointer to left child with version before vt

Original right pointer to right child with version before vt

Version stamp modification box

Page 28: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 28

Update Operation – Node-Copying

Consider update operation i.

• Identify the node in the current version that will be affected by the update operation

• Make a copy of this node if the version stamp modification box is not empty

• Modify the node accordingly to the operation

Page 29: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 29

Update Operation – Example (Insert)

v0

5

20

8

• (Versions 1 to 6) Insert: 15, 6, 2, 1, 28, 12

Page 30: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 30

Update Operation – Example (Insert)

v0

5

20

v2:lp

8

v1:rp

• (Versions 1 to 6) Insert: 15, 6, 2, 1, 28, 12

15

8

6

Page 31: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 31

Update Operation – Example (Insert)

v0-v4

5

v3:lp

20

v2:lp

8

v1:rp

• (Versions 1 to 6) Insert: 15, 6, 2, 1, 28, 12

15

v6:lp

8

6

2

v4:lp

1 28

20

5

v5-v6

12

Page 32: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 32

Update Operation – Example (Delete)

v0-v4

5

v3:lp

20

v2:lp

8

v1:rp

• (Versions 7 and 8) Delete: 1, 20

15

v6:lp

8

6

2

v4:lp

1 28

20

5

v5-v6

12

Page 33: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 33

Access Operation – Node-Copying

Navigating through this persistent structure is exactly the same as the Fat node method.

To access any version vi:

• Find the root node at version i.• Then traverse nodes in the structure, choosing only version values

with the maximum version stamp i.

Exercise: From the previous figure, access version v6

Page 34: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 34

Analysis – Node-Copying Method

Update operation• Space per update:• Time per update:

Access operation• Time per access:

Page 35: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 35

Planar Point Location

Page 36: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 36

Planar Point Location: Sarnak-Tarjan Solution

• Idea: (partial) persistence– Query time: O(log n), Space: O(n)

– Relies on Dobkin-Lipton construction and Cole’s observation.

• Dobkin-Lipton:– Partition the plane into vertical slabs by drawing a vertical line through

each endpoint.

– Within each slab, the lines are totally ordered.

– Allocate a search tree per slab containing the lines, and with each line associate the polygon above it.

– Allocate another search tree on the x-coordinates of the vertical lines.

Page 37: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 37

Dobkin-Lipton Construction

• Partition the plane into vertical slabs.

Page 38: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 38

Dobkin-Lipton Construction

• Locate a point with two binary searches. Query time: O(log n).• Nice but space inefficient! Can cause O(n2).

Page 39: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 39

Worst-Case Example

• Θ(n) segments in each slabs, and Θ(n) slabs.

Page 40: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 40

Cole’s Observation

A B

• Sets of line segments intersecting contiguous slabs are similar.• Reduces the problem to storing a “persistent” sorted set.

Page 41: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 41

Improving the Space Bound

• Create the search tree for the first slab.

• Then obtain the next one by deleting the lines that end at the corresponding vertex and adding the lines that start at that vertex.

• Total number of insertions / deletions:– 2n– One insertion and one deletion per segment.

Page 42: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut

Computational Geometry, WS 2007/08Prof. Dr. Thomas Ottmann 42

Planar Point Location and Persistence

• Updates should be persistent (since we need all search trees at the end).

• Partial persistence is enough (Sarnak and Tarjan).

• Method 1: Path-copying method; simple and powerful (Driscoll et al., Overmars).

– O(n log n) space + O(n log n) preprocessing time.

• Method 2: Node-copying method– We can improve the space bound to O(n).

Page 43: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut
Page 44: Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut