42
Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Rank-Balanced Trees

  • Upload
    pisces

  • View
    32

  • Download
    2

Embed Size (px)

DESCRIPTION

Rank-Balanced Trees. Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan. Observation. Computer science is (still) a young field. We often settle for the first (good) solution. It may not be the best: the design space is rich. - PowerPoint PPT Presentation

Citation preview

Page 1: Rank-Balanced Trees

Rank-Balanced Trees

Siddhartha Sen, Princeton University

WADS 2009

Joint work with Bernhard Haeupler and

Robert E. Tarjan 1

Page 2: Rank-Balanced Trees

Observation

Computer science is (still) a young field.

We often settle for the first (good) solution.

It may not be the best: the design space is rich.

2

Page 3: Rank-Balanced Trees

3

Research Agenda

For fundamental problems, systematically explore the design space to find the best solutions, seeking

elegance: “a quality of neatness and ingenious simplicity in the solution of a problem (especially in science or mathematics).”

wordnet.princeton.edu/perl/webwn

Keep the design simple, allow complexity in the analysis.

Page 4: Rank-Balanced Trees

4

Searching: Dictionary Problem

Maintain a set of items, so that

Access: find a given itemInsert: add a new itemDelete: remove an item

are efficient.

Assumption: items are totally ordered, so that binary comparison is possible

Page 5: Rank-Balanced Trees

Binary Search Tree

Symmetric order

< >e

c i

a g n

k

Why not hashing?5

Page 6: Rank-Balanced Trees

6

Binary Search Tree

Access k

e

c i

a g n

k

>

>

<

Page 7: Rank-Balanced Trees

7

Binary Search Tree

Insert h

e

c i

a g n

k

>

h

<

>

Page 8: Rank-Balanced Trees

8

Binary Search Tree

Delete i

e

c i

a g n

kh

m

Find successorSwap

i

kDelete

k

Page 9: Rank-Balanced Trees

Problem: imbalance

How to bound the height?• Maintain local balance condition,

rebalance after insert or delete balanced tree

• Restructure after each access self-adjusting tree

Store balance information in nodes, guarantee O(log n) height

After (during) insert/delete, restore balance bottom-up (top-down):

• Update balance information• Restructure along access path

a

b

c

d

e

f

9

Page 10: Rank-Balanced Trees

10

Restructuring primitive:(Single) Rotation

Preserves symmetric order

Changes heights

Takes O(1) time

y

x

A B

C

x

y

B C

A

right

left

Page 11: Rank-Balanced Trees

11

Known Balanced Trees

AVL trees (“passé” according to one author)weight balanced trees2,3 treesB treesred-black treesetc.

Goal: small height, little rebalancing, simple algorithms

not binary

Page 12: Rank-Balanced Trees

12

Ranks

Each node has an integer rank, a proxy for heightConvention: leaves have rank 0, missing nodes have rank -1

rank of tree = rank of root

rank difference of a child =rank of parent - rank of child

i-child: node of rank difference ii,j-node: children have rank differences i and j

Page 13: Rank-Balanced Trees

13

Example of a rank-balanced tree

e

c i

a g n

kh

3

0 1

1 2 2 1

1 1 1 1

0 1 0 1

If all rank differences are positive, rank height

Page 14: Rank-Balanced Trees

14

Rank Rules

AVL trees: every node is a 1,1- or 1,2-node

Rank-balanced trees: every node is a 1,1-, 1,2-, or 2,2-node (rank differences are 1 or 2)

Red-black trees: all rank differences are 0 or 1, no 0-child is the parent of another

All need one balance bit per node.

Page 15: Rank-Balanced Trees

15

Height boundsnk = minimum n for rank k

AVL trees:

n0 = 1, n1 = 2, nk = nk-1 + nk-2 + 1, nk = Fk+3 - 1

nk = Fk+3 - 1, Fk+2 < k k log n 1.44lg n

Rank-balanced trees:

n0 = 1, n1 = 2, nk = 2nk-2,

nk = 2k/2 k 2lg n

Same bound for red-black trees

2/)51(

Page 16: Rank-Balanced Trees

16

b

Insert bInsert c

0

c

1

a

Insert a

>

>

10

Rotate left at b

01

Insertion example

Demote a

0

0

1

Promote a

Promote b

Page 17: Rank-Balanced Trees

17

>

e

d

2

Insert e

0

0

1

1

01 c

b>

Insert d

a

1

>Rotate left at d

12

Insertion example

Insert c

Demote c0 0

0

0

1

1

Promote c

Promote b

Promote d

Page 18: Rank-Balanced Trees

18

01 e

2

1

1d

b

a

c

2

Insertion example

Insert eInsert f

>

>

>

f 0

2

1

0 Rotate left at d

Demote b

1

0 0

0

0

1

2

Promote e

Promote d

Page 19: Rank-Balanced Trees

19

1

Insert f

f

1 1e

d

b

2

Insertion example

a c 11

1

0 0 0

1

Page 20: Rank-Balanced Trees

20

Rebalancing: insertion

Non-terminal

0, 1, or 2 rotations

O(log n) rank changes

No 2,2 nodes = AVL trees

log n height!

Page 21: Rank-Balanced Trees

21

210 def

e

Delete aDelete fDelete d

1

Swap with successor

Delete

1f

1

d

b

2

Deletion example

a c 11

1

0 0 0

Double rotate at cDouble promote c

Demote b

Double demote e

Page 22: Rank-Balanced Trees

22

e

c

b

Delete f

Deletion example

20 20

2

Page 23: Rank-Balanced Trees

23

Rebalancing: deletion

Non-terminal

0, 1, or 2 rotations

O(log n) rank changes

Page 24: Rank-Balanced Trees

24

Amortized (time-averaged) analysis

If ti is the actual time of operation i and i is the potential of the data structure after operation i, the amortized time of operation i is

0 and 0if e.g.,

0if

0

0

0

1

F

Fii

Fii

iiii

at

ta

ta

Page 25: Rank-Balanced Trees

25

Non-terminal cases

Must decrease potential!

Must decrease potential!

Page 26: Rank-Balanced Trees

Insertions: 1,1-node 1,2-node

Deletions: 2,2-node 1,1- or 1,2-node

= #1,1-nodes + 2 #2,2-nodes

non-terminating steps are free,last insertion step: Δ 2,last deletion step: Δ 3

If there are m inserts and d deletes (n = m - d), the number of rebalancing steps is O(m + d)

26

Page 27: Rank-Balanced Trees

27

Rank-Balanced Trees

height 2lg n

2 rotations per rebalancing

O(1) amortized rebalancing time

Red-Black Trees

height 2lg n

3 rotations per rebalancing

O(1) amortized rebalancing time

Yes. No.

Are rank-balanced trees better?

Page 28: Rank-Balanced Trees

Better height bound?

Sequential Insertions:

rank-balanced red-black

height = lg n (best) height = 2lg n (worst)

Theorem. The height of a rank-balanced tree is at most log m.

Degrades gracefully from AVL trees as d/m 1

28

Page 29: Rank-Balanced Trees

29

Proof

Give a node a count of 1 when it is insertedTotal amount of count in tree is m

Potential of a node = total count in its subtree

When a node is deleted, its count is added to its parent if it has one

Let k be the minimum potential of a node of rank k

Claim: k satisfies

0 = 1, 1 = 2, k = 1 + k-1 + k-2 for k > 1

m Fk+3 - 1 k

Page 30: Rank-Balanced Trees

30

Proof of Claim

k = 1 + k-1 + k-2 for k > 1

Easy for 1,1- and 1,2-nodes

Harder for 2,2-nodes (created by deletions)But counts are inherited

Page 31: Rank-Balanced Trees

Rebalancing frequency

How high does rebalancing propagate?

O(m + d) rebalancing steps total, which implies

O((m + d) / k) insertions/deletions at rank k

Actually, we can show:

Theorem. There are O((m + d) / 2k/3) rebalancing steps at rank k.

31

Page 32: Rank-Balanced Trees

32

Proof

Use an exponential potential:

1,1- and 2,2-nodes of rank i get potential bi

1,2-nodes of rank i get potential bi-2

where b = 21/3

The potential change in the non-terminal steps telescopes. Combine this effect with initialization and terminal step.

Page 33: Rank-Balanced Trees

33

0

01

1

0

0

1

11

1 2

2

2

2

Telescoping potential

0 1

1

1

bibi-1 -

bi+1bi -

bi+2bi+1 -

bi+3bi+2 -

0

= -bi+3

Page 34: Rank-Balanced Trees

34

Fix k. Cut off growth in potential at rank k:

1,1- and 2,2-nodes of rank i: bmin{i,k-3}

1,2-nodes of rank i: bmin{i-2,k-3}

Then a rebalancing that propagates to rank k or above decreases the potential by bk-3.

The same idea works for red-black trees (we think).

Page 35: Rank-Balanced Trees

Preliminary evaluation

35

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 36: Rank-Balanced Trees

Preliminary evaluation

36

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 37: Rank-Balanced Trees

Preliminary evaluation

37

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 38: Rank-Balanced Trees

Preliminary evaluation

38

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 39: Rank-Balanced Trees

Conclusion

Rank-balanced trees are a relaxation of AVL trees with behavior at least as good as red-black trees and better in important ways.

Especially the height bound of min{2lg n, log m}

Exponential potential functions yield new insights into the efficiency of rebalancing. We anticipate more applications.

39

Page 40: Rank-Balanced Trees

For insertions, yes. But what about deletions?

Deletion rebalancing is complicated, ignored by textbooks, and many database systems do not do it

So, can we avoid deletion rebalancing?

Yes. Relaxation of AVL trees, ravl trees, achieves log m access time using lglg m + 1 balance bits

… and no rebalancing during deletion!

Is rebalancing necessary?

40

Page 41: Rank-Balanced Trees

41

Thank you

Page 42: Rank-Balanced Trees

42

Extra slides