Upload
pisces
View
32
Download
2
Embed Size (px)
DESCRIPTION
Rank-Balanced Trees. Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan. Observation. Computer science is (still) a young field. We often settle for the first (good) solution. It may not be the best: the design space is rich. - PowerPoint PPT Presentation
Citation preview
Rank-Balanced Trees
Siddhartha Sen, Princeton University
WADS 2009
Joint work with Bernhard Haeupler and
Robert E. Tarjan 1
Observation
Computer science is (still) a young field.
We often settle for the first (good) solution.
It may not be the best: the design space is rich.
2
3
Research Agenda
For fundamental problems, systematically explore the design space to find the best solutions, seeking
elegance: “a quality of neatness and ingenious simplicity in the solution of a problem (especially in science or mathematics).”
wordnet.princeton.edu/perl/webwn
Keep the design simple, allow complexity in the analysis.
4
Searching: Dictionary Problem
Maintain a set of items, so that
Access: find a given itemInsert: add a new itemDelete: remove an item
are efficient.
Assumption: items are totally ordered, so that binary comparison is possible
Binary Search Tree
Symmetric order
< >e
c i
a g n
k
Why not hashing?5
6
Binary Search Tree
Access k
e
c i
a g n
k
>
>
<
7
Binary Search Tree
Insert h
e
c i
a g n
k
>
h
<
>
8
Binary Search Tree
Delete i
e
c i
a g n
kh
m
Find successorSwap
i
kDelete
k
Problem: imbalance
How to bound the height?• Maintain local balance condition,
rebalance after insert or delete balanced tree
• Restructure after each access self-adjusting tree
Store balance information in nodes, guarantee O(log n) height
After (during) insert/delete, restore balance bottom-up (top-down):
• Update balance information• Restructure along access path
a
b
c
d
e
f
9
10
Restructuring primitive:(Single) Rotation
Preserves symmetric order
Changes heights
Takes O(1) time
y
x
A B
C
x
y
B C
A
right
left
11
Known Balanced Trees
AVL trees (“passé” according to one author)weight balanced trees2,3 treesB treesred-black treesetc.
Goal: small height, little rebalancing, simple algorithms
not binary
12
Ranks
Each node has an integer rank, a proxy for heightConvention: leaves have rank 0, missing nodes have rank -1
rank of tree = rank of root
rank difference of a child =rank of parent - rank of child
i-child: node of rank difference ii,j-node: children have rank differences i and j
13
Example of a rank-balanced tree
e
c i
a g n
kh
3
0 1
1 2 2 1
1 1 1 1
0 1 0 1
If all rank differences are positive, rank height
14
Rank Rules
AVL trees: every node is a 1,1- or 1,2-node
Rank-balanced trees: every node is a 1,1-, 1,2-, or 2,2-node (rank differences are 1 or 2)
Red-black trees: all rank differences are 0 or 1, no 0-child is the parent of another
All need one balance bit per node.
15
Height boundsnk = minimum n for rank k
AVL trees:
n0 = 1, n1 = 2, nk = nk-1 + nk-2 + 1, nk = Fk+3 - 1
nk = Fk+3 - 1, Fk+2 < k k log n 1.44lg n
Rank-balanced trees:
n0 = 1, n1 = 2, nk = 2nk-2,
nk = 2k/2 k 2lg n
Same bound for red-black trees
2/)51(
16
b
Insert bInsert c
0
c
1
a
Insert a
>
>
10
Rotate left at b
01
Insertion example
Demote a
0
0
1
Promote a
Promote b
17
>
e
d
2
Insert e
0
0
1
1
01 c
b>
Insert d
a
1
>Rotate left at d
12
Insertion example
Insert c
Demote c0 0
0
0
1
1
Promote c
Promote b
Promote d
18
01 e
2
1
1d
b
a
c
2
Insertion example
Insert eInsert f
>
>
>
f 0
2
1
0 Rotate left at d
Demote b
1
0 0
0
0
1
2
Promote e
Promote d
19
1
Insert f
f
1 1e
d
b
2
Insertion example
a c 11
1
0 0 0
1
20
Rebalancing: insertion
Non-terminal
0, 1, or 2 rotations
O(log n) rank changes
No 2,2 nodes = AVL trees
log n height!
21
210 def
e
Delete aDelete fDelete d
1
Swap with successor
Delete
1f
1
d
b
2
Deletion example
a c 11
1
0 0 0
Double rotate at cDouble promote c
Demote b
Double demote e
22
e
c
b
Delete f
Deletion example
20 20
2
23
Rebalancing: deletion
Non-terminal
0, 1, or 2 rotations
O(log n) rank changes
24
Amortized (time-averaged) analysis
If ti is the actual time of operation i and i is the potential of the data structure after operation i, the amortized time of operation i is
0 and 0if e.g.,
0if
0
0
0
1
F
Fii
Fii
iiii
at
ta
ta
25
Non-terminal cases
Must decrease potential!
Must decrease potential!
Insertions: 1,1-node 1,2-node
Deletions: 2,2-node 1,1- or 1,2-node
= #1,1-nodes + 2 #2,2-nodes
non-terminating steps are free,last insertion step: Δ 2,last deletion step: Δ 3
If there are m inserts and d deletes (n = m - d), the number of rebalancing steps is O(m + d)
26
27
Rank-Balanced Trees
height 2lg n
2 rotations per rebalancing
O(1) amortized rebalancing time
Red-Black Trees
height 2lg n
3 rotations per rebalancing
O(1) amortized rebalancing time
Yes. No.
Are rank-balanced trees better?
Better height bound?
Sequential Insertions:
rank-balanced red-black
height = lg n (best) height = 2lg n (worst)
Theorem. The height of a rank-balanced tree is at most log m.
Degrades gracefully from AVL trees as d/m 1
28
29
Proof
Give a node a count of 1 when it is insertedTotal amount of count in tree is m
Potential of a node = total count in its subtree
When a node is deleted, its count is added to its parent if it has one
Let k be the minimum potential of a node of rank k
Claim: k satisfies
0 = 1, 1 = 2, k = 1 + k-1 + k-2 for k > 1
m Fk+3 - 1 k
30
Proof of Claim
k = 1 + k-1 + k-2 for k > 1
Easy for 1,1- and 1,2-nodes
Harder for 2,2-nodes (created by deletions)But counts are inherited
Rebalancing frequency
How high does rebalancing propagate?
O(m + d) rebalancing steps total, which implies
O((m + d) / k) insertions/deletions at rank k
Actually, we can show:
Theorem. There are O((m + d) / 2k/3) rebalancing steps at rank k.
31
32
Proof
Use an exponential potential:
1,1- and 2,2-nodes of rank i get potential bi
1,2-nodes of rank i get potential bi-2
where b = 21/3
The potential change in the non-terminal steps telescopes. Combine this effect with initialization and terminal step.
33
0
01
1
0
0
1
11
1 2
2
2
2
Telescoping potential
…
0 1
1
1
…
bibi-1 -
bi+1bi -
bi+2bi+1 -
bi+3bi+2 -
0
= -bi+3
34
Fix k. Cut off growth in potential at rank k:
1,1- and 2,2-nodes of rank i: bmin{i,k-3}
1,2-nodes of rank i: bmin{i-2,k-3}
Then a rebalancing that propagates to rank k or above decreases the potential by bk-3.
The same idea works for red-black trees (we think).
Preliminary evaluation
35
Test N X Red-black trees Rank-balanced trees# rots 106
# bals 106
avg.pLen
max.pLen
# rots 106
# bals 106
avg.pLen
max.pLen
id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092
queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999
work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345
zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045
dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158
Preliminary evaluation
36
Test N X Red-black trees Rank-balanced trees# rots 106
# bals 106
avg.pLen
max.pLen
# rots 106
# bals 106
avg.pLen
max.pLen
id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092
queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999
work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345
zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045
dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158
Preliminary evaluation
37
Test N X Red-black trees Rank-balanced trees# rots 106
# bals 106
avg.pLen
max.pLen
# rots 106
# bals 106
avg.pLen
max.pLen
id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092
queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999
work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345
zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045
dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158
Preliminary evaluation
38
Test N X Red-black trees Rank-balanced trees# rots 106
# bals 106
avg.pLen
max.pLen
# rots 106
# bals 106
avg.pLen
max.pLen
id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092
queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999
work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345
zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045
dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158
Conclusion
Rank-balanced trees are a relaxation of AVL trees with behavior at least as good as red-black trees and better in important ways.
Especially the height bound of min{2lg n, log m}
Exponential potential functions yield new insights into the efficiency of rebalancing. We anticipate more applications.
39
For insertions, yes. But what about deletions?
Deletion rebalancing is complicated, ignored by textbooks, and many database systems do not do it
So, can we avoid deletion rebalancing?
Yes. Relaxation of AVL trees, ravl trees, achieves log m access time using lglg m + 1 balance bits
… and no rebalancing during deletion!
Is rebalancing necessary?
40
41
Thank you
42
Extra slides