B-Trees and Red Black Trees. Binary Trees B Trees spread data all over – Fine for memory – Bad...

Preview:

Citation preview

B-Trees and Red Black Trees

Binary Trees

• B Trees spread data all over– Fine for memory– Bad on disks

Disk Blocks

• Disk data stored in blocks/sectors– 512bytes – 4kB– Increment for all reds/writes– Like to maximize

useful data

B Trees

• B Tree– Each node can store up to n values– Each value choses between two pointers• Max Degree = num pointers = n + 1

B Trees

• B Tree– Node designed to fit

block size– May have degree of 100+• First level 100 keys• Second level 100,000 keys• Third level 100,000,000 keys!

– Still compare against keys one at a time

B Tree Simulation

• BTree Inserts– Always insert at leaves– Split at max size, median value moves to parent– Tree gets deeper when root splits

B Tree Simulation

• BTree Deletes– Remove from node– Empty node• Steal from sibling if possible• Else merge with sibling & steal key from parent

Start Removed 104 Removed 112

So…

• Btrees self balance• Can represent as a binary tree…

Red Black Tree

• Red Black tree can be seen as binary version of Btree– Red nodes are part of their parent• 1 red + 1 black = node degree 3• 2 red + black = node degree 4

Red Black Rules

• Two views of same tree

Red Black Rules

• Rules1. The root is black

Red Black Rules

• Rules1. The root is black – if it becomes red, turn it back to black2. Null values are black3. A red node must have black children4. Every path from root to leaf must have same number of

black nodes

Nodes vs Edges

• Note: Can think of edges or nodes as red/black

Guarantee

• Worst and best case in terms of red nodes for black height = 2

• If L = num leaves, B = black height 2B ≤ L ≤ 22B 2B ≤ L ≤ 4B

Guarantee

• So…

Height

• Black height is O(logL)• Total height at most 2B– 2O(logL) = O(logL)– Height is O(logL)

• Total nodes (N) < 2L – 1– O(log(n/2)) = O(logn) Guaranteed logN performance

𝑙𝑜𝑔2 (𝐿 )≥𝐵

Actual Work

• Insert as normal– New node is always red

– Two red's in a row need tobe fixed…

Fixes

• Red parent of red child with red sibling– Push up redness of siblings to grandparent• Fix them if necessary• If root becomes red, make it black

Splitting a 4 node

Fixes

• Red parent of red child with black/no sibling– AVL style rotation to balance– New parent becomes black, new child red

Turn a 3 node + child into 4 node

Binary Tree Comparisons

• BST– Hopefully O(logn)– Could be O(n)

• Splay – Amortized O(logn)– Ideal for consecutive accesses

Binary Tree Comparisons

• AVL– Guaranteed O(logn)– High constant factors on insert/delete– Height limited to ~1.44 log(n)

Binary Tree Comparisons

• Red/Black– Guaranteed O(logn)– Height limited to ~2 log(n)• Less balanced than AVL

– Faster insert/remove– Slower find

– Standard implementation for most library BSTs

Tree Comparisons

• BTree– Not binary – can pick any node size– Still sorted– Self balancing– Ideal for slower storage– Model for red/black trees

Tree Comparisons

• Other trees– Represent tree structure– Not necessarily sorted