Upload
clemence-carol-mccarthy
View
223
Download
5
Embed Size (px)
Citation preview
CS121 © JAS 2005 2-1
CS121 Data Structures
Trees
Each entry in a List (Stack, Queue) has at most one predecessor and one successor.
In a Tree each entry has at most one predecessor, but can have more than one successor.
CS121 © JAS 2005 2-2
CS121 Data Structures
In general trees can have any number of successors, but usually we restrict ourselves to a maximum of two successors - BINARY TREES.
A tree is either
• Empty, or
• Consists of a value and two sub-trees (left and right)
CS121 © JAS 2005 2-3
CS121 Data Structures
For each non-empty tree there is a unique node with no predecessor – the root node
Nodes which have no successors are termed leaf nodes
All other nodes have exactly one predecessor and either one or two successors – internal nodes
CS121 © JAS 2005 2-4
CS121 Data Structures
Various subsets of trees can be identified –
• Binary Trees (maximum of two successors)
• Strictly Binary Trees (nodes have either 0 or 2 successors)
• Full Binary Trees (all leaves at same depth)
• Complete Binary Trees (all levels except last one are full, and last level is filled from left)
CS121 © JAS 2005 2-5
CS121 Data Structures
Various characteristics of trees can be measured
• Number of nodes - Nodes
• Number of leaves - Leaves
• Depth (longest path from root to leaf) – Depth
Nodes(EmptyTree) = 0
Nodes(Tree) = 1 + Nodes(Tree.Left) + Nodes(Tree.Right)
CS121 © JAS 2005 2-6
CS121 Data Structures
Leaves(EmptyTree) = 0
Leaves(RootOnly) = 1
Leaves(Tree) = Leaves(Tree.Left) + Leaves(Tree.Right)
Depth(Tree) = 1 + maximum(Depth(Tree.Left), Depth(Tree.Right))
Depth(EmptyTree) = – 1
CS121 © JAS 2005 2-7
CS121 Data Structures
Example Application – Representing an Arithmetic Expression
Nodes represent operators
Leaves represent operands
/
– *
^ * 2 a
b 2 * c
4 a
b ^ 2 – 4 * a * c / 2 * a(( ) )
CS121 © JAS 2005 2-8
CS121 Data Structures
Traversing a tree (visit all nodes in a tree)
• Left, Root, Right - Inorder or Symmetric• Left, Right, Root - Postorder • Root, Left, Right - Preorder or Depth-First
• Level by Level - Breadth-First
Each of these traversals can be reversed
CS121 © JAS 2005 2-9
CS121 Data Structures
Defining a tree by operations/methods
Create a tree
either an empty a tree
or, a tree consisting solely of a root
IsEmpty – determine if a tree is empty
CS121 © JAS 2005 2-10
CS121 Data Structures
Accessing/Moving round a treeGetData – view data in a nodeGetLeft – move to/access left sub-treeGetRight – move to/access right sub-tree
Adding to a treecan be application dependentsimplest methods add a sub-tree to an empty left (or right) subtreeAdding elsewhere involves reshaping the tree
CS121 © JAS 2005 2-11
CS121 Data Structures
Defining a Tree Class
public class BTNode
{ private Object data;
private BTNode left, right;
public BTNode(Object initialData,
BTNode initialLeft,
BTNode initialRight)
{ data = initialData;
left = initialLeft;
right = initialRight;
}
CS121 © JAS 2005 2-12
CS121 Data Structures
public Object getData()
{ return data; }
public BTNode getLeft()
{ return left; }
public BTNode getRight()
{ return right; }
CS121 © JAS 2005 2-13
CS121 Data Structures
public Object getLeftmostData(){ if (left == null)
return data; else return left.getLeftmostData();}
public Object getRightmostData(){
………}
CS121 © JAS 2005 2-14
CS121 Data Structures
public void inorderPrint()
{ if (left != null)
left.inorderPrint();
System.out.println(data);
if (right != null)
right.inorderPrint();
}
CS121 © JAS 2005 2-15
CS121 Data Structures
public BTNode removeLeftmost()
{ if (left == null)
return right; //leftmost node is root
else
{ left = left.removeLeftmost();
return this;
}
}
CS121 © JAS 2005 2-16
CS121 Data Structures
public void setData(Object newData)
{ data = newData; }
public void setleft(BTNode newLeft)
{ left = newLeft; }
CS121 © JAS 2005 2-17
CS121 Data Structures
public static BTNode treeCopy(BTNode src){ BTNode leftCopy, RightCopy; if (src == null)
return null; else { leftCopy = treeCopy(src,left);
rightCopy = treeCopy(src.right); return new
BTNode(src.data,leftCopy,rightCopy); }}
CS121 © JAS 2005 2-18
CS121 Data Structures
Binary Search Trees
In a binary search tree, a node N with a key K is inserted so that
• the keys in the left subtree of N are less than K, and • the keys in the right subtree of N are greater than K.
To summarise, for each node N in a binary search tree:
Keys in left subtree of N <
Key K in node N <
Keys in right subtree of N.
CS121 © JAS 2005 2-19
CS121 Data Structures
public BTNode addEntry(Object newData){ if (this == null) return (new BTNode(newData,null,null));
else { if newData.lessthan(data) left = left.addEntry(newData);
else right = right.addEntry(newData);
}return this;
}
CS121 © JAS 2005 2-20
CS121 Data Structures
public boolean findEntry(Object wanted)
{ if this == null
return false;
else
if wanted.lessthan(data)
return left.findEntry(wanted);
else
if wanted.greaterthan(data)
return right.findEntry(wanted);
else //assume equal to data
return true;
}
CS121 © JAS 2005 2-21
CS121 Data Structures
Deleting From a Binary Search Tree
• If node to be deleted is a leaf – no problem, just delete
• If node has only one subtree – replace node with child
• If node has two subtrees
– Determine larger sub-tree (left/right)
– Select least/greatest node of larger sub-tree
– Move that node up
– This will always be a leaf node or a node with only one sub-tree so process ends
CS121 © JAS 2005 2-22
CS121 Data Structures
hd l
b j n
a c e g i k m o
f
e
CS121 © JAS 2005 2-23
CS121 Data Structures
Binary Tree Application - Huffman CodesAim is to encode more frequent letters with shorter bit sequences
Algorithm
• Count frequencies of letters
• Arrange into ascending order (linked list or binary search tree?)
• Using two least frequent letters construct binary tree such that– Left leaf is least frequent letter; label left branch 1
– Right leaf is next least frequent letter; label right branch 0
– Root is given value = sum of leafs
• Replace first two entries in frequency list with entry representing sum (ensure list remains sorted)
• Repeat
CS121 © JAS 2005 2-24
CS121 Data Structures
Encoding a string
Replace each letter with bit code derived from path from root of tree to leaf containing that letter
Decoding a string
Read binary code such that1 – descend left0 – descend rightat leaf – output letter
CS121 © JAS 2005 2-25
CS121 Data Structures
This is a simple messagea 2e 3g 1h 1i 3l 1m 2p 1s 5t 1
g hgh 2lp 2at 3
l p a t
m
mgh 4lpat 5ei 6smgh 9
e is
1
1
1
1
1
1
1
1
1
0
0
00
0
0 0
0
0
CS121 © JAS 2005 2-26
CS121 Data Structures
Binary Tree Application - Priority QueuesA heap is a complete binary tree such that no node has a
value bigger than its parent. This can provide a representation of a priority queue.The highest priority entry will always be at the root.Removal of the root does, however, require restructuring the
tree.To maintain the complete tree property – move the last item
in the tree (rightmost leaf on bottom level) to the root.To re-establish the priority property swap new root with
highest priority child and repeat until it is moved to the correct position
CS121 © JAS 2005 2-27
CS121 Data Structures
Implementation of Heaps
Obviously Heaps can be implemented using references, but as (almost) all the operations are based on swapping values and the trees are complete it is also feasible to use arrays to implement a heap tree.
CS121 © JAS 2005 2-28
CS121 Data Structures
Trees implemented using an array
a b c d e f g h i j k l m n
ab c
d e f g
h i j k l m n
o
o
CS121 © JAS 2005 2-29
CS121 Data Structures
If a Tree is represented by array ATree
Tree.root == ATree[0]
If Tree == ATree[i]
then Tree.left == ATree[2i+1]
and Tree.right == ATree[2i+2]
CS121 © JAS 2005 2-30
CS121 Data Structures
Analysis of Binary Tree
For a complete binary tree every comparison halves the size of the tree to be searched.
Searching is thus said to be O(logn)n is the number of items to be searchedlog is the logarithm to base 2
If we inserted sorted data into a binary search tree we would end up with a linear list
For a linear list the search time is O(n)
CS121 © JAS 2005 2-31
CS121 Data Structures
Balanced binary search trees have the best search times – O(logn)
In the worst case, unbalanced trees can have O(n) search times
AVL trees – as each new key is inserted, the tree remains balanced
Guarantees O(logn) search times
To keep tree balanced insertion worst case is O(n)
CS121 © JAS 2005 2-32
CS121 Data Structures
AVL (Adelson-Velskii and Landis) trees are almost balanced binary trees
They have both O(logn) insertion (for one key) and search times
Height of binary tree = length of longest path from the root to some leaf
(The empty binary tree is considered to have a height of -1)
The AVL property for a node, N, in a binary tree is that the heights of the left and right subtrees of node N are either equal or if they differ by 1
An AVL tree is a binary tree in which each of its nodes has the AVL property.
CS121 © JAS 2005 2-33
CS121 Data Structures
If inserting into an AVL tree causes the AVL property to be lost at a node, we can apply some shape-changing tree transformations, rotations, to restore the AVL property
There are four different rotationsSingle leftSingle rightDouble left = single right then single leftDouble right = single left then single right
CS121 © JAS 2005 2-34
CS121 Data Structures
private AVLNode RotateRight( AVLNode tree)
{
AVLNode newTree = tree.Left;
tree.Left = newTree.Right;
newTree.Right = tree;
return newTree;
}
CS121 © JAS 2005 2-35
CS121 Data Structures
BRU
ORY
JFK
Right Rotation
treenewTree
0 0 01 1
0 0 01 1
2 2
CS121 © JAS 2005 2-36
CS121 Data Structures
MEX
ORDGLA
DUSARN
GCM NRT
ZRH
BRU ORY
JFK
Double Left
Right then Left
CS121 © JAS 2005 2-37
CS121 Data Structures
AVL Tree SummaryDifference in height between the left and right subtrees of every node is
0 or 1If this property is lost by inserting a new key, rotations can be
performed on the subtree to regain the propertyAn extra field will be required in the node record to retain the height of
the subtreesIt can be shown that in the worst case insertions, searches and deletions
take at most O(logn) time
Remember binary trees have a worst case of O(n) when a chain occurs. In fact it can be proven using fibonacci trees that AVL trees are at worst 44% less efficient than complete trees.
Proof not required for course, but can be found in text books.
CS121 © JAS 2005 2-38
CS121 Data Structures
Two-Three Trees (2-3 Trees)
Another option is to arrange for all subtrees to be perfectly balanced with respect to their heights, but to permit the number of search keys stored in nodes to vary
Each node in a 2-3 tree is permitted to contain either one or two search keys, and to have either two or three descendants
All leaves of the tree are empty trees that lie on exactly one bottom level.
CS121 © JAS 2005 2-39
CS121 Data Structures
H
D J N
IA E F K L O P
LH < L
J < L < N
CS121 © JAS 2005 2-40
CS121 Data Structures
H
D J N
IA E F K L O P
B
B < H
B<D
A B
CS121 © JAS 2005 2-41
CS121 Data Structures
H
D J N
IA E F K L O P
MH < M
J < M < N
L ^
NJ
K M
H L
CS121 © JAS 2005 2-42
CS121 Data Structures
The worst case 2-3 tree is one in which every node has 1 key and two children – a complete binary tree
For a 2-3 tree with k levels and n elements we have
n = 2 (k+1) - 1 Hence k+1=log(n+1)
As splits are passed up the tree to parent nodes, there can only ever be O(logn) splits during the insertion
Thus we can determine that search and deletions take O(logn) in the worst cases.
CS121 © JAS 2005 2-43
CS121 Data Structures
B-Trees
Increase the number of children in a non-root node of a 2-3 tree to be, say, from 100 to 200, then we have a B-tree of order 200
Such a tree can store 8 million records in just 3 levels
2003 = 8 million or log200
8000000=3
Therefore when storing records on disc we would only need 3 disc accesses in order to find any record in our tree of 8 million records
A binary tree would require around 23 disc access
A B-tree of order 200 would require 3 or in the worst case 5 accesses
CS121 © JAS 2005 2-44
CS121 Data Structures
Binary trees may be fast when stored in fast primary memory, but trees with higher branching factors are far more efficient when stored on slow external memory devices.
For efficiency, all of an ordered key sequence in a B-tree is read into internal memory at once, and a fast binary search is used to find a key in this ordered sequence. This either gives the key itself, or the node which must be read from disc next.
In general, in a B-tree of order m , each internal node except the root and the leaves must have between upper(m/2) and m children. The root can either be a leaf or it can contain from 2 to m children. All leaves lie on the same bottom-most level and are empty.
CS121 © JAS 2005 2-45
CS121 Data Structures
Summary
Concepts and terminology of trees– branches, nodes, children, parents, ancestors, descendants, internal nodes, leaves, levels, paths
Binary trees, definition, and recursive definition, complete binary tree definition, representing expressions using binary trees, traversing binary trees
Implementing trees in Java
Binary search trees
Huffman coding tree
Deleting from a binary tree
CS121 © JAS 2005 2-46
CS121 Data Structures
Analysis of the binary tree – big-O notation
Priority queues; represented by heaps; reheapifying
AVL trees – definition, how to calculate the heights and balance factors, how to identify AVL and non-AVL trees, how to insert, how to perform rotations, benefits
2-3 trees – how to insert into, benefits
B-trees – benefits, general idea of how they work
CS121 © JAS 2005 2-47
CS121 Data Structures