47
CS121 © JAS 2005 2-1 CS121 Data Structures Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry has at most one predecessor, but can have more than one successor.

CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

Embed Size (px)

Citation preview

Page 1: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-1

CS121 Data Structures

Trees

Each entry in a List (Stack, Queue) has at most one predecessor and one successor.

In a Tree each entry has at most one predecessor, but can have more than one successor.

Page 2: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-2

CS121 Data Structures

In general trees can have any number of successors, but usually we restrict ourselves to a maximum of two successors - BINARY TREES.

A tree is either

• Empty, or

• Consists of a value and two sub-trees (left and right)

Page 3: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-3

CS121 Data Structures

For each non-empty tree there is a unique node with no predecessor – the root node

Nodes which have no successors are termed leaf nodes

All other nodes have exactly one predecessor and either one or two successors – internal nodes

Page 4: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-4

CS121 Data Structures

Various subsets of trees can be identified –

• Binary Trees (maximum of two successors)

• Strictly Binary Trees (nodes have either 0 or 2 successors)

• Full Binary Trees (all leaves at same depth)

• Complete Binary Trees (all levels except last one are full, and last level is filled from left)

Page 5: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-5

CS121 Data Structures

Various characteristics of trees can be measured

• Number of nodes - Nodes

• Number of leaves - Leaves

• Depth (longest path from root to leaf) – Depth

Nodes(EmptyTree) = 0

Nodes(Tree) = 1 + Nodes(Tree.Left) + Nodes(Tree.Right)

Page 6: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-6

CS121 Data Structures

Leaves(EmptyTree) = 0

Leaves(RootOnly) = 1

Leaves(Tree) = Leaves(Tree.Left) + Leaves(Tree.Right)

Depth(Tree) = 1 + maximum(Depth(Tree.Left), Depth(Tree.Right))

Depth(EmptyTree) = – 1

Page 7: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-7

CS121 Data Structures

Example Application – Representing an Arithmetic Expression

Nodes represent operators

Leaves represent operands

/

– *

^ * 2 a

b 2 * c

4 a

b ^ 2 – 4 * a * c / 2 * a(( ) )

Page 8: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-8

CS121 Data Structures

Traversing a tree (visit all nodes in a tree)

• Left, Root, Right - Inorder or Symmetric• Left, Right, Root - Postorder • Root, Left, Right - Preorder or Depth-First

• Level by Level - Breadth-First

Each of these traversals can be reversed

Page 9: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-9

CS121 Data Structures

Defining a tree by operations/methods

Create a tree

either an empty a tree

or, a tree consisting solely of a root

IsEmpty – determine if a tree is empty

Page 10: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-10

CS121 Data Structures

Accessing/Moving round a treeGetData – view data in a nodeGetLeft – move to/access left sub-treeGetRight – move to/access right sub-tree

Adding to a treecan be application dependentsimplest methods add a sub-tree to an empty left (or right) subtreeAdding elsewhere involves reshaping the tree

Page 11: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-11

CS121 Data Structures

Defining a Tree Class

public class BTNode

{ private Object data;

private BTNode left, right;

public BTNode(Object initialData,

BTNode initialLeft,

BTNode initialRight)

{ data = initialData;

left = initialLeft;

right = initialRight;

}

Page 12: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-12

CS121 Data Structures

public Object getData()

{ return data; }

public BTNode getLeft()

{ return left; }

public BTNode getRight()

{ return right; }

Page 13: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-13

CS121 Data Structures

public Object getLeftmostData(){ if (left == null)

return data; else return left.getLeftmostData();}

public Object getRightmostData(){

………}

Page 14: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-14

CS121 Data Structures

public void inorderPrint()

{ if (left != null)

left.inorderPrint();

System.out.println(data);

if (right != null)

right.inorderPrint();

}

Page 15: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-15

CS121 Data Structures

public BTNode removeLeftmost()

{ if (left == null)

return right; //leftmost node is root

else

{ left = left.removeLeftmost();

return this;

}

}

Page 16: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-16

CS121 Data Structures

public void setData(Object newData)

{ data = newData; }

public void setleft(BTNode newLeft)

{ left = newLeft; }

Page 17: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-17

CS121 Data Structures

public static BTNode treeCopy(BTNode src){ BTNode leftCopy, RightCopy; if (src == null)

return null; else { leftCopy = treeCopy(src,left);

rightCopy = treeCopy(src.right); return new

BTNode(src.data,leftCopy,rightCopy); }}

Page 18: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-18

CS121 Data Structures

Binary Search Trees

In a binary search tree, a node N with a key K is inserted so that

• the keys in the left subtree of N are less than K, and • the keys in the right subtree of N are greater than K.

To summarise, for each node N in a binary search tree:

Keys in left subtree of N <

Key K in node N <

Keys in right subtree of N.

Page 19: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-19

CS121 Data Structures

public BTNode addEntry(Object newData){ if (this == null) return (new BTNode(newData,null,null));

else { if newData.lessthan(data) left = left.addEntry(newData);

else right = right.addEntry(newData);

}return this;

}

Page 20: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-20

CS121 Data Structures

public boolean findEntry(Object wanted)

{ if this == null

return false;

else

if wanted.lessthan(data)

return left.findEntry(wanted);

else

if wanted.greaterthan(data)

return right.findEntry(wanted);

else //assume equal to data

return true;

}

Page 21: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-21

CS121 Data Structures

Deleting From a Binary Search Tree

• If node to be deleted is a leaf – no problem, just delete

• If node has only one subtree – replace node with child

• If node has two subtrees

– Determine larger sub-tree (left/right)

– Select least/greatest node of larger sub-tree

– Move that node up

– This will always be a leaf node or a node with only one sub-tree so process ends

Page 22: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-22

CS121 Data Structures

hd l

b j n

a c e g i k m o

f

e

Page 23: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-23

CS121 Data Structures

Binary Tree Application - Huffman CodesAim is to encode more frequent letters with shorter bit sequences

Algorithm

• Count frequencies of letters

• Arrange into ascending order (linked list or binary search tree?)

• Using two least frequent letters construct binary tree such that– Left leaf is least frequent letter; label left branch 1

– Right leaf is next least frequent letter; label right branch 0

– Root is given value = sum of leafs

• Replace first two entries in frequency list with entry representing sum (ensure list remains sorted)

• Repeat

Page 24: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-24

CS121 Data Structures

Encoding a string

Replace each letter with bit code derived from path from root of tree to leaf containing that letter

Decoding a string

Read binary code such that1 – descend left0 – descend rightat leaf – output letter

Page 25: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-25

CS121 Data Structures

This is a simple messagea 2e 3g 1h 1i 3l 1m 2p 1s 5t 1

g hgh 2lp 2at 3

l p a t

m

mgh 4lpat 5ei 6smgh 9

e is

1

1

1

1

1

1

1

1

1

0

0

00

0

0 0

0

0

Page 26: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-26

CS121 Data Structures

Binary Tree Application - Priority QueuesA heap is a complete binary tree such that no node has a

value bigger than its parent. This can provide a representation of a priority queue.The highest priority entry will always be at the root.Removal of the root does, however, require restructuring the

tree.To maintain the complete tree property – move the last item

in the tree (rightmost leaf on bottom level) to the root.To re-establish the priority property swap new root with

highest priority child and repeat until it is moved to the correct position

Page 27: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-27

CS121 Data Structures

Implementation of Heaps

Obviously Heaps can be implemented using references, but as (almost) all the operations are based on swapping values and the trees are complete it is also feasible to use arrays to implement a heap tree.

Page 28: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-28

CS121 Data Structures

Trees implemented using an array

a b c d e f g h i j k l m n

ab c

d e f g

h i j k l m n

o

o

Page 29: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-29

CS121 Data Structures

If a Tree is represented by array ATree

Tree.root == ATree[0]

If Tree == ATree[i]

then Tree.left == ATree[2i+1]

and Tree.right == ATree[2i+2]

Page 30: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-30

CS121 Data Structures

Analysis of Binary Tree

For a complete binary tree every comparison halves the size of the tree to be searched.

Searching is thus said to be O(logn)n is the number of items to be searchedlog is the logarithm to base 2

If we inserted sorted data into a binary search tree we would end up with a linear list

For a linear list the search time is O(n)

Page 31: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-31

CS121 Data Structures

Balanced binary search trees have the best search times – O(logn)

In the worst case, unbalanced trees can have O(n) search times

AVL trees – as each new key is inserted, the tree remains balanced

Guarantees O(logn) search times

To keep tree balanced insertion worst case is O(n)

Page 32: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-32

CS121 Data Structures

AVL (Adelson-Velskii and Landis) trees are almost balanced binary trees

They have both O(logn) insertion (for one key) and search times

Height of binary tree = length of longest path from the root to some leaf

(The empty binary tree is considered to have a height of -1)

The AVL property for a node, N, in a binary tree is that the heights of the left and right subtrees of node N are either equal or if they differ by 1

An AVL tree is a binary tree in which each of its nodes has the AVL property.

Page 33: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-33

CS121 Data Structures

If inserting into an AVL tree causes the AVL property to be lost at a node, we can apply some shape-changing tree transformations, rotations, to restore the AVL property

There are four different rotationsSingle leftSingle rightDouble left = single right then single leftDouble right = single left then single right

Page 34: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-34

CS121 Data Structures

private AVLNode RotateRight( AVLNode tree)

{

AVLNode newTree = tree.Left;

tree.Left = newTree.Right;

newTree.Right = tree;

return newTree;

}

Page 35: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-35

CS121 Data Structures

BRU

ORY

JFK

Right Rotation

treenewTree

0 0 01 1

0 0 01 1

2 2

Page 36: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-36

CS121 Data Structures

MEX

ORDGLA

DUSARN

GCM NRT

ZRH

BRU ORY

JFK

Double Left

Right then Left

Page 37: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-37

CS121 Data Structures

AVL Tree SummaryDifference in height between the left and right subtrees of every node is

0 or 1If this property is lost by inserting a new key, rotations can be

performed on the subtree to regain the propertyAn extra field will be required in the node record to retain the height of

the subtreesIt can be shown that in the worst case insertions, searches and deletions

take at most O(logn) time

Remember binary trees have a worst case of O(n) when a chain occurs. In fact it can be proven using fibonacci trees that AVL trees are at worst 44% less efficient than complete trees.

Proof not required for course, but can be found in text books.

Page 38: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-38

CS121 Data Structures

Two-Three Trees (2-3 Trees)

Another option is to arrange for all subtrees to be perfectly balanced with respect to their heights, but to permit the number of search keys stored in nodes to vary

Each node in a 2-3 tree is permitted to contain either one or two search keys, and to have either two or three descendants

All leaves of the tree are empty trees that lie on exactly one bottom level.

Page 39: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-39

CS121 Data Structures

H

D J N

IA E F K L O P

LH < L

J < L < N

Page 40: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-40

CS121 Data Structures

H

D J N

IA E F K L O P

B

B < H

B<D

A B

Page 41: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-41

CS121 Data Structures

H

D J N

IA E F K L O P

MH < M

J < M < N

L ^

NJ

K M

H L

Page 42: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-42

CS121 Data Structures

The worst case 2-3 tree is one in which every node has 1 key and two children – a complete binary tree

For a 2-3 tree with k levels and n elements we have

n = 2 (k+1) - 1 Hence k+1=log(n+1)

As splits are passed up the tree to parent nodes, there can only ever be O(logn) splits during the insertion

Thus we can determine that search and deletions take O(logn) in the worst cases.

Page 43: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-43

CS121 Data Structures

B-Trees

Increase the number of children in a non-root node of a 2-3 tree to be, say, from 100 to 200, then we have a B-tree of order 200

Such a tree can store 8 million records in just 3 levels

2003 = 8 million or log200

8000000=3

Therefore when storing records on disc we would only need 3 disc accesses in order to find any record in our tree of 8 million records

A binary tree would require around 23 disc access

A B-tree of order 200 would require 3 or in the worst case 5 accesses

Page 44: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-44

CS121 Data Structures

Binary trees may be fast when stored in fast primary memory, but trees with higher branching factors are far more efficient when stored on slow external memory devices.

For efficiency, all of an ordered key sequence in a B-tree is read into internal memory at once, and a fast binary search is used to find a key in this ordered sequence. This either gives the key itself, or the node which must be read from disc next.

In general, in a B-tree of order m , each internal node except the root and the leaves must have between upper(m/2) and m children. The root can either be a leaf or it can contain from 2 to m children. All leaves lie on the same bottom-most level and are empty.

Page 45: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-45

CS121 Data Structures

Summary

Concepts and terminology of trees– branches, nodes, children, parents, ancestors, descendants, internal nodes, leaves, levels, paths

Binary trees, definition, and recursive definition, complete binary tree definition, representing expressions using binary trees, traversing binary trees

Implementing trees in Java

Binary search trees

Huffman coding tree

Deleting from a binary tree

Page 46: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-46

CS121 Data Structures

Analysis of the binary tree – big-O notation

Priority queues; represented by heaps; reheapifying

AVL trees – definition, how to calculate the heights and balance factors, how to identify AVL and non-AVL trees, how to insert, how to perform rotations, benefits

2-3 trees – how to insert into, benefits

B-trees – benefits, general idea of how they work

Page 47: CS121 Data Structures CS121 © JAS 2005 2-1 Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry

CS121 © JAS 2005 2-47

CS121 Data Structures