Upload
victor-palmar
View
304
Download
1
Embed Size (px)
Citation preview
Trees1
Lecture 5
Tree Data StructureTree Data Structure
Basic Tree Concepts
Trees2
A tree consists of finite set of elements, called nodes, and a finite set of directed lines called branches, that connect the nodes.
The number of branches associated with a node is the degree of the node.
Trees3
Trees4
TreeA simple unordered tree; in this diagram, the node
labelled 7 has two children, labelled 2 and 6, and one parent, labelled 2. The root node, at the top, has no parent.
A tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.
Basic Tree Concepts
Trees6
When the branch is directed toward the node, it is in degree branch.
When the branch is directed away from the node, it is an out degree branch.
The sum of the in degree and out degree branches is the degree of the node.
If the tree is not empty, the first node is called the root.
Basic Tree Concepts
Trees7
The in degree of the root is, by definition, zero.
With the exception of the root, all of the nodes in a tree must have an in degree of exactly one; that is, they may have only one predecessor.
All nodes in the tree can have zero, one, or more branches leaving them; that is, they may have out degree of zero, one, or more.
Trees8
Basic Tree Concepts
Trees9
A leaf is any node with an out degree of zero, that is, a node with no successors.
A node that is not a root or a leaf is known as an internal node.
A node is a parent if it has successor nodes; that is, if it has out degree greater than zero.
A node with a predecessor is called a child.
Basic Tree Concepts
Trees10
Two or more nodes with the same parents are called siblings.
An ancestor is any node in the path from the root to the node.
A descendant is any node in the path below the parent node; that is, all nodes in the paths from a given node to a leaf are descendants of that node.
Basic Tree Concepts
Trees11
A path is a sequence of nodes in which each node is adjacent to the next node.
The level of a node is its distance from the root. The root is at level 0, its children are at level 1, etc. …
Basic Tree Concepts
Trees12
The height of the tree is the level of the leaf in the longest path from the root plus 1. By definition the height of any empty tree is -1.
A sub tree is any connected structure below the root.
The first node in the sub tree is known as the root of the sub tree.
Trees13
Trees14
Recursive definition of a tree
Trees15
A tree is a set of nodes that either:is empty orhas a designated node, called the root,
from which hierarchically descend zero or more sub trees, which are also trees.
Tree Representation
Trees16
General Tree – organization chart format
Indented list – bill-of-materials system in which a parts list represents the assembly structure of an item
Trees17
Trees18
Parenthetical Listing
Trees19
Parenthetical Listing – the algebraic expression, where each open parenthesis indicates the start of a new level and each closing parenthesis completes the current level and moves up one level in the tree.
Parenthetical Listing
Trees20
A (B (C D) E F (G H I) )
Trees21
Binary Trees
A binary tree can have no more than two A binary tree can have no more than two descendentsdescendents
• Properties• Binary Tree Traversals• Expression Trees• Huffman Code
Binary Trees
Trees22
A binary tree is a tree in which no node can have more than two sub trees; the maximum out degree for a node is two.
In other words, a node can have zero, one, or two sub trees.
These sub trees are designated as the left sub tree and the right sub tree.
Trees23
Trees24
A null tree is a tree with no nodes
Some Properties of Binary TreesThe height of binary trees can be mathematically
predictedGiven that we need to store N nodes in a binary
tree, the maximum height is
maxH N
Trees25
A tree with a maximum height is rare. It occurs when all of the nodes in the entire tree have only one successor.
Some Properties of Binary Trees
The minimum height of a binary tree is determined as follows:
min 2log 1H N
Trees26
For instance, if there are three nodes to be stored in the binary tree (N=3) then Hmin=2.
Some Properties of Binary Trees
Given a height of the binary tree, H, the minimum number of nodes in the tree is given as follows:
minN H
Trees27
Some Properties of Binary Trees
The formula for the maximum number of nodes is derived from the fact that each node can have only two descendents.
Given a height of the binary tree, H, the maximum number of nodes in the tree is given as follows:
max 2 1HN
Trees28
Some Properties of Binary TreesThe children of any node in a tree can be accessed by
following only one branch path, the one that leads to the desired node.
The nodes at level 1, which are children of the root, can be accessed by following only one branch; the nodes of level 2 of a tree can be accessed by following only two branches from the root, etc.
The balance factor of a binary tree is the difference in height between its left and right sub trees:
L RB H H Trees29
Trees30
B=0 B=0 B=1 B=-1
B=0 B=1
B=-2 B=2
Balance of the tree
Some Properties of Binary Trees
In the balanced binary tree (the height of its sub trees differs by no more than one) its balance factor is -1, 0, or 1), and its sub trees are also balanced.
Trees31
Complete and nearly complete binary trees
Trees32
A complete tree has the maximum number of entries for its height. The maximum number is reached when the last level is full.
A tree is considered nearly complete if it has the minimum height for its nodes and all nodes in the last level are found on the left
Trees33
Implementing Binary Trees
Just like other ADTs, we can implement a binary tree using
pointers or arrays. A pointer based implementation
Implementing Binary Trees Root
RootData Value
Left Right
Data Value
Left Right
Data Value
Left Right
etc.
Traversal through BSTsRemember that a binary tree is either empty or
it is in the form of a Root with two sub trees. If the Root is empty, then the traversal
algorithm should take no action (i.e., this is an empty tree -- a "degenerate" case).
If the Root is not empty, then we need to print the information in the root node and start traversing the left and right sub trees.
When a sub tree is empty, then we stop traversing it.
Traversal through BSTsThe recursive traversal algorithm is:
Traverse (Root)
If the Tree is not empty then
Visit the node at the Root
Traverse(Left sub tree)
Traverse(Right sub tree)When traversing any binary tree, the algorithm should have
3 choices of when to process the root: before it traverses both sub trees,after it traverses the left sub tree, or after it traverses both sub trees. Each of these traversal methods has a name: preorder, in
order, post order
Binary Tree Traversal
Trees38
A binary tree traversal requires that each node of the tree be processed once and only once in a predetermined sequence.
In the depth-first traversal processing process along a path from the root through one child to the most distant descendant of that first child before processing a second child.
Trees39
Trees40
Trees41
Trees42
Trees43
….Pre order TraversalIt would traverse the following tree as:
60,20,10,5,15,40,30,70,65,85
60
2070
10
5 15 30
40 65 85
Trees45
Trees46
…In order traversal It would traverse the same tree as:
5,10,15,20,30,40,60,65,70,85; Notice that this type of traversal produces the numbers in
order. Search trees can be set up so that all of the nodes in the
left sub tree are less than the nodes in the right sub tree.60
2070
10
5 15 30
40 65 85
Trees48
Trees49
…Post order traversal It would traverse the same tree as:
5, 15, 10,30,40,20,65,85,70,60
Tree RepresentationsThere are many different ways to represent trees.Common representations represent the nodes as records
allocated on the heap with pointers to their children, their parents, or both, or as items in an array, with relationships between them determined by their positions in the array (e.g., binary heap).
Trees and GraphsThe tree data structure can be generalized to represent
directed graphs by removing the constraints that a node may have at most one parent, and that no cycles are allowed.
Edges are still abstractly considered as pairs of nodes, however, the terms parent and child are usually replaced by different terminology (for example, source and target).
Relationship with Trees in Graph Theory
• In graph theory, a tree is a connected acyclic graph; unless stated otherwise, trees and graphs are undirected.
• There is no one-to-one correspondence between such trees and trees as data structure.
• We can take an arbitrary undirected tree, arbitrarily pick one of its vertices as the root, make all its edges directed by making them point away from the root node - producing an arborescence and assign an order to all the nodes.
The result corresponds to a tree data structure. Picking a different root or different ordering produces a
different one.
• Enumerating all the items• Enumerating a section of a tree• Searching for an item• Adding a new item at a certain position on the tree• Deleting an item• Removing a whole section of a tree is called pruning• Adding a whole section to a tree is called grafting• Finding the root for any node
Common Operations
Common Uses
Manipulate hierarchical dataMake information easy to searchManipulate sorted lists of dataAs a workflow for composting digital images for
visual effectsRouter algorithms
Search TreesA binary search tree can be created so that the
elements in it satisfy an ordering property. This allows elements to be searched for quickly. All of the elements in the left sub-tree are less than the
element at the root which is less than all of the elements in the right sub-tree and this property applies recursively to all the sub-trees.
The great advantage of this is that when searching for an element, a comparison with the root will either find the element or indicate which one sub-tree to search.
The ordering is an invariant property of the search tree. All routines that operate on the tree can make use of it provided that they also keep it holding true.
Additional…Binary Search Trees
Key propertyValue at node
Smaller values in left sub-treeLarger values in right sub-tree
ExampleX > YX < Z
Y
X
Z
Binary Search TreesExamples
Binary search trees
Not a binary search tree
5
10
30
2 25 45
5
10
45
2 25 30
5
10
30
2
25
45
Example Binary SearchesFind ( root, 2 )
5
10
30
2 25 45
5
10
30
2
25
45
10 > 2, left
5 > 2, left
2 = 2, found
5 > 2, left
2 = 2, found
root
Example Binary SearchesFind (root, 25 )
5
10
30
2 25 45
5
10
30
2
25
45
10 < 25, right
30 > 25, left
25 = 25, found
5 < 25, right
45 > 25, left
30 > 25, left
10 < 25, right
25 = 25, found
Types of Binary TreesDegenerate – only one childComplete – always two childrenBalanced – “mostly” two children
more formal definitions exist, above are intuitive ideas
Degenerate binary tree
Balanced binary tree
Complete binary tree
Binary Trees PropertiesDegenerate
Height = O(n) for n nodes
Similar to linked list
BalancedHeight = O( log(n) ) for
n nodesUseful for searches
Degenerate binary tree
Balanced binary tree
Binary Search PropertiesTime of search
Proportional to height of treeBalanced binary tree
O( log(n) ) timeDegenerate tree
O( n ) timeLike searching linked list / unsorted
array
Binary Search Tree ConstructionHow to build & maintain binary trees?
InsertionDeletion
Maintain key property (invariant)Smaller values in left sub treeLarger values in right sub tree
Example InsertionInsert ( 20 )
5
10
30
2 25 45
10 < 20, right
30 > 20, left
25 > 20, left
Insert 20 on left
20
Example Deletion (Leaf)Delete ( 25 )
5
10
30
2 25 45
10 < 25, right
30 > 25, left
25 = 25, delete
5
10
30
2 45
Example Deletion (Internal Node)Delete ( 10 )
5
10
30
2 25 45
5
5
30
2 25 45
2
5
30
2 25 45
Replacing 10 with largest value in left
subtree
Replacing 5 with largest value in left
subtree
Deleting leaf
Example Deletion (Internal Node)Delete ( 10 )
5
10
30
2 25 45
5
25
30
2 25 45
5
25
30
2 45
Replacing 10 with smallest value in right
subtree
Deleting leaf Resulting tree
Balanced Search TreesKinds of balanced binary search trees
height balanced vs. weight balanced“Tree rotations” used to maintain balance on
insert/deleteNon-binary search trees
2/3 treeseach internal node has 2 or 3 childrenall leaves at same depth (height balanced)
Balanced Search TreesB-trees
Generalization of 2/3 treesEach node has an array of
pointers to childrenWidely used in databases
AVL TreeAVL (Adelson-Velskii and Landis) tree.Also called Height Balanced Binary Search TreesAn AVL tree is identical to a BST except
Height of the left and right sub trees can differ by at most 1.
Height of an empty tree is defined to be (–1). Every sub tree is an AVL tree.
AVL Tree
An AVL Tree
5
82
4
3
1 7
height0
1
2
3
AVL Tree
Not an AVL tree
6
81
4
3
1
5
height0
1
2
3
Example
An example of an AVL tree where the heights are shown next to the nodes
88
44
17 78
32 50
48 62
2
4
1
1
2
3
1
1
Balanced Binary Tree
The height of a binary tree is the maximum level of its leaves (also called the depth).
The balance of a node in a binary tree is defined as the height of its left sub tree minus height of its right sub tree.
Each node has an indicated balance of 1, 0, or –1.
B-Tree
A B-Tree of order m is an m-way tree, such that:All leaves are on the same levelAll internal nodes except the root are constrained to have
at most non empty children and at least m/2 non empty children
The root has at most m non empty children
B-Tree of order 2, also known as 2-3-4-tree:
1717 2121
771111
1188
2200
2266
3311
22 44 55 66 88 991122
1166
2222
2233
2255
2277
2299
3300
3322
3355
B- TreeA B-tree is a tree data structure that keeps data sorted
and allows searches, sequential access, insertions, and deletions in logarithmic time.
The B-tree is a generalization of a binary search tree in that a node can have more than two children. B-tree is optimized for systems that read and write large blocks of data.
It is commonly used in databases and file systems.
...B- TreeA B-tree is a specialized multi-way tree designed
especially for use on disk. In a B-tree each node may contain a large number of keys.
The number of sub trees of each node, then, may also be large.
A B-tree is designed to branch out in this large number of directions and to contain a lot of keys in each node so that the height of the tree is relatively small.
This means that only a small number of nodes must be read from disk to retrieve an item.
The goal is to get fast access to the data, and with disk drives this means reading a very small number of records.
B - TreeFor example, the following is a multiway search tree of
order 4. Note that the first row in each node shows the keys, while the second row shows the pointers to the child nodes.
There is a record of data associated with each key, so that the first row in each node might be an array of records where each record contains a key and its associated data.
Another approach would be to have the first row of each node contain an array of records where each record contains a key and a record number for the associated data record, which is found in another file.
This is often used when the data records are large.
.....
....
Records are stored in locations called leaves. This name derives from the fact that records always exist
at end points; there is nothing beyond them. The maximum number of children per node is the order of
the tree. The number of required disk accesses is the depth. The image at left shows a binary tree for locating a
particular record in a set of eight leaves.
The image at right shows a B-tree of order three for locating a particular record in a set of eight leaves (the ninth leaf is unoccupied, and is called a null).
The binary tree at left has a depth of four; the B-tree at right has a depth of three.
Clearly, the B-tree allows a desired record to be located faster, assuming all other system parameters are identical.
A sophisticated program is required to execute the operations in a B-tree. But this program is stored in RAM, so it runs fast.