Upload
mansoor-mehmood
View
224
Download
0
Embed Size (px)
Citation preview
8/3/2019 aaoa chap 1
1/47
Advanced Analysis of
AlgorithmsChapter 1
Dr. M. Sikander Hayat Khiyal
Chairperson
Department of Computer Science/Software
Engineering,
Fatima Jinnah Women University,
Rawalpindi, PAKISTAN
8/3/2019 aaoa chap 1
2/47
EVERY CASE TIME
COMPLEXITY
For a given function, T(n) is defined as the
number of times the algorithm does the
basic operations for an instance of size n.
T(n) is called the every case time
complexity of the algorithm, and the
determination of T(n) is called an everycase time complexity analysis.
8/3/2019 aaoa chap 1
3/47
WORST CASE TIME
COMPLEXITYFor a given algorithm, W(n) is defined as the
maximum number of times the algorithm will ever
do its basic operations for an input of size n.W(n) is called the worst case time complexity of
the algorithm, and the determination of W(n) is
called the worst case time complexity analysis. If
T(n) exists then clearly:
W(n) = T(n)
8/3/2019 aaoa chap 1
4/47
AVERAGE CASE TIME
COMPLEXITYFor a given algorithm, A(n) is defined as the
average (or expected value) number of times the
algorithm does its basic operations for an input ofsize n.
A(n) is called the average case time complexity of
the algorithm, and the determination of A(n) is
called the average case time complexity analysis.
If T(n) exists then clearly:
A(n) = T(n)
8/3/2019 aaoa chap 1
5/47
BEST CASE TIME
COMPLEXITYFor a given algorithm, B(n) is defined as the
minimum number of times the algorithm will ever
do its basic operations for an input of size n.B(n) is called the best case time complexity of the
algorithm, and the determination ofB(n) is called
the Best case time complexity analysis. If T(n)
exists then clearly:
B(n) = T(n)
8/3/2019 aaoa chap 1
6/47
ANALYSIS SUMMARY OF SORTING ALGORITHMS
Algorithm Comparisons of
keys
Assignment of
records
Extra space
usage
Exchange Sort T(n) = n2/2 W(n) = 3n2/2
A(n) = 3n2/4
In place
Insertion Sort W(n) = n2/2
A(n) = n2/4
W(n) = n2/2
A(n) = n2/4
In place
Selection Sort T(n) = n2/2 T(n) = 3n In place
Merge Sort W(n) = nlgn
A(n) = nlgn
T(n) = 2nlgn U(n) records
Merge Sort(D.P)
W(n) = nlgnA(n) = nlgn
T(n) = 0 U(n) links
Quick Sort W(n) = n2/2
A(n) = 1.38 nlgn
A(n) = 0.69 nlgn U(lg n) indices
Heap Sort W(n) = 2nlgnA n = 2nl n
W(n) = nlgnA n = nl n
In place
8/3/2019 aaoa chap 1
7/47
COMPUTATIONAL COMPLEXITY
Computational complexity is the study of
all possible algorithms that can solve a
given problem.
A computational complexity analysis tries
to determine a lower bound on the
efficiency of all algorithms for a givenproblem. Its is a field that runs hand in-
hand with algorithm design and analysis.
8/3/2019 aaoa chap 1
8/47
LOWERBOUND FOR SORTING ALGORITHMS
Insertion sort:In general, we are concerned with sorting n distinct
keys that come from any ordered set. However, without
loss of generality, we can assume that the keys to be
sorted are simply the +ve integers 1,2,,n because wecan substitute 1 for the smallest key, 2 for 2nd smallest
key and so on. Suppose for the alpha input [Ralph,
Clyde, Dave], we have [3,1,2], if 1 associates with
Clyde, 2 with Dave and 3 with Ralph. Any algorithmthat sorts these integers only by comparison of keys
would have to do the same number of comparisons to
sort the three names.
8/3/2019 aaoa chap 1
9/47
A permutation of the first n positive integers can
be thought of as an ordering of these integers
because there are n! permutations of the first n +veinteger, there are n! different orderings.
For example, for the first three +ve integers there
are six ordering permutation as:
[1,2,3] [1,3,2] [2,1,3] [2,3,1] [3,1,2] [3,2,1]
This means that there are n! different inputs
containing n distinct keys. These six permutations
are the different inputs of size 3. Permutation isdenoted by [k1,k2,,kn].
An inversion in a permutation is a pair (ki,kj)
such that i kj
8/3/2019 aaoa chap 1
10/47
For example, the permutation [3,2,4,1,6,5] contains the
inversion (3,2), (3,1), (2,1), (4,1) and (6,5). Clearly a
permutation contains no inversion if and only if it is thesorted ordering [1,2,,n]. This means that the task of
sorting n distinct keys is the removal of all inversions
in a permutation.
8/3/2019 aaoa chap 1
11/47
Proof:
For the worst case, there are [n,n-1,,2,1] permutations,therefore there are at least n(n-1)/2 comparisons of keys.(solved by arithmetic sum)
For the average case, pair the permutation [kn, , k2, k1]with the permutation [k1, k2, ..., kn]. This is called theTranspose of the original permutation.
Let r and s be integers between 1 and n such that s>r.
Theorem1:
Any algorithm that sorts n distinct keys only by
comparison of keys and remove at most one inversion after
each comparison must in the worst case do at least
n(n-1)/2 comparisons of keys
and on the average do at least
n(n-1)/4 comparisons of keys.
8/3/2019 aaoa chap 1
12/47
Given a permutation, the pair (s,r) is an inversionin either permutation or its transpose and not inboth. As there are [n,n-1,,2,1] permutations sothere are n(n-1)/2 pairs between 1 and n. Thismeans that a permutation and its transpose haven(n-1)/2 inversions. Thus the average number ofinversions in a permutation and its transpose is
()*(n(n-1)/2) = n(n-1)/4.
Therefore, if we consider all permutations equally probable for the input, the average number ofinversions in the input is also n(n-1)/4. Because
we assumed that the algorithm removes at mostone inversion after each comparison, on theaverage it must do at least this many comparisonsto remove all inversions and there by sort the
input.
8/3/2019 aaoa chap 1
13/47
Lower bounds for sorting only by comparison of keys
Decision tree for sorting algorithm:
We can associate a binary tree with procedure sort tree
by placing the comparison of a and b at the root, left
child in the comparison if a
8/3/2019 aaoa chap 1
14/47
This tree is called Decision tree, because at each
node a decision must be made as to which node tovisit next.
A decision tree is called Valid for sorting n keys
if, for each permutation of the n keys, there is a path
from the root to a leaf that sorts that permutation,
that is, it can sort every input of size n. The above
tree is valid, but no longer be valid if we removed
any branch from the trees.
Now draw a decision tree for exchange sort
when sorting three keys.
8/3/2019 aaoa chap 1
15/47
b
8/3/2019 aaoa chap 1
16/47
A decision tree is Pruned if every leaf can be reached
from the root by making a consistent sequence. Notethat for exchange sort the comparison c
8/3/2019 aaoa chap 1
17/47
LEMMA 1:
To every deterministic algorithm for sorting n distinctkeys there corresponds a pruned, valid, binary decision
tree containing exactly n! leaves.PROOF:As there is a pruned, valid decision treecorresponding to any algorithm for sorting n keys. Whenall the keys are distinct, the result of a comparison is
always . Therefore, each node in that tree has atmost 2 children, which means that it is a binary tree.
Because there are n! different inputs that contains ndistinct keys and because a decision tree is valid for
sorting n distinct keys only if it has a leaf for every input,the tree has at least n! leaves. Because there is a uniquepath in the tree for each of the n! different inputs and because every leaf is pruned, decision tree must bereachable, the tree can have no more than n! leaves.Therefore, the tree has exactly n! leaves.
8/3/2019 aaoa chap 1
18/47
LOWERBOUND FOR WORST CASE BEHAVIOUR
LEMMA 2: The worst case number of comparisons done
by a decision tree is equal to its depth.
PROOF: Given some input, the number of comparisons
done by a decision tree is the number of internal nodes
on the path followed for that input. The number ofinternal nodes is the same as the length of the path.
Therefore, the worst case number of comparisons done
by a decision tree is the length of the longest path to a
leaf, which is the depth of the decision tree.
8/3/2019 aaoa chap 1
19/47
LEMMA 3: If m is the number of leaves in a binary tree
and d is the depth, then d u lg m
PROOF: By induction, show that 2d umInduction Base: A binary tree with the depth 0 has one
node that is both the root and the only leaf. Therefore,
for such a tree, the number of leaves m equals 1, and
20 u 1Induction Hypothesis: Assume for every binary tree with
depth d , 2d u m,
where m is the number of leaves.
Induction Step: We need to show that, for a binary tree
with depth d+1 ,
2
d+1
u m
8/3/2019 aaoa chap 1
20/47
where m is the number of leaves.
If we remove all the leaves from such a tree, we have a
tree with depth d whose leaves are the parent of the
leaves in original tree. If m is the number of these
parents, then by the induction hypothesis.
2d
u mbecause each parent can have at most two children,2m>=m
Thus,
2d+1 u 2m u m Proof completed.Now taking lg of 2d u m gives d u lg m because d is theinteger so d u lg m.
8/3/2019 aaoa chap 1
21/47
THEOREM 2: Any deterministic algorithm that sorts n
distinct keys only by comparisons of keys must in the
worst case do at least lg (n!) comparisons of keys.PROOF: By lemma 1, any pruned, valid BDT has n!
leaves and by lemma 3 depth dulg m. Thus theoremfollow by lemma 2 that any DT worst case number of
comparison is given by its depth.
LEMMA 4: For any +ve integer n,
lg(n!) u nlgn - 1.45n
PROOF:lg(n!) = lg[n(n-1)(n-2)(2)(1)] = 7 lg i
i=2
n
8/3/2019 aaoa chap 1
22/47
since lg 1=0
lg(n!) u lg x dx = (1/ln2)[n ln n n+1]u nlogn 1.45n Proof Completed.
THEOREM 3: Any deterministic algorithm that sorts ndistinct keys only by comparison of keys must in the
worst case do at least
nlgn 1.45n comparison of keys.
PROOF: By theorem 2 and lemma 4.
1
n
8/3/2019 aaoa chap 1
23/47
LOWERBOUNDS FOR AVERAGE CASE BEHAVIOUR
Definition: A binary tree in which every nonleaf
contains exactly two children is called 2-tree.LEMMA 5: To every pruned, valid, BDT for sorting n
keys (distinct), there corresponds a pruned, valid
decision 2-tree that is at least as efficient as the original
tree.PROOF: If the pruned, valid BDT corresponding to a
deterministic sorting algorithms for sorting n distinct
keys contain any comparison nodes with only one child,
we can replace each such node by its child and prunethe child to obtain a decision tree that sorts using no
more comparisons than did the original (fig.2). Every
nonleaf in the new tree will contain exactly two
children.
8/3/2019 aaoa chap 1
24/47
Definition: The external path length of a tree is the total
length of all paths from the root to the leaves. EPL for
Figure 1 is
EPL=2+3+3+3+3+2=16
As the EPL is the total number of comparisons done by
DT and n! different inputs of size n. Therefore, theaverage number of comparison is EPL/n!
LEMMA 6: Any deterministic algorithm that sorts n
distinct keys only by comparison of keys must on the
average do at least(min EPL(n!)) / n!
comparison of keys.
8/3/2019 aaoa chap 1
25/47
PROOF: By lemma 1, every DA for sorting n distinct
keys there corresponds a pruned, valid, BDT containing
n! leaves.
By lemma 5, we can convert BT to 2-tree. Because the
original tree has n! leaves so must the 2-tree we obtain
from it. Hence proved.
LEMMA 7: Any 2-tree that has m leaves and whose
EPL equals min EPL(m) must have all of its leaves on
at most the bottom two levels.
PROOF: Suppose that some 2-tree does not have all itsleaves on the bottom two levels. Let d be the depth of
the tree, let A be a leaf in the tree that is not on one of
the bottom two levels and let k be the depth of A.
8/3/2019 aaoa chap 1
26/47
Because nodes at the bottom level have depth d, ked-2.
Now show that tree cannot minimize the EPL among
the trees with the same number of leaves by developinga 2-tree with the same no. of leaves and a lower EPL.
Now choose a nonleafB at level d-1 in original tree,
removing its two children and giving two children to A.
Clearly the new tree has the same number of leaves asthe original tree.
B
A Level k
Level d-1
Level d
Original 2-treewith m leaves
A Level k
Level k+1
B Level d-1
New 2-tree with m leavesand EPL decreased
8/3/2019 aaoa chap 1
27/47
In new tree neither A nor the children ofB are leaves,
but they are leaves in old tree. Therefore, we have
decreased the EPL by the length of the path to A and bythe length of the two paths to Bs children. That is
k+d+d = k+2d
In new tree, B and the two new children of A are leaves,
but they are not leaves in old tree. Therefore, we have
increased the EPL by the length of the path to B and the
length of the two paths to As new children. That is, we
have increased the EPL by
d-1+k+1+k+1 = d+2k+1
the net change in EPL is k ed-2
(d+2k+1) (k+2d) = k-d+1 e d-2-d+1 = -1
8/3/2019 aaoa chap 1
28/47
As the net change in the EPL is negative, the new treehas a smaller EPL. Thus the old tree cannot minimize the
EPL among trees with the same number of leaves.LEMMA 8: Any 2-tree that has m leaves and whose EPLequals min EPL(m) must have
2d-m leaves at level d-1 and
2m-2d leaves at level dand have no other leaves where d is the depth of the tree.
PROOF: By lemma 7, all leaves are at the bottom twolevels and non leaves in a 2-tree must have two children,
there must be 2d-1 nodes at level d-1. Therefore, if r is thenumber of leaves at level d-1, the number of non leavesat that level is 2d-1-r. Because non leaves in a 2-tree haveexactly two children, for every non leaf at least d-1 there
are two leaves at level d.B
ecause there are only leaves at
8/3/2019 aaoa chap 1
29/47
level d, the number of leaves at level d is equal to 2(2d-1-
r). Because lemma 7 says that all leaves are at level d or
d-1,r+2(2d-1-r) = m
gives,
r = 2
d
m at level d-1Therefore, the number of leave at level d is
m - r = m - 2d + m = 2m 2d
8/3/2019 aaoa chap 1
30/47
LEMMA 9: For any 2-tree that has m leaves andwhose EPL equals min EPL(m), the depth d is given
by d=lgm.PROOF: Consider the case that m is a power of 2,then for some integer k, m=2k. Let d be depth of aminimizing tree by Lemma 8, let r be the number ofleaves at level d-1,
r = 2d m = 2d - 2k
because ru 0, we must have d u k. Assuming that d>kleads to a contradiction.
If d>k, then
r = 2d - 2ku 2k+1 2k= 2k(2-1) = 2k= mbecause rem, this means r=m, and all leaves are atlevel d-1. But there must be some leaves at level d,this contradiction implies that d=k which means r=0,
8/3/2019 aaoa chap 1
31/47
thus,
2d m = 0 means 2d = m or d = lgm.
Because lgm=lgm where m is a power of 2, Thusd = lgm . Proof completed.
LEMMA 10:
For all integers mu1 min EPL(m) u m lgm .PROOF: By lemma 8, any 2-tree that minimizes EPL
must have 2d m leaves at level d-1, have 2m- 2d leaves
at level d, and have no other leaves. Therefore we have,min EPL(m) = (2d -m)(d-1)+(2m-2d)d
= md + m 2d
8/3/2019 aaoa chap 1
32/47
By lemma 9,
min EPL(m) = m lgm + m 2lgm
If m is a power of 2, mlgm = m lgmIf m is not a power of 2,
lgm = -lgm+ 1, so
min EPL(m)= m(-lgm +1) + m 2lgm= m -lgm + 2m 2lgm > m -lgm> m -lgm
because 2m > 2lgm
therefore min EPL(m) > m -lgm
8/3/2019 aaoa chap 1
33/47
THEOREM 4: Any deterministic algorithm that sorts ndistinct keys only by comparisons of keys must on the
average do at least-nlogn 1.45n comparisons of keys.PROOF: By lemma 6, any such algorithm must on theaverage do at least
min EPL(n!) / n! comparisons of keys.By lemma 10, this comparison is greater than or equalto,
min EPL(n!) / n! = n! -lg(n!) /n! = -lg(n!)
By lemma 4,-lg(n!) u -nlogn 1.45n
Proved.
8/3/2019 aaoa chap 1
34/47
Lower bounds for searching only by comparison of
keys
The problem of searching for a key can bedescribed as follows. Given an array S containingn keys and a key x, find an index i such that x=S[i]
if x equals one of the keys; if x does not equal oneof the keys, report failure.
For lower bound, we can associate a decision treewith every deterministic algorithm that searches
for a key x in an array of n keys. Figure 1 shows adecision tree corresponding to binary search whensearching seven keys and figure2 shows a decision
8/3/2019 aaoa chap 1
35/47
tree corresponding to sequential search algorithm. In
these trees, each large node represents a comparison ofan array item with the search key x and each small node
(leaf) contains a result that is reported. When x is in the
array, we report an index of the item that is equals, and
when x is not in the array, we report an F for thefailure. In figure we use S[1]=S1, S[2]=S2,.,
S[7]=S7.
Each leaf in a decision tree for searching n keys for a
key x represents a point at which the algorithm stops
and report, an index i such that x=Si or reports failure.
Every internal node represents a comparisons.
8/3/2019 aaoa chap 1
36/47
x=S4
x=S5
x=S1
x=S3
x=S2 x=S6
2
3F1 FF F
4
5FF F 7 F
6 x=S7
>>
=
=
====
=>
Figure1: The decision tree corresponding to binary search
when searching seven keys
8/3/2019 aaoa chap 1
37/47
Figure 2: The decision treecorresponding to sequentialsearch when searching seven
keys
x=S7
x=S1
x=S2
x=S3
x=S4
x=S5
x=S6
1
3
2
5
4
7
6
F
=
=
=
=
=
{
=
=
{
{
{
{
{
{
8/3/2019 aaoa chap 1
38/47
A decision tree is called Valid for searching n keys for a
key x if for each possible outcome there is a path from
the root to a leaf that reports that outcome. That is, theremust be a path for x=Si for 1eien and a path that leads
to failure.
The decision tree is called pruned if every leaf is
reachable. Every algorithm that searches for a key x isan array of n keys has a corresponding pruned, valid
decision tree.
8/3/2019 aaoa chap 1
39/47
Lower bounds for worst case behavior
LEMMA 11: If n is the number of nodes in a binarytree and d is the depth, then
d > -lg(n)PROOF: We have
n e 1+2+22+23++2d
because there can be only one root, at most two nodeswith depth 1, 22 nodes with depth 2, , and 2d nodeswith depth d. Apply geometric series,
n e 2d+1 1
which means that n < 2d+1 or lgn
8/3/2019 aaoa chap 1
40/47
LEMMA 12: To be a pruned,valid decision treefor searching n distinct keys for a key x, the
binary tree consisting of the correspondingnodes must contain at least n nodes.PROOF: Let Si for i=1,2,3,.,n be the values of the nkeys. First we show that every Si must be in at least one
comparison nodes. Suppose that for some i this is notthe case. Take two inputs that are identical for all keysexcept the ith key, and are different for the ith key. Letx have the value of Si in one of the inputs. Because Si isnot involved in any comparison and all the other keys
are the same in both inputs, the decision tree must behave the same for both inputs. However, it mustreport i for one of the input and it must not report i forthe other. This contradiction shows that every Si mustbe in at least one comparison node.
8/3/2019 aaoa chap 1
41/47
Because every Si must be in at least one comparisonnode, the only way we would have less than n
comparisons node would be to have at least one key Siinvolved only in comparison with other keys- that is,one Si that is never compared with x, suppose we dohave such a key.
Take two inputs that are equal everywhere except for Si
,with Sibeing the smallest key in both inputs. Let x bethe ith key in one of the inputs. A path from acomparison node containing Si must go in the samedirection for both inputs, and all other keys are the same
in both inputs. Therefore the decision tree must behavethe same for the two inputs. However, it must report ifor one of them and must not report i for the other. Thiscontradiction proves the lemma.
8/3/2019 aaoa chap 1
42/47
THEOREM 5: Any deterministic algorithm that
searches for a key x in an array of n distinct keys only
by comparison of keys must in the worst case do at least-lg(n) + 1 comparison of keys.PROOF: Corresponding to the algorithm, there is a
pruned, valid decision tree for searching n distinct keys
for a key x. The worst case number of comparisons is
the number of nodes in the longest path from the root to
a leaf in the binary tree consisting of the comparison
nodes in that decision tree. This number is the depth of
the binary tree plus 1. Lemma 12 says that this binary
tree has at least n nodes. Therefore, by lemma 11, its
depth is greater than or equal to -lg(n). This proves thetheorem.
8/3/2019 aaoa chap 1
43/47
Lower bounds for average case behavior
A binary tree is called a nearly complete binary tree if it
is complete down to a depth of d-1. Every essentiallycomplete binary tree is nearly complete, but not every
nearly complete binary tree is essentially complete.
Fig(a): An Essentially Complete
Binary TreeFig(b): A Nearly Complete Binary Tree
but not Essentially CompleteB
T.
8/3/2019 aaoa chap 1
44/47
LEMMA 13: The tree consisting of the comparison ofnodes in the pruned,valid decision tree corresponding tothe binary search is a nearly complete binary tree.
LEMMA 14: The total node distance (TND) of a binarytree containing n nodes is equal to min TND(n) if andonly if the tree is nearly complete.
PROOF: First show that if a trees TND=min TND(n),the tree is nearly complete. Suppose that some binary
tree is not nearly complete. Than there must be somenode, not at one of the bottom two levels, that has atmost one child. We can remove any node from the
bottom level and make it a child of that node. Theresulting tree will be a binary tree containing n nodes.The number of nodes in the path to A in that tree will beat least 1 less than the number of nodes in the path A in
8/3/2019 aaoa chap 1
45/47
the original tree. The number of nodes in the path toall other nodes will be the same. Therefore, we have
created a binary tree containing n nodes with a TND
smaller than that of our original tree, which means that
our original tree did not have a minimum TND.
As TND is the same for all nearly complete binary tree
containing n nodes. Therefore, every such tree must
have the minimum TND.
LEMMA 15: Suppose that we are searching n keys, the
search key x is in the array, and all array slots are
equally probable. Then the average case timecomplexity for binary search is given by
min TND(n)/n
PROOF: The proof follows from lemma 13 and 14.
8/3/2019 aaoa chap 1
46/47
LEMMA 16: If we assume that x is in the array and thatall array slots are equally probable, the average casetime complexity of any deterministic algorithm thatsearches for a key x in an array of n distinct keys isbounded below by
min TND(n)/n
PROOF:B
y lemma 12, every array item Si must becompared with x at least once in the decision treecorresponding to the algorithm. Let Ci be the number ofnodes in the shortest path to a node containing acomparison of Si with x. Because each key has the same
probability 1/n of being the search key x, a lower boundon the average case time complexity is given by
C1(1/n)+ C2(1/n)+.+ Cn(1/n)=(1/n) 7 Cithus, 7 Ci u min TND(n)
i=1
n
i=1
n
8/3/2019 aaoa chap 1
47/47
THEOREM 6:
Among deterministic algorithm that search for a key x
in an array of n distinct keys only by comparison ofkeys, binary search is optimal in its average case
performance if we assume that x is in the array and that
all array slots are equally probable. Therefore, under
these assumptions algorithm must on the average do atleast approximately -lg(n) 1 comparisons of keys.PROOF:
Proof follows from lemma 15 and 16.