aaoa chap 1

8/3/2019 aaoa chap 1

1/47

Advanced Analysis of

AlgorithmsChapter 1

Dr. M. Sikander Hayat Khiyal

Chairperson

Department of Computer Science/Software

Engineering,

Fatima Jinnah Women University,

Rawalpindi, PAKISTAN

[email protected]


2/47

EVERY CASE TIME

COMPLEXITY

For a given function, T(n) is defined as the

number of times the algorithm does the

basic operations for an instance of size n.

T(n) is called the every case time

complexity of the algorithm, and the

determination of T(n) is called an everycase time complexity analysis.


3/47

WORST CASE TIME

COMPLEXITYFor a given algorithm, W(n) is defined as the

maximum number of times the algorithm will ever

do its basic operations for an input of size n.W(n) is called the worst case time complexity of

the algorithm, and the determination of W(n) is

called the worst case time complexity analysis. If

T(n) exists then clearly:

W(n) = T(n)


4/47

AVERAGE CASE TIME

COMPLEXITYFor a given algorithm, A(n) is defined as the

average (or expected value) number of times the

algorithm does its basic operations for an input ofsize n.

A(n) is called the average case time complexity of

the algorithm, and the determination of A(n) is

called the average case time complexity analysis.

If T(n) exists then clearly:

A(n) = T(n)


5/47

BEST CASE TIME

COMPLEXITYFor a given algorithm, B(n) is defined as the

minimum number of times the algorithm will ever

do its basic operations for an input of size n.B(n) is called the best case time complexity of the

algorithm, and the determination ofB(n) is called

the Best case time complexity analysis. If T(n)

exists then clearly:

B(n) = T(n)


6/47

ANALYSIS SUMMARY OF SORTING ALGORITHMS

Algorithm Comparisons of

keys

Assignment of

records

Extra space

usage

Exchange Sort T(n) = n2/2 W(n) = 3n2/2

A(n) = 3n2/4

In place

Insertion Sort W(n) = n2/2

A(n) = n2/4

W(n) = n2/2

A(n) = n2/4

In place

Selection Sort T(n) = n2/2 T(n) = 3n In place

Merge Sort W(n) = nlgn

A(n) = nlgn

T(n) = 2nlgn U(n) records

Merge Sort(D.P)

W(n) = nlgnA(n) = nlgn

T(n) = 0 U(n) links

Quick Sort W(n) = n2/2

A(n) = 1.38 nlgn

A(n) = 0.69 nlgn U(lg n) indices

Heap Sort W(n) = 2nlgnA n = 2nl n

W(n) = nlgnA n = nl n

In place


7/47

COMPUTATIONAL COMPLEXITY

Computational complexity is the study of

all possible algorithms that can solve a

given problem.

A computational complexity analysis tries

to determine a lower bound on the

efficiency of all algorithms for a givenproblem. Its is a field that runs hand in-

hand with algorithm design and analysis.


8/47

LOWERBOUND FOR SORTING ALGORITHMS

Insertion sort:In general, we are concerned with sorting n distinct

keys that come from any ordered set. However, without

loss of generality, we can assume that the keys to be

sorted are simply the +ve integers 1,2,,n because wecan substitute 1 for the smallest key, 2 for 2nd smallest

key and so on. Suppose for the alpha input [Ralph,

Clyde, Dave], we have [3,1,2], if 1 associates with

Clyde, 2 with Dave and 3 with Ralph. Any algorithmthat sorts these integers only by comparison of keys

would have to do the same number of comparisons to

sort the three names.


9/47

A permutation of the first n positive integers can

be thought of as an ordering of these integers

because there are n! permutations of the first n +veinteger, there are n! different orderings.

For example, for the first three +ve integers there

are six ordering permutation as:

[1,2,3] [1,3,2] [2,1,3] [2,3,1] [3,1,2] [3,2,1]

This means that there are n! different inputs

containing n distinct keys. These six permutations

are the different inputs of size 3. Permutation isdenoted by [k1,k2,,kn].

An inversion in a permutation is a pair (ki,kj)

such that i kj


10/47

For example, the permutation [3,2,4,1,6,5] contains the

inversion (3,2), (3,1), (2,1), (4,1) and (6,5). Clearly a

permutation contains no inversion if and only if it is thesorted ordering [1,2,,n]. This means that the task of

sorting n distinct keys is the removal of all inversions

in a permutation.


11/47

Proof:

For the worst case, there are [n,n-1,,2,1] permutations,therefore there are at least n(n-1)/2 comparisons of keys.(solved by arithmetic sum)

For the average case, pair the permutation [kn, , k2, k1]with the permutation [k1, k2, ..., kn]. This is called theTranspose of the original permutation.

Let r and s be integers between 1 and n such that s>r.

Theorem1:

Any algorithm that sorts n distinct keys only by

comparison of keys and remove at most one inversion after

each comparison must in the worst case do at least

n(n-1)/2 comparisons of keys

and on the average do at least

n(n-1)/4 comparisons of keys.


12/47

Given a permutation, the pair (s,r) is an inversionin either permutation or its transpose and not inboth. As there are [n,n-1,,2,1] permutations sothere are n(n-1)/2 pairs between 1 and n. Thismeans that a permutation and its transpose haven(n-1)/2 inversions. Thus the average number ofinversions in a permutation and its transpose is

()*(n(n-1)/2) = n(n-1)/4.

Therefore, if we consider all permutations equally probable for the input, the average number ofinversions in the input is also n(n-1)/4. Because

we assumed that the algorithm removes at mostone inversion after each comparison, on theaverage it must do at least this many comparisonsto remove all inversions and there by sort the

input.


13/47

Lower bounds for sorting only by comparison of keys

Decision tree for sorting algorithm:

We can associate a binary tree with procedure sort tree

by placing the comparison of a and b at the root, left

child in the comparison if a


14/47

This tree is called Decision tree, because at each

node a decision must be made as to which node tovisit next.

A decision tree is called Valid for sorting n keys

if, for each permutation of the n keys, there is a path

from the root to a leaf that sorts that permutation,

that is, it can sort every input of size n. The above

tree is valid, but no longer be valid if we removed

any branch from the trees.

Now draw a decision tree for exchange sort

when sorting three keys.


15/47

b


16/47

A decision tree is Pruned if every leaf can be reached

from the root by making a consistent sequence. Notethat for exchange sort the comparison c


17/47

LEMMA 1:

To every deterministic algorithm for sorting n distinctkeys there corresponds a pruned, valid, binary decision

tree containing exactly n! leaves.PROOF:As there is a pruned, valid decision treecorresponding to any algorithm for sorting n keys. Whenall the keys are distinct, the result of a comparison is

always . Therefore, each node in that tree has atmost 2 children, which means that it is a binary tree.

Because there are n! different inputs that contains ndistinct keys and because a decision tree is valid for

sorting n distinct keys only if it has a leaf for every input,the tree has at least n! leaves. Because there is a uniquepath in the tree for each of the n! different inputs and because every leaf is pruned, decision tree must bereachable, the tree can have no more than n! leaves.Therefore, the tree has exactly n! leaves.


18/47

LOWERBOUND FOR WORST CASE BEHAVIOUR

LEMMA 2: The worst case number of comparisons done

by a decision tree is equal to its depth.

PROOF: Given some input, the number of comparisons

done by a decision tree is the number of internal nodes

on the path followed for that input. The number ofinternal nodes is the same as the length of the path.

Therefore, the worst case number of comparisons done

by a decision tree is the length of the longest path to a

leaf, which is the depth of the decision tree.


19/47

LEMMA 3: If m is the number of leaves in a binary tree

and d is the depth, then d u lg m

PROOF: By induction, show that 2d umInduction Base: A binary tree with the depth 0 has one

node that is both the root and the only leaf. Therefore,

for such a tree, the number of leaves m equals 1, and

20 u 1Induction Hypothesis: Assume for every binary tree with

depth d , 2d u m,

where m is the number of leaves.

Induction Step: We need to show that, for a binary tree

with depth d+1 ,

2

d+1

u m


20/47

where m is the number of leaves.

If we remove all the leaves from such a tree, we have a

tree with depth d whose leaves are the parent of the

leaves in original tree. If m is the number of these

parents, then by the induction hypothesis.

2d

u mbecause each parent can have at most two children,2m>=m

Thus,

2d+1 u 2m u m Proof completed.Now taking lg of 2d u m gives d u lg m because d is theinteger so d u lg m.


21/47

THEOREM 2: Any deterministic algorithm that sorts n

distinct keys only by comparisons of keys must in the

worst case do at least lg (n!) comparisons of keys.PROOF: By lemma 1, any pruned, valid BDT has n!

leaves and by lemma 3 depth dulg m. Thus theoremfollow by lemma 2 that any DT worst case number of

comparison is given by its depth.

LEMMA 4: For any +ve integer n,

lg(n!) u nlgn - 1.45n

PROOF:lg(n!) = lg[n(n-1)(n-2)(2)(1)] = 7 lg i

i=2

n


22/47

since lg 1=0

lg(n!) u lg x dx = (1/ln2)[n ln n n+1]u nlogn 1.45n Proof Completed.

THEOREM 3: Any deterministic algorithm that sorts ndistinct keys only by comparison of keys must in the

worst case do at least

nlgn 1.45n comparison of keys.

PROOF: By theorem 2 and lemma 4.

1

n


23/47

LOWERBOUNDS FOR AVERAGE CASE BEHAVIOUR

Definition: A binary tree in which every nonleaf

contains exactly two children is called 2-tree.LEMMA 5: To every pruned, valid, BDT for sorting n

keys (distinct), there corresponds a pruned, valid

decision 2-tree that is at least as efficient as the original

tree.PROOF: If the pruned, valid BDT corresponding to a

deterministic sorting algorithms for sorting n distinct

keys contain any comparison nodes with only one child,

we can replace each such node by its child and prunethe child to obtain a decision tree that sorts using no

more comparisons than did the original (fig.2). Every

nonleaf in the new tree will contain exactly two

children.


24/47

Definition: The external path length of a tree is the total

length of all paths from the root to the leaves. EPL for

Figure 1 is

EPL=2+3+3+3+3+2=16

As the EPL is the total number of comparisons done by

DT and n! different inputs of size n. Therefore, theaverage number of comparison is EPL/n!

LEMMA 6: Any deterministic algorithm that sorts n

distinct keys only by comparison of keys must on the

average do at least(min EPL(n!)) / n!

comparison of keys.


25/47

PROOF: By lemma 1, every DA for sorting n distinct

keys there corresponds a pruned, valid, BDT containing

n! leaves.

By lemma 5, we can convert BT to 2-tree. Because the

original tree has n! leaves so must the 2-tree we obtain

from it. Hence proved.

LEMMA 7: Any 2-tree that has m leaves and whose

EPL equals min EPL(m) must have all of its leaves on

at most the bottom two levels.

PROOF: Suppose that some 2-tree does not have all itsleaves on the bottom two levels. Let d be the depth of

the tree, let A be a leaf in the tree that is not on one of

the bottom two levels and let k be the depth of A.


26/47

Because nodes at the bottom level have depth d, ked-2.

Now show that tree cannot minimize the EPL among

the trees with the same number of leaves by developinga 2-tree with the same no. of leaves and a lower EPL.

Now choose a nonleafB at level d-1 in original tree,

removing its two children and giving two children to A.

Clearly the new tree has the same number of leaves asthe original tree.

B

A Level k

Level d-1

Level d

Original 2-treewith m leaves

A Level k

Level k+1

B Level d-1

New 2-tree with m leavesand EPL decreased


27/47

In new tree neither A nor the children ofB are leaves,

but they are leaves in old tree. Therefore, we have

decreased the EPL by the length of the path to A and bythe length of the two paths to Bs children. That is

k+d+d = k+2d

In new tree, B and the two new children of A are leaves,

but they are not leaves in old tree. Therefore, we have

increased the EPL by the length of the path to B and the

length of the two paths to As new children. That is, we

have increased the EPL by

d-1+k+1+k+1 = d+2k+1

the net change in EPL is k ed-2

(d+2k+1) (k+2d) = k-d+1 e d-2-d+1 = -1


28/47

As the net change in the EPL is negative, the new treehas a smaller EPL. Thus the old tree cannot minimize the

EPL among trees with the same number of leaves.LEMMA 8: Any 2-tree that has m leaves and whose EPLequals min EPL(m) must have

2d-m leaves at level d-1 and

2m-2d leaves at level dand have no other leaves where d is the depth of the tree.

PROOF: By lemma 7, all leaves are at the bottom twolevels and non leaves in a 2-tree must have two children,

there must be 2d-1 nodes at level d-1. Therefore, if r is thenumber of leaves at level d-1, the number of non leavesat that level is 2d-1-r. Because non leaves in a 2-tree haveexactly two children, for every non leaf at least d-1 there

are two leaves at level d.B

ecause there are only leaves at


29/47

level d, the number of leaves at level d is equal to 2(2d-1-

r). Because lemma 7 says that all leaves are at level d or

d-1,r+2(2d-1-r) = m

gives,

r = 2

d

m at level d-1Therefore, the number of leave at level d is

m - r = m - 2d + m = 2m 2d


30/47

LEMMA 9: For any 2-tree that has m leaves andwhose EPL equals min EPL(m), the depth d is given

by d=lgm.PROOF: Consider the case that m is a power of 2,then for some integer k, m=2k. Let d be depth of aminimizing tree by Lemma 8, let r be the number ofleaves at level d-1,

r = 2d m = 2d - 2k

because ru 0, we must have d u k. Assuming that d>kleads to a contradiction.

If d>k, then

r = 2d - 2ku 2k+1 2k= 2k(2-1) = 2k= mbecause rem, this means r=m, and all leaves are atlevel d-1. But there must be some leaves at level d,this contradiction implies that d=k which means r=0,


31/47

thus,

2d m = 0 means 2d = m or d = lgm.

Because lgm=lgm where m is a power of 2, Thusd = lgm . Proof completed.

LEMMA 10:

For all integers mu1 min EPL(m) u m lgm .PROOF: By lemma 8, any 2-tree that minimizes EPL

must have 2d m leaves at level d-1, have 2m- 2d leaves

at level d, and have no other leaves. Therefore we have,min EPL(m) = (2d -m)(d-1)+(2m-2d)d

= md + m 2d


32/47

By lemma 9,

min EPL(m) = m lgm + m 2lgm

If m is a power of 2, mlgm = m lgmIf m is not a power of 2,

lgm = -lgm+ 1, so

min EPL(m)= m(-lgm +1) + m 2lgm= m -lgm + 2m 2lgm > m -lgm> m -lgm

because 2m > 2lgm

therefore min EPL(m) > m -lgm


33/47

THEOREM 4: Any deterministic algorithm that sorts ndistinct keys only by comparisons of keys must on the

average do at least-nlogn 1.45n comparisons of keys.PROOF: By lemma 6, any such algorithm must on theaverage do at least

min EPL(n!) / n! comparisons of keys.By lemma 10, this comparison is greater than or equalto,

min EPL(n!) / n! = n! -lg(n!) /n! = -lg(n!)

By lemma 4,-lg(n!) u -nlogn 1.45n

Proved.


34/47

Lower bounds for searching only by comparison of

keys

The problem of searching for a key can bedescribed as follows. Given an array S containingn keys and a key x, find an index i such that x=S[i]

if x equals one of the keys; if x does not equal oneof the keys, report failure.

For lower bound, we can associate a decision treewith every deterministic algorithm that searches

for a key x in an array of n keys. Figure 1 shows adecision tree corresponding to binary search whensearching seven keys and figure2 shows a decision


35/47

tree corresponding to sequential search algorithm. In

these trees, each large node represents a comparison ofan array item with the search key x and each small node

(leaf) contains a result that is reported. When x is in the

array, we report an index of the item that is equals, and

when x is not in the array, we report an F for thefailure. In figure we use S[1]=S1, S[2]=S2,.,

S[7]=S7.

Each leaf in a decision tree for searching n keys for a

key x represents a point at which the algorithm stops

and report, an index i such that x=Si or reports failure.

Every internal node represents a comparisons.


36/47

x=S4

x=S5

x=S1

x=S3

x=S2 x=S6

2

3F1 FF F

4

5FF F 7 F

6 x=S7

>>

=

=

====

=>

Figure1: The decision tree corresponding to binary search

when searching seven keys


37/47

Figure 2: The decision treecorresponding to sequentialsearch when searching seven

keys

x=S7

x=S1

x=S2

x=S3

x=S4

x=S5

x=S6

1

3

2

5

4

7

6

F

=

=

=

=

=

{

=

=

{

{

{

{

{

{


38/47

A decision tree is called Valid for searching n keys for a

key x if for each possible outcome there is a path from

the root to a leaf that reports that outcome. That is, theremust be a path for x=Si for 1eien and a path that leads

to failure.

The decision tree is called pruned if every leaf is

reachable. Every algorithm that searches for a key x isan array of n keys has a corresponding pruned, valid

decision tree.


39/47

Lower bounds for worst case behavior

LEMMA 11: If n is the number of nodes in a binarytree and d is the depth, then

d > -lg(n)PROOF: We have

n e 1+2+22+23++2d

because there can be only one root, at most two nodeswith depth 1, 22 nodes with depth 2, , and 2d nodeswith depth d. Apply geometric series,

n e 2d+1 1

which means that n < 2d+1 or lgn


40/47

LEMMA 12: To be a pruned,valid decision treefor searching n distinct keys for a key x, the

binary tree consisting of the correspondingnodes must contain at least n nodes.PROOF: Let Si for i=1,2,3,.,n be the values of the nkeys. First we show that every Si must be in at least one

comparison nodes. Suppose that for some i this is notthe case. Take two inputs that are identical for all keysexcept the ith key, and are different for the ith key. Letx have the value of Si in one of the inputs. Because Si isnot involved in any comparison and all the other keys

are the same in both inputs, the decision tree must behave the same for both inputs. However, it mustreport i for one of the input and it must not report i forthe other. This contradiction shows that every Si mustbe in at least one comparison node.


41/47

Because every Si must be in at least one comparisonnode, the only way we would have less than n

comparisons node would be to have at least one key Siinvolved only in comparison with other keys- that is,one Si that is never compared with x, suppose we dohave such a key.

Take two inputs that are equal everywhere except for Si

,with Sibeing the smallest key in both inputs. Let x bethe ith key in one of the inputs. A path from acomparison node containing Si must go in the samedirection for both inputs, and all other keys are the same

in both inputs. Therefore the decision tree must behavethe same for the two inputs. However, it must report ifor one of them and must not report i for the other. Thiscontradiction proves the lemma.


42/47

THEOREM 5: Any deterministic algorithm that

searches for a key x in an array of n distinct keys only

by comparison of keys must in the worst case do at least-lg(n) + 1 comparison of keys.PROOF: Corresponding to the algorithm, there is a

pruned, valid decision tree for searching n distinct keys

for a key x. The worst case number of comparisons is

the number of nodes in the longest path from the root to

a leaf in the binary tree consisting of the comparison

nodes in that decision tree. This number is the depth of

the binary tree plus 1. Lemma 12 says that this binary

tree has at least n nodes. Therefore, by lemma 11, its

depth is greater than or equal to -lg(n). This proves thetheorem.


43/47

Lower bounds for average case behavior

A binary tree is called a nearly complete binary tree if it

is complete down to a depth of d-1. Every essentiallycomplete binary tree is nearly complete, but not every

nearly complete binary tree is essentially complete.

Fig(a): An Essentially Complete

Binary TreeFig(b): A Nearly Complete Binary Tree

but not Essentially CompleteB

T.


44/47

LEMMA 13: The tree consisting of the comparison ofnodes in the pruned,valid decision tree corresponding tothe binary search is a nearly complete binary tree.

LEMMA 14: The total node distance (TND) of a binarytree containing n nodes is equal to min TND(n) if andonly if the tree is nearly complete.

PROOF: First show that if a trees TND=min TND(n),the tree is nearly complete. Suppose that some binary

tree is not nearly complete. Than there must be somenode, not at one of the bottom two levels, that has atmost one child. We can remove any node from the

bottom level and make it a child of that node. Theresulting tree will be a binary tree containing n nodes.The number of nodes in the path to A in that tree will beat least 1 less than the number of nodes in the path A in


45/47

the original tree. The number of nodes in the path toall other nodes will be the same. Therefore, we have

created a binary tree containing n nodes with a TND

smaller than that of our original tree, which means that

our original tree did not have a minimum TND.

As TND is the same for all nearly complete binary tree

containing n nodes. Therefore, every such tree must

have the minimum TND.

LEMMA 15: Suppose that we are searching n keys, the

search key x is in the array, and all array slots are

equally probable. Then the average case timecomplexity for binary search is given by

min TND(n)/n

PROOF: The proof follows from lemma 13 and 14.


46/47

LEMMA 16: If we assume that x is in the array and thatall array slots are equally probable, the average casetime complexity of any deterministic algorithm thatsearches for a key x in an array of n distinct keys isbounded below by

min TND(n)/n

PROOF:B

y lemma 12, every array item Si must becompared with x at least once in the decision treecorresponding to the algorithm. Let Ci be the number ofnodes in the shortest path to a node containing acomparison of Si with x. Because each key has the same

probability 1/n of being the search key x, a lower boundon the average case time complexity is given by

C1(1/n)+ C2(1/n)+.+ Cn(1/n)=(1/n) 7 Cithus, 7 Ci u min TND(n)

i=1

n

i=1

n


47/47

THEOREM 6:

Among deterministic algorithm that search for a key x

in an array of n distinct keys only by comparison ofkeys, binary search is optimal in its average case

performance if we assume that x is in the array and that

all array slots are equally probable. Therefore, under

these assumptions algorithm must on the average do atleast approximately -lg(n) 1 comparisons of keys.PROOF:

Proof follows from lemma 15 and 16.

Documents

aaoa chap 1