22
Data Structures and Algorithms (CS210/ESO207/ESO211) Lecture 31 Average running time of Quick sort 1

Lecture-31-CS210-2012.pptx

Embed Size (px)

Citation preview

Page 1: Lecture-31-CS210-2012.pptx

Data Structures and Algorithms(CS210/ESO207/ESO211)

Lecture 31

• Average running time of Quick sort

1

Page 2: Lecture-31-CS210-2012.pptx

Overview of this lecture

Main Objective:• Analyzing average time complexity of QuickSort using recurrence.

– Using mathematical induction.– Solving the recurrence exactly.

• The outcome of this analysis will be quite surprising!

Extra benefits:• You will learn a standard way of using mathematical induction to bound

time complexity of an algorithm. You must try to internalize it.

2

Page 3: Lecture-31-CS210-2012.pptx

QuickSort

3

Page 4: Lecture-31-CS210-2012.pptx

Pseudocode for QuickSort()

QuickSort(){ If (||>1) Pick and remove an element from ; (, ) Partition(,); return( Concatenate(QuickSort(), , QuickSort())}

4

Page 5: Lecture-31-CS210-2012.pptx

Pseudocode for QuickSort()When the input is stored in an array

QuickSort(,, ){ If ( < ) Partition(,,); QuickSort(,, ); QuickSort(,, )}

Partition selects an element from [] as a pivot element, and permutes the subarray [] such that elements preceding are smaller than , and elements succeeding are greater than . It achieves this task in O() time using O(1) extra space only.

5

Page 6: Lecture-31-CS210-2012.pptx

0 1 2 3 4 5 6 7 8

Example: Partition(,,)Without loss of generality, the first element is selected as the pivot element.

6

x<x

>x

What happens after Partition(,,)

Page 7: Lecture-31-CS210-2012.pptx

0 1 2 3 4 5 6 7 8

Example: Partition(,,)Without loss of generality, the first element is selected as the pivot element.

7

0 1 2 3 4 5 6 7 8

<x >x

Page 8: Lecture-31-CS210-2012.pptx

Analyzing average time complexity of QuickSort

8

Part 1Deriving the recurrence

Page 9: Lecture-31-CS210-2012.pptx

Analyzing average time complexity of QuickSort

Assumption (for a neat analysis): • all elements are distinct. • Each recursive call selects the first element as the pivot element

Let : th smallest element of .Observation: the running time of quick sort depends upon the permutation of ’s and not on the values taken by ’s.

() : Average running time for Quick sort on input of size .Question: average over what ?Answer: average over all possible permutations of {, ,… ,} Hence, ()= , where is the time complexity (or no. of comparisons) when the input is permutation .

9

Calculating () from definition/scratch is impractical, if not impossible.

Page 10: Lecture-31-CS210-2012.pptx

All permutations of {, ,… ,}

Analyzing average time complexity of QuickSort

Let P() be the set of all those permutations of {, ,… ,} that begin with .Question: What fraction of all permutations constitutes P() ?Answer: Let () be the average running time of QuickSort over P().

Question: What is the relation between () and ()’s ?Answer: () Observation: We now need to derive an expression for .For this purpose, we need to have a closer look at the execution of QuickSort over P().

10

Permutations beginning with

Permutations beginning with

Permutations beginning with

Page 11: Lecture-31-CS210-2012.pptx

Analyzing average time complexity of QuickSort

Question: How does QuickSort proceed on a permutation from P() ?Answer: The procedure Partition() permutes the array such that it looks like

Observation: For any given implementation of Partition() and a permutation from P(), the resulting permutation is well-defined. Let S() be the set of permutations that are outcome of executing Partition() on P().

Inference: Partition() can be viewed as a mapping from set P() to S(). Question: What kind of mapping from set P() to S() is defined by Partition() in this way? Lemma1: The mapping defined by Partition() is a many-to-one uniform mapping from P() to S(). In particular, exactly permutations from P() get mapped to a single permutation in S().

11

0 1 2 3 4 5 6 7 8

Permutation of ,…, Permutation of ,…,

𝑒𝑖

()!⨯ ()!()!

Page 12: Lecture-31-CS210-2012.pptx

Analyzing average time complexity of QuickSort

It follows from Lemma 1 that after Partition() on P(), each permutation of ,…,occurs with same(uniform) frequency in [] and each permutation ,…, occurs with same (uniform) frequency in [].

Hence we can express () (the time complexity of QuickSort averaged over P())as: () = () + () ----1We showed previously that : () ----2

Question: Can you express () recursively using 1 and 2?() = + () =

12

+

Page 13: Lecture-31-CS210-2012.pptx

Analyzing average time complexity of QuickSort

13

Part 2Solving the recurrence through

mathematical induction

Page 14: Lecture-31-CS210-2012.pptx

() = () = + = + Assertion A(): () ≤ + for all ≥ 1Base case A() : Holds for Induction step: Assuming A() holds for all < , we have to prove A(). () ≤ + ≤ + = + + ≤ + + = + = + ≤ + = + ≤ for >

14

Page 15: Lecture-31-CS210-2012.pptx

Analyzing average time complexity of QuickSort

15

Part 3Solving the recurrence exactly

Page 16: Lecture-31-CS210-2012.pptx

Some elementary tools

Question: How to approximate ?Answer: + ᵞ, as increases where ᵞ is Euler’s constant ~0.58Hint:

We shall calculate average number of comparisons during QuickSort using:• our knowledge of solving recurrences by substitution• our knowledge of solving recurrence by unfolding• our knowledge of simplifying a partial fraction (from JEE days)Students should try to internalize the way the above tools are used.

16

1

1/21/3

1/4 1/5 1/6

Look at this figure, and relate it to the curve for function

f(x)= 1/x and its integration…

Page 17: Lecture-31-CS210-2012.pptx

() : average number of comparisons during QuickSort on elements. () = , () = , () = + = + () = + -----1Question: How will this equation appear for ?() = + -----2Subtracting 2 from 1, we get() () = () + () () = Question: How to solve/simplify it further ? =

17

Page 18: Lecture-31-CS210-2012.pptx

= = , where Question: How to simplify RHS ? = = = = = =

18

Page 19: Lecture-31-CS210-2012.pptx

= Question: How to calculate ? = = … = … = = Hence = + (2 2 = + (2 4 = + 2 4 () = () ( + 2 4) = 2(

19

Page 20: Lecture-31-CS210-2012.pptx

() = 2( = 2+ 1.16 = 2 2.84 + O(1) = 2

Theorem: The average number of comparisons during QuickSort on elements approaches 2 2.84 . = 1.39 O()

The best case number of comparisons during QuickSort on elements = The worst case no. of comparisons during QuickSort on elements =

20

Page 21: Lecture-31-CS210-2012.pptx

QuickSort versus MergeSortTheorem: Average no. of comparisons by QuickSort on elements = 1.39 O(). Question: How does MergeSort on 8 elements look like ?

Theorem: Worst case number of comparisons during MergeSort = + O() Average no. of comparisons during QuickSort is 39% more than the worst case number

of comparisons of MergeSort.Is the fact mentioned above not surprising because QuickSort outperforms MergeSort in real life almost always ?

21

Page 22: Lecture-31-CS210-2012.pptx

QuickSort versus MergeSort

Question: What makes QuickSort outperform MergeSort in real life?Answer:• Fewer recursive calls (MergeSort : 2n-1, QuickSort : n).• QuickSort makes better use of cache (a fast but small memory unit between

processor and the RAM).• No extra space used by QuickSort. • No overhead of copying the array carried out during merging.• The likelihood of QuickSort to deviate from its average case behavior reduces

drastically as input size increases (read the following theorem). Theorem: On a permutation of 1 million elements selected randomly uniformly from all possible permutations, the probability that QuickSort takes more than twice the average time is less than .

Proof of this theorem is discussed in almost every course on randomized algorithms.

22