27
CS208: Algorithms and Complexity Lecture 3: Search and Sort Thomas Selig University of Strathclyde 2 February 2017 T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 0 / 26

CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

CS208: Algorithms and ComplexityLecture 3: Search and Sort

Thomas Selig

University of Strathclyde

2 February 2017

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 0 / 26

Page 2: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Assignments

They’re meant to help you.

Some questions can have several correct answers.

Working together is fine, even encouraged, but...

You are allowed to start working on them more than 48 hours before they are due.

I look for answers that are clear and reasonably concise. Beyond that, I’m notbothered by format.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 1 / 26

Page 3: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

In the lectures so far we...

We introduced complexity, asymptotic complexity.

Formalised asymptotic complexity using big-O notation.

Determined the asymptotic complexity of polynomials.

We considered non-recursive algorithms.

Determined complexity of non-recursive algorithms.

Rules for if, while, for and sequential algorithms.

We considered recursive algorithms.

Determined complexity of recursive algorithms.

Obtained a recurrence relation and quoted its solution.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 2 / 26

Page 4: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Overview of Lecture 3

Today: Complexity of algorithms for searching in and sorting lists.

Initial questions: In this lecture

Input Size: Length of list.

Primitive Operations: Comparisons of numbers.

Algorithms: As a by-product, this lecture is also revision

Reminders about recursive algorithms

Two observations

Two algorithms for searching

Selection Sort analysis

Insertion Sort analysis

Merge Sort analysis

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 3 / 26

Page 5: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Recursive algorithms

Recursive algorithms are algorithms which call themselves.

Recursive algorithms have recursively defined complexity functions.

The worst case asymptotic complexity is governed by a recurrence relation.

You should be able toDerive the recurrence relation for a recursive algorithm.Know the complexity class of functions defined by recurrence relations.Know a little bit about logarithms.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 4 / 26

Page 6: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

What does O(1) mean?

O(1) doesn’t ‘mean’ anything by itself.

To say that another function f is O(1) does mean something...

To say that an algorithm A is O(1) usually means that the asymptotic worst casetime complexity of the algorithm for input size n, the function TA(n), is O(1).

I’ll return to this point again.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 5 / 26

Page 7: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Small observation

Suppose T (n) = k + T (n/2). Then

T (256) = k + T (128)

= k + k + T (64)

= k + k + k + T (32)

= k + k + k + k + T (16)

= k + k + k + k + k + T (8)

= k + k + k + k + k + k + T (4)

= k + k + k + k + k + k + k + T (2)

= k + k + k + k + k + k + k + k + T (1).

So T (256) = 8k + T (1) = k log2(256) + constant.

By the same reasoning, T (n) = k log2(n) + constant

This argument is used to show that T (n) is O(log2 n).

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 6 / 26

Page 8: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

A searching algorithm

Problem: Searching a list of integers L for m.

Algorithm search1(L,m)

Require: Ordered list L and integer mif L =[] then

answer ←falseelse if head(L)=m then

answer ←trueelse

answer ← search1(tail(L), m)end ifreturn answer

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 7 / 26

Page 9: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Analysis of search1

Complexity: Primitive operation is comparing list head to value. Defined recursively by

Tsearch1(n) =

{0 if L is empty1 if the head of L is m1 + Tsearch1(n − 1) otherwise.

Recurrence relation: Worst-case asymptotic complexity given by

Tsearch1(n) = 1 + Tsearch1(n − 1)

so Tsearch1(n) is O(n) (WB). We say ‘search1 has linear worst-case time complexity’.

Can we do better? In general, no. However...

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 8 / 26

Page 10: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Another searching Example

Suppose you want to search a list which is ordered.

Algorithm search2(L,m)

Require: Ordered list L and integer mif L =[] then

answer ←falseelse

a← middle value(L)if a = m then

answer ←trueelse if a > m then

answer ← search2(first half (L), m)else

answer ← search2(second half (L), m)end if

end ifreturn answer

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 9 / 26

Page 11: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Analysis of search2

Recurrence Relation: Primitive operation is comparing middle list value to m.Worst-case asymptotic complexity is given by

Tsearch2(n) = 1 + Tsearch2(n/2)

so Tsearch2(n) is O(log2(n)) (by using the Fact at bottom of slide 5). In this case we say‘search2 has logarithmic (log) worst-case time complexity’.

Terminology: We call algorithms such as this one which break down a problem into two(or more) sub-problems of the same type a divide and conquer algorithm.

Why are such algorithms usually efficient?

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 10 / 26

Page 12: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Selection Sort

Description: To sort a list

Head of the result is the smallest member of the list.

To obtain the tail of the result, remove the element we have just found and thensort the remaining elements recursively.

Algorithm selsort(L)

Require: Input is a list L1: if L =[] then2: return []3: else4: x ← smallest(L)5: L′ ← selsort(remove(x, L))6: J ← cons([x], L′)7: return J8: end if

Auxiliary Functions: Selection sort uses three other functions.

smallest returns the smallest element of a list.

remove deletes an element from a list.

cons is the concatenation operator.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 11 / 26

Page 13: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Analysis of Selection Sort

Key Idea: Selection sort uses other functions.

So the complexity of selection sort depends on the complexity of the other functions.

Step 1: Derive the complexity by using 4-rules

T (n) =

0 if L = []

T4(n) + T5(n) + T6(n)

= S(n) + (R(n) + T (n − 1)) + 0

= S(n) + R(n) + T (n − 1).

if L 6= []

Step 2: It is easy to see that S(n) = a1n + b1 and R(n) = a2n + b2 for some numbersa1, a2, b1, b2. (What exactly they are we do not care.) The important point is that they areboth linear functions of n, and therefore their sum will also be a linear function of n. So

T (n) = a3n + b3 + T (n − 1)

for some numbers a3 and b3. (Again we do not care what they are!)

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 12 / 26

Page 14: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Another way to say this is that since S(n) and R(n) are both O(n), their sumS(n) + R(n) is O(n).

The previous equation can be written in the following (incorrect!) but highly suggestiveform:

T (n) = O(n) +O(n) + T (n − 1) = O(n) + T (n − 1)

What has just happened?

We have just performed arithmetic using big-O notation.

Can we make sense of the equation T (n) = O(n) + T (n − 1) ?What about T (n)− T (n − 1) = O(n) ?What about T (n)− T (n − 1) is O(n) ?

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 13 / 26

Page 15: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Resolving T (n) = O(n) + T (n − 1)

The equation T (n) = O(n) + T (n − 1) ‘means’

T (n)− T (n − 1) = O(n),

which in turn ‘means’T (n)− T (n − 1) is O(n).

So from the definition of big-O, this tells us there are numbers N and c such that

T (n)− T (n − 1) ≤ cn

for all n > N.

If T (n) = O(n) + T (n − 1) then T (n) is O(n2). (WB)

More generally, if T (n) = O(na) + T (n − 1) for some a ≥ 0, then T (n) is O(na+1).Example for a = 0.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 14 / 26

Page 16: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

The asymptotic complexity of Selection Sort

For Selection Sort, we have that T (n) = O(n) + T (n − 1).

We can therefore conclude that T (n) is O(n2),i.e. Selection sort has worst case time complexity that is quadratic.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 15 / 26

Page 17: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Insertion Sort

Description: Insertion sort is defined as follows

insert inserts an item in a sorted list

inssort recursively sorts the tail and insert puts the head in the correct place.

Algorithm inssort(L)

if L = [] thenreturn L

elsea ← head(L)return insert(a, inssort(tail(L)))

end if

Algorithm insert(a, L)

if L = [] thenreturn [a]

elseb ← head(L)if a ≤ b then

return cons([a], L)else

return cons([b], insert(a, tail(L))end if

end if

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 16 / 26

Page 18: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Insertion Sort

We can re-write the algorithm as follows.

Description: Insertion sort is defined as follows

insert inserts an item in a sorted list

inssort recursively sorts the tail and insert puts the head in the correct place.

Algorithm inssort(L)

inssort([]) ← []inssort(cons(a, L′)) ← insert(a, inssort(L′))

insert(a, []) ← cons(a, [])insert(a, cons(b, L)) ← cons(a, cons(b, L)) when a ≤ binsert(a, cons(b, L)) ← cons(b, insert(a, L)) when a > b

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 17 / 26

Page 19: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Complexity of Insertion Sort

The primitive operations to be counted are comparisons of pairs of integers.Insert: First the complexity of insert

Tinsert(n) =

0 if n = 01 if a ≤ head(L)1 + Tinsert(n − 1) otherwise

and so insert is a linear, i.e. O(n), algorithm.

Insertion Sort: The complexity of insertion sort is given by

Tinssort(n) =

{0 if n = 0Tinsert(n − 1) + Tinssort(n − 1) otherwise

So Tinssort(n) = O(n) + Tinssort(n − 1).

Hence: Insertion sort is therefore a quadratic, i.e. O(n2) , algorithm.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 18 / 26

Page 20: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Merge Sort

Description: Merge sort is defined as follows

Divide the list L into two (as far as possible) equal parts first(L) and second(L).

Recursively Merge sort each part, giving two sorted lists.

Merge the two resulting lists, e.g.

mge([1, 10, 20, 22], [5, 15, 21]) = [1, 5, 10, 15, 20, 21, 22].

Algorithm msort(L)

Require: Input is a list Lmsort([]) ← []msort([e]) ← [e]msort(L) ← mge(msort(first(L), msort(second(L)) when |L| > 1

Algorithm mge(L, L′)

Require: Input is a pair of sorted lists (L, L′)mge([], L) ← Lmge(L, []) ← Lmge(cons(a1, L1), cons(a2, L2)) ← cons(a1, mge(L1, cons(a2, L2))) when a1 ≤ a2

mge(cons(a1, L1), cons(a2, L2)) ← cons(a2, mge(cons(a1, L1), L2)) when a1 > a2

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 19 / 26

Page 21: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Complexity of Merge Sort

Merge: To merge two lists whose total length is n, in the worst case we need n− 1 steps.

Merge Sort:

Tmsort(n) =

0 if n = 00 if n = 1O(n) + 2Tmsort(n/2) otherwise.

Fact: A recurrence relation of the form

T (n) = O(n) + 2T (n/2)

has a solution which is O(n log2(n)).(WB)

Conclusion: Merge sort is asymptotically more efficient than both insertion sort andselection sort.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 20 / 26

Page 22: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Quick sort

Description: Given list L

Choose a pivot x in L.

Move elements of L so thatall elements to the left of x are less than it, andall elements to the right of x are greater than it.

Apply this procedure to the lists left and right of x

Quick sort routine on a list L will be executed by calling q2sort(L,1,L.length)

(adding additional parameters is necessary for the recursive definition).

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 21 / 26

Page 23: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Algorithm qsort(L)Require: Input is a list L1: q2sort(L, 1, length(L))2: return L

Algorithm q2sort(L, p, r)Require: Input is a list L, and two integers p and r1: if p < r then2: q ← partition(L, p, r)3: q2sort(L, p, q − 1)4: q2sort(L, q + 1, r)5: end if

Algorithm partition(L, p, r)Require: Input is a list L, and two integers p and r1: x ← L[r ]2: i ← p − 13: for j = p to r − 1 do4: if L[j] ≤ x then5: i ← i + 16: swap L[i ] and L[j]7: end if8: end for9: swap L[i + 1] and L[r ]

10: return i + 1

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 22 / 26

Page 24: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Complexity of Quicksort

Let Tqsort(n) be the time taken to quicksort an input of size n.

ThenTqsort(n) = Tpartition(n) + Tqsort(m) + Tqsort(n − 1−m)

for some value m. Now Tpartition(n) = n, or is just O(n).

SoTqsort(n) = n + Tqsort(m) + Tqsort(n − 1−m)

for some m ∈ {0, 1, . . . , n − 1}.Analysing the behaviour of qsort is difficult due to the form of the above equation.

It depends on a quantity m which can vary greatly depending upon the particularinput.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 23 / 26

Page 25: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Complexity of Quicksort

However, for worst case time complexity we may write

Tqsort(n) ≤ n + maxm∈{0,...,n−1}

{Tqsort(m) + Tqsort(n − 1−m)}.

Solving such recursive inequalities is a delicate art.

The solution T is generally well behaved and one expects the argument of max tobe achieved for m = 0, n − 1, or (n − 1)/2. It is difficult to explain why this is so,but the ‘level 1’ reason is a combination of considering how quickly T increasesalong with a symmetry argument.

To cut a long derivation short, the answer is Tqsort(n) ∈ O(n2).

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 24 / 26

Page 26: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Complexity of Quicksort

We have shown that the worst-case complexity of Quicksort is O(n2), i.e. quadratic.

In principle, this is less efficient than Mergesort.

However, recall:

Tqsort(n) = n + Tqsort(m) + Tqsort(n − 1−m)

for some m ∈ {0, 1, . . . , n − 1}.

If, in general, m = 0 or m = n − 1, we get T (n) = O(n) + T (n − 1), so T (n) isO(n2).

But if, in general, m = n/2, we get T (n) = O(n) + 2T (n/2) and T (n) isO(n log2(n)).

Same as Mergesort! This is what happens on average.

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 25 / 26

Page 27: CS208: Algorithms and Complexity Lecture 3: Search and Sort · Algorithms: As a by-product, this lecture is also revision Reminders about recursive algorithms Two observations Two

Comparing the sorting algorithms

Selection Sort has complexity that is quadratic for best and worst cases.

Insertion Sort has worst case time complexity that is quadratic, but its best case islinear (if the list is already sorted).

Merge Sort has worst case time complexity O(n log2(n)) (also average case).

Quick sort has worst-case complexity that is quadratic, but average case that isO(n log2(n)).

Quick sort vs Merge sort? Beyond the scope of these lectures (depends on how listsare implemented...).

T. Selig (Univ. Strathclyde) CS208: A & C 2 February 2017 26 / 26