50
Computing Science 1P Lecture 21: Wednesday 18 th April Simon Gay Department of Computing Science University of Glasgow 2006/07

Lecture 20, Wednesday 18th April

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Lecture 20, Wednesday 18th April

Computing Science 1P

Lecture 21: Wednesday 18th April

Simon GayDepartment of Computing Science

University of Glasgow

2006/07

Page 2: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 2

What's coming up?

Wed 18th April (today): lectureFri 20th April: lectureMon 23rd April, 12.00-14.00: FPP demo session; PRIZES!Mon 23rd – Wed 25th April: labs: exam preparation /

lab exam preparationWed 25th April: lecture / tutorialFriday 27th April: lecture: revision / questions

with Peter SaffreyMon 30th April – Wed 2nd May: Lab ExamWed 2nd May: No lectureFri 4th May: No lecture

THE END

Page 3: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 3

Lab Exam 2

As you know, there will be a second lab exam, during the weekbeginning 30th April. It is worth 10% of the module mark.The question has been available since Monday afternoon.

In the first lab exam, although there were many very goodsubmissions, a worrying number of submissions showedvery little evidence of advance preparation.

You can prepare for the lab exam in any way you want to,including asking a friend to explain the solution to you.On the day, of course, you are on your own.

Trying to solve the problem from scratch in the exam is justmaking life difficult for yourself.

Page 4: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 4

Algorithms

We have looked at algorithms for sorting; we saw that choosinga better algorithm can have a dramatic effect on the efficiencyof a program.

Now let's consider searching, another basic computing task.

The general problem: find a desired item of data in a collection.

Example: in a collection of (name,address) pairs, find aparticular person's address.

Page 5: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 5

Searching in unstructured data

Imagine that we have a list of (key,value) pairs and we do notknow anything about the order. We can easily define:

def find(key,data): for i in data: if i[0] == key: return i[1] raise "KeyNotFound",key

Page 6: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 6

Searching in unstructured data

What can we say about the time taken by find ?Just as for sorting, the relevant measure is the number ofcomparisons.

Clearly it is possible that the key we are looking for is at the endof the list. In that case we have to compare the given key withevery key in the list.

If we imagine testing find repeatedly with a large number ofrandom lists, on average it will have to search half way alongthe list.

When analysing algorithms, sometimes we talk about theaverage case and sometimes the worst case. In this situation they are both the same: order n, where n is the length of the list.

Page 7: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 7

Searching in unstructured data

It's obvious that we can't do better than order n for searching inan unstructured list, because we can't avoid the possibility thatthe desired key is at the end.

Remarkably, there is an algorithm for quantum computers whichonly takes square root of n operations to search in anunstructured list. However, quantum computers of a useful sizehave not yet been built.

To find out more, look up Grover's algorithm.

But let's stick to conventional algorithms…

Page 8: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 8

More efficient search

If we can't improve the algorithm for search in an unstructuredlist, the only alternative is to change the data structure: don't usean unstructured list!

The first idea is quite simple: use an ordered list instead.In other words, put the data in the list in such a way that thekeys are in order. Often this means alphabetical order ornumerical order, but other more complex orders could be defined.

Example: in a dictionary, the words are in alphabetical order,and we can take advantage of this to find words quickly.

Page 9: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 9

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Page 10: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 10

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

It could be anywhere in the list.

Page 11: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 11

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

It could be anywhere in the list.

The list has length 12.Divide it by 2 and look at position 6.

Page 12: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 12

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

It could be anywhere in the list.

The list has length 12.Divide it by 2 and look at position 6.

cat < garage

Page 13: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 13

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Because the list is ordered, we nowknow that cat must be before garage,i.e. it is in the first half of the list.

Page 14: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 14

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Because the list is ordered, we nowknow that cat must be before garage,i.e. it is in the first half of the list.

Page 15: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 15

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Now repeat, searching in a list of length 6.

Page 16: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 16

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Now repeat, searching in a list of length 6.

Divide by 2 and look at position 3.

Page 17: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 17

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Now repeat, searching in a list of length 6.

Divide by 2 and look at position 3.

cat < door

Page 18: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 18

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

We now know that cat must bebefore door.

Page 19: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 19

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

We now know that cat must bebefore door.

Page 20: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 20

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Now repeat, searching in a list of length 3.

Page 21: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 21

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Now repeat, searching in a list of length 3.

Divide by 2 and look at position 1.

Page 22: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 22

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

Now repeat, searching in a list of length 3.

Divide by 2 and look at position 1.

cat > badger

Page 23: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 23

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

We now know that cat must be after badger.

Page 24: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 24

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

We now know that cat must be after badger.

We have narrowed down the possibleposition of cat to just one place.And in fact cat is there, so we have found it.

Page 25: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 25

Binary searchandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: cat

We now know that cat must be after badger.

We have narrowed down the possibleposition of cat to just one place.And in fact cat is there, so we have found it.

If a different word is there, then cat is not inthe list.

Page 26: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 26

Binary search

The idea of binary search is very simple, but implementing itcorrectly requires care: there are many possibilities for"off by one" errors.

Searching in a dictionary is often used as an example ofbinary search, but we don't really use dictionaries in exactly this way.

Usually we flick through the pages quickly to find the right letter,then do something similar to binary search. A typical dictionaryhas extra structure to support this process (e.g. words in thepage headers; thumbholes for indexing).

Page 27: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 27

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 28: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 28

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 29: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 29

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 30: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 30

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 31: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 31

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 32: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 32

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

We could stop here, but if we followthe algorithm strictly, we continuedividing the region in two

Page 33: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 33

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 34: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 34

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Page 35: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 35

Another exampleandroid

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

Search for the key: handle

Now we can certainly stop

Page 36: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 36

Analysing binary search

Remember that we are interested in the number of comparisons.

Suppose that we are searching in a list of length n.

We compare the middle item with the search key. The result might tell us we have found the key, but in general it narrowsdown the region of the list in which we are searching.

The possible region of the list is now half the size it was.

Page 37: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 37

Analysing binary search

We keep halving the size of the region, until we narrow it downto a single position in which the key should be found.

How many times do we have to halve the size?

n = 16: 8, 4, 2, 1 4 comparisonsn = 64: 32, 16, 8, 4, 2, 1 6 comparisons

It is the logarithm of n to base 2, i.e. the power of 2 which gives n.

Binary search is an order log n algorithm.

Page 38: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 38

Analysing binary search

We can compare the efficiency of an order n algorithm with thatof an order log n algorithm:

n log n time n time

10 3 10

100 6 100

1 000 9 9 microsec 1 000 1 millisec

10 000 12 12 microsec 10 000 10 millisec

100 000 15 15 microsec 100 000 100 millisec

1 000 000 18 18 microsec 1 000 000 1 sec

10 000 000 21 21 microsec 10 000 000 10 sec

Page 39: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 39

Implementing binary searchdef find(key,data): lower = 0 upper = len(data)-1 length = upper - lower + 1 while length > 1: midpoint = lower + length/2 if key < data[midpoint]: upper = midpoint - 1 else: lower = midpoint length = upper - lower + 1 if key == data[lower]: return lower else: raise "KeyNotFound",key

Page 40: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 40

Implementing binary searchdef find(key,data): lower = 0 upper = len(data)-1 length = upper - lower + 1 while length > 1: midpoint = lower + length/2 if key < data[midpoint]: upper = midpoint - 1 else: lower = midpoint length = upper - lower + 1 if key == data[lower]: return lower else: raise "KeyNotFound",key

android

badger

cat

door

ending

fireman

garage

handle

iguana

jumper

kestrel

lemon

0

1

2

3

4

5

6

7

8

9

10

11

lower

upper

midpoint

Page 41: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 41

Termination

When writing programs with loops, we have to be sure that theyterminate, i.e. eventually stop.

In almost all of our previous programs, it has been obvious thatloops terminate.

In a for loop, the number of iterations is known before westart, e.g. for x in range(10)

In a while loop, the condition can be anything, but we havealways used a simple structure:

i = 0while i < 10: # code inside the loop, not changing i i = i + 1

Page 42: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 42

Termination

Binary search uses a while loop with a more complex structure:

length = upper - lower + 1while length > 1: midpoint = lower + length/2 if key < data[midpoint]: upper = midpoint - 1 else: lower = midpoint length = upper - lower + 1

Let's prove that eventually length <= 1, so that the loop terminates.

Page 43: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 43

Proving termination

Consider one iteration of the loop. At the beginning we havevalues lower, upper and length. At the end of the body of theloop we have new values lower', upper' and length'.

We will prove that length' < length .

Therefore as we go round the loop repeatedly, length getssmaller and smaller. It is always an integer value, soeventually it must reach 1 or less.

Page 44: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 44

Proving termination

At the top of the loop we have values lower, upper.

We have length = upper – lower + 1

Assume that length > 1, so that we go into the loop.

At the bottom of the loop we have new values lower' , upper'

and we have a new value length' = upper' – lower' + 1

we also have midpoint = lower + length/2

Now we consider the possible ways of calculating lower', upper'

Page 45: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 45

Proving termination

Case 1: we take the first branch of the if statement.

upper' = midpoint – 1lower' = lower

length' = upper' – lower' + 1 = midpoint – 1 – lower + 1 = midpoint – lower = lower + length/2 – lower = length/2 < length because length > 1

Page 46: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 46

Proving termination

Case 2: we take the second branch of the if statement.

upper' = upperlower' = midpoint

length' = upper' – lower' + 1 = upper – midpoint + 1 = upper – (lower + length/2) + 1 = upper – lower – length/2 + 1 = length – length/2 < length because length > 1

Page 47: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 47

Proving termination

We have proved that whichever path we take through the body of the while loop, length decreases.

Therefore the loop must terminate.

With further calculation of a similar kind (exercise!) we canprove that when the loop terminates, length = 1 (not < 1),meaning that we really have identified one location in the listwhere the key should be found if it is present at all.

Page 48: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 48

Refining binary search

It might turn out that when we look at the midpoint of the list,the key we want happens to be there. We might as well takeadvantage of that case:

while length > 1: midpoint = lower + length/2 if key < data[midpoint]: upper = midpoint - 1 elif key > data[midpoint]: lower = midpoint else: return midpoint

but notice that we have introduced an extra comparison.What can we do about this?

Page 49: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 49

The function cmp

The problem is that when we compare two values, there arethree possible results: equal, first one smaller, second one smaller

The comparisons < <= > >= == return a boolean result, sothey only tell us one of two possible results.

To solve this problem, Python provides the function cmp .

cmp(a,b) returns 0 if a == b -1 if a < b 1 if a > b

Page 50: Lecture 20, Wednesday 18th April

2006/07 Computing Science 1P Lecture 21 - Simon Gay 50

Using cmp

while length > 1: midpoint = lower + length/2 r = cmp(key,data[midpoint]) if r == -1: upper = midpoint - 1 elif r == 1: lower = midpoint else: return midpoint