74
Searching Chapter 7

Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Embed Size (px)

Citation preview

Page 1: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Searching

Chapter 7

Page 2: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Objectives

• Introduce sequential search.– Calculate the computational complexity of a

successful search.

• Introduce binary search.– 4 different versions.– Calculate the computational complexity.

• Discuss comparison trees and how they can be used to analyze algorithm performance.– Internal path length– External path length– Average path length

Page 3: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Homework Overview

• Written (max 40 points)– 7.2 E3 (4 pts)– 7.3 E1 (a, b, c, d) (2 pts each)– 7.4 E1 (a, b, c, d) (3 pts each)– 7.4 E2 (5 pts)– 7.4 E3 (10 pts)– 7.6 E1 (a, b, c, d) (2 pts each)– 7.6 E2 (6 pts)– 7.6 E5 (a, b, c, d, e, f, g, h) (1 pt each)– 7.6 E6 (a, b, c, d) (2 pts each)

• Programming (max 20 points)– 7.2 E4 (8 pts)– 7.2 P2 (12 pts)– 7.4 P1 (15 pts)

Page 4: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Searching

• A very common problem in computer science is trying to find a particular data entry.

• There are two main strategies.– Use a general storage type and then produce and algorithm to

search within that type.– Design special storage types that make searching more

efficient.

• In general we assume each entry has a key.– Name– ID number– Value– etc.

• We search the entries until we find the desired key.

Page 5: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Sequential Search

• If there is no organizational structure to the data then the only real strategy is a sequential search.– We start at one end of the list and examine

each key in turn.

for (position = 0; position < size; position++){the_list.retrieve(position, data);if (data == target) return success;

}return not_present;

Page 6: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Complexity of Sequential Search

• To determine the computational cost of doing this search we count how many times some representative operation occurs.– We will choose to count the number of comparisons.– For some data types, comparisons may be very expensive

• For example comparing long strings.

– How many times is the == operator used?

• The answer depends on where (or if) the target key is stored in the list.– We could get lucky and it the target on the first comparison.– We could find the key on the last comparison.– If the key is not in the list we will need to look at every entry

to make sure.

Page 7: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Complexity of Sequential Search

• Let’s assume we know the key is in the list so the search will be successful.– Let’s also assume the key has an equal probability of

being in any location in the list.

• Let n be the number of entries in the list.• We could find the desired key after 1, 2, 3, …, n

comparisons all with equal probability.• The average search time is:

Page 8: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Key Class

• In order to count the number of comparisons in the run of an actual program, it is helpful to create a custom key class.

• This class will represent the key (any data type) but more importantly it will allow us to overload the comparison operators.

• The class will contain a static variable that will count the number of comparisons.

• Each time a comparison is made, the overloaded operator will add one to the comparison count.

• We can examine the comparison count at the end of the program.

Page 9: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Key Class Definition

class Key {int key;

public:static int comparisons;Key (int x = 0);int the_key() const;

};

bool operator == (const Key &x, const Key &y);bool operator > (const Key &x, const Key &y);bool operator < (const Key &x, const Key &y);bool operator >= (const Key &x, const Key &y);bool operator <= (const Key &x, const Key &y);bool operator != (const Key &x, const Key &y);

int Key::comparisons = 0;

• Note the static variable is assigned its initial value outside of any function.– It is accessed using the class name and the scope

resolution operator.

Page 10: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Key Class Methods

• The constructor and accessor methods are simple.Key::Key(int x){

key = x;}

int Key::the_key() const{

return key;}

• The operators are also simple.– They use the accessor method and the default comparison to

do their job.– They increment the comparison count.– They are all similar to the following.

bool operator == (const Key &x, const Key &y){

Key::comparisons++;return x.the_key() == y.the_key();

}

Page 11: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Sequential Search Testing Program

• Now that we have a key class that can count comparisons for us, we can write a program to test sequential search.

• We will generate a list of odd entries from a known range of values.

• We will repeatedly select a random value that we know is in the list and search for it.– Computing the average number of comparisons over a large

number of runs.– We will also calculate the run time for these searches.

• We will then repeatedly select a random entry we know is not in the list and search for it.– Computing the average number of comparisons over a large

number of runs.– We will also calculate the run time for these searches.

Page 12: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Random, Timer and List

• To generate the random numbers we will use the Random class defined in Appendix B of the book.

• To calculate the run times we will use the Timer class defined in Appendix C of the book.

• We have used similar code before so we won’t go into the details here.

• Finally, we will use one of the list packages (all should work) that we developed in the last chapter.– You will need to add not_present to the

enumeration of the return values.

Page 13: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Main Function

• First, we will just be storing Keys in the list.typedef Key Record;

• The main function asks the user for details, creates the list and then call test_search.

int main(){

int items, searches;List<Record> the_list;Key::comparisons = 0;cout << "How many items should be stored in the list? " << flush;cin >> items;if (items < 0) {

cout << "Error: the number of items must be nonnegative." << endl;exit(1);

}cout << "How many searches should be performed? " << flush;cin >> searches;if (searches <= 0) {

cout << "Error: the number of searches must be positive." << endl;exit(1);

}for (int i = 0; i < items; i++)

the_list.insert(i, 2 * i + 1);test_search(searches, the_list);

}

Page 14: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

test_search Functionvoid test_search(int searches, List<Record> &the_list)/* Pre: The number searches is a positive integer and the List the_list has been

filled some number of integers. Post: Statistics are printed about the performance of searching algorithms when

the searched for key is present in the list and when it is absent. Uses: The List class, the Random number class, the Key class, the Timer class,

and the function sequential_search. */{

int list_size = the_list.size();if (searches <= 0 || list_size < 0){ cout << " Exiting test: " << endl

<< " The number of searches must be positive." << endl << " The number of list entries must exceed 0." <<

endl; return; }

int i, target, found_at;Key::comparisons = 0;Random number;Timer clock;for (i = 0; i < searches; i++){ target = 2 * number.random_integer(0, list_size - 1) + 1;

if (sequential_search(the_list, target, found_at) == not_present)

cout << "Error: Failed to find expected target " << target << endl;

}

Page 15: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

test_search Functionprint_out("Successful", clock.elapsed_time(), Key::comparisons,

searches);Key::comparisons = 0;clock.reset();for (i = 0; i < searches; i++){

target = 2 * number.random_integer(0, list_size);if (sequential_search(the_list, target, found_at) == success) cout << "Error: Found unexpected target " << target

<< " at " << found_at << endl;}print_out("Unsuccessful", clock.elapsed_time(), Key::comparisons,

searches);}

Page 16: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Sequential Search FunctionError_code sequential_search(const List<Record> &the_list, const Key &target, int &position)/* Post: If an entry in the_list has key equal to target, the return success and

the output parameter position locates such an entry within the list.

Otherwise return not_present and position becomes invalid. */{

int s = the_list.size();for (position = 0; position < s; position++){

Record data;the_list.retrieve(position, data);if (data == target) return success;

}return not_present;

}

Page 17: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

print_out Function

void print_out(char *search, double time, int comparisons, int searches)/* Pre: search is a string describing a search. Post: Statistics about the search are printed out. */{

cout << "The search " << search << " took " << time << " seconds and " << comparisons << " comparisons to make " << searches << "

searches." << endl;

cout << "This results in an average search time of " << time / searches<< " and an average number of comparisons of " << comparisons /

searches << "." << endl;

}

Page 18: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Sample Output

• Here is a sample of the output of the testing program.How many items should be stored in the list? 1000How many searches should be performed? 100The search Successful took 0.002343 seconds and 49474 comparisons to make 100 searches.This results in an average search time of 2.343e-05 and an average number of comparisons of 494.The search Unsuccessful took 0.004342 seconds and 100000 comparisons to make 100 searches.This results in an average search time of 4.342e-05 and an average number of comparisons of 1000.

• We expected the average number of comparisons to be 1001/2 = 500.5 which is slightly different than the actual result.

• The time for a successful search was on average about half the time for an unsuccessful search.

Page 19: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Computational Complexity

• For successful searches:– The average number of comparisons is approximately half of the number of items

in the list.

• For both successful and unsuccessful searches:– When the number of entries (and the number of comparisons) increases by a

factor of 10, the run time increases by a little less than 10 times.– Both the run time and the number of comparisons are linear functions of the

number of items.– There are a lot of details here that depend on the computer, compiler, language,

programmer skill, etc.– We use a shorthand notation, O(n), to say the runtime is a linear function of the list

size.

Successful Searches Unsuccessful Searches

n Ave. comp. Ave. time Ave. comp. Ave. time

10 5 3.70e-07 10 5.50e-07

100 49 2.46e-06 100 4.96e-06

1000 494 2.34e-05 1000 4.34e-0510000 4942 0.00018757 10000 0.00033711100000 49425 0.00126353 100000 0.00294095

Page 20: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Homework – Section 7.2 (page 276)

• Written– E3 (written on paper) (4 pts)

• Programming– E4 (email code) (8 pts)– P2 (email code and written report) (12 pts)

Page 21: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Binary Search

• If the list is ordered (say from the smallest key to the largest key) then we can do much better than sequential search.

• With binary search we can divide the list in two and eliminate the half that we know does not contain the desired key.

• We divide the list in half.

• Since the keys are ordered and the desired key is larger than the mid key we know that the desired entry (if it exists) is in the top half of the list.

Page 22: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Binary Search

• We have cut the size of the problem in half with one comparison!

• We can repeat the problem, resetting bottom and top to indicate the part of the list that still might contain the desired key.

Page 23: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Binary Search Termination

• There are several options when implementing the binary search algorithm.

• First, when do we terminate?• There are two options for terminating the

division:1. Stupid condition: We have a list with one entry

(top == bottom).• With this method we might keep going after we have

“found” the target key.

2. Clever condition: We have a list with one entry or we find the key (top == bottom || data == target).• This has the penalty of an extra comparison.

Page 24: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Recursion

• We can also implement the binary search recursively or iteratively.

• This leaves us with 4 possible solutions:

• Which method is fastest and has the least number of comparisons?– Let’s implement them all and test them just

the way we did with the sequential search.

Recursive, Stupid Recursive, Clever

Iterative, Stupid Iterative, Clever

Page 25: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Ordered Lists

• To enforce the fact that our list must be ordered, we will create an extension of the list class.

• The new class will be called Ordered_list and will overload (replace) the insert and replace methods.– The new versions will ensure that the list is

always in sorted order.

Page 26: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Ordered List Classclass Ordered_list: public List<Record>{public: Ordered_list(); /* Post: The Ordered_list is initialized to be empty. */ Error_code insert(const Record &data); /* Post: If the Ordered_list is not full, the function succeeds: the Record

data is inserted into the list following the last entry of the list with a strictly lesser key (or in the first position if no element has a lesser key).

Else: the function fails with the diagnostic Error_code overflow. */ Error_code insert(int position, const Record &data); /* Post: If the Ordered_list is not full, 0 <= position <= n, where n is the

number of elements in the list, and the Record data can be inserted at position in the list, without disturbing the list order, then the function succeeds: Any enry formerly in position and all later entries have their position numbers increased by 1 and data is inserted at position of the List.

Else: the function fails with a diagnostic Error_code. */ Error_code replace (int position, const Record &data); /* Post: If the entry at position can be replaced with data without disturbing

the list order, then the function succeeds and the entry is replaced. Else: the function fails with a diagnostic Error_code. */};

Page 27: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Ordered List Methods

Ordered_list::Ordered_list()/* Post: The Ordered_list is initialized to be empty. */{ count = 0;}

Error_code Ordered_list::insert(const Record &data)/* Post: If the Ordered_list is not full, the function succeeds: the Record

data is inserted into the list following the last entry of the list with

a strictly lesser key (or in the first position if no element has a lesser key).

Else: the function fails with the diagnostic Error_code overflow.*/{ int s = size(); int position; for (position = 0; position < s; position++){ Record list_data; retrieve(position, list_data); if (data >= list_data) break; } return List<Record>::insert(position, data);}

Page 28: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Ordered List MethodsError_code Ordered_list::insert(int position, const Record &data)/* Post: If the Ordered_list is not full, 0 <= position <= n, where n is the number of elements in the list, and the Record data can be inserted at position

in the list, without disturbing the list order, then the function succeeds: Any enry formerly in position and all later entries have their position numbers increased by 1 and data is inserted at position of the List.

Else: the function fails with a diagnostic Error_code. */{ Record list_data; if (position > 0){ retrieve(position - 1, list_data); if (data < list_data) return fail; } if (position < size()){ retrieve(position, list_data); if (data > list_data) return fail; } return List<Record>::insert(position, data);}

Page 29: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Ordered List Methods

Error_code Ordered_list::replace (int position, const Record &data)/* Post: If the entry at position can be replaced with data without

disturbing the list order, then the function succeeds and the entry is replaced.

Else: the function fails with a diagnostic Error_code. */{ Record list_data; if (position > 0){ retrieve(position - 1, list_data); if (data < list_data) return fail; } if (position < size()){ retrieve(position + 1, list_data); if (data > list_data) return fail; } return List<Record>::replace(position, data);}

Page 30: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Binary Search Algorithm

• Binary search is famous for being coded incorrectly – be careful.

• We need to carefully define our variables:– top and bottom will be indices enclosing the part of the list in

which we are searching for the target key.

• At each step we will reduce the region between top and bottom by about half.

• The following is our loop invariant:– The target key, provided it is present in the list will be found

between the indices bottom and top inclusive.

• We will start with the following values:– bottom = 0– top = list.size() – 1

Page 31: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Binary Search Algorithm

• To actually do the searching we calculate the midpoint in the list

mid=(bottom + top)/2• We will compare the target key to the key at position mid.

– If the target key is greater than the key at position mid then the target can only lie in the top half of the list.• bottom = mid + 1.

– If the target key is less than or equal to the key at position mid then the target can only lie in the bottom half of the list.• top = mid.

• This process repeats until top <= bottom.– Alternatively we could also terminate when the target key ==

the key at position mid.

• The process can be either iterative or recursive.

Page 32: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Recursive Algorithm

Simplification Step: If the target key > the key at position mid then repeat the problem with

bottom = mid + 1

Otherwise repeat withtop = mid

Base Case: if top <= bottom then the list is has at most one entry. Check this entry to see if it is the target.

Page 33: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Recursive VersionError_code recursive_binary_1(const Ordered_list &the_list, const Key &target, int bottom, int top, int &position)/* Pre: The indices bottom to top define the range to search for the target. Post: If a Record in the range from bottom to top in the_list has key equal to

target, then position locates one such entry and success is returned. Otherwise, not_present is returned and position becomes undefined.{

Record data;if (bottom < top){ // List has more than one entry. int mid = (bottom + top) / 2;

the_list.retrieve(mid, data);if (data < target) // Reduce to top half of the list. return recursive_binary_1(the_list, target, mid + 1, top,

position);else // Reduce to bottom half of the list. return recursive_binary_1(the_list, target, bottom, mid,

position);}else if (top < bottom) return not_present; // List is empty.else {

position = bottom;the_list.retrieve(bottom, data);if (data == target) return success;else return not_present;

}}

Page 34: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Recursive Version

• So that a user of this algorithm can call it like any other sorting algorithm we introduce a simple function to arrange the parameters into the correct format for the recursion.

Error_code run_recursive_binary_1(const Ordered_list &the_list, const Key &target, int &position)

/* Post: If a Record in the_list has key equal to target, then position locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined.

Uses: recursive_binary_1 and methods of the classes Ordered_list and Record. */{

return recursive_binary_1(the_list, target, 0, the_list.size() - 1, position);}

Page 35: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Iterative Algorithm

• Since the recursion is tail recursion, it is fairly simple to write an iterative version of the same algorithm.

• In this case we do not need a special function just to set up the correct parameters.

Page 36: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Iterative VersionError_code binary_search_1(const Ordered_list &the_list, const Key &target, int &position)/* Post: If a Record in the_list has key equal to target, then position

locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined.

Uses: Methods of the classes Ordered_list and Record. */{

Record data;int bottom = 0, top = the_list.size() - 1;while(bottom < top){

int mid = (bottom + top) / 2;the_list.retrieve(mid, data);if (data < target)

bottom = mid + 1;else

top = mid;}if (top < bottom) return not_present;else{

position = bottom;the_list.retrieve(bottom, data);if (data == target) return success;else return not_present;

}}

Page 37: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Versions

• If we check to see in the target key == the key at position mid then we might get lucky and get to quit early.

• The modifications to the code are fairly simple.

• Will the possibility of quitting early be worth the extra comparison at each step?– We will run an experiment to see.

Page 38: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Recursive VersionError_code recursive_binary_2(const Ordered_list &the_list, const Key &target, int bottom, int top, int &position)/* Pre: The indices bottom to top define the range in the list to search for the

target. Post: If a Record in the range of locations from bottom to top in the_list has key equal to target, then position locates one such entry and a code of

success is returned. Otherwise, the Error_code of not_present is returned

and position becomes undefined. Uses: recursive_binary_2 and methods of the classes Ordered_list and Record. */{

Record data;if (bottom <= top){ int mid = (bottom + top) / 2;

the_list.retrieve(mid, data);if (data == target){

position = mid;return success;

}else if (data < target) // Reduce to top half of the list. return recursive_binary_2(the_list, target, mid + 1, top,

position);else // Reduce to bottom half of the list. return recursive_binary_2(the_list, target, bottom, mid - 1,

position);}else return not_present; // List is empty.

}

Page 39: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Iterative Version

Error_code binary_search_2(const Ordered_list &the_list, const Key &target, int &position)/* Post: If a Record in the_list has key equal to target, then position

locates one such entry and a code of success is returned. Otherwise, the Error_code of not_present is returned and position becomes undefined.

Uses: Methods of the classes Ordered_list and Record. */{

Record data;int bottom = 0, top = the_list.size() - 1;while(bottom <= top){

position = (bottom + top) / 2;the_list.retrieve(position, data);if (data == target) return success;if (data < target)bottom = position + 1;elsetop = position - 1;

}return not_present;

}

Page 40: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Modify Main

• The main function we used to test the sequential search can be modified in a fairly obvious manner to test these 4 different versions of binary search.

Page 41: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Comparison of Methods

• First, let’s look at successful searches.

• The clever versions are faster for short lists, but as the lists get longer, eventually the stupid versions win.– If comparisons were more expensive, the stupid version would be the clear

winner.

• Iterative versions are generally a little faster than the recursive versions.

• All of these are much faster than sequential search for long lists.

Sequential Stupid, recursive Stupid, iterativeClever,recursive

Clever,iterative

nComp Time Comp Time Comp Time Comp Time Comp Time

105 3.70e-07 4 4e-07 4 3.5e-07 4 3e-07 4 2.3e-07

100 49 2.46e-06 7 1.08e-06 7 9.8e-07 10 7.8e-07 10 6.9e-07

1000 494 2.34e-05 10 5.17e-06 10 4.29e-06 17 4.36e-06 17 4.15e-06

10000 4942 0.00018757 14 4.772e-05 14 2.928e-05 23 2.687e-05 23 2.643e-05

100000 49425 0.00126353 17 0.0003288 17 0.00024821 30 0.00025079 30 0.00025482

Page 42: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Comparison of Methods

• Next, let’s look at unsuccessful searches.

• Unlike sequential search the run times are similar between successful and unsuccessful searches.

• The clever version has an even larger number of comparisons, the stupid version remains the same.

• Otherwise the results are similar.

Sequential Stupid, recursive Stupid, iterativeClever,recursive

Clever,iterative

nComp Time Comp Time Comp Time Comp Time Comp Time

1010 5.50e-07 4 3.1e-07 4 2.7e-07 7 2.9e-07 7 3.8e-07

100 100 4.96e-06 7 1.1e-06 7 9.7e-07 13 8.2e-07 13 1e-06

1000 1000 4.34e-05 10 4.39e-06 10 4.28e-06 19 4.41e-06 20 5.89e-06

10000 10000 0.00033711 14 3.107e-05 14 2.613e-05 26 2.68e-05 26 2.701e-05

100000 100000 0.00294095 17 0.0003201 17 0.0003494 33 0.00026173 33 0.00025522

Page 43: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Conclusions

• It seems that the clever version is not worth the trouble, particularly if comparisons are expensive.– For example comparing strings.

• Similarly, the recursive version has slightly worse performance and the iterative version may be easier to understand.

• Winner – the stupid iterative version!

Page 44: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Computational Complexity

• Notice the relationship between the number of comparisons and the logarithm of the size of the list.

• We say that binary search is O(log n).

n Comp. log2n

10 4 3.32

100 7 6.64

1000 10 9.97

10000 14 13.29

100000 17 16.61

Page 45: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Homework – Section 7.3 (page 285)

• Written– E1 (a, b, c, d) (written on paper) (2 pts each)

Page 46: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Comparison trees

• Our analysis of binary search algorithms depended on:– A particular implementation– A particular computer– A particular operating system– A particular language– A particular compiler

• It would be nice to have a general analysis that would avoid all these issues.

• One method is to construct a comparison tree (decision tree or search tree).

• This tree represents each comparison (or decision) in the algorithm.

Page 47: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Sequential Search Comparison Tree

• Suppose we are searching the list 1, 2, 3, …, n using sequential search.• The following is the comparison tree.

– Each circle represents a comparison.– Each square represents a possible result of the search.– The F result means the target key was not in the list.

• First the target is compared to entry 1.– If they are equal we have found the target and we are done.– If they are not equal then move on to entry 2. – etc.

Page 48: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Binary Search Comparison Tree

• Suppose we want to search the list 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 using the stupid version of binary search.

• This is the search tree.

• The height of the tree represents the number of comparisons in the worst case.– In this case there might be 5 comparisons, but many cases need only 4.

Page 49: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Root, Leaf and Path Length

• The initial comparison is called the root of the tree and will be made in all cases.

• Each ultimate result is a leaf of the tree.• The path length is the number of interior vertices

(circles) between the root and the leaf.• The path length for a particular target key corresponds to

the number of comparisons needed for the search.• For example if the target key is 7, the path involves the

following comparisons.– Compare to 5 (greater than)– Compare to 8 (less than or equal to)– Compare to 7 (less than or equal to)– Compare to 6 (greater than)– Compare to 7 (equal)– The total number of comparisons is 5

Page 50: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Stupid Average Path Length

• We want to know the average number of comparisons.• For the searches using the stupid version:

– 12 paths of length 4– 8 paths of length 5

• All of these paths begin at the root and end at a leaf and are called external paths.

• Adding up the lengths of all the external paths produces the external path length of the tree.

• The average number of comparisons in a search is

Page 51: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Binary Search Comparison Tree

• Suppose we want to search the list 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 using the clever version of binary search.

• This is the search tree.

• In this case there might be anywhere between 1 and 8 comparisons.

Page 52: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Binary Search Comparison Tree

• The tree for the clever method is somewhat complicated.• We can simplify it by combining pairs of comparisons into a single

circle.

• Here a circle represents:– One comparison if the target key is found.– Two comparisons if the target is not found.

Page 53: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Path Length

• In this clever implementation, if the target key is 7, the path involves the following comparisons.– 5 (not equal)– 5 (greater than)– 8 (not equal)– 8 (less than)– 6 (not equal)– 6 (greater than)– 7 (equal)– There are a total of 7 comparisons.

Page 54: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Average Successful Path Length

• Now a search could terminate at any vertex.– Successful searches end at interior vertices– Unsuccessful searches end at leaves.

• To calculate the average number of comparisons for a successful search we need the length of all the paths from the root to an interior vertex.– This is the interior path length of the tree.

Page 55: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Average Successful Path Length

• There are 10 interior paths with lengths:– 1 path of length 0 (one comparison)– 2 paths of length 1 (three comparisons)– 4 paths of length 2 (five comparisons)– 3 paths of length 3 (seven comparisons)

• The total interior path length is

• Every vertex in the path represents 2 comparisons plus one for the terminating node.

• The average number of comparisons in a successful search is

Page 56: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Clever Average Unsuccessful Path Length

• There are 11 unsuccessful searches using the clever version.• They all end in leaves so we will calculate the external path length.• There are:

– 5 paths of length 3– 6 paths of length 4

• The total external path length is

• Every vertex in these paths represents 2 comparisons.• Average number of comparisons is

• The big penalty for the clever version comes in the unsuccessful case.

Page 57: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Homework – Section 7.4 (page 296)

• Exercises 7.4 (page 296)– E1(a, b, c, d) (written on paper) (3 pts each)– E2 (written on paper) (5 pts)

• Programming– P1 (email code, written report) (15 pts)

Page 58: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Extending to Larger Trees

• We want to extend our results to larger cases without going through the pain of actually drawing the decision trees.

• A 2-tree is a tree where every vertex except the leaves have two children.

• This means we can predict the maximum possible number of vertices at each level.

Level Max. # of vertices

0 1

1 2

2 4

3 8

… …

t 2t

Page 59: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Extending to Larger Trees

• This means that if we know we have k vertices on level t then

Page 60: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Extending to Larger Trees

• We often want to round our results and there are two possibilities.– The floor of x (written ) is the largest

integer less than or equal to x. (round down)– The ceiling of x (written ) is the smallest

integer greater than or equal to x. (round up)

• Notice that

Page 61: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Analysis of Stupid Method

• Suppose we are searching a list of n items.• There are n successful outcomes and the last step is to

check for equality with two possible outcomes.• Therefore, there are 2n leaves.• The number of levels on the tree must be

• This is also the maximum number of comparisons.• Notice that we can either end at level t or at level t-1 and

that

and

• In all cases there will be between lgn and lgn + 2 comparisons.

Page 62: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Analysis of Clever Method – Unsuccessful Searches

• In the clever method all unsuccessful searches end in leaves on the last two levels.

• If we are searching a list on n items then there are n+1 leaves.– Less than smallest key– Between each pair of adjacent keys.– More than the largest key.

• The height of the tree is

• Each level in the tree corresponds to two comparisons and the leaves are on either level t or level t-1.

• This means the number of comparisons will be between

and

• As we have seen before this is around twice the number with the stupid method.

Page 63: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Internal vs. External Vertices

• To compute the average number of successful searches we need a fact about the relationship between the path lengths of internal and external vertices of a 2-tree.– Let E be the external path length.– Let I be the internal path length.– Let q be the number of internal vertices (not

leaves)

• It is a general fact that E = I + 2q.

Page 64: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Internal vs. External Vertices

• To see that E = I + 2q we need to use a proof by induction.• Base case: Suppose a tree contains only the root. In this case E = I = q =

0 so the equation is true.• Induction Step: We build a larger 2-tree from a simpler one.

– Suppose we have a 2-tree (with values E1, I1 and q1) where E1 = I1 + 2q1.

– Pick a leaf v with path length k from the root.– Add two children to v so that it is no longer a leaf.– This produces a new 2-tree (with values E2, I2 and q2).

– Notice that v is in both trees but in the new tree it is no longer a leaf so q2 = q1 + 1.

– Also the internal path length is now I2 = I1 + k.

– Finally, there are two new leaves at level k + 1 but one fewer leaf at level k.– This means E2 = E1 + 2(k+1) – k.

– Now notice that E2 = E1 + 2(k + 1) – k = I1 + 2q1+ k + 2

= (I1 + k) + 2 (q1+1) = I2 + 2q2.

Page 65: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Analysis of Clever Method – Successful Searches

• In the clever method the path length to the leaves is either

or

• There are n+1 leaves so external path length is

• Each internal node corresponds to a unique list key, so q = n. • This means

• Recall the number of comparisons is 2I + q.– Every node on the each path makes 2 comparisons.– The terminating node makes 1.

• Thus the average number of comparisons is

Page 66: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Analysis of Clever Method – Successful Searches

• Notice that for large n,and

• So the average number of comparisons in a successful search is approximately

• The only thing different from the stupid method here is the -3.

• With a big penalty for unsuccessful searches.• Moral:– For short lists (<= 8) use sequential search.– For longer use the stupid binary search.

Page 67: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Homework – Section 7.4 (page 297)

• Written– E3 (written on paper) (10 pts)

Page 68: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Asymptotics

• When we are talking about the run time of an algorithm we often are only worried about what happens for large problems.

• We also don’t want to focus on details that would depend on a particular system.

• We want to compare our run times to a “library” of basic functions.– g(n) = 1 (constant)– g(n) = log n (logarithmic)– g(n) = n (linear)– g(n) = n2 (quadratic)– g(n) = n3 (cubic)– g(n) = 2n (exponential)

Page 69: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Asymptotics

Test Conclusion

f(n) has a smaller order of magnitude than g(n). f(n) is growing slower than g(n)

is finite (not 0, not infinity)

f(n) has the same order of magnitude as g(n). The growth of f(n) and g(n) only differs by a multiplied constant.

f(n) has a smaller order of magnitude than g(n). f(n) is growing slower than g(n).

Page 70: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Big O Notation

Notation Name Comparison

little o < = 0

big O <= >= 0, finite

big theta = nonzero, finite

big omega >= nonzero, could be infinite

• We introduce a new notation to express the different asymptotic growth rates.

Page 71: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Comparisons By Growth Rate

• A chart can make the relationships between our “library” functions clear.

n 1 lg n n lg n n2 n3 2n

1 1 0 0 1 1 2

10 1 3.32 33 100 1000 1024

100 1 6.64 664 10,000 1,000,000 1.268x1030

1000 1 9.97 9970 1,000,000 109 1.072x10301

Page 72: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Rules for Big O Calculations

1. Ignore multiplied constants.– so

2. Ignore all but the fastest growing term.– so

Page 73: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Rules for Big O Calculations

3. The base for logarithms doesn’t matter.– so

Page 74: Searching Chapter 7. Objectives Introduce sequential search. – Calculate the computational complexity of a successful search. Introduce binary search

Homework – Section 7.6 (page 312)

• Exercises 7.6 (page 312)– E1 (a, b, c, d) (written on paper) (2 pts each)– E2 (written on paper) (6 pts)– E5 (a, b, c, d, e, f, g, h) (written on paper) (1

pt each)– E6 (a, b, c, d) (written on paper) (2 pts each)