The Linear and Binary Search and more Lecture Notes 9

The Linear and Binary Searchand more

Lecture Notes 9

Linear Search

Suppose we have a vector of ints How do we search for a target Write the code to do this Analyze your code

Is it working under all conditions? Have you tested all possible inputs?

Linear Search Algorithm

Public int linearSearch(int target){ for (int I=0;I<A.size();I++) if (A.elementAt(I)==target) return i; return -1;}

Linear Search -analysis

Worst case is target item not in the list or at the end– N comparisons – Order(n)

Best case is target is at the first position– Order(1) complexity

Average case– (n+(n-1)+(n-2)+…+1)/n comparisons– Average complexity is order(n)

Binary Search

Linear Search is highly time consuming for arrays(vectors) that are large

So what if the array is sorted? Is there a better way to do the search Answer: yes Look at the middle element

– If the target is smaller, look to the left sub-array– If the target is larger, look to the right sub-array

Binary Search ctd..

Consider 12 34 56 89 98 The middle element is (n+1)/2 If the target is 15, then

– Search left– Else search right

Write the code Analyze the algorithm

Binary Search Algorithm

public int binarySearch(Object target){ int first=0, last=A.size()-1; while (first<=last){ middle=(first+last)/2; if (A.elementAt(middle) > target) {last=middle-1;} else if (A.elementAt(middle) < target) {first=middle+1;} else return middle; } return -1}

Analysis of the binary Search

Other Search Algorithms

Breadth-first search is one of the two most common search algorithms

The approach of breadth-first search is to start with a queue containing a list of nodes to visit

A node is a state or a value; in programming it is usually represented as a structure containing particular information about the environment or domain of the problem.

Other Search Algorithms ctd..

The algorithm starts by placing the initial state of the problem into the head (beginning) of the queue

The search then proceeds by visiting the first node and adding all nodes connected to that node to the queue

The search then proceeds by visiting the first node and adding all nodes connected to that node to the queue. When viewed as a tree graph, it would move from left to right from the current node (usually represented as a circle on a graph) along links (represented as connecting lines in between the node circles) to connected nodes, adding the connected nodes to the queue.

As the head node of the queue is visited it is removed. The search then moves to the next node on the queue and continues until the goal is reached.

Sorts

Many programs will execute more efficiently if the data they process is sorted before processing begins.– We first looked at a linear search

it doesn’t care whether the data is sorted or not the algorithm starts at the first element in the vector, and looks

at every element in the vector, in sequence, until it finds what it is looking for, or comes to the end of the vector

– With small data sets this algorithm performs acceptably– If the data sets are of significant size, than performance can

become unacceptable

We have since looked at Binary Search, which improves performance, IF the data is sorted

Sorts

If we know that a data set is sorted in some order (lowest to greatest, largest to smallest, highest priority to lowest priority), then we can write searches to take advantage of this fact.– If you have misplaced your calculator in the library in the

afternoon, you do not have to retrace your steps from when you arrived on campus in the morning to find it

– You start your search in the library

– When looking up a phone number in the phone book, you do not start at page 1 and scan each page in sequence, until you find the name you are looking for

Sorting Comparison

There are all kinds of sorts For our purposes a sort is a rearrangement of data into either

ascending or descending order

Fast Sorts versus Slow SortsO (N log2 N) O (N2)

Slow sorts are easy to code and sufficient when the amount of data is small

N N2 N * log(N)10 100 33100 10,000 664

1,000 1,000,000 9,96610,000 1,000,000,000 132,877100,000 10,000,000,000 1,660,964

Bubble Sort - N2 Sort

Strategy– ‘bubble’ the smallest item to the left (slot 0)– ‘bubble’ the next smallest item to slot 1– ‘bubble’ the third smallest item to slot 2

algorithm (for one pass)walk through the Vectorif this item (in spot j) is smaller than the item in spot 0swap item in spot j with the item in spot 0

codefor (int j = 0; j < count; j++) {if (nums[j] < nums[0] {int temp = nums[j];nums[j] = nums[0];nums[0] = temp;}

}

Bubble Sort (con’t)

This code for one pass finds the smallest element in the vector and puts it into slot 0

Now wrap this code in a loop that will find succeeding smaller elements and put them into the proper position in the Vector// outer for loop controls the spot that gets the appropriate smallest// valuefor (int left = 0; left < count - 1; left++) {// code from previous slide, with starting point being left + 1// instead of 0, and comparing nums[j] to nums[left]for (int j = left + 1; j < count; j++)if (nums[j] < nums[left] {int temp = nums[j];nums[j] = nums[left];nums[left] = nums[j];}}}

Selection Sort We have noticed that the most “expensive” part of the bubble sort is swapping (three lines of code to execute)

A selection sort reduces the number of swaps until the end of each pass, when we know what the smallest remaining value is, and placing it appropriately

Strategy– find the smallest item, and swap it with slot 0– find the next smallest item and swap it with slot 1– find the third smallest item and swap it with slot 2

algorithm (for one pass)walk through the Vectorthe first item is labeled the smallestEvery other element is compared to the smallestif it is smaller, than it is labeled the smallest

At the end of the walkthrough, the first is swapped with the smallest

Selection Sort (con’t) The algorithm for one pass finds the smallest element in the vector and puts it into slot 0

Now wrap this code in a loop that will find succeeding smaller elements and put them into the proper position in the Vector

// outer for loop controls the spot that gets the appropriate smallest// value (same as bubble sort)

for (int left = 0; left < count - 1; left++) { // last one will be correctint smallest = left; // we will keep the index to the smallest for (int j = left + 1; j < count; j++)if (nums[j] < nums[smallest] {smallest = j;}if (smallest != left) // no sense swapping if left is smallest{int temp = nums[smallest];nums[smallest] = nums[left];nums[left] = temp;}}

Use FindSmallest() routine with SelectionSort()

We have spent a lot of time looking at FindSmallest() routines… you should know how to do that

Incorporate that knowledge into selection sort

// outer for loop controls the spot that gets the appropriate smallest// value (same as bubble sort)for (int left = 0; left < count - 1; left++) {int smallest = findSmallest(nums, count, left); // use what we already know// pass the starting point of the // remainder of the vector to look // atif (smallest != left ) {int temp = nums[smallest];nums[smallest] = nums[left];nums[left] = temp;}}

findSmallest(const int nums[], int count, int left);

int findSmallest(const int nums[], int count, int left) {

int smallest = left; // start with first index as the smallest

for (int j = left + 1; j < count; j++)if (nums[j] < nums[smallest] {smallest = j;}

return smallest;}

Quick Sort Most widely used algorithm Invented in 1960 by C.A.R. Hoare Not difficult to implement Good general purpose sort (works well in most

cases) Average performance is n log n Reecursive (drawback) Worst case performance is n2 (why is that????)

– file already sorted time = n2/2 and space = n

precise mathematical analysis backed up by empirical results

Divide and conquer algorithm

Quick Sort Algorithm

– Partitioning Step Choose a pivot element say a = v[j] Determine its final position in the sorted array

– a > v[I] for all I < j – a < v[I] for all I > j

– Recursive Step Perform above step on left array and right array

An early look at quicksort code (incomplete)void quicksort(vector<int> &A , int left, int right) { int I; if (right > left) { Pivot(A, left, right); I = partition(A, left, right); quicksort(A, left, I-1); quicksort(A, I+1, right); }}

Quick Sort Code ctd..More Detailed look at the partition code// Partition(): rearrange A into 3 sublists, a sublist // A[left] Ö A[j-1] of values at most A[j], a sublist A[j],// and a sublist A[j+1] Ö A[right] of values at least A[j]int Partition(vector<int> &A, int left, int right) {

char pivot = A[left];int i = left;int j = right+1;do {do ++i; while (A[i] < pivot);do --j; while (A[j] > pivot);if (i < j) {Swap(A[i], A[j]);}} while (i < j);Swap(A[j], A[left]);return j;

}

Quick Sort Code ctd..More Detailed look at the pivot code// Pivot(): prepare A for partitioningvoid Pivot(vector<int> &A, int left, int right) {

if (A[left] > A[right])Swap(A[left], A[right]);

}

eg: trace the quick sort code for 30 12 23 10 28

Quick Sort Analysis Divide the file by half Recurrence relation is

– C(n) = 2C(n/2) + n– 2C(n/2) is the cost of doing two partitions of size n/2– n is the cost of examining each element

Using the above recurrence relation , prove that quick sort is of n log nLet us assume that n = 2k for some k without the loss of generalityC(n)= 2C(n/2) + n = 2 (2C(n/4) + n/2) + n = 2 (2(2C(n/8)+n/4) + n/2) + n = 23C(n/23) + 3 n ….. After 3 iterationsSo after k iterations we have C(n) = 2k C(n/2k) + k. n = 2k C(1) + k. n since n = 2k

= 2log n . 1 + k.n = n + n log n = f(n log n) In Lab 5 you must use the recursive quicksort

Documents

The Linear and Binary Search and more Lecture Notes 9