Upload
rozeny2k
View
231
Download
0
Embed Size (px)
Citation preview
8/3/2019 DAA Lecture 3
1/63
Design and Analysis of AlgorithmsGraduate Course-Number CSC5011Fall Semester 2011
Lecture 3 Searching Tactics
Dr. Md. Shamim Akhter
Assistant ProfessorComputer Science Department
American International University Bangladesh
Email: [email protected]
8/3/2019 DAA Lecture 3
2/63
Searching Concept (1/3)Common problem in computer science
Involves storing and maintaining large dataset, and then searching the data forarticular values
data storage and retrieval are key tomany industry applications
search algorithms are necessary tostoring and retrieving data efficiently
8/3/2019 DAA Lecture 3
3/63
Searching Concept (2/3) For instance, a program that checks
the spelling of words, searches forthem in a dictionary, which is just anordered list of words.
Problems of this kind are calledsearching problems.
8/3/2019 DAA Lecture 3
4/63
Searching Concept (3/3) There are many searching algorithms.
The natural searching method is linearsearch (or sequential search, or exhaustivesearch)
very simple but takes a long time to apply withlarge lists
A binary search repeatedly subdivides the
list to locate an item much faster than linear search
Like a binary search, an interpolation
search repeatedly subdivides the list tolocate an item
8/3/2019 DAA Lecture 3
5/63
Linear / Sequential Search Special case of brute-force search
This is a very simple algorithm It uses a loop to sequentially step through
, .
It compares each element with the valuebeing searched for and stops when that
value is found or the end of the array isreached.
8/3/2019 DAA Lecture 3
6/63
Linear Search (2/8)Sub LinearSearch(x:int, a[]: Int, loc: Int)
i:=1While (i
8/3/2019 DAA Lecture 3
7/63
Linear Search (3/8)Array numlist contains
earc ng or t e t e va ue 11, nearsearch examines 17, 23, 5, and11 -> Found
Searching for the the value 7, linearsearch examines 17, 23, 5, 11, 2,29, and 3 -> Not Found
8/3/2019 DAA Lecture 3
8/63
Linear Search (4/8) The advantage is its simplicity.
It is easy to understand Easy to implement Does not require the array to be in order
The disadvantage is its inefficiency If there are 20,000 items in the array and
what you are looking for is in the 19,999th
element, you need to search through theentire list.
8/3/2019 DAA Lecture 3
9/63
Linear Search (5/8) Whenever the number of entries doubles,
so does the running time, roughly. If a machine does 1 million comparisons
er second it takes about 30 minutes for
4 billion comparisons.
8/3/2019 DAA Lecture 3
10/63
Linear Search (6/8)
8/3/2019 DAA Lecture 3
11/63
Linear Search (7/8)Use a Sentinel to Improve the
Performance
Sub LinearSearch2(x:int, a[]: Int, loc: Int)
= = =While (xa[i])i = i+1
End WhileIf i
8/3/2019 DAA Lecture 3
12/63
Linear Search (8/8)Apply Linear Search to Sorted Lists
Sub LinearSearch3(x:int, a[]: Int, loc: Int)
i = 1
While (x > a[i])i = i+1
End While
If a[i] = x Then loc = i Else loc = 0End Sub
8/3/2019 DAA Lecture 3
13/63
Binary Search (1/9)Can We Search More Efficiently?
Yes, provided the list is in some kind oforder, for example alphabetical order withrespect to the names.
If this is the case, we use a divide andconquer strategy to find an item quickly.
This strategy is what one would use in anumber guessing game, for example.
8/3/2019 DAA Lecture 3
14/63
Binary Search (2/9)Im Thinking of A Number
between 1 and 1000. Guess it!
Is it 750? Nope, too high. Is it 625? etc
This strategy guarantees a correctguess in no more than ten guesses!
8/3/2019 DAA Lecture 3
15/63
Binary Search (3/9)Apply This Strategy to Searching
The resulting algorithm is called theBinary Searchalgorithm. We check the middle key in our list.
If it is beyond what we are looking for(too high), we look only at the 1st half ofthe list.
If its not far enough in (too low), welook at the 2nd half.
Then iterate!
8/3/2019 DAA Lecture 3
16/63
Binary Search (4/9)1. Divide a sorted array into three
sections. middle element elements on one side of the middle
element elements on the other side of the middle
element
2. If the middle element is the correct
value, done. Otherwise, go to step 1,using only the half of the array thatmay contain the correct value.
8/3/2019 DAA Lecture 3
17/63
Binary Search (5/9)
3. Continue steps 1 and 2 until either the
value is found or there are no moreelements to examine.
8/3/2019 DAA Lecture 3
18/63
Binary Search (6/9)Binary Search Example
Array numlist2 contains
2 3 5 11 17 23 29
Searching for the value 11, binarysearch examines 11 and stops. Found.
Searching for the value 7, binary searchexamines 11,3,5,and stops. Not
Found.
8/3/2019 DAA Lecture 3
19/63
Binary Search (7/9)Algorithm for Binary search
Sub BinarySearch(x:int, a[]: int, loc: Int)i =1: j =n
wbeginm =(i + j) \ 2
if x > a[m] then i=m+1 else j=mendif x=a[i] then loc=i else loc=0
End Sub
8/3/2019 DAA Lecture 3
20/63
Binary Search (8/9) The worst case number of comparisons
grows by only 1 comparison every time listsize is doubled.
Only 32 comparisons would be needed on
a list of4 billion using Binary Search. Sequential Search would need 4 billion
comparisons and would take 30 minutes!
8/3/2019 DAA Lecture 3
21/63
Binary Search (9/9) Benefit
Much more efficient than linear search. For array of N elements, performs at
Disadvantage
Requires that array elements be sorted.
8/3/2019 DAA Lecture 3
22/63
Interpolation Search (1/9) Binary search is a great improvement
over linear search eliminates large portion of the list without
ll x min ll
Values are fairly evenly distributed,interpolation can be used to
eliminate more values at each step.
8/3/2019 DAA Lecture 3
23/63
Interpolation Search (2/9) Interpolation is the process of
using knowledge to guess theposition of an unknown value
Indexes of known values in the list
value should have. Interpolation search selects the
dividing point by interpolation usingthe following code:
m = l + (x a[l])*(r-l)/(a[r]-a[l])
8/3/2019 DAA Lecture 3
24/63
Interpolation Search (3/9) Compare x to a[m]
If x = a[m]: Found. If x a m : set l = m + 1
If searching is still not finish, continuesearching with new l and r.
Stop searching when Found or xa[r].
8/3/2019 DAA Lecture 3
25/63
Interpolation Search (4/9)Example: Find the key x = 32 in the list
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 201 4 7 9 9 12 13 17 19 21 24 32 36 44 45 54 55 63 66 70
1: l=1, r=20 -> m=1+(32-1)*(20-1)/(70-1) =10
a[10]=21 l=11
2: l=11, r=20 -> m=11+(30-24)*(20-11)/(70-24) = 12
a[12]=32=x -> Found at m = 12
8/3/2019 DAA Lecture 3
26/63
Interpolation Search (5/9)Example: Find the key x = 30 in the list
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 4 7 9 9 12 13 17 19 21 24 32 36 44 45 54 55 63 66 70
1: l=1, r=20 -> m=1+(30-1)*(20-1)/(70-1) = 9
a = =x - =2: l=10, r=20 -> m=10+(30-21)*(20-10)/(70-21) = 12
a[12]=32>30=x -> r = 113: l=10, r=11 -> m=10+(30-24)*(11-10)/(24-
21) = 12
m=12>11=r: Not Found
8/3/2019 DAA Lecture 3
27/63
Interpolation Search (6/9)Private Sub Interpolation(a[]: Int, x: Int, n: Int,
Found: Boolean)l = 1: r = n
Do While (r > l)
m = l + ((x a[l]) / (a[r] a[l])) * (r - l)Verify and Decide What to do next
Loop
End Sub
8/3/2019 DAA Lecture 3
28/63
Interpolation Search (7/9)
Verify and Decide what to do next
If (a[m] = x) Or (m < l) Or (m > r) ThenFound = iif(a[m] = x, True, False)Exit Do
ElseIf (a[m] < x) Thenl = m + 1
ElseIf (a[m] > x) Then
r = m 1End If
8/3/2019 DAA Lecture 3
29/63
Interpolation Search (8/9) Binary search is very fast (O(logn)), but
interpolation search is much faster(O(loglogn)).
For n = 2^32 (four billion items) Binary search took32 steps of verification Interpolation search tookonly 5 steps of
verification.
8/3/2019 DAA Lecture 3
30/63
Interpolation Search (9/9) Interpolation search performance
time is nearly constant for a largerange of n.
data had been stored on a hard diskor other relatively slow device.
8/3/2019 DAA Lecture 3
31/63
Binary Search Tree (BST) Its a binary tree !
For each node in a BST left subtree is smaller than it;
an
right subtree is greater than it.
8/3/2019 DAA Lecture 3
32/63
Search Operation
Search operation takestime O(h), where h isthe height of a BST
8/3/2019 DAA Lecture 3
33/63
Operation Insert
8/3/2019 DAA Lecture 3
34/63
Worst Case
8/3/2019 DAA Lecture 3
35/63
Performance
Depend on the shape of the tree
Best Case: Perfectly balanced tree, log N nodes from
root to leave
Worst Case: N nodes in a search path
Average Case: 1.39 log N comparisons for N keys
8/3/2019 DAA Lecture 3
36/63
Balanced Tree
Tree structures support various basic dynamicset operations in time proportional to the heightof the tree
e.g.: Search, Predecessor, Successor, Minimum,, ,
Ideally, a tree will be balanced and the heightwill be log nwhere nis the number of nodes
in the tree To ensure that the height of the tree is as
small as possible and therefore provide the
best running time
8/3/2019 DAA Lecture 3
37/63
Balanced BST
BST Worst case O(N)
Need to be balancedApproach:
Recursive and linear time
However, insertion cost quadratic
Frequently rebalancing
Is there a type of BST which guarantee??
Every insert and search will be logarithmic
8/3/2019 DAA Lecture 3
38/63
Top Down 2-3-4 Trees
Nodes store 1, 2, or 3 keys and have 2,
3, or 4 children, respectivelyAll leaves have the same depth
8/3/2019 DAA Lecture 3
39/63
2-3-4 Tree Nodes
Introduction of nodes with more than 1
key, and more than 2 children
-
same as a binary node
3 Node: 2 keys, 3 links
4 Node:
3 keys, 4 links
8/3/2019 DAA Lecture 3
40/63
Why 2-3-4? (1/2)
Why not minimize height by maximizing children ina d-tree?
Let each node have d children so that we getO(logd N) search time! Right?
That means if d = N1/2, we get a height of 2
8/3/2019 DAA Lecture 3
41/63
Why 2-3-4? (2/2)
However, searching out the correct childon each level requires O(log N1/2) by
binary search
2 log N1/2 = O(log N) which is not as good
as we had hoped for! 2-3-4-trees will guarantee O(log N) height
using only 2, 3, or 4 children per node
8/3/2019 DAA Lecture 3
42/63
Insertion into 2-3-4 Trees (1/3)
Insert the new key at the lowest internal
node reached in the search 2-node becomes 3-node
3-node becomes 4-node
What about a 4-node?
We cant insert another key!
8/3/2019 DAA Lecture 3
43/63
Insertion into 2-3-4 Trees (2/3)
In our way down the tree, whenever we
reach a 4-node, we break it up into two2-nodes, and move the middle elementup into the parent node
8/3/2019 DAA Lecture 3
44/63
Insertion into 2-3-4 Trees (3/3)
Now we can perform the insertion using
one of the previous two cases Since, we follow this method from the
root down to the leaf it is called to
down insertion
8/3/2019 DAA Lecture 3
45/63
Splitting the Tree
As we travel down the tree, if we
encounter any 4-node we will break it upinto 2-nodes.
his uarantees that we will never have
the problem of inserting the middleelement of a former 4-node into itsparent 4-node.
8/3/2019 DAA Lecture 3
46/63
Splitting the Tree
8/3/2019 DAA Lecture 3
47/63
Splitting the Tree
Time Complexity of Insertion
8/3/2019 DAA Lecture 3
48/63
Time Complexity of Insertion
in 2-3-4 Trees Time complexity:
A search visits O(log N) nodesAn insertion requires O(log N) node splits
Each node s lit takes constant time
Operations Search and Insert eachtaketime O(log N)
d
8/3/2019 DAA Lecture 3
49/63
Beyond 2-3-4 Trees
What do we know about 2-3-4 Trees?
Balanced
O(log N) search time
Different node structures
Can we get 2-3-4 tree advantages ina binary tree format???
Welcome to the world of Red-Black Trees!!!
8/3/2019 DAA Lecture 3
50/63
Best both methods
Search in BST Insert in 2-3-4 search tree
R d Bl k T
8/3/2019 DAA Lecture 3
51/63
Red-Black Tree
A red-black tree is a binary search tree withthe following properties:
edges are colored red or black
no two consecutive red ed es on an root-leaf
path same number of black edges on any root-leaf
path (= black height of the tree)
edges connecting leaves are black
R d Bl k T
8/3/2019 DAA Lecture 3
52/63
Red-Black Tree
2 3 4 T E l i
8/3/2019 DAA Lecture 3
53/63
2-3-4 Tree Evolution
How 2-3-4 trees relate to red-black trees
8/3/2019 DAA Lecture 3
54/63
Insertion into Red-Black Tree1. Perform a standard search to find the leaf where
the key should be added
2. Replace the leafwith an internal node with thenew key
.
4. Add two new leaves, and color their incomingedges black
Inse tion into Red Black T ee
8/3/2019 DAA Lecture 3
55/63
Insertion into Red-Black Tree
If the parent had an incoming red edge,we now have two consecutive red edges!
We must re-organize tree to remove thatviolation.
What must be done depends on the siblingof the parent.
I ti Pl i d Si l
8/3/2019 DAA Lecture 3
56/63
Insertion - Plain and Simple
Right Left Rotation
8/3/2019 DAA Lecture 3
57/63
Right Left Rotation
Restructuring
8/3/2019 DAA Lecture 3
58/63
Restructuring
Case 2: Incoming edge of p is red,and its sibling is black
8/3/2019 DAA Lecture 3
59/63
Similar to a right rotation, we can do aleft rotation...
Double Rotation
8/3/2019 DAA Lecture 3
60/63
Double Rotation
What if the new node is between its parent andgrandparent in the inorder sequence?
We must perform a double rotation(which is nomore difficult than a single one)
This would be called a left-right double rotation
Last of the Rotations
8/3/2019 DAA Lecture 3
61/63
Last of the Rotations
And this would be called a right-leftdouble rotation
Bottom-Up Rebalancing
8/3/2019 DAA Lecture 3
62/63
Bottom-Up Rebalancing
Case 3: Incoming edge of p is red and itssibling is also red
We call this a promotion
Note how the black depthremains unchanged for allof the descendants ofg This process will continue
upward beyondg if necessary: rename gas n and repeat.
Summary of Insertion
8/3/2019 DAA Lecture 3
63/63
Summary of Insertion
If two red edges are present, we do either
a restructuring(with a simple or doublerotation)
and stop, or apromotion and continue
A r r rin k n n im n i
performed at most once. It reorganizes an off-balanced section of the tree.
Promotions may continue up the tree and are
executed O(log N) times. The time complexity of an insertion is
O(logN).