1
IS703:Decision Support and Optimization
Week 10: Local Search
Lau Hoong ChuinSchool of Information Systems
2
Reference:• Johnson and McGeoch, The Traveling Salesman Problem - A
Case Study in Local Optimization. In Local Search in Combinatorial Optimization, E. H. L. Aarts and J. K. Lenstra(eds), John-Wiley and Sons, 1997
Objectives:• To learn the basic local search paradigm for solving (NP-
hard) optimization problems. Why?• To design LS-based meta-heuristics (tabu search, simulated
annealing, genetic algorithms, ACO, etc)
Local Search
3
Heuristics• Optimization problem:
minx∈Sf(x), where S=feasible solutions space(find a feasible x such that f is minimized)
• Heuristic algorithms do not guarantee to find an optimal solution (as opposed to exact algorithms)
• However, they are usually designed to find a “reasonably good” solution quickly.
• Solutions to optimization problems:– Global Optimum: Equal/better than all other solutions– Local Optimum: Equal/better than all solutions in a
certain neighborhood
4
Global vs. Local Optimum
5
Why Do We Need Heuristics?• Many real world applications are too large to be
solved by exact methods (e.g. Branch and Bound, Cutting Planes, etc.)
• Solutions are usually needed fast (sometimes even in real time)– for hard problems, the time for finding an optimal
solution is usually much greater than finding a near-optimal solution
• Optimal solutions, although desirable, are often not needed in the real world.
6
Local SearchDefinitions:• N (neighborhood) : S→2s
• The size of a neighborhood is the number of solutions in the neighborhood set
Algorithm:1. Pick a starting solution s // by other algorithm2. If f(s) < mins’ in N(s) f(s’), stop // what is the time complexity?3. Else set s = s’ s.t. f(s’) = mins’’ in N(s) f(s’’)4. Goto Step 2
7
Local Search
An iteration in a local search
Current Solution
GenerateNeighborhood
Evaluate Neighborhood
Choose Neighbor
8
Example: Traveling Salesman Problem Given:
A set of n cities {c1, c2, …, cn}Cost matrix, containing cost to travel between cities
Find a minimum-cost tour.
9
Solving TSP Heuristically
3 Broad Steps:1. Construction Heuristics
– Constructing a tour from scratch – Nearest neighbor, insertion heuristic
2. Local Search Algorithms– Improving an existing tour
3. Extensions – “Escaping” from locally optimal tours– Meta-heuristics
10
Construction Heuristics (Nearest Neighbor)
Step 1: Choose starting city p at randomStep 2: Scan remaining cities for the city NNp
nearest to pStep 3: Add edge (p,NNp) to the sequenceStep 4: Set p = NNp
Step 5: Repeat from Step 2 until all cities are addedStep 6: Add last edge to complete a tour
11
Some Variants of Nearest Neighbour• Double-Sided Nearest Neighbour
– Grow the nearest neighbour sequence from both ends• Randomized Nearest Neighbour
– Instead of using the nearest neighbour of a city, pick a neighbour at random from the set of remaining knearest neighbours of the city and add it to the sequence
12
Multiple Fragment HeuristicBasic Idea:
Build a tour one edge at a time, by adding the shortest remaining available edge to the tour edge set.
Step 1: Order the set of edges in increasing length in a listStep 2: Pick first available edge on the list Step 3: Remove edge from list and add it to the tour edge setStep 4: Repeat from step 2 until there are no more available
edges on the list
Note: similarity with Kruskal’s algorithm for MSTs
13
Multiple Fragment Tour:
14
Insertion Heuristics• Basic Idea:
Starting with a subtour made of 2 arbitrary chosen cities grow a tour by inserting cities which fulfill some criteria.
• The criteria usually depends on the cities which are already in the tour
Step 1: Form a subtour by choosing 2 cities at randomStep 2: Scan remaining cities for the city which fulfills
insertion criteriaStep 3: Insert the city to the subtourStep 4: Repeat Step 2 until all cities are in the tour
15
Some Insertion Criteria• Nearest Insertion:
– Insert the city that has shortest distance to a tour city• Farthest Insertion:
– Insert the city whose shortest distance to a tour city is maximized
• Cheapest Insertion:– Choose the city whose insertion causes the least
increase in length of the resulting subtour• Random Insertion
– Select the city to be inserted at random
16
Nearest vs Cheapest Insertion
17
AnalysisNumber of steps:
n-2 cities are to be insertedFor each insertion the set of remaining cities has to be scanned : O(n)
Worst case run time: O(n2)
18
Local Search Algorithm for TSP • Create an initial tour // discussed previous lecture • Define a local search neighborhood• Two tours are neighbors if they share most of their edges
• Edge exchange neighborhood:
Delete k edges in the current tour and then add kedges which form a new feasible tour
This neighborhood is called k-opt.
19
2-opt• Delete 2 edges, add 2 edges to restore the tour
• 2-opt removes the “crossings” of edges in a tour
20
2-opt Analysis• How many possible ways?• Size of the 2-opt neighborhood:
– Number of edge pairs = n(n-3)/2– Exactly 1 way to restore tour
• At each step of Local Search, evaluate O(n2) new tours and choose the best available.
Worst-case run time: O(n2)
21
3-opt• Delete 3 edges, add 3 edges to restore the tour
• 3-opt can remove subsequences of the tour
22
3-opt Analysis• Size of the 3-opt neighborhood:
– Number of edge triples = – Each of which has 8 possible ways to be restored– Some neighbors may not be valid
• At each step of Local Search, O(n3) new toursworst case run time: O(n3)
(1) (2)
⎟⎟⎠
⎞⎜⎜⎝
⎛3n
23
k-opt• It may seem sensible to increase k even further.• However, once k edges of a tour have been
deleted there are O(nk) new tours. Expensive!• Observe that some 3-opt moves are equivalent to
applying two 2-opt moves
24
Lin-Kernighan Heuristic
• First observed by Lin and Kernighan• One of most successful heuristics to solve
combinatorial optimization problems• “Adaptive” k-Opt• Build k-opt moves (with variable k) by a sequence of
2-opt moves• "Tabu lists” for edges added and deleted• Multiple restarts
25
Given- k vehicles, each with fixed capacity - a set of customers, having location,
time window, service duration and demand- cost matrix [cij]
Find min-cost vertex-disjoint feasible routes to cover all customers.
Finding a feasible solution is NP-hard, even k=1!
Vehicle Routing Problem with Time Windows
26
Vehicle Routing Problem with Time Windows
11:00~12:0010units
11:30~12:3020units
10:30~12:0010units
11:00~11:3030units
Depot
27
Vehicle Routing Problem with Time WindowsNeighborhood
Relocate:Relocate:
Distribute:Distribute:
Exchange:Exchange:
28
Vehicle Routing Problem with Time Windows
22--Opt:Opt:
Neighborhood
Question:Question:
What is the time complexity of each neighborhood?
29
Iterated Local Search• Problem of Local Search:
– may get stuck in a local optimum of poor quality– very dependent on the starting tour
• Solution:• Apply Local Search to more than one tours• Search Diversification
30
Multi-start• Step 1: Create a feasible tour (randomly or by
using a construction heuristic)• Step 2: Apply Local Search until local optimality• Step 3: Repeat from Step 1 until some stopping
criteria is met• Step 4: Output best tour found during this process
31
Multi-start: Random vs Constructive Restarts• Random:
• creating a random tour is fast O(n)• Local Search is likely to be slow• Local Search usually does not perform very well
• Construction Heuristics• creating a new tour is slow (best case O(n2))• Local Search is usually quite fast• Local Search usually performs better when started from
good tours
32
Chained Local SearchMotivation:
Why abandon a tour that is already locally optimal?
Basic concept:Instead of restarting local search from a new tour, modify current tour (kick) and restart local search on modified tour
33
Chained Local SearchStep 1: Create an initial tour Step 2: Apply Local SearchStep 3: Modify the locally optimal tour in a way that Local
Search cannot repairStep 4: Apply Local Search only to the modified parts of the
tourStep 5: Repeat from step 3 until stopping criteria is metStep 6: Output best tour found
34
Local Search for Satisfiability
• WalkSat( C) Guess initial assignment While unsatisfied or not time out do
Select from C an unsatisfied clause c = ±Xi v ±Xj v ±XkSelect variable v in unsatisfied clause c Flip v
• GSat( C) Guess initial assignment While unsatisfied or not time out do
Flip value assigned to the variable that yields the greatest # of satisfied clauses. (Note: Flip even if no improvement)
If sat assignment not found, repeat entire process, starting from a different initial random assignment.
3SAT:Input: 3CNF, Output: Yes/No whether input can be satisfied
35
Example:
36
Mixing Random Walk with Greedy
• Value of p determined empirically, by finding best setting for problem class
37
Finding the best value of p• Q: What value for p? • Let Q[p, c] be quality of using MixedWalkSat[p] on
problem c. – Q[p, c] = Time to return answer, or
= 1 if MixedWalkSat[p] return (correct) answer within 5minutes and 0 otherwise, or
= . . . perhaps some combination of both . . . • Then, find p that maximize the average performance of
MixedWalkSat[p] on a set of challenge problems.
38
Experimental Results: “Hard” Random 3CNF
• Time in seconds• Effectiveness: probability that random initial
assignment leads to a solution. • Test instances up to 400 variables
– MixedWalkSat better than Simulated Annealing – MixedWalkSat better than Basic GSAT
Source: Selman and Kautz 1993
39
Advanced Local Search Paradigms(Meta-heurstics) •• ““An iterative master process that An iterative master process that guides and modifiesguides and modifies the the
operations of a subordinate heuristic to efficiently produce operations of a subordinate heuristic to efficiently produce high quality solutionshigh quality solutions”” (Voss et al, 2002)(Voss et al, 2002)
• Different philosophies and forms• Single-point
– Tabu Search– Simulated Annealing– Greedy Randomized Adaptive Search Procedure
• Population-based– Evolutionary (Genetic) Algorithms– Ant Colony Optimization– Particle Swarm
• Hybrids, e.g. Hyper-Heuristics
40
What can go wrong with Local Search?• Caught in local optimality
– Diversify : multiple (random restarts), etc• Cycling:
41
What can go wrong with Local Search? Analogy
42
Tabu Search
• A tabu list is maintained for forbidden (tabu) moves, to avoid cycling
• Tabu moves are based on the short- and long-term history of the search process.
• Aspiration criteria is the condition which allows the tabu status of a tabu move to be overwritten so that the move can be considered at the iteration.
• The next move is the best move among the feasible moves from the neighborhood of the current solution. A tabu move is taken if it satisfies the aspiration criteria.
Fred Glover (1986)
43
Tabu Search Algorithm
1. construct an initial solution s2. while not finished3. compute N(s), T(s), A(s)4. choose s’∈(N(s) - T(s)) ∪ A(s) s.t. cost(s’) is min5. s = s’6. endwhile
• s: current solution• N(s): neighborhood set of s where N(s) ⊆ S• T(s): tabu set of s where T(s) ⊆ N(s)• A(s): aspiration set of s where A(s) ⊆ T(s)
44
Tabu Search Algorithm
Step 3
Move operate on the best-picked neighbor to generate new solution
Step 1 Step 2
Step 4Step 5
Step 6
Tabu search Stopped
Set Current Solution
Define the Neighborhood
Update the TabuList and trigger any related events
Stopping conditions meet
ObjectiveFunction used to evaluate each of the neighbors
TabuList and AspirationCriteria are consulted
45
Example: Constrained MST Problem
x1 x2x3
x4
x6 x7
x5
6
2 0
8 12
189
Constraints:
(1) x1 + x2 + x6 ≤ 1
(2) x1 ≤ x3
Penalty of each constraint violation= 50
46
Example • Neighborhood: standard “edge swap”• An edge is tabu if it was added within the last two
iterations• The aspiration criteria is satisfied if the tabu move
would create a tree that is better than the best tree so far
47
Example • Iteration 1 (initial MST)
x1 x2x3
x4
x6 x7
x5
6
2 0
8 12
189
cost = 16 + 100
AddDrop
Constraints:
(1) x1 + x2 + x6 ≤ 1
(2) x1 ≤ x3
Penalty of each constraint violation= 50
48
Example • Iteration 2
x1 x2x3
x4
x6 x7
x5
6
2 0
8 12
189
cost = 28
Tabu
AddDrop
49
Example • Iteration 3
x1 x2x3
x4
x6 x7
x5
6
2 0
8 12
189
cost = 32
Tabu
Tabu
DropAdd
Edge x3 is aspired.
50
Example • Iteration 4
x1 x2x3
x4
x6 x7
x5
6
2 0
8 12
189
cost= 23
Tabu
Tabu
51
TabuTabu Search forSearch for TSPTSPTabu ListTabu List
Proposal 1Proposal 1: : TabuTabu--inging the solutionthe solution•• Record the complete tourRecord the complete tour•• Strictly prevent solution cyclingStrictly prevent solution cycling•• Verification required checking all rotationsVerification required checking all rotations
Proposal 2Proposal 2: : TabuTabu--inging the movethe move•• Record the recent made movesRecord the recent made moves•• Easy to verifyEasy to verify•• Less restrictive: Does not prevent cycling entirelyLess restrictive: Does not prevent cycling entirely
Proposal 3Proposal 3: : TabuTabu--inging the objective valuethe objective value•• Record the solutionRecord the solution’’s objective values objective value•• Easy to verifyEasy to verify•• More restrictive: May accidentally miss out good solutions More restrictive: May accidentally miss out good solutions
52
TTABUABU SSEARCH EARCH Advanced StrategiesAdvanced Strategies
Reactive Tabu ListReactive Tabu List•• Tabu tenure varies according to search historyTabu tenure varies according to search history•• Lengthen tenure if a series of poor solutions is encountered Lengthen tenure if a series of poor solutions is encountered •• Shorten tenure if good (elite) solution is encounteredShorten tenure if good (elite) solution is encountered
IntensificationIntensification•• Focus on solutions with Focus on solutions with ““goodgood”” characteristicscharacteristics•• Idea is to find better solutions around elite solutions Idea is to find better solutions around elite solutions •• Usually applied on local optimalUsually applied on local optimal
DiversificationDiversification•• Forcing search to move away from specific characteristicsForcing search to move away from specific characteristics•• Idea is to explore new regions of the search spaceIdea is to explore new regions of the search space•• Usually applied when potential of finding better solution is lowUsually applied when potential of finding better solution is low
53
Simulated Annealing
Let T be non-negativeLoop:
pick random neighbor s’ in N(s)let D = f(s) – f(s’) (improvement)if D > 0 (better), s = s’else with probability eD/T , set s = s’
As T tends to 0, hill climbingAs T tends to ∞, random walkIn general, start with large T and decrease it slowly.
Kirkpatrick et al. (1982); Metropolis et al. (1953)
54
Intuition Behind Simulated Annealing• A stochastic variation on hill climbing in which
downhill moves can be made.• The probability of making a downhill move decreases
with time (length of the exploration path from a start state).
• Model after the physical process of annealing metals(cooling molten metal to solid minimal-energy state)– solutions = states – cost of solution = energy of state
55
Intuition Behind Simulated Annealing• During the annealing process in metals, there is
a probability p that a transition to a higher energy state (which is sub-optimal) occurs. This probability is edE/T where– dE = (energy of previous state) – (energy of current
state) (it is < 0)– T = temperature of the metal
• p is higher when T is higher, and movement to higher energy states become less likely as the temperature comes down.
• The rate at which system is cooled is called the annealing schedule.
56
Effect of Varying dE for fixed T=10
1.000
0.27-13
0.01-43e dE/10dE
The smaller thevalue of dE, the bigger the step wouldbe downhill, and the lower the probabilityof taking this downhillmove.
57
Effect of Varying T for fixed dE = -13
0.9999…1010
0.5650
0.0000021e -13/TT
The greater thevalue of T, the smaller the relativeimportance of dEand the higher theprobability of choosing a downhillmove.
58
Logistic (sigmoid) Function• select any move (uphill or downhill) with probability 1/(1+e –dE/T)
dE
1
0.5T very high
p T near 0
0-20 20
59
Annealing Schedule
After each step, how to update T?• Logarithmic: Ti = γ/log(i+2)
provable properties, slow• Geometric: Ti = T0 γ int(i/L)
lots of parameters to tune• Adaptive
change Ti dynamically
60
Properties of Simulated Annealing• If the annealing schedule lowers T slowly enough
(exponentially), algorithm can find global optimum.• If large running time is allowed, SA generally
outperforms all other meta-heurstics with respect to quality of solutions.
61
Population-Based Meta-HeuristicsHybrid Meta-Heuristics
Next Week…