Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Optimization for Learning, Planning andProblem SolvingIntroduction and Concepts
Joshua KnowlesSchool of Computer Science
The University of Manchester
COMP60342 - Week 1 2.15, Mar 13th 2015
TRAVELING SALESPERSONInstance: Complete graph G = (V,E) and set of edge weights wi,j
for each i, j ∈ E. Number k.Question: Is there a tour starting and ending at v0 and visiting everyother vertex of V exactly once, of total edge weight less than k ?
COMP60342-Introduction 2 2.15, Mar 13th 2015
This Course
Aim: a practical introduction to solving optimization problems computationally.
• Develop skill at classifying and formulating problems
• Practise programming skills
• Develop algorithm design skills
• Learn:
- a few general-purpose algorithms for optimization
- how to adapt these to different problems
• Learn to reason about complexity of a problem and methods
Less important: deeply mathematical approaches to optimization.
COMP60342-Introduction 3 2.15, Mar 13th 2015
Structure of the Course
Week 1: Intro to OptimizationBasic exact methods: Enumeration and Greedy
Week 2: Branch-and-BoundComplexity/NP-Completeness
Week 3: Dynamic ProgrammingStochastic DP Problems
Week 4: Evolutionary Algorithms and Local SearchWeek 5: Multiobjective Optimization
Assessments:1st Lab (two weeks) Knapsack Problems 10%2nd Lab (one week) Dynamic Programming Applications 10%3rd Lab (one week) Weighted Graph Matching Problem 10%4th Lab (one week) Multiobjective MAX-SAT 10%Examination (2 hours) 20 Multiple choice questions 40%
2 Long Questions from 4 60%
COMP60342-Introduction 4 2.15, Mar 13th 2015
The Course Text
Relevant Chapters / Sections:
1. Optimization Problems
1.A Mathematical Revision Notes
8. Algorithms and Complexity
12. Minimum Spanning Tree
15. NP-Complete Problems
16.2 Strongly NP-hard
17. Approximation Algorithms
18. Branch-and-Bound; Dynamic Programming
19. Local Search
COMP60342-Introduction 5 2.15, Mar 13th 2015
This Lecture
• Problems, and Optimization Problems
• Applications of optimization (a brief tour)
• Formulation of optimization problems
• Algorithms and basic complexity concepts
COMP60342-Introduction 7 2.15, Mar 13th 2015
The Meaning of Problem
In mathematics, one general way to define a problem is
a problem is a function that partitions a set of candidatesolutions into two sets, solutions and non-solutions.
For example
2 + 2 = ?
partitions the set of candidate solutions (say the natural numbers) intothe set of solutions {4} and non-solutions {0,1,2,3,5,. . .}.
COMP60342-Introduction 8 2.15, Mar 13th 2015
Equally,
10 + 10 > ?
partitions the set of candidate solutions (the natural numbers) into theset of solutions {0,1,2,. . . ,19} and the set of non-solutions{20,21,. . .}.
To solve a problem is to find one or more (or all) of the solutions.
COMP60342-Introduction 9 2.15, Mar 13th 2015
The Meaning of OptimizationProblem
An optimization problem is just a mathematical problem concernedwith finding an optimum (= minimum or maximum) of a set ofcandidate solutions.
COMP60342-Introduction 10 2.15, Mar 13th 2015
The Meaning of OptimizationMaximize a measure of utility over a set of candidate solutions.
E.g. In a given graph, find a clique of greatest total weight.
A set of candidate solutionsand their utility values
As in this problem, the problems we will be concerned with on thiscourse have a discrete and finite set of candidate solutions.
COMP60342-Introduction 11 2.15, Mar 13th 2015
The Meaning of OptimizationAim: find the best from a set of possible alternatives.
The set of all possible alternatives is the search space or searchdomain.
Alternatives are often (though not always) described as vectors ofvariables. They are known as solutions, candidates orconfigurations.
The “best” is defined as that which maximizes some well-definedmeasure of performance, profit, fitness, utility or quality, orminimizes some cost or penalty.
Alternatives have to be feasible. Feasibility is defined as meeting anumber of constraints.
COMP60342-Introduction 12 2.15, Mar 13th 2015
The Difficulty of Optimization
As described, optimization seems mathematically trivial — eveneasier than say sorting. (Why?)
However, in most optimization problems the input objects only definethe feasible alternatives in a non-direct way. We must search overmany or all the combinations (or permutations) of the input objects.
Hence, while in sorting there are only N objects to sort, inoptimization there are often 2N or N ! candidates to search and test(i.e., an exponential function of the input size N ).
The problems are called combinatorial optimization because of thissearch over combinations.
COMP60342-Introduction 13 2.15, Mar 13th 2015
The very large number of alternatives that are associated with suchproblems is known as a combinatorial explosion.
Optimization is about how to organize the searching of this set toobtain results efficiently in time (and sometimes space).
Another difficulty of optimization is that the evaluation of even a singlecandidate may be time-consuming.
COMP60342-Introduction 14 2.15, Mar 13th 2015
Machine Learning...Chess
Simple: What move wouldmaximize value of capturedpieces?
Tricky: What move nowwould force a win?
COMP60342-Introduction 16 2.15, Mar 13th 2015
Machine Learning...Poker
Given the state of play, how much should I bet this round?
COMP60342-Introduction 17 2.15, Mar 13th 2015
Machine Learning...Simulated CarRacing
What can we optimize in a shoot-em-up or real-time game?
COMP60342-Introduction 18 2.15, Mar 13th 2015
Machine Learning...Feature Selection
Red: genes up-regulated indiseased tissueGreen: genes up-regulatedin healthy tissue
What selection of features (genes) makes it easy to detect the presence (orprogression) of a disease?
COMP60342-Introduction 19 2.15, Mar 13th 2015
Planning ... the Shortest Tour
Circuit boards must be drilled.
What is the shortest route to move the drill from its ‘home’ position, to visit each hole,and return home?
COMP60342-Introduction 20 2.15, Mar 13th 2015
Planning ... the London Olympics
Athletes need rest, spectators need to travel, finals must followsemi-finals, ...
COMP60342-Introduction 21 2.15, Mar 13th 2015
Planning ... where to point Hubble
• Hubble is a sharedresource so need toschedule teams fairly
• Targets for observationmay specify time windows
COMP60342-Introduction 22 2.15, Mar 13th 2015
Problem Solving
What is the minimum number of moves (from here) to obtain a solved cube?
What about from ANY configuration?
COMP60342-Introduction 25 2.15, Mar 13th 2015
Exercise 1: Problem Formulation
In pairs or threes:
Select a problem (given above or one ofyour own) and prepare answers to thesequestions.
• What is the measure of optimality?(Could there be more than one?)
• What are the variables?
• What are the constraints?
• Are there any notable features to thisproblem, or difficulties in formulating it?
COMP60342-Introduction 27 2.15, Mar 13th 2015
More on Applications...Codebreaking was the first use of early computers. This is a kind ofoptimization/search process.
Operations Research: Military operations, mining industries, finance/investment,logistics and transportation, large-scale infrastructure decisions, floor-planning,scheduling,...
Used by:
• Governments / think-tanks
• Economists and Bankers
• Evolutionary biologists
• Pharmaceutical companies
• Engineers and Designers
• AI and Machine Learning Scientists...and many more
COMP60342-Introduction 28 2.15, Mar 13th 2015
Optimization Techniques(Development)
And who studies/develops optimization methods?
• Mathematicians
• Business/management scientists and economists
• Computer scientists
They each have a different purpose and approach (though with muchoverlap).
Computer scientists focus on algorithms.
COMP60342-Introduction 29 2.15, Mar 13th 2015
The General Nonlinear Optimization Problem
A general form is the following.
minimize f (x)subject to gi(x) ≥ 0 i = 1, . . . ,m
hj(x) = 0 j = 1, . . . , p(1)
where x is a vector of variables ∈ Rn, known as a (candidate)solution, f is the cost or objective function to be minimized, and gand h are sets of constraint functions.
When f , g and h are general1, the problem is the general non-linearoptimization problem.
1i.e., they are not restricted to linear functions
COMP60342-Introduction 31 2.15, Mar 13th 2015
←− Nonlinear(curvy) cost surface
←− Nonlinear feasiblespace
COMP60342-Introduction 32 2.15, Mar 13th 2015
Combinatorial OptimizationProblem
General verbal definition:
Given a definition of a finite set of objects, find an object from that set which isoptimal, in some given sense.
E.g.:Traveling Salesperson Problem. Given a complete edge-weighted graph, find atour in the graph that has minimum total weight.
A tour is an object and there are certainly a finite number of distinct tours within anyfinite graph. An optimal tour has been defined as one with minimum total weight.
COMP60342-Introduction 33 2.15, Mar 13th 2015
Combinatorial Optimization -Alternative DefinitionsCombinatorial Optimization Problem Given some finite setN = {1, . . . , n}, weights cj for each j ∈ N , and a set F of feasiblesubsets of N , find a minimum weight feasible subset
minS⊆N
∑j∈S
cj : S ∈ F
.
(paraphrased from Laurence Wolsey “Integer Programming”)
For example, the change-making problem: given a set of coins ofdifferent denominations, find a minimum cardinality set that has totalvalue exactly V .
COMP60342-Introduction 34 2.15, Mar 13th 2015
Combinatorial Optimization -Alternative DefinitionsPapadimitriou and Steiglitz (course text) define first an instance of aproblem, where an instance specifies a set F of feasible points, and acost function c that maps F to <.
The problem associated with this instance is to find a set f ∈ F suchthat c(f ) is a minimum among all c(y), y ∈ F .
A problem (or problem class) is then just defined as a set I ofinstances.
e.g. An instance of a traveling salesman problem refers to a particularset of cities to visit, and their distances from one another. Theproblem, TSP, in general, consists of a set of instances (possibly aninfinite set), but usually all generated in the same way (Papadimitriouand Steiglitz, page 4).
COMP60342-Introduction 35 2.15, Mar 13th 2015
Hierarchy of Problems
We will concern ourselves with Matching and especially NP-completeInteger Programming Problems.
COMP60342-Introduction 36 2.15, Mar 13th 2015
Algorithms
Recall: the course is about algorithms for optimization.
What is an algorithm?
COMP60342-Introduction 38 2.15, Mar 13th 2015
Algorithms
Recall: the course is about algorithms for optimization.
What is an algorithm?
...intimately tied up with concepts Turing, Post and others developed: the concept ofa “machine” or computer.
An algorithm is an effective procedure for achieving a specific well-defined effect(the output) on a given specific type of input. It mechanically solves any giveninstance of a problem.
It must halt . A non-halting program is not an algorithm.
−→ There is no algorithm for running the economy perfectly, or purchasing an ideal family home, etc, sincethese problems are not mathematically well-defined.
COMP60342-Introduction 39 2.15, Mar 13th 2015
Algorithms — Historical Note
These are algorithms (high-school maths)
109 66
901 + 24 x
---- ----
1010 264
1320
----
1584
These arithmetic procedures were invented by the Persianmathematician al-Khwarizmi in c. 800-820AD. His name becameassociated with any ”mechanical” or step-by-step procedure:al-Khwarizmi −→ Algorithm.
COMP60342-Introduction 40 2.15, Mar 13th 2015
Complexity Hierarchy of Problems
Shortest Path
Spanning Tree
Non−computable /Undecidable
Post’s Translation
Halting Problem
Easy
Tiling the Plane
Integer additionSatisfiability
Traveling Salesman
Knapsack
NP−completeNP−hard /
Factorization
?Polynomial−time
Fairly Hard
Integer
ImpossibleHard
NB: This is a crude division of problems based on their difficulty. There are manymore classes than these.
COMP60342-Introduction 41 2.15, Mar 13th 2015
Analysis of Algorithms — Big-O
Algorithm running time grows with input size.
The running time can be expressed as a function f (n), where n is input size.
Some functions grow faster thanothers.Constant: no growthLogs: slow growthRoots: a bit quickerPolynomials: n10 g.f. n9 g.f. n8...Exponentials: 10n g.f. 2n
Factorials: grow faster than all theabove
Linear
Polynomial
50.n
n3
Exponential2n
0
500
1000
1500
2000
2500
0 2 4 6 8 10
COMP60342-Introduction 42 2.15, Mar 13th 2015
O(f (n)) expresses that the rate of growth is of the order of f (n) (or less). Thismeans that it is the same up to a constant factor for sufficiently large n(asymptotically).
Space complexity: We may also be concerned with the amount of space (memory)an algorithm uses.
COMP60342-Introduction 43 2.15, Mar 13th 2015
Understanding Big-O as a Set
A function f (n) is in the set O(g(n)) if f (n) grows more slowly or at the same rateas g(n).
In particular, f (n) ∈ O(g(n)) if and only if there is some positive constant k andnumber n0 such that f (n) ≤ k.g(n) for all n > n0.
Here are a few members of the set O(n2).
2n2
n nlog
2)
25
n+100
2n5 n+6
n(O
COMP60342-Introduction 44 2.15, Mar 13th 2015
2n2
n nlog
2)
25
n+100
2n5 n+6
n(O
Not in the set
n3 is an example of a function not in the set O(n2). It is not there because n3
always ‘overtakes’ n2 for sufficiently large n. And n3 will be bigger than even say106.n2 for sufficiently large n.
COMP60342-Introduction 45 2.15, Mar 13th 2015
Complexity — The Class P
The class P or Polynomial is the class of problems for which there exists apolynomial-time bounded algorithm.
Polynomial-time bounded includes any algorithm with big-oh a polynomial. e.g.O(n4).
But also includes any algorithm with big-oh bounded by a polynomial. E.g. O(n2.5)and O(n2 log n), since these are both bounded by O(n3).
N.B: the program must be able to solve every instance in this time bound.Complexity is worst-case by default.
Why is the class P practically important? (see pp.163–165 of the core text).
COMP60342-Introduction 46 2.15, Mar 13th 2015
The class P represents problems for which solutions are tractable. This means thatreasonably large instances of these problems can be solved reasonably quickly oncomputers.
An important fact about P is the fact that a problem in P gets exponentially quicker tosolve as computer speed increases (exponentially). So problems that were slow afew years ago can now be solved very quickly.
However, for problems not in P , in particular problems with exponential timecomplexity or worse, the growth in computer power only causes a polynomialspeed-up. So a problem (not in P ) that was intractable to solve on a computer a fewyears ago is still intractable to solve today. And it will remain intractable for many,many years to come.
For these problems, we may need to satisfy ourselves with approximate solutionsonly.
COMP60342-Introduction 47 2.15, Mar 13th 2015