Upload
meljun-cortes
View
285
Download
4
Tags:
Embed Size (px)
DESCRIPTION
MELJUN CORTES Automata Theory (Automata21)
Citation preview
CSC 3130: Automata theory and formal languages
Polynomial time
Fall 2008MELJUN P. CORTES, MBA,MPA,BSCS,ACSMELJUN P. CORTES, MBA,MPA,BSCS,ACS
MELJUN CORTESMELJUN CORTES
Efficient algorithms
• The running time of an algorithm depends on the input
• For longer inputs, we allow more time
• Efficiency is measured as a function of input size
decidable
ATM
PCP
efficient
Examples of running time
parsingproblem
running time
0n1n
algorithm LR(1)
O(n) O(n) O(n log n)
short paths
Dijkstra
matching
Edmonds
O(n3)
CYK
O(n2)
n = input size
running time
problem routing
2O(n)
scheduling
2O(n logn) 2O(n)
theorem proving
Input representation
• Since we measure efficiency in terms of input size, how the input is represented will make a difference
• For us, any “reasonable” representation will be okay
The number 17 17
10001 (17 in base two)
11111111111111111
OKOK
NO
This graph
0000,0010,0001,0010
1 2
3 4OK
(2,3),(3,4) OK
Measuring running time
• What does it mean when we say:
• One step in
all mean different things!
“This algorithm runs in 1000 steps”
java RAM machine Turing Machine
if (x > 0) y = 5*y + x;
write r3; (q3, a) = (q7, b, R)
Example
L = {0n1n: n > 0}
in java:
M(string x) { n = x.len; if n % 2 == 0 reject; else for (i = 0; i <= n/2; i++) if x[i] != x[n-i] reject; accept; }
running time = O(n)
But how about:
RAM machine?
Turing Machine?
multitape Turing Machine?
nondeterministic TM?
Efficiency and the Church-Turing thesis• The Church-Turing thesis says all these
models are equivalent in power…
… but not in running time!
java
RAM machine
Turing Machine
multitape TM
UNIVAC
The Cobham-Edmonds thesis
• However, there is an extension to the Church-Turing thesis that says
For any realistic models of computation M1 and M2:
• So any task that takes time T on M1 can be done in time (say) T2 or T3 on M2
M1 can be simulated on M2 with at mostpolynomial slowdown
Efficient simulation
• The running time of a program depends on the model of computation…
… but in the grand scheme, this is irrelevant
javaRAM machinemultitape TMordinary TM
fastslow
Every reasonable model of computation can be simulated efficiently on every other
Example of efficient simulation
• Recall simulating multiple tapes on a single tape
M…0 1 0
…0 1
…1 0 0
= {0, 1, ☐}
S …0 1 0 10 # # 0 #1 0
’ = {0, 1, ☐, 0, 1, ☐, #}
#
Running time of simulation
• Each move of the multiple tape TM might require traversing the whole single tape
after t steps
O(s) steps of single tape TMs = rightmost cell ever visited
s ≤ 3t + 4
1 step of 3-tape TM
t steps of 3-tape O(ts) = O(t2) single tape steps
multi-tape TM
single tape TMquadraticslowdown
Simulation slowdown
• Cobham-Edmonds Thesis:
multi-tape TM
java
single tape TMRAM machine
O(t) O(t)
O(t2)
O(t2)
O(t)O(t)
M1 can be simulated on M2 with at mostpolynomial slowdown
Running time of nondeterministic TM• What about nondeterministic TMs?
• For ordinary TMs, the running time of M on input x is the number of transitions M makes before it halts
• But a nondeterministic TM can run for a different time on different “computation paths”
Example
• Definition of running time for nondeterministic TM
1/1R qacc
q01/1R 1/1R
0/0Rq1
10001
what is the running time?
qrej
running time =
computation path:any possible sequence of transitions
max length of any computation path
50/0R
Simulation of nondeterministic TM
nondet TM
multi-tape TM
…1 00
…1 00
…2 21
input tape x
1
simulation tape z
address tape aFor all k > 0 For all possible strings a of length k Copy x to z. Simulate N on input z using a as choices If a specifies an invalid choice or simulation loops/rejects, abandon simulation. If N enters its accept state, accept and halt. If N rejected on all as of length k, reject and halt.
represents possible choicesat each step
each a describesa possible computation path
N M
Simulation slowdown for nondeterminismFor all k > 0 For all possible strings a of length k Copy x to z. Simulate N on input z using a as choices If a specifies an invalid choice or simulation loops/rejects, abandon simulation. If N enters its accept state, accept and halt. If N rejected on all as of length k, reject and halt.
simulation will halt when k = trunning time of N is t
running time of simulation= (running time for specific a)× (number of as of length ≤ t)
= O(t) × 2O(t)
= 2O(t)
Simulation slowdown
multi-tape TM
java
single tape TMRAM machine
O(t) O(t)O(t2)
O(t2)O(t)
O(t)
nondeterministic TM
2O(t)
Do nondeterministic TM violate the Cobham-Edmonds thesis?
Nondeterminism and the CE thesis
• Cobham-Edmonds Thesis says:
• But is nondetermistic computation realistic?
Any two realistic models of computationcan be simulated with polynomial slowdown
Example
• Recall the scheduling problem
• Scheduling with nondeterminism:
CSC 3230 CSC 2110
CSC 3160CSC 3130
Can you schedule final examsso that there are no conflicts?
Exams → vertices
Slots → colors
Conflicts → edges
Y R B
schedule(int n, Edges edges) { for i := 1 to n: choose { c[i] := Y; } or { c[i] := R; } or { c[i] := B; } for all e in edges: if c[e.left] == c[e.right] reject; accept;}
Example
... but if we had it, we could schedule in linear time!
schedule(int n, Edges edges) { for i := 1 to n: choose { c[i] := Y; } or { c[i] := R; } or { c[i] := B; } for all e in edges: if c[e.left] == c[e.right] reject; accept;}
In reality, programminglanguages don’t allow usto choose
We have to tell the computer how to make these choices
Nondeterminism does not seem like a realistic feature of a programming language or computer
Nondeterministic simulation
• If we can do better, this would improve all known combinatorial optimization algorithms!
nondeterministic TM
multi-tape TM
2O(t) slowdown
Is this the best we can do?
Millenium prize problems
• Recall how in 1900, Hilbert gave 23 problems that guided mathematics in the 20th century
• In 2000, the Clay Mathematical Institute gave 7 problems for the 21st century
1 P versus NP2 The Hodge conjecture3 The Poincaré conjecture4 The Riemann hypothesis5 Yang–Mills existence and mass gap6 Navier–Stokes existence and smoothness7 The Birch and Swinnerton-Dyer conjecture
$1,000,000
Hilbert’s 8th problemPerelman 2006 (refused money)
computer science
The P versus NP question
• Among other things, this asks:– Is nondeterminism a realistic feature of computation?– Can the choose construct be efficiently implemented?– Can we efficiently optimize any “well-posed” problem?
nondeterministic TM
ordinary TM
Can nondeterministic TM be simulated on ordinary TM with polynomial slowdown?
poly(t)
Most people think not, but nobody knows for
sure!
The class P
decidable
regular
context-free
efficient
P is the class of all languages that can be decided on anordinary TM whose running time is some polynomial in the length of the input
By the CE thesis, we can replace“ordinary TM” by any realisticmodel of computation
multi-tape TM
java RAM
Examples of languages in P
parsingproblem
running time
0n1n
algorithm LR(1)
O(n) O(n) O(n log n)
short paths
Dijkstra
matching
Edmonds
O(n3)
CYK
O(n2)
n = input size
L01 = {0n1n: n > 0}
LG = {x: x is generated by G}
PATH = {(G, a, b, L): G is a graph with a path of length L from a to b}
G is some CFG
MATCH = {G, a, b, L: G is a graph with a “perfect” matching} context-free
P (efficient)
decidable
L01
LGPATH
MATCH
Languages believed to be outside P
running timeof best-known algorithm
problem routing
2O(n)
scheduling
2O(n) 2O(n)
thm-proving
We do not know if these problems have faster algorithms, but we suspect not
P (efficient)
decidable
LGPATH
MATCH
ROUTE
SCHED
PROVE
?
To explain why, first we needto understand what theseproblems have in common
More problems
1 2
3 4
Graph G
A clique is a subset of vertices that are all interconnected
{1, 4}, {2, 3, 4}, {1} are cliques
An independent set is a subset of vertices so that no pair is connected
{1, 2}, {1, 3}, {4} are independent sets
there is no independent set of size 3
A vertex cover is a set of vertices that touches (covers) all edges
{2, 4}, {3, 4}, {1, 2, 3} are vertex covers
Boolean formula satisfiability
• A boolean formula is an expression made up of variables, ands, ors, and negations, like
• The formula is satisfiable if one can assign values to the variables so the expression evaluates to true
(x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)
x1 = F x2 = F x3 = T x4 = TAbove formula is satisfiable because this assignment makes it true:
Status of these problems
CLIQUE = {(G, k): G is a graph with a clique of k vertices}
IS = {(G, k): G is a graph with an independent set of k vertices}
VC = {(G, k): G is a graph with a vertex cover of k vertices}
SAT = {f: f is a satisfiable Boolean formula}
running timeof best-known algorithm
problem CLIQUE
2O(n)
IS
2O(n)
SAT
2O(n)
VC
2O(n)
What do these problems have in common?
Checking solutions efficiently
• We don’t know how to solve them efficiently
• But if someone told us the solution, we would be able to check it very quickly
1
2
3
4
5
6
78
9
10
11
12
13
14
15
Is (G, 5) in CLIQUE?
1,5,9,12,14
Example:
Cliques via nondeterminism
• Checking solutions efficiently is equivalent to designing efficient nondeterministic algorithms
Is (G, k) in CLIQUE?Example:clique(Graph G, int k) { C = {}; % potential clique for i := 1 to G.n: % choose clique choose { C := union(C, {i}); } or {} if size(C) != k reject; % check size is k for i := 1 to G.n: % check all edges for j := 1 to G.n: % are in if i in C and j in C if G.isedge(i,j) == false reject; accept;}
Example: Formula satisfiability
(x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)f =
Checking solution: Nondeterministic algorithm:
FFTT
substitutex1 = F x2 = F x3 = T x4 = T
evaluate formula(F ∨T ) ∧ (F∨T∨F) ∧ (T)f =
can be done in linear time
sat(Formula f) { x = new bool[f.n]; for i := 1 to n: choose { x[i] := true; } or { x[i] := false; } if f.eval(x) == true accept; else reject;}
The class NP
• The class NP:
L can be solved on a nondeterministic TM in polynomial time iff its solutions can be checked intime polynomial in the input length
NP is the class of all languages that can be decided on a nondeterministic TM whose running time is some polynomial in the length of the input
P versus NP
because an ordinary TM is only weaker than a nondeterministic one
Conceptually, finding solutions can only be harder than checking them
P (efficient)
decidable
LGPATH
MATCH
CLIQUE
SAT IS
NP (efficiently checkable)
VC
P is contained in NP
P versus NP
• The answer to the question
is not known. But one reason it is believed to be negative is because, intuitively, searching is harder than verifying
• For example, solving homework problems (searching for solutions) is harder than grading (verifying the solution is correct)
Is P equal to NP?
$1,000,000
Searching versus verifying
Mathematician: Given a mathematical claim, come up with a proof for it.
Scientist: Given a collection of data on some phenomena, find a theory explaining it.
Engineer: Given a set of constraints (on cost, physical laws, etc.) come up with a design (of an engine, bridge, etc) which meets them.
Detective: Given the crime scene, find “who’s done it”.