Upload
meljun-cortes
View
88
Download
0
Embed Size (px)
DESCRIPTION
MELJUN CORTES Automata Theory (Automata19)
Citation preview
CSC 3130: Automata theory and formal languages
Undecidable problems for CFGsand descriptive complexity
MELJUN P. CORTES, MBA,MPA,BSCS,ACSMELJUN P. CORTES, MBA,MPA,BSCS,ACS
MELJUN CORTESMELJUN CORTES
Decidable vs. undecidable
“TM M accepts w”
“TM M accepts some input”
“TM M and M’ accept same inputs”
“TM M accepts all inputs”
undecidable
“TM M halts on w” “PDA M accepts w”
“DFA M accepts w”
decidable
“PDA P accepts all inputs”“CFG G is ambiguous”
other kinds of problems?
?
“DFA M accepts all inputs”
›0q0l0o0t0u0s0
›0o0q6o0t0u0s0
›0o0k6q3t0u0s›0o0k6r0q0u0s0
›0o0k6r0a0q1s›0o0k6r0qaa0☐
Computation is local
lotus
ootus
oktus
okrus
okras
okra
M q6
q0
q3
q0
q1
qacc
The changes between rows occur in a 2x3 window
computationtableau
Computation histories as strings
• If M halts on w, We can represent the computation tableau by a string t over alphabet ∪Q∪{#, ›}
›0q0l0o0t0u0s0
›0o0q6o0t0u0s0
›0o0k6q3t0u0s›0o0k6r0q0u0s0
›0o0k6r0a0q1s›0o0k6r0qaa0☐
›q0lotus#›oq6otus#...#›okrqaa☐#
M accepts w qa occurs in string t
M rejects w qa does not occur in t
Undecidable problems for PDAs
• Theorem
• Proof: We will show that
ALLPDA = { 〈 P 〉 : P is a PDA that accepts all inputs}
The language ALLPDA is undecidable.
If ALLPDA can be decided, so can ATM.
Undecidable problems for PDAs
〈 M 〉 , w
reject if M accepts w
accept if M rej/loops w
Areject if not
accept if P accepts all inputs〈 P〉
A〈 P〉
P accepts all inputs if M rejects or loops on w
P does not accept some input if M accepts w
Undecidability via computation histories
P accepts all inputs if M rejects or loops on w
P does not accept some input if M accepts w
Pcandidate computationhistory of M on w
reject acceptinghistories
›q0lotus#›oq6otus#...#›okrqaa☐ reject
acceptevery other string
M accepts w P rejects t
M rej/loops on w no accepting historiesP accepts everything
Undecidability via computation histories• Task: Design a PDA P such that
Pcandidate computationhistory t of M on w
reject acceptinghistories
›0q0l0o0t0u0s0
›0o0q6o0t0u0s0
›0o0k6q3t0u0s›0o0k6r0q0u0s0
›0o0k6r0a0q1s›0o0k6r0qaa0☐
Expect t of the form w1#w2#...#wk#
If w1 ≠›q0w , accept t.
If t does not contain qa, accept t.
If two consecutive blocks wi#wi+1
do not correspond to a propertransition of M, accept t.
Implementing P
If w1 ≠›q0w , accept t.
If two consecutive blocks wi#wi+1 do not represent a valid transition of M, accept t.
On input t:
Nondeterministically make one of the following choices
If t does not contain qa, accept t.
Look for the beginning of the ith block of t
Look in the first block w1 of t
Look for the appearance of qa
›0o0k6q3t0u0s #›0o0k6r0q0u0s0
wi#wi+1 represents a valid transition if all 3x2 windows correspond to possible transitions of M
valid transition
Valid and invalid windows
… 6c3a0t0 …… 0c6a0p0 …0
… 6t3t0u0 …… 0t6t0u0 …0
valid window
invalid window
… 6t3t0u0 …… 0t6t0q3 …0
valid window
… 6t3q3u0 …… 0t6a0q7 …0
valid if (q3, u) = (q7, a, R)
… 6q3t0u0 …… 0k6t0q0 …0
invalid window
… 6c3a0t0 …… 0b6a0t0 …0
valid window
Implementing P
• To check this it is better to write t in boustrophedon
wi#wi+1 represent a valid transition of M
›0q0l0o0t0u0s0
›0o0q6o0t0u0s0
›0o0k6q3t0u0s›0o0k6r0q0u0s0
›0o0k6r0a0q1s›0o0k6r0qaa0☐
›q0lotus#›oq6otus#...#›okrqaa☐#
›q0lotus#sutoq6o›#...#›okrqaa☐#
Alternate rows are written in reverse
Implementing P
›0o0k6q3t0u0s #›0o0k6r0q0u0s0
proper transition…#›okq3tus#suq0rko›#…
wi wi+1
Nondeterministically look for beginning of 3x2 window
wi#wi+1 represent a valid transition of M
#
Remember first row of window in state
Use stack to detect beginning of second row
Remember second row of window in state
If window is not valid, accept, otherwise reject.
The Post Correspondence Problem
• Input: A set of tiles like this
• Given an infinite supply of such tiles, can you match top and bottom?
babcc
cab
aab
baaa
babcc
cab
aab
baaa
cab
ababa
ababa
bab
bab
Undecidability of PCP
• Theorem
• Proof: We will show that
PCP = {D: D is a collection of tiles thatcontains a top-bottom match}
The language PCP is undecidable.
If PCP can be decided, so can ATM.
Undecidability of PCP
• Idea: Matches represent accepting histories
〈 M〉 , w
T (collection of tiles)
If M accepts w, then T can be matched
If M rej/loops on w, then T cannot be matched
›q0lotus#›oq6otus#›okq3t...#›qa☐☐☐☐›q0lotus#›oq6otus#›okq3r...#›qa☐☐☐☐
›q0lotus#
›q0l›oq6
tt
uu
ss
##
››
oq60okq3
oo
…
Some technicalities
• We will assume that– Before accepting, TM M erases its tape– One of the PCP tiles is marked as a starting tile
• These assumptions can be made without loss of generality (we will see why later)
babcc
cab
aab
baaa
ababa
s
Undecidability of PCP
• To decide ATM, we construct these tiles for PCP
〈 M〉 , w
T (collection of tiles)
If M accepts w, then T can be matched
If M rej/loops on w, then T cannot be matched
›q0w
#
sa1qia3
b1b2b3
for each valid window of this
form
aa
for all a in ∪{#, ›}
#›qa
☐##
☐
“final” tiles
›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐
›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐
Undecidability of PCP
›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐
›q0w
#
sa1qia3
b1b2b3
aa
#›qa
☐##
☐
accepting computation history
Undecidability of PCP
• If M rejects on input w, then qr appears on bottom at some point, but it cannot be matched on top
• If M loops on w, then matching keeps going forever
›q0w
#
sa1qia3
b1b2b3
aa
#›qa
☐##
☐
A technicality
• We assumed that one tile marked as starting tile
• We can remove assumption by changing tiles a bit
babcc
cab
ababa
s
b*a*b**c*c
c**a*b
*a**b*a*b*a
*
“starting tile”begins with *
“ending tile” matches last *
“middle tiles”
Ambiguity of CFGs
AMB = {G: G is an ambiguous CFG}
• Theorem
• Proof: We will show that
The language AMB is undecidable.
If AMB can be decided, so can PCP.
Ambiguity of CFGs
• Proof:
Step 1: Number the tiles
T G
If T can be matched, then G is ambiguous
If T cannot be matched, then G is unambiguous
(collection of tiles)
babcc
cab
aab
1 2 3
(CFG)
Ambiguity of CFGs
T G(collection of tiles)
babcc
cab
aab
1 2 3
Productions: T → babT1
Terminals:
B → ccS1T → cT2B → abB2
S → aT3B → abB3
a,b,c,1,2,3
(CFG)
Variables: S, T, B
T → B →
S → T | B
Ambiguity of CFGs
• Each sequence of tiles gives two derivations
• If the tiles match, these two derive the same string
babcc
cab
1 2cab
2
S → T → babT1 → babcT21→ babcc221
S → B → ccB1 → ccabB21→ ccabab221
Ambiguity of CFGs
• Argue by contradiction: – If G is ambiguous then ambiguity must look like this
T G
If T can be matched, then G is ambiguous
If T cannot be matched, then G is unambiguous
(collection of tiles) (CFG)
✓
STTa1 n1
ai ni
Ta2 n2
…
SBBb1 m1
bj mj
Bb2 m2
…Then n1...ni = m1…mj
So there is a match
a1
b1
a2
b2
ai
bi
n1 n2 ni
✓
…
Descriptive complexity
Roulette
• In a game of roulette, you bet $1 on even or odd
• The outcome is a number between 1 and 36– If you guessed correctly, double your bet– Otherwise, you lose
17 6 5 16 5 2
11 8 31 18 7 4
5 2 29 8 1 12
Randomness
• If we write E for even, O for odd, what we saw is
• It seems the wheel is crooked. If it wasn’t we would expect something more like
• But both sequences have same probability! Why does one appear less random than the other?
OEOEOEOEOEOEOEOEOEOE
OOOEEOEOOEOEOOOEEEOE
Turing Machines with output
• The goal of a Turing Machine with output is to write something on the output tape and go into state qhalt
M
output tape…0 1 0
work tape…0 1 0
Descriptive complexity
• The descriptive complexity K(x) of x is the shortest description of any Turing Machine that outputs x
• We will assume x is long
Andrey Kolmogorov(1903-1987)
Example of descriptive complexity
• Turing machine implementation:
x = “OE...OE” = (OE)n Repeat for n steps: At odd step print O At even step print E
Write n in binary on work tape
While work tape not equal to 0, Subtract 1 from number on work tape If number is odd, write O If number is even, write E
(n = 1,000,000,000)
≈ log2n states
≈ 3 states≈ 15 states
≈ 2 states
≈ log2n + 20K(x)
Bounds on descriptive complexity
• Theorem 1
• Proof: Let x = x1...xn and consider the following TM:
For every x of length n, K(x) is at most O(n)
Write x1 to output tape and move rightWrite x2 to output tape and move right
Write xn to output tape and halt.
...
n + O(1)
Descriptive complexity and randomness• Theorem 2
For 99% of strings of length n, K(x) ≥ n – 10.
0 O(log n) n – 10
“simple” strings 111...1, OEOE...OE,3.14159265, 1212321234321
“random-looking” strings
n O(1)
“randomness-deficient” strings
Evaluating randomness
• How do we know if the casino is crooked?
• Idea: Compute K(sequence).
If much less than n, indicates sequence is not random
17 6 5 16 5 2
11 8 31 18 11 13
5 2 29 8 1 12
12
14
12
8
31
4
Computing descriptive complexity
• Proof: Suppose it is, fix n and consider this TM M:
Let x = output of M, then
So (when n is large) we get K(x) > K(x), impossible!
It is not possible to compute K(x).
Output the first x of length n (in lexicographic order) such that K(x) ≥ n – 10
K(x) ≥ n – 10 K(x) ≤ 〈 M 〉 | = log2n + O(1)but