35
CSC 3130: Automata theory and formal languages Undecidable problems for CFGs and descriptive complexity MELJUN P. CORTES, MBA,MPA,BSCS,ACS MELJUN P. CORTES, MBA,MPA,BSCS,ACS MELJUN CORTES MELJUN CORTES

MELJUN CORTES Automata Theory (Automata19)

Embed Size (px)

DESCRIPTION

MELJUN CORTES Automata Theory (Automata19)

Citation preview

Page 1: MELJUN CORTES Automata Theory (Automata19)

CSC 3130: Automata theory and formal languages

Undecidable problems for CFGsand descriptive complexity

MELJUN P. CORTES, MBA,MPA,BSCS,ACSMELJUN P. CORTES, MBA,MPA,BSCS,ACS

MELJUN CORTESMELJUN CORTES

Page 2: MELJUN CORTES Automata Theory (Automata19)

Decidable vs. undecidable

“TM M accepts w”

“TM M accepts some input”

“TM M and M’ accept same inputs”

“TM M accepts all inputs”

undecidable

“TM M halts on w” “PDA M accepts w”

“DFA M accepts w”

decidable

“PDA P accepts all inputs”“CFG G is ambiguous”

other kinds of problems?

?

“DFA M accepts all inputs”

Page 3: MELJUN CORTES Automata Theory (Automata19)

›0q0l0o0t0u0s0

›0o0q6o0t0u0s0

›0o0k6q3t0u0s›0o0k6r0q0u0s0

›0o0k6r0a0q1s›0o0k6r0qaa0☐

Computation is local

lotus

ootus

oktus

okrus

okras

okra

M q6

q0

q3

q0

q1

qacc

The changes between rows occur in a 2x3 window

computationtableau

Page 4: MELJUN CORTES Automata Theory (Automata19)

Computation histories as strings

• If M halts on w, We can represent the computation tableau by a string t over alphabet ∪Q∪{#, ›}

›0q0l0o0t0u0s0

›0o0q6o0t0u0s0

›0o0k6q3t0u0s›0o0k6r0q0u0s0

›0o0k6r0a0q1s›0o0k6r0qaa0☐

›q0lotus#›oq6otus#...#›okrqaa☐#

M accepts w qa occurs in string t

M rejects w qa does not occur in t

Page 5: MELJUN CORTES Automata Theory (Automata19)

Undecidable problems for PDAs

• Theorem

• Proof: We will show that

ALLPDA = { 〈 P 〉 : P is a PDA that accepts all inputs}

The language ALLPDA is undecidable.

If ALLPDA can be decided, so can ATM.

Page 6: MELJUN CORTES Automata Theory (Automata19)

Undecidable problems for PDAs

〈 M 〉 , w

reject if M accepts w

accept if M rej/loops w

Areject if not

accept if P accepts all inputs〈 P〉

A〈 P〉

P accepts all inputs if M rejects or loops on w

P does not accept some input if M accepts w

Page 7: MELJUN CORTES Automata Theory (Automata19)

Undecidability via computation histories

P accepts all inputs if M rejects or loops on w

P does not accept some input if M accepts w

Pcandidate computationhistory of M on w

reject acceptinghistories

›q0lotus#›oq6otus#...#›okrqaa☐ reject

acceptevery other string

M accepts w P rejects t

M rej/loops on w no accepting historiesP accepts everything

Page 8: MELJUN CORTES Automata Theory (Automata19)

Undecidability via computation histories• Task: Design a PDA P such that

Pcandidate computationhistory t of M on w

reject acceptinghistories

›0q0l0o0t0u0s0

›0o0q6o0t0u0s0

›0o0k6q3t0u0s›0o0k6r0q0u0s0

›0o0k6r0a0q1s›0o0k6r0qaa0☐

Expect t of the form w1#w2#...#wk#

If w1 ≠›q0w , accept t.

If t does not contain qa, accept t.

If two consecutive blocks wi#wi+1

do not correspond to a propertransition of M, accept t.

Page 9: MELJUN CORTES Automata Theory (Automata19)

Implementing P

If w1 ≠›q0w , accept t.

If two consecutive blocks wi#wi+1 do not represent a valid transition of M, accept t.

On input t:

Nondeterministically make one of the following choices

If t does not contain qa, accept t.

Look for the beginning of the ith block of t

Look in the first block w1 of t

Look for the appearance of qa

›0o0k6q3t0u0s #›0o0k6r0q0u0s0

wi#wi+1 represents a valid transition if all 3x2 windows correspond to possible transitions of M

valid transition

Page 10: MELJUN CORTES Automata Theory (Automata19)

Valid and invalid windows

… 6c3a0t0 …… 0c6a0p0 …0

… 6t3t0u0 …… 0t6t0u0 …0

valid window

invalid window

… 6t3t0u0 …… 0t6t0q3 …0

valid window

… 6t3q3u0 …… 0t6a0q7 …0

valid if (q3, u) = (q7, a, R)

… 6q3t0u0 …… 0k6t0q0 …0

invalid window

… 6c3a0t0 …… 0b6a0t0 …0

valid window

Page 11: MELJUN CORTES Automata Theory (Automata19)

Implementing P

• To check this it is better to write t in boustrophedon

wi#wi+1 represent a valid transition of M

›0q0l0o0t0u0s0

›0o0q6o0t0u0s0

›0o0k6q3t0u0s›0o0k6r0q0u0s0

›0o0k6r0a0q1s›0o0k6r0qaa0☐

›q0lotus#›oq6otus#...#›okrqaa☐#

›q0lotus#sutoq6o›#...#›okrqaa☐#

Alternate rows are written in reverse

Page 12: MELJUN CORTES Automata Theory (Automata19)

Implementing P

›0o0k6q3t0u0s #›0o0k6r0q0u0s0

proper transition…#›okq3tus#suq0rko›#…

wi wi+1

Nondeterministically look for beginning of 3x2 window

wi#wi+1 represent a valid transition of M

#

Remember first row of window in state

Use stack to detect beginning of second row

Remember second row of window in state

If window is not valid, accept, otherwise reject.

Page 13: MELJUN CORTES Automata Theory (Automata19)

The Post Correspondence Problem

• Input: A set of tiles like this

• Given an infinite supply of such tiles, can you match top and bottom?

babcc

cab

aab

baaa

babcc

cab

aab

baaa

cab

ababa

ababa

bab

bab

Page 14: MELJUN CORTES Automata Theory (Automata19)

Undecidability of PCP

• Theorem

• Proof: We will show that

PCP = {D: D is a collection of tiles thatcontains a top-bottom match}

The language PCP is undecidable.

If PCP can be decided, so can ATM.

Page 15: MELJUN CORTES Automata Theory (Automata19)

Undecidability of PCP

• Idea: Matches represent accepting histories

〈 M〉 , w

T (collection of tiles)

If M accepts w, then T can be matched

If M rej/loops on w, then T cannot be matched

›q0lotus#›oq6otus#›okq3t...#›qa☐☐☐☐›q0lotus#›oq6otus#›okq3r...#›qa☐☐☐☐

›q0lotus#

›q0l›oq6

tt

uu

ss

##

››

oq60okq3

oo

Page 16: MELJUN CORTES Automata Theory (Automata19)

Some technicalities

• We will assume that– Before accepting, TM M erases its tape– One of the PCP tiles is marked as a starting tile

• These assumptions can be made without loss of generality (we will see why later)

babcc

cab

aab

baaa

ababa

s

Page 17: MELJUN CORTES Automata Theory (Automata19)

Undecidability of PCP

• To decide ATM, we construct these tiles for PCP

〈 M〉 , w

T (collection of tiles)

If M accepts w, then T can be matched

If M rej/loops on w, then T cannot be matched

›q0w

#

sa1qia3

b1b2b3

for each valid window of this

form

aa

for all a in ∪{#, ›}

#›qa

☐##

“final” tiles

Page 18: MELJUN CORTES Automata Theory (Automata19)

›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐

›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐

Undecidability of PCP

›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐

›q0w

#

sa1qia3

b1b2b3

aa

#›qa

☐##

accepting computation history

Page 19: MELJUN CORTES Automata Theory (Automata19)

Undecidability of PCP

• If M rejects on input w, then qr appears on bottom at some point, but it cannot be matched on top

• If M loops on w, then matching keeps going forever

›q0w

#

sa1qia3

b1b2b3

aa

#›qa

☐##

Page 20: MELJUN CORTES Automata Theory (Automata19)

A technicality

• We assumed that one tile marked as starting tile

• We can remove assumption by changing tiles a bit

babcc

cab

ababa

s

b*a*b**c*c

c**a*b

*a**b*a*b*a

*

“starting tile”begins with *

“ending tile” matches last *

“middle tiles”

Page 21: MELJUN CORTES Automata Theory (Automata19)

Ambiguity of CFGs

AMB = {G: G is an ambiguous CFG}

• Theorem

• Proof: We will show that

The language AMB is undecidable.

If AMB can be decided, so can PCP.

Page 22: MELJUN CORTES Automata Theory (Automata19)

Ambiguity of CFGs

• Proof:

Step 1: Number the tiles

T G

If T can be matched, then G is ambiguous

If T cannot be matched, then G is unambiguous

(collection of tiles)

babcc

cab

aab

1 2 3

(CFG)

Page 23: MELJUN CORTES Automata Theory (Automata19)

Ambiguity of CFGs

T G(collection of tiles)

babcc

cab

aab

1 2 3

Productions: T → babT1

Terminals:

B → ccS1T → cT2B → abB2

S → aT3B → abB3

a,b,c,1,2,3

(CFG)

Variables: S, T, B

T → B →

S → T | B

Page 24: MELJUN CORTES Automata Theory (Automata19)

Ambiguity of CFGs

• Each sequence of tiles gives two derivations

• If the tiles match, these two derive the same string

babcc

cab

1 2cab

2

S → T → babT1 → babcT21→ babcc221

S → B → ccB1 → ccabB21→ ccabab221

Page 25: MELJUN CORTES Automata Theory (Automata19)

Ambiguity of CFGs

• Argue by contradiction: – If G is ambiguous then ambiguity must look like this

T G

If T can be matched, then G is ambiguous

If T cannot be matched, then G is unambiguous

(collection of tiles) (CFG)

STTa1 n1

ai ni

Ta2 n2

SBBb1 m1

bj mj

Bb2 m2

…Then n1...ni = m1…mj

So there is a match

a1

b1

a2

b2

ai

bi

n1 n2 ni

Page 26: MELJUN CORTES Automata Theory (Automata19)

Descriptive complexity

Page 27: MELJUN CORTES Automata Theory (Automata19)

Roulette

• In a game of roulette, you bet $1 on even or odd

• The outcome is a number between 1 and 36– If you guessed correctly, double your bet– Otherwise, you lose

17 6 5 16 5 2

11 8 31 18 7 4

5 2 29 8 1 12

Page 28: MELJUN CORTES Automata Theory (Automata19)

Randomness

• If we write E for even, O for odd, what we saw is

• It seems the wheel is crooked. If it wasn’t we would expect something more like

• But both sequences have same probability! Why does one appear less random than the other?

OEOEOEOEOEOEOEOEOEOE

OOOEEOEOOEOEOOOEEEOE

Page 29: MELJUN CORTES Automata Theory (Automata19)

Turing Machines with output

• The goal of a Turing Machine with output is to write something on the output tape and go into state qhalt

M

output tape…0 1 0

work tape…0 1 0

Page 30: MELJUN CORTES Automata Theory (Automata19)

Descriptive complexity

• The descriptive complexity K(x) of x is the shortest description of any Turing Machine that outputs x

• We will assume x is long

Andrey Kolmogorov(1903-1987)

Page 31: MELJUN CORTES Automata Theory (Automata19)

Example of descriptive complexity

• Turing machine implementation:

x = “OE...OE” = (OE)n Repeat for n steps: At odd step print O At even step print E

Write n in binary on work tape

While work tape not equal to 0, Subtract 1 from number on work tape If number is odd, write O If number is even, write E

(n = 1,000,000,000)

≈ log2n states

≈ 3 states≈ 15 states

≈ 2 states

≈ log2n + 20K(x)

Page 32: MELJUN CORTES Automata Theory (Automata19)

Bounds on descriptive complexity

• Theorem 1

• Proof: Let x = x1...xn and consider the following TM:

For every x of length n, K(x) is at most O(n)

Write x1 to output tape and move rightWrite x2 to output tape and move right

Write xn to output tape and halt.

...

n + O(1)

Page 33: MELJUN CORTES Automata Theory (Automata19)

Descriptive complexity and randomness• Theorem 2

For 99% of strings of length n, K(x) ≥ n – 10.

0 O(log n) n – 10

“simple” strings 111...1, OEOE...OE,3.14159265, 1212321234321

“random-looking” strings

n O(1)

“randomness-deficient” strings

Page 34: MELJUN CORTES Automata Theory (Automata19)

Evaluating randomness

• How do we know if the casino is crooked?

• Idea: Compute K(sequence).

If much less than n, indicates sequence is not random

17 6 5 16 5 2

11 8 31 18 11 13

5 2 29 8 1 12

12

14

12

8

31

4

Page 35: MELJUN CORTES Automata Theory (Automata19)

Computing descriptive complexity

• Proof: Suppose it is, fix n and consider this TM M:

Let x = output of M, then

So (when n is large) we get K(x) > K(x), impossible!

It is not possible to compute K(x).

Output the first x of length n (in lexicographic order) such that K(x) ≥ n – 10

K(x) ≥ n – 10 K(x) ≤ 〈 M 〉 | = log2n + O(1)but