Context Free Languages - University of Georgia

Context Free Languages

CSCI 2670

Computer Science Department

Fall 2014

CSCI 2670 Context Free Languages

Outline

I Context Free Grammars (CFGs)

I Ambiguity

I Chomsky Normal Form

I Pushdown Automata (PDAs)

I Equivalence between CFGs and PDAs

I The Pumping Lemma


Formal Grammars

I We’ve seen that regular languages can be recognized by finite automata.Other types of automata recognize other classes of languages.

I Formal languages can also be described by formal grammars.

I Intuitively a grammar is a set of variables and terminal symbols, and a setof rules which state when a variable can be replaced with another string.

Example

A → 0A1A → BB → #

I A, B are variables (A is the “start”), and 0, 1, # terminals. Thegrammar generates the language {0n#1n|n ≥ 1}.

I Starting with A, there is a sequence of rule applications in whichvariables on the left side are replaced with strings on the right.

I The language above is not regular. It is context free, however.


Formal Grammar

Definition

A formal grammar G = (V ,Σ,R,S) is a 4-tuple consisting of

1. a finite set V of nonterminal symbols or variables;

2. a finite set Σ of terminal symbols;

3. a finite set R of substitution (production) rules u → v , where u andv are both strings from (V ∪ Σ)∗, and u contains a variable;

4. a designated start symbol S ∈ V ;

Sets V , Σ, and R are disjoint.

I Placing restrictions on the form of rules determines whether the languageof the grammar falls into a particular class (e.g., regular).


Context Free Grammars and Languages

Definition

I A formal grammar G = (V ,Σ,R,S) is context free if and only if foreach rule u → v in R, u is a single variable.

I If u, v and w are strings from (V ∪ Σ)+ and A→ w a rule of R, thenuAv yields uwv , written uAv ⇒ uwv .

I u derives v , written u∗⇒ v , if and only if u ⇒ v or there is a sequence

u ⇒ u1 ⇒ uk ⇒ v , with k ≥ 0.

I L(G ) = {w |S ∗⇒ w} is the language of the grammar.

Example

A → 0A1A → BB → #

I A⇒ 0A1⇒ 00A11⇒ 000A111⇒ 000B111⇒ 000#111

I L(G ) = {0n#1n|n ≥ 1}


Formal Grammars

I Multiple rules for the same variable are often written using “|”. Othertypes of automata recognize other classes of languages.

Example

A → 0A1A → BB → #

is equivalent to

A → 0A1 | BB → #


Example

I Here, variables are indicated with ‘(’ and ‘)’.

Example

(SENTENCE) → (NOUN-PHRASE) (VERB-PHRASE)(NOUN-PHRASE) → (CMPLX-NOUN)|(CMPLX-NOUN)(PREP-PHRASE)(VERB-PHRASE) → (CMPLX-VERB) | (CMPLX-VERB)(PREP-PHRASE)(PREP-PHRASE) → (PREP) (CMPLX-NOUN)(CMPLX-NOUN) → (ARTICLE) (NOUN)(CMPLX-VERB) → (VERB) | (VERB)(NOUN-PHRASE)(ARTICLE) → a | the(NOUN) → boy | girl | flower(VERB) → touches | likes | sees(PREP) → with

The above grammar generates strings such as.a girl with a flower likes the boy


Combining Grammars

I Grammars can be combined to form new grammars.

I This fact can be used to design grammars for particular languages.

Example

I Let G1 be S1 → 0S11|ε. L(G1) = {0n1n|n ≥ 0}.I Let G2 be S2 → 1S20|ε. L(G2) = {1n0n|n ≥ 0}.I Let G be

S → S1|S2

S1 → 0S11|εS2 → 1S20|ε

I L(G ) = {1n0n|n ≥ 0} ∪ {0n1n|n ≥ 0}.


CFGs for Regular Languages

I Context free grammars can be easily constructed for regular languages.

I Let D = (Q,Σ, δ, q0,F ) be a DFA.

I Construct a CFG G = (V ,Σ,R,S) as follows:

1. V = {Ri |qi ∈ Q} (make a variable for each state of Q).2. S = R0 (S corresponds to the start state q0).3. For each transition δ(qi , a) = qj , add rule Ri → aRj to R.4. For each qi ∈ F , add rule Ri → ε to R.

I In general, a grammar is regular if all rules are of the form:

I A→ aBI A→ aI A→ ε

where A,B ∈ V and a ∈ Σ.

I A language is regular if some regular grammar generates it.


In-class Questions

Give the regular grammar for the language which can be accepted by thefollowing DFA M.

The above DFA M can be formally defined as follows:

I Q = {q1, q2}.I Σ = {0, 1}.I δ is given by the table:

δ 0 1

q1 q1 q2

q2 q1 q2

I q0 = q1.

I F = {q2}

L(M) = {w | w ends in a 1}.CSCI 2670 Context Free Languages

In-class Questions

Give the regular grammar for the language which can be accepted by thefollowing DFA M.

The regular grammar G can be formally defined as follows:

I V = {R1,R2}.I Σ = {0, 1}.I the set of rules R:

R1 → 0R1 | 1R2

R2 → 0R1 | 1R2 | ε

I S = R1.


Parse Trees

I The derivation of a string using a grammar can also be represented usinga parse tree.


Ambiguity

I In a given CFG, it’s possible for there to be multiple parse trees for thesame string w . Such a grammar is said to be ambiguous.

Definition

I A derivation of a string w in a grammar is a leftmost derivation if, ateach step of the derivation, it is the leftmost variable of the string thatis replaced.

I A string w is derived ambiguously in CFG G if it has two or moreleftmost derivations.

I A CFG G is ambiguous if it generates some string ambiguously.

Some languages are inherently ambiguous. They can only be generatedby ambiguous grammars.


Ambiguity

Example

This grammar is ambiguous (but the language generated by it is notinherently ambiguous).

< EXPR >→< EXPR > + < EXPR > | < EXPR > × < EXPR > |(< EXPR >)|a


Chomsky Normal Form

Definition

A CFG G is in Chomsky Normal Form (CNF) if and only if every ruleof the grammar is of the form:

A → BCA → aS → ε

where S is the start variable, A is any variable, B and C are variablesother than S , and a is any terminal.

Theorem

For each CFL L, there exists a grammar G in CNF such that L(G ) = L.


Chomsky Normal Form

Theorem

For each CFL L, there exists a grammar G in CNF such that L(G ) = L.

Proof.

1. Add a new start symbol and rule S0 → S , where S is the old start.

2. For each A→ ε (A 6= S0), delete it and for each B → w , where A appearsin w , add rules B → w ′, where w ′ is w with one or more As removed. Ifw ′ = ε and B → ε was deleted, don’t re-add it. Repeat until all ε rulesare deleted.

3. Remove each rule A→ B: If rule B → u exists, add rule A→ u unless itwas previously removed.

4. For each rule of the form A→ u1u2 . . . un, where each ui ∈ V ∪ Σ andn ≥ 3, replace the rule with rules A→ u1A1, A1 → u2A2, . . .An−2 → un−1un, where each Ai is a new variable.

5. Replace each rule of the form A→ uB, u ∈ Σ, with A→ UB and U → u.


Chomsky Normal Form

Convert the following to CNF:

S → ASA | aBA → B | SB → b | ε

Step 1: Add a new start symbol and rule.

S0 → SS → ASA | aBA → B | SB → b | ε

Step 2: Process ε–rules.

S0 → SS → ASA | aB | aA → B | S | εB → b | ε


Chomsky Normal Form

Step 2: Process ε–rules.

S0 → SS → ASA | aB | a | AS | SA | SA → B | S | εB → b

Step 3: Process unit rules (S → S). Here, no new rules need to beadded.

S0 → SS → ASA | aB | a | AS | SA | SA → B | SB → b


Chomsky Normal Form

Step 3: Process unit rules (S0 → S).

S0 → S | ASA | aB | a | AS | SAS → ASA | aB | a | AS | SAA → B | SB → b

Step 3: Process unit rules (A→ S).

S0 → | ASA | aB | a | AS | SAS → ASA | aB | a | AS | SAA → B | S | ASA | aB | a | AS | SAB → b


Chomsky Normal Form

Step 3: Process unit rules (A→ B).

S0 → ASA | aB | a | AS | SAS → ASA | aB | a | AS | SAA → B | b | ASA | aB | a | AS | SAB → b

Step 4: Process remaining rules.

S0 → AA1 | aB | a | AS | SAS → AA1 | aB | a | AS | SAA → | b | AA1 | aB | a | AS | SAB → bA1 → SA


Chomsky Normal Form

Step 4: Process remaining rules.

S0 → AA1 | UB | a | AS | SAS → AA1 | UB | a | AS | SAA → b | AA1 | UB | a | AS | SAB → bA1 → SAU → a

At this point, the grammar is in Chomsky Normal Form.


Stack


Conceptual View of a Stack


Stack: Push Operation


Stack: Pop Operation


Pushdown Automata

I Context free languages are recognized by pushdown automata (PDAs).

I A PDA has a memory in the form of a stack.

I At each step of computation,

I the PDA occupies some state q,I a symbol a can be read from the input string (optionally, nothing need be

read), andI a symbol b can be popped from the top of the stack (optionally, nothing

need be popped).

I Based on the state, input symbol, and stack symbol,

I the PDA optionally pushes a symbol onto the stack, andI changes state.

I A PDA accepts a string w if it occupies an accept state after reading thelast symbol of the input.

I Note that PDAs can be nondeterministic.

I Nondeterministic PDAs are more powerful than deterministic PDAs.


Pushdown Automata

Definition

A pushdown automaton P is a 6-tuple (Q,Σ, Γ, δ, qo ,F ), where

1. Q is a finite set of states,

2. Σ is a finite input alphabet,

3. Γ is a finite stack alphabet,

4. δ : (Q × Σε × Γε)→ P(Q × Γε) is the transition function,

5. q0 is the start state,

6. F ⊆ Q is a set of accept states.

I δ maps triples (q, a, b)—where q is a state, a an input symbol, and b astack symbol—to sets of pairs (q′, b′), where q′ is a state and b′ a stacksymbol. a,b, b′ can also be ε.

I δ is nondeterministic. The PDA needn’t read an input or stack symbol,and it needn’t write a stack symbol. In general, there might be multipletransitions for a given state, input symbol, and stack symbol.


Transition Tables for PDAs

Example

Let M1 = (Q,Σ, Γ, δ, q1,F ), where

1. Q = {q1, q2, q3, q4},2. Σ = {0, 1},3. Γ = {0, $},4. F = {q1, q4}, and

5. δ is given by the below table.

Input 0 1 εStack 0 $ ε 0 $ ε 0 $ εq1 {(q2, $)}q2 {(q2, 0)} {(q3, ε)}q3 {(q3, ε)} {(q4, ε)}q4

I The symbol $ is, by convention, used to mark an empty stack.

I As shown in the table, when started, the PDA immediately pushes it onto thestack.


State Diagrams for PDAs

Example

The PDA M1 recognizes the language {0n1n|n ≥ 0}. The PDA may be representedusing the following state diagram.

A label a, b → c means that a is read from the input, b is popped fromthe stack, and c is pushed onto the stack.


Acceptance for PDAs

Definition

Let P = (Q,Σ, Γ, δ, qo ,F ) be a PDA and w = w1w2 . . .wm a string suchthat each wi ∈ Σε.

P accepts w if and only if there exists

I a sequence r0, r1, . . . , rm of states of Q, and

I a sequence of strings s0, s1, . . . , sm, each si ∈ Γ∗ (these represent thestack contents at a given step), such that

1. r0 = q0 and s0 = ε (initially, the stack is empty),

2. for each 0 ≤ i < m, (ri+1, b) ∈ δ(ri ,wi , a), where si = at and si+1 = btfor some a, b ∈ Γε and t ∈ Γ∗, and

3. rm ∈ F .


Example PDAs

Example

The below PDA M2 recognizes the language {aibjck |i , j , k ≥ 0 and i = j or i = k}.


Example PDAs

Example

The below PDA M3 recognizes the language {wwR|w ∈ {0, 1}∗}.


Pushdown Automata

Example

Given an informal description for a PDA recognizing the below language.

1. {aibjck | i = j or j = k where i , j , k ≥ 0}


Equivalence between CFGs and PDAs

Theorem

A language is context free if and only if it is recognized by some pushdownautomaton.

This can be broken down into two lemmas (which will be proven).

Lemma

If a language is context free, then it is recognized by a pushdown automaton.

Lemma

If a language is recognized by a pushdown automaton, then it is context free.



Lemma

If a language is context free, then it is recognized by a pushdown automaton.

We construct a PDA to recognize context free language L. We modify thedefinition of the PDA slightly to allow whole strings to be pushed onto thestack in a single step (this causes no harm).

I The machine only has three states: qstart , qloop, and qaccept .

I The input alphabet of the PDA is Σ, the alphabet of the grammar.

I The stack alphabet is {$} ∪ V ∪ Σ.

I The transition functions for the PDA are as follows:

I δ(qstart , ε, ε) = {(qloop, S$), where S is the start state of thegrammar.

I If A→ w is a rule of G , then (qloop,w) ∈ δ(qloop, ε,A).I If a ∈ Σ, then (qloop, ε) ∈ δ(qloop, a, a).I (qaccept , ε) ∈ δ(qloop, ε, $).



I The conversion produces a PDA of the form below.

I It accepts w if it enters qaccept (and there is no more input).

I (Undefined transitions go to a trap state).



I Implementing pushing a compound string onto the stack.



I Construct a PDA for the following grammar.

S → aTb | bT → Ta | ε





I Construct a PDA for the following grammar.

S → SS | (S) | ()

Try processing (()()).



Lemma

If a language is recognized by a pushdown automaton, then it is context free.

For this direction, we construct a grammar to generate the language recognizedby the PDA P. P must have a certain form.

I It has a single accept state, qaccept . An existing PDA can be modified byadding a new accept state and connecting old states to it via ε-edges.

I It can only accept a string if its stack is empty. This can be accomplishedby adding loops ε, a→ ε to the accept state for each a ∈ Γ.

I Each step pops or pushes a stack symbol but never both. This can beaccomplished by adding new states and modifying transitions:

1. If a transition pops and pushes, split the transition into separatepop and push transitions going through a new state.

2. If a transition neither pops nor pushes, add transitions to pushand then pop an arbitrary symbol, going through a new state.



Given PDA P = (Q,Σ, Γ, δ, q0, {qaccept}), we construct the grammarG = (V ,Σ,R,S) as follows:

I V = {Apq|p ∈ Q and q ∈ Q} (we add a variable for each pair of states).

I S = Aq0qaccept . R is defined as follows:

I For each p ∈ Q, add App → ε.I For each p, q, r ∈ Q, add rule Apq → AprArq.I For each p, q, r , s ∈ Q, t ∈ Γ, and a, b ∈ Σε, if (r , t) ∈ δ(p, a, ε)

and (q, ε) ∈ δ(s, b, t), then add Apq → aArsb to R.

I In the grammar, if one starts in state Apq and applies the rules, all of thestrings that take the PDA P from state p (and an empty stack) to state q(with an empty stack) will be generated.

I Aq0qaccept generates all strings that take P from q0 to qaccept (beginningand ending with an empty stack). I.e., Aq0qaccept generates L(P).

I The starting and ending with the empty stack implies that it is possible tobegin and end with the same stack contents, regardless of what that is.



Proposition

If Apq generates x , then x can bring P from p with empty stack to q withempty stack.

The proof is by induction on the length of the derivation of x from variable Apq.Basis: If the derivation takes only one step then the rule used must beApp → ε. All other rules contain variables on the RHS. And so x = ε andq = p. It is clearly possible to go from p to p reading ε (by staying put).



Induction: Suppose the claim holds for derivations of length k or less, k ≥ 1,and suppose the derivation of x from Apq takes k + 1 steps. Since thederivation is of length greater than 1, it must be that the rule used has eitherthe form Apq → aArsb or AprArq. We consider each in turn.

I If the rule has the form Apq → aArsb, then x = awb and Ars∗⇒ w . By the

construction of G , (r , t) ∈ δ(p, a, ε) and (q, ε) ∈ δ(s, b, t), for somea, b ∈ Σε and t ∈ Γ. That is, one may go from p to r reading input a andpushing t onto the stack. By the inductive hypothesis, w can bring Pfrom state r to state s, leaving the stack in the same state as when itstarted (here, with t as the top symbol). Given that (q, ε) ∈ δ(s, b, t), xthus can bring p with empty stack to q with empty stack.

I If the rule has the form Apq → AprArq, then x can be split into halves y

and z such that Apr∗⇒ y and Arq

∗⇒ z . By the inductive hypothesis, y canbring P from p with empty stack to r with empty stack, and z can bringP from r with empty stack to q with empty stack. And so x = yz canbring P from p with empty stack to q with empty stack.

One of these cases must hold, and so the claim holds.



Proposition

If x can bring P from p with empty stack to q with empty stack, then Apq

generates x .

The proof is by induction on the number of steps in the computation brings Pfrom p with empty stack to q with empty stack on input x .Basis: Suppose the computation has 0 steps. Then p = q and x must be ε.Since App → ε is rule of G , Apq generates x .



Induction: Suppose the claim holds for all computations taking k or fewersteps, k ≥ 0, and suppose a computation of P on input x takes k + 1 steps.

I Suppose the stack is empty only at the beginning and end of thecomputation. Then an input symbol a is read and a symbol t is pushedonto the stack on the first move, and P enters state r . t must be poppedfrom the stack in the last move. Let b be the symbol read in the lastmove and s the state of P before the last move. Then (r , t) ∈ δ(p, a, ε)and (q, ε) ∈ δ(s, b, t). As such, rule Apq → aArsb exists as a rule of G .

Let x = ayb. y can bring P from state r to state s never popping t fromthe stack, and so y can bring P from r with empty stack to s with emptystack. By the ind. hyp., Ars generates y . From this, Apq generates x .

I If the stack is empty at some intermediary step with corresponding stater , then the computations from p to r and from r to q both have lengthless than k + 1. Let y be the input read going from state p to r in thefirst computation, and let z be the input read during the secondcomputation. By the ind. hyp., Apr generates y and Arq generates z . Asrule Apq → AprAqr is a rule of G , it follows that Apq generates x .


Every regular language is context free

I It has been proved that pushdown automata recognize the class ofcontext-free languages.

I Every regular language is recognized by a finite automaton.

I Every finite automaton is automatically a pushdown automaton thatsimply ignores its stack.

I Hence, every regular language is a context-free language.

{ regular languages } ⊂ { context free languages }


A Pumping Lemma for CFLs

I Like regular languages, CFLs have strings that can be pumped.

I For large strings s, the parse tree for s has a branch with a repeated variable R.

I A later occurrence of R can be replaced with the subtree under an earlieroccurrence.

I Similarly, an earlier occurrence can be replaced with the subtree under the lastoccurrence. As seen below, s = uvxyz, and v and y can be pumped.



Theorem

If A is a context free language, then there exists an integer p such for alls ∈ A with |s| ≥ p, s may be divided into five pieces s = uvxyz such that

1. uv ixy iz ∈ A for each i ≥ 0,

2. |vy | > 0, and

3. |vxy | ≤ p.

I Observe that v or y can be empty, but not both. (Without this, thetheorem would be trivial).



Let G be a CFG and let b be the maximum number of symbols on the RHS ofany rule in G . It is clear that the branching factor of any parse tree generatedby G will be at most b, and so if a parse tree for a string s has height h, then shas at most bh symbols. Similarly, if |s| ≥ bh + 1, it must be that any parsetree for s has height at least h + 1.

(Proof of the Pumping Lemma)

I Let A be a CFL with grammar G , and let p = b|V |+1, where b is themaximum number of symbols on the RHS of any rule of G .

I Let s be a string generated by G such that |s| ≥ p.

I Let τ be a minimal parse tree for s (minimal in the sense that no otherparse tree for s has fewer nodes than τ). Assume the root of τ is T .

I It follows that τ has height ≥ |V |+ 1, and so there exists a branch B oflength (in edges) ≥ |V |+ 1. Since that is so, some variable R must appearat least twice on the branch.




I We may chose any branch with length ≥ |V |+ 1, and so we assume B tobe the longest branch of τ . We also assume R is a variable on B in whichmultiple occurrence occur on the last |V |+ 1 variables on the path, andwe let R1 and R2 be the last two occurrences of R on the branch.

I It is clear that s can be divided into uvxyz , where vxy consists of theleaves of the tree rooted at R1, x consists of the leaves of the tree rootedat R2, and uvxyz consists of the leaves of the tree rooted at T .

I Since R1 and R2 are occurrences of R, it is possible to replace the subtreerooted at R2 with the one rooted at R1, yielding a larger tree. This processcan be repeated infinitely, giving parse trees for uv ixy iz for each i > 1.And so the resulting strings are all in A.

I Similarly, is possible to replace the subtree rooted at R1 with the onerooted at R2, yielding the tree τ ′ and corresponding string uv 0xy 0z . Ifboth v and y are empty, then uxz would equal uvxyz . Since τ ′ has fewernodes than τ , this would mean that τ is not a minimal tree for s. But wehave assumed τ to be minimal, and so one of v or y must be nonempty.




I Since R1 and R2 appear among the last |V |+ 1 variables on B, it must bethat the subtree rooted at R1 has height at most |V |+ 1. But a tree ofthat height can generate strings of length at most b|V |+1, where b is themaximum number of symbols on the RHS of any rule of G .

I And so |vxy | ≤ b|V |+1. However, p = b|V |+1, and so |vxy | ≤ p.

�



Example

B = {anbncn|n ≥ 0} is not context free.

Proof.

Suppose that B is context free, and let s = apbpcp, where p is the pumpinglength of B.By the pumping lemma, s = uvxyz such that |vy | > 0 and |vxy | ≤ p. Giventhat |vxy | ≤ p, v and y can consist of at most two types of symbols (e.g., aand b). Neither will contain the third (e.g., c). Thus, pumping up v and y willruin the symmetry of s, which implies that uv 2xy 2z /∈ B, contradicting thepumping lemma.



Example

B = {aibjck |0 ≤ i ≤ j ≤ k} is not context free.

Proof.

Suppose that B is context free, and let s = apbpcp, where p is the pumpinglength of B.By the pumping lemma, s = uvxyz such that |vy | > 0 and |vxy | ≤ p. Giventhat |vxy | ≤ p, v and y can consist of at most two types of symbols (e.g., aand b). Neither will contain the third (e.g., c). We consider the possibilities.

I If neither contain c, then v and y can be pumped so that the number of a’sand/or b’s outnumber the c’s.

I If neither contain b, then v and y must be all a’s (or empty) or else all c’s. Inthe former case, the a’s can be pumped up to outnumber the b’s. In the latter,v and y can be pumped down so so that the number of b’s outnumber the c’s.

I If neither contain a’s, we can again pump down so that the a’s outnumber theb’s, or the b’s outnumber the c’s.



Example

1. The language A1 = {0n1n0n1n | n ≥ 0}2. The language A2 = {w#w |w ∈ {0, 1}∗ }3. The language A3 = {0n#02n#03n | n ≥ 0}4. The language A4 = {w#t | w is a substring of t, where w , t ∈ {a, b}∗}

Proof.

???


Documents

Context Free Languages - University of Georgia