37
Context free languages 1. Equivalence of context free grammars 2. Normal forms

Context free languages 1. Equivalence of context free grammars 2. Normal forms

Embed Size (px)

Citation preview

Context free languages

1. Equivalence of context free grammars2. Normal forms

Context-free grammars In a context free grammar, all

productions are of the formA -> w,where A is a nonterminal or the start symbol S, and w is a string from(N T)*

Handles, recursive productions In the production A -> xw,

prefix x, if a single symbol, is called the handle of the production, whether x is in N or T

A productionA -> Awis called left-recursive

The productionA -> wAis called right recursive

Repeated sentential forms

In a derivation, the sentential form wAxS-> … -> wAx -> … -> wAx -> …is called a repeated sentential form. All the intervening steps are wasted steps.

Leftmost derivationsMinimal leftmost derivations A derivation is a leftmost

derivation if at each step only the leftmost nonterminal symbol is replaced using some rule of the grammar.

A leftmost derivation is called minimal if no sentential form is repeated in the derivation

Weak equivalence Two context-free grammars G1

and G2 are called weakly equivalent ifL(G1) = L(G2)

Example of weak equivalence G1: S -> S01; S -> 1

L(G1) = { 1(01)* } G2: S -> S0S; S -> 1

L(G2) = { 1(01)* }

Strong equivalence Two CFGs G1 and G2 are called

strongly equivalent if they are weakly equivalent, and for each string w of terminals in L(G1) = L(G2), and the minimal left-most derivations of w in G1 and the minimal left-most derivations of w in G2 are exactly the same in number, and so can be put into one-to-one correspondence.

Strong equivalence Thus G1 and G2 must both be

unambiguous, or must both be ambiguous in exactly the same number of ways, for each string w in T*

Weakly equivalent but not strongly equivalent G1: Grammar of expressions S:

S -> T | S + T;T -> F | T * F;F -> a | ( S );

G2: Grammar of expressions S:S -> E;E -> E + E | E * E | (E) | a;

L(G1) = L(G2) = valid expressions using a, +, *, (, and ). G1 has operator precedence.

Example: Strong equivalence G1: S->A; A->1B; A->1; B->0A

L(G1) = { (10)*1 }

G2: S->B; B->A1; B->1; A->B0L(G2) = { 1(01)* }

Elementary transformations of context free grammars substitution expansion removal of useless productions removal of non-generative

productions removal of left recursive

productions

Substitution If G has the A-rule, A->uBv,

and all the B-rules are:B->w1, B->w2, . . . , B->wk, then

1. Remove the A-rule A->uBv2. Add the A-rules: A->uw1v, A->uw2v, . . . , A->uwkv3. Keep all the other rules of G, including the B-rules

Example of substitution G1: S->H;

H->TT;T->S; T->aSb; T->c

G2: S->H;H->ST; H->aSbT; H->cT;T->S; T->aSb; T->c;

Strong equivalence after substitution The grammar G, and the grammar

G’ obtained by substitution of B into the A-rule, are strongly equivalent if steps 2 and 3 do not introduce duplicate rules.

Expansion If a grammar has the A-rule, A->uv Remove this A-rule, and replace it

with the two rules A->Xv; X->u; or with A->uY; Y->v

where X (or Y) is a new non-terminal symbol of the grammar.

Strong equivalence after expansion If G is context free, and G’ is

obtained from G by expansion, then G and G’ are strongly equivalent.

Useful production A production A->w of a cfg G is useful

if there is a string x from T* such thatS-> . . -> uAv -> uwv -> . . -> x

Otherwise the production, A->w is useless

Thus, a production that is never used to derive a string of terminals is useless

Removing useless productions T-marking S-marking Productions that are both T-

marked and S-marked are useful. All other productions can be removed.

T-marking Construct a sequence P0, P1,

P2, . . . , of subsets of P, and a sequence N0, N1, N2, . . . of subsets of N as follows: P0 = empty, N0 = empty, j = 0 P[j+1] = { A->w|w in (N[j] + T)* } N[j+1] = { A in N | P[j+1] contains a rule

A->w } Continue until P[j] = P[j+1] = P[T]

S-marking Construct a sequence Q1, Q2, Q3, .

. . of subsets of P[T] as follows: Q1 = {S->w in P[T]} Q[j+1] = Q[j] + {A->w in P[T] | Q[j]

contains a rule B->uAv } Continue until Q[j] = Q[j+1] = P[S]

P[S] are now the useful productions.

Example: T/S-marking Rule T mark S mark

1. S->H 2 12. H->AB3. H->aH 2 24. H->a 1 25. B->Hb 26. C->aC

Thus only 1,3,4 are useful

Strong equivalence after removal of useless productions

If grammar G’ is obtained from grammar G after removal of useless productions of grammar G, then G and G’ are strongly equivalent.

Removing non-generative productions

Removing left-recursive rules Let all the X-rules of grammar G be:

X->u1 | u2 | . . . | uk

X->Xw1 | Xw2 | . . . | Xwh

Then these rules may be replaced by the following:X->u1 | u2 | . . . | uk

X->u1Z | u2Z | . . . | ukZZ->w1 | w2 | . . . | wh

Z->w1Z | w2Z | . . . | whZwhere Z is a new non-terminal symbol

Example: Removing left-recursive rules S->E; S->E;

E->T | aT | bT; E->T | aT | bT; E->EaT | EbT; E->TG | aTG | bTG; T->F; G->aT | bT;T->TcF | TdF; G->aTG | bTG; F->n | xEy T->F;

T->FH;H->cF | dF;H->cFH | dFH;

F->n | xEy

Strong equivalence after removal of left-recursive rules If grammar G’ is obtained from

grammar G by replacing the left-recursive rules of G by right recursive rules to get G’, then G and G’ are strongly equivalent.

Well-formed grammars A context free grammar

G=(N,T,P,S) is well-formed if each production has one of the forms:S->S->AA->wwhere A N and w (N+T)* - N and each production is useful.

Example of well-formed grammars Parenthesis grammar

S->A;A->AA;A->(A);A->();

Chomsky Normal form A context free grammar G=(N,T,P,S)

is in normal form (Chomsky normal form) if each production has one of the forms:S->S->AA->BCA->awhere A,B,C N and a T.

Example of Chomsky normal form grammar Parenthesis grammar

S->A; S->A; A->AA; A->AA;A->(A); A->BC;

B-> (;C->AD;D->);

A->(); A->BD;

Chomsky Normal Form Theorem From any context free grammar,

one can construct a strongly equivalent grammar in Chomsky normal form.

Greibach normal form(standard form) A context free grammar G=(N,T,P,S)

is in standard form (Greibach normal form) if each production has one of the forms:S->S->AA->awwhere A N, a T, and w (N+T)*.

Example: converting to Greibach standard form First remove left-recursive rules:

S->E; S->E;E->T; E->T;E->EaT; E->TF;T->n; F->aT;T->xEy; F->aTF;

T->n;T->xEy;

Converting to Greibach: then substitute to get nonterminal

handles S->E; S->E;

E->T; E->n | xEy;E->TF; E->nF | xEyF;F->aT; F->aT; F->aTF; F->aTF; T->n; T->n; T->xEy; T->xEy;

Standard Form Theorem From any context free grammar,

one can construct a strongly equivalent grammar in standard form (Greibach normal form).

Pumping Lemma for context free languages If L is a context free language,

then there exists a positive integer p such that: if w L and |w| > p, thenw = xuyvz, with uv and y nonempty and xukyvkz L for all k 0.