5
1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example. The productions take the form , where and are strings over an alphabet of terminals and nonterminals. Read as, produces ,” “ derives ,” or “ is replaced by .” The following four expressions are productions for a grammar. {a, b}, the alphabet of the language. s: {S, B}, the grammar symbols (uppercase letters), disjoint from ter : S, a specified nonterminal alone on the left side of some product rm: any string of terminals and/or nonterminals. a transformation of sentential forms by means of productions as If xy is a sentential form and is a production, then the repla in xy to obtain xy is a derivation step, which we denote by xy xy. ivation: S aSB aaSBB aaBB aabBB aabbB aabbb. leftmost derivation, where each step replaces the leftmost nonterminal. l + means one or more steps and * means zero or more steps. So we aabbb or S * aabbb or aSB * aSB, and so on. S aSB S B bB B b. Alternative Short Form S aSB | B bB | b.

1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example

Embed Size (px)

Citation preview

Page 1: 1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example

1

Section 3.3 GrammarsA grammar is a finite set of rules, called productions, that are used to describe the strings of a language.

Notational Example. The productions take the form , where and are strings over an alphabet of terminals and nonterminals. Read as, “produces,” “derives,” or “is replaced by.” The following four expressions are productions for a grammar.

Terminals: {a, b}, the alphabet of the language.

Nonterminals: {S, B}, the grammar symbols (uppercase letters), disjoint from terminals.

Start symbol: S, a specified nonterminal alone on the left side of some production.

Sentential form: any string of terminals and/or nonterminals.

Derivation: a transformation of sentential forms by means of productions as follows: If xy is a sentential form and is a production, then the replacement of by in xy to obtain xy is a derivation step, which we denote by xy xy.

Example Derivation: S aSB aaSBB aaBB aabBB aabbB aabbb.

This is a leftmost derivation, where each step replaces the leftmost nonterminal.The symbol + means one or more steps and * means zero or more steps. So we could write S + aabbb or S * aabbb or aSB * aSB, and so on.

S aSB S B bBB b.

Alternative Short Form S aSB | B bB | b.

Page 2: 1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example

2

The Language of a GrammarThe language of a grammar is the set of terminal strings derived from the start symbol.

Example. Can we find the language of the grammar S aSB | and B bB | b?

Solution: Examine some derivations to see if a pattern emerges.

S S aSB aB abS aSB aB abB abbB abbb S aSB aaSBB aaBB aabB aabbS aSB aaSBB aaBB aabBB aabbBB aabbbB aabbbb.

So we have a pretty good idea that the language of the grammar is {anbn+k | n, k N}.

Quiz (1 minute). Describe the language of the grammar S a | bcS.

Solution: {(bc)na | n N}.

Construction of Grammars

Example. Find a grammar for {anb | n N}.

Solution: We need to derive any string of a’s followed by b. The production S aS can beused to derive strings of a’s. The production S b will stop the derivation and produce thedesired string ending with b. So a grammar for the language is S aS | b.

Quiz (1 minute). Find a grammar for {ban | n N}.

Solution: S Sa | b.

Quiz (1 minute). Find a grammar for {(ab)n | n N}.

Solution: S Sab | or S abS | .

Page 3: 1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example

3

Rules for Combining GrammarsLet L and M be two languages with grammars that have start symbols A and B, respectively, and with disjoint sets of nonterminals. Then the following rules apply.

• L M has a grammar starting with S A | B.• LM has a grammar starting with S AB.• L* has a grammar starting with S AS | .

Example. Find a grammar for {ambmcn | m, n N}.

Solution: The language is the product LM, where L = {ambm | m N} and M = {cn | n N}. So a grammar for LM can be written in terms of grammars for L and M as follows.

S ABA aAb | B cB | .

Example. Find a grammar for the set Odd, of odd decimal numerals with no leading zeros, where, for example, 305 Odd, but 0305 Odd.

Solution: Notice that Odd can be written in the form Odd = (PD*)*O, where O = {1, 3, 5, 7, 9}, P = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and D = {0} P.

Grammars for O, P, and D can be written with start symbols A, B, and C as: A 1 | 3 | 5 | 7 | 9, B A | 2 | 4 | 6 | 8, and C B | 0.

Grammars for D* and PD* and (PD*)* can be written with start symbols E, F, and G as: E CE | , F BE, and G FG | .

So a grammar for Odd with start symbol S isS GA.

Page 4: 1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example

4

Example. Find a grammar for the language L defined inductively by,

Basis: a, b, c L.Induction: If x, y L then ƒ(x), g(x, y) L.

Solution: We can get some idea about L by listing some of its strings.

a, b, c, ƒ(a), ƒ(b), …, g(a, a), …, g(ƒ(a), ƒ(a)), …, ƒ(g(b, c)), …, g(ƒ(a), g(b, ƒ(c))), …

So L is the set of all algebraic expressions made up from the letters a, b, c, and the function symbols ƒ and g of arities 1 and 2, respectively. A grammar for L can be written as

S a | b | c | ƒ(S) | g(S, S).

For example, a leftmost derivation of g(ƒ(a), g(b, ƒ(c))) can be written as

S g(S, S) g(ƒ(S), S) g(ƒ(a), S) g(ƒ(a), g(S, S))

g(ƒ(a), g(b, S)) g(ƒ(a), g(b, ƒ(S))) g(ƒ(a), g(b, ƒ(c))).

S

g ( S , S )

ƒ ( S ) b

a

A Parse Tree is a tree that represents a derivation. The root is the start symbol and the children of a nonterminal node are the symbols (terminals, nonterminals, or ) on the right side of the production used in the derivation step that replaces that node.

Example. The tree shown in the picture is the parse tree for the following derivation:

S g(S, S) g(ƒ(S), S) g(ƒ(a), S) g(ƒ(a), b).

Page 5: 1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example

5

Example. Is the grammar S SaS | b ambiguous?Solution: Yes. For example, the string babab has two distinct leftmost derivations:

S SaS SaSaS baSaS babaS babab.S SaS baS baSaS babaS babab.

The parse trees for the derivations are pictured.

S

S a S

S a S

b b

b

S

S a S

S a Sb

b b

Ambiguous Grammar: Means there is at least one string with two distinct parse trees, or equivalently, two distinct leftmost derivations or two distinct rightmost derivations.

Quiz (2 minutes). Show that the grammar S abS | Sab | c is ambiguous.Solution: The string abcab has two distinct leftmost derivations:

S abS abSab abcab and S Sab abSab abcab.

Unambiguous Grammar: Sometimes one can find a grammar that is not ambiguous for the language of an ambiguous grammar.

Example. The previous example showed S SaS | b is ambiguous. The language of the grammar is {b, bab, babab, …}. Another grammar for the language is S baS | b. It is unambiguous because S produces either baS or b, which can’t derive the same string.

Example. The previous quiz showed S c | abS | Sab is ambiguous. Its language is {(ab)mc(ab)n | m, n N}. Another grammar for the language is

S abS | cT and T abT | .

It is unambiguous because S produces either abS or cT, which can’t derive the same string. Similarly, T produces either abT or , which can’t derive the same string.