41
Lecture 5 Grammars Topics Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

Embed Size (px)

Citation preview

Page 1: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

Lecture 5 GrammarsLecture 5

Grammars

Topics Topics Moving on from Lexical Analysis Grammars Derivations CFLs

Readings: 4.1Readings: 4.1

January 25, 2006

CSCE 531 Compiler Construction

Page 2: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 2 – CSCE 531 Spring 2006

OverviewOverviewLast TimeLast Time

Symbol table - hash table from K&R DFA review Simulating DFA figure 3.22 NFAs Thompson Construction: re NFA Examples NFA DFA, the subset construction

ε – closure(s), ε – closure(T), move(T,a)

Today’s Lecture Today’s Lecture Flex example Fig 3.28 revisited

ReferencesReferences

Page 3: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 3 – CSCE 531 Spring 2006

Pop Quiz- I will be a couple of minutes latePop Quiz- I will be a couple of minutes late

Draw the NFA that recognizes (00 | 11)* (01 | 10).Draw the NFA that recognizes (00 | 11)* (01 | 10).

Given an NFA MR that recognizes the language denoted Given an NFA MR that recognizes the language denoted by a regular expression R build a machine that by a regular expression R build a machine that recognizes Rrecognizes Reveneven – that matches R an even number of – that matches R an even number of timestimes

Page 4: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 4 – CSCE 531 Spring 2006

Lexical analyzer for subset of CLexical analyzer for subset of C

int constants: int, octal, hex, int constants: int, octal, hex,

Float constantsFloat constants

C identifiersC identifiers

KeywordsKeywords for, while, if, else

Relational operatorsRelational operators < > >= <= != ==

Arithmetic, Boolean and bit operatorsArithmetic, Boolean and bit operators + - * / && || ! ~ & |

Other symbolsOther symbols ; { } [ ] * ->

Page 5: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 5 – CSCE 531 Spring 2006

Write core.l Flex SpecificationWrite core.l Flex Specification

Due Monday Jan 30Due Monday Jan 30

NotesNotes

1.1. Install Identifiers and constants into symbol tableInstall Identifiers and constants into symbol table

2.2. Return separate token code for each relational Return separate token code for each relational operator. Not as in text!!operator. Not as in text!!

Homework 02 Dues Thursday Jan 26 (now Saturday 28)Homework 02 Dues Thursday Jan 26 (now Saturday 28)

1.1. Construct NFA for recognizing (a|b|Construct NFA for recognizing (a|b|εε)(ab)*)(ab)*

2.2. Convert to DFAConvert to DFA

Page 6: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 6 – CSCE 531 Spring 2006

Flex example Fig 3.28 revisitedFlex example Fig 3.28 revisited

/class/csce531-001/Examples/Flex/class/csce531-001/Examples/Flex Put “e=/class/csce531-001/Examples/” in your .bash_profile

in your home directory (note the period makes it hidden.) Then when you login you can use “cd $e” to move to the

Examples directory

FilesFiles ex0.l, ex1.l (note last character is lowercase “L”) ex3.18.l, Makefile, y.tab.h

Fixed a few things so it would actually compile and run

Page 7: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 7 – CSCE 531 Spring 2006

Building and Runningex3.18Building and Runningex3.18

Preliminary stepsPreliminary steps

cp $e/Flex/ex3.18.l . cp $e/Flex/ex3.18.l . // copy lex-spec to current // copy lex-spec to current directorydirectory

cp $e/Flex/Makefilecp $e/Flex/Makefile ..

cp $e/Flex/y.tab.hcp $e/Flex/y.tab.h ..

flex ex3.18.lflex ex3.18.l // creates the file lex.yy.c// creates the file lex.yy.c

ls ls

gcc lex.yy.c –lflgcc lex.yy.c –lfl

./a.out./a.out

if then else xbarif then else xbar

(output)(output)

Page 8: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 8 – CSCE 531 Spring 2006

Routines sectionRoutines section%%%%

main(){main(){

int tok;int tok;

while((tok = yylex() ) != EOF){while((tok = yylex() ) != EOF){

printf("Token code %d\t lexeme %s \n", tok, yytext);printf("Token code %d\t lexeme %s \n", tok, yytext);

}}

}}

/*Code for install_id() and install_num(); *//*Code for install_id() and install_num(); */

int int

install_id() {install_id() {

}}

intint

install_num(){install_num(){

}}

Page 9: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 9 – CSCE 531 Spring 2006

Regular LanguagesRegular Languages

Regular Expressions Regular Expressions NFA NFA DFA DFA

All specify/recognize the same languages; these All specify/recognize the same languages; these languages in formal language theory are called languages in formal language theory are called regular.regular.

Page 10: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 10 – CSCE 531 Spring 2006

Example of a Non-Regular LanguageExample of a Non-Regular Language

L = { 0L = { 0nn11nn | n > 0 } is non-regular. | n > 0 } is non-regular.

Proof: Suppose that L were a regular Proof: Suppose that L were a regular language, then there would exist language, then there would exist some DFA M that accepts L.some DFA M that accepts L.

Suppose that M has k states.Suppose that M has k states.

Consider the collection of stringsConsider the collection of strings

00

0000

000000

… …

00kk

00k+1k+1

Then by the Pigeon hole principle if Then by the Pigeon hole principle if you start at qyou start at q00 and follow the and follow the paths determined by the k+1 paths determined by the k+1 strings above, two of the strings, strings above, two of the strings, say 0say 0ii and 0 and 0jj leave you in the leave you in the same state q.same state q.

But then from state q following the But then from state q following the path determined by the string 1path determined by the string 1ii

must leave you in a final state.must leave you in a final state.

But then 0But then 0jj11ii must be accepted also. must be accepted also.

This is a contradiction, which proves This is a contradiction, which proves that L is not regular.that L is not regular.

QEDQED

Intuitively a DFA can count only a Intuitively a DFA can count only a finite (bounded) number of finite (bounded) number of things.things.

The language of balanced The language of balanced parentheses is non-regular also.parentheses is non-regular also.

Page 11: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 11 – CSCE 531 Spring 2006

Moving on Up to the Parsing SideMoving on Up to the Parsing Side

Lexical analysis can’t do it allLexical analysis can’t do it all

Syntax analysis recognize things from context.Syntax analysis recognize things from context.

The process of discovering the structure for some The process of discovering the structure for some sentence or program.sentence or program.

Need a mathematical model of syntax — a grammar Need a mathematical model of syntax — a grammar GG

Need an algorithm for testing membership in Need an algorithm for testing membership in L(G)L(G)

Page 12: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 12 – CSCE 531 Spring 2006

The Role of the ParserThe Role of the Parser

Figure 4.1Figure 4.1

Page 13: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 13 – CSCE 531 Spring 2006

Context Free GrammarsContext Free Grammars

A Context free grammar is a formal mathematical model A Context free grammar is a formal mathematical model that has 4 components, G = (N, T, P, S), wherethat has 4 components, G = (N, T, P, S), where

N is a set of grammar symbols called nonterminalsN is a set of grammar symbols called nonterminals

T is a set of terminals (or tokens)T is a set of terminals (or tokens)

P is a set of productions or rewrite rules of the form, P is a set of productions or rewrite rules of the form, Nonterminal string of grammar symbols E.g., N => a b N Terminology, left hand side, right hand side grammar symbols = N U T

S is the start symbol (a nonterminal)S is the start symbol (a nonterminal)

Generally a grammar is specified by listing the Generally a grammar is specified by listing the productions.productions.

Page 14: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 14 – CSCE 531 Spring 2006

Example Context Free GrammarsExample Context Free Grammars

Example: G = (N, T, P, S)Example: G = (N, T, P, S)

N = {S, T}N = {S, T}

T = {a, b, c}T = {a, b, c}

P = { SP = { S aS, S aS, S bT, T bT, T c} c}

Notational conventions Notational conventions Nonterminals are typically represented by capital letters N,

T, P, S, … or lower case strings in italics, e.g., expr Terminals are typically represented by lower case letters a,

b, … z, punctuation symbols, operators, parentheses, digits Unless otherwise stated the nonterminal of the first

production is the start symbol “|” shorthand SaS | bT is shorthand for the two S

productions S aS, S bT Lower case greek symbol represent strings of grammar

symbols

Page 15: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 15 – CSCE 531 Spring 2006

DerivationsDerivations

The derives (=>) relation is a binary relation between strings The derives (=>) relation is a binary relation between strings of grammar symbols.of grammar symbols.

We define derives as below:We define derives as below:

If T If T X X11XX22…X…Xn n is a production and is a production and αα and and ββ are strings are strings of grammar symbols then we say of grammar symbols then we say ααTTββ derives derives ααXX11XX22…X…Xnnββ and denote this by and denote this by ααTTββ => => ααXX11XX22…X…Xnnββ

ExampleExample

Page 16: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 16 – CSCE 531 Spring 2006

Review of Properties of Binary RelationsReview of Properties of Binary RelationsIf R is a binary relation on A then R isIf R is a binary relation on A then R is

a subset of A x Aa subset of A x A

Symmetric if a R b implies b R aSymmetric if a R b implies b R a

Transitive if a R b and b R c implies a R cTransitive if a R b and b R c implies a R c

The transitive closure of R is the minimal subset of A x A The transitive closure of R is the minimal subset of A x A that contains R and is a transitive relation.that contains R and is a transitive relation.

Henceforth we will use Henceforth we will use (read derives) to denote the (read derives) to denote the transitive closure of “=>” the “one-step” derives on the transitive closure of “=>” the “one-step” derives on the previous slide.previous slide.

αα ββ means means αα => => αα1 1 => => αα2 2 => … => … ααn n = = ββ

Thus Thus αα ββ means that one can apply a sequence of means that one can apply a sequence of productions and rewrite productions and rewrite αα as as αα11, then apply a , then apply a production to production to αα1 1 to rewrite to obtain to rewrite to obtain αα22 … and eventually … and eventually obtain obtain ββ

Page 17: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 17 – CSCE 531 Spring 2006

Derivations and Sentential FormsDerivations and Sentential Forms

If If αα ββ or or αα => => αα1 1 => => αα2 2 => … => … ααn n = = ββ then we say the then we say the sequence of rewrites forms a derivation of sequence of rewrites forms a derivation of ββ from from αα..

The purpose of a grammar is to rewrite strings of The purpose of a grammar is to rewrite strings of grammar symbols until we obtain a string of grammar symbols until we obtain a string of terminals(tokens).terminals(tokens).

If G = (N, T, P, S) is a grammar then If G = (N, T, P, S) is a grammar then αα is a sentential is a sentential form ifform if

1.1. The Start symbol derives The Start symbol derives αα, S , S αα

2.2. αα derives a string of tokens, derives a string of tokens, αα ωω, where , where ωω ЄЄ T* T*

Or written more conciselyOr written more concisely

S S αα ωω, where , where ωω ЄЄ T* T*

Page 18: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 18 – CSCE 531 Spring 2006

Language Generated by a grammarLanguage Generated by a grammar

If G = (N, T, P, S) is a grammar then the language If G = (N, T, P, S) is a grammar then the language generated by G, denoted by L(G) isgenerated by G, denoted by L(G) is

L(G) = {x L(G) = {x ЄЄ T* | S T* | S x} x}

ExampleExample

S S 0 S 1 | 0 S 1 | εε

Page 19: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 19 – CSCE 531 Spring 2006

Parse TreesParse Trees

A parse tree is a graphical presentation of a derivation, A parse tree is a graphical presentation of a derivation, satisfyingsatisfying

The root is the start symbolThe root is the start symbol

Each leaf is a token or Each leaf is a token or εε (note different font from (note different font from text)text)

Each interior node is a nonterminalEach interior node is a nonterminal

If A is a parent with children XIf A is a parent with children X1 1 , X, X22 … X … Xnn then thenA A X X11XX22 … X … Xnn is a production is a production

Page 20: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 20 – CSCE 531 Spring 2006

Top down vs. Bottom Up Construction of Parse TreesTop down vs. Bottom Up Construction of Parse TreesG:G:

S S (E)*S (E)*S

S S (E) (E)

E E id + id id + id

S

( E ) * S

( E ) * Sid + idid + id

id + idid + id

id + idid + id

( E )

Page 21: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 21 – CSCE 531 Spring 2006

Bottom Up Construction of Parse TreesBottom Up Construction of Parse Trees

G:G:

S S (E)*S (E)*S

S S (E) (E)

E E id + id id + id

X * Y + Z * W

Page 22: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 22 – CSCE 531 Spring 2006

Leftmost (Rightmost) derivationsLeftmost (Rightmost) derivationsA derivation S A derivation S ωω, where , where ωω ЄЄ (N U T)* is called Leftmost if at (N U T)* is called Leftmost if at

each step you rewrite the leftmost nonterminal in the sentential each step you rewrite the leftmost nonterminal in the sentential form.form.

If we want to emphacize that this is a leftmost derviation we will If we want to emphacize that this is a leftmost derviation we will write S write S LMLM ωω, read S leftmost derives , read S leftmost derives ωω..

ExampleExample

E E E + E E + E

E E E * E E * E

E E id id

We will henceforth use the ‘|’ shorthand and write this grammar asWe will henceforth use the ‘|’ shorthand and write this grammar as

E E E + E | E * E | id E + E | E * E | id

Rightmost derivations are defined in a similar manner.Rightmost derivations are defined in a similar manner.

Page 23: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 23 – CSCE 531 Spring 2006

AmbiguityAmbiguity

A grammar is ambiguous if there is a string of terminals A grammar is ambiguous if there is a string of terminals that has two distinct parse trees (or two distinct LM that has two distinct parse trees (or two distinct LM derivations or 2 RM derivations)derivations or 2 RM derivations)

Example: E Example: E E + E | E * E | id E + E | E * E | id

Page 24: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 24 – CSCE 531 Spring 2006

Eliminating AmbiguityEliminating Ambiguity

Rewrite the grammar is the approach taken.Rewrite the grammar is the approach taken.

However there are certain languages that no matter However there are certain languages that no matter what grammar is chosen it will have to be what grammar is chosen it will have to be ambiguous. ambiguous.

These languages are called These languages are called inherently ambiguous inherently ambiguous languages. languages. We will not consider any of these We will not consider any of these languages in this classlanguages in this class..

Page 25: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 25 – CSCE 531 Spring 2006

Consider the grammar for expressionsConsider the grammar for expressions

E E E + E | E * E | id E + E | E * E | id

Page 26: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 26 – CSCE 531 Spring 2006

Derivations and PrecedenceDerivations and Precedence

This grammar has no notion of precedence!This grammar has no notion of precedence!

To add precedenceTo add precedence

Create a non-terminal for each Create a non-terminal for each level of precedencelevel of precedence

Isolate the corresponding part of the grammarIsolate the corresponding part of the grammar

Force the parser to recognize high precedence Force the parser to recognize high precedence subexpressions firstsubexpressions first

For algebraic expressions For algebraic expressions

Multiplication and division, first Multiplication and division, first ((level onelevel one))

Subtraction and addition, next Subtraction and addition, next ((level twolevel two))

Page 27: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 27 – CSCE 531 Spring 2006

Rewriting the Expression GrammarRewriting the Expression Grammar

Add nonterminals for each level of precedenceAdd nonterminals for each level of precedence

Term (product) for components of sumsTerm (product) for components of sums

Factor for components of products(terms)Factor for components of products(terms)

Expr Expr Expr + TermExpr + Term

Expr Expr Expr - TermExpr - Term

Expr Expr TermTerm

Term Term Term + Factor Term + Factor

Term Term Term - FactorTerm - Factor

Term Term FactorFactor

FactorFactor IDID

Factor Factor NUMBERNUMBER

FactorFactor ( Expr )( Expr )

Page 28: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 28 – CSCE 531 Spring 2006

Derivation of 5 * X + 3 * Y Derivation of 5 * X + 3 * Y

Page 29: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 29 – CSCE 531 Spring 2006

Notes on rewritten grammarNotes on rewritten grammar

It is more complex; more nonterminals, more It is more complex; more nonterminals, more productions.productions.

It requires more steps in the derivationIt requires more steps in the derivation

But it does eliminate the ambiguity.But it does eliminate the ambiguity.

Page 30: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 30 – CSCE 531 Spring 2006

Ambiguous Grammar 2 If-elseAmbiguous Grammar 2 If-else

The leftmost and rightmost derivations for a sentential The leftmost and rightmost derivations for a sentential form may differ, even in an unambiguous grammarform may differ, even in an unambiguous grammar

Classic example — the Classic example — the ifif--thenthen--elseelse problem problem

Stmt if Expr then Stmt | if Expr then Stmt else Stmt | other stmts

Page 31: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 31 – CSCE 531 Spring 2006

AmbiguityAmbiguity

This sentential form has two derivationsThis sentential form has two derivationsif Expr1 then if Expr2 then Stmt1 else Stmt2

Page 32: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 32 – CSCE 531 Spring 2006

Removing the ambiguityRemoving the ambiguity

To eliminate the ambiguity To eliminate the ambiguity

We must rewrite the grammar to avoid generating the We must rewrite the grammar to avoid generating the problemproblem

We must associate each else with the innermost We must associate each else with the innermost unmatched ifunmatched if

SS withElsewithElse

Page 33: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 33 – CSCE 531 Spring 2006

AmbiguityAmbiguity

Removing the ambiguityRemoving the ambiguity

Must rewrite the grammar to avoid generating the problemMust rewrite the grammar to avoid generating the problem

Match each Match each elseelse to innermost unmatched to innermost unmatched ifif

With this grammar, the example has only one derivationWith this grammar, the example has only one derivation

1 Stmt WithElse

2 | NoElse

3 WithElse if Expr then WithElse else WithElse

4 | OtherStmt

5 NoElse if Expr then Stmt

6 | if Expr then WithElse else NoElse

Intuition: a NoElse always has no else on its last cascaded else if statement

Page 34: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 34 – CSCE 531 Spring 2006

Ambiguity Ambiguity

ifif ExprExpr11 thenthen ifif ExprExpr22 thenthen StmtStmt11 elseelse StmtStmt2 2

This binds the This binds the elseelse controlling controlling SS22 to the inner to the inner ifif

Rule Sentential Form— Stmt2 NoElse5 if Expr then Stmt? if E1 then Stmt1 if E1 then WithElse3 if E1 then if Expr then WithElse else WithElse? if E1 then if E2 then WithElse else WithElse4 if E1 then if E2 then S1 else WithElse4 if E1 then if E2 then S1 else S2

Page 35: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 35 – CSCE 531 Spring 2006

Deeper AmbiguityDeeper Ambiguity

Ambiguity usually refers to confusion in the CFGAmbiguity usually refers to confusion in the CFG

Overloading can create deeper ambiguityOverloading can create deeper ambiguity

a = f(17)

In many Algol-like languages, In many Algol-like languages, ff could be either a function or a could be either a function or a subscripted variablesubscripted variable

Disambiguating this one requires contextDisambiguating this one requires context

Need values of declarationsNeed values of declarations

Really an issue of Really an issue of typetype, not context-free syntax, not context-free syntax

Requires an extra-grammatical solution (not in Requires an extra-grammatical solution (not in CFGCFG))

Must handle these with a different mechanismMust handle these with a different mechanism Step outside grammar rather than use a more complex grammar

Page 36: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 36 – CSCE 531 Spring 2006

Regular Languages and GrammarsRegular Languages and Grammars

A grammar where all productions are of the formA grammar where all productions are of the form

A A a or A a or A a B, where A,B a B, where A,B N and a N and a T T

Is called left-linear or sometimes a regular grammar.Is called left-linear or sometimes a regular grammar.

It turns out that the language generated by a left-linear It turns out that the language generated by a left-linear grammar is a regular language.grammar is a regular language.

How would you prove that?How would you prove that?

Page 37: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 37 – CSCE 531 Spring 2006

Context Free LanguagesContext Free Languages

A language L is called a context free language (CFL) if A language L is called a context free language (CFL) if there exits a context free grammar that generates it, there exits a context free grammar that generates it, i.e., L = L(G).i.e., L = L(G).

Page 38: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 38 – CSCE 531 Spring 2006

Left recursionLeft recursion

A A A Aαα | | ββ

Page 39: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 39 – CSCE 531 Spring 2006

Elimination of Immediate Left RecursionElimination of Immediate Left Recursion

Page 40: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 40 – CSCE 531 Spring 2006

Error handlingError handling

Error handlingError handling Error detection Error recovery

Page 41: Lecture 5 Grammars Topics Moving on from Lexical Analysis Grammars Derivations CFLs Readings: 4.1 January 25, 2006 CSCE 531 Compiler Construction

– 41 – CSCE 531 Spring 2006

Fig 3.27 NFA DFAFig 3.27 NFA DFA

aa bb

{0,1,2,4,7}{0,1,2,4,7} {3, 8, 6,1,2,4,7}= {3, 8, 6,1,2,4,7}= {1,2,3,4,6,7,8}{1,2,3,4,6,7,8}

{5,6,1,2,4,7}= {5,6,1,2,4,7}= {1,2,4,5,6,7}{1,2,4,5,6,7}

{1,2,3,4,6,7,8}{1,2,3,4,6,7,8} {3,6,1,2,4,7,8} a loop{3,6,1,2,4,7,8} a loop εε-clos{5,9} = -clos{5,9} = {5,9,6,1,2,4,7}={5,9,6,1,2,4,7}=

{1,2,4,5,6,7,9}{1,2,4,5,6,7,9}

{1,2,4,5,6,7}{1,2,4,5,6,7} εε-clos{3,8} =-clos{3,8} =

{3,8,6,1,2,4,7} ={3,8,6,1,2,4,7} =

{1,2,3,4,6,7,8}{1,2,3,4,6,7,8}

εε-clos{5} = -clos{5} = {5,6,1,2,4,7}={5,6,1,2,4,7}=

{1,2,4,5,6,7} {1,2,4,5,6,7} a loopa loop

{1,2,4,5,6,7,9}{1,2,4,5,6,7,9} εε-clos{3,8} =-clos{3,8} =

{1,2,3,4,6,7,8}{1,2,3,4,6,7,8}

εε-clos{5,10} = -clos{5,10} = {5,10,6,1,2,4,7}={5,10,6,1,2,4,7}=

{1,2,4,5,6,7,10}{1,2,4,5,6,7,10}

{1,2,4,5,6,7,10}{1,2,4,5,6,7,10} εε-clos{3,8} =-clos{3,8} =

{1,2,3,4,6,7,8}{1,2,3,4,6,7,8}

εε-clos{5} = -clos{5} =

{1,2,4,5,6,7}{1,2,4,5,6,7}