CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 4

Preview:

DESCRIPTION

CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 4 School of Innovation, Design and Engineering Mälardalen University 2012. 1. Content. - PowerPoint PPT Presentation

Citation preview

11

CDT314

FABER

Formal Languages, Automata and Models of Computation

Lecture 4

School of Innovation, Design and Engineering Mälardalen University

2012

2

Regular Expressions and Regular Languages

NFADFA Subset Construction (Delmängdskonstruktion)

FA RE State Elimination (Tillståndseliminering)

Minimizing DFA by Set Partitioning (Särskiljandealgoritmen)

Grammar: Linear grammar. Regular grammar.

Application: Compiler. Lexical Analysis.

Content

3

Two Basic Theoremsabout Finite State Automata

Based on C Busch, RPI, Models of Computation

4

1: KleeneTheoremNFADFA

Let M be NFA accepting L. Then there exists DFA M´ that accepts L as well.

For each NFA there exists corresponding DFA

that accepts the same language.

5

II: Finite Language TheoremAny finite language is regular.

Any finite language is FSA-acceptable.

Proof: every finite language is a union of single strings which are regular because described by regular expressionsExample L = {abba, abb, abab}L is expressed by the regular expression “abba|abb|abab“

FSA = Finite State Automaton = DFA/NFA

6

Some Properties of Regular Languages,

Summary

7

Properties

1L 2L

21LLConcatenation:

*1LStar:

21 LL Union:

Are RegularLanguages

For regular languages and we will prove that:

8

Regular languages are closed under

21LLConcatenation:

*1LStar:

21 LL Union:

9

1LRegular language

11 LML

1M

Single final state

NFA

2LRegular language

2M

Single final state

22 LML

NFA

10

Example

}{1 baL na

b

1M

baL 2ab

2M

11

Union (Thompson’s construction)

NFA for

1M

2M

21 LL

12

ab

ab

}{1 baL n

}{2 baL

}{}{21 babaLL n NFA for

Example

13

Concatenation (Thompson’s construction)

NFA for 21LL

1M 2M

14

Example NFA for }{}}{{21 bbaababaLL nn

ab ab

}{1 baL n}{2 baL

15

Star Operation (Thompson’s construction)

NFA for *1L

1M

*1L

16

Example

NFA for *}{*1 baL n

ab

}{1 baL n

17

Reverse of a Regular Language

18

Theorem

The reverse of a regular languageis a regular language

RL L

Proof idea

Construct NFA that accepts :RL

invert the transitions of the NFAthat accepts L

19

ProofSince is regular, there is NFA that accepts

L

Example:

baabL *a

b

ba

L

20

Invert Transitions

a

b

ba

a

b

ba

21

Make old initial state a final state

and vice versa

a

b

ba

a

b

ba

22

Add a new initial state

a

b

ba

a

b

ba

23

Resulting machine accepts RL

baabL *ababLR *

a

b

ba

RL is regular

24

Regular Expressionsand

Regular Languages

25

Theorem: Regular expressions are representation of regular languages and they are equivalent.

Regular expressions describe regular languages in formal language theory. They have the same expressive power as regular grammars.

26

Theorem - Part 1

LanguagesGenerated byRegular Expressions

RegularLanguages

1. For any regular expression the language is regular

r)(rL

27

Theorem - Part 2

LanguagesGenerated byRegular Expressions

RegularLanguages

LrL )(2. For any regular language there is a regular expression with

Lr

28

Proof - Part 1

r)(rL

For any regular expression the language is regular

Proof by induction on the size of r

29

Induction BasisPrimitive Regular Expressions: ,,NFAs

)()( 1 LML

)(}{)( 2 LML RegularLanguages

)(}{)( 3 aLaML a

(where )

30

Inductive Hypothesis

1r 2rAssume

for regular expressions and

that and are regular languages)( 1rL )( 2rL

31

Inductive StepWe will prove:

regular languages

))(()(

)()(

1

1

21

21

rLrL

rrLrrL

32

By definition of regular expressions:

)())(())(()(

)()()()()()(

11

*11

2121

2121

rLrLrLrL

rLrLrrLrLrLrrL

33

)( 1rL )( 2rLBy inductive hypothesis we know: and are regular languages

Regular languages are closed under

*1

21

21

rLrLrLrLrL union

concatenation

star

34

Therefore

** 11

2121

2121

rLrL

rLrLrrL

rLrLrrL

regularlanguages

And trivially

))(( 1rL is a regular language

35

Proof – Part 2

Lr LrL )(

For any regular language there is a regular expression with

Proof by construction of regular expression

36

LML )(

Single final state

LM

Since is regular take the NFA that accepts it

37

From construct the equivalentGeneralized Transition Graph

transition labels are regular expressions

M

Example

a

ba,

cM

a

ba

c

QED

38

Summary: Operations on Regular Expressions

RE Regular language description

a+b {a,b} (a+b)(a+b) {aa, ab, ba, bb} a* {, a, aa, aaa, …} a*b {b,ab, aab, aaab, …} (a+b)* {, a, b, aa, ab, ba, aaa, bbb…}

39

Algebraic Properties

Axiom Description r +s = s+r + is commutative (r +s)+t = r +(s+t) + is associative (rs)t = r (st) concatenation is associative r (s+t) = rs+rt (s+t)r = sr +tr

concatenation distributes over +

r = r r = r

is the identity element for concatenation

r* = ( r +)* relation between * and r** = r* * is idempotent

40

Operator Precedence

1. Kleene star 2. Concatenation 3. Union

allows dropping unnecessary parentheses.

41

Converting Regular Expression to a DFA

42

Example: From a Regular Expression to an NFA

Example : (a+b)*abb step by step construction

2a

4 5b

6

1

3

(a+b)

43

Example: From a Regular Expression to an NFA

Example : (a+b)*abb

2a

4 5b

1

6

3

0 7

(a+b)*

44

Example: From a Regular Expression to an NFA

Example : (a+b)*abb

2 a

4 5b

10

10

a

b

b

76

3

8

9

(a+b)*abb

45

Converting FA to a Regular Expression

State Elimination

http://www.math.uu.se/~salling/Movies/FA_to_RegExpression.mov

46

Add a new start (s) and a new final (f) state:• From s to the ex-starting state (1)• From each ex-final state (1,3) to f

Example

ba ,a

bb

1 2

3

a

47

ba ,a

b

b

1 2

3

a

s

f

Let’s remove state 1!Each combination of input/output to 1 will generate a new path once state 1 is gone

; ba , ; ba a; , a ;

The original:

ba ,a

bb

1 2

3

a

ba ,a

bb

1 2

3

a

48

When state 1 is gone we must be able to make all those transitions!

ba ,a

bb

1 2

3

a

s

f

ba( ); ba ,

ba,a

bba

Previous:

49

;

ba ,a

bb

1 2

3

a

s

f

ba( )

50

ba ,a

b

b

1 2

3

a

s

f

ba a; ,

ba( )a

ba( )

51

A common mistake: having several arrows between the same pair of states. Join the two arrows (by union of their regular expressions) before going on to the next step.

ba ,a

b1 2

3

a

s

f

ba( )

bba( )a

b

52

ba ,a

b1 2

3

a

s

f

a ; ba( )

a

bba( )a

53

Union again..

ba ,a

b1 2

3

a

s

f

ba( )

a

bba( )a

54

ab

2

3

s

f

ba( )

a

bba( )a

Without state 1...

55

ab

2

3

s

f

ba( )

a

ba(a

Now we repeat the same procedure for the state 2...

b

56

Following the path s-2-3 we concatenate all strings on our way…don’t forget a* in 2!

ab

2

3

s

f

ba( )

a

bba( )a

bba( ) a

57

When 2 is removed, the path 3 - 2 –3 has also to be preserved, so we concatenate 3-2 string, take a* loop and go back 2-3 with string b!

ab 2

3

s

f

ba( )

bba(a

a

bba( ) a

( bba( )a ba)

58

3

s

f

This is how the FA looks like without state 2:

a

bba( ) a

( bba( )a ba)

59

3

s

f

Finally we remove state 3…

a

bba( ) a

( bba( )a ba)

60

3

s

f

...so we concatenate strings s-3, loop in 3 and 3-f

)))((( babbaa

babbaa ))((

baba *)(

a

baba )( )( a

61

Now we can omit state 3!

s

f

From s we have two choices, empty string or the long expression

OR is represented by union , as usually

)))((( babbaababa )( )( a

62

So union the arrows...

s f

...and we are done!

)))((( babbaababa )( )( a

63

Converting FA to a Regular Expression-

An Algorithm for State Elimination

64

• We expand our notion of NFA- to allow transitions on arbitrary regular expressions, not simply single symbols or .

• Successively eliminate states, replacing transitions that enter and leave a state with a more complicated regular expression, until eventually there are only two states left: a start state, an accepting state, and a single transition connecting them, labeled with a regular expression.

• The resulting regular expression is then our answer.

65

• To begin with, the automaton should have a start state that has no transitions into it (including self-loops), and which is not accepting.

• If your automaton does not obey this property, add a new, non-accepting start state s, and add a -transition from s to the original start state.

• The automaton should also have a single accepting final state with no transitions leaving it, and no self-loops.

• If your automaton does not have it, add a new final state q, change all other accepting states to non-accepting, and add -transitions from them to q.

This change clearly doesn't change the language accepted by the automaton.

66

Repeat the following steps, which eliminate a state:

1. Pick a non-start, non-accepting state q to eliminate. The state q will have i transitions in and j transitions out. Each will be labeled with a regular expression. For each of the ij combinations of transitions into and out of q, replace:

A

B

p qC

r

withCBA *

p r

And delete state q.

67

2. If several transitions go between a pair of states, replace them with a single transition that is the union of the individual transitions.

E.g. replace: A

B

p r

with

p rBA

68

Example ba

a

b1 3 4a2

b

abab *a

b1 3 4

baabab *)*( 1 4

69

N.B. common mistake!

ba,

a b

*)( bameans

See example on s 46 and 47 in Sallings book

i.e.

NOT *)*( ba

70

Minimizing DFAby Set Partitioning

http://www.math.uu.se/~salling/Movies/SubsetConstruction.mov

71

Minimization of DFA

The deterministic finite automata are not always the smallest possible accepting the source language.

There may be states with the same "acceptance behavior". This applies to states p and q, if for all input words, the automaton always or never moves to a final state from p and q.

72

State Reduction by Set Partitioning (Särskiljandealgoritmen)

The set partitioning technique is similar to one used for partitioning people into groups based on their responses to questionnaire.

The following slides show the detailed steps for computing equivalent state sets of the starting DFA and constructing the corresponding reduced DFA.

73

bb a

b b

a

a

b

ab

a

4

3

1

2

a

5

0

b

aa, b

3

1,2

a

b4,5

0a, b

Starting DFA

Reduced DFA

74

Step 0: Partition the states into two groups accepting and non-accepting.

State Reduction by Set Partitioning

{ 3, 4, 5 } { 0, 1, 2 }

P1 P2

bb a

b b

aa

b

ab

a

4

31

2

a

5

0

75

Step 1: Get the response of each state for each input symbol. Notice that States 3 and 0 show different responses from the ones of the other states in the same set.

P1 P2

p1 p1 p1 p2 p1 p1

a a {3, 4, 5 } {0,1, 2 }b b

p2 p1 p1 p2 p1 p1

State Reduction by Set Partitioning

bb a

b b

aa

b

ab

a

4

31

2

a

5

0

P1P2

76

P11 P12 P21 P22

p11 p11 p12 p12

a a {4, 5} {3} {1, 2} {0}

b b p11 p11 p11 p11

Step 2: Partition the sets according to the responses, and go to Step 1 until no partition occurs.

No further partition is possible for the sets P11 and P21 . So the final partition results are as follows:

{4, 5} {3} {1, 2} {0}

77

{4, 5} {3} {1, 2} {0}

Minimized DFA consists of four states of the final partition, and the transitions are the one corresponding to the starting DFA.

b

aa, b

3

1,2

a

b4,5

0a, b

bb a

b b

aa

b

ab

a

4

31

2

a

5

0

b

aa, b

3

1,2

a

b4,5

0a, b

Minimized DFA Starting DFA

78

DFA Minimization Algorithm

The algorithm

P { F, {Q-F}}while ( P is still changing) T { } for each set s P for each partition s by into s1, s2, …, sk

T T s1, s2, …, sk if T P then P T

Why does this work?Partition P 2Q (power set)Start off with 2 subsets of Q

{F} and {Q-F}

While loop takes PiPi+1 by splitting one or more sets

Pi+1 is at least one step closer to the partition with |Q | sets

Maximum of |Q | splits

Partitions are never combinedInitial partition ensures that final

states are intact

This is a fixed-point algorithm!

79

Set Partition Algorithm (Salling book)

Example 2.39 We apply step by step set partition algorithm on the following DFA: },{ ba

a

ba

a

a

b

b

21

35

4

6

ba ab

Two strings x,y are distinguished by language (DFA) if there is a (distinguishing) string z such as that only one of strings xz, yz ends in accepting state (belongs to language).

80

}6,5,4,3,2,1{

}6,5,4,2,1{ }3{Non-accepting: Accepting:

What happens when we read ?a

}3{

}3{}5{

}5,4,2{ }6,1{

From 1 and 6 by a we reach 3 which is accepting. They form a special group.

}6,1{}4,2{

What happens when we read ?b

a

ba

a

a

b

b

21

35

4

6

ba ab

We search for strings that are distinguished by DFA!

Distinguishing string: ora b

From 5 we end in 6 by b, leaving its group.

81

}3{}5{ }6,1{}4,2{The minimal automaton has four states:

a

a

ab

b

}6,1{ b}3{

}5{}4,2{

a

a

ba

a

a

b

b

21

35

4

6

ba ab

82

The Chomsky Hierarchy

83

Regular Languages

}{ nnba }{ RwwContext-Free LanguagesNon-regular languages

}0:{ ! nan}0,:{ lncba lnln

84

Some Additional Examplesfor your excersize

85

NFA N for (a+b)*abb

1start

2

1010

a 3

4 5

0 6

b

87 9 a b b

Example of subset construction for NFA DFA

86

Translation table for DFA

STATEINPUT SYMBOL

a b

ABCDE

BBBBB

CDCEC

Example of subset construction for NFA DFA

87

Result of applying the subset construction of NFA for (a+b)*abb.

Astart

Ba

Db b

10E

b

a a

a

bC

b

a

Example of subset construction for NFA DFA

88

Another Example of State Elimination

ba,a

b

b0q 1q 2q

b

ba a

b

b0q 1q 2q

b

89

ba a

b

b0q 1q 2q

b

0q 2q

babb*

)(* babb

Another Example of State Elimination

90

0q 2q

babb*

)(* babb

*)(**)*( bbabbabbr

LMLrL )()(

Another Example of State Elimination

Resulting Regular Expression

91

In General

Removing states

iq q jqa b

cde

iq jq

dae* bce*dce*

bae*

92

Obtaining the final regular expression

0q fq

1r

2r

3r4r

*)*(* 213421 rrrrrrr

LMLrL )()(

93

Example: From an NFA to a DFA

states a bA B CB B DC B CD B EE B C

A B

C

D Ea b b

ab b

b

aa

a

94

Example: From an NFA to a DFA

states a bA B A

B B D

D B E

E B A

A B D Ea b b

b

a

a

a

b

95

Example: DFA Minimization

s0

as1

b

s3

bs4

s2

a

b

b

a

a

a

b

s0 , s2

as1

b

s3

bs4

b

a

a

a

b

final state

Current Partition Split on a Split on bP0 {s4} {s0, s1, s2, s3} none {s0, s1, s2} {s3}P1 {s4}{s3}{s0, s1, s2} none {s0, s2}{s1}P2 {s4}{s3}{s1}{s0, s2} none none

96

Example: DFA Minimization

What about a ( b + c )* ?

First, the subset construction:

q0 q1 a

q4 q5 b

q6 q7 c

q3 q8 q2 q9

-closure(move(s,*))NFA states a b c

s0 q0 q1, q2, q3, q4, q6, q9

none none

s1 q1, q2, q3,q4, q6, q9

none q5, q8, q9,q3, q4, q6

q7, q8, q9,q3, q4, q6

s2 q5, q8, q9,q3, q4, q6

none s2 s3

s3 q7, q8, q9,q3, q4, q6

none s2 s3

s3

s2

s0 s1

c

ba

b

b

c

c

Final states

97

Example: DFA MinimizationThen, apply the minimization algorithm

To produce the minimal DFAs3

s2

s0 s1

c

ba

b

b

c

c

Split onCurrent Partition a b c

P0 { s1, s2, s3} {s0} none none none

s0 s1

a

b , c

final states

98

Grammars

99

GrammarsGrammars express languages

Example: the English language

verbpredicate

nounarticlephrasenoun

predicatephrasenounsentence

_

_

100

barksverbsingsverb

dognounbirdnoun

thearticleaarticle

101

A derivation of “the bird sings”:

ñáñáñáÞñáñáñáÞ

ñáñáñáÞñáñáñáÞ

ñáñáñáÞñáñáÞñá

birdtheverbbirdtheverbnounthe

verbnounarticlepredicatenounarticlepredicatephrasenounsentence

sings

_

102

A derivation of “a dog barks”:

barksdogaverbdogaverbnouna

verbnounarticleverbphrasenounpredicatephrasenounsentence

Þ

Þ

Þ

Þ

Þ

Þ

__

103

The language of the grammar:

}

{

"ingssogdthe",barks"ogdthe"

,"ingssogda",barks"ogda",sings"birdthe",barks"birdthe",sings"birda"

,barks"birda"L

"ingssogdhet","barksogdhet"

,"ingssogda","barksogda","ingssbirdhet","barksbirdhet"

,"ingssbirda","barksbirda"{L

}

}

104

Notation

dognoun

birdnoun

Non-terminal (Variable)

TerminalProduction rule

105

Example

Derivation of sentence:

SaSbS

ab

abaSbS ÞÞ

aSbS S

Grammar:

106

aabbaaSbbaSbS ÞÞÞ

aSbS S

aabb

SaSbSGrammar:

Derivation of sentence

107

Other derivations

aaabbbaaaSbbbaaSbbaSbS ÞÞÞÞ

aaaabbbbaaaaSbbbbaaaSbbbaaSbbaSbS

ÞÞÞÞÞ

108

The language of the grammar

SaSbS

}0:{ nbaL nn

109

Formal Definition

Grammar PSTVG ,,,

:V Set of variables

:T Set of terminal symbols

:S Start variable

:P Set of production rules

110

ExampleGrammar

SaSbS

G

}{SV },{ baT

},{ SaSbSP

PSTVG ,,,

111

Sentential Form A sentence that contains variables and terminals

Example

aaabbbaaaSbbbaaSbbaSbS ÞÞÞÞ

sentential forms Sentence(sats)

112

We write:

Instead of:

aaabbbS*Þ

aaabbbaaaSbbbaaSbbaSbS

ÞÞÞÞ

113

nww*

1 Þ

nwwww ÞÞÞÞ 321

In general we write

if

By default ww*Þ( )

114

Example

SaSbS

aaabbbS

aabbS

abS

S

*

*

*

*

Þ

Þ

Þ

Þ

Grammar Derivations

115

baaaaaSbbbbaaSbbÞ

aaSbbSÞ

SaSbS

GrammarExample

Derivations

116

Another Grammar Example

AaAbAAbS

Derivations

aabbbaaAbbbaAbbSabbaAbbAbS

bAbS

ÞÞÞÞÞÞ

ÞÞ

GGrammar

117

More Derivations

aaaabbbbbaaaaAbbbbbaaaAbbbbaaAbbbaAbbAbS

ÞÞÞÞÞÞ

bbaS

bbbaaaaaabbbbS

aaaabbbbbS

nn

Þ

Þ

Þ

118

The Language of a Grammar

For a grammar with start variable G S

}:{)( wSwGLÞ

String of terminals

119

ExampleFor grammar

AaAbAAbS

}0:{)( nbbaGL nn

Since bbaS nnÞ

G

120

Notation

AaAbA

|aAbA

thearticleaarticle

theaarticle |

121

Linear Grammars

122

Linear Grammars

Grammars with at most one variable (non-terminal) at the right side of a production

AaAbAAbS

SaSbS

Examples:

123

A Non-Linear Grammar

bSaSaSbS

SSSS

Grammar G

)}()(:{)( wnwnwGL ba

124

A Linear Grammar

Grammar

AbBaBAAS

|

}0:{)( nbaGL nn

G

125

Right-Linear Grammars

All productions have form: xBA

xAor

aSabSS

Example

126

Left-Linear Grammars

All productions have form BxA

aBBAabA

AabS

|

xAor

Example

127

Regular Grammars

A grammar is a regular grammar if and only if it is right-linear or left-linear.

128

Regular Grammars Generate

Regular Languages

129

Theorem

LanguagesGenerated byRegular Grammars

RegularLanguages

130

Theorem - Part 1

LanguagesGenerated byRegular Grammars

RegularLanguages

Any regular grammar generatesa regular language

131

Theorem - Part 2

Any regular language is generated by a regular grammar

LanguagesGenerated byRegular Grammars

RegularLanguages

132

Proof – Part 1

The language generated by any regular grammar is regular

)(GLG

LanguagesGenerated byRegular Grammars

RegularLanguages

133

The case of Right-Linear Grammars

Let be a right-linear grammar

We will prove: is regular

Proof idea We will construct NFA with

M)()( GLML

G

)(GL

134

Grammar is right-linearG

Example

aBbBBaaABaAS

|

|

135

Construct NFA such that every state is a grammar variable:

M

aBbBBaaABaAS

|

|

A

B

S specialfinal stateFV

136

Add edges for each production:

S FV

A

B

a

aBbBBaaABaAS

|

|

137

S FV

A

B

a

aBbBBaaABaAS

|

|

138

S FV

A

B

a

a

a

aBbBBaaABaAS

|

|

139

S FV

A

B

a

a

a

baBbB

BaaABaAS

|

|

140

S FV

A

B

a

a

a

b

a

aBbBBaaABaAS

|

|

141

aaabaaaabBaaaBaAS ÞÞÞÞ

S FV

A

B

a

a

a

b

a

142

S FV

A

B

a

a

a

b

a

abBBBaaABaAS

|

|

GM GrammarNFA

abaaaabGLML

**)()(

143

In GeneralA right-linear grammar

has variables:

and productions:

G

,,, 210 VVV

jmi VaaaV 21

mi aaaV 21or

144

We construct the NFA such that:

each variable corresponds to a node:

M

iV

0V 1V 2V FV

specialfinal state

….

145

For each production:

we add transitions and intermediate nodes

jmi VaaaV 21

iV jV………

1a 2a ma

Example:http://www.cs.duke.edu/csed/jflap/tutorial/grammar/toFA/index.htmlConvert Right-Linear Grammar to FA by JFLAP

146

Example

M

)()( MLGL

04

5193

452

48433421

23110

|

||

aVaVaV

VaVVaaaVaaV

VaVaV

0VFV

1V

2V

3V

4V

1a

3a3a

4a

8a

2a 4a5a

9a5a

9a

147

The case of Left-Linear Grammars

Let be a left-linear grammar

We will prove: is regular

Proof idea We will construct a right-linear grammar with

G

)(GL

G RGLGL )()(

148

Since is left-linear grammarthe productions look like:

G

kaaBaA 21

kaaaA 21

149

Construct right-linear grammar G

In :G kaaBaA 21

In :G BaaaA k 12

vBA

BvA R

150

Construct right-linear grammar G

In :G kaaaA 21

In :G 12aaaA k

vA

RvA

151

It is easy to see that:

Since is right-linear, we have:

RGLGL )()(

)(GL RGL )(

G

)(GLRegularLanguage

RegularLanguage

RegularLanguage

152

Proof - Part 2

Any regular language is generated by some regular grammar

LG

LanguagesGenerated byRegular Grammars

RegularLanguages

153

Proof idea

Any regular language is generated by some regular grammar

LG

Construct from a regular grammar such that

M G)()( GLML

Since is regularthere is an NFA such that

LM )(MLL

154

Example

*)*(* abbababL)(MLL

a

b

a

b

M1q 2q

3q

0q

155

a

b

a

b

M0q 1q 2q

3q

3

13

32

21

11

10

qqqbqqaqqbqqaqq

G

LMLGL )()(

Convert to a right-linear grammarM

156

In General

For any transition:aq p

Add production: apq

variable terminal variable

157

For any final state: fq

Add production: fq

158

Since is right-linear grammar

is also a regular grammar with

G

LMLGL )()(

G

159

Regular GrammarsA regular grammar is any right-linear or left-linear grammar

aSabSS

aBBAabA

AabS

|

1G 2G

Examples

160

ObservationRegular grammars generate regular languages

aSabSS

aabGL *)()( 1

aBBAabA

AabS

|

*)()( 2 abaabGL

1G 2GExamples

161

Regular Languages

Chomsky’s Language Hierarchy

Non-regular languages

162

Application: CompilerLexical Analysis

(Lexikalisk analys i Kompilatorteori)

163

What is a compiler?A compiler is program that translates a source

language into an equivalent target language

while (i > 3) { a[i] = b[i]; i ++}

mov eax, ebxadd eax, 1cmp eax, 3jcc eax, edx

C program

assemblyprogram

compiler does this

164

What is a compiler?

class foo { int bar; ...}

struct foo { int bar; ...}

Java program

compiler does this

C program

165

What is a compiler?

class foo { int bar; ...}

........

.........

........

Java program

compiler does this

Java virtual machine program

166

Phases of a compiler

Lexical AnalyzerScanner

Parser

Semantic Analyzer

Source Program

Syntax Analyzer

Tokens

Parse Tree

Abstract Syntax Tree withattributes

167

Compilation in a Nutshell

Source code(character stream)

Lexical analysis

Parsing

Token stream

Abstract syntax tree(AST)

Semantic Analysis

if (b == 0) a = b;

if ( b ) a = b ;0==

if==

b 0=

a b

if==

int b int 0=

int alvalue

int b

booleanDecorated AST

int ;

;

168

Stages of Analysis

Lexical Analysis – breaking the input up into individual words/tokens

Syntax Analysis – parsing the phrase structure ofthe program

Semantic Analysis – calculating the program’s meaning

169

Lexical Analyzer

Input is a stream of charactersProduces a stream of names, keywords &

punctuation marksDiscards white space and comments

170

Recognize “tokens” in a program source code.The tokens can be variable names, reserved

words, operators, numbers, … etc.Each kind of token can be specified as an RE,

e.g., a variable name is of the form [A-Za-z][A-Za-z0-9]*.

We can then construct an -NFA to recognize it automatically.

Lexical Analyzer

171

By putting all these -NFA’s together, we obtain one which can recognize all different kinds of tokens in the input string.

We can then convert this -NFA to NFA and then to DFA, and implement this DFA as a deterministic program - the lexical analyzer.

Lexical Analyzer

172

RE NFA DFA Minimal DFA

Thompson’s Contruction

Subset Contruction

Hopcroft Minimization

Recommended