The Coke Vending Machine
⢠Vending machine dispenses soda for $0.45 a pop.
⢠Accepts only dimes ($0.10) and quarters ($0.25).
⢠Eats your money if you donât have correct change.
⢠Youâre told to âimplementâ this functionality.
Vending Machine Java Code
Soda vend(){int total = 0, coin;while (total != 45){
receive(coin);if ((coin==10 && total==40) ||(coin==25 && total>=25))
reject(coin);else
total += coin;}return new Soda();
} Overkill?!?
Why this was overkillâŚ
⢠Vending machines have been around long before computers.
â Or Java, for that matter.
⢠Donât really need intâs.
â Each int introduces 232
possibilities.
⢠Donât need to know how to add integers to model vending machineâ total += coin.
⢠Java grammar, if-then-else, etc. complicate the essence.
Vending Machine âLogicsâ
Why was this simpler than Java Code?
⢠Only needed two coin types âDâ and âQââ symbols/letters in alphabet
⢠Only needed 7 possible current total amounts
â states/nodes/vertices
⢠Much cleaner and more aesthetically pleasing than Java lingo
⢠Next: generalize and abstractâŚ
Alphabets and Strings
⢠Definitions:
⢠An alphabet is a set of symbols (characters, letters).
⢠A string (or word) over is a sequence of symbols.
â The empty string is the string containing no symbols at all, and is denoted by
.
â The length of the string is the number of symbols, e.g. || = 0.
Questions:
1) What is ?
2) What are some good or bad strings?
3) What does signify here?
Finite Automaton Example
0
1
0
1
1 1 0 0 1
01
sourceless
arrow
denotes
âstartâ
double
circle
denotes
âacceptâ
input put
on tape
read left
to right
What strings are âacceptedâ?
Formal Definition of a Finite Automaton
A finite automaton (FA) is a 5-tuple (đ, ÎŁ, đż, đ0, đš),where
⢠Q is a finite set called the states
⢠is a finite set called the alphabet
⢠đż: đ Ă ÎŁ â đ is the transformation function
⢠đ0 â đ is the start state
⢠đš â đ is the set of accept states (a.k.a. final states).
Formal Definition of a Finite Automaton
A finite automaton (FA) is a 5-tuple (đ, ÎŁ, đż, đ0, đš),where
⢠Q is a finite set called the states
⢠is a finite set called the alphabet
⢠đż: đ Ă ÎŁ â đ is the transformation function
⢠đ0 â đ is the start state
⢠đš â đ is the set of accept states (a.k.a. final states).
⢠The âinput stringâ and the tape containing it are implicit in the definition.The definition only deals with the static view.
Further explaining is needed for understanding how FAâs interact withtheir input.
Accept States
⢠How does an FA operate on strings?
Imagine an auxiliary tape containing the string.
The FA reads the tape from left to right with each new character
causing the FA to go into another state.
When the string is completely read, the string is accepted
depending on whether the FAâs final state was an accept state.
Accept States
⢠How does an FA operate on strings?
Imagine an auxiliary tape containing the string.
The FA reads the tape from left to right with each new character
causing the FA to go into another state.
When the string is completely read, the string is accepted
depending on whether the FAâs final state was an accept state.
⢠Definition: A string u is accepted by an automaton M iff (if and only if )the path starting at q
0 which is labeled by u ends in an accept state.
Accept States
⢠How does an FA operate on strings?
Imagine an auxiliary tape containing the string.
The FA reads the tape from left to right with each new character
causing the FA to go into another state.
When the string is completely read, the string is accepted
depending on whether the FAâs final state was an accept state.
⢠Definition: A string u is accepted by an automaton M iff (if and only if )the path starting at q
0 which is labeled by u ends in an accept state.
⢠This definition is somewhat informal. To really define what it means for a
string to label a path, you need to break u up into its sequence of
characters and apply d repeatedly, keeping track of states.
Language
⢠Definition:
The language accepted by an finite automaton M is the set of all strings
which are accepted by M. The language is denoted by L(M).
We also say that M recognizes L(M), or that M accepts L(M).
⢠Think of all the possible ways of getting from the start to any accept state.
Language
⢠Definition:
The language accepted by an finite automaton M is the set of all strings
which are accepted by M. The language is denoted by L(M).
We also say that M recognizes L(M), or that M accepts L(M).
⢠Think of all the possible ways of getting from the start to any accept state.
⢠We will eventually see that not all languages can be described as the
accepted language of some FA.
⢠A language L is called a regular language if there exists a FA M that
recognizes the language L.
Designing Finite Automata
⢠This is essentially a creative processâŚ
⢠âYou are the automatonâ methodâ Given a language (for which we must design an automaton).
â Pretending to be automaton, you receive an input string.
â You get to see the symbols in the string one by one.
â After each symbol you must determine whether string
seen so far is part of the language.
â Like an automaton, you donât see the end of the string,so you must always be ready to answer right away.
⢠Main point: What is crucial, what defines the language?!
Find the automata forâŚ
1) = {0,1},
Language consists of all strings with odd number of ones.
2) = {0,1},
Language consists of all strings with substring â001â,for example 100101, but not 1010110101.
More examples in the book and in the exercisesâŚ
Definition of Regular Language
⢠Recall the definition of a regular language:
A language L is called a regular language if there exists
a FA M that recognizes the language L.
⢠We would like to understand what types of languages are regular.
Languages of this type are amenable to super-fast recognition.
Definition of Regular Language
⢠Recall the definition of a regular language:
A language L is called a regular language if there exists
a FA M that recognizes the language L.
⢠We would like to understand what types of languages are regular.
Languages of this type are amenable to super-fast recognition.
⢠Are the following languages regular?
â Unary prime numbers: { 11, 111, 11111, 1111111, 11111111111, ⌠}= {1
2, 1
3, 1
5, 1
7, 1
11, 1
13, ⌠} = { 1p| p is a prime number }
â Palindromic bit strings: {, 0, 1, 00, 11, 000, 010, 101, 111, âŚ}
Finite Languages
⢠All the previous examples had the following property in common:
infinite cardinality
⢠Before looking at infinite languages, we should look at finite languages.
Finite Languages
⢠All the previous examples had the following property in common:
infinite cardinality
⢠Before looking at infinite languages, we should look at finite languages.
⢠Question:
Is the singleton language containing one string regular?
For example, is the language {banana} regular?
Languages of Cardinality 1
⢠Answer: Yes.
Languages of Cardinality 1
⢠Answer: Yes.
⢠Question: Huh? Whatâs wrong with this automaton?!?What if the automation is in state q
1and reads a âbâ?
Languages of Cardinality 1
⢠Answer: Yes.
⢠Question: Huh? Whatâs wrong with this automaton?!?What if the automation is in state q
1and reads a âbâ?
⢠Answer:
This a first example of a nondeterministic FA. The difference between a
deterministic FA (DFA) and a nondeterministic FA (NFA) is that every DFA
state has one exiting transition arrow for each symbol of the alphabet.
Languages of Cardinality 1
⢠Answer: Yes.
⢠Question: Huh? Whatâs wrong with this automaton?!?What if the automation is in state q
1and reads a âbâ?
⢠Answer:
This a first example of a nondeterministic FA. The difference between a
deterministic FA (DFA) and a nondeterministic FA (NFA) is that every DFA
state has one exiting transition arrow for each symbol of the alphabet.
⢠Question: Is there a way of fixing it and making it deterministic?
Arbitrary Finite Number of Finite Strings
⢠Theorem: All finite languages are regular.
Arbitrary Finite Number of Finite Strings
⢠Theorem: All finite languages are regular.
⢠Proof:
One can always construct a tree whose leaves are word-ending.
Make word endings into accept states, add a fail sink-state and
add links to the fail state to finish the construction.
Since thereâs only a finite number of finite strings,the automaton is finite.
Arbitrary Finite Number of Finite Strings
⢠Theorem: All finite languages are regular.
⢠Proof:
One can always construct a tree whose leaves are word-ending.
Make word endings into accept states, add a fail sink-state and
add links to the fail state to finish the construction.
Since thereâs only a finite number of finite strings,the automaton is finite.
⢠Example for {banana, nab, ban, babba}:
ba a
b
a
n b
a
n
b
an
Regular Operations
⢠You may have come across the regular operations when doing advanced
searches utilizing programs such as emacs, egrep, perl, python, etc.
⢠There are four basic operations we will work with:
â Union
â Concatenation
â Kleene-Star
â Kleene-Plus (which can be defined using the other three)
Regular Operations â Summarizing Table
Operation Symbol UNIX version Meaning
Union ⪠| Match one of the patterns
Concatenation implicit in UNIX Match patterns in sequence
Kleene-star * * Match pattern 0 or more times
Kleene-plus + + Match pattern 1 or more times
Regular Operations matters!
Regular operations: Union
⢠In UNIX, to search for all lines containing vowels in a text one could use
the command
â egrep -i `a|e|i|o|uââ Here the pattern âvowelâ is matched by any line containing a vowel.â A good way to define a pattern is as a set of strings, i.e. a language.
The language for a given pattern is the set of all strings satisfying the
predicate of the pattern.
Regular operations: Union
⢠In UNIX, to search for all lines containing vowels in a text one could use
the command
â egrep -i `a|e|i|o|uââ Here the pattern âvowelâ is matched by any line containing a vowel.â A good way to define a pattern is as a set of strings, i.e. a language.
The language for a given pattern is the set of all strings satisfying the
predicate of the pattern.
⢠In UNIX, a pattern is implicitly assumed to occur as a substring of the
matched strings. In our course, however, a pattern needs to specify
the whole string, not just a substring.
Regular operations: Union
⢠In UNIX, to search for all lines containing vowels in a text one could use
the command
â egrep -i `a|e|i|o|uââ Here the pattern âvowelâ is matched by any line containing a vowel.â A good way to define a pattern is as a set of strings, i.e. a language.
The language for a given pattern is the set of all strings satisfying the
predicate of the pattern.
⢠In UNIX, a pattern is implicitly assumed to occur as a substring of the
matched strings. In our course, however, a pattern needs to specify
the whole string, not just a substring.
⢠Computability: Union is exactly what we expect.
If you have patterns A = {aardvark}, B = {bobcat}, C = {chimpanzee},
the union of these is AB C = {aardvark, bobcat, chimpanzee}.
Regular operations: Concatenation
⢠To search for all consecutive double occurrences of vowels, use:
â egrep -i `(a|e|i|o|u)(a|e|i|o|u)ââ Here the pattern âvowelâ has been repeated. Parentheses have been
introduced to specify where exactly in the pattern the concatenation is
occurring.
Regular operations: Concatenation
⢠To search for all consecutive double occurrences of vowels, use:
â egrep -i `(a|e|i|o|u)(a|e|i|o|u)ââ Here the pattern âvowelâ has been repeated. Parentheses have been
introduced to specify where exactly in the pattern the concatenation is
occurring.
⢠Computability: Consider the previous result:
L = {aardvark, bobcat, chimpanzee}.
When we concatenate L with itself we obtain:
LL = {aardvark, bobcat, chimpanzee} {aardvark, bobcat, chimpanzee} =
{aardvarkaardvark, aardvarkbobcat, aardvarkchimpanzee,
bobcataardvark, bobcatbobcat, bobcatchimpanzee,
chimpanzeeaardvark, chimpanzeebobcat, chimpanzeechimpanzee}
Regular operations: Kleene-*
⢠We continue the UNIX example: now search for lines consisting purely of
vowels (including the empty line):
â egrep -i `^(a|e|i|o|u)*$ââ Note: ^ and $ are special symbols in UNIX regular expressions which
respectively anchor the pattern at the beginning and end of a line.
The trick above can be used to convert any Computability regular
expression into an equivalent UNIX form.
Regular operations: Kleene-*
⢠We continue the UNIX example: now search for lines consisting purely of
vowels (including the empty line):
â egrep -i `^(a|e|i|o|u)*$ââ Note: ^ and $ are special symbols in UNIX regular expressions which
respectively anchor the pattern at the beginning and end of a line.
The trick above can be used to convert any Computability regular
expression into an equivalent UNIX form.
⢠Computability: Suppose we have a language B = {ba, na}.
Question: What is the language B*?
Regular operations: Kleene-*
⢠We continue the UNIX example: now search for lines consisting purely of
vowels (including the empty line):
â egrep -i `^(a|e|i|o|u)*$ââ Note: ^ and $ are special symbols in UNIX regular expressions which
respectively anchor the pattern at the beginning and end of a line.
The trick above can be used to convert any Computability regular
expression into an equivalent UNIX form.
⢠Computability: Suppose we have a language B = {ba, na}.
Question: What is the language B*?
⢠Answer: B * = { ba,na }* = {, ba, na, baba, bana, naba, nana,
bababa, babana, banaba, banana, nababa, nabana, nanaba, nanana,
babababa, bababana, ⌠}
Regular operations: Kleene-+
⢠Kleene-+ is just like Kleene-* except that the pattern is forced to
occur at least once.
⢠UNIX: search for lines consisting purely of vowels (not including the
empty line):
â egrep -i `^(a|e|i|o|u)+$â
Regular operations: Kleene-+
⢠Kleene-+ is just like Kleene-* except that the pattern is forced to
occur at least once.
⢠UNIX: search for lines consisting purely of vowels (not including the
empty line):
â egrep -i `^(a|e|i|o|u)+$â
⢠Computability:
Suppose we have a language B = {ba, na}.
What is B+and how does it defer from B*?
Regular operations: Kleene-+
⢠Kleene-+ is just like Kleene-* except that the pattern is forced to
occur at least once.
⢠UNIX: search for lines consisting purely of vowels (not including the
empty line):
â egrep -i `^(a|e|i|o|u)+$â
⢠Computability:
Suppose we have a language B = {ba, na}.
What is B+and how does it defer from B*?
B+= {ba, na}
+ = {ba, na, baba, bana, naba, nana, bababa, babana,
banaba, banana, nababa, nabana, nanaba, nanana, babababa,
bababana, ⌠}
The only difference is the absence of
Closure of Regular Languages
⢠When applying regular operations to regular languages, regular languages
result. That is, regular languages are closed under the operations of
union, concatenation, and Kleene-*.
⢠Goal: Show that regular languages are closed under regular operations.
In particular, given regular languages L1
and L2, show:
1. L1 L
2 is regular,
2. L1 L
2 is regular,
3. L1* is regular.
⢠No.âs 2 and 3 are deferred until we learn about NFAâs.⢠However, No. 1 can be achieved immediately.
Union Example
⢠Problem: Draw the FA for
L = { x {0,1}* | |x|=even or x ends with 11}
Letâs start by drawing M1
and M2,
the automaton recognizing L1
and L2
⢠L1
= { x {0,1}* | x has even length}
⢠L2
= { x {0,1}* | x ends with 11 }
Cartesian Product Construction
⢠We want to construct a finite automaton M that recognizes
any strings belonging to L1
or L2.
⢠Idea: Build M such that it simulates both M1
and M2
simultaneously
and accept if either of the automatons accepts
Cartesian Product Construction
⢠We want to construct a finite automaton M that recognizes
any strings belonging to L1
or L2.
⢠Idea: Build M such that it simulates both M1
and M2
simultaneously
and accept if either of the automatons accepts
⢠Definition: The Cartesian product of two sets A and B,denoted by đ´ Ă B, is the set of all ordered pairs (a,b)
where aA and bB.
Formal Definition
⢠Given two automatađ1 = (đ1, ÎŁ, đż1, đ1, đš1) and đ2 = (đ2, ÎŁ, đż2, đ2, đš2)
⢠Define the unioner of M1 and M2 by:đ⪠= (đ1 Ă đ2, ÎŁ, đż1 Ă đż2, (đ1, đ2), đšâŞ)
- where the accept state đ1, đ2 is the combined start stateof both automata
- where đšâŞ is the set of ordered pairs in đ1 Ă đ2with at least onestate an accept state. That is: đšâŞ = đ1 Ă đš2 ⪠đš1 Ă đ2
- where the transition function d is defined asđż đ1, đ2 , đ = đż1 đ1, đ , đż2 đ2, đ = đż1 Ă đż2
Union Example: L1L
2
⢠When using the Cartesian Product Construction:
01
0 0
0
0
0
11
1
11
Other constructions: Intersector
⢠Other constructions are possible, for example an intersector:
⢠Accept only when both ending states are accept states. So the onlydifference is in the set of accept states. Formally the intersector ofM1 and M2 is given byđ⊠= đ1 Ă đ2, ÎŁ, đż1 Ă đż2, đ0,1, đ0,2 , đšâŠ , where đšâŠ = đš1 Ă đš2.
Other constructions: Intersector
⢠Other constructions are possible, for example an intersector:
⢠Accept only when both ending states are accept states. So the onlydifference is in the set of accept states. Formally the intersector ofM1 and M2 is given byđ⊠= đ1 Ă đ2, ÎŁ, đż1 Ă đż2, đ0,1, đ0,2 , đšâŠ , where đšâŠ = đš1 Ă đš2.
(b,y)(b,x)
(a,x) (a,y) (a,z)
(b,z)
01
0 0
0
0
0
11
1
11
Other constructions: Difference
⢠The difference of two sets is defined by A - B = {x A | x B}
⢠In other words, accept when first automaton accepts andsecond does not
đâ = (đ1 Ă đ2, ÎŁ, đż1 Ă đż2, đ0,1, đ0,2 , đšâ),where đšâ = đš1 Ă đ2 â đ1 Ă đš2
Other constructions: Difference
⢠The difference of two sets is defined by A - B = {x A | x B}
⢠In other words, accept when first automaton accepts andsecond does not
đâ = (đ1 Ă đ2, ÎŁ, đż1 Ă đż2, đ0,1, đ0,2 , đšâ),where đšâ = đš1 Ă đ2 â đ1 Ă đš2
(b,y)(b,x)
(a,x) (a,y) (a,z)
(b,z)
01
0 0
0
0
0
11
1
11
Other constructions: Symmetric difference
⢠The symmetric difference of two sets A, B is AB =AB - AB⢠Accept when exactly one automaton accepts:
đâ = (đ1 Ă đ2, ÎŁ, đż1 Ă đż2, đ0,1, đ0,2 , đšâ), where đšâ = đšâŞ â đšâŠ
Other constructions: Symmetric difference
⢠The symmetric difference of two sets A, B is AB =AB - AB⢠Accept when exactly one automaton accepts:
đâ = (đ1 Ă đ2, ÎŁ, đż1 Ă đż2, đ0,1, đ0,2 , đšâ), where đšâ = đšâŞ â đšâŠ
(b,y)(b,x)
(a,x) (a,y) (a,z)
(b,z)
01
0 0
0
0
0
11
1
11
Complement
⢠How about the complement? The complement is only definedwith respect to some universe.
⢠Given the alphabet , the default universe is just the set of allpossible strings *. Therefore, given a language L over , i.e.L * the complement of L is * - L
⢠Note: Since we know how to compute set difference, and weknow how to construct the automaton for * we can constructthe automaton for L .
Complement
⢠How about the complement? The complement is only definedwith respect to some universe.
⢠Given the alphabet , the default universe is just the set of allpossible strings *. Therefore, given a language L over , i.e.L * the complement of L is * - L
⢠Note: Since we know how to compute set difference, and weknow how to construct the automaton for * we can constructthe automaton for L .
⢠Question: Is there a simpler construction forL ?
Complement
⢠How about the complement? The complement is only definedwith respect to some universe.
⢠Given the alphabet , the default universe is just the set of allpossible strings *. Therefore, given a language L over , i.e.L * the complement of L is * - L
⢠Note: Since we know how to compute set difference, and weknow how to construct the automaton for * we can constructthe automaton for L .
⢠Question: Is there a simpler construction forL ?
⢠Answer: Just switch accept-states with non-accept states.
Complement Example
x y z1
0
0
1
1
Original:
x y z1
0
0
1
1
Complement: yx
Boolean-Closure Summary
⢠We have shown constructively that regular languages are closed under
boolean operations. I.e., given regular languages L1
and L2
we saw that:
1. L1 L
2 is regular,
2. L1 L
2 is regular,
3. L1-L
2 is regular,
4. L1 L
2 is regular,
5. L1
is regular.
⢠No. 2 to 4 also happen to be regular operations. We still need to show
that regular languages are closed under concatenation and Kleene-*.
⢠Tough question: Canât we do a similar trick for concatenation?
BacktoNondeterministicFA
⢠Question:DrawanFAwhichacceptsthelanguageL1={xà {0,1}*|4th bitfromleftofxis0}
⢠FAforL1:
⢠Question:Whataboutthe4th bitfromtheright?
⢠Looksascomplicated:L2={xà {0,1}*|4th bitfromrightofxis0}
0,1
0,1
0,1 0,10
0,11
WeirdIdea
⢠NoticethatL2isthereverseL1.
⢠I.e.sayingthat0shouldbethe4th fromtheleftisreverseofsayingthat0shouldbe4th fromtheright.Canwesimplyreverse thepicture(reversearrows,swapstartandaccept)?!?
⢠Hereâsthereversedversion:
0,1
0,1
0,1 0,10
0,11
0,1
0,1
0,1 0,10
0,11
Discussionofweirdidea
1. Sillyunreachablestate.Notpretty,butallowedinmodel.
2. Oldstartstatebecameacrashingacceptstate.Underdeterminism.Couldfixwithfailstate.
3. Oldacceptstatebecameastatefromwhichwedonâtknowwhattodowhenreading0.Overdeterminism.Trouble.
4. (Notinthisexample,but)Therecouldbemorethanonestartstate!Seeminglyoutsidestandarddeterministicmodel.
⢠Still,thereissomethingaboutourautomaton.ItturnsoutthatNFAâs(=NondeterministicFA)areactuallyquiteusefulandareembeddedinmanypracticalapplications.
⢠Idea,keepmorethan1activestate ifnecessary.
IntroductiontoNondeterministicFiniteAutomata
⢠ThestaticpictureofanNFAisasagraphwhoseedgesarelabeledbyS andbye (togethercalledSe)andwithstartvertexq0andacceptstatesF.
⢠Example:
⢠AnylabeledgraphyoucancomeupwithisanNFA,aslongasitonlyhasonestartstate.Later,eventhisrestrictionwillbedropped.
0,10
1e
1
6
NFA:FormalDefinition.
⢠Definition:Anondeterministicfiniteautomaton(NFA) isencapsulatedbyM=(Q,S,d,q0,F)inthesamewayasanFA,exceptthatd hasdifferentdomainandco- domain: !: #ĂÎŁ& â ( #
⢠Here,P(Q)isthepowersetofQsothatd(q,a) isthesetofallendpointsofedgesfromq whicharelabeledbya.
⢠Example,forNFAofthepreviousslide:
d(q0,0) = {q1,q3},d(q0,1) = {q2,q3}, d(q0,e) = Ă,...d(q3,e) = {q2}.
q1
q0q2
q30,10
1e
1
FormalDefinitionofanNFA:Dynamic
⢠JustaswithFAâs,thereisanimplicitauxiliarytapecontainingtheinputstringwhichisoperatedonbytheNFA.AsopposedtoFAâs,NFAâsareparallelmachines â abletobeinseveralstatesatanygiveninstant.TheNFAreadsthetapefromlefttorightwitheachnewcharactercausingtheNFAtogointoanothersetofstates.Whenthestringiscompletelyread,thestringisaccepteddependingonwhethertheNFAâsfinalconfigurationcontainsanacceptstate.
⢠Definition:Astringuisaccepted byanNFAM iff thereexistsapathstartingatq0whichislabeledbyuandendsinanacceptstate.ThelanguageacceptedbyM isthesetofallstringswhichareacceptedbyMandisdenotedbyL(M).⢠Followingalabele isforfree(withoutreadinganinputsymbol).In
computingthelabelofapath,youshoulddeletealleâs.⢠TheonlydifferenceinacceptanceforNFAâsvs.FAâsarethewordsâthere
existsâ.InFAâsthepathalwaysexistsandisunique.
Example
M4:
Question:Whichofthefollowingstringsisaccepted?1. e2. 03. 14. 0111
q1
q0q2
q30,10
1e
1
NFAâsvs.RegularOperations
⢠OnthefollowingfewslideswewillstudyhowNFAâsinteractwithregularoperations.
⢠WewillusethefollowingschematicdrawingforageneralNFA.
⢠Theredcirclestandsforthestartstateq0,thegreenportionrepresentstheacceptstatesF,theotherstatesaregray.
NFA:Union
⢠TheunionAĂB isformedbyputtingtheautomataAandBinparallel.Createanewstartstateandconnectittotheformerstartstatesusinge-edges:
UnionExample
⢠L ={xhasevenlength}à {xendswith11}
c
b
0,1 0,1
d e f
0
1
0
0
1
1
ae e
NFA:Concatenation
⢠TheconcatenationAâ˘B isformedbyputtingtheautomatainserial.ThestartstatecomesfromA whiletheacceptstatescomefromB.Aâsacceptstatesareturnedoffandconnectedviae-edgestoBâsstartstate:
ConcatenationExample
⢠L ={xhasevenlength}⢠{xendswith11}
⢠Remark:ThisexampleissomewhatquestionableâŚ
c
b
0,1 0,1
d e f
0
1
0
01
1e
NFAâs:Kleene-+.
⢠TheKleene-+A+ isformedbycreatingafeedbackloop.Theacceptstatesconnecttothestartstateviae-edges:
Kleene-+Example
L={ }+
={ }
x is a streak of one or more 1âs followedby a streak of two or more 0âs
00c d f
1
10
e
e
x starts with 1, ends with 0, and alternatesbetween one or more consecutive 1âs
and two or more consecutive 0âs
NFAâs:Kleene-*
⢠TheconstructionfollowsfromKleene-+constructionusingthefactthatA*istheunionofA+ withtheemptystring.JustcreateKleene-+andaddanewstartacceptstateconnectingtooldstartstatewithane-edge:
ClosureofNFAunderRegularOperations
⢠TheconstructionsaboveallshowthatNFAâsareconstructively closedundertheregularoperations.Moreformally,
⢠Theorem:IfL1and L2areacceptedbyNFAâs,thensoareL1Ă L2, L1â˘L2,L1+ andL1*.Infact,theacceptingNFAâscanbeconstructedinlineartime.
⢠Thisisalmostwhatwewant.IfwecanshowthatallNFAâscanbeconvertedintoFAâsthiswillshowthatFAâsâ andhenceregularlanguagesâ areclosedundertheregularoperations.
RegularExpressions(REX)
⢠Wearealreadyfamiliarwiththeregularoperations.Regularexpressionsgiveawayofsymbolizingasequenceofregularoperations,andthereforeawayofgeneratingnewlanguagesfromold.
⢠Forexample,togeneratetheregularlanguage{banana,nab}*fromtheatomiclanguages{a},{b}and{n}wecoulddothefollowing:
(({b}â˘{a}â˘{n}â˘{a}â˘{n}â˘{a})Ă({n}â˘{a}â˘{b}))*
Regularexpressionsspecifythesameinamorecompactform:
(bananaĂnab)*
RegularExpressions(REX)
⢠Definition:Thesetofregularexpressions overanalphabetS andthelanguagesinS*whichtheygeneratearedefinedrecursively:â BaseCases:EachsymbolaĂ S aswellasthesymbolse andĂ are
regularexpressions:⢠ageneratestheatomiclanguageL(a)={a}⢠e generatesthelanguageL(e)={e}⢠à generatestheemptylanguageL(Ă)={}=Ă
â InductiveCases:ifr1 andr2areregularexpressionssoarer1Ăr2,(r1)(r2),(r1)*and(r1)+:⢠L(r1Ăr2)=L(r1)ĂL(r2),sor1Ăr2generatestheunion⢠L((r1)(r2))=L(r1)â˘L(r2),so(r1)(r2) istheconcatenation⢠L((r1)*)=L(r1)*, so(r1)* representstheKleene-*⢠L((r1)+)=L(r1)+,so(r1)+ representstheKleene-+
RegularExpressions:TableofOperationsincludingUNIX
Operation Notation Language UNIX
Union r1Ăr2 L(r1)ĂL(r2) r1|r2
Concatenation (r1)(r2) L(r1)â˘L(r2) (r1)(r2)
Kleene-* (r )* L(r )* (r )*
Kleene-+ (r )+ L(r )+ (r )+
Exponentiation (r )n L(r )n (r ){n}
RegularExpressions:Simplifications
⢠Justasalgebraicformulascanbesimplifiedbyusinglessparentheseswhentheorderofoperationsisclear,regularexpressionscanbesimplified.Usingthepuredefinitionofregularexpressionstoexpressthelanguage{banana,nab}*wewouldbeforcedtowritesomethingnastylike
((((b)(a))(n))(((a)(n))(a))Ă(((n)(a))(b)))*
⢠Usingtheoperatorprecedenceordering*,⢠,à andtheassociativityof⢠allowsustoobtainthesimpler:
(bananaĂnab)*
⢠Thisisdoneinthesamewayasonewouldsimplifythealgebraicexpressionwithre-orderingdisallowed:
((((b)(a))(n))(((a)(n))(a))+(((n)(a))(b)))4=(banana+nab)4
RegularExpressions:Example
⢠Question:Findaregularexpressionthatgeneratesthelanguageconsistingofallbit-stringswhichcontainastreakofseven0âsorcontaintwodisjointstreaksofthree1âs.â Legal:010000000011010,01110111001,111111â Illegal:11011010101,10011111001010,00000100000
⢠Answer:(0Ă1)*(07Ă13(0Ă1)*13)(0Ă1)*â Anevenbriefervalidansweris:S*(07Ă13S*13)S*â Theofficial answerusingonlythestandardregularoperationsis:
(0Ă1)*(0000000Ă111(0Ă1)*111)(0Ă1)*â AbriefUNIXansweris:
(0|1)*(0{7}|1{3}(0|1)*1{3})(0|1)*
RegularExpressions:Examples
1) 0*10*
2) (SS)*
3) 1*Ă
4) S ={0,1},{w|whasatleastone1}
5) S ={0,1},{w|wstartsandendswiththesamesymbol}
6) {w|wisanumericalconstantwithsignand/orfractionalpart}⢠E.g.3.1415,-.001,+2000
RegularExpressions:AdifferentviewâŚ
⢠Regularexpressionsarejuststrings.Consequently,thesetofallregularexpressionsisasetofstrings,sobydefinitionisalanguage.
⢠Question:Supposingthatonlyunion,concatenationandKleene-*areconsidered.Whatisthealphabetforthelanguageofregularexpressionsoverthebasealphabet S ?
⢠Answer:S Ă {(,),Ă,*}
REXĂ NFA
⢠SinceNFAâsareclosedundertheregularoperationsweimmediatelyget
⢠Theorem:GivenanyregularexpressionrthereisanNFANwhichsimulatesr.Thatis,thelanguageacceptedbyN ispreciselythelanguagegeneratedbyr sothatL(N)=L(r).Furthermore,theNFAisconstructibleinlineartime.
REXĂ NFA
⢠Proof:Theproofworksbyinduction,usingtherecursivedefinitionofregularexpressions.FirstweneedtoshowhowtoacceptthebasecaseregularexpressionsaĂS, e andĂ.ThesearerespectivelyacceptedbytheNFAâs:
⢠Finally,weneedtoshowhowtoinductivelyacceptregularexpressionsformedbyusingtheregularoperations.Thesearejusttheconstructionsthatwesawbefore,encapsulatedby:
q0 q0q1q0a
REXĂ NFAexercise:FindNFAfor )* ⪠) â
REXĂ NFA:Example
⢠Question:FindanNFAfortheregularexpression(0Ă1)*(0000000Ă111(0Ă1)*111)(0Ă1)*
ofthepreviousexample.
REXĂ NFAĂ FA?!?
⢠ThefactthatregularexpressionscanbeconvertedintoNFAâsmeansthatitmakessensetocallthelanguagesacceptedbyNFAâsâregular.â
⢠However,theregularlanguagesweredefinedtobethelanguagesacceptedbyFAâs,whicharebydefault,deterministic.ItwouldbeniceifNFAâscouldbeâdeterminizedâandconvertedtoFAâs,forthenthedefinitionofâregularâlanguages,asbeingFA-acceptedwouldbejustified.
⢠Letâstrythisnext.
NFAâshave3typesofnon-determinism
Nondeterminismtype
MachineAnalog
d -function Easy to fix? Formally
Under-determined Crash No output yes, fail-state |d(q,a)|= 0
Over-determined Randomchoice
Multi-valued no |d(q,a)|> 1
ePause reading
Redefine alphabet no |d(q,e)|> 0
DeterminizingNFAâs:Example
⢠Idea:Wemightkeeptrackofallparallelactivestatesastheinputisbeingcalledout.Ifattheendoftheinput,oneoftheactivestateshappenedtobeanacceptstate,theinputwasaccepted.
⢠Example,considerthefollowingNFA,anditsdeterministicFA.
1
2 3
a
a
e
a,b
b
One-Slide-RecipetoDerandomize
⢠InsteadofthestatesintheNFA,weconsiderthepower-states intheFA.(IftheNFAhasnstates,theFAhas2n states.)
⢠Firstwefigureoutwhichpower-stateswillreachwhichpower-statesintheFA.(UsingtherulesoftheNFA.)
⢠Thenwemustaddallepsilon-edges:Weredirectpointersthatareinitiallypointingtopower-state{a,b,c}topower-state{a,b,c,d,e,f},ifandonlyifthereisanepsilon-edge-only-pathpointingfromanyofthestatesa,b,ctostatesd,e,f(a.k.a.transitiveclosure).Wedotheverysameforthestartingstate:startingstateofFA={startingstateofNFA,allNFAstatesthatcanrecursivelybereachedfromthere}
⢠Acceptingstates oftheFAareallstatesthatincludeaacceptingNFAstate.
Remarks
⢠Thepreviousrecipecanbemadetotallyformal.Moredetailscanbefoundinthereadingmaterial.
⢠JustfollowingtherecipewilloftenproduceatoocomplicatedFA.Sometimesobvioussimplificationscanbemade.Ingeneralhowever,thisisnotaneasytask.
AutomataSimplification
⢠TheFAcanbesimplified.States{1,2}and{1},forexample,cannotbereached.StilltheresultisnotassimpleastheNFA.
Derandomization Exercise
⢠Exercise:Letâsderandomize thesimplifed two-stateNFAfromslide1/70whichwederivedfromregularexpression )* ⪠) â
x na b
aa
REXĂ NFAĂ FA
⢠Summary:StartingfromanyNFA,wecanusesubsetconstructionandtheepsilon-transitive-closuretofindanequivalentFAacceptingthesamelanguage.Thus,
⢠Theorem:IfLisanylanguageacceptedbyanNFA,thenthereexistsaconstructible[deterministic]FAwhichalsoacceptsL.
⢠Corollary:Theclassofregularlanguagesisclosedundertheregularoperations.
⢠Proof:SinceNFAâsareclosedunderregularoperations,andFAâsarebydefaultalsoNFAâs,wecanapplytheregularoperationstoanyFAâsanddeterminizeattheendtoobtainanFAacceptingthelanguagedefinedbytheregularoperations.
REXĂ NFAĂ FAĂ REXâŚ
⢠WeareonestepawayfromshowingthatFAâsÂť NFAâsÂť REXâs;i.e.,allthreerepresentationareequivalent.Wewillbedonewhenwecancompletethecircleoftransformations:
FA
NFA
REX
NFAĂ REXissimple?!?
⢠ThenFAà REXevensimpler!
⢠Pleasesolvethissimpleexample:
1
0
0
1
1
11
0
00
REXĂ NFAĂ FAĂ REXâŚ
⢠InconvertingNFAâstoREXâsweâllintroducethemostgeneralizednotionofanautomaton,thesocalledâGeneralizedNFAâorâGNFAâ.InconvertingintoREXâs,weâllfirstgothroughaGNFA:
FA
NFA
REX
GNFA
GNFAâs
⢠Definition:Ageneralizednondeterministicfiniteautomaton(GNFA) isagraphwhoseedgesarelabeledbyregularexpressions,â withauniquestartstatewithin-degree0,butarrowstoevery
otherstateâ andauniqueacceptstatewithout-degree0,butarrowsfrom
everyotherstate(notethatacceptstateš startstate)â andanarrowfromanystatetoanyotherstate(includingself).
⢠AGNFAaccepts astringsifthereexistsapathp fromthestartstatetotheacceptstate suchthatw isanelementofthelanguagegeneratedbytheregularexpressionobtainedbyconcatenatingalllabelsoftheedgesinp.
⢠Thelanguageaccepted byaGNFAconsistsofalltheacceptedstringsoftheGNFA.
GNFAExample
⢠ThisisaGNFAbecauseedgesarelabeledbyREXâs,startstatehasnoin-edges,andtheunique acceptstatehasnoout-edges.
⢠Convinceyourselfthat000000100101100110isaccepted.
b
c
0Ăe
000
a
(0110Ă1001)*
NFAĂ REXconversionprocess
1. ConstructaGNFAfromtheNFA.A. Iftherearemorethanonearrowsfromonestatetoanother,unify
themusingâĂâB. Createauniquestartstatewithin-degree0C. Createauniqueacceptstateofout-degree0D. [Ifthereisnoarrowfromonestatetoanother,insertonewith
labelĂ]
2. Loop:AslongastheGNFAhasstrictlymorethan2states:Ripoutarbitraryinteriorstateandmodifyedgelabels.
3. Theansweristheuniquelabelr.
acceptstartr
NFAĂ REX:RippingOut.
⢠Rippingoutisdoneasfollows.Ifyouwanttoripthemiddlestatevout(forallpairsofneighborsu,w)âŚ
⢠âŚthenyouâllneedtorecreateallthelostpossibilitiesfromutow.I.e.,tothecurrentREXlabelr4oftheedge(u,w)youshouldaddtheconcatenationofthe(u,v )labelr1followedbythe(v,v )-looplabelr2repeatedarbitrarily,followedbythe(v,w )labelr3..Thenew(u,w)substitutewouldthereforebe:
v wur3
r2
r1
r4
wur4 Ă r1 (r2)*r3
FAĂ REX:Example
FAĂ REX:Exercise
Summary:FAâ NFAâ REX
⢠Thiscompletesthedemonstrationthatthethreemethodsofdescribingregularlanguagesare:
1. DeterministicFAâs2. NFAâs3. RegularExpressions
⢠Wehavelearntthatalltheseareequivalent.
RemarkaboutAutomatonSize
⢠Creatinganautomatonofsmallsizeisoftenadvantageous.â Allowsforsimpler/cheaperhardware,orbetterexamgrades.â Designing/Minimizingautomataisthereforeafunnysport.Example:
a
b
1
d
0,1
e
0,1
1
c
0,1
gf
0
0
0
01
1
Minimization
⢠Definition:Anautomatonisirreducible ifâ itcontainsnouselessstates,andâ notwodistinctstatesareequivalent.
⢠Byjustfollowingthesetworules,youcanarriveatanâirreducibleâFA.Generally,suchalocalminimumdoesnothavetobeaglobalminimum.
⢠Itcanbeshownhowever,thattheseminimizationrulesactuallyproducetheglobalminimumautomaton.
⢠Theideaisthattwoprefixesu,v areindistinguishableiff forallsuffixesx,ux à Liff vx à L.Ifuandvaredistinguishable,theycannotendupinthesamestate.Thereforethenumberofstatesmustbeatleastasmanyasthenumberofpairwisedistinguishableprefixes.
Pigeonholeprinciple
⢠ConsiderlanguageL,whichcontainswordw à L.⢠ConsideranFAwhichacceptsL,withn<|w|states.⢠Then,whenacceptingw,theFAmustvisitatleastonestatetwice.
⢠Thisisaccordingtothepigeonhole(a.k.a.Dirichlet)principle:â Ifm>npigeonsareputintonpigeonholes,there'saholewith
morethanonepigeon.â Thatâsaprettyfancynameforaboringobservation...
Languageswithunboundedstrings
⢠Consequently,regularlanguageswithunboundedstringscanonlyberecognizedbyFA(finite!bounded!)automataiftheselongstringsloop.
⢠TheFAcanenterthelooponce,twice,âŚ,andnotatall.⢠Thatis,languageLcontainsall {xz,xyz,xy2z,xy3z,âŚ}.
PumpingLemma
⢠Theorem:GivenaregularlanguageL,thereisanumberp (calledthepumpingnumber)suchthatanystringinLoflength³ pispumpablewithinitsfirstp letters.
⢠Inotherwords,forallu Ă Lwith|u |Âł pwecanwrite:â u=xyz (xisaprefix,zisasuffix)â |y|Âł 1 (mid-portionyisnon-empty)â |xy|ÂŁ p (pumpingoccursinfirstpletters)â xyizĂ Lforalli Âł 0(canpumpy-portion)
⢠If,ontheotherhand,thereisnosuchp,thenthelanguageisnotregular.
PumpingLemmaExample
⢠LetLbethelanguage{0n1n |n³ 0}
⢠Assume(forthesakeofcontradiction)thatLisregular⢠Letp bethepumpinglength.Letu bethestring0p1p.⢠Letâscheckstringuagainstthepumpinglemma:
⢠âInotherwords,forallu Ă Lwith|u |Âł pwecanwrite:â u=xyz (xisaprefix,zisasuffix)â |y|Âł 1 (mid-portionyisnon-empty)â |xy|ÂŁ p (pumpingoccursinfirstpletters)â xyiz Ă Lforalli Âł 0(canpumpy-portion)â
Ă y = 0+
Ă Then, xz or xyyz is not in L. Contradiction!
LetâsmaketheexampleabitharderâŚ
⢠LetLbethelanguage{w|whasanequalnumberof0sand1s}
⢠Assume(forthesakeofcontradiction)thatLisregular⢠Letp bethepumpinglength.Letu bethestring0p1p.⢠Letâscheckstringuagainstthepumpinglemma:
⢠âInotherwords,forallu Ă Lwith|u |Âł pwecanwrite:â u=xyz (xisaprefix,zisasuffix)â |y|Âł 1 (mid-portionyisnon-empty)â |xy|ÂŁ p (pumpingoccursinfirstpletters)â xyiz Ă Lforalli Âł 0(canpumpy-portion)â
Harderexamplecontinued
⢠Again,ymustconsistof0sonly!⢠Pumpitthere!Clearlyagain,ifxyzà L,thenxz orxyyz arenotinL.
⢠Thereâsanotheralternativeproofforthisexample:â 0*1*isregular.â Ă isaregularoperation.â IfLregular,thenLĂ 0*1*isalsoregular.â However,LĂ 0*1*isthelanguagewestudiedinthepreviousexample(0n1n).
Acontradiction.
NowyoutryâŚ
⢠Is-. = 00 0 â 0 ⪠1 â} regular?
⢠Is-6 = 17 8beingaprimenumber} regular?