Afj+ +Dodatni+ +Eng

8/8/2019 Afj+ +Dodatni+ +Eng

http://slidepdf.com/reader/full/afj-dodatni-eng 1/37

1

Definitions on Language

Subjects to be Learned

• alphabet• string (word)

• language

• operations on languages: concatenation of strings, union, intersection, Kleene star

Contents

Here we are going to learn the concept of language in very abstract and general sense,

operations on languages and some of their properties.

1. Basic concepts

First, an alphabet is a finite set of symbols. For example {0, 1} is an alphabet with two

symbols, {a, b} is another alphabet with two symbols and English alphabet is also an

alphabet. A string (also called a word) is a finite sequence of symbols of an alphabet. b, a

and aabab are examples of string over alphabet {a, b} and 0, 10 and 001 are examples of

string over alphabet {0, 1}. A language is a set of strings over an alphabet. Thus {a, ab,

baa} is a language (over alphabert {a,b}) and {0, 111} is a language (over alphabet {0,1}).

The number of symbols in a string is called the length of the string. For a string w its

length is represented by |w|. It can be defined more formally by recursive definition. The

empty string (also called null string) is the string with length 0. That is, it has no symbols.

The empty string is denoted by (capital lambda). Thus | | = 0.

Let u and v be strings. Then uv denotes the string obtained by concatenating u with v , that

is, uv is the string obtained by appending the sequence of symbols of v to that of u. For

example if u = aab and v = bbab, then uv = aabbbab. Note that vu = bbabaab uv . We

are going to use first few symbols of English alphabet such as a and b to denote symbols of

an alphabet and those toward the end such as u and v for strings.

A string x is called a substring of another string y if there are strings u and v such that y =

uxv . Note that u and v may be an empty string. So a string is a substring of itself. A string x

is a prefix of another string y if there is a string v such that y = xv . v is called a suffix of y .

2. Some special languages

The empty set is a language which has no strings. The set { } is a language which has

one string, namely . Though has no symbols, this set has an object in it. So it is not

empty. For any alphabet , the set of all strings over (including the empty string) is

http://www.cs.odu.edu/~toida/nerzic/390teched/language/gen_ind/gen_ind.html#length

http://www.cs.odu.edu/~toida/nerzic/390teched/language/gen_ind/gen_ind.html#length



1

denoted by . Thus a language over alphabet is a subset of .

3. Operations on languages

Since languages are sets, all the set operations can be applied to languages. Thus the union,

intersetion and difference of two languages over an alphabet are languages over . The

complement of a language L over an alphabet is - L and it is also a language.

Another operation onlanguages is concatenation. Let L1 and L2 be languages. Then the

concatenation of L1 with L2 is denoted as L1L2 and it is defined as L1L2 = { uv | u L1 and v

L2 }. That is L1L2 is the set of strings obtained by concatenating strings of L1 with those of

L2. For example {ab, b} {aaa, abb, aaba} = {abaaa, ababb, abaaba, baaa, babb, baaba}.

Powers : For a symbol a and a natural number k, ak represents the concatenation of k

a's. For a string u and a natural number k, uk denotes the concatenation of k u's.

Similarly for a language L, Lk means the concatenation of k L's. Hence Lk is the set of strings

that can be obtained by concatenating k strings of L. These powers can be formally defined

recursively. For example Lk can be defined recursively as follows.

Recursive definition of Lk :

Basis Clause: L0 = { }

Inductive Clause: L(k+1) = Lk L.

Since Lk is defined for natural numbers k, the extremal clause is not necessary.

ak and uk can be defined similarly. Here a0 = and u0 = .

The following two types of languages are generalizations of * and we are going to see

them quite often in this course.

Recursive definition of L*:

Basis Clause: L*

Inductive Clause: For any x L* and any w L, xw L*.

Extremal Clause: Nothing is in L* unless it is obtained from the above two clauses.

L* is the set of strings obtained by concatenating zero or more strings of L as we are

going to see in Theorem 1. This * is called Kleene star.



1

For example if L = { aba, bb }, then L* = { , aba, bb, ababb, abaaba, bbbb, bbaba, ... }

The * in * is also the same Kleene star defined above.

Recursive definition of L+

:

Basis Clause: L L+

Inductive Clause: For any x L+ and any w L, xw L+.

Extremal Clause: Nothing is in L+ unless it is obtained from the above two clauses.

Thus L+ is the set of strings obtained by concatenating one or more strings of L.

For example if L = { aba, bb }, then L+ = { aba, bb, ababb, abaaba, bbbb, bbaba, ... }

Let us also define (i.e. L0 L L

2 ... ) as = { x | x Lk for some natural

number k } .

Then the following relationships hold on L* and L+. Theorems 1 and 2 are proven in "General

Induction" which you study in the next unit. Other proofs are omitted.

Theorem 1: LnL

*

Theorem 2:

Theorem 3:

Theorem 4: L+ = L L* = L

*L

Note: According to Theorems 2 and 3, any nonempty string in L* or L

+ can be expresssed as

the concatenation of strings of L, i.e. w1w2...wk for some k , where wi's are strings of L.

L

*

and L

*

have a number of interesting properties. Let us list one of them as a theorem andprove it.

Theorem 5: L* = (L*)*.

Proof: Because by Theorem 2, by applying Theorem 2 to the language L* we

can see that

http://www.cs.odu.edu/~toida/nerzic/390teched/language/gen_ind/gen_ind.html






1

L* (L*)*.

Conversely ( L* )* L

* can be proven as follows:

Let x be an arbitrary nonempty string of ( L*

)*

. Then there are nonempty strings w1, w2, ...,wk in L* such that x = w1w2...wk . Since w1, w2, ..., wk are strings of L*, for each wi there are

strings wi1, wi2, ..., wimi in L such that wi = wi1wi2...wimi

Hence x = w11 ...w1m1w21.. .w2m2...wm1...wmmk . Hence x is in L* .

If x is an empty string, then it is obviously in L* .

Hence ( L* )* L

* .

Since L* (L*)* and ( L

* )* L* , L

* = ( L* )* .

Test Your Understanding of Language

Indicate which of the following statements are correct and which are not.

Click Yes or No , then Submit.

There are two sets of questions.

In the questions below the following notations are used:

L^* , L^+ and L^k for L*, L+, Lk , respectively

a^k for ak



1

Definitions of Regular Language and Regular Expression


• regular language

• regular expression

Contents

Here we are going to learn one type of language called regular language which is thesimplest of the four Homsky formal languages, and regular expression which is one of theways to describe regular languages.

1. Regular language

The set of regular languages over an alphabet is defined recursively as below. Any

language belonging to this set is a regular language over .

Definition of Set of Regular Languages :

Basis Clause: , { } and {a} for any symbol a are regular languages.

Inductive Clause: If Lr and Ls are regular languages, then Lr Ls , Lr Ls and Lr * are regular

languages.

Extremal Clause: Nothing is a regular language unless it is obtained from the above twoclauses.

For example, let = {a, b}. Then since {a} and {b} are regular languages, {a, b} ( = {a}

{b} ) and {ab} ( = {a}{b} ) are regular languages. Also since {a} is regular, {a}* is a

regular language which is the set of strings consisting of a's such as , a, aa, aaa, aaaa etc.

Note also that *, which is the set of strings consisting of a's and b's, is a regular languagebecause {a, b} is regular.

2. Regular expression

Regular expressions are used to denote regular languages. They can represent regularlanguages and operations on them succinctly.

The set of regular expressions over an alphabet is defined recursively as below. Anyelement of that set is a regular expression.



1

Basis Clause: , and a are regular expressions corresponding to languages , { }

and {a}, respectively, where a is an element of .Inductive Clause: If r and s are regular expressions corresponding to languages Lr and Ls ,

then ( r + s ) , ( rs ) and ( r*

) are regular expressions corresponding to languages Lr Ls ,Lr Ls and Lr

* , respectively.

Extremal Clause: Nothing is a regular expression unless it is obtained from the above twoclauses.

Conventions on regular expressions

(1) When there is no danger of confusion, bold face may not be used for regular expressions.

So for example, ( r + s ) is used in stead of ( r + s ).(2) The operation * has precedence over concatenation, which has precedence over union (

+ ). Thus the regular expression ( a + ( b( c*) ) ) is written as a + bc*,(3) The concatenation of k r's , where r is a regular expression, is written as rk . Thus forexample rr = r2 .

The language corresponding to rk is Lr k , where Lr is the language corresponding to the

regular expression r.For a recursive definition of Lr

k click here.(4) We use ( r+) as a regular expression to represent Lr

+ .

Examples of regular expression and regular languages corresponding to them

• ( a + b )2 corresponds to the language {aa, ab, ba, bb}, that is the set of strings of

length 2 over the alphabet {a, b}.In general ( a + b )k corresponds to the set of strings of length k over the alphabet{a, b}. ( a + b )* corresponds to the set of all strings over the alphabet {a, b}.

• a*b* corresponds to the set of strings consisting of zero or more a's followed by zeroor more b's.

• a*b+a* corresponds to the set of strings consisting of zero or more a's followed by oneor more b's followed by zero or more a's.

• ( ab )+ corresponds to the language {ab, abab, ababab, ... }, that is, the set of strings of repeated ab's.

Note:A regular expression is not unique for a language. That is, a regular language, in

general, corresponds to more than one regular expressions. For example ( a + b )* and( a*b* )* correspond to the set of all strings over the alphabet {a, b}.

Definition of Equality of Regular Expressions

Regular expressions are equal if and only if they correspond to the same language.

Thus for example ( a + b )* = ( a*b* )* , because they both represent the language of all

http://www.cs.odu.edu/~toida/nerzic/390teched/language/gen_ind/gen_ind.html#pwrlk

http://www.cs.odu.edu/~toida/nerzic/390teched/language/gen_ind/gen_ind.html#pwrlk



1

strings over the alphabet {a, b}.

In general, it is not easy to see by inspection whether or not two regularexpressions are equal.

Test Your Understanding of Regular Language and Regular Expression





\Lambda for

L^* for L*

a^* for a* , etc.

Exercise Questions on Regular Language and Regular Expression

Ex. 1: Find the shortest string that is not in the language represented by the regularexpression a*(ab)*b*.



1

Solution: It can easily be seen that , a, b, which are strings in the language with length 1or less. Of the strings wiht length 2 aa, bb and ab are in the language. However, ba is not init. Thus the answer is ba.

Ex. 2: For the two regular expressions given below,(a) find a string corresponding to r2 but not to r1 and(b) find a string corresponding to both r1 and r2.

r1 = a* + b* r2 = ab* + ba* + b*a + (a*b)*

Solution: (a) Any string consisting of only a's or only b's and the empty string are in r1. Sowe need to find strings of r2 which contain at least one a and at least one b. For example aband ba are such strings.(b) A string corresponding to r1 consists of only a's or only b's or the empty string. The onlystrings corresponding to r2 which consist of only a's or b's are a, b and the strings consitingof only b's (from (a*b)*).

Ex. 3: Let r1 and r2 be arbitrary regular expressions over some alphabet. Find a simple (theshortest and with the smallest nesting of * and +) regular expression which is equal to eachof the following regular expressions.

(a) (r1 + r2 + r1r2 + r2r1)* (b) (r1(r1 + r2)*)+

Solution: One general strategy to approach this type of question is to try to see whether ornot they are equal to simple regular expressions that are familiar to us such as a, a*, a+, (a+ b)*, (a + b)+ etc.(a) Since (r1 + r2)* represents all strings consisting of strings of r1 and/or r2 , r1r2 + r2r1 inthe given regular expression is redundant, that is, they do not produce any strings that are

not represented by (r1 + r2)*

. Thus (r1 + r2 + r1r2 + r2r1)*

is reduced to (r1 + r2)*

.(b) (r1(r1 + r2)*)+ means that all the strings represented by it must consist of one or morestrings of (r1(r1 + r2)*). However, the strings of (r1(r1 + r2)*) start with a string of r1 followedby any number of strings taken arbitrarily from r1 and/or r2. Thus anything that comes afterthe first r1 in (r1(r1 + r2)*)+ is represented by (r1 + r2)*. Hence (r1(r1 + r2)*) also representsthe strings of (r1(r1 + r2)*)+, and conversely (r1(r1 + r2)*)+ represents the stringsrepresented by (r1(r1 + r2)*). Hence (r1(r1 + r2)*)+ is reduced to (r1(r1 + r2)*).

Ex. 4: Find a regular expression corresponding to the language L over the alphabet { a , b }defined recursively as follows:

Basis Clause: L

Inductive Clause: If x L , then aabx L and xbb L .Extremal Clause: Nothing is in L unless it can be obtained from the above two clauses.

Solution: Let us see what kind of strings are in L. First of all L . Then starting with ,strings of L are generated one by one by prepending aab or appending bb to any of thealready generated strings. Hence a string of L consists of zero or more aab's in front and



1

zero or more bb's following them. Thus (aab)*(bb)* is a regular expression for L.

Ex. 5: Find a regular expression corresponding to the language L defined recursively asfollows:

Basis Clause: L and a L .

Inductive Clause: If x L , then aabx L and bbx L .Extremal Clause: Nothing is in L unless it can be obtained from the above two clauses.

Solution: Let us see what kind of strings are in L. First of all and a are in L . Then starting

with or a, strings of L are generated one by one by prepending aab or bb to any of thealready generated strings. Hence a string of L has zero or more of aab's and bb's in front

possibly followed by a at the end. Thus (aab + bb)*(a + ) is a regular expression for L.

Ex. 6: Find a regular expression corresponding to the language of all strings over thealphabet { a, b } that contain exactly two a's.

Solution: A string in this language must have at least two a's. Since any string of b's can beplaced in front of the first a, behind the second a and between the two a's, and since anarbitrasry string of b's can be represented by the regular expression b*, b*a b*a b* is aregular expression for this language.

Ex. 7: Find a regular expression corresponding to the language of all strings over thealphabet { a, b } that do not end with ab.

Solution: Any string in a language over { a , b } must end in a or b. Hence if a string does

not end with ab then it ends with a or if it ends with b the last b must be preceded by asymbol b. Since it can have any string in front of the last a or bb, ( a + b )*( a + bb ) is aregular expression for the language.

Ex. 8: Find a regular expression corresponding to the language of all strings over thealphabet { a, b } that contain no more than one occurence of the string aa.

Solution: If there is one substring aa in a string of the language, then that aa can befollowed by any number of b. If an a comes after that aa, then that a must be preceded by bbecause otherwise there are two occurences of aa. Hence any string that follows aa isrepresented by ( b + ba )*. On the other hand if an a precedes the aa, then it must befollowed by b. Hence a string preceding the aa can be represented by ( b + ab )*. Hence if a

string of the language contains aa then it corresponds to the regular expression ( b + ab )*aa( b + ba )* .If there is no aa but at least one a exists in a string of the language, then applying the sameargument as for aa to a, ( b + ab )*a( b + ba )* is obtained as a regular expressioncorresponding to such strings.If there may not be any a in a string of the language, then applying the same argument as

for aa to , ( b + ab )*( b + ba )* is obtained as a regular expression corresponding tosuch strings.

Altogether ( b + ab )*( + a + aa )( b + ba )* is a regular expression for the language.



1

Ex. 9: Find a regular expression corresponding to the language of strings of even lengthsover the alphabet of { a, b }.

Solution: Since any string of even length can be expressed as the concatenation of strings

of length 2 and since the strings of length 2 are aa, ab, ba, bb, a regular expressioncorresponding to the language is ( aa + ab + ba + bb )*. Note that 0 is an even number.

Hence the string is in this language.

Ex. 10: Describe as simply as possible in English the language corresponding to the regularexpression a*b(a*ba*b)*a* .

Solution: A string in the language can start and end with a or b, it has at least one b, andafter the first b all the b's in the string appear in pairs. Any numbe of a's can appear anyplace in the string. Thus simply put, it is the set of strings over the alphabet { a, b } thatcontain an odd number of b's

Ex. 11: Describe as simply as possible in English the language corresponding to the regular

expression (( a + b )3)*( + a + b ) .

Solution: (( a + b )3) represents the strings of length 3. Hence (( a + b )3)* represents thestrings of length a multiple of 3. Since (( a + b )3)*( a + b ) represents the strings of length3n + 1, where n is a natural number, the given regular expression represents the strings of length 3n and 3n + 1, where n is a natural number.

Ex. 12: Describe as simply as possible in English the language corresponding to the regularexpression ( b + ab )*( a + ab )*.

Solution: ( b + ab )*

represents strings which do not contain any substring aa and whichend in b, and ( a + ab )* represents strings which do not contain any substring bb. Hencealtogether it represents any string consisting of a substring with no aa followed by one bfollowed by a substring with no bb.

Test Your Understanding of Regular Language and Regular Expression





\Lambda for

a^* for a* , etc.



1



1

+ True 0 False

Properties of Regular Language


• Closure of the set of regular languages under union, concatenation and Kleene staroperations.

• Regularity of finite languages

Contents

Here we are going to learn two of the properties of regular languages. We will see moreproperties later.

To review definition of regular language click here

We say a set of languages is closed under an operation if the result of applying theoperation to any arbitrary language(s) of the set is a language in the set.For example a set of languages is closed under union if the union of any two languages of the set also belongs to the set.

The following theorem is immediate from the Inductive Clause of the definition of the set of regular languages.

Theorem 1: The set of regular languages over an alphabet is closed under operationsunion, concatenation and Kleene star.

Proof: Let Lr and Ls be regular languages over an alphabet . Then by the definition of the

set of regular languages , Lr Ls , Lr Ls and Lr * are regular languages and they are obviously

over the alphabet . Thus the set of regular languages is closed under those operations.

Note 1: Later we shall see that the complement of a regular language and the intersectionof regular laguages are also regular.Note 2: The union of infinitely many regular languages is not necessarily regular. Forexample while { akbk } is regular for any natural number k , { anbn | n is a natural number} which is the union of all the languages { ak bk } , is not regular as we shall seelater.

The following theorem shows that any finite language is regular. We say a language is finite

if it consists of a finite number of strings, that is, a finite language is a set of n strings forsome natural number n.

Theorem 2: A finite language is regular.

Proof: Let us first assume that a language consisting of a single string is regular and provethe theorem by induction. We then prove that a language consisting of a single string isregular.

http://www.cs.odu.edu/~toida/nerzic/390teched/regular/reg-lang/review-reg-lang.html

http://www.cs.odu.edu/~toida/nerzic/390teched/regular/reg-lang/review-reg-lang.html



1

Claim 1: A language consisting of n strings is regular for any natural number n (that is, afinite language is regular) if { w } is regular for any string w.

Proof of the Claim 1: Proof by induction on the number of strings.

Basis Step: (corresponding to n = 0) is a regular language by the Basis Clause of thedefinition of regular language.Inductive Step: Assume that a language L consisting of n strings is a regular language

(induction hypothesis). Then since { w } is a regular language as proven below, L { w } isa regular language by the definition of regular language.

End of proof of Claim 1

Thus if we can show that { w } is a regular language for any string w, then we have proventhe theorem.

Claim 2: Let w be a string over an alphabet . Then { w } is a regular language.

Proof of Claim 2: Proof by induction on strings.

Basis Step: By the Basis Clause of the definition of regular language, { } and { a } are

regular languages for any arbitrary symbol a of .

Inductive Step: Assume that { w } is a regular language for an arbitrary string w over .

Then for any symbol a of , { a } is a regular language from the Basis Step. Hence by theInductive Clause of the definition of regular language { a }{ w } is regular. Hence { aw } isregular.

End of proof for Claim 2

Note that Claim 2 can also be proven by induction on the length of string.

End of proof of Theorem 2.

Test Your Understanding of Properties of Regular Language




a^i for ai , etc.

\Sigma for

\cup for

<= for



1

FALSE

Definition of Deterministic Finite Automata


• Finite automata

• State transition diagram

• State transition table

Contents

Here we are going to formally define finite automata, in particular deterministic finiteautomata and see some examples. Finite automata recognize regular languages and,conversely, any language that is recognized by a finite automaton is regular. There are other

types of finite automata such as nondeterministic finite automata and nondeterministicautomata with and they are going to be studied later.

Let us now formally define deterministic finite automaton.

Definition of deterministic finite automaton

Let Q be a finite set and let be a finite set of symbols. Also let be a function from Q



1

to Q , let q0 be a state in Q and let A be a subset of Q. We call the elements of Q a state,

the transition function, q0 the initial state and A the set of accepting states.

Then a deterministic finite automaton is a 5-tuple < Q , , q0 , , A >

Notes on the definition

1. The set Q in the above definition is simply a set with a finite number of elements. Itselements can, however, be interpreted as a state that the system (automaton) is in.

Thus in the example of vending machine, for example, the states of the machinesuch as "waiting for a customer to put a coin in", "have received 5 cents" etc. are theelements of Q. "Waiting for a customer to put a coin in" can be considered the initialstate of this automaton and the state in which the machine gives out a soda can canbe considered the accepting state.

2. The transition function is also called a next state function meaning that the

automaton moves into the state (q, a) if it receives the input symbol a while instate q.

Thus in the example of vending machine, if q is the initial state and a nickel is put in,

then (q, a) is equal to "have received 5 cents".

3. Note that is a function. Thus for each state q of Q and for each symbol a of ,

(q, a) must be specified.4. The accepting states are used to distinguish sequences of inputs given to the finite

automaton. If the finite automaton is in an accepting state when the input ceases tocome, the sequence of input symbols given to the finite automaton is "accepted".Otherwise it is not accepted. For example, in the Example 1 below, the string a isaccepted by the finite automaton. But any other strings such as aa, aaa, etc. are notaccepted.

5. A deterministic finite automaton is also called simply a "finite automaton".Abbreviations such as FA and DFA are used to denote deterministic finiteautomaton.

DFAs are often represented by digraphs called (state) transition diagram. The vertices(denoted by single circles) of a transition diagram represent the states of the DFA and thearcs labeled with an input symbol correspond to the transitions. An arc ( p , q ) from vertex p

to vertex q with label represents the transition (p, ) = q . The accepting states areindicated by double circles.

Transition functions can also be represented by tables as seen below. They are calledtransition table.

Examples of finite automaton

Example 1: Q = { 0, 1, 2 }, = { a }, A = { 1 }, the initial state is 0 and is as shown inthe following table.

State(q)

Input(a)

Next State ( (q,a) )

0 a 1

1 a 2

2 a 2



1

A state transition diagram for this DFA is given below.

If the alphabet of the Example 1 is changed to { a, b } in stead of { a }, then we need aDFA such as shown in the following examle to accept the same string a. It is a little morecomplex DFA.

Example 2: Q = { 0, 1, 2 }, = { a, b }, A = { 1 }, the initial state is 0 and is as shownin the following table.

Note that for each state there are two rows in the table for corresponding to the symbols aand b, while in the Example 1 there is only one row for each state.


State(q)

Input(a)


0 a 10 b 2

1 a 2

1 b 2

2 a 2

2 b 2



1

A DFA that accepts all strings consisting of only symbol a over the alphabet { a, b } is thenext example.

Example 3: Q = { 0, 1 }, = { a, b }, A = { 0 }, the initial state is 0 and is as shown in

the following table.


Example 4: For the example of vending machine of the

previous section, Q = { 0, 5, 10, 15, 20 }, = { D, N }, A= { 15, 20 }, the initial state q0 = 0. If we make it a DFA,

its transition function is as shown in the following table.

State(q)

Input(a)


0 a 0

0 b 1

1 a 1

1 b 1

State(q)

Input(a)


0 N 5

0 D 10

5 N 10

5 D 15

10 N 15

10 D 20

15 N 5

15 D 10

20 N 5

20 D 10



1

A finite automaton as a machine

A finite automaton can also be thought of as the device shown below consisting of a tapeand a control circuit which satisfy the following conditions:

1. The tape has the left end and extends to the right without an end.2. The tape is divide into squares in each of which a symbol can be written prior to the

start of the operation of the automaton.3. The tape has a read only head.4. The head is always at the leftmost square at the beginning of the operation.



1

5. The head moves to the right one square every time it reads a symbol.It never moves to the left. When it sees no symbol, it stops and the automatonterminates its operation.

6. There is a finite control which determines the state of the automaton and alsocontrols the movement of the head.

Operation of finite automata

Let us see how an automaton operates when it is given some inputs. As an example let usconsider the DFA of Example 3 above.Initially it is in state 0. When zero or more a's are given as an input to it, it stays in state 0while it reads all the a's (without breaks) on the tape. Since the state 0 is also the acceptingstate, when all the a's on the tape are read, the DFA is in the accepting state. Thus thisautomaton accepts any string of a's. If b is read while it is in state 0 (initially or after readingsome a's), it moves to state 1. Once it gets to state 1, then no matter what symbol is read,this DFA never leaves state 1. Hence when b appears anywhere in the input, it goes intostate 1 and the input string is not accepted by the DFA. For example strings aaa, aaaaaaetc. are accepted but strings such as aaba, b etc. are not accepted by this automaton.

* of DFA and its Properties



1


•*

• Language accepted by DFA

Contents

Here we are going to formally describe what is meant by applying a transition repeatedly,

that is the concept of *

For a state q and string w, *( q , w ) is the state the DFA goes into when it reads the stringw starting at the state q. In general a DFA goes through a number of states from the state q

responding to the symbols in the string w. Thus for a DFA < Q , , q0 , , A > , thefunction

* : Q * -> Q is defined recursively as follows:

Definition of *:

Basis Clause: For any state q of Q , *( q , ) = q , where denotes the empty string.

Inducitve Clause: For any state q of Q, any string y * and any symbol a ,*( q , ya ) = ( *( q , y ) , a ) .

In the definition, the Basis Clause says that a DFA stays in state q when it reads an emptystring at state q and the Inductive Clause says that the state DFA reaches after readingstring ya starting at state q is the state it reaches by reading symbol a after reading string yfrom state q.

Example

For example suppose that a DFA contains the transitions shown below.

Then *( q , DNR ) can be calculated as follows:

*( q , DNR ) = ( *( q , DN ) , R ) by the Inductive Clause.

= ( ( *( q , D ) , N ) , R ) by applying the Inductive Clause to *( q , DN ).



1

= ( ( *( q , D ) , N ) , R ) since D = D .

= ( ( ( *( q , ) , D ) , N ) , R ) by applying the Inductive Clause to *( q ,D ).

= ( ( ( q , D ) , N ) , R ) , since ( q , ) = q .

= ( ( q1 , N ) , R ) , since ( q , D ) = q1 as seen from the diagram.

= ( q2 , R ) , since ( q1 , N ) = q2 as seen from the diagram.

= q3 since ( q2 , R ) = q3 as seen from the diagram.

Properties of *

We can see the following two properties of * .

Theorem 1: For any state q of Q and any symbol a of for a DFA < Q , , q0 , , A > ,

*( q , a ) = ( q , a )

Proof : Since a = a ,*( q , a ) = *( q , a ) .

By the definition of * ,*( q , a ) = ( *( q , ) , a )

But *( q , ) = q by the definition of * .

Hence ( *( q , ) , a ) = ( q , a ) .

The next theorem states that the state reached from any state, say q , by reading a string,say w , is the same as the state reached by first reading a prefix of w, call it x, and then byreading the rest of the w, call it y.

Theorem 2: For any state q of Q and any strings x and y over for a DFA < Q , , q0 ,, A > ,

*( q , xy ) = *( *( q , x ) , y ) .

Proof : This is going to be proven by induction on string y. That is the statement to beproven is the following:

For an arbitrary fixed string x, *( q , xy ) = *( *( q , x ) , y ) holds for any arbitrarystring y.

First let us review the recursive definition of *.

Recursive definition of *:

Basis Clause: *.

Inductive Clause: If x * and a , then xa * .

Extremal Clause: Nothing is in * unless it is obtained from the above two clauses.



1

Now the proof of the theorem.

Basis Step: If y = , then *( q , xy ) = *( q , x ) = *( q , x ) .

Also *( *( q , x ) , y ) = *( *( q , x ) , ) = *( q , x ) by the definition of * . Hence the

theorem holds for y = .

Inductive Step: Assume that *( q , xy ) = *( *( q , x ) , y ) holds for an arbitrary string

y. This is the induction hypothesis.We are going to prove that *( q , xya ) = *( *( q , x ) , ya ) for any arbitrary symbol a of

.

*( q , xya ) = ( *( q , xy ) , a ) by the definition of *

= ( * ( *( q , x ) , y ) , a ) by the induction hypothesis.

= *( *( q , x ) , ya ) by the definition of * .

Thus the theorem has been proven.

Test Your Understanding of * of DFA and its Properties

Indicate which of the following statements are correct and which are not.Click Yes or No , then Submit.

For the following DFA answer the questions given below.

Language Accepted by DFA



1


• Language accepted by DFA

Contents

Here we are going to formally define what is meant by a DFA (deterministic finiteautomaton) accepting a string or a language.

A string w is accepted by a DFA < Q , , q0 , , A > , if and only if *( q0 , w )A . That is a string is accepted by a DFA if and only if the DFA starting at the initial stateends in an accepting state after reading the string.

A language L is accepted by a DFA < Q , , q0 , , A > , if and only if L = { w | *(

q0 , w ) A } . That is, the language accepted by a DFA is the set of strings accepted by theDFA.

Example 1 :

This DFA accepts { } because it can go from the initial state to the accepting state (alsothe initial state) without reading any symbol of the alphabet i.e. by reading an empty string

. It accepts nothing else because any non-empty symbol would take it to state 1, which isnot an accepting state, and it stays there.

Example 2 :

This DFA does not accept any string because it has no accepting state. Thus the language it

accepts is the empty set .



1

Example 3 : DFA with one cycle

This DFA has a cycle: 1 - 2 - 1 and it can go through this cycle any number of times byreading substring ab repeatedly.

To find the language it accepts, first from the initial state go to state 1 by reading one a. Then from state 1 go through the cycle 1 - 2 - 1 any number of times by reading substringab any number of times to come back to state 1. This is represented by (ab)*. Then fromstate 1 go to state 2 and then to state 3 by reading aa. Thus a string that is accepted by thisDFA can be represented by a(ab)*aa .

Example 4 : DFA with two independent cycles

This DFA has two independent cycles: 0 - 1 - 0 and 0 - 2 - 0 and it can move through thesecycles any number of times in any order to reach the accepting state from the initial statesuch as 0 - 1 - 0 - 2 - 0 - 2 - 0. Thus a string that is accepted by this DFA can be representedby ( ab + bb )*.

Example 5 : DFA with two interleaved cycles



1

This DFA has two cycles: 1 - 2 - 0 - 1 and 1 - 2 - 3 - 1. To find the language accepted by this DFA, first from state 0 go to state 1 by reading a ( anyother state which is common to these cycles such as state 2 can also be used instead of state 1 ). Then from state 1 go through the two cycles 1 - 2 - 0 - 1 and 1 - 2 - 3 - 1 anynumber of times in any order by reading substrings baa and bba, respectively. At this pointa substring a( baa + bba )* will have been read. Then go from state 1 to state 2 and then tostate 3 by reading bb. Thus altogether a( baa + bba )*bb will have been read when state 3 isreached from state 0.

Example 6 :

This DFA has two accepting states: 0 and 1. Thus the language that is accepted by this DFAis the union of the language accepted at state 0 and the one accepted at state 1. The



1

language accepted at state 0 is b* . To find the language accepted at state 1, first at state 0read any number of b's. Then go to state 1 by reading one a. At this point (b*a) will havebeen read. At state 1 go through the cycle 1 - 2 - 1 any number of times by readingsubstring ba repeatedly. Thus the language accepted at state 1 is b*a(ba)* .

There is a systematic way of finding the language accepted by a DFA and we are going tolearn it later. So we are not going to go any further on this problem here.

Definition of Nondeterministic Finite Automata


• Nondeterministic finite automata

•

State transition diagram• State transition table

Contents

In the previous section we have seen DFAs that accept some simple languages such as , {

} , and { a }. As you might have noticed, those DFAs have states and transitions which donot contribute to accepting strings and languages. For example all we need about an FA thataccepts { a } is the following regardless of the alphabet (whether be it { a } , { a , b } orany other) .

This is so to say the essence of such an FA. But it is not DFA. A DFA that accepts { a } wouldneed more states and transitions as you can see below for example.



1

Without those extra state and transitions it is not a DFA if the alphabet is { a , b } . To avoid those redundant states and transitions and to make modeling easier we use finiteautomata called nondeterministic finite automata (abbreviated as NFA) . Below we are goingto formally define nondeterministic finite automata (abbreviated as NFS) and see someexamples. As we are going to see later, for any NFA there is a DFA which accepts the samelanguage and vice versa.

NFAs are quite similar to DFAs. The only difference is in the transition function. NFAs do notnecessarily go to a unique next state. An NFA may not go to any state from the current stateon reading an input symbol or it may select one of several states nondeterministically (e.g.by throwing a die) as its next state.

Definition of nondeterministic finite automaton

Let Q be a finite set and let be a finite set of symbols. Also let be a function from Q

to 2Q , let q0 be a state in Q and let A be a subset of Q. We call the elements of Q a

state, the transition function, q0 the initial state and A the set of accepting states.

Then a nondeterministic finite automaton is a 5-tuple < Q , , q0 , , A >

Notes on the definition

1. As in the case of DFA the set Q in the above definition is simply a set with a finitenumber of elements. Its elements can be interpreted as a state that the system(automaton) is in.

2. The transition function is also called a next state function . Unlike DFAs an NFA

moves into one of the states given by (q, a) if it receives the input symbol a while

in state q. Which one of the states in (q, a) to select is determinednondeterministically.

3. Note that is a function. Thus for each state q of Q and for each symbol a of

(q, a) must be specified. But it can be the empty set, in which case the NFA abortsits operation.

4. As in the case of DFA the accepting states are used to distinguish sequences of inputs given to the finite automaton. If the finite automaton is in an accepting state



1

when the input ends i.e. ceases to come, the sequence of input symbols given to thefinite automaton is "accepted". Otherwise it is not accepted.

5. Note that any DFA is also a NFA.

Examples of NFA

Example 1: Q = { 0, 1 }, = { a }, A = { 1 }, the initial state is 0 and is as shown in thefollowing table.

A state transition diagram for this finite automaton is given below.

If the alphabet is changed to { a, b } in stead of { a }, this is still an NFA that accepts

{ a } .

Example 2: Q = { 0, 1, 2 }, = { a, b }, A = { 2 }, the initial state is 0 and is as shownin the following table.

State(q)

Input(a)


0 a { 1 }

1 a

State(q)

Input(a)


0 a { 1 , 2 }

0 b

1 a

1 b { 2 }

2 a

2 b



1

Note that for each state there are two rows in the table for corresponding to the symbols aand b, while in the Example 1 there is only one row for each state.

A state transition diagram for this finite automaton is given below.

Operation of NFA

Let us see how an automaton operates when some inputs are applied to it. As an examplelet us consider the automaton of Example 2 above.

Initially it is in state 0. When it reads the symbol a, it moves to either state 1 or state 2.Since the state 2 is the accepting state, if it moves to state 2 and no more inputs are given,then it stays in the accepting state. We say that this automaton accepts the string a. If onthe other hand it moves to state 1 after reading a, if the next input is b and if no more inputsare given, then it goes to state 2 and remains there. Thus the string ab is also accepted bythis NFA. If any other strings are given to this NFA, it does not accept any of them.

Let us now define the function * and then formalize the concepts of acceptance of stringsand languages by NFA.



1

Language Accepted by NFA


•* for NFA

• Language accepted by NFA

• Properties of *

Contents

Here we are going to formally define what is meant by an NFA (nondeterministic finite

automaton) accepting a string or a language. We start with the concept of * .

Definition of *

For a state q and string w, *( q , w ) is the set of states that the NFA can reach when itreads the string w starting at the state q. In general an NFA nondeterministically goesthrough a number of states from the state q as it reads the symbols in the string w. Thus for

an NFA < Q , , q0 , , A > , the function

* : Q * -> 2Q is defined recursively as follows:

Definition of *:

Basis Clause: For any state q of Q,*

( q , ) = { q }, where denotes the empty string.

Inducitve Clause: For any state q of Q, any string y * and any symbol a ,

*( q , ya ) =

In the definition, the Basis Clause says that an NFA stays in state q when it reads an emptystring at state q and the Inductive Clause says that the set of states NFA can reach afterreading string ya starting at state q is the set of states it can reach by reading symbol aafter reading string y starting at state q.

Example

For example consider the NFA with the following transitiontable:

State

(q)

Input

(a)

Next State ( (q,

a) )0 a { 0 , 1 , 3 }

0 b { 2 }

1 a

1 b { 3 }

2 a { 3 }

2 b

3 a

3 b { 1 }



1

The transition diagram for this NFA is as given below.

Suppose that the state 3 is an accepting state of this NFA.

Then *( 0 , ab ) can be calculated as follows:

*( 0 , ab ) is the union of ( p, b ) for all p *( 0 , a ) by the Inductive Clause of the

definition of * .

Now *( 0 , a ) is the union of ( p, a ) for all p *( 0 , ) again by the Inductive Clause

of the definition of * .

By the Basis Clause of the definition of *

,*

( 0 , ) = { 0 } .Hence *( 0 , a ) = ( 0 , a ) = { 0 , 1 , 3 } .

Hence *( 0 , ab ) = ( 0 , b ) ( 1 , b ) ( 3 , b ) = { 2 } { 3 } { 1 } = { 1 , 2 ,3 } .

We say that a string x * is accepted by an NFA < Q, , q0, , A > if and only if



1

*( q0 , x ) A is not empty, that is, if and only if it can reach an accepting state by reading x

starting at the initial state. The language accepted by an NFA < Q, , q0, , A > is theset of strings that are accepted by the NFA.

Some of the strings accepted by the NFA given above are , a, ab, aaa, abbbb etc. and thelanguage it accepts is a*( ab + a + ba )(bb)* .

* for NFA has properties similar to that for DFA.

Theorem 1: For any state q of Q and any symbol a of for an NFA < Q , , q0 , , A > ,

*( q , a ) = ( q , a )

Theorem 2: For any state q of Q and any strings x and y over for an NFA < Q , , q0 ,

, A > ,

*( q , xy ) =

These theorems can be proven in a manner similar to those for Theorems 1 and 2 for DFA.

Regular Grammar


• Production and Grammar

• Regular Grammar

• Context-Free, Context-Sensitive and Phrase Structure Grammars

Contents

We have learned three ways of characterising regular languages: regular expressions, finite

automata and construction from simple languages using simple operations. There is yet

another way of characterizing them, that is by something called grammar. A grammar is a

set of rewrite rules which are used to generarte strings by successively rewriting symbols.

For example consider the language represented by a+

, which is { a, aa, aaa, . . . } . One cangenerate the strings of this language by the following procedure: Let S be a symbol to start

the process with. Rewrite S using one of the following two rules: S -> a , and S -> aS . These

rules mean that S is rewritten as a or as aS. To generate the string aa for example, start with

S and apply the second rule to replace S with the right hand side of the rule, i.e. aS, to

obtain aS. Then apply the first rule to aS to rewrite S as a. That gives us aa. We write S =>

aS to express that aS is obtained from S by applying a single production. Thus the process of

obtaining aa from S is written as S => aS => aa . If we are not interested in the



1

intermediate steps, the fact that aa is obtained from S is written as S =>* aa , In general if a

string is obtained from a string by applying productions of a grammar G, we write

=>*G and say that is derived from . If there is no ambiguity about the grammar G

that is referred to, then we simply write =>*

Formally a grammar consists of a set of nonterminals (or variables) V, a set of terminals

(the alphabet of the language), a start symbol S, which ia a nonterminal, and a set of rewrite

rules (productions) P. A production has in general the form -> , where is a string of

terminals and nonterminals with at least one nonterminal in it and is a string of terminals

and nonterminals. A grammar is regular if and only if is a single nonterminal and is a

single terminal or a single terminal followed by a single nonterminal, that is a production is

of the form X -> a or X -> aY, where X and Y are nonterminals and a is a terminal.

For example, = {a, b}, V = { S } and P = { S -> aS, S -> bS, S -> } is a regular

grammar and it generates all the strings consisting of a's and b's including the empty string.

The following theorem holds for regular grammars.

Theorem 3: A language L is accepted by an FA i.e. regular, if L - { } can be generated by

a regular grammar.

This can be proven by constructing an FA for the given grammar as follows: For each

nonterminal create a state. S corresponds to the initial state. Add another state as the

accepting state Z. Then for every production X -> aY, add the transition ( X, a ) = Y and for

every production X -> a add the transition ( X, a ) = Z.

For example = {a, b}, V = { S } and P = { S -> aS, S -> bS, S -> a, S -> b } form a

regular grammar which generates the language ( a + b )+. An NFA that recognizes this

language can be obtained by creating two states S and Z, and adding transitions ( S, a ) =

{ S, Z } and ( S, b ) = { S, Z } , where S is the initial state and Z is the accepting state of

the NFA.

The NFA thus obtained is shown below.



1

Thus L - { } is regular. If L contains as its member, then since { } is regular , L = ( L -

{ } ) { } is also regular.

Conversely from any NFA < Q, , , q0, A > a regular grammar < Q, , P, q0 > is obtained

as follows:

for any a in , and nonterminals X and Y, X -> aY is in P if and only if (X, a) = Y , and forany a in and any nonterminal X, X -> a is in P if and only if (X, a) = Y for some accepting

state Y.

Thus the following converse of Theorem 3 is obtained.

Theorem 4 : If L is regular i.e. accepted by an NFA, then L - { } is generated by a regular

grammar.

For example, a regular grammar corresponding to the NFA given below is < Q, { a, b }, P, S

> , where Q = { S, X, Y } , P = { S -> aS, S -> aX, X -> bS, X -> aY, Y -> bS, S -> a } .

In addition to regular languages there are three other types of languages in Chomsky

hierarchy : context-free languages, context-sensitive languages and phrase structure

languages. They are characterized by context-free grammars, context-sensitive grammars

and phrase structure grammars, respectively.

These grammars are distinguished by the kind of productions they have but they also form a

hierarchy, that is the set of regular languages is a subset of the set of context-free

languages which is in turn a subset of the set of context-sensitive languages and the set of

context-sensitive languages is a subset of the set of phrase structure languages.

A grammar is a context-free grammar if and only if its production is of the form X -> ,

where is a string of terminals and nonterminals, possibly the empty string.

For example P = { S -> aSb, S -> ab } with = { a, b } and V = { S } is a contex-free

grammar and it generates the language { anbn | n is a positive integer } . As we shall see

later this is an example of context-free language which is not regular.

A grammar is a context-sensitive grammar if and only if its production is of the form 1X

http://www.cs.odu.edu/~toida/nerzic/390teched/language/intro.html







1

2 -> 1 2, where X is a nonterminal and 1 , 2 and are strings of terminals and

nonterminals, possibly empty except .

Thus the nonterminal X can be rewritten as only in the context of 1X 2 .

For example P = { S -> XYZS1, S -> XYZ, S1 -> XYZS1, S1 -> XYZ, YX -> XY, ZX -> XZ, ZY ->

YZ, X -> a, aX -> aa, aY -> ab, BY -> bb, bZ -> bc, cZ -> cc } with = { a, b, c } and V =

{ X, Y, Z, S, S1 } is a context-sensitive grammar and it generates the language { anbncn | n is

a positive integer } . It is an example of context-sensitive language which is not context-

free.

Context-sensitive grammars are also characterized by productions whose left hand side is

not longer than the right hand side, that is, for every production -> , | | | | .

For a phrase structure grammar, there is no restriction on the form of production, that is

a production of a phrase structure grammar can take the form -> , where and can

be any string.



1



Documents

Afj+ +Dodatni+ +Eng