37
1 Alphabet Non-empty set of symbols usually expressed in V or Other Upper-case Greece Letter 2 Symbol(Character) Elements in alphabet, finest elements in a language 3 String Finite sequence of symbols in the Alphabet. Chapter 2 Language & Syntax Description Section 1 Alphabet & String

Chapter 2 Language & Syntax Description Section 1 Alphabet & String

  • Upload
    afia

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

Chapter 2 Language & Syntax Description Section 1 Alphabet & String. 1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest elements in a language 3 、 String - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

1 、 Alphabet Non-empty set of symbols , usually expressed in 、 V or Other Upper-case Greece Letter2 、 Symbol(Character) Elements in alphabet, finest elements in a language3 、 String Finite sequence of symbols in the Alphabet. Notes : Null-string is string without any symbol, written as 。

Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String

Page 2: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String

4 、 Sentence A set of strings based on symbols in the Alphabet in certain construction rules5 、 Language Sets of sentences in the Alphabet. Notes : By convention, a symbol is expressed as a,b,c,… ; a string is expressed as ,,,… ;a set of strings is expressed in A,B,C,….

Page 3: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String

6 、 Operations on the sets of strings 1) 、 Concatenate (Product) Operation Let the string set A={1,2,…},B={1,2,...}, then (Cartesian) Product AB is defined as AB={|A and B}Notes : 1 ) String set product on self is called as power of the string set 2 ) A0={} 3 ) n powers of Alphabet A is the set of all strings with n length

Page 4: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String

6 、 Operations on the sets of strings 2) 、 Closure and positive closure a ) Closure A*=A0A1A2… It is meant by the set of all strings on Alphabet A(Including null-string ) b ) Positive closure A+=A1A2…=A*-{}Notes : A language is a subset of positive closure on the Alphabet.

Page 5: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts

a 、 Grammar Grammar is the formal production rules describing

the construction of syntax elements. Notes : 1) Syntax elements include sentences and

words in sentences, a language is composed of sentences. 2) The form of a production rule is as following: left-sideright-side (that can be read as “left-side is

defined as right-side”, “left-side derives right-side”,or “left-side produces right-side”, it expresses the relation between the two sides)

Page 6: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

b 、 Non-terminal symbol– A symbol that appears in the left of a rule , is bracketed

in <> and expresses a syntax concept.– A set of non-terminal symbols is expressed in VN

c 、 Terminal symbol– Strings in a language that cannot be decomposed

(including strings of single characters), expressed in VT. Notes : Terminal symbols are basic elements of a

sentence.

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts

Page 7: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts

d 、 Start symbol– A special non-terminal symbol that is the core of

the defined syntax.

Notes : The start symbol is also named as “identified symbol”.

e 、 Production– A set of rules to define the relations among strings

The form : A ( A produce )E.g. <Sentence> <Subject><Predicate>

Page 8: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts f 、 Derivation

– The process that starts from the Start Symbol, and derives a sentence by replacing the left-side with right side in a production rule.

– Leftmost (Rightmost) Derivation : Only use a production rule every time and replace the leftmost (Rightmost) Terminal Symbol with the right side

Notes : Leftmost (Rightmost) Derivation are called canonical derivation.

Page 9: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts g 、 Reduction

– Reduction is the inverse process of derivation,that is, starting from a given sentence of a language, arriving at the Start Symbol by replacing the right-side with left-side of the production rules finally.

– Leftmost(Rightmost) Reduction is the inverse process of Rightmost(Leftmost) derivation.

Notes : Leftmost and Rightmost Reduction are called canonical reduction.

Page 10: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts

h 、 Sentential form 、 Sentence & Language• Sentential form

– String that is produced from every derivation (including 0 derivation) from the Start Symbol. Written as S , ( VN VT)*

• Sentence– A sentential form that only include terminal

symbol• Language

– The set of sentences (strings) that are produced from one or more derivation from S. Written as L(G), L(G)={|S , and VT

*}

*

+

Page 11: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts i 、 Recursive definition of grammar rules

– A non-terminal symbol is included in the definition of the non-terminal symbol.

Notes : You should be careful when you define a grammar in a recursive method. You must give the exit statement (special case statement) of the recursion. Otherwise you can not get a sentence forever.

Page 12: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts

j 、 Extended notations of grammar rules Use extended BNF(Backus Naur Form)

notations– () ——Extract factor E.g. Uax|ay|az Rewritten as Ua(x|y|z)– {} ——Assignment of repeat number

E.g. <Identifier><Letter>{<Letter>|<Digit>}50.

– [] ——Optional symbol E.g. <Integer>[+|-]<Digit>{<Digit>}

Page 13: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts

k 、 Meta-language symbol

The symbols that are used in describing the relations of grammar symbol, E.g. “” and “|” are called as meta-language symbol.

Page 14: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

a 、 Grammar definition A grammar G is defined as a quadruple

(VN,VT,P,S) b 、 Catalog of grammars

According to the limitation on the production rules in a grammar, we can classify grammars into 4 sorts, such as ,0-type grammar 、 1-type grammar 、2-type grammar and 3-type grammar

Page 15: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

b 、 Catalog of grammars (1) 0-type grammar (Phrase grammar or grammar without

limitation)– To any production in P where V+ and V*,

there is at least a non-terminal symbol in .Notes : The automation that can recognizes a 0-type

language is called as Turing Machine; 0-type grammar is a grammar that has least

limitation on its productions; We can get other types of grammar by limiting the

form of productions in a 0-type grammar.

Page 16: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

b 、 Catalog of grammars (2) 1-type grammar(context-sensitive grammar or length-

added grammar)– To any production in P,there is the limitation of ||

>=|| except for S . If S , S can not appear in the right side of any production.

– Or , any production in P has the form of A (where , V* ,A VN, V+) except for S .

Notes : The automation that can recognizes a 1-type language is called as Linear Bound (LBA) ; In a 1-type grammar, we should consider the context of a non-terminal symbol when we replace the non-terminal symbol. And a non-terminal symbol can not be replaced by except that the Start Symbol can produce

Page 17: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

b 、 Catalog of grammars

(3) 2-type grammar(Context-free grammar)– Every production in P is of the form A where

AVN , V*.

Notes : The left side of each production should be a non-terminal symbol, the right side of each production may be VN , VT or .The automation that recognizes a 2-type language is called as Push-Down Automation(PDA)

Page 18: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-linear

grammar or left-linear grammar)– Every production in P is of the form A B , A ,

or A B , A , where A , BVN , VT* 。

Notes : The productions in 3-type grammar are right-linear productions or else left-linear productions. There cannot be either left-linear productions or right-linear productions. If all the productions in a 3-type grammar are left-linear productions, we call name grammar as left-linear grammar. If all the productions in a 3-type grammar are right-linear productions, we name the grammar as right-linear grammar.

Page 19: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-

linear grammar or left-linear grammar)Notes : The automation that recognizes 3-type

language is called as finite state automation; 2-type grammar=self-embedded grammar(The

productions are of the form S aSb) +regular grammar, that is, any 2-type grammar without self-embedded property is equivalent to regular grammar.

Page 20: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

b 、 Catalog of grammars

Hierarchy Alias Production form

Automation name

0-type Grammar without limitation

, V+ Turing Machine

1-type Context-sensitive grammar

A , A VN

Linear Bound Automation

2-type Context-free grammar

A,

A VN

Pushdown automation

3-type Regular grammar

A B , A , A , BVN , VT

*

Finite automation

Page 21: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

c 、 i-type language– A language produced from i-type.

Written as L(G): L(G)={| VT* , and S }+

Page 22: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

L(G1)={ai(a|b)|i>=0}

Example : LetG2 = ({S},{a,b},P,S)

Where P includes: (0) S aSb

(1) S ab

L(G2)={anbn|n>=1}

Example : Let G1 = ({S},{a,b},P,S)

Where P includes: (0) S aS

(1) S a

(2) S b

Page 23: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition

Notes : Limitations on productions in grammars used by lexical analysis and syntax analysis are as followings,– There is not the production such as P P, for this kind of

production would be useless but for leading to ambiguity– Any non-terminal symbol P should be accessed , and can

derive terminal string.• Start from the Start Symbol S , there exists the

derivation S P• P must be able to derive a terminal string ,

that is P ; VT*.

*

+

Page 24: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language

Example1 : Let L1={a2nbn|n>=1 and a,b VT}

Try to construct the grammar G1 from L1

Let n=1 , L1 =aab n=2 , L1 =aaaabb

n=3 , L1 =aaaaaabbb …… So we have : S aaSb S aab

Page 25: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language

Example 2 : Let L2={aibjck | i,j,k>=1 and a,b,c VT}

Try to construct the grammar G2 from L2

S aS S aB

B bB B bC

C cC | c

Page 26: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language

Example 3 : Let L3={ | (a,b)* and there are as many a’s as b’s in }

Try to construct the grammar G3 from L3

S

S bB , S aA

A bS|b , A aAA

B aS | a | bBB

(0) S S aSbSS bSaS

Page 27: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language

Example 4 : Let L4={ | (0,1)* and the number of 1 appeared in is even}

Try to construct the grammar G4 from L4

S

S 0S , S 1A

A 0A , A 1S

Page 28: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification

2 、 Grammar Simplification a 、 Because a language can be described in different

grammars, it is true that should select the grammar which has least productions and is the most suitable to the properties of the language.

b 、 In a grammar, there may be some redundant productions that are useless to derivation. We should delete these productions. – The production which is of the form PP– The production which can not derive a terminal string forever– The production whose left-side non-terminal symbol does not

appear in the right-side of any production

Page 29: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification2 、 Grammar Simplification

c 、 Steps of simplification :– Look for the productions of the form PP, and

delete them ;– If a production can not be used in the derivations

forever, delete it ;– If a production can not derive a terminal string,

delete it;– Arrange the remained productions.

Page 30: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification2 、 Grammar Simplification

Example : Simplify the following grammar

(0)S Be (1)S Ec (2)A Ae (3)A e

(4)A A (5)B Ce (6)B Af (7)C Cf

(8)D f

Result:

(0) S Be (1)A Ae (2)A e (3)B Af

Page 31: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification3 、 Construct a context-free grammar without -production

a 、 A context-free grammar without -production should satisfy the conditions as followings– If there is the production S of the form in P, S

should not appear in right-side of any production, where S is the Start Symbol of the grammar ;

– There are no other -productions in P.

b 、 The algorithm to construct a context-free grammar without -production :– G=(VN,VT,P,S) G’=(V’N,V’T,P’,S’) (1) Find out all non-terminal symbols that can derive

after some steps, and put them into the set V0;

Page 32: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification3 、 Construct a context-free grammar without -productionb 、 The algorithm to construct a context-free grammar

without -production : (2)Construct the P’ set of productions of G’ as following

steps:

(A)If an symbol in V0 appears in the right-side of a production, change the production into two productions :substitute the symbol in and itself in the production respectively ; put the new productions into P’

( B)Otherwise, put the productions relating to the symbol into P’ except for -production relating to the symbol

( C)If there exists the production of the form S in P, change the production into S’ | S and put them into P’,let S’ be the Start Symbol of G’ , let V’N=VN{S’ } ,

Page 33: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Example : Let G1=({S},{a,b},P,S),whereP: (0) S (1) S aSbS (2) S bSaS

(1)V0={S}

(2)P’ (1) SabS|aSbS|aSb|ab

(2) SbaS|bSaS|bSa|ba

(0) S’ | S

So : G1’=({S’,S},{a,b},P’,S’),where

P’: (0) S’ | S

(1) S abS|aSbS|aSb|ab

(2) S baS|bSaS|bSa|ba

Page 34: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 4 Ambiguity of a grammar

a 、 Ambiguity of a sentence

If a sentence in a grammar has two or more related syntax tree, the sentence is ambiguous.

b 、 Ambiguity of a grammarIf a language to a grammar has ambiguous

sentences, the grammar is ambiguous.

Page 35: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 4 Ambiguity of a grammar

Example : G=({E} , {+,*,(,),i} , P , E)where : E E+E | E*E | (E) | i

To the sentence (i* i+ i), there are two leftmost derivations, thus there are two syntax trees to the sentence.

(1) E (E) (E+E) (E*E+E) ( i*E+E) ( i*i+E) ( i* i+ i)

(2) E (E) (E*E) ( i*E) ( i*E+E) ( i*i+E) ( i* i+ i)

Page 36: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

E

( E )

E + E

E * E i

i i

E

( E )

E * E

E + E i

i i

Page 37: Chapter 2 Language & Syntax Description Section 1  Alphabet & String

Chapter 2 Language & Syntax Description Section 4 Ambiguity of a grammar

Notes: (1)Ambiguity would bring uncertainty of syntax analysis

(2)Ambiguity of a grammar is undetermined, that is, there is no such algorithm that can determine a grammar is an ambiguous grammar in finite steps

(3)If you want to prove a grammar is ambiguous, you just give a counterexample

(4)If we can control the ambiguity of a grammar, that is, use additional conditions, the existence of ambiguity is not so bad