Upload
afia
View
40
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Chapter 2 Language & Syntax Description Section 1 Alphabet & String. 1 、 Alphabet Non-empty set of symbols , usually expressed in 、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest elements in a language 3 、 String - PowerPoint PPT Presentation
Citation preview
1 、 Alphabet Non-empty set of symbols , usually expressed in 、 V or Other Upper-case Greece Letter2 、 Symbol(Character) Elements in alphabet, finest elements in a language3 、 String Finite sequence of symbols in the Alphabet. Notes : Null-string is string without any symbol, written as 。
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
4 、 Sentence A set of strings based on symbols in the Alphabet in certain construction rules5 、 Language Sets of sentences in the Alphabet. Notes : By convention, a symbol is expressed as a,b,c,… ; a string is expressed as ,,,… ;a set of strings is expressed in A,B,C,….
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
6 、 Operations on the sets of strings 1) 、 Concatenate (Product) Operation Let the string set A={1,2,…},B={1,2,...}, then (Cartesian) Product AB is defined as AB={|A and B}Notes : 1 ) String set product on self is called as power of the string set 2 ) A0={} 3 ) n powers of Alphabet A is the set of all strings with n length
Chapter 2 Language & Syntax DescriptionSection 1 Alphabet & String
6 、 Operations on the sets of strings 2) 、 Closure and positive closure a ) Closure A*=A0A1A2… It is meant by the set of all strings on Alphabet A(Including null-string ) b ) Positive closure A+=A1A2…=A*-{}Notes : A language is a subset of positive closure on the Alphabet.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
a 、 Grammar Grammar is the formal production rules describing
the construction of syntax elements. Notes : 1) Syntax elements include sentences and
words in sentences, a language is composed of sentences. 2) The form of a production rule is as following: left-sideright-side (that can be read as “left-side is
defined as right-side”, “left-side derives right-side”,or “left-side produces right-side”, it expresses the relation between the two sides)
b 、 Non-terminal symbol– A symbol that appears in the left of a rule , is bracketed
in <> and expresses a syntax concept.– A set of non-terminal symbols is expressed in VN
c 、 Terminal symbol– Strings in a language that cannot be decomposed
(including strings of single characters), expressed in VT. Notes : Terminal symbols are basic elements of a
sentence.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
d 、 Start symbol– A special non-terminal symbol that is the core of
the defined syntax.
Notes : The start symbol is also named as “identified symbol”.
e 、 Production– A set of rules to define the relations among strings
The form : A ( A produce )E.g. <Sentence> <Subject><Predicate>
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts f 、 Derivation
– The process that starts from the Start Symbol, and derives a sentence by replacing the left-side with right side in a production rule.
– Leftmost (Rightmost) Derivation : Only use a production rule every time and replace the leftmost (Rightmost) Terminal Symbol with the right side
Notes : Leftmost (Rightmost) Derivation are called canonical derivation.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts g 、 Reduction
– Reduction is the inverse process of derivation,that is, starting from a given sentence of a language, arriving at the Start Symbol by replacing the right-side with left-side of the production rules finally.
– Leftmost(Rightmost) Reduction is the inverse process of Rightmost(Leftmost) derivation.
Notes : Leftmost and Rightmost Reduction are called canonical reduction.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
h 、 Sentential form 、 Sentence & Language• Sentential form
– String that is produced from every derivation (including 0 derivation) from the Start Symbol. Written as S , ( VN VT)*
• Sentence– A sentential form that only include terminal
symbol• Language
– The set of sentences (strings) that are produced from one or more derivation from S. Written as L(G), L(G)={|S , and VT
*}
*
+
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts i 、 Recursive definition of grammar rules
– A non-terminal symbol is included in the definition of the non-terminal symbol.
Notes : You should be careful when you define a grammar in a recursive method. You must give the exit statement (special case statement) of the recursion. Otherwise you can not get a sentence forever.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
j 、 Extended notations of grammar rules Use extended BNF(Backus Naur Form)
notations– () ——Extract factor E.g. Uax|ay|az Rewritten as Ua(x|y|z)– {} ——Assignment of repeat number
E.g. <Identifier><Letter>{<Letter>|<Digit>}50.
– [] ——Optional symbol E.g. <Integer>[+|-]<Digit>{<Digit>}
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 1 、 Basic concepts
k 、 Meta-language symbol
The symbols that are used in describing the relations of grammar symbol, E.g. “” and “|” are called as meta-language symbol.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
a 、 Grammar definition A grammar G is defined as a quadruple
(VN,VT,P,S) b 、 Catalog of grammars
According to the limitation on the production rules in a grammar, we can classify grammars into 4 sorts, such as ,0-type grammar 、 1-type grammar 、2-type grammar and 3-type grammar
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (1) 0-type grammar (Phrase grammar or grammar without
limitation)– To any production in P where V+ and V*,
there is at least a non-terminal symbol in .Notes : The automation that can recognizes a 0-type
language is called as Turing Machine; 0-type grammar is a grammar that has least
limitation on its productions; We can get other types of grammar by limiting the
form of productions in a 0-type grammar.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (2) 1-type grammar(context-sensitive grammar or length-
added grammar)– To any production in P,there is the limitation of ||
>=|| except for S . If S , S can not appear in the right side of any production.
– Or , any production in P has the form of A (where , V* ,A VN, V+) except for S .
Notes : The automation that can recognizes a 1-type language is called as Linear Bound (LBA) ; In a 1-type grammar, we should consider the context of a non-terminal symbol when we replace the non-terminal symbol. And a non-terminal symbol can not be replaced by except that the Start Symbol can produce
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars
(3) 2-type grammar(Context-free grammar)– Every production in P is of the form A where
AVN , V*.
Notes : The left side of each production should be a non-terminal symbol, the right side of each production may be VN , VT or .The automation that recognizes a 2-type language is called as Push-Down Automation(PDA)
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-linear
grammar or left-linear grammar)– Every production in P is of the form A B , A ,
or A B , A , where A , BVN , VT* 。
Notes : The productions in 3-type grammar are right-linear productions or else left-linear productions. There cannot be either left-linear productions or right-linear productions. If all the productions in a 3-type grammar are left-linear productions, we call name grammar as left-linear grammar. If all the productions in a 3-type grammar are right-linear productions, we name the grammar as right-linear grammar.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-
linear grammar or left-linear grammar)Notes : The automation that recognizes 3-type
language is called as finite state automation; 2-type grammar=self-embedded grammar(The
productions are of the form S aSb) +regular grammar, that is, any 2-type grammar without self-embedded property is equivalent to regular grammar.
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
b 、 Catalog of grammars
Hierarchy Alias Production form
Automation name
0-type Grammar without limitation
, V+ Turing Machine
1-type Context-sensitive grammar
A , A VN
Linear Bound Automation
2-type Context-free grammar
A,
A VN
Pushdown automation
3-type Regular grammar
A B , A , A , BVN , VT
*
Finite automation
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
c 、 i-type language– A language produced from i-type.
Written as L(G): L(G)={| VT* , and S }+
L(G1)={ai(a|b)|i>=0}
Example : LetG2 = ({S},{a,b},P,S)
Where P includes: (0) S aSb
(1) S ab
L(G2)={anbn|n>=1}
Example : Let G1 = ({S},{a,b},P,S)
Where P includes: (0) S aS
(1) S a
(2) S b
Chapter 2 Language & Syntax DescriptionSection 2 Grammar & Language 2 、 Formal definition
Notes : Limitations on productions in grammars used by lexical analysis and syntax analysis are as followings,– There is not the production such as P P, for this kind of
production would be useless but for leading to ambiguity– Any non-terminal symbol P should be accessed , and can
derive terminal string.• Start from the Start Symbol S , there exists the
derivation S P• P must be able to derive a terminal string ,
that is P ; VT*.
*
+
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example1 : Let L1={a2nbn|n>=1 and a,b VT}
Try to construct the grammar G1 from L1
Let n=1 , L1 =aab n=2 , L1 =aaaabb
n=3 , L1 =aaaaaabbb …… So we have : S aaSb S aab
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example 2 : Let L2={aibjck | i,j,k>=1 and a,b,c VT}
Try to construct the grammar G2 from L2
S aS S aB
B bB B bC
C cC | c
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example 3 : Let L3={ | (a,b)* and there are as many a’s as b’s in }
Try to construct the grammar G3 from L3
S
S bB , S aA
A bS|b , A aAA
B aS | a | bBB
(0) S S aSbSS bSaS
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification1 、 Constructing a grammar from a language
Example 4 : Let L4={ | (0,1)* and the number of 1 appeared in is even}
Try to construct the grammar G4 from L4
S
S 0S , S 1A
A 0A , A 1S
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification
2 、 Grammar Simplification a 、 Because a language can be described in different
grammars, it is true that should select the grammar which has least productions and is the most suitable to the properties of the language.
b 、 In a grammar, there may be some redundant productions that are useless to derivation. We should delete these productions. – The production which is of the form PP– The production which can not derive a terminal string forever– The production whose left-side non-terminal symbol does not
appear in the right-side of any production
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification2 、 Grammar Simplification
c 、 Steps of simplification :– Look for the productions of the form PP, and
delete them ;– If a production can not be used in the derivations
forever, delete it ;– If a production can not derive a terminal string,
delete it;– Arrange the remained productions.
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification2 、 Grammar Simplification
Example : Simplify the following grammar
(0)S Be (1)S Ec (2)A Ae (3)A e
(4)A A (5)B Ce (6)B Af (7)C Cf
(8)D f
Result:
(0) S Be (1)A Ae (2)A e (3)B Af
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification3 、 Construct a context-free grammar without -production
a 、 A context-free grammar without -production should satisfy the conditions as followings– If there is the production S of the form in P, S
should not appear in right-side of any production, where S is the Start Symbol of the grammar ;
– There are no other -productions in P.
b 、 The algorithm to construct a context-free grammar without -production :– G=(VN,VT,P,S) G’=(V’N,V’T,P’,S’) (1) Find out all non-terminal symbols that can derive
after some steps, and put them into the set V0;
Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification3 、 Construct a context-free grammar without -productionb 、 The algorithm to construct a context-free grammar
without -production : (2)Construct the P’ set of productions of G’ as following
steps:
(A)If an symbol in V0 appears in the right-side of a production, change the production into two productions :substitute the symbol in and itself in the production respectively ; put the new productions into P’
( B)Otherwise, put the productions relating to the symbol into P’ except for -production relating to the symbol
( C)If there exists the production of the form S in P, change the production into S’ | S and put them into P’,let S’ be the Start Symbol of G’ , let V’N=VN{S’ } ,
Example : Let G1=({S},{a,b},P,S),whereP: (0) S (1) S aSbS (2) S bSaS
(1)V0={S}
(2)P’ (1) SabS|aSbS|aSb|ab
(2) SbaS|bSaS|bSa|ba
(0) S’ | S
So : G1’=({S’,S},{a,b},P’,S’),where
P’: (0) S’ | S
(1) S abS|aSbS|aSb|ab
(2) S baS|bSaS|bSa|ba
Chapter 2 Language & Syntax Description Section 4 Ambiguity of a grammar
a 、 Ambiguity of a sentence
If a sentence in a grammar has two or more related syntax tree, the sentence is ambiguous.
b 、 Ambiguity of a grammarIf a language to a grammar has ambiguous
sentences, the grammar is ambiguous.
Chapter 2 Language & Syntax Description Section 4 Ambiguity of a grammar
Example : G=({E} , {+,*,(,),i} , P , E)where : E E+E | E*E | (E) | i
To the sentence (i* i+ i), there are two leftmost derivations, thus there are two syntax trees to the sentence.
(1) E (E) (E+E) (E*E+E) ( i*E+E) ( i*i+E) ( i* i+ i)
(2) E (E) (E*E) ( i*E) ( i*E+E) ( i*i+E) ( i* i+ i)
E
( E )
E + E
E * E i
i i
E
( E )
E * E
E + E i
i i
Chapter 2 Language & Syntax Description Section 4 Ambiguity of a grammar
Notes: (1)Ambiguity would bring uncertainty of syntax analysis
(2)Ambiguity of a grammar is undetermined, that is, there is no such algorithm that can determine a grammar is an ambiguous grammar in finite steps
(3)If you want to prove a grammar is ambiguous, you just give a counterexample
(4)If we can control the ambiguity of a grammar, that is, use additional conditions, the existence of ambiguity is not so bad