Upload
madlyn-reeves
View
233
Download
2
Embed Size (px)
Citation preview
Chapter 3
Language Translation Issues
&
Program Verifications
To specify the syntax of a language, the context-free grammar or Backus-Naur form grammar were developed, but it is realized that (only) syntax was insufficient. (how about the semantics?)
3.1 Programming language syntax
Syntax ::= “ the arrangement of words as elements in a sentence to show their relationship,” describes the sequence of symbols that make up valid programs.
X := Y + Z (valid for a Pascal program)
let X = Y + Z (Basic)
X = Y + Z (C language)
Add Y , Z to X (COBOL)
How much is 2 + 3 * 4 ? (14 or 20 ? It depends on syntax.)
In a statement like X = 2.45 + 3.67, syntax cannot tell us whether Variable X was declared or declared as type real. Results of X=5, X=6, and X=6.12 are all possible. We need more than just syntactic structures for the full description of a PL. Other attributes, under the general term semantics, such as the use of declarations, operations, sequence control, and referencing environments, affect a variable and are not always determined by syntax rules.
3.1.1 General Syntactic Criteria
The details of syntax are chosen largely on the basis of secondary criteria, such as readability, writeability, ease of verifiability, ease of translation, lack of ambiguity, which are unrelated to the primary goal of communicating information to the language processor.
Readability: It is enhanced by such language features as natural statement formats, structured statements, liberal use of keywords and noise(optional) words, provision for embedded comments, unrestricted length identifiers, mnemonic operator symbols, free-field format, and complete data declarations.
Writeability: The syntactic features that makes a program easy to write are often in conflict with those features that make it easy to read.
It is enhanced by use of concise and regular syntactic structures, whereas for readability a variety of more verbose( tedious) constructs are helpful.
Implicit syntactic conventions that allow declarations and operations to be left unspecified make programs shorter and easier to write but harder to read.
A syntax is redundant if it communicates the same item of information in more than one way. The disadvantage is that redundancy makes programs more verbose and thus harder to write.
Ease of verifiability:( to be covered at Sec.4-2-4)
Ease of translation: The LISP syntax provides an example of a program structure that is neither particularly readable nor particularly writable but that is extremely simple to translate.
Lack of ambiguity: for example; “a dangling else” problem in Pascal, a reference to A(I, J) in FORTRAN. How to solve them?
Delimiters & brackets: Brackets are paired delimiters.
Free- & fixed-field formats: free format for HLL now.
Expressions: from which the statements are built.
Statements: SNOBOL4 has only one basic statement syntax while COBOL provides different syntactic structures for each statement type. APL & SNOBOL4 not allow “embedded statements”.
3.1.2 Syntactic Elements of a Language
Character set: from 6-bit to 8-bit char’s, now up to 16-bit char’s.
Identifiers: a string beginning with a letter, and …, length ?
Operator symbols: **, ^, sqr, sqrt, =, EQ, …
Keywords & reserved words: FORTRAN VS. COBOL
Noise words: GO [TO]
Comments: allow comments in several ways; …
Blanks (spaces): In SNOBOL4, it is used as a concatenation opr.
3.1.3 Overall Program-Subprogram Structure
Separate subprogram def’s:each subp as a separate syntactic unit.(c)
Separate data def’s: the class mechanism in Java, C++ .
Nested subprogram def’s:Pascal with this concept to any depth.
Separate interface def’s: 若允許副程式單獨編譯 , 則介面需小心 ,然不如是 , 則再編譯之花費不紫 .
Data description separated from executable statements: COBOL’s.
Unseparated subprogram def’s: SNOBOL4’s lack of organization.
3.2 Stages in Translation
原始程式 編譯程式 目的程式
Position := initial + rate * 60
Lexical analyzer
id1 := id2 + id3 * 60
Syntax analyzer
語法樹 ( 一 )
Semantic analyzer
語法樹 ( 二 )
Intermediate code generator
temp1:=inttoreal(60) temp2:=id3 * temp1 temp3:=id2 + temp2 id1:=temp3
Code optimizer
temp1:=id3 * 60.0 id1 := id2 + temp1
Code generator
目的碼
3.2.1 Analysis of the Source ProgramLexical analysis (scanning): It reads successive lines of input program, breaks them down into individual lexical items (tokens).
The formal model used to design lexical analyzers is the finite-state automata.
How about : DO 10 I = 1, 5 and DO 10 I = 1.5
Syntactic analysis (parsing): Here the larger program structures are identified by the help of GRAMMARS.
Semantic analysis: The bridge between the analysis and synthesis parts of translation. It includes the following functions:
1. Symbol-table maintenance: The symbol-table entry contains more than just the identifier. It contains additional data concerning the attributes of that identifier: its type, type of values, referencing environment, and whatever other information is available from the input program through declarations and usage. The semantic analyzers enter this information into the symbol table as they process declarations, subprogram headers, and program statements.
2. Insertion of implicit information: take example of FORTRAN.
3. Error detection: The semantic analyzer must not only recognize those errors but determine the appropriate way to continue with syntactic analysis of the remainder of the program.
4. Macro processing and compile-time operation: Macro is a piece of program text that has been separately defined and that is to be inserted into the program during translation. A compile-time operation is an operation to be performed during translation to control the translation of the source program.
3.2.2 Synthesis of the Object Program
1. Optimization: Much research has been done on program optimization, and many sophisticated techniques are known.
2. Code generation: The output code may be directly executable, or there may be other translation steps to follow (e.g., assembly or linking and loading).
3. Linking and loading: In the optional final stage of translation, the pieces of code resulting from separate translation of subprograms are coalesced( merged) into the final executable program.
2004/3/15 10
Q1: C語言的compiler可以用C語言撰寫?
Ans: well, if you know how to bootstrap it !
CC M
C’
CC’ M
M CC M
M
3.3 Formal Translation Models
The syntactic recognition parts of compiler theory are fairly standard and generally based on the context-free theory of languages. We briefly summarize that theory in the next few pages.
The two classes of grammars useful in compiler technology include the BNF grammar (or context-free grammar) and the regular grammar.
3.3.1 BNF Grammars The BNF and context-free grammar forms are equivalent in power; the differences are only in notation. For this reason, the terms BNF grammar and context-free grammar are usually interchangeable in discussion of syntax.
Parse trees: Given a grammar, we can use a single-replacement rule to generate strings in our language. For example, the following grammar generates all sequences of balanced parentheses:
S SS | (S) | ( )
Try to figure out the above parse tree .
Now, we have figure 3.4, with the grammar depicted in fig. 3.5.
Ambiguity: In natural language, we have
They are flying planes.
For grammars below, we have
S SS | 0 | 1 T 0T | 1T | 0 | 1
Extensions to BNF Notation:
Square brackets, parentheses, and an asterisk.
Try to process 100101 yourself.
Nondeterministic Finite Automata:
The NFA is a useful concept in proving theorems. Also, the concept of nondeterminism plays a central role in both the theory of languages and the theory of computation, and it is useful to understand this notion( i.e., concept) fully in a very simple context initially.
A finite automaton(FA) consists of a finite set of states and a set of transitions from state to state that occur on input symbols chosen from an alphabet Σ . For each input symbol there is exactly one transition out of each state (possibly back to the state itself). One state, usually denoted q0 , is the initial state, in which the automaton starts. Some states are designed as final or accepting states.
q0 q1
a a
b
b
(a*ba*ba*)*
Consider modifying the finite automaton model to allow zero, one, or more transitions from a state on the same input symbol. This new model is called a nondeterministic finite automaton(NFA). Note that the FA( i.e., DFA for emphasis) is a special case of the NFA in which for each state there is a unique transition on each symbol.
We may extend our model of the NFA to include transitions on the empty
input ε. Therefore, our NFA may be depicted as:
q0
q1
q2
a
a
bε
b
(ab+ | ab+a)*
正規表示式 (Regular Expression)
非決定性有限自動機 (Nondeterministic Finite State Automaton)
決定性有限自動機 (Deterministic Finite State Automaton)
最小之決定性有限自動機 (Minimized DFA) = Transition Diagram
Source Program MDFA + Driver
token
Lex by Lesk in 1975
RE
NFA
DFA
(1) Regular Expression NFA
[1] 若 RE 是 ø ( 空集合 ), 則其 NFA 為 [2] 若 RE 是 ε(空字串 ),則其 NFA為 ε
[3] 若 RE 是 a ( ), 則其 NFA 為 a
[4] 若 S 與 T 兩 NFA 分別為 MS MT
則 {i} S|T 之 NFA 為 MS
MT
則 {ii} S•T 之 NFA為
ε ε
ε
ε
MS MTε ε ε
則 {iii} S* 之 NFA為 MS
ε
ε
ε
Computational Power of an FSA: The set of strings they can recognize is limited( with the help of Pumping Lemma to check it out).
A Push-Down Automaton (PDA) is a septuple P=(Q, , , , q0, z, F), where
Q is finite set of states,
is a finite input alphabet,
is a finite stack alphabet,
maps elements of Q x ( x {}) x into finite subsets of Q x *
q0 Q is start state,
z is start stack symbol,
F Q is set of final states.
Example: Let P=({q0, q1, q2}, {0,1}, {Z, 0}, , q0, Z, {q0}) where
(q0, 0, Z) = {(q1, 0Z)}(q1, 0, 0) = {(q1, 00)}(q1, 1, 0) = {(q2, )}(q2, 1, 0) = {(q2, )}(q2, , Z) = {(q0, )}
L(P)={0n1n| n 0} ? Why ?
( 一 ) 最早的語法解析方式 1. 利用 recursive procedure撰寫
2. 可能需要 back-tracking token
例示 S ::= cAdA ::= abA ::= a
若 input string 是 cad 其 top-down parsing 如下 :
S
c A d
(1)
S
c A d
(2)
a b
S
c A d
a (3)
Procedure S( )begin if input symbol = ‘c’ then ADVANCE( ) if A( ) then if input symbol = ‘d’ then ADVANCE( ) return true end if end if end if return falseend
Procedure A( )begin isave = input-point if input symbol = ‘a’ then ADVANCE( ) if input symbol = ‘b’ then ADVANCE( ) return true end if input-point = isave // 無法找到 ab // if input symbol = ‘a’ then ADVANCE( ) return true end if else return false end ifend
Copyright © 2009 Addison-Wesley. All rights reserved. 1-34
Introduction The General Problem of Describing
Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:
Dynamic Semantics
Copyright © 2009 Addison-Wesley. All rights reserved. 1-35
Introduction Syntax: thethe form or structurestructure of the
expressions, statements, and program units Semantics: the meaning the meaning of the expressions,
statements, and program units Syntax and semantics provide a language’s
definition Users of a language definition
language designers Implementers Programmers (the users of the language)
Copyright © 2009 Addison-Wesley. All rights reserved. 1-36
The General Problem of Describing Syntax: TerminologyTerminology A sentence is a string of characters over
some alphabet
A language is a set of sentences
A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin)
A token token is a category category of lexemes (e.g., identifier) pls refer to next page!
Copyright © 2009 Addison-Wesley. All rights reserved. 1-37
Lexemes Tokens
Index identifier= equal_sign2 int_literal* mult_opcount identifier+ plus_op17 int_literal; semicolon
Index = 2 * count + 17 ;
Copyright © 2009 Addison-Wesley. All rights reserved. 1-38
Formal Definition of Languages Recognizers
A recognition device reads input strings over the alphabet of the language and decides whether the input strings belong to the language
Example: syntax analysis part of a compiler - Detailed discussion of syntax analysis appears in Chapter 4
Generators A device that generates sentences of a language One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of the generator
BNF and Context-Free Grammars Context-Free Grammars
Developed by Noam Chomsky in the mid-1950s Language generators, meant to describe the syntax of
natural languages Define a class of languages called context-free
languages ( ref to next page )
Backus-Naur Form (1959) Invented by John Backus to describe Algol 58 BNF is equivalent to context-free grammars
Copyright © 2009 Addison-Wesley. All rights reserved. 1-39
40
Type 0: Unrestricted Grammars
any
Type 1: Context Sensitive Grammars(CSG)
for all , || ||
Type 2: Context Free Grammars(CFG)
for all , N (i.e., A )
Type 3: Right (or Left)-Linear Grammars
if all productions are of the form
A x or A xB
G2 = ({S, B, C}, {a, b, c}, P, S)P: S aSBC
S abCCB BCbB bbbC bccC cc
Which Type ?
G3 :S S + SS S * SS (S)S a
Which Type ?
Copyright © 2009 Addison-Wesley. All rights reserved. 1-41
BNF Fundamentals In BNF, abstractions abstractions are used to represent classes of syntactic structures--they act like syntactic
variables (also called nonterminal symbols, or just nonnonterminals)
TerminalsTerminals are lexemes are lexemes or tokens A rule has a left-hand side (LHS), which is a nonterminal, and a right-hand side (RHS), which is a string
of terminals and/or nonterminals
Nonterminals are often enclosed in angle brackets
Examples of BNF rules:<ident_list> → identifier | identifier, <ident_list><if_stmt> → if <logic_expr> then <stmt>
Grammar: a finite non-empty set of rules
A start symbol is a special element of the nonterminals of a grammar
Copyright © 2009 Addison-Wesley. All rights reserved. 1-42
BNF Rules
An abstraction (or nonterminal abstraction (or nonterminal symbol) can have more than one RHS
<stmt> <single_stmt> || begin <stmt_list> end
Copyright © 2009 Addison-Wesley. All rights reserved. 1-43
Describing Lists Syntactic lists are described using
recursionrecursion
<ident_listident_list> ident | ident, <ident_listident_list>
A derivation is a repeated application of A derivation is a repeated application of rulesrules, starting with the start symbol and ending with a sentence (all terminal symbols)
Copyright © 2009 Addison-Wesley. All rights reserved. 1-44
An Example GrammarGrammar<program> <stmts>
<stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d <expr> <term> + <term> | <term> - <term> <term> <var> | const
Copyright © 2009 Addison-Wesley. All rights reserved. 1-45
An Example DerivationDerivation<program> => <stmts> => <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const
Copyright © 2009 Addison-Wesley. All rights reserved. 1-46
Derivations Every string of symbols in a derivation is a
sentential form A sentence is a sentential form that has only
terminal symbols A leftmost derivation is one in which the
leftmost nonterminal in each sentential form is the one that is expanded
A derivation may be neither leftmost nor rightmost
Copyright © 2009 Addison-Wesley. All rights reserved. 1-47
Parse Tree A hierarchical representation of a
derivation
<program>
<stmts>
<stmt>
const
a
<var> = <expr>
<var>
b
<term> + <term>
Copyright © 2009 Addison-Wesley. All rights reserved. 1-48
Ambiguity in Grammars A grammar is ambiguous if and only if it
generates a sentential form that has two or more distinct parse trees
Copyright © 2009 Addison-Wesley. All rights reserved. 1-49
An Ambiguous Expression Grammar<expr> <expr> <op> <expr> | const<op> / | -
<expr>
<expr> <expr>
<expr> <expr>
<expr>
<expr> <expr>
<expr> <expr>
<op>
<op>
<op>
<op>
const const const const const const- -/ /
<op>
Copyright © 2009 Addison-Wesley. All rights reserved. 1-50
An Unambiguous Expression Grammar If we use the CFG grammar to indicate to indicate
precedence levels of the operatorsprecedence levels of the operators, we cannot have ambiguity
<expr> <expr> - <term> | <term><term> <term> / const| const
<expr>
<expr> <term>
<term> <term>
const const
const/
-
Copyright © 2009 Addison-Wesley. All rights reserved. 1-51
Associativity of Operators Operator associativity can also be indicated by a grammarcan also be indicated by a grammar
<expr> -> <expr> + <expr> | const (ambiguous)<expr> -> <expr> + const | const (unambiguous)
<expr><expr>
<expr>
<expr> const
const
const
+
+
Copyright © 2009 Addison-Wesley. All rights reserved. 1-52
ExtendedExtended BNF OptionalOptional parts are placed in brackets [ ]
<proc_call> -> ident [(<expr_list>)]
Alternative Alternative parts parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term> (+|-) const
RepetitionsRepetitions (0 or more) are placed inside braces { }<ident> → letter {letter|digit}
Copyright © 2009 Addison-Wesley. All rights reserved. 1-53
BNF and EEBNF BNF <expr> <expr> + <term> | <expr> - <term> | <term> <term> <term> * <factor> | <term> / <factor> | <factor>
EEBNF <expr> <term> {(+ | -) <term>} <term> <factor> {(* | /) <factor>}
Copyright © 2009 Addison-Wesley. All rights reserved. 1-54
Recent Variations Recent Variations in EBNF Alternative RHSs are put on separate lines Use of a colon instead of => Use of opt for optional parts
Use of oneof for choices
Copyright © 2009 Addison-Wesley. All rights reserved. 1-55
StaticStatic Semantics Nothing to do with Nothing to do with (actual) (actual) meaningmeaning Context-free grammars (CFGsCFGs) cannot
describe all of the syntax of programming languages (CSGs might be ?!)(CSGs might be ?!)
Categories of constructs that are troubletrouble:
- Context-free, but cumbersome (e.g.,
types of operands in expressions)
- Non-context-free (e.g., variables must
be declared before they are used)
The analysis required to check these specifications The analysis required to check these specifications (e.g. type compatibility) can be done (e.g. type compatibility) can be done at compile at compile timetime..
Copyright © 2004 Pearson Addison-Wesley. All rights reserved. 3-56
Attribute Grammars (AGs) (Knuth, 1968)
Cfgs cannot describe all of the syntax of programming languages
Additions to cfgs to carry some semantic info along through parse trees
Primary value of AGs: Static semantics specification Compiler design (static semantics checking)
Static semantics: including data-type, forward-branching locations
112/04/19 57
Attribute Grammars
Def: An attribute grammar is a cfg G = (S, N, T, P) with the following additions: For each grammar symbol x there is a set A(x)
of attribute values Each rule has a set of functions that define
certain attributes of the nonterminals in the rule
Each rule has a (possibly empty) set of predicates to check for attribute consistency Attribute grammars are grammars to which have been added
attributes, attribute computation functions, and predicate functions.
112/04/19 58
Attribute Grammars Let X0 X1 ... Xn be a rule Functions of the form S(X0) = f(A(X1), ... ,
A(Xn)) define synthesized attributes Functions of the form I(Xj) = f(A(X0), ... ,
A(Xj-1)), for 1 <= j <= n, define inherited attributes
Initially, there are intrinsic attributesintrinsic attributes on the leaves
X0
X1 X2 Xn...
112/04/19 59
Attribute Grammars Example: expressions of the form id + id
id's can be either int_type or real_type types of the two id's must be the same type of the expression must match it's expected
type(from top down) BNF: <expr> <var> + <var> <var> id Attributes:
actual_type - synthesizedsynthesized for <var> and <expr> expected_type - inheritedinherited for <expr>
112/04/19 60
1. <assign> <var> = <expr>2. <expr> <var>1 + <var>2
3. <expr> <var> 4. <var> A | B
Syntax rule: <assign> <var> = <expr> Semantic rule: <expr>.expected_type <var>.actual_type
Syntax rule: <expr> <var>1 + <var>2
Semantic rule: <expr>.actual_type if (<var>1 .actual_type = int) and
(<var>2 .actual_type = int) then int else real
end if Predicate:Predicate: <expr>.actual_type = <expr>.expected_type
Rule 1
Rule 2
112/04/19 61
Syntax rule: <expr> <var> Semantic rule: <expr>.actual_type <var>.actual_type PredicatePredicate: <expr>.actual_type = <expr>.expected_type
Syntax rule: <var> A | B Semantic rule: <var>.actual_type look-up(<var>.string)
To compute the attribute values, the following is a possible sequence:
1. <var>.actual_type look-up(A) (Rule 4)
2. <expr>.expected_type <var>.actual_type (Rule 1)
3. <var>1 .actual_type look-up(A) (Rule 4)
<var>2 .actual_type look-up(A) (Rule 4)
4. <expr>.actual_type either int or real (Rule 2)
5. <expr>.expected_type = <expr>.actual_type is either
True or False (Rule 2)
Rule 3
Rule 4
112/04/19 62
<assign>
<var> <var>1 <var>2
<expr>
A = A + B
Actual_typeActual_type
Actual_type
Actual_type
Expected_type
①
②
③④
⑤ ⑤
T or F ?⑥
Now you may take a close look of a “Fully attributed” parse tree.
112/04/19 63
Attribute Grammars How are attribute values computed?How are attribute values computed?
If all attributes were inherited, the tree could be decorated in top-down order.
If all attributes were synthesized, the tree could be decorated in bottom-up order.
In many cases, both kinds of attributes are used, and it is some combination of top-down and bottom-up that must be used.
Copyright © 2009 Addison-Wesley. All rights reserved. 1-64
SemanticsSemantics There is no single widely acceptable notation
or formalism for describing semantics Several needs for a methodology &
notation for semanticsnotation for semantics: ProgrammersProgrammers need to know what statements mean Compiler writers Compiler writers must know exactly what language constructs do Correctness proofs would be possible Compiler generators Compiler generators would be possible DesignersDesigners could detect ambiguities and inconsistencies
Copyright © 2004 Pearson Addison-Wesley. All rights reserved. 3-65
Although much is known about the programming language syntax, we have less knowledge of how to correctly define the semantics of a language. The problem of semantic definition has been the object of theoretical study for as long as the problem of syntactic definitions, but a satisfactory solution has been much more difficult to find. Many different methods( 5 indeed) for the formal definition of semantics have been developed;
1. Grammatical models( attribute grammars; Knuth 1968): By adding attributes to each rule in a grammar. (done already)(done already)
Copyright © 2004 Pearson Addison-Wesley. All rights reserved. 3-66
2. Operational models: The Vienna Definition Language (VDL) is an operational approach form in the 1970s. Typically the definition of the virtual computer is described as an automaton.(state machine type)
3. Denotational models: (functional model type)
4. Axiomatic models: This method extends the predicate calculus to include programs.(by Hoareby Hoare)
5. Specification model: The algebraic data type is a form of formal specification. For example
pop( push (S, x))= S.
No single semantic definition method No single semantic definition method has been found useful for both user and implementor of a language.
Copyright © 2004 Pearson Addison-Wesley. All rights reserved. 3-67
We can use some of the previous discussion on language semantics to aid in correctness issues in three ways:
1. Given Program P, what does it mean? That is, what is its Specification S?(semantic modeling issue)
2. Given Specification S, develop Program P that implement that specification.(the central problem in SE today)
3. Do Specification S and Program P perform the same function?(the central problem in program verification)
Copyright © 2004 Pearson Addison-Wesley. All rights reserved. 3-68
Q := 0
R := X
R>=Y
R := R - Y
Q := Q + 1
X>=0 ^ Y>0
X>=0 ^ Y>0 ^ Q=0
X>=0 ^ Y>0 ^ Q=0 ^ R=X
X>=0 ^ Y>0 ^ Q>=0 ^ R>=0 ^ X=Q*Y+R
X>=0 ^ Y>0 ^ R>=0 ^ Q>=0 ^ X=Q*Y+R ^ R>=Y
X>=0 ^ Y>0 ^ R>=0 ^ Q>=0 ^ X=(Q+1)*Y+R
X>=0 ^ Y>0 ^ R>=0 ^ Q>=1 ^ X=Q*Y+R
X>=0 ^ Y>0 ^ Q>=0 ^ X=Q*Y+R ^ 0<=R<Y
Assigning meanings to programs (R.W.Floyd 1967)
Modeling Language Properties
1. Formal Grammars2. Language Semantics3. Program Verification
Type 0: Unrestricted Grammars
any
Type 1: Context Sensitive Grammars(CSG)
for all , || ||
Type 2: Context Free Grammars(CFG)
for all , N (i.e., A )
Type 3: Right (or Left)-Linear Grammars
if all productions are of the form
A x or A xB
G2 = ({S, B, C}, {a, b, c}, P, S)P: S aSBC
S abCCB BCbB bbbC bccC cc
Which Type ? What the language is?
G3 :S S + SS S * SS (S)S a
Which Type ?What language ?
Formal Grammars
Q: Is there a limit to what we can compute with a computer ?
Halting problem may answer this question.
Turing Machines can define the class of all computable funs.
1. A finite-state automata consists of a finite state graph and a one-way tape. For each operation, the automaton reads the next symbol from the tape and enters a new state.
2. A pushdown automaton adds a stack to the finite automaton. For each operation, the automaton reads the next tape symbol and the stack symbol, writes a new stack symbol, and enters a new state.
3. A linear-bounded automaton is similar to the finite-state automaton with the additions that it can read and write to the tape for each input symbol and it can move the tape in either direction.
4. A Turing machine is similar to a linear-bounded automaton except that the tape is infinite in either direction.
Although much is known about the programming language syntax, we have less knowledge of how to correctly define the semantics of a language. The problem of semantic definition has been the object of theoretical study for as long as the problem of syntactic definitions, but a satisfactory solution has been much more difficult to find. Many different methods( 5 indeed) for the formal definition of semantics have been developed;
1. Grammatical models( attribute grammars; Knuth 1968): By adding attributes to each rule in a grammar. Please refer to the Fig.4.3 for more details.
2+4*(1+2)
E E + T | T
T T * P | P
P ( E ) | digit
2. Imperative or operational models: The Vienna Definition Language (VDL) is an operational approach form the 1970s. Typically the definition of the virtual computer is described as an automaton.(state machine type)
3. Applicative models: (functional model type; denotational sem.)
4. Axiomatic models: This method extends the predicate calculus to include programs.(by Hoare)
5. Specification model: The algebraic data type is a form of formal specification. For example
pop( push (S, x))= S.
No single semantic definition method has been found useful for both user and implementor of a language.
We can use some of the previous discussion on language semantics to aid in correctness issues in three ways:
1. Given Program P, what does it mean? That is, what is its Specification S?(semantic modeling issue)
2. Given Specification S, develop Program P that implement that specification.(the central problem in SE today)
3. Do Specification S and Program P perform the same function?(the central problem in program verification)
Q := 0
R := X
R>=Y
R := R - Y
Q := Q + 1
X>=0 ^ Y>0
X>=0 ^ Y>0 ^ Q=0
X>=0 ^ Y>0 ^ Q=0 ^ R=X
X>=0 ^ Y>0 ^ Q>=0 ^ R>=0 ^ X=Q*Y+R
X>=0 ^ Y>0 ^ R>=0 ^ Q>=0 ^ X=Q*Y+R ^ R>=Y
X>=0 ^ Y>0 ^ R>=0 ^ Q>=0 ^ X=(Q+1)*Y+R
X>=0 ^ Y>0 ^ R>=0 ^ Q>=1 ^ X=Q*Y+R
X>=0 ^ Y>0 ^ Q>=0 ^ X=Q*Y+R ^ 0<=R<Y
Copyright © 2009 Addison-Wesley. All rights reserved. 1-79
Summary BNF and context-free grammars are
equivalent meta-languages Well-suited for describing the syntax of
programming languages An attribute grammar is a descriptive
formalism that can describe both the syntax and the semantics of a language
Three primary methods of semantics Three primary methods of semantics description( description( by 2009by 2009)) Operation, axiomatic, denotationalOperation, axiomatic, denotational