D Goforth COSC 31271 Translating High Level Languages Note error in assignment 1: #4 - refer to...

Preview:

Citation preview

D Goforth COSC 3127 1

Translating High Level Languages

Note error in assignment 1:

#4 - refer to Example grammar 3.4, p. 126

D Goforth COSC 3127 2

Stages of translation

Lexical analysis - the lexer or scanner Syntactic analysis - the parser Code generation LinkingBefore Execution

D Goforth COSC 3127 3

Lexical analysis

Translate stream of characters into lexemes

Lexemes belong to categories called tokens

Token identity of lexemes is used at the next stage of syntactic analysis

D Goforth COSC 3127 4

From characters to lexemes

yVal = x + 450 – min ( 100, 4xVal ));

yVal = x + 450 – min ( 100, 4xVal ));

D Goforth COSC 3127 5

Examples: tokens and lexemes

Some token categories contain only one lexeme:

semi-colon ; Some tokens categorize many

lexemes:identifier count, maxCost,…

based on a rule for legal identifier strings

D Goforth COSC 3127 6

Tokens and Lexemes

yVal = x + 450 – min ( 100, 4xVal ));

Lexical analysis

•identifies lexemes and their token type

•recognizes illegal lexemes (4xVal)

•does NOT identify syntax error: ) )

identifierille

gal

lexemeleft_parenequal_sign

D Goforth COSC 3127 7

Syntax or Grammar of Language

rules for generating (used by programmer) or Recognizing (used by parser)a valid sequence of lexemes

D Goforth COSC 3127 8

Grammars

4 categories of grammars (Chomsky) Two categories are important in

computing: Regular expressions (pattern

matching) Context-free grammars

(programming languages)

D Goforth COSC 3127 9

Context-free grammar Meta-language for describing

languages States rules or productions for what

lexeme sequences are correct in the language

Written in Backus-Naur Form (BNF) or EBNF Syntax graphs

D Goforth COSC 3127 10

Example of BNF rule

PROBLEM: how to recognize all these as correct?

y = x

f = rVec.length + 1

button[4].label = “Exit”

RULE for defining assignment statement:

<assign> <variable> = <expression>

Assumes other rules for <variable>, <expression>

D Goforth COSC 3127 11

BNF rules

Non-terminal and terminal symbols: Non-terminals are defined by at least

one rule<assignment> < var> = <expression> Terminals are tokens (or lexemes)

D Goforth COSC 3127 12

Simple sample grammar(p.123)

<assign> <id> = <expr>

<id> A | B | C // lexical

<expr> <id> + <expr>

| <id> * <expr>

| ( <expr>)

| <id> terminals

<nonterminals>

terminals

<nonterminals>

D Goforth COSC 3127 13

Simple sample production<assign> <id> = <expr> <- apply one rule at each step

B = <expr> to leftmost non-terminal

B = <id> * <expr>

B = A * <expr>

B = A * ( <expr> )

B = A * ( <id> + <expr> )

B = A * ( C + <expr> )

B = A * ( C + <id> )

B = A * ( C + C )<assign> <id> = <expr>

<id> A | B | C

<expr> <id> + <expr>

| <id> * <expr>

| ( <expr>)

| <id>

<assign> <id> = <expr>

<id> A | B | C

<expr> <id> + <expr>

| <id> * <expr>

| ( <expr>)

| <id>

D Goforth COSC 3127 14

Sample parse tree<assign>

<expr><id>

=

+

* <expr>B <id>

A <expr>( )

<expr><id>

<id>C

C

Leaves represent the sentence of lexemes

Ru

le a

pp

licatio

n

<assign> <id> = <expr>

<id> A | B | C

<expr> <id> + <expr>

| <id> * <expr>

| ( <expr>)

| <id>

<assign> <id> = <expr>

<id> A | B | C

<expr> <id> + <expr>

| <id> * <expr>

| ( <expr>)

| <id>

D Goforth COSC 3127 15

extended sample grammar<stmt> <assign> | <ifstmt>

<ifstmt> if (<cond>) then <stmt>

| if (<cond>) then <stmt> else <stmt>

<cond> <expr> <compareop><expr>

<compareop> < | > | <= | >= | == | ~=

How to add compound condition?

D Goforth COSC 3127 16

Ambiguous grammar

Different parse trees for same sentence

Different translations for same sentence

Different machine code for same source code!

D Goforth COSC 3127 17

Grammars for ‘human’ conventions without ambiguity Putting features of languages into

grammars: expression any length: lists, p. 121 precedence - an extra non-terminal:

p. 125 associativity - order in recursive rules:

p. 128 nested if statements - “dangling

else” problem: p. 130

D Goforth COSC 3127 18

Forms for grammars Backus-Naur form (BNF) Extended Backus-Naur form (EBNF)

-shortens set of rules Syntax graphs

-easier to read for learning language

D Goforth COSC 3127 19

EBNF optional zero or one occurrence [..] <expr> -> [ <expr> + ] <term> optional zero or more occurrences {..}<expr> -> <term> { + <term> } ‘or’ choice of alternative symbols |<term> -> <term> [ (*|/) <term> ]

Syntax Graph - basic structures

expr term

term factor

factor*

/

expr term

term+

-

factor*

/termterm

BNF (p. 121) EBNF

Syntax Graph

<expr> -> <expr>+<term>

| <expr>-<term>

| <term>

<term> -> <term>*<factor>

| <term>/<factor>

| <factor>

<expr> -> [<expr> (+|-)] <term>

<term> -> [<term> (*|/)] <factor>

<expr> -> <term> {(+|-) <term>}

<term> -> <factor> {(*|/)<factor>}

expr term

term+

-

term factor

factor*

/

D Goforth COSC 3127 22

Attribute grammars Problem: context-free grammars cannot

describe some features needed in programming - “static semantics”e.g.: rules for using data types

*Can’t assign real to integer(clumsy in BNF)

*Can’t access variable before assigning (impossible in BNF)

D Goforth COSC 3127 23

Attributes Symbols in the grammar can have

attributes (properties) Productions can have functions of

some of the attributes of their symbols that compute the attributes of other symbols

Predicates (boolean functions) inspect the attributes of non-terminals to see if they are legitimate

D Goforth COSC 3127 24

Using attributes

1) Apply productions to create parse tree (symbols have some intrinsic attributes)

2) Apply functions to determine remaining attributes

3) Apply predicates to test correctness of parse tree

D Goforth COSC 3127 25

Sebesta’s example

<assign> <var> = <expr><expr> <var> + <var>

| <var><var> A | B | C

Add attributes for type checkingExpected_typeActual_type

D Goforth COSC 3127 26

Sebesta’s example

<assign> <var> = <expr>

<expr> <var> + <var> | <var>

<var> A | B | C

expected_type

actual_type

expected_type

actual_type

expected_type

actual_type

expected_type

actual_type

D Goforth COSC 3127 27

Sebesta’s example

<assign> <var> = <expr>

<expr> <var> + <var> | <var>

<var> A | B | C

actual_typeDetermined from string (A,B,C)

Which has been declared

actual_typeDetermined from string (A,B,C)

Which has been declared

D Goforth COSC 3127 28

Sebesta’s example

<assign> <var> = <expr>

<expr> <var> + <var> | <var>

<var> A | B | C

actual_typeDetermined from <var>

Actual types

actual_typeDetermined from <var>

Actual types

D Goforth COSC 3127 29

Sebesta’s example

<assign> <var> = <expr>

<expr> <var> + <var> | <var>

<var> A | B | C

expected typeDetermined from <var>

Actual types

expected typeDetermined from <var>

Actual types

D Goforth COSC 3127 30

Sebesta’s type rules p.138

D Goforth COSC 3127 31

Sebesta’s example

D Goforth COSC 3127 32

Sebesta’s example

D Goforth COSC 3127 33

Axiomatic semantics

Assertions about statements Preconditions Postconditions

like JUnit testing Purpose

Define meaning of statement Test for validity of computation (does it

do what it is supposed to do?)

D Goforth COSC 3127 34

Example for assignment

What the statement should do is expressed as a postcondition

Based on the syntax of the assignment, a precondition is inferred

When statement is executed, conditions can be verified before and after

D Goforth COSC 3127 35

Example assignment statement

y = 25 + x * 2 postcondition: y>40

y>4025+x*2>40x*2>15x>7.5 precondition

Recommended