23
INF 3110/4110 - 2006 8/28/2006 1 Syntax/semantics - I Program <> program execution Compiling/interpretation Syntax Classes of languages Regular languages Context-free languages Meta models

Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax Semantics A description of a programming language consists of two main components: – Syntactic

  • Upload
    others

  • View
    23

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

INF 3110/4110 -2006

8/28/2006 1

Syntax/semantics - I

Program <> program execution Compiling/interpretationSyntax– Classes of languages– Regular languages– Context-free languages

Meta models

Page 2: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 2

INF 3110/4110 -2006

Programming/Modeling

GeneratorProgram Programexecution

ProgrammingReal World/

Problem/...

GeneratorModel

(Description) Model

ModelingReal World/

Problem/...

Page 3: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 3

INF 3110/4110 -2006

Syntax <> SemanticsA description of a programming language consists of two main components:– Syntactic rules: what form does a legal program have.– Semantic rules: what the sentences in the language mean.

Static semantics: rules that may be checked (by the compiler) beforethe execution of the program, e.g.:

– All variables must be declared.– Declaration and use of variables coincide (type check).

Dynamic semantics: rules saying what shall hapen during (as part of)the execution of the program, e.g. in terms of an operational semantics, that is a semantics that describes the behaviour of a(idealised) abstract generator (prosessor/machine) executing a program.

Page 4: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 4

INF 3110/4110 -2006

Compiling/interpretationAn interpreter reads a program and executes its operations.

A compiler/ translator translates a program to another language, typically a machine language or to a language for a virtual machine.

InterpreterData

Program

Data

Interpreter/Machine

Data

Program

Data

Program Compiler/Translator

Page 5: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 5

INF 3110/4110 -2006

How to describe syntax – I BNF-grammar

Syntax diagram (‘railway diagram’)

<tall> → - <siffer><tall> → <sifffer><siffer> → 0 <sifre><siffer> → 1 <sifre><sifre> → 0 <sifre><sifre> → 1 <sifre><sifre> → ε

0

1-

tall

Page 6: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 6

INF 3110/4110 -2006

Extended BNFExtended BNF has the following metasymbols (on the right side):

| alternatives

? symbols may occur 0 or 1 time

* symbols may occur 0 or several times

+ symbols may occur 1 or several times

{...} groups symbols

<tall> → - <siffer> | <sifffer><siffer> → 0 <sifre> | 1 <sifre><sifre> → 0 <sifre> | 1 <sifre> | ε

<tall> → {-}?{0|1}+

Page 7: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 7

INF 3110/4110 -2006

How to describe syntax – IINon-deterministic automaton

Deterministic automaton

Page 8: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 8

INF 3110/4110 -2006

ExampleBNF-grammar

Syntax diagram

<uttrykk> → <uttrykk> + <term> | <term>

<term> → <term> * navn | navn

production, rule(produksjon)

terminals(grunnsymbol)

non-terminals(metasymbol)

left-recursion

uttrykk +uttrykk

term

term

term *term

navn

navn

Page 9: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 9

INF 3110/4110 -2006

Derivation of sentencesThe sentences in a language defined by a BNF-grammar are exactly those that may be produced by the following procedure:1. Start with the start symbol.2. For each meta symbol, substitute this with one of alternatives on the

right hand side in the production rule.3. Repeat step 2 until only terminals.

This is called a derivation from the start symbol to a complete sentence, and it may be represented by a syntax tree

Abstract syntax tree– Removing terminals that are

not needed in order

navn * navn + navn

uttrykk

uttrykk term

termto give the meaning

Page 10: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 10

INF 3110/4110 -2006

Unambiguous/ambiguous grammarsIf every sentence in a language can be derived by one and only one syntax tree, then the grammar is unambiguous, otherwise it is ambiguous.

<uttrykk> → navn | <uttrykk> + <uttrykk> | <uttrykk> * <uttrykk>

uttrykk

navn +

uttrykk

navn

uttrykk

*

uttrykk

navn

uttrykk

uttrykk

navn +

uttrykk

navn

uttrykk

*

uttrykk uttrykk

navn

Page 11: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 11

INF 3110/4110 -2006

Types of languagesRegular languages (type 3)– A BNF-grammar with one meta symbol to the left and only

terminals to the right, possibly with a metasymbol as the last symbol

– May be analyzed with non-deterministic and deterministic automata

Context free languages (type 2)– A BNF-grammar with just one meta symbol on the left hand side– Almost all programming languages have grammars of this type– May be analyzed with parsers

Type 1 languages («context-sensitive») require that the righthand side is of the same length as the lefthand side. This makes it possible to cover name bindings and type checksType 0 languages have no restrictions– One of theoretical interest

Page 12: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 12

INF 3110/4110 -2006

Meta modelsAlternative to grammars and syntax treesObject model representing the program (not the execution)

AD

B C

A

D

n..m1

0..10..*

*

<statement> → <assignment> | <if-then-else> | <while-do>

statement

if-then-elseassignment while-do

Page 13: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 13

INF 3110/4110 -2006

Why meta models?Inspired by abstract syntax trees in terms of object structures,interchange formats between toolsNot all modeling/programming tools are grammar (parser)-based (e.g. wizards)Growing interest in domain specific languages, often with a mixture of text and graphics

Page 14: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 14

INF 3110/4110 -2006

Use of deterministic automataTo check if a given string is part of the regular language or not:

Page 15: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 15

INF 3110/4110 -2006

Use of deterministic automataWhat if the string is not in the language?

Page 16: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 16

INF 3110/4110 -2006

Howe to make a deterministic automaton?A deterministic automaton (D-automat) is easy to use, but not necessarily so intuitive and not so easy to make.From a regular expression (or a syntax diagram) it is, however, easy to make a none-deterministic automaton (ID-automat).

Then we can use an algorithm to make a deterministic automaton from the none-deterministic automaton.

May have:• Empty transitions (so-called ε - transitions).• More transitions from same state with the same symbol.

Page 17: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 17

INF 3110/4110 -2006

Example

<tall> → 0 <FP> | 1 <IFP>

<IFP> → 1 <IFP> | 0 <IFP> | <FP>

<FP> → ε | . <EP>

<EP> → 0 | 1 | 0 <EP> | 1 <EP>

<tall> → { 0 | 1 {0 | 1 }* } {.{ 0 | 1 }+ }?

Allowed words are

0 1 101 0.10 100.1010 10.1

However, not allowed with leading 0 or”decimalpoint” without preceeding or following ciffers, sothe following is not allowed:

001 10. .01

Page 18: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 18

INF 3110/4110 -2006

Parse/syntax tree

.

1

EP

IFP

tall

1 IFP

0

FP

10.1

Page 19: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 19

INF 3110/4110 -2006

From syntax diagram to none-deterministic automaton

1. Every ”switch” becoms a node in the automaton2. The lines (with symbols) become transitions between the nodes

Some transitions may get an empty symbol (ε)3. Mark start node and end node(s)

Page 20: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 20

INF 3110/4110 -2006

Example0

1

0

1

. 0

1

ε

00 0

1

11 ε ε

ε

.

0 ε

ε

00 0

1

11 ε ε

ε

.

X

Page 21: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 21

INF 3110/4110 -2006

From non-deterministic to deterministic automaton

ε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

non-deterministic:

deterministic:

ε

00 0

11

1 ε ε

ε

.

X

0

1

ε

00 0

11

1 ε ε

ε

.

X

0

1

ε

00 0

11

1 ε ε

ε

.

X

0

1

0

1

a,X

X0

1

a,X

X0

1

a,Xε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

X0

1

a,X

X0

1

a,X

a,b,X

X

X0

1

a,X

a,b,X

X

X0

1

a,X

a,b,X

ε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

1

X

X0

1

0

a,X

a,b,X 1

X

X0

1

0

a,X

a,b,X

ε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

1

X

X0

1

0

.

.

a,X

a,b,X 1

X

X0

1

0

.

.

a,X

a,b,X

c

1

X

X0

1

0

.

.

a,X

a,b,X

c

ε

00 0

11

1 ε ε

ε

.

X

1

X

X0

1

0

.

.

0

1

a,X

a,b,X

c

ε

00 0

11

1 ε ε

ε

.

X

1

X

X0

1

0

.

.

0

1

a,X

a,b,X

c

1

X

X0

1

0

.

.

0

1

a,X

a,b,X

c

c,d,X1

X

X

X

0

1

0

.

.

0

1

a,X

a,b,X

c

c,d,X1

X

X

X

0

1

0

.

.

0

1

a,X

a,b,X

c

c,d,X

ε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

ε

00 0

11

1 ε ε

ε

.

X

1

X

X

X

0

1

0

.

.

0

1

0

1

a,X

a,b,X

c

c,d,X1

X

X

X

0

1

0

.

.

0

1

0

1

a,X

a,b,X

c

c,d,X

ε

00 0

11

1 ε ε

ε

.

Xa

b c d

1

2

3

4

5

Page 22: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 22

INF 3110/4110 -2006

As a table

1 2 3 4 5 error

0 2 error 3 5 5

1 3 error 3 5 5

. error 4 4 error error

end ok okok

Page 23: Syntax/semantics - I - uio.no file8/28/2006 3 INF 3110/4110 - 2006 Syntax  Semantics A description of a programming language consists of two main components: – Syntactic

8/28/2006 23

INF 3110/4110 -2006

Syntax checking algorithmGiven such a table t, the following algorithm will check a given string:

Summary - How to make a syntax checking program:– Make a non-deterministic automaton for the regular expression– Make a deterministic automaton from the non-deterministic

automaton– Make the table t– Use the above algorithm

state := 1;while <more symbols> do begin

c := <next symbol>;state := t(state,c);

end while;if ok(state) then <OK>else <Not OK>;