33
Concepts of Programming Languages Lecture 5 - Syntax Patrick Donnelly Montana State University Spring 2014 Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 1 / 23

Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Concepts of Programming LanguagesLecture 5 - Syntax

Patrick Donnelly

Montana State University

Spring 2014

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 1 / 23

Page 2: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Administrivia

Assignment:

Programming #1 : due 02.10Homework #2 : due 02.19

Reading:

Chapter 3

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 2 / 23

Page 3: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

A language that is simple to parse for the compiler is also simple toparse for the human programmer.

N. Wirth (1974)

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 3 / 23

Page 4: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Thinking about Syntax

DefinitionThe syntax of a programming language is a precise description of allits grammatically correct programs.

Precise syntax was first used with Algol 60, and has been used eversince.

Three levels:Lexical syntaxConcrete syntaxAbstract syntax

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 4 / 23

Page 5: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Levels of Syntax

Lexical syntax = all the basic symbols of the language (names, values,operators, etc.)

Concrete syntax = rules for writing expressions, statements andprograms.

Abstract syntax = internal representation of the program, favoringcontent over form.

C: if ( expr ) ...discard ( )

Ada: if ( expr ) then discard then

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 5 / 23

Page 6: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Syntax of a Small Language: Clite

Why examine only a subset of C?

Grammar Size for Various Languages

Language Pages Reference

Pascal 5 [Jensen & Wirth, 1975]C 6 [Kernighan & Richie, 1988]C++ 22 [Stroustrup, 1997]Java 14 [Gosling et. al., 1996]

The Clite grammar fits on one page (next few slides),so it’s a far better tool for studying language design.

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 6 / 23

Page 7: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Syntax of a Small Language: Clite

Why examine only a subset of C?

Grammar Size for Various Languages

Language Pages Reference

Pascal 5 [Jensen & Wirth, 1975]C 6 [Kernighan & Richie, 1988]C++ 22 [Stroustrup, 1997]Java 14 [Gosling et. al., 1996]

The Clite grammar fits on one page (next few slides),so it’s a far better tool for studying language design.

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 6 / 23

Page 8: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Grammar: Statements

Program → int main ( ) { Declarations Statements }

Declarations → { Declaration }

Declaration → Type Identifier [ [ Integer ] ]{ , Identifier [ [ Integer ] ] }

Type → int | bool | float | char

Statements → { Statement }

Statement → ; | Block | Assignment |IfStatement | WhileStatement

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 7 / 23

Page 9: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Grammar: Statements

Block → { Statements}

Assignment → Identifier [ [ Expression ] ]= Expression ;

IfStatement → if ( Expression ) Statement[ else Statement ]

WhileStatement → while ( Expression ) Statement

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 8 / 23

Page 10: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Grammar: Expressions

Expression → Conjunction { || Conjunction }

Conjunction → Equality { && Equality }

Equality → Relation [ EquOp Relation ]

EquOp → == | !=

Relation → Addition [ RelOp Addition ]

RelOp → < | <= | > | >=

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 9 / 23

Page 11: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Grammar: Expressions

Addition → Term { AddOp Term }

AddOp → + | -

Term → Factor { MulOp Factor }

MulOp → * | / | %

Factor → [ UnaryOp ] Primary

UnaryOp → - | !

Primary → Identifier [ [ Expression ] ] | Literal |( Expression ) | Type ( Expression )

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 10 / 23

Page 12: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Grammar: Lexical Level

Identifier → Letter { Letter | Digit }

Letter → a | b | ...| z | A | B | ...| Z

Digit → 0 | 1 | ...| 9

Literal → Integer | Boolean | Float | Char

Integer → Digit { Digit }

Boolean → true | false

Float → Integer . Integer

Char → ‘ASCII Char’

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 11 / 23

Page 13: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Issues Not Addressed by this Grammar

1 Comments2 Whitespace3 Distinguishing one token <= from two tokens < =4 Distinguishing identifiers from keywords like if

These issues are addressed by identifying two levels:lexical levelsyntactic level

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 12 / 23

Page 14: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Lexical SyntaxInput : a stream of characters from the ASCII set,keyed by a programmer.

Output : a stream of tokens or basic symbols,classified as follows:

Identifiers e.g., Stack, x, i, push

Literals e.g., 123, ’x’, 3.25, true

Keywords bool char else false float if int main true while

Operators = || && == != < <= > >= + - * / !

Punctuation ; , { } ( )

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 13 / 23

Page 15: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace

Whitespace is any space, tab, end-of-line character (or characters),or character sequence inside a comment

No token may contain embedded whitespace(unless it is a character or string literal)

Example:>= one token> = two tokens

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 14 / 23

Page 16: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do

legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 17: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 18: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo

illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 19: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 20: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do

legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 21: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 22: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo

illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 23: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 24: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Whitespace Examples in Pascal

Legal or Illegal?

while a < b do legal spacing between tokens

whilea < bdo illegal cannot tell boundaries

while a<b do legal spacing not needed for <

whilea<bdo illegal cannot tell boundaries

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 15 / 23

Page 25: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Comments

Not defined in grammar

Clite uses // comment style of C++

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 16 / 23

Page 26: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Identifier

Sequence of letters and digits, starting with a letter

if is both an identifier and a keyword

Most languages require identifiers to be distinct from keywords

In some languages, identifiers are merely predefined(and thus can be redefined by the programmer)

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 17 / 23

Page 27: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Identifier

Redefining Identifiers can be dangerous!

program confusing;

const true = false;

begin

if (a<b) = true then

f(a)

else ...

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 18 / 23

Page 28: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Identifier

Should Identifiers be case-sensitive?

Pascal no

Modula yes

C, C++ yes

Java yes

PHP partly yes, partly no

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 19 / 23

Page 29: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Concrete Syntax

Based on a parse of its Tokens:

; is a statement terminator

(Algol-60, Pascal use ; as a separator)

Rule for IfStatement is ambiguous:

The else ambiguity is resolved by connecting an else with the lastencountered else-less if. Stroustrup, 1991

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 20 / 23

Page 30: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Expressions in Clite

13 grammar rules

Use of meta braces - operators are left associative

C++ expressions require 4 pages of grammar rules[Stroustrup]

C uses an ambiguous expression grammar[Kernighan and Ritchie]

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 21 / 23

Page 31: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Associativity and Precedence

Clite Operator Associativity

Unary - ! none

* / left

+ - left

< <= > >= none

== != none

&& left

|| left

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 22 / 23

Page 32: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Equality, Relational Operators

Clite Equality, Relational Operators are non-associative.

(an idea borrowed from Ada)

Why is this important?

In C++, the expression:

if (a < x < b)

is not equivalent to

if (a < x && x < b)

But it is error-free!

So, what does it mean?

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 23 / 23

Page 33: Concepts of Programming Languages - KSU Facultyfac.ksu.edu.sa/sites/default/files/05-syntax.pdf · A language that is simple to parse for the compiler is also simple to parse for

Clite Equality, Relational Operators

Clite Equality, Relational Operators are non-associative.

(an idea borrowed from Ada)

Why is this important?

In C++, the expression:

if (a < x < b)

is not equivalent to

if (a < x && x < b)

But it is error-free!

So, what does it mean?

Patrick Donnelly (Montana State University) Concepts of Programming Languages Spring 2014 23 / 23