Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is...

Preview:

Citation preview

Lexical and Syntax Analysis(of Programming Languages)

Abstract Syntax

Lexical and Syntax Analysis(of Programming Languages)

Abstract Syntax

What is Parsing?

String ofcharacters

Easy for humansto write

Easy for programsto process

Parser

A parser also checks that the input stringis well-formed, and if not, rejects it.

Data structure

What is Parsing?

String ofcharacters

Easy for humansto write

Easy for programsto process

Parser

A parser also checks that the input stringis well-formed, and if not, rejects it.

Data structure

Example 1

Charlton, 49

Lineker, 48

Beckham, 17

CSV (Comma Separated Value)

Array of pairs

Parser

“Charlton”

49

“Lineker”

48

“Beckham”

17

Example 1

Charlton, 49

Lineker, 48

Beckham, 17

CSV (Comma Separated Value)

Array of pairs

Parser

“Charlton”

49

“Lineker”

48

“Beckham”

17

Concrete andAbstract Syntax

The concrete syntax is a set ofrules that describe valid inputsto the parser.

The abstract syntax is a set ofrules that describe valid outputsfrom the parser.

The data structure produced bya parser is commonly termedthe abstract syntax tree.

Concrete andAbstract Syntax

The concrete syntax is a set ofrules that describe valid inputsto the parser.

The abstract syntax is a set ofrules that describe valid outputsfrom the parser.

The data structure produced bya parser is commonly termedthe abstract syntax tree.

Concrete andAbstract Syntax

String ofcharacters

Conforms to theConcrete Syntaxof the language

Conforms to theAbstract Syntaxof the language

Parser

Data structure

Concrete andAbstract Syntax

String ofcharacters

Conforms to theConcrete Syntaxof the language

Conforms to theAbstract Syntaxof the language

Parser

Data structure

Abstract syntax

The abstract syntax is usuallyspecified as a data type in theprogramming language beingused, in our case C. Example:

typedef struct {char* name;int goals;

} Player;

typedef struct {Player* players;int size;

} Squad;

An abstract syntax tree is avalue of this type.

Abstract syntax

The abstract syntax is usuallyspecified as a data type in theprogramming language beingused, in our case C. Example:

typedef struct {char* name;int goals;

} Player;

typedef struct {Player* players;int size;

} Squad;

An abstract syntax tree is avalue of this type.

This Chapter

How:

to define the abstract syntax

to construct abstract syntax trees

in the programming language C.

Also revisits some important Cprogramming techniques.

If you need a C tutorial then thefollowing books are recommended.

This Chapter

How:

to define the abstract syntax

to construct abstract syntax trees

in the programming language C.

Also revisits some important Cprogramming techniques.

If you need a C tutorial then thefollowing books are recommended.

POINTERS

Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

POINTERS

Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

Pointers

Declare a variable x of type intand initialise it to the value 10.

int x = 10; 10

x:

int* p;p:

p = &x;p:

10

x:

Declare a variable p of type int*(read: int pointer).

Make p point to x (or assign theaddress of x to p).

Pointers

Declare a variable x of type intand initialise it to the value 10.

int x = 10; 10

x:

int* p;p:

p = &x;p:

10

x:

Declare a variable p of type int*(read: int pointer).

Make p point to x (or assign theaddress of x to p).

Pointers

Print p (here, the address of x).

printf("%i\n", p );p:

10

x:

Print the value pointed to by p(here, the value of x).

printf("%i\n", *p );p:

10

x:

Assign 20 to the location pointedto by p.

*p = 20;p:

20

x:

Pointers

Print p (here, the address of x).

printf("%i\n", p );p:

10

x:

Print the value pointed to by p(here, the value of x).

printf("%i\n", *p );p:

10

x:

Assign 20 to the location pointedto by p.

*p = 20;p:

20

x:

Exercise 1

void swap(int* x, int* y) {

int tmp;tmp = *x;*x = *y;*y = tmp;

}

void main(){

int a = 1;int b = 2;swap(&a, &b);printf("a=%i, b=%i\n", a, b);

}

What is printed by the followingprogram?

Exercise 1

void swap(int* x, int* y) {

int tmp;tmp = *x;*x = *y;*y = tmp;

}

void main(){

int a = 1;int b = 2;swap(&a, &b);printf("a=%i, b=%i\n", a, b);

}

What is printed by the followingprogram?

DYNAMIC ALLOCATION

Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

DYNAMIC ALLOCATION

Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

Array allocation

int* p;p:

p = malloc(4 * sizeof(int));

p:

Declare a variable p of type int*.

Allocate memory for an array of4 int values and let point p to it.

Array allocation

int* p;p:

p = malloc(4 * sizeof(int));

p:

Declare a variable p of type int*.

Allocate memory for an array of4 int values and let point p to it.

Array indexing

*p = 10;p:

10

Assign 10 to the location pointedto by p.

p[0] = 20;p:

20

Assign 20 to the first element ofthe array pointed to by p.

p[2] = p[0];p:

20 20

Copy the first element of thearray to the third element.

Array indexing

*p = 10;p:

10

Assign 10 to the location pointedto by p.

p[0] = 20;p:

20

Assign 20 to the first element ofthe array pointed to by p.

p[2] = p[0];p:

20 20

Copy the first element of thearray to the third element.

Array deallocation

free(p);p:

When finished with an arrayallocated by malloc, call free torelease the space, otherwiseyour program may run out ofmemory.

Space released, so it canbe reused by future callsto malloc.

Array deallocation

free(p);p:

When finished with an arrayallocated by malloc, call free torelease the space, otherwiseyour program may run out ofmemory.

Space released, so it canbe reused by future callsto malloc.

STRINGS

String: a series of consecutive characters. [The Free Dictionary]

STRINGS

String: a series of consecutive characters. [The Free Dictionary]

Strings

char* s = “hi”;s:

h i \0

Declare a variable s, initialised topoint to the string “hi”.

s = s + 1;

Let s point to the next character.

s = s - 1;

And let s point to the previouscharacter again.

s:

h i \0

s:

h i \0

Strings

char* s = “hi”;s:

h i \0

Declare a variable s, initialised topoint to the string “hi”.

s = s + 1;

Let s point to the next character.

s = s - 1;

And let s point to the previouscharacter again.

s:

h i \0

s:

h i \0

Exercise 2

int f(char* s){

int i = 0;while (s[i] != '\0') i++;return i;

}

void main(){

char* x = “Hello”;printf(“%i\n”, f(x));

}

What is printed by the followingprogram?

Exercise 2

int f(char* s){

int i = 0;while (s[i] != '\0') i++;return i;

}

void main(){

char* x = “Hello”;printf(“%i\n”, f(x));

}

What is printed by the followingprogram?

USER-DEFINED TYPES

Type: the general character or structure held in common by a number of things. [The Free Dictionary]

USER-DEFINED TYPES

Type: the general character or structure held in common by a number of things. [The Free Dictionary]

Type definitions

typedef int Integer;

A typedef declaration allows anew name to be given to a type.

typedef char* String;

A new nameExisting type

String s; /* Declare a string s */Integer i; /* and an integer i */i = 0;s = “hello”;

Example use:

Type definitions

typedef int Integer;

A typedef declaration allows anew name to be given to a type.

typedef char* String;

A new nameExisting type

String s; /* Declare a string s */Integer i; /* and an integer i */i = 0;s = “hello”;

Example use:

Enumerations

enum colour {RED, GREEN, BLUE};

An enum declaration introducesa new type whose values aremembers of a given set.

Example use:

New type Possible values

enum colour c;c = RED;if (c == RED) printf(“Red\n”);

typedef enum colour Colour;

Give it a shorter name:

Enumerations

enum colour {RED, GREEN, BLUE};

An enum declaration introducesa new type whose values aremembers of a given set.

Example use:

New type Possible values

enum colour c;c = RED;if (c == RED) printf(“Red\n”);

typedef enum colour Colour;

Give it a shorter name:

Structures

An struct declaration introducesa new type that is a conjunctionof one or more existing types.

struct rectangle {float width;float height;

};

A width anda height

New type

struct rectangle r;r.width = 10;r.height = 20;

Example use:

struct circle {float radius;

}

A circle:

Structures

An struct declaration introducesa new type that is a conjunctionof one or more existing types.

struct rectangle {float width;float height;

};

A width anda height

New type

struct rectangle r;r.width = 10;r.height = 20;

Example use:

struct circle {float radius;

}

A circle:

Unions

An union declaration introducesa new type that is a disjunctionof one or more existing types.

union shape {struct circle circ;struct rectangle rect;

};

A circle ora rectangle

New type

struct shape s;s.circ.radius = 10;

Example use:

Unions

An union declaration introducesa new type that is a disjunctionof one or more existing types.

union shape {struct circle circ;struct rectangle rect;

};

A circle ora rectangle

New type

struct shape s;s.circ.radius = 10;

Example use:

Tagged unions

Often a tag is used to denote theactive disjunct of a union.

Another definition of shape:

struct shape {enum { CIRCLE, RECTANGLE } tag;union {

struct circle circ;struct rectangle rect;

};};

Tagged unions

Often a tag is used to denote theactive disjunct of a union.

Another definition of shape:

struct shape {enum { CIRCLE, RECTANGLE } tag;union {

struct circle circ;struct rectangle rect;

};};

Tagged unions

struct shape s, t;

s.tag = CIRCLE;s.circ.radius = 10;

t.tag = RECTANGLE;t.rect.width = 5;t.rect.height = 15;

Example: s is a circle and t is arectangle, and both are of typestruct shape.

Tagged unions

struct shape s, t;

s.tag = CIRCLE;s.circ.radius = 10;

t.tag = RECTANGLE;t.rect.width = 5;t.rect.height = 15;

Example: s is a circle and t is arectangle, and both are of typestruct shape.

Tagged unions

Example: compute the area ofany given shape s.

float area(struct shape s){

if (s.tag == CIRCLE) {float r = s.circ.radius;return (3.14 * r * r);

}if (s.tag == RECTANGLE) {

return (s.rect.width *s.rect.height);

}}

Tagged unions

Example: compute the area ofany given shape s.

float area(struct shape s){

if (s.tag == CIRCLE) {float r = s.circ.radius;return (3.14 * r * r);

}if (s.tag == RECTANGLE) {

return (s.rect.width *s.rect.height);

}}

Recursive structures

A value of type struct t maycontain a value of type struct t*.

struct list {int head;struct list* tail;

};

typedef struct list List;

(*xs).head ≡ xs->head

Suppose x is a value of type List*.

(*(*xs).tail).head ≡ xs->tail->head

Recursive structures

A value of type struct t maycontain a value of type struct t*.

struct list {int head;struct list* tail;

};

typedef struct list List;

(*xs).head ≡ xs->head

Suppose x is a value of type List*.

(*(*xs).tail).head ≡ xs->tail->head

Recursive structures

Example: inserting an item ontothe front of a linked list.

List* insert(List* xs, int x){

List* ys = malloc(sizeof(List));ys->head = x;ys->tail = xs;return ys;

}

Recursive structures

Example: inserting an item ontothe front of a linked list.

List* insert(List* xs, int x){

List* ys = malloc(sizeof(List));ys->head = x;ys->tail = xs;return ys;

}

CASE STUDY

A simplifier for arithmetic expressions.

CASE STUDY

A simplifier for arithmetic expressions.

Concrete syntax

Consider the following concretesyntax for arithmetic expressions,where v ranges over variablenames and n over integers.

e → v| n| e + e| e * e| ( e )

Example expression:

x * y + (z * 10)

Concrete syntax

Consider the following concretesyntax for arithmetic expressions,where v ranges over variablenames and n over integers.

e → v| n| e + e| e * e| ( e )

Example expression:

x * y + (z * 10)

Simplification

Consider the algebraic law:

∀x. x * 1 = x

Example simplification:

x * (y * 1) → x * y

This law can be used to simplifyexpressions by using it as arewrite rule from left to right.

Simplification

Consider the algebraic law:

∀x. x * 1 = x

Example simplification:

x * (y * 1) → x * y

This law can be used to simplifyexpressions by using it as arewrite rule from left to right.

Problem

1. Define an abstract syntax, in C,for arithmetic expressions.

2. Show how to construct abstractsyntax trees that representarithmetic expressions.

3. Implement the simplification as aC function that takes and returnsan abstract syntax tree.

Problem

1. Define an abstract syntax, in C,for arithmetic expressions.

2. Show how to construct abstractsyntax trees that representarithmetic expressions.

3. Implement the simplification as aC function that takes and returnsan abstract syntax tree.

Abstract syntax

typedef enum { ADD, MUL } Op;

struct expr {enum { VAR, NUM, APP } tag;union {

char* var;int num;struct {

struct expr* e1;Op op;struct expr* e2;

} app;};

};

typedef struct expr Expr;

A variable ora number oran op and twosub-expressions

Abstract syntax

typedef enum { ADD, MUL } Op;

struct expr {enum { VAR, NUM, APP } tag;union {

char* var;int num;struct {

struct expr* e1;Op op;struct expr* e2;

} app;};

};

typedef struct expr Expr;

A variable ora number oran op and twosub-expressions

Constructors

Expr* mkVar(char* v) {Expr* e = malloc(sizeof(Expr));e->tag = VAR; e->var = v;return e;

}

Expr* mkNum(int n) {Expr* e = malloc(sizeof(Expr));e->tag = NUM; e->num = n;return e;

}

Expr* mkApp(Expr* e1, Op op, Expr* e2) {Expr* e = malloc(sizeof(Expr));e->tag = APP; e->app.op = op;e->app.e1 = e1; e->app.e2 = e2;return e;

}

Constructors

Expr* mkVar(char* v) {Expr* e = malloc(sizeof(Expr));e->tag = VAR; e->var = v;return e;

}

Expr* mkNum(int n) {Expr* e = malloc(sizeof(Expr));e->tag = NUM; e->num = n;return e;

}

Expr* mkApp(Expr* e1, Op op, Expr* e2) {Expr* e = malloc(sizeof(Expr));e->tag = APP; e->app.op = op;e->app.e1 = e1; e->app.e2 = e2;return e;

}

Abstract syntax trees

mkApp( mkVar("x"), ADD, mkApp( mkVar("y")

, MUL, mkNum(2)))

An abstract syntax tree thatrepresents the expression

x + y * 2

can be constructed by thefollowing C expression

Abstract syntax trees

mkApp( mkVar("x"), ADD, mkApp( mkVar("y")

, MUL, mkNum(2)))

An abstract syntax tree thatrepresents the expression

x + y * 2

can be constructed by thefollowing C expression

Simplification

void simplify(Expr* e){

if (e->tag == APP&& e->app.op == MUL&& e->app.e2->tag == NUM&& e->app.e2->num == 1) {

*e = *(e->app.e1);}if (e->tag == APP) {

simplify(e->app.e1);simplify(e->app.e2);

}}

∀x. x * 1 = x

is implemented by

Simplification

void simplify(Expr* e){

if (e->tag == APP&& e->app.op == MUL&& e->app.e2->tag == NUM&& e->app.e2->num == 1) {

*e = *(e->app.e1);}if (e->tag == APP) {

simplify(e->app.e1);simplify(e->app.e2);

}}

∀x. x * 1 = x

is implemented by

Homework exercises

Extend the simplifier to exploitthe following algebraic law.

Implement a pretty printer thatprints an abstract syntax tree ina concrete form.

void print(Expr* e){

...}

∀x. x * 0 = 0

Homework exercises

Extend the simplifier to exploitthe following algebraic law.

Implement a pretty printer thatprints an abstract syntax tree ina concrete form.

void print(Expr* e){

...}

∀x. x * 0 = 0

Motivation for LSA

In LSA, we are interested in howto implement the following kindof function

Expr* parse(char* string){

...}

It takes a string conforming tothe concrete syntax and returnsan abstract syntax tree.

Recommended