75
Lexical and Syntax Analysis (of Programming Languages) Abstract Syntax

Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Lexical and Syntax Analysis(of Programming Languages)

Abstract Syntax

Page 2: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Lexical and Syntax Analysis(of Programming Languages)

Abstract Syntax

Page 3: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

What is Parsing?

String ofcharacters

Easy for humansto write

Easy for programsto process

Parser

A parser also checks that the input stringis well-formed, and if not, rejects it.

Data structure

Page 4: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

What is Parsing?

String ofcharacters

Easy for humansto write

Easy for programsto process

Parser

A parser also checks that the input stringis well-formed, and if not, rejects it.

Data structure

Page 5: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Example 1

Charlton, 49

Lineker, 48

Beckham, 17

CSV (Comma Separated Value)

Array of pairs

Parser

“Charlton”

49

“Lineker”

48

“Beckham”

17

Page 6: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Example 1

Charlton, 49

Lineker, 48

Beckham, 17

CSV (Comma Separated Value)

Array of pairs

Parser

“Charlton”

49

“Lineker”

48

“Beckham”

17

Page 7: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Concrete andAbstract Syntax

The concrete syntax is a set ofrules that describe valid inputsto the parser.

The abstract syntax is a set ofrules that describe valid outputsfrom the parser.

The data structure produced bya parser is commonly termedthe abstract syntax tree.

Page 8: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Concrete andAbstract Syntax

The concrete syntax is a set ofrules that describe valid inputsto the parser.

The abstract syntax is a set ofrules that describe valid outputsfrom the parser.

The data structure produced bya parser is commonly termedthe abstract syntax tree.

Page 9: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Concrete andAbstract Syntax

String ofcharacters

Conforms to theConcrete Syntaxof the language

Conforms to theAbstract Syntaxof the language

Parser

Data structure

Page 10: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Concrete andAbstract Syntax

String ofcharacters

Conforms to theConcrete Syntaxof the language

Conforms to theAbstract Syntaxof the language

Parser

Data structure

Page 11: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Abstract syntax

The abstract syntax is usuallyspecified as a data type in theprogramming language beingused, in our case C. Example:

typedef struct {char* name;int goals;

} Player;

typedef struct {Player* players;int size;

} Squad;

An abstract syntax tree is avalue of this type.

Page 12: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Abstract syntax

The abstract syntax is usuallyspecified as a data type in theprogramming language beingused, in our case C. Example:

typedef struct {char* name;int goals;

} Player;

typedef struct {Player* players;int size;

} Squad;

An abstract syntax tree is avalue of this type.

Page 13: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

This Chapter

How:

to define the abstract syntax

to construct abstract syntax trees

in the programming language C.

Also revisits some important Cprogramming techniques.

If you need a C tutorial then thefollowing books are recommended.

Page 14: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

This Chapter

How:

to define the abstract syntax

to construct abstract syntax trees

in the programming language C.

Also revisits some important Cprogramming techniques.

If you need a C tutorial then thefollowing books are recommended.

Page 15: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

POINTERS

Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

Page 16: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

POINTERS

Pointer: a variable that holds the address of a core storage location. [The Free Dictionary]

Page 17: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Pointers

Declare a variable x of type intand initialise it to the value 10.

int x = 10; 10

x:

int* p;p:

p = &x;p:

10

x:

Declare a variable p of type int*(read: int pointer).

Make p point to x (or assign theaddress of x to p).

Page 18: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Pointers

Declare a variable x of type intand initialise it to the value 10.

int x = 10; 10

x:

int* p;p:

p = &x;p:

10

x:

Declare a variable p of type int*(read: int pointer).

Make p point to x (or assign theaddress of x to p).

Page 19: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Pointers

Print p (here, the address of x).

printf("%i\n", p );p:

10

x:

Print the value pointed to by p(here, the value of x).

printf("%i\n", *p );p:

10

x:

Assign 20 to the location pointedto by p.

*p = 20;p:

20

x:

Page 20: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Pointers

Print p (here, the address of x).

printf("%i\n", p );p:

10

x:

Print the value pointed to by p(here, the value of x).

printf("%i\n", *p );p:

10

x:

Assign 20 to the location pointedto by p.

*p = 20;p:

20

x:

Page 21: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Exercise 1

void swap(int* x, int* y) {

int tmp;tmp = *x;*x = *y;*y = tmp;

}

void main(){

int a = 1;int b = 2;swap(&a, &b);printf("a=%i, b=%i\n", a, b);

}

What is printed by the followingprogram?

Page 22: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Exercise 1

void swap(int* x, int* y) {

int tmp;tmp = *x;*x = *y;*y = tmp;

}

void main(){

int a = 1;int b = 2;swap(&a, &b);printf("a=%i, b=%i\n", a, b);

}

What is printed by the followingprogram?

Page 23: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

DYNAMIC ALLOCATION

Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

Page 24: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

DYNAMIC ALLOCATION

Dynamic Allocation: the allocation of memory storage for use in a computer program. [The Free Dictionary]

Page 25: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Array allocation

int* p;p:

p = malloc(4 * sizeof(int));

p:

Declare a variable p of type int*.

Allocate memory for an array of4 int values and let point p to it.

Page 26: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Array allocation

int* p;p:

p = malloc(4 * sizeof(int));

p:

Declare a variable p of type int*.

Allocate memory for an array of4 int values and let point p to it.

Page 27: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Array indexing

*p = 10;p:

10

Assign 10 to the location pointedto by p.

p[0] = 20;p:

20

Assign 20 to the first element ofthe array pointed to by p.

p[2] = p[0];p:

20 20

Copy the first element of thearray to the third element.

Page 28: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Array indexing

*p = 10;p:

10

Assign 10 to the location pointedto by p.

p[0] = 20;p:

20

Assign 20 to the first element ofthe array pointed to by p.

p[2] = p[0];p:

20 20

Copy the first element of thearray to the third element.

Page 29: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Array deallocation

free(p);p:

When finished with an arrayallocated by malloc, call free torelease the space, otherwiseyour program may run out ofmemory.

Space released, so it canbe reused by future callsto malloc.

Page 30: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Array deallocation

free(p);p:

When finished with an arrayallocated by malloc, call free torelease the space, otherwiseyour program may run out ofmemory.

Space released, so it canbe reused by future callsto malloc.

Page 31: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

STRINGS

String: a series of consecutive characters. [The Free Dictionary]

Page 32: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

STRINGS

String: a series of consecutive characters. [The Free Dictionary]

Page 33: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Strings

char* s = “hi”;s:

h i \0

Declare a variable s, initialised topoint to the string “hi”.

s = s + 1;

Let s point to the next character.

s = s - 1;

And let s point to the previouscharacter again.

s:

h i \0

s:

h i \0

Page 34: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Strings

char* s = “hi”;s:

h i \0

Declare a variable s, initialised topoint to the string “hi”.

s = s + 1;

Let s point to the next character.

s = s - 1;

And let s point to the previouscharacter again.

s:

h i \0

s:

h i \0

Page 35: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Exercise 2

int f(char* s){

int i = 0;while (s[i] != '\0') i++;return i;

}

void main(){

char* x = “Hello”;printf(“%i\n”, f(x));

}

What is printed by the followingprogram?

Page 36: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Exercise 2

int f(char* s){

int i = 0;while (s[i] != '\0') i++;return i;

}

void main(){

char* x = “Hello”;printf(“%i\n”, f(x));

}

What is printed by the followingprogram?

Page 37: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

USER-DEFINED TYPES

Type: the general character or structure held in common by a number of things. [The Free Dictionary]

Page 38: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

USER-DEFINED TYPES

Type: the general character or structure held in common by a number of things. [The Free Dictionary]

Page 39: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Type definitions

typedef int Integer;

A typedef declaration allows anew name to be given to a type.

typedef char* String;

A new nameExisting type

String s; /* Declare a string s */Integer i; /* and an integer i */i = 0;s = “hello”;

Example use:

Page 40: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Type definitions

typedef int Integer;

A typedef declaration allows anew name to be given to a type.

typedef char* String;

A new nameExisting type

String s; /* Declare a string s */Integer i; /* and an integer i */i = 0;s = “hello”;

Example use:

Page 41: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Enumerations

enum colour {RED, GREEN, BLUE};

An enum declaration introducesa new type whose values aremembers of a given set.

Example use:

New type Possible values

enum colour c;c = RED;if (c == RED) printf(“Red\n”);

typedef enum colour Colour;

Give it a shorter name:

Page 42: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Enumerations

enum colour {RED, GREEN, BLUE};

An enum declaration introducesa new type whose values aremembers of a given set.

Example use:

New type Possible values

enum colour c;c = RED;if (c == RED) printf(“Red\n”);

typedef enum colour Colour;

Give it a shorter name:

Page 43: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Structures

An struct declaration introducesa new type that is a conjunctionof one or more existing types.

struct rectangle {float width;float height;

};

A width anda height

New type

struct rectangle r;r.width = 10;r.height = 20;

Example use:

struct circle {float radius;

}

A circle:

Page 44: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Structures

An struct declaration introducesa new type that is a conjunctionof one or more existing types.

struct rectangle {float width;float height;

};

A width anda height

New type

struct rectangle r;r.width = 10;r.height = 20;

Example use:

struct circle {float radius;

}

A circle:

Page 45: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Unions

An union declaration introducesa new type that is a disjunctionof one or more existing types.

union shape {struct circle circ;struct rectangle rect;

};

A circle ora rectangle

New type

struct shape s;s.circ.radius = 10;

Example use:

Page 46: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Unions

An union declaration introducesa new type that is a disjunctionof one or more existing types.

union shape {struct circle circ;struct rectangle rect;

};

A circle ora rectangle

New type

struct shape s;s.circ.radius = 10;

Example use:

Page 47: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Tagged unions

Often a tag is used to denote theactive disjunct of a union.

Another definition of shape:

struct shape {enum { CIRCLE, RECTANGLE } tag;union {

struct circle circ;struct rectangle rect;

};};

Page 48: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Tagged unions

Often a tag is used to denote theactive disjunct of a union.

Another definition of shape:

struct shape {enum { CIRCLE, RECTANGLE } tag;union {

struct circle circ;struct rectangle rect;

};};

Page 49: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Tagged unions

struct shape s, t;

s.tag = CIRCLE;s.circ.radius = 10;

t.tag = RECTANGLE;t.rect.width = 5;t.rect.height = 15;

Example: s is a circle and t is arectangle, and both are of typestruct shape.

Page 50: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Tagged unions

struct shape s, t;

s.tag = CIRCLE;s.circ.radius = 10;

t.tag = RECTANGLE;t.rect.width = 5;t.rect.height = 15;

Example: s is a circle and t is arectangle, and both are of typestruct shape.

Page 51: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Tagged unions

Example: compute the area ofany given shape s.

float area(struct shape s){

if (s.tag == CIRCLE) {float r = s.circ.radius;return (3.14 * r * r);

}if (s.tag == RECTANGLE) {

return (s.rect.width *s.rect.height);

}}

Page 52: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Tagged unions

Example: compute the area ofany given shape s.

float area(struct shape s){

if (s.tag == CIRCLE) {float r = s.circ.radius;return (3.14 * r * r);

}if (s.tag == RECTANGLE) {

return (s.rect.width *s.rect.height);

}}

Page 53: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Recursive structures

A value of type struct t maycontain a value of type struct t*.

struct list {int head;struct list* tail;

};

typedef struct list List;

(*xs).head ≡ xs->head

Suppose x is a value of type List*.

(*(*xs).tail).head ≡ xs->tail->head

Page 54: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Recursive structures

A value of type struct t maycontain a value of type struct t*.

struct list {int head;struct list* tail;

};

typedef struct list List;

(*xs).head ≡ xs->head

Suppose x is a value of type List*.

(*(*xs).tail).head ≡ xs->tail->head

Page 55: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Recursive structures

Example: inserting an item ontothe front of a linked list.

List* insert(List* xs, int x){

List* ys = malloc(sizeof(List));ys->head = x;ys->tail = xs;return ys;

}

Page 56: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Recursive structures

Example: inserting an item ontothe front of a linked list.

List* insert(List* xs, int x){

List* ys = malloc(sizeof(List));ys->head = x;ys->tail = xs;return ys;

}

Page 57: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

CASE STUDY

A simplifier for arithmetic expressions.

Page 58: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

CASE STUDY

A simplifier for arithmetic expressions.

Page 59: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Concrete syntax

Consider the following concretesyntax for arithmetic expressions,where v ranges over variablenames and n over integers.

e → v| n| e + e| e * e| ( e )

Example expression:

x * y + (z * 10)

Page 60: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Concrete syntax

Consider the following concretesyntax for arithmetic expressions,where v ranges over variablenames and n over integers.

e → v| n| e + e| e * e| ( e )

Example expression:

x * y + (z * 10)

Page 61: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Simplification

Consider the algebraic law:

∀x. x * 1 = x

Example simplification:

x * (y * 1) → x * y

This law can be used to simplifyexpressions by using it as arewrite rule from left to right.

Page 62: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Simplification

Consider the algebraic law:

∀x. x * 1 = x

Example simplification:

x * (y * 1) → x * y

This law can be used to simplifyexpressions by using it as arewrite rule from left to right.

Page 63: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Problem

1. Define an abstract syntax, in C,for arithmetic expressions.

2. Show how to construct abstractsyntax trees that representarithmetic expressions.

3. Implement the simplification as aC function that takes and returnsan abstract syntax tree.

Page 64: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Problem

1. Define an abstract syntax, in C,for arithmetic expressions.

2. Show how to construct abstractsyntax trees that representarithmetic expressions.

3. Implement the simplification as aC function that takes and returnsan abstract syntax tree.

Page 65: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Abstract syntax

typedef enum { ADD, MUL } Op;

struct expr {enum { VAR, NUM, APP } tag;union {

char* var;int num;struct {

struct expr* e1;Op op;struct expr* e2;

} app;};

};

typedef struct expr Expr;

A variable ora number oran op and twosub-expressions

Page 66: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Abstract syntax

typedef enum { ADD, MUL } Op;

struct expr {enum { VAR, NUM, APP } tag;union {

char* var;int num;struct {

struct expr* e1;Op op;struct expr* e2;

} app;};

};

typedef struct expr Expr;

A variable ora number oran op and twosub-expressions

Page 67: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Constructors

Expr* mkVar(char* v) {Expr* e = malloc(sizeof(Expr));e->tag = VAR; e->var = v;return e;

}

Expr* mkNum(int n) {Expr* e = malloc(sizeof(Expr));e->tag = NUM; e->num = n;return e;

}

Expr* mkApp(Expr* e1, Op op, Expr* e2) {Expr* e = malloc(sizeof(Expr));e->tag = APP; e->app.op = op;e->app.e1 = e1; e->app.e2 = e2;return e;

}

Page 68: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Constructors

Expr* mkVar(char* v) {Expr* e = malloc(sizeof(Expr));e->tag = VAR; e->var = v;return e;

}

Expr* mkNum(int n) {Expr* e = malloc(sizeof(Expr));e->tag = NUM; e->num = n;return e;

}

Expr* mkApp(Expr* e1, Op op, Expr* e2) {Expr* e = malloc(sizeof(Expr));e->tag = APP; e->app.op = op;e->app.e1 = e1; e->app.e2 = e2;return e;

}

Page 69: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Abstract syntax trees

mkApp( mkVar("x"), ADD, mkApp( mkVar("y")

, MUL, mkNum(2)))

An abstract syntax tree thatrepresents the expression

x + y * 2

can be constructed by thefollowing C expression

Page 70: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Abstract syntax trees

mkApp( mkVar("x"), ADD, mkApp( mkVar("y")

, MUL, mkNum(2)))

An abstract syntax tree thatrepresents the expression

x + y * 2

can be constructed by thefollowing C expression

Page 71: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Simplification

void simplify(Expr* e){

if (e->tag == APP&& e->app.op == MUL&& e->app.e2->tag == NUM&& e->app.e2->num == 1) {

*e = *(e->app.e1);}if (e->tag == APP) {

simplify(e->app.e1);simplify(e->app.e2);

}}

∀x. x * 1 = x

is implemented by

Page 72: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Simplification

void simplify(Expr* e){

if (e->tag == APP&& e->app.op == MUL&& e->app.e2->tag == NUM&& e->app.e2->num == 1) {

*e = *(e->app.e1);}if (e->tag == APP) {

simplify(e->app.e1);simplify(e->app.e2);

}}

∀x. x * 1 = x

is implemented by

Page 73: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Homework exercises

Extend the simplifier to exploitthe following algebraic law.

Implement a pretty printer thatprints an abstract syntax tree ina concrete form.

void print(Expr* e){

...}

∀x. x * 0 = 0

Page 74: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Homework exercises

Extend the simplifier to exploitthe following algebraic law.

Implement a pretty printer thatprints an abstract syntax tree ina concrete form.

void print(Expr* e){

...}

∀x. x * 0 = 0

Page 75: Lexical and Syntax Analysis - University of YorkConcrete and Abstract Syntax The concrete syntax is a set of rules that describe valid inputs to the parser. The abstract syntax is

Motivation for LSA

In LSA, we are interested in howto implement the following kindof function

Expr* parse(char* string){

...}

It takes a string conforming tothe concrete syntax and returnsan abstract syntax tree.