CMPUT 680 - Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic 6: Optimal Code...

Preview:

Citation preview

CMPUT 680 - Compiler Design and Optimization

1

CMPUT680 - Winter 2006

Topic 6: Optimal Code Generation

José Nelson Amaralhttp://www.cs.ualberta.ca/~amaral/courses/680

CMPUT 680 - Compiler Design and Optimization

2

Data Dependency Graph

(a) t1 := ld(x);(b) t2 := t1 + 4;(c) t3 := t1 * 8;(d) t4 := t1 - 4;(e) t5 := t1 / 2;(f) t6 := t2 * t3;(g) t7 := t4 - t5;(h) t8 := t6 * t7;(i) st(y,t8);

B3a

CMPUT 680 - Compiler Design and Optimization

3

Data Dependency Graph

(a) t1 := ld(x);(b) t2 := t1 + 4;(c) t3 := t1 * 8;(d) t4 := t1 - 4;(e) t5 := t1 / 2;(f) t6 := t2 * t3;(g) t7 := t4 - t5;(h) t8 := t6 * t7;(i) st(y,t8);

B3a

b c d e

CMPUT 680 - Compiler Design and Optimization

4

Data Dependency Graph

B3a

b c d e

f

(a) t1 := ld(x);(b) t2 := t1 + 4;(c) t3 := t1 * 8;(d) t4 := t1 - 4;(e) t5 := t1 / 2;(f) t6 := t2 * t3;(g) t7 := t4 - t5;(h) t8 := t6 * t7;(i) st(y,t8);

CMPUT 680 - Compiler Design and Optimization

5

Data Dependency Graph

B3a

b c d e

f g

(a) t1 := ld(x);(b) t2 := t1 + 4;(c) t3 := t1 * 8;(d) t4 := t1 - 4;(e) t5 := t1 / 2;(f) t6 := t2 * t3;(g) t7 := t4 - t5;(h) t8 := t6 * t7;(i) st(y,t8);

CMPUT 680 - Compiler Design and Optimization

6

Data Dependency Graph

B3a

b c d e

f g

h

(a) t1 := ld(x);(b) t2 := t1 + 4;(c) t3 := t1 * 8;(d) t4 := t1 - 4;(e) t5 := t1 / 2;(f) t6 := t2 * t3;(g) t7 := t4 - t5;(h) t8 := t6 * t7;(i) st(y,t8);

CMPUT 680 - Compiler Design and Optimization

7

Data Dependency Graph

B3a

b c d e

f g

h

i

(a) t1 := ld(x);(b) t2 := t1 + 4;(c) t3 := t1 * 8;(d) t4 := t1 - 4;(e) t5 := t1 / 2;(f) t6 := t2 * t3;(g) t7 := t4 - t5;(h) t8 := t6 * t7;(i) st(y,t8);

CMPUT 680 - Compiler Design and Optimization

8

Code Generation

Problem: How to generate optimal code for a basic block specified by its DAG representation?

If the DAG is a tree, we can use Sethi-Ullman algorithm togenerate code that is optimal in terms of program length or number of registers used.

(Aho-Sethi-Ullman,pp. 557)

CMPUT 680 - Compiler Design and Optimization

9

Example

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

Generate code for amachine with tworegisters.

(Aho-Sethi-Ullman,pp. 558)

Assume that only t4is alive at the exitof the basic block.

CMPUT 680 - Compiler Design and Optimization

10

Example

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD a, R0ADD b, R0

(Aho-Sethi-Ullman,pp. 558)

CMPUT 680 - Compiler Design and Optimization

11

Example

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD a, R0ADD b, R0LOAD c, R1ADD d, R1

(Aho-Sethi-Ullman,pp. 558)

Assume that SUB only works with registers.

Can’t evaluate

t3 because thereare no available

registers!

CMPUT 680 - Compiler Design and Optimization

12

Example

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD a, R0ADD b, R0LOAD c, R1ADD d, R1STORE R0, t1LOAD e, R0SUB R1, R0

(Aho-Sethi-Ullman,pp. 558)

CMPUT 680 - Compiler Design and Optimization

13

Example

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD a, R0ADD b, R0LOAD c, R1ADD d, R1STORE R0, t1LOAD e, R0SUB R1, R0LOAD t1, R1SUB R0, R1STORE R1, t4

(Aho-Sethi-Ullman,pp. 558)

CMPUT 680 - Compiler Design and Optimization

14

Example

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD a, R0ADD b, R0LOAD c, R1ADD d, R1STORE R0, t1LOAD e, R0SUB R1, R0LOAD t1, R1SUB R0, R1STORE R1, t4

10 instructions

1 spill

(Aho-Sethi-Ullman,pp. 558)

Evaluation Order:t1, t2, t3, t4

CMPUT 680 - Compiler Design and Optimization

15

Example(can we do better?)

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD c, R0ADD d, R0

(Aho-Sethi-Ullman,pp. 559)

CMPUT 680 - Compiler Design and Optimization

16

Example(can we do better?)

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD c, R0ADD d, R0LOAD e, R1SUB R0,R1

(Aho-Sethi-Ullman,pp. 559)

CMPUT 680 - Compiler Design and Optimization

17

Example(can we do better?)

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD c, R0ADD d, R0LOAD e, R1SUB R0, R1LOAD a, R0ADD b, R0

(Aho-Sethi-Ullman,pp. 559)

CMPUT 680 - Compiler Design and Optimization

18

Example(can we do better?)

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

LOAD c, R0ADD d, R0LOAD e, R1SUB R0, R1LOAD a, R0ADD b, R0SUB R1, R0STORE R0, t4

(Aho-Sethi-Ullman,pp. 559)

CMPUT 680 - Compiler Design and Optimization

19

Example(can we do better?

Yes!!!)

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

8 instructions

no spills!!!

(Aho-Sethi-Ullman,pp. 559)

LOAD c, R0ADD d, R0LOAD e, R1SUB R0, R1LOAD a, R0ADD b, R0SUB R1, R0STORE R0, t4

Evaluation Order:t2, t3, t1, t4

CMPUT 680 - Compiler Design and Optimization

20

Example:Why the improvement?

t1 := a + bt2 := c + dt3 := e - t2t4 := t1 - t3 t4

+

-

-+

a b e

c d

t3

t2

t1

We evaluated t4 immediatelyafter t1 (its leftmost argument).

(Aho-Sethi-Ullman,pp. 559)

CMPUT 680 - Compiler Design and Optimization

21

Heuristic Node Listing Algorithm for a DAG

(1) while unlisted interior nodes remain(2) select an unlisted node n, all of whose parents

have been listed;(3) list n;(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

endwhile endwhile

(Aho-Sethi-Ullman,pp. 560)

CMPUT 680 - Compiler Design and Optimization

22

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

8

11 12

5

6 7

9 10(1) while unlisted interior nodes remain(2) select an unlisted node n, all of whose parents

have been listed;(3) list n;

CMPUT 680 - Compiler Design and Optimization

23

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:1

(1) while unlisted interior nodes remain(2) select an unlisted node n, all of whose parents

have been listed;(3) list n;

8

n

CMPUT 680 - Compiler Design and Optimization

24

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

m

n

List:1

CMPUT 680 - Compiler Design and Optimization

25

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

List:12

8

m

n

CMPUT 680 - Compiler Design and Optimization

26

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

List:12

8m

n

CMPUT 680 - Compiler Design and Optimization

27

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:123

(1) while unlisted interior nodes remain(2) select an unlisted node n, all of whose parents

have been listed;(3) list n;

8

n

CMPUT 680 - Compiler Design and Optimization

28

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

n

m

List:123

CMPUT 680 - Compiler Design and Optimization

29

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:1234

(4) while the leftmost child m of n has no unlisted parents,and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

n

m

CMPUT 680 - Compiler Design and Optimization

30

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

n

m

List:1234

CMPUT 680 - Compiler Design and Optimization

31

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:12345

(4) while the leftmost child m of n has no unlisted parents,and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

n

m

CMPUT 680 - Compiler Design and Optimization

32

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10(4) while the leftmost child m of n has no unlisted parents,

and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

n

m

List:12345

CMPUT 680 - Compiler Design and Optimization

33

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:123456

(4) while the leftmost child m of n has no unlisted parents,and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8

n

m

CMPUT 680 - Compiler Design and Optimization

34

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:123456

(4) while the leftmost child m of n has no unlisted parents,and is not a leaf node do/* since n was just listed, m is not yet listed */

(5) list m;(6) n := m;

8n

m

CMPUT 680 - Compiler Design and Optimization

35

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:1234568

8

(1) while unlisted interior nodes remain(2) select an unlisted node n, all of whose parents

have been listed;(3) list n;

n

CMPUT 680 - Compiler Design and Optimization

36

Node Listing Example

+ -

a b

+ c

- +

ed

1

2 3

4

11 12

5

6 7

9 10

List:1234568

Therefore the optimal evaluation order (regardless of thenumber of registers available) for the internal nodes is 8654321.

8

CMPUT 680 - Compiler Design and Optimization

37

Optimal Code Generationfor Trees

The order is optimal in the sense that it yields the shortest instruction sequence over allinstruction sequences that evaluate the tree.

If the DAG representing the data flow in a basicblock is a tree, then for some machine models,there is a simple algorithm (the SethiUllmanalgorithm) that gives the optimal order.

CMPUT 680 - Compiler Design and Optimization

38

Sethi-Ullman Algorithm

Intuition:1. Label each node according to the number of registers that are required to generate code for the node.2. Generate code from top down always generating code first for the child that requires the most registers.

CMPUT 680 - Compiler Design and Optimization

39

Sethi-Ullman Algorithm(Intuition)

Leftleaf

Rightleaves

Bottom-Up Labeling: visit a node after all its children are labeled.

CMPUT 680 - Compiler Design and Optimization

40

Labeling Algorithm

( ) ( ) ( )( ) ( )( )

end

begin else

else

thenif

thenif

1max (6)

that so

by ordered ofchildren thebe let (5)

/* nodeinterior an is * /

0 (4)

1 (3)

parent its of childleftmost theis (2)

leaf a is (1)

1

21

21

−+=

≥≥≥

=

=

≤≤iclabel :nlabel

clabelclabelclabel

labeln, c, , cc

n

label(n) :

label(n) :

n

n

iki

k

k

L

L

CMPUT 680 - Compiler Design and Optimization

41

Labeling Algorithm

( ) ( ) ( )

( ) ( )( )

[ ]⎩⎨⎧

=+≠

=

−+==

≥≥≥

≤≤

)()( if1)(

)()( if)(),(max)(

:becomes

1max

relation following then thechildren), with twonode (a 1k If

211

2121

11

21

clabelclabelclabel

clabelclabelclabelclabelnlabel

iclabel :nlabel

clabelclabelclabel

iki

kL

CMPUT 680 - Compiler Design and Optimization

42

Example

a b t2

c d

t1 t3

e

t4

CMPUT 680 - Compiler Design and Optimization

43

Example

a b t2

c d

t1 t3

e

t4Labeling leaves:leftmost is 1, others are 0

1 0

0

1

1

CMPUT 680 - Compiler Design and Optimization

44

Example

a b t2

c d

t1 t3

e

t4

Labeling t2:

Thus 1 0

0

1

1

112)(

111)(

)()(

=−+=−+

>

dlabelclabel

dlabelclabel

1)2( =tlabel

(5) let c1, c2, L , ck be the children of n ordered by label

so that label c1( ) ≥ label c2( ) ≥L ≥ label ck( )

(6) label n( ) : = max1≤ i≤k

label c i( ) + i −1( )

CMPUT 680 - Compiler Design and Optimization

45

Example

a b t2

c d

t1 t3

e

t4

1 0

0

1

1

1

2)3( Thus

212)2(

111)(

)2()(

==−+

=−+=

tlabeltlabelelabel

tlabelelabel

(5) let c1, c2, L , ck be the children of n ordered by label

so that label c1( ) ≥ label c2( ) ≥L ≥ label ck( )

(6) label n( ) : = max1≤ i≤k

label c i( ) + i −1( )

CMPUT 680 - Compiler Design and Optimization

46

Example

a b t2

c d

t1 t3

e

t4

1 0

0

1

1

1

2

Likewise labeling t1

CMPUT 680 - Compiler Design and Optimization

47

Example

a b t2

c d

t1 t3

e

t4

1 0

0

1

1

1

2)4( Thus

212)1(

211)2(

)1()2(

==−+=−+

>

tlabeltlabeltlabel

tlabeltlabel

21

CMPUT 680 - Compiler Design and Optimization

48

Example

a b t2

c d

t1 t3

e

t4

1 0

0

1

1

1

21

2

See Aho, Sethi, Ullman (pp. 564)for code generating algorithm.