25
Compilers and Language Processing Tools Summer Term 2011 Prof. Dr. Arnd Poetzsch-Heffter Software Technology Group TU Kaiserslautern c Prof. Dr. Arnd Poetzsch-Heffter 1 Content of Lecture 1. Introduction 2. Syntax and Type Analysis 2.1 Lexical Analysis 2.2 Context-Free Syntax Analysis 2.3 Context-Dependent Analysis 3. Translation to Target Language 3.1 Translation of Imperative Language Constructs 3.2 Translation of Object-Oriented Language Constructs 4. Selected Topics in Compiler Construction 4.1 Intermediate Languages 4.2 Optimization 4.3 Register Allocation 4.4 Just-in-time Compilation 4.5 Further Aspects of Compilation 5. Garbage Collection 6. XML Processing (DOM, SAX, XSLT) c Prof. Dr. Arnd Poetzsch-Heffter 2 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline 4. Selected Topics in Compiler Construction 4.1 Intermediate Languages 4.1.1 3-Address Code 4.1.2 Other Intermediate Languages 4.2 Optimization 4.2.1 Classical Optimization Techniques 4.2.2 Potential of Optimizations 4.2.3 Data Flow Analysis 4.2.4 Non-local Optimization 4.3 Register Allocation 4.3.1 Sethi-Ullman Algorithm 4.3.2 Register Allocation by Graph Coloring 4.4 Just-in-time Compilation 4.5 Further Aspects of Compilation c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 4 Selected topics in compiler construction Focus: Techniques that go beyond the direct translation of source languages to target languages Concentrate on concepts instead of language-dependent details Use program representations tailored for the considered tasks (instead of source language syntax): simplifies representation (but needs more work to integrate tasks) c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 5 Selected topics in compiler construction (2) Learning objectives: Intermediate languages for translation and optimization of imperative languages Different optimization techniques Different static analysis techniques for (intermediate) programs Register allocation Some aspects of code generation c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 6 Intermediate languages 4.1 Intermediate languages c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 7 Intermediate languages Intermediate languages Intermediate languages are used as appropriate program representation for certain language implementation tasks common representation of programs of different source languages Source Language 1 Source Language 2 Source Language n Intermediate Language Target Language 1 Target Language 2 Target Language m ... ... c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 8

4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

  • Upload
    others

  • View
    22

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Compilers and Language Processing ToolsSummer Term 2011

Prof. Dr. Arnd Poetzsch-Heffter

Software Technology GroupTU Kaiserslautern

c© Prof. Dr. Arnd Poetzsch-Heffter 1

Content of Lecture

1. Introduction2. Syntax and Type Analysis

2.1 Lexical Analysis2.2 Context-Free Syntax Analysis2.3 Context-Dependent Analysis

3. Translation to Target Language3.1 Translation of Imperative Language Constructs3.2 Translation of Object-Oriented Language Constructs

4. Selected Topics in Compiler Construction4.1 Intermediate Languages4.2 Optimization4.3 Register Allocation4.4 Just-in-time Compilation4.5 Further Aspects of Compilation

5. Garbage Collection6. XML Processing (DOM, SAX, XSLT)

c© Prof. Dr. Arnd Poetzsch-Heffter 2

4. Selected Topics in CompilerConstruction

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3

Chapter Outline

4. Selected Topics in Compiler Construction4.1 Intermediate Languages

4.1.1 3-Address Code4.1.2 Other Intermediate Languages

4.2 Optimization4.2.1 Classical Optimization Techniques4.2.2 Potential of Optimizations4.2.3 Data Flow Analysis4.2.4 Non-local Optimization

4.3 Register Allocation4.3.1 Sethi-Ullman Algorithm4.3.2 Register Allocation by Graph Coloring

4.4 Just-in-time Compilation4.5 Further Aspects of Compilation

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 4

Selected topics in compiler construction

Focus:• Techniques that go beyond the direct translation of source

languages to target languages• Concentrate on concepts instead of language-dependent details• Use program representations tailored for the considered tasks

(instead of source language syntax):I simplifies representationI (but needs more work to integrate tasks)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 5

Selected topics in compiler construction (2)

Learning objectives:• Intermediate languages for translation and optimization of

imperative languages• Different optimization techniques• Different static analysis techniques for (intermediate) programs• Register allocation• Some aspects of code generation

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 6

Intermediate languages

4.1 Intermediate languages

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 7

Intermediate languages

Intermediate languages

• Intermediate languages are used asI appropriate program representation for certain language

implementation tasksI common representation of programs of different source languages

Source Language 1

Source Language 2

Source Language n

Intermediate Language

Target Language 1

Target Language 2

Target Language m

...

...

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 8

Page 2: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Intermediate languages

Intermediate languages (2)

• Intermediate languages for translation are comparable to datastructures in algorithm design, i.e., for each task, an intermediatelanguage is more or less suitable.

• Intermediate languages can conceptually be seen as abstractmachines.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 9

Intermediate languages 3-Address Code

4.1.1 3-Address Code

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 10

Intermediate languages 3-Address Code

3-address code

3-address code (3AC) is a common intermediate language with manyvariants.

Properties:

• only elementary data types (but often arrays)• no nested expressions• sequential execution, jumps and procedure calls as statements• named variables as in a high level language• unbounded number of temporary variables

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 11

Intermediate languages 3-Address Code

3-address code (2)

A program in 3AC consists of

• a list of global variables

• a list of procedures with parameters and local variables

• a main procedure

• each procedure has a sequence of 3AC commands as body

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 12

Intermediate languages 3-Address Code

3AC commands

Syntax Explanation

x := y bop zx : = uop zx:= y

x: variable (global, local, parameter, temporary)y,z: variable or constantbop: binary operatoruop: unary operator

goto Lif x cop y goto L

jump or conditional jump to label Lcop: comparison operatoronly procedure-local jumps

x:= a[i]a[i]:= y a one-dimensional array

x : = & ax:= *y*x := y

a global, local variable or parameter& a address of a* dereferencing operator

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 13

Intermediate languages 3-Address Code

3AC commands (2)

Syntax Explanation

param xcall preturn y

call p(x1, ..., xn) is encoded as:(block is considered as one command)param x1

...

param xn

call p

return y causes jump to return addresswith (optional) result y

We assume that 3AC only contains labelsfor which jumps are used in the program.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 14

Intermediate languages 3-Address Code

Basic blocks

A sequence of 3AC commands can be uniquely partitioned into basicblocks.

A basic block B is a maximal sequence of commands such that• at the end of B, exactly one jump, procedure call, or return

command occurs• labels only occur at the first command of a basic block

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 15

Intermediate languages 3-Address Code

Basic blocks (2)

Remarks:• The commands of a basic block are always executed sequentially,

there are no jumps to the inside• Often, a designated exit-block for a procedure containing the

return jump at its end is required. This is handled by additionaltransformations.

• The transitions between basic blocks are often denoted by flowcharts.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 16

Page 3: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Intermediate languages 3-Address Code

Example: 3AC and basic blocks

Consider the following C program:Beispiel: (3AC und Basisblöcke)

Wir betrachten den 3AC für ein C-Programm:

int a[2];

int b[7];

int skprod(int i1, int i2, int lng) {... }

int main( ) {

a[0] = 1; a[1] = 2;

b[0] = 4; b[1] = 5; b[2] = 6;

skprod(0 1 2);skprod(0,1,2);

return 0;

}

3AC mit Basisblockzerlegung für die Prozedur main:

main:

a[0] := 1a[0] := 1

a[1] := 2

b[0] := 4

b[1] := 5

b[2] := 6

param 0

param 1

param 2

call skprod

28.06.2007 296© A. Poetzsch-Heffter, TU Kaiserslautern

return 0

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 17

Intermediate languages 3-Address Code

Example: 3AC and basic blocks (2)

3AC with basic block partitioning for main procedure

Beispiel: (3AC und Basisblöcke)

Wir betrachten den 3AC für ein C-Programm:

int a[2];

int b[7];

int skprod(int i1, int i2, int lng) {... }

int main( ) {

a[0] = 1; a[1] = 2;

b[0] = 4; b[1] = 5; b[2] = 6;

skprod(0 1 2);skprod(0,1,2);

return 0;

}

3AC mit Basisblockzerlegung für die Prozedur main:

main:

a[0] := 1a[0] := 1

a[1] := 2

b[0] := 4

b[1] := 5

b[2] := 6

param 0

param 1

param 2

call skprod

28.06.2007 296© A. Poetzsch-Heffter, TU Kaiserslautern

return 0

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 18

Intermediate languages 3-Address Code

Example: 3AC and basic blocks (3)

Procedure skprod:Prozedur skprod mit 3AC und Basisblockzerlegung:

int skprod(int i1, int i2, int lng) {

int ix, res = 0;

for( ix=0; ix <= lng-1; ix++ ){

res += a[i1+ix] * b[i2+ix];

}

skprod:

}

return res;

}

res:= 0

ix := 0

t0 := lng-1

if ix<=t0

true false

t1 := i1+ix

t2 := a[t1]

t1 := i2+ix

t3 := b[t1]

t1 := t2*t3

return res

t1 := t2*t3

res:= es+t1

ix := ix+1

28.06.2007 297© A. Poetzsch-Heffter, TU Kaiserslautern

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 19

Intermediate languages 3-Address Code

Example: 3AC and basic blocks (4)

Procedure skprod as 3AC with basic blocks

Prozedur skprod mit 3AC und Basisblockzerlegung:

int skprod(int i1, int i2, int lng) {

int ix, res = 0;

for( ix=0; ix <= lng-1; ix++ ){

res += a[i1+ix] * b[i2+ix];

}

skprod:

}

return res;

}

res:= 0

ix := 0

t0 := lng-1

if ix<=t0

true false

t1 := i1+ix

t2 := a[t1]

t1 := i2+ix

t3 := b[t1]

t1 := t2*t3

return res

t1 := t2*t3

res:= es+t1

ix := ix+1

28.06.2007 297© A. Poetzsch-Heffter, TU Kaiserslautern

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 20

Intermediate languages 3-Address Code

Intermediate Language Variations

3 AC after elimination of array operations (at above example)

Variation im Rahmen einer Zwischensprache:

3-Adress-Code nach Elimination von Feldoperationen

anhand des obigen Beispiels:

skprod:p

res:= 0

ix := 0

t0 := lng-1

if ix<=t0

t1 := i1+ix

tx := t1*4

ta := a+tx

true false

return res

t2 := *ta

t1 := i2+ix

tx := t1*4

tb := b+tx

t3 *tbt3 := *tb

t1 := t2*t3

res:= res+t1

ix := ix+1

28.06.2007 298© A. Poetzsch-Heffter, TU Kaiserslautern

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 21

Intermediate languages 3-Address Code

Characteristics of 3-Address Code

• Control flow is explicit.• Only elementary operations• Rearrangement and exchange of commands can be handled

relatively easily.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 22

Intermediate languages Other Intermediate Languages

4.1.2 Other Intermediate Languages

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 23

Intermediate languages Other Intermediate Languages

Further Intermediate Languages

We consider• 3AC in Static Single Assignment (SSA) representation• Stack Machine Code

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 24

Page 4: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Intermediate languages Other Intermediate Languages

Single Static Assignment Form

If a variable a is read at a program position, this is a use of a.

If a variable a is written at a program position, this is a definition of a.

For optimizations, the relationship between use and definition ofvariables is important.

In SSA representation, each variable has exactly one definition. Thus,relationship between use and definition in the intermediate language isexplicit.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 25

Intermediate languages Other Intermediate Languages

Single Static Assignment Form (2)

SSA is essentially a refinement of 3AC.

The different definitions of one variable are represented by indexingthe variable.

For sequential command lists, this means that• at each definition position, the variable gets a different index.• at the use position, the variable has the index of its last definition.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 26

Intermediate languages Other Intermediate Languages

Example: SSA

In SSA-Repräsentation besitzt jede Variable genau

eine Definition. Dadurch wird der Zusammenhang

ischen An end ng nd Definition in derzwischen Anwendung und Definition in der

Zwischensprache explizit, d.h. eine zusätzliche

def-use-Verkettung oder use-def-Verkettung wird

unnötig.

SSA ist im Wesentlichen eine Verfeinerung von 3AC.

Die Unterscheidung zwischen den Definitionsstellen

wird häufig durch Indizierung der Variablen dargestelltwird häufig durch Indizierung der Variablen dargestellt.

Für sequentielle Befehlsfolgen bedeutet das:

• An jeder Definitionsstelle bekommt die Variable

einen anderen Indexeinen anderen Index.

• An der Anwendungsstelle wird die Variable mit

dem Index der letzten Definitionsstelle notiert.

a := x + y

Beispiel:

a := x + y 1 0 0

b := a – 1

a := y + b

b := x * 4

a := a + b

b := a - 1

a := y + b

b := x * 4

a := a + b

1 1

2

2

0

0 1

28.06.2007 300© A. Poetzsch-Heffter, TU Kaiserslautern

a := a + b a := a + b 3 2 2

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 27

Intermediate languages Other Intermediate Languages

SSA - Join Points of Control Flow

At join points of control flow, an additional mechanism is required:

An Stellen, an denen der Kontrollfluß zusammen-

führt bedarf es eines zusätzlichen Mechanismus:führt, bedarf es eines zusätzlichen Mechanismus:

3 2 2a := x + y a := a – b1 0 0

?b := a3

...

Einführung der fiktiven Orakelfunktion“ ! dieEinführung der fiktiven „Orakelfunktion !, die

quasi den Wert der Variable im zutreffenden Zweig

auswählt:

3 2 2a := x + y a := a – b1 0 0

a := !(a ,a )b := a

43

1 34

28.06.2007 301© A. Poetzsch-Heffter, TU Kaiserslautern

...

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 28

Intermediate languages Other Intermediate Languages

SSA - Join Points of Control Flow (2)

Introduce an "oracle" Φ that selects the value of the variable of the usebranch:

An Stellen, an denen der Kontrollfluß zusammen-

führt bedarf es eines zusätzlichen Mechanismus:führt, bedarf es eines zusätzlichen Mechanismus:

3 2 2a := x + y a := a – b1 0 0

?b := a3

...

Einführung der fiktiven Orakelfunktion“ ! dieEinführung der fiktiven „Orakelfunktion !, die

quasi den Wert der Variable im zutreffenden Zweig

auswählt:

3 2 2a := x + y a := a – b1 0 0

a := !(a ,a )b := a

43

1 34

28.06.2007 301© A. Poetzsch-Heffter, TU Kaiserslautern

...

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 29

Intermediate languages Other Intermediate Languages

SSA - Remarks

• The construction of an SSA representation with a minimal numberof applications of the Φ oracle is a non-trivial task.(cf. Appel, Sect. 19.1. and 19.2)

• The term single static assignment form reflects that for eachvariable in the program text, there is only one assignment.Dynamically, a variable in SSA representation can be assignedarbitrarily often (e.g., in loops).

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 30

Intermediate languages Other Intermediate Languages

Further intermediate languages

While 3AC and SSA representation are mostly used as intermediatelanguages in compilers, intermediate languages and abstractmachines are more and more often used as connections betweencompilers and runtime environments.

Java Byte Code and CIL (Common Intermediate Language, cf. .NET)are examples for stack machine code, i.e., intermediate results arestored on a runtime stack.

Further intermediate languages are, for instance, used foroptimizations.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 31

Intermediate languages Other Intermediate Languages

Stack machine code as intermediate language

Homogeneous scenario for Java:Sprachlich homogenes Szenario bei Java:

C1.java

C2.javajikes

C1.class

C2 class

Java ByteCode

C2.java

C3.java javac2

C2.class

C3.class

JVM

Sprachlich ggf. inhomogenes Szenario bei .NET:

ProgrammeIntermediate

C# -

C il

prog1.cs prog1.il

verschiedener

Hochsprachen

Intermediate

Language

Compilerprog2.cs prog2.il

prog3.il

CLR

Haskell -

Compilerprog3.hs

Java-ByteCode und die MS-Intermediate Language

sind Beispiele für Kellermaschinencode, d.h.

Z i h b i d f i L f itk ll

28.06.2007 303© A. Poetzsch-Heffter, TU Kaiserslautern

Zwischenergebnisse werden auf einem Laufzeitkeller

verwaltet.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 32

Page 5: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Intermediate languages Other Intermediate Languages

Stack machine code as intermediate language (2)

Inhomogeneous scenario for .NET:

Sprachlich homogenes Szenario bei Java:

C1.java

C2.javajikes

C1.class

C2 class

Java ByteCode

C2.java

C3.java javac2

C2.class

C3.class

JVM

Sprachlich ggf. inhomogenes Szenario bei .NET:

ProgrammeIntermediate

C# -

C il

prog1.cs prog1.il

verschiedener

Hochsprachen

Intermediate

Language

Compilerprog2.cs prog2.il

prog3.il

CLR

Haskell -

Compilerprog3.hs

Java-ByteCode und die MS-Intermediate Language

sind Beispiele für Kellermaschinencode, d.h.

Z i h b i d f i L f itk ll

28.06.2007 303© A. Poetzsch-Heffter, TU Kaiserslautern

Zwischenergebnisse werden auf einem Laufzeitkeller

verwaltet.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 33

Intermediate languages Other Intermediate Languages

Example: Stack machine code

Beispiel: (Kellermaschinencode)

package beisp;

class Weltklasse extends Superklasse

implements BesteBohnen {

Qualifikation studieren ( Arbeit schweiss){

return new Qualifikation();

}}

}

Compiled from Weltklasse.java

class beisp Weltklasse extends beisp Superklasseclass beisp.Weltklasse extends beisp.Superklasse

implements beisp.BesteBohnen{

beisp.Weltklasse();

beisp.Qualifikation studieren( beisp.Arbeit);

}

Method beisp.Weltklasse()

0 aload_0

1 invokespecial #6 <Method beisp.Superklasse()>

4 return

Method beisp.Qualifikation studieren( beisp.Arbeit )

0 new #2 <Class beisp.Qualifikation>

3 dup

4 invokespecial #5 <Method beisp.Qualifikation()>

7 areturn7 areturn

Bemerkung:

Weitere Zwischensprachen werden insbesondere auch

28.06.2007 304© A. Poetzsch-Heffter, TU Kaiserslautern

Weitere Zwischensprachen werden insbesondere auch

im Zusammenhang mit Optimierungen eingesetzt.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 34

Intermediate languages Other Intermediate Languages

Example: Stack machine code (2)

Beispiel: (Kellermaschinencode)

package beisp;

class Weltklasse extends Superklasse

implements BesteBohnen {

Qualifikation studieren ( Arbeit schweiss){

return new Qualifikation();

}}

}

Compiled from Weltklasse.java

class beisp Weltklasse extends beisp Superklasseclass beisp.Weltklasse extends beisp.Superklasse

implements beisp.BesteBohnen{

beisp.Weltklasse();

beisp.Qualifikation studieren( beisp.Arbeit);

}

Method beisp.Weltklasse()

0 aload_0

1 invokespecial #6 <Method beisp.Superklasse()>

4 return

Method beisp.Qualifikation studieren( beisp.Arbeit )

0 new #2 <Class beisp.Qualifikation>

3 dup

4 invokespecial #5 <Method beisp.Qualifikation()>

7 areturn7 areturn

Bemerkung:

Weitere Zwischensprachen werden insbesondere auch

28.06.2007 304© A. Poetzsch-Heffter, TU Kaiserslautern

Weitere Zwischensprachen werden insbesondere auch

im Zusammenhang mit Optimierungen eingesetzt.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 35

Optimization

4.2 Optimization

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 36

Optimization

Optimization

Optimization refers to improving the code with the following goals:

• Runtime behavior

• Memory consumption

• Size of code

• Energy consumption

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 37

Optimization

Optimization (2)

We distinguish the following kinds of optimizations:• machine-independent optimizations• machine-dependent optimizations (exploit properties of a

particular real machine)

and• local optimizations• intra-procedural optimizations• inter-procedural/global optimizations

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 38

Optimization

Remark on Optimization

Appel (Chap. 17, p 350):

"In fact, there can never be a complete list [of optimizations]. "

"Computability theory shows that it will always be possible to inventnew optimizing transformations."

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 39

Optimization Classical Optimization Techniques

4.2.1 Classical Optimization Techniques

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 40

Page 6: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Classical Optimization Techniques

Constant Propagation

If the value of a variable is constant, the variable can be replaced withthe constant.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 41

Optimization Classical Optimization Techniques

Constant Folding

Evaluate all expressions with constants as operands at compile time.

Iteration of Constant Folding and Propagation:

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 42

Optimization Classical Optimization Techniques

Non-local Constant Optimization

For each program position, the possible values for each variable arerequired. If the set of possible values is infinite, it has to be abstractedappropriately.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 43

Optimization Classical Optimization Techniques

Copy Propagation

Eliminate all copies of variables, i.e., if there exist several variablesx,y,z at a program position, that are known to have the same value, alluses of y and z are replaced by x.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 44

Optimization Classical Optimization Techniques

Copy Propagation (2)

This can also be done at join points of control flow or for loops:

For each program point, the information which variables have the samevalue is required.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 45

Optimization Classical Optimization Techniques

Common Subexpression Elimination

If an expression or a statement contains the same partial expressionseveral times, the goal is to evaluate this subexpression only once.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 46

Optimization Classical Optimization Techniques

Common Subexpression Elimination (2)

Optimization of a basic block is done after transformation to SSA andconstruction of a DAG:

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 47

Optimization Classical Optimization Techniques

Common Subexpression Elimination (3)

Remarks:• The elimination of repeated computations is often done before

transformation to 3AC, but can also be reasonable following othertransformations.

• The DAG representation of expressions is also used asintermediate language by some authors.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 48

Page 7: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Classical Optimization Techniques

Algebraic Optimizations

Algebraic laws can be applied in order to be able to use otheroptimizations. For example, use associativity and commutativity ofaddition:

Caution: For finite data type, common algebraic laws are not valid ingeneral.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 49

Optimization Classical Optimization Techniques

Strength Reduction

Replace expensive operations by more efficient operations (partiallymachine-dependent).

For example: y: = 2* x can be replaced by

y : = x + x

or by

y: = x « 1

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 50

Optimization Classical Optimization Techniques

Inline Expansion of Procedure Calls

Replace call to non-recursive procedure by its body with appropriatesubstitution of parameters.

Note: This reduces execution time, but increases code size.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 51

Optimization Classical Optimization Techniques

Inline Expansion of Procedure Calls (2)

Remarks:• Expansion is in general more than text replacement:

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 52

Optimization Classical Optimization Techniques

Inline Expansion of Procedure Calls (3)

• In OO programs with relatively short methods, expansion is animportant optimization technique. But, precise information aboutthe target object is required.

• A refinement of inline expansion is the specialization ofprocedures/functions if some of the current parameters areknown. This technique can also be applied to recursiveprocedures/functions.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 53

Optimization Classical Optimization Techniques

Dead Code Elimination

Remove code that is not reached during execution or that has noinfluence on execution.

In one of the above examples, constant folding and propagationproduced the following code:

Provided, t3 and t4 are no longer used after the basic block (not live).

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 54

Optimization Classical Optimization Techniques

Dead Code Elimination (2)

A typical example for non-reachable and thus, dead code that can beeliminated:

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 55

Optimization Classical Optimization Techniques

Dead Code Elimination (3)

Remarks:

• Dead code is often caused by optimizations.

• Another source of dead code are program modifications.

• In the first case, liveness information is the prerequiste for deadcode elimination.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 56

Page 8: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Classical Optimization Techniques

Code motion

Move commands over branching points in the control flow graph suchthat they end up in basic blocks that are less often executed.

We consider two cases:

• Move commands in succeeding or preceeding branches• Move code out of loops

Optimization of loops is very profitable, because code inside loops isexecuted more often than code not contained in a loop.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 57

Optimization Classical Optimization Techniques

Move code over branching points

If a sequential computation branches, the branches are less oftenexecuted than the sequence.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 58

Optimization Classical Optimization Techniques

Move code over branching points (2)

Prerequisite for this optimization is that a defined variable is only usedin one branch.

Moving the command over a preceeding joint point can be advisable, ifthe command can be eliminated by optimization from one of thebranches.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 59

Optimization Classical Optimization Techniques

Partial redundancy elimination

Definition (Partial Redundancy)An assignment is redundant at a program position s, if it has alreadybeen executed on all paths to s.

An expression e is redundant at s, if the value of e has already beencalculated on all paths to s.

An assignment/expression is partially redundant at s, if it is redundantwith respect to some execution paths leading to s.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 60

Optimization Classical Optimization Techniques

Partial redundancy elimination (2)

Example:

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 61

Optimization Classical Optimization Techniques

Partial redundancy elimination (3)

Elimination of partial redundancy:

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 62

Optimization Classical Optimization Techniques

Partial redundancy elimination (4)

Remarks:

• PRE can be seen as a combination and extension of commonsubexpression elimination and code motion.

• Extension: Elimination of partial redundancy according toestimated probability for execution of specific paths.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 63

Optimization Classical Optimization Techniques

Code motion from loops

Idea: Computations in loops whose operations are not changed insidethe loop should be done outside the loop.

Provided, t1 is not live at the end of the top-most block on the left side.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 64

Page 9: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Classical Optimization Techniques

Optimization of loop variables

Variables and expressions that are not changed during the executionof a loop are called loop invariant.

Loops often have variables that are increased/decreasedsystematically in each loop execution, e.g., for-loops.

Often, a loop variable depends on another loop variable,e.g., a relative address depends on the loop counter variable.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 65

Optimization Classical Optimization Techniques

Optimization of loop variables (2)

Definition (Loop Variables)A variable i is called explicit loop variable of a loop S, if there is exactlyone definition of i in S of the form i := i + c where c is loop invariant.

A variable k is called derived loop variable of a loop S, if there isexactly one definition of k in S of the form k := j ∗ c or k := j + dwhere j is a loop variable and c and d are loop invariant.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 66

Optimization Classical Optimization Techniques

Induction variable analysis

Compute derived loop variables inductively, i.e., instead of computingthem from the value of the loop variable, compute them from thevalued of the previous loop execution.

Note: For optimization of derived loop variables, the dependenciesbetween variable definitions have to be precisely understood.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 67

Optimization Classical Optimization Techniques

Loop unrolling

If the number of loop executions is known statically or properties about thenumber of loop executions (e.g., always an even number) can be inferred, theloop body can be copied several times to save comparisons and jumps.

Provided, ix is dead at the end of the fragment.Note, the static computation of ix ’s values in the unrolled loop.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 68

Optimization Classical Optimization Techniques

Loop unrolling (2)

Remarks:

• Partial loop unrolling aims at obtaining larger basic blocks in loopsto have more optimization options.

• Loop unrolling is in particular important for parallel processorarchitectures and pipelined processing (machine-dependent).

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 69

Optimization Classical Optimization Techniques

Optimization for other language classes

The discussed optimizations aim at imperative languages. Foroptimizing programs of other language classes, special techniqueshave been developed.

For example:

• Object-oriented languages: Optimization of dynamic binding(type analysis)

• Non-strict functional languages: Optimization of lazy function calls(strictness analysis)

• Logic programming languages: Optimization of unification

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 70

Optimization Potential of Optimizations

4.2.2 Potential of Optimizations

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 71

Optimization Potential of Optimizations

Potential of optimizations - Example

Consider procedure skprod for the evaluation of the optimization techniques:

4.2.2 Optimierungspotential

Am Beispiel der Prozedur skprod demonstrieren

i i i d bi T h ik d dwir einige der obigen Techniken und das

Verbesserungspotential, das durch Optimierungen

erzielt werden kann; dabei skizzieren wir auch

dessen Bewertung.

k dskprod:

res:= 0

ix := 0

t0 := lng-1

if ix<=t0

true false

return res

t1 := i1+ix

tx := t1*4

ta := a+tx

t2 := *ta

t1 := i2+ixt1 : i2+ix

tx := t1*4

tb := b+tx

t3 := *tb

t1 := t2*t3

res:= res+t1

ix := ix+1

Bewertung: Anzahl der Befehlsschritte in Abhängigkeit

28.06.2007 322© A. Poetzsch-Heffter, TU Kaiserslautern

Bewertung: Anzahl der Befehlsschritte in Abhängigkeit

von lng: 2 + 2 + 13*lng + 1 = 13*lng + 5

( lng = 100: 1305, lng = 1000: 13005 )

Evaluation:Number of steps depending on lng:2 + 2 + 13 ∗ lng + 1 = 13 ∗ lng + 5lng=100: 1305lng=1000: 13005

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 72

Page 10: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Potential of Optimizations

Potential of optimizations - Example (2)Move computation of loop invariant out of loop:Herausziehen der Berechnung der

Schleifeninvariante t0:

skprod:

res:= 0res:= 0

ix := 0

t0 := lng-1

if i < t0

return res

t1 := i1+ix

tx := t1*4

if ix<=t0

true false

ta := a+tx

t2 := *ta

t1 := i2+ix

tx := t1*4

tb := b+txtb : b+tx

t3 := *tb

t1 := t2*t3

res:= res+t1

ix := ix+1

Bewertung: 3 + 1 + 12*lng + 1 = 12*lng + 5

28.06.2007 323© A. Poetzsch-Heffter, TU Kaiserslautern

g g g

( lng = 100: 1205, lng = 1000: 12005 )Evaluation: 3+1+12*lng+1 = 12 *lng + 5c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 73

Optimization Potential of Optimizations

Potential of optimizations - Example (3)Optimization of loop variables: There are no derived loop variables, becauset1 and tx have several definitions; transformation to SSA for t1 and tx yieldsthat t11, tx1, ta, t12, tb become derived loop variables.

Optimierung von Schleifenvariablen (1):

Zunächst gibt es keine abgeleiteten Schleifenvariablen,

da t1 und tx mehrere Definitionen besitzen; Einführen

von SSA für t1 und tx macht t11, tx1, ta, t12, tx2, tb

zu abgeleiteten Schleifenvariablen:

skprod:

res:= 0res:= 0

ix := 0

t0 := lng-1

if i < t0

return res

t11:= i1+ix

tx1:= t11*4

1

if ix<=t0

true false

ta := a+tx1

t2 := *ta

t12:= i2+ix

tx2:= t12*4

tb := b+tx2tb : b t

t3 := *tb

t13:= t2*t3

res:= res+t13

ix := ix+1

28.06.2007 324© A. Poetzsch-Heffter, TU Kaiserslauternc© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 74

Optimization Potential of Optimizations

Potential of optimizations - Example (4)

Optimization of loop variables(2): Inductive definition of loop variablesOptimierung von Schleifenvariablen (2):

Initialisierung und induktive Definition der

S hl if i blSchleifenvariablen:

skprod:

res:= 0res:= 0

ix := 0

t0 := lng-1

t11:= i1-1

tx1:= t11*4

ta := a+tx1

t12:= i2-1

tx2:= t12*4

tb := b+tx2

t11:= t11+1

if ix<=t0

true false

return res

t11:= t11+1

tx1:= tx1+4

ta := ta+4

t2 := *ta

t12:= t12+1

tx2:= tx2+4

tb := tb+4

t3 := *tb

t13:= t2*t3

res:= res+t13

28.06.2007 325© A. Poetzsch-Heffter, TU Kaiserslautern

res: res+t13

ix := ix+1

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 75

Optimization Potential of Optimizations

Potential of optimizations - Example (5)Dead Code Elimination: t11, tx1, t12, tx2 do not influence the result.

Elimination toten Codes:

Die Zuweisungen an t11, tx1, t12, tx2 sind toter

Code da sie das Ergebnis nicht beeinflussen

skprod:

Code, da sie das Ergebnis nicht beeinflussen.

res:= 0

ix := 0

t0 := lng-1

t11:= i1-1

tx1:= t11*4tx1: t11 4

ta := a+tx1

t12:= i2-1

tx2:= t12*4

tb := b+tx2

if ix<=t0

true false

return res

ta := ta+4

t2 := *ta

tb := tb+4

t3 := *tb

t13:= t2*t3t13: t2 t3

res:= res+t13

ix := ix+1

28.06.2007 326© A. Poetzsch-Heffter, TU Kaiserslautern

Bewertung: 9 + 1 + 8*lng + 1 = 8*lng + 11

( lng = 100: 811, lng = 1000: 8011 )

Evaluation: 9 + 1 + 8 * lng +1 = 8 * lng +11c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 76

Optimization Potential of Optimizations

Potential of optimizations - Example (6)

Algebraic Optimizations: Use invariants ta = 4 ∗ (i1− 1 + ix) + a for thecomparison ta ≤ 4 ∗ (i1− 1 + t0) + aAlgebraische Optimierung:

Ausnutzen der Invarianten: ta = 4*(i1-1+ix)+ a

für den Vergleich: ta < 4*(i1 1+t0)+ afür den Vergleich: ta <= 4*(i1-1+t0)+ a

skprod:

res:= 0

ix := 0

t0 := lng-1

t11:= i1-1

tx1:= t11*4tx1: t11 4

ta := a+tx1

t12:= i2-1

tx2:= t12*4

tb := b+tx2

t4 := t11+t0

t5 := 4*t4

t6 := t5+a

ta := ta+4

t2 := *ta

if ta<=t6

true false

return rest2 : ta

tb := tb+4

t3 := *tb

t13:= t2*t3

res:= res+t13

28.06.2007 327© A. Poetzsch-Heffter, TU Kaiserslautern

ix := ix+1

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 77

Optimization Potential of Optimizations

Potential of optimizations - Example (7)Dead Code Elimination: Assignment to ix is dead code and can be eliminated.Elimination toten Codes:

Durch die Transformation der Schleifenbedingung ist

di Z i C d d d kdie Zuweisung an ix toter Code geworden und kann

eliminiert werden:skprod:

res:= 0

t0 := lng-1

t11:= i1-1

tx1:= t11*4

ta := a+tx1ta := a+tx1

t12:= i2-1

tx2:= t12*4

tb := b+tx2

t4 := t11+t0

t5 := 4*t4

t6 := t5+a

if ta<=t6

return res

ta := ta+4

t2 := *ta

tb := tb+4

if ta< t6

true false

tb : tb+4

t3 := *tb

t13:= t2*t3

res:= res+t13

28.06.2007 328© A. Poetzsch-Heffter, TU Kaiserslautern

Bewertung: 11 + 1 + 7*lng + 1 = 7*lng + 13

( lng = 100: 713, lng = 1000: 7013 )

Evaluation: 11 + 1 + 7 * Ing +1 = 7 * lng + 13c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 78

Optimization Potential of Optimizations

Potential of optimizations - Example (8)

Remarks:

• Reduction of execution steps by almost half, where the mostsignificant reductions are achieved by loop optimization.

• Combination of optimization techniques is important. Determiningthe ordering of optimizations is in general difficult.

• We have only considered optimizations at examples. The difficultyis to find algorithms and heuristics for detecting optimizationpotential automatically and for executing the optimizingtransformations.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 79

Optimization Data flow analysis

4.2.3 Data flow analysis

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 80

Page 11: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Data flow analysis

Data flow analysis

For optimizations, data flow information is required that can beobtained by data flow analysis.

Goal: Explanation of basic concepts of data flow analysis at examples

Outline:• Liveness analysis (Typical example of data flow analysis)• Data flow equations• Important analyses classes

Each analysis has an exact specification which information it provides.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 81

Optimization Data flow analysis

Liveness analysis

Definition (Liveness Analysis)Let P be a program. A variable v is live at a program position S in P ifthere is an execution path π from S to a use of v such that there is nodefinition of v on π.

The liveness analysis determines for all positions S in P whichvariables are live at S.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 82

Optimization Data flow analysis

Liveness analysis (2)

Remarks:• The definition of liveness of variables is static/syntactic. We have

defined dead code dynamically/semantically.• The result of the liveness analysis for a programm P can be

represented as a function live mapping positions in P to bitvectors, where a bit vector contains an entry for each variable inP. Let i be the index of a variable in P, then it holds that:

live(S)[i] = 1 iff v is live at position S

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 83

Optimization Data flow analysis

Liveness analysis (3)

Idea:

• In a procedure-local analysis, exactly the global variables are liveat the end of the exit block of the procedure.

• If the live variables out(B) at the end of a basic block B are known,the live variables in(B) at the beginning of B are computed by:

in(B) = gen(B) ∪ (out(B) \ kill(B))

whereI gen(B) is the set of variables v such that v is applied in B without a

prior definition of vI kill(B) is the set of variables that are defined in B

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 84

Optimization Data flow analysis

Liveness analysis (4)

As the set in(B) is computed from out(B), we have a backwardanalysis.

For B not the exit block of the procedure, out(B) is obtained by

out(B) =⋃

in(Bi) for all successors Bi of B

Thus, for a program without loops, in(B) and out(B) are defined for allbasic blocks B. Otherwise, we obtain a system of recursive equations.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 85

Optimization Data flow analysis

Liveness analysis - Example

Question: How do we compute out(B2)?c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 86

Optimization Data flow analysis

Data flow equations

Theory:

• There is always a solution for equations of the considered form.• There is always a smallest solution that is obtained by an iteration

starting from empty in and out sets.

Note: The equations may have several solutions.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 87

Optimization Data flow analysis

Ambiguity of solutions - Example

a := aB0:

b := 7B1:

out(B0) = in(B0) ∪ in(B1)out(B1) = { }in(B0) = gen(B0) ∪ (out(B0)\kill(B0))

= {a } ∪ out(B0)in(B1) = gen(B1) ∪ (out(B1)\kill(B1))

= { }

Thus, out(B0) = in(B0), and hence in(B0) = {a} ∪ in(B0).

Possible Solutions: in(B0) = {a} or in(B0) = {a,b}

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 88

Page 12: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Data flow analysis

Computation of smallest fixpoint

1. Compute gen(B), kill(B) for all B.

2. Set out(B) = ∅ for all B except for the exit block. For the exit block,out(B) comes from the program context.

3. While out(B) or in(B) changes for any B:

Compute in(B) from current out(B) for all B.

Compute out(B) from in(B) of its successors.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 89

Optimization Data flow analysis

Further analyses and classes of analyses

Many data flow analyses can be described as bit vector problems:• Reaching definitions: Which definitions reach a position S?• Available expressions for elimination of repeated computations• Very busy expressions: Which expression is needed for all

subsequent computations?

The according analyses can be treated analogue to liveness analysis,but differ in• the definition of the data flow information• the definition of gen and kill• the direction of the analysis and the equations

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 90

Optimization Data flow analysis

Further analyses and classes of analyses (2)

For backward analyses, the data flow information at the entry of abasic block B is obtained from the information at the exit of B:

in(B) = gen(B) ∪ (out(B) \ kill(B))

Analyses can be distinguished if they consider the conjunction or theintersection of the successor information:

out(B) =⋃

Bi∈succ(B)

in(Bi)

or

out(B) =⋂

Bi∈succ(B)

in(Bi)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 91

Optimization Data flow analysis

Further analyses and classes of analyses (3)

For forward analyses, the dependency is the other way round:

out(B) = gen(B) ∪ (in(B) \ kill(B))

with

in(B) =⋃

Bi∈pred(B)

out(Bi)

or

in(B) =⋂

Bi∈pred(B)

out(Bi)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 92

Optimization Data flow analysis

Further analyses and classes of analyses (4)

Overview of classes of analyses:

conjunction intersectionforward reachable definitions available expressions

backward live variables busy expressions

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 93

Optimization Data flow analysis

Further analyses and classes of analyses (5)

For bit vector problems, data flow information consists of subsets offinite sets.

For other analyses, the collected information is more complex, e.g., forconstant propagation, we consider mappings from variables to values.

For interprocedural analyses, complexity increases because the flowgraph is not static.

Formal basis for the development and correctness of optimizations isprovided by the theory of abstract interpretation.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 94

Optimization Non-Local Program Analysis

4.2.4 Non-Local Program Analysis

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 95

Optimization Non-Local Program Analysis

Non-local program analysis

We use a points-to analysis to demonstrate:• interprocedural aspects: The analysis crosses the borders of

single procedures.• constraints: Program analysis very often involves solving or

refining constraints.• complex analysis results: The analysis result cannot be

represented locally for a statement.• analysis as abstraction: The result of the analysis is an

abstraction of all possible program executions.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 96

Page 13: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Non-Local Program Analysis

Points-to analysis

Analysis for programs with pointers and for object-oriented programs

Goal: Compute which references to which records/objects a variablecan hold.

Applications of Analysis Results:

Basis for optimizations• Alias information (e.g., important for code motion)

I Can p.f = x cause changes to an object referenced by q?I Can z = p.f read information that is written by p.f = x?

• Call graph construction• Resolution of virtual method calls• Escape analysis

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 97

Optimization Non-Local Program Analysis

Alias InformationBeispiele: (Verwendung von Points-to-

Analyseinformation)Analyseinformation)

(1) p.f = x;

(2) f

A. Nutzen von Alias-Information:

(2) y = q.f;

(3) q.f = z;

p == q: (1)

(2) y = x;(2) y x;

(3) q.f = z;

p != q: Erste Anweisung lässt sich mit den

anderen beiden vertauschenanderen beiden vertauschen.

B. Elimination dynamischer Bindung:

class A {class A {

void m( ... ) { ... }

}

class B extends A {

void m( ) { }void m( ... ) { ... }

}

...

A p;

28.06.2007 338© A. Poetzsch-Heffter, TU Kaiserslautern

p = new B();

p.m(...) // Aufruf von B::m

First two statements can

be switched.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 98

Optimization Non-Local Program Analysis

Elimination of Dynamic Binding

Beispiele: (Verwendung von Points-to-

Analyseinformation)Analyseinformation)

(1) p.f = x;

(2) f

A. Nutzen von Alias-Information:

(2) y = q.f;

(3) q.f = z;

p == q: (1)

(2) y = x;(2) y x;

(3) q.f = z;

p != q: Erste Anweisung lässt sich mit den

anderen beiden vertauschenanderen beiden vertauschen.

B. Elimination dynamischer Bindung:

class A {class A {

void m( ... ) { ... }

}

class B extends A {

void m( ) { }void m( ... ) { ... }

}

...

A p;

© A. Poetzsch-Heffter, TU Kaiserslautern

p = new B();

p.m(...) // Aufruf von B::mCall of B::m

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 99

Optimization Non-Local Program Analysis

Escape Analysis

C. Escape-Analyse:

R m( A p ) {( p ) {

B q;

q = new B(); // Kellerverwaltung möglich

q.f = p;

q.g = p.n(); q g p ();

return q.g;

}

Eine Points-to-Analyse für Java:

Vereinfachungen:

• Gesamte Programm ist bekannt.

• Nur Zuweisungen und Methodenaufrufe der

folgenden Form:

Di kt Z i- Direkte Zuweisung: l = r

- Schreiben auf Instanzvariablen: l.f = r

- Lesen von Instanzvariablen: l = r.f

Objekterzeugung: l C()- Objekterzeugung: l = new C()

- Einfacher Methodenaufruf: l = r0.m(r1,..)

• Ausdrücke ohne Seiteneffekte

• Zusammengesetzte Anweisungen

© A. Poetzsch-Heffter, TU Kaiserslautern

• Zusammengesetzte Anweisungen

Can be stored on stack

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 100

Optimization Non-Local Program Analysis

A Points-to Analysis for Java

Simplifications and assumptions about underlying language• Complete program is known.• Only assignments and method calls of the following form are

used:I Direct assignment: l = rI Write to instance variables: l.f = rI Read of instance variables: l = r.fI Object creation: l = new C()I Simple method call: l = r0.m(r1, ...)

• Expressions without side effects• Compound statements

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 101

Optimization Non-Local Program Analysis

A Points-to Analysis for Java (2)

Analysis type• Flow-insensitive: The control flow of the program has no

influence on the analysis result. The states of the variables atdifferent program points are combined.

• Context-insensitive: Method calls at different program points arenot distinguished.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 102

Optimization Non-Local Program Analysis

A Points-to Analysis for Java (3)

Points-to graph as abstraction

Result of the analysis is a so-called points-to graph having• abstract variables and abstract objects as nodes• edges represent that an abstract variable may have a reference to

an abstract object

Abstract variables V represent sets of concrete variables at runtime.

Abstract objects O represent sets of concrete objects at runtime.

An edge between V and O means that in a certain program state, aconcrete variable in V may reference an object in O.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 103

Optimization Non-Local Program Analysis

Points-to Graph - ExampleBeispiel: (Points-to-Graph)

class Y { ... }

class X {

Y f;

void set( Y r ) { this.f = r; }

static void main() {

X p = new X(); // s1 „erzeugt“ o1

Y q = new Y(); // s2 „erzeugt“ o2q (); // „ g

p.set(q);

}

}

p

o1

this

o1

f

q

r

o2

28.06.2007 341© A. Poetzsch-Heffter, TU Kaiserslautern

r

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 104

Page 14: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Optimization Non-Local Program Analysis

Points-to Graph - Example (2)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 105

Optimization Non-Local Program Analysis

Definition of the Points-to Graph

For all method implementations,• create node o for each object creation• create nodes for

I each local variable vI each formal parameter p of any method

(incl. this and results (ret))I each static variable s

(Instance variables are modeled by labeled edges.)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 106

Optimization Non-Local Program Analysis

Definition of the Points-to Graph (2)Edges: Smallest Fixpoint of f : PtGraph × Stmt → PtGraph with

• f (G, l = new C()) = G ∪ {(l ,oi )}• f (G, l = r) = G ∪ {(l ,oi ) |oi ∈ Pt(G, r)}• f (G, l .f = r) = G ∪ {(< oi , f >,oj ) |oi ∈ Pt(G, l),oj ∈ Pt(G, r)}• f (G, l = r .f ) = G ∪ {(l ,oi ) | ∃oj ∈ Pt(G, r).oi ∈ Pt(G, < oj , f >)}• f (G, l = r0.m(r1, . . . , rn)) =

G ∪⋃oi∈Pt(G,r0)

resolve(G,m,oi , r1, . . . , rn, l)

where Pt(G, x) is the points-to set of x in G,

resolve(G,m,oi , r1, . . . , rn, l) =let mj (p0,p1, . . . ,pn, retj ) = dispatch(oi ,m) in{(p0,oi )} ∪ f (G,p1 = r1) ∪ . . . ∪ f (G, l = retj ) end

and dispatch(oi ,m) returns the actual implementation of m for oi with formalparameters p1, . . . ,pn, result variable retj , p0 refers to this.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 107

Optimization Non-Local Program Analysis

Definition of the Points-to Graph (3)

Remark:

The main problem for practical use of the analysis is the efficientimplementation of the computation of the points-to graph.

Literature:

A. Rountev, A. Milanova, B. Ryder: Points-to Analysis for Java UsingAnnotated Constraints. OOPSLA 2001.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 108

Register Allocation

4.3 Register Allocation

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 109

Register Allocation

Register allocation

Efficient code has to make good use of the available registers on thetarget machine: Accessing registers is much faster then accessingmemory (the same holds for cache).

Register allocation has two aspects:• Determine which variables are implemented by registers at which

positions.• Determine which register implements which variable at which

positions (register assignment).

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 110

Register Allocation

Register allocation (2)

Goals of register allocation

1. Generate code that requires as little registers as possible

2. Avoid unnecessary memory accesses, i.e., not only temporaries,but also program variables are implemented by registers.

3. Allocate registers such for variables that are used often (do notuse them for variables that are only rarely accessed).

4. Obey programmer’s requirements.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 111

Register Allocation

Register allocation (3)

Outline

• Algorithm interleaving code generation and register allocationfor nested expressions (cf. Goal 1)

• Algorithm for procedure-local register allocation(cf. Goals 2 and 3)

• Combination and other aspects

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 112

Page 15: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Register Allocation Sethi-Ullmann Algorithm

4.3.1 Sethi-Ullmann Algorithm

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 113

Register Allocation Sethi-Ullmann Algorithm

Evaluation ordering with minimal registers

The algorithm by Sethi and Ullmann is an example of an integratedapproach for register allocation and code generation.(cf. Wilhelm, Maurer, Sect. 12.4.1, p. 584 ff)

Input:

An assignment with a nested expression on the right hand side

4.3.1 Auswertungsordnung mit

minimalem Registerbedarfminimalem Registerbedarf

Der Algorithmus von Sethi-Ullman ist ein Beispiel

für eine integriertes Verfahren zur Registerzuteilung

und Codeerzeugung.

Eingabe:

Eine Zuweisung mit zusammengesetztem Ausdruckg g

auf der rechten Seite:

Assign ( Var, Exp )

Exp = BinExp | Var

BinExp ( Exp Op Exp )BinExp ( Exp, Op, Exp )

Var ( Ident )

Ausgabe:

Zugehörige Maschinencode bzw ZwischensprachenZugehörige Maschinencode bzw. Zwischensprachen-

code mit zugewiesenen Registern. Wir betrachten hier

Zwei-Adresscode, d.h. Code mit maximal einem

Speicherzugriff:i [ ]Ri := M[V]

M[V] := Ri

Ri := Ri op M[V]

Ri := Ri op Rj

28.06.2007 346© A. Poetzsch-Heffter, TU Kaiserslautern

(vgl. Wilhelm/Maurer 12.4.1, Seite 584 ff)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 114

Register Allocation Sethi-Ullmann Algorithm

Evaluation ordering with minimal registers (2)

Output:

Machine or intermediate language code with assigned registers.

We consider two-address code, i.e., code with one memory access atmaximum. The machine has r registers represented by R0, . . . ,Rr−1.

4.3.1 Auswertungsordnung mit

minimalem Registerbedarfminimalem Registerbedarf

Der Algorithmus von Sethi-Ullman ist ein Beispiel

für eine integriertes Verfahren zur Registerzuteilung

und Codeerzeugung.

Eingabe:

Eine Zuweisung mit zusammengesetztem Ausdruckg g

auf der rechten Seite:

Assign ( Var, Exp )

Exp = BinExp | Var

BinExp ( Exp Op Exp )BinExp ( Exp, Op, Exp )

Var ( Ident )

Ausgabe:

Zugehörige Maschinencode bzw ZwischensprachenZugehörige Maschinencode bzw. Zwischensprachen-

code mit zugewiesenen Registern. Wir betrachten hier

Zwei-Adresscode, d.h. Code mit maximal einem

Speicherzugriff:i [ ]Ri := M[V]

M[V] := Ri

Ri := Ri op M[V]

Ri := Ri op Rj

28.06.2007 346© A. Poetzsch-Heffter, TU Kaiserslautern

(vgl. Wilhelm/Maurer 12.4.1, Seite 584 ff)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 115

Register Allocation Sethi-Ullmann Algorithm

Example: Code generation w/ register allocation

Consider f := (a + b)− (c − (d + e))

Assume that there are two registers R0 and R1 available for thetranslation.

Result of direct translation:

Beispiel: (Codeerzeugung mit Registerzuteil.)

Betrachte: f:= (a+b)-(c-(d+e))Betrachte: f:= (a+b) (c (d+e))

Annahme: Zur Übersetzung stehen nur zwei Registerzur Verfügung.

Ergebnis der direkten Übersetzung:

R0 := M[a]

R0 := R0 + M[b]

R1 := M[d]R1 := M[d]

R1 := R1 + M[e]

M[t1] := R1

R1 := M[c]

R1 := R1 – M[t1]

R0 := R0 – R1

M[f] := R0

Ergebnis von Sethi-Ullman:

R0 := M[c]

R1 := M[d]

R1 := R1 + M[e]

R0 := R0 – R1

R1 : M[a]R1 := M[a]

R1 := R1 + M[b]

R1 := R1 – R0

M[f] := R1

28.06.2007 347© A. Poetzsch-Heffter, TU Kaiserslautern

Besser, weil ein Befehl weniger und keine Zwischen-Speicherung nötig.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 116

Register Allocation Sethi-Ullmann Algorithm

Example: Code generation w/ register allocation (2)

Result of Sethi-Ullmann algorithm:

Beispiel: (Codeerzeugung mit Registerzuteil.)

Betrachte: f:= (a+b)-(c-(d+e))Betrachte: f:= (a+b) (c (d+e))

Annahme: Zur Übersetzung stehen nur zwei Registerzur Verfügung.

Ergebnis der direkten Übersetzung:

R0 := M[a]

R0 := R0 + M[b]

R1 := M[d]R1 := M[d]

R1 := R1 + M[e]

M[t1] := R1

R1 := M[c]

R1 := R1 – M[t1]

R0 := R0 – R1

M[f] := R0

Ergebnis von Sethi-Ullman:

R0 := M[c]

R1 := M[d]

R1 := R1 + M[e]

R0 := R0 – R1

R1 : M[a]R1 := M[a]

R1 := R1 + M[b]

R1 := R1 – R0

M[f] := R1

28.06.2007 347© A. Poetzsch-Heffter, TU Kaiserslautern

Besser, weil ein Befehl weniger und keine Zwischen-Speicherung nötig.

More efficient, because it uses one instruction less and does not needto store intermediate results.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 117

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm

Goal: Minimize number of registers and number of temporaries.

Idea: Generate code for subexpression requiring more registers first.

Procedure:• Define function regbed that computes the number of registers

needed for an expression• Generate code for an expression E = BinExp(L,OP,R);

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 118

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm (2)

We use the following notations:• v_reg(E): the set of available registers for the translation of E• v_tmp(E): the set of addresses where values can be stored

temporarily when translating E• cell(E): register/memory cell where the result of E is stored

Now, let• E be an expression• L the left subexpression of E• R the right subexpression of E• vr abbreviate |v_reg(E)|

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 119

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm (3)

We distinguish the following cases:

1. regbed(L) < vr

2. regbed(L) ≥ vr and regbed(R) < vr

3. regbed(L) ≥ vr and redbed(R) ≥ vr

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 120

Page 16: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm (4)

Case 1: regbed(L) < vr

• Generate code for R using v_reg(E) and v_tmp(E) with result incell(R)

• Generate code for L using v_reg(E) \{ cell(R) } and v_tmp(E) withresult in cell(L)

• Generate code for the operation cell(L) := cell(L) OP cell(R)• Set cell(E) = cell(L)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 121

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm (5)

Case 2: regbed(L) ≥ vr and regbed(R) < vr

• Generate code for L using v_reg(E) and v_tmp(E) with result incell(L)

• Generate code for R using v_reg(E) \{ cell(L) } and v_tmp(E) withresult in cell(R)

• Generate code for the operation cell(L) := cell(L) OP cell(R)• Set cell(E) = cell(L)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 122

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm (6)

Case 3: regbed(L) ≥ vr and redbed(R) ≥ vr

• Generate code for R using v_reg(E) and v_tmp(E) with result incell(R)

• Generate code M[first(v_tmp(E))] := cell(R)• Generate code for L using v_reg(E) and rest(v_tmp(E)) with result

in cell(L)• Generate code for the operation cell(L) := cell(L) OP

M[first(v_tmp(E))]• Set cell(E) = cell(L)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 123

Register Allocation Sethi-Ullmann Algorithm

Sethi-Ullmann algorithm (7)

Function regbed in MAX notation (can be realized by S-Attribution):

3. Fall: regbed( L ) ! vr und regbed( R ) ! vr

Generiere zunächst Code für RGeneriere zunächst Code für R

unter Verwendung von v_reg(E) und v_tmp(E)

mit Ergebnis in zelle(R)

Generiere Code: M[ first(v_tmp(E)) ] := zelle(R)

Generiere Code für L

unter Verwendung von v_reg(E) und

rest( v_tmp(E) ) mit Ergebnis in zelle(L)

G i C d fü di O tiGeneriere Code für die Operation:

zelle(L) := zelle(L) OP M[ first(v_tmp(E)) ]

Setze zelle(E) = zelle(L)

Die Funktion regbed (in MAX-Notation):

ATT regbed( Exp@ E ) Nat:

IF Assign@< Var@ E> : 0IF Assign@<_,Var@ E> : 0

| BinExp@< Var@ E,_,_> : 1

| BinExp@<_,_,Var@ E > : 0

| BinExp@< L,_, R > E :

IF regbed(L)=regbed(R)

THEN regbed(L) + 1

ELSE max( regbed(L), regbed(R) )

ELSE nil // Fall kommt nicht vor

28.06.2007 350© A. Poetzsch-Heffter, TU Kaiserslautern

(In ML wäre die Definition von regbed etwas

aufwendiger, da der Kontext von Var-Ausdrücken

nicht direkt berücksichtigt werden kann.)c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 124

Register Allocation Sethi-Ullmann Algorithm

Example: Sethi-Ullman Algorithm

Consider f:= (( a + b ) - (c + d)) * (a - (d+e))

Attributes:

Beispiel: (Ablauf Sethi-Ullman)

Betrachte: f:= ((a+b)-(c+d)) * (a-(d+e))Betrachte: f: ((a+b) (c+d)) (a (d+e))

Attribute: v_reg | v_tmp 12T

regbed

zellezelle

Assign

Var

fBinExp

*

(3.)

312T

1

T

BinExpBinExp

-

(1.)

-

(1.)

2 212 12T

12

Var

a

BinExpBinExpBinExp

+ + + 1 1 11

2 12 12T212

VarVar

d

Var

d

Var VarVar

b

(2.) (1.) (1.)

1 0 1 0 1 0

28.06.2007 351© A. Poetzsch-Heffter, TU Kaiserslautern

edda cb1 0 1 0 1 0

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 125

Register Allocation Sethi-Ullmann Algorithm

Example: Sethi-Ullman Algorithm (2)

Beispiel: (Ablauf Sethi-Ullman)

Betrachte: f:= ((a+b)-(c+d)) * (a-(d+e))Betrachte: f: ((a+b) (c+d)) (a (d+e))

Attribute: v_reg | v_tmp 12T

regbed

zellezelle

Assign

Var

fBinExp

*

(3.)

312T

1

T

BinExpBinExp

-

(1.)

-

(1.)

2 212 12T

12

Var

a

BinExpBinExpBinExp

+ + + 1 1 11

2 12 12T212

VarVar

d

Var

d

Var VarVar

b

(2.) (1.) (1.)

1 0 1 0 1 0

28.06.2007 351© A. Poetzsch-Heffter, TU Kaiserslautern

edda cb1 0 1 0 1 0

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 126

Register Allocation Sethi-Ullmann Algorithm

Example: Sethi-Ullman Algorithm (3)

For formalizing the algorithm, we realize the set of available registersand addresses for storing temporaries with lists, where• the list RL of registers is non-empty• the list AL of addresses is long enough• the result cell is always a register which is the first in RL, i.e.,

first(RL)• the function exchange switches the first two elements of a list,

fst returns the first element of the list,rest returns the tail of the list

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 127

Register Allocation Sethi-Ullmann Algorithm

Example: Sethi-Ullman Algorithm (4)

In the following, the function expcode for code generation is given inMAX notation (functional).

Note: The application of the functions exchange, fst and expcodesatisfy their preconditions length(RL) > 1 or length(RL) > 0, resp.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 128

Page 17: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Register Allocation Sethi-Ullmann Algorithm

Example: Sethi-Ullman Algorithm (5)FCT expcode( Exp@ E, RegList RL, AdrList AL )

CodeList: // pre: length(RL)>0

IF Var@<ID> E:

[ fst(RL) := M[adr(ID)] ]

| BinExp@< L,OP,Var@<ID> > E:

expcode(L,RL,AL)

++ [ fst(RL) := fst(RL) OP M[adr(ID)] ]

| BinExp@< L,OP,R > E:

LET vr == length( RL ) :

IF regbed(L) < vr :

expcode(R,exchange(RL),AL)

++ expcode(L,rst(exchange(RL)),AL)

++ [ fst(RL):= fst(RL) OP fst(rst(RL))]

| regbed(L)>=vr AND regbed(R)<vr :

expcode(L,RL,AL)

++ expcode(R,rst(RL),AL)

++ [ fst(RL):= fst(RL) OP fst(rst(RL))]

| regbed(L)>=vr AND regbed(R)>=vr :

expcode(R,RL,AL)

[ [ f ( ) ] f ( ) ]++ [ M[ fst(AL) ] := fst(RL) ]

++ expcode(L,RL,rst(AL))

++ [ fst(RL):= fst(RL) OP M[fst(AL)] ]

ELSE nil

ELSE []ELSE []

Beachte:

Die Anwendungen der Funktionen exchange, fst und

28.06.2007 353© A. Poetzsch-Heffter, TU Kaiserslautern

expcode erfüllen jeweils ihre Vorbedingungen

length(RL) > 1 bzw. length(RL) > 0 .

Remarks:• The algorithm generates 2AC which is optimal with respect to the

number of instructions and the number of temporaries if theexpression has no common subexpressions.

• The algorithm shows the dependency between code generationand register allocation and vice versa.

• In a procedural implementation, register and address lists can berealized by a global stack.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 129

Register Allocation Register Allocation by Graph Coloring

4.3.2 Register Allocation by Graph Coloring

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 130

Register Allocation Register Allocation by Graph Coloring

Register allocation by graph coloring

Register allocation by graph coloring is an algorithm (with manyvariants) for allocation of registers in control flow graphs.

Register allocation for CGF with 3AC in SSA form• Input: CFG with using temporary variables• Output: Structurally the same CFG with

I registers instead of temporary variablesI additional instructions for storing intermediate results on the stack,

if applicable

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 131

Register Allocation Register Allocation by Graph Coloring

Register allocation by graph coloring (2)

Remarks:• The SSA representation is not necessary, but simplifies the

formulation of the algorithm(e.g.,Wilhelm/Maurer do not use SSA in Sect. 12.5)

• It is no restriction that only temporary variables are implementedby registers. We assume that program variables are assigned totemporary variables in a preceding step.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 132

Register Allocation Register Allocation by Graph Coloring

Life range and interference graph

Definition (Life range)The life range of a temporary variable is the set of program positions atwhich it is alive.

Definition (Interference)Two temporary variables interfere if their life ranges have a non-emptyintersection.

Definition (Interference graph)Let P be a program part/CFG in 3AC/SSA. The interference graph of Pis an undirected graph G = (N,E), where• N is the set of temporary variables• an edge (n1,n2) is in E iff n1 and n2 interfere.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 133

Register Allocation Register Allocation by Graph Coloring

Register allocation by graph coloring

Goal: Reduce number of temporary variables with the availableregisters.

Idea: Translate the problem to graph coloring (NP-complete). Colorthe interference graph, such that• neighboring nodes have different colors• no more colors are used than available registers

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 134

Register Allocation Register Allocation by Graph Coloring

Register allocation by graph coloring (2)

General procedure: Try to color the graph as described below. Then:• If a coloring is found, terminate.• If nodes could not be colored,

I choose a non-colored node kI modify the 3AC program such that the value of k is stored

temporarily and is first loaded when it is usedI try to color the modified program

Termination: The procedure terminates, because storing valuesintermediately reduces life ranges of temporaries and interferences.In practice, two or three iterations are sufficient.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 135

Register Allocation Register Allocation by Graph Coloring

Register allocation by graph coloring (3)

Coloring algorithm: Let rn be the number of available registers, i.e.,for coloring, maximally rn colors may be used.

The coloring algorithm consists of the phases:

• (a) Simplify with marking

• (b) Coloring

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 136

Page 18: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Register Allocation Register Allocation by Graph Coloring

Simplify with marking

Remove iteratively nodes with less than rn neighbors from the graphand push them onto a stack.

Case 1: The current simplification steps lead to an empty graph.Continue with the coloring phase.

Case 2: The graph contains only nodes with rn and more than rnneighbors. Choose a suitable node as candidate for storing ittemporarily, mark it, push it onto the stack and continue simplification.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 137

Register Allocation Register Allocation by Graph Coloring

Coloring

The nodes are successively popped from the stack and, if possible,colored and put back into the graph.

Let k be the popped node.

Caseh1: k is not marked. Thus, it has less than rn neighbors. Then, kcan be colored with a new color.

Case 2: k is marked.a) the rn or more neighbors have less than rn-1 different colors.

Then, color k appropriately.b) there are rn or more colors in the neighborhood. Leave k

uncolored.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 138

Register Allocation Register Allocation by Graph Coloring

Example - Graph coloring

For simplicity, we only consider one basic block.

In the beginning, t0 and t2 are live.Beispiel: (Graphfärbung)

Einfachheitshalber betrachten wir nur einen Basisblock:

t1 := a + t0

t3 := t2 – 1

t4 := t1 * t3

t5 := b + t0

Am Anfang sindt0, t2 lebendig

0 1 2 3 4 5 6 7 8 9

t5 := b + t0

t6 := c + t0

t7 := d + t4

t8 := t5 + 8

t9 := t8

A E d i dt2 := t6 + 4

t0 := t7

Am Ende sindt0, t2, t9 leb.

Interferenzgraph:t4

t5

Interferenzgraph:

t0

t1

t2

t3

t6t7

t8

t1

t9

Annahme: 4 verfügbare Register

28.06.2007 358© A. Poetzsch-Heffter, TU Kaiserslautern

g g

Vereinfachung: Eliminiere der Reihe nach

t1, t3, t2, t9, t0, t5, t4, t7, t8, t6

In the end, t0, t2, t9 are alive.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 139

Register Allocation Register Allocation by Graph Coloring

Example - Graph coloring (2)Interference graph:

Assumption: 4 available registers

Simplification: Remove (in order) t1, t3, t2, t9, t0, t5, t4, t7, t8, t6

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 140

Register Allocation Register Allocation by Graph Coloring

Example - Graph coloring (3)

Possible coloring:

Fortsetzung des Beispiels:

Möglich Färbung (t1, t3, t2, t9, t0, t5, t4, t7, t8, t6):g g ( , , , , , , , , , )

t4

t5

t0 t2

t3

t5

t6t7

t8

t1

t9

Bemerkung:

Es gibt eine Reihe von Erweiterungen des Verfahrens:

• Elimination von Move-BefehlenElimination von Move Befehlen

• Bestimmte Heuristiken bei der Vereinfachung (Was

ist ein geeigneter Knoten?)

• Berücksichtigung vorgefärbter KnotenBerücksichtigung vorgefärbter Knoten

Lesen Sie zu Abschnitt 4.3.2:

A l

28.06.2007 359© A. Poetzsch-Heffter, TU Kaiserslautern

Appel:

• Section 11.1-11.3 , S. 238-251

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 141

Register Allocation Register Allocation by Graph Coloring

Example - Graph coloring (4)

Remarks:

There are several extensions of the algorithm:• Elimination of move instructions• Specific heuristics for simplification (What is a suitable node?)• Consider pre-colored nodes

Recommended reading:• Appel, Sec. 11.1 – 11.3

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 142

Register Allocation Register Allocation by Graph Coloring

Further aspects of register allocation

The introduced algorithms consider subproblems. In practice, thereare further aspects that have to be dealt with for register allocation:• Interaction with other compiler phases (in particular optimization

and code generation)• Relation between temporaries and registers• Source/intermediate/target language• Number of applications (Is a variable inside an inner loop?)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 143

Register Allocation Register Allocation by Graph Coloring

Further aspects of register allocation (2)

Possible global procedure

• Allocate registers for standard tasks (registers for stack andargument pointers, base registers)

• Decide which variables and parameters should be stored inregisters

• Evaluate application frequency of temporaries (occurrences ininner loops, distribution of accesses over life range)

• Use evaluation together with heuristics of register allocationalgorithm

• If applicable, optimize again

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 144

Page 19: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Just-In-Time Compilation

4.4 Just-In-Time Compilation

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 145

Just-In-Time Compilation Language Execution Techniques

4.4.1 Language Execution Techniques

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 146

Just-In-Time Compilation Language Execution Techniques

Static (Ahead-of-Time) Compilation

RuntimeCompile Time

SourceCode

AOT CompilerMachine

CodeMachine

Advantages

• Fast execution

Disadvantages

• Platform dependent• Compilation step

Examples

• C/C++, Pascal

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 147

Just-In-Time Compilation Language Execution Techniques

Interpretation

Runtime

SourceCode

Interpreter

Advantages

• Platform independent• No compilation step

Disadvantages

• Slow execution

Examples

• Bash, Javascript (old browsers)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 148

Just-In-Time Compilation Language Execution Techniques

Use of Virtual Machine Code (Bytecode)

RuntimeCompile Time

SourceCode

AOT Compiler Bytecode Virtual Machine

Advantages

• Faster execution• Platform independent

Disadvantages

• Still slow due to interpretation• Compilation step

Examples

• Java, C#

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 149

Just-In-Time Compilation Just-In-Time Compilation

4.4.2 Just-In-Time Compilation

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 150

Just-In-Time Compilation Just-In-Time Compilation

Dynamic (Just-In-Time) Compilation

Runtime

Byte/SourceCode

JIT CompilerMachine

CodeMachine

Virtual Machine/Interpreter

Advantages

• Fast execution• Platform independent

Disadvantages

• JIT runtime overhead

Examples

• Java HotSpot VM, .NET CLR, Mozilla SpiderMonkeyc© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 151

Just-In-Time Compilation Just-In-Time Compilation

Just-in-time Compilation

• Just-in-time (dynamic) compilation compiles code during runtime• The goal is to improve performance compared to pure

interpretation• Trade-off between compilation cost and execution time benefit

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 152

Page 20: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Just-In-Time Compilation Just-In-Time Compilation

The History of Just-In-Time1

1960 McCarthy: compile LISP functions at runtime

1968 Thompson: compile regular expressions at runtime

1968 Mitchell: get compiled code by storing interpreter actions

1970 Abrams: JIT-Compilers for APL

1974 Hansen: Detect hot-spots using frequency counters

1993 Jones: Use partial evaluation to create compilers frominterpreters

1994 Hölzle: Adaptive optimization for Self

1997 Sun Hot-Spot JVM

2006 Gal and Franz: Tracing JITs

2011 Google V8, Mozilla TraceMonkey

1See Aycook, 2003c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 153

Just-In-Time Compilation Just-In-Time Compilation

Advantages of JIT Compilation2

Many optimizations can be done at runtime, which are not possible instatic compilation, due to additional runtime information:• Concrete operating system and execution platform

I e.g. to use SSE2 instructions• Concrete input values

I Inline virtual method callsI Apply constant foldingI ...

• Program can be monitored at runtimeI Optimize hot-code

• Global optimizations in presence ofI Library codeI Dynamically loaded code

2Source: http://en.wikipedia.org/wiki/Just-in-time_compilationc© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 154

Just-In-Time Compilation Just-In-Time Compilation

Kinds of JIT Compilation

Classic• No interpretation• Compile code with a fast (non-optimizing) traditional compiler

Mixed-Mode• Start with interpretation• Only compile hot code• Examples: Sun Hot-Spot JVM, Mozilla SpiderMonkey

Adaptive Compilation• No interpretation• Start with fast compilation (nearly no optimizations)• Recompile hot code with optimizing compiler• Example: Google V8

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 155

Just-In-Time Compilation Just-In-Time Compilation

Design Decisions for JIT Implementations

• JIT implementations have to decide:I What to compile?

• All code or only some code?I How to compile?

• Fast or optimal?I When to compile?

• At startup or when hot-code is detected?• Longer analysis⇒ better generated code

• Decisions may depend on the target machine and the targetapplication

I Client applications require fast start-upI Server applications should be optimized more aggressively

• JIT implementations typically allow to configure these parameters• Default values are based on empirical data (benchmarks)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 156

Just-In-Time Compilation Just-In-Time Compilation

Different Compilers

Fast Compiler

• Only simple optimizations (e.g. constant folding)• No intermediate representations• Simple register allocation (linear time)• Advantage: fast compilation• Disadvantage: slow code

Optimizing Compiler

• Use all techniques of traditional compilers• Disadvantage: slow compilation• Advantage: very fast code

I Generated code can outperform C or C++ compiled code due toadditional runtime information

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 157

Just-In-Time Compilation Hot-Spot Detection

4.4.3 Hot-Spot Detection

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 158

Just-In-Time Compilation Hot-Spot Detection

Hot-Spot Detection

ObservationMany programs spend the majority of their time executing a minority oftheir code (hot spots)3

ProblemIt is often statically not clear which parts of the program are executedmore often than others

SolutionMonitor code during runtime (profiling)

3D.E. Knuth. An empirical study of Fortran programs. Software—Practice andExperience 1, pp. 105–133, 1971.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 159

Just-In-Time Compilation Hot-Spot Detection

Profiling

Profiling

1. monitor and trace events that occurs during runtime,2. set the cost of these events3. attribute the cost of these events to specific parts of the program.

Profiling uses the past to predict the future

Ways to profile

• Time-based profiling• Counter-based profiling• Sampling-based profiling

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 160

Page 21: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Just-In-Time Compilation Hot-Spot Detection

Time-based Profiling

Method

• Record time spent in each method• Profiling instructions are inserted in prolog and epilog• Measure time and add it to the total time of the method• Methods are compiled when a certain amount of time has been

spent in that method

Properties

• All methods are profiled• Maybe inaccurate for short methods• Very large overhead

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 161

Just-In-Time Compilation Hot-Spot Detection

Counter-based Profiling

Method

• Invocation counter for each method (loop back-branches)• Increase counter for each method call (branch take)• Compile method when counter reaches a predefined threshold

Properties

• All methods are profiled• Accurate• Difficult to choose good thresholds• Large overhead

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 162

Just-In-Time Compilation Hot-Spot Detection

Sampling-based Profiling

Method

• Counter for each method• Sample application periodically (e.g., every 10ms)• Increase counter of current method (and caller method)• Compile method when counter reaches a predefined threshold

Properties

• Low overhead• May miss methods• Non-deterministic (difficult to debug)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 163

Just-In-Time Compilation Further Aspects of JIT Compilers

4.4.4 Further Aspects of JIT Compilers

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 164

Just-In-Time Compilation Further Aspects of JIT Compilers

Memory Mangement of Compiled Code

Problem

• Compiled (native) code is often 4-8 times larger than the originalbytecode

• Compiled code must be hold in memory

Solution

• To reduce the memory consumption only a fixed amount (cache)of compiled code is hold in memory

Cache Replacement Strategies

• FIFO (First in First Out)• LRU (Least Recently Used)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 165

Just-In-Time Compilation Further Aspects of JIT Compilers

On-Stack Replacement (OSR)

Problem

• When a hot-loop is detected, the compiled version of theexecuting method is only executed the next time the method iscalled (which may never be the case)

Solution

• Compile a special version of the method that starts in the middleof the method, where the loop is executing

• Stop interpreting the executing method and execute the specialcompiled version

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 166

Just-In-Time Compilation Further Aspects of JIT Compilers

De-Optimization

Problem

• In languages that allow dynamic code loading (i.e., Java)optimizations may become invalid

• For example: method inlining for virtual method calls can becomeinvalid when new classes are added to the type hierarchy

De-Optimization

• Optimized code can be deoptimized at runtime• Deoptimized code can be reoptimized again

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 167

Just-In-Time Compilation Further Aspects of JIT Compilers

Inline Caches (1/2)4

Problem• Message lookup in prototype-based languages like Javascript or

Smalltalk can be expensive due to complex lookup rules.

Observation• Receiver objects at certain call site are often of the same type

Idea• After first dynamic lookup, inline the lookup result at the call site• Add typecheck to fallback to dynamic lookup and update cache

4Good introduction:http://blog.cdleary.com/2010/09/picing-on-javascript-for-fun-and-profit/

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 168

Page 22: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Just-In-Time Compilation Further Aspects of JIT Compilers

Inline Caches (2/2)

Example (Javascript)

function isPoint(obj) {return obj.isPoint;

}

Generated code (pseudo code):

type := gettype(obj)if type = CACHED_TYPE

result = staticcall CACHED_METHODjump L

elseresult = dynamiccall obj, "isPoint"# ... update cached values (modify generated code)

L: return result

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 169

Just-In-Time Compilation Further Aspects of JIT Compilers

Polymorphic Inline Caches (PICs)5

Problem

• Inline caches only work for a single type (monomorphic type)

Solution

• Polymorphic Inline Caches (PICs)• Like (monomorphic) inline caches, but handles multiple cases• If typecheck fails add additional case (linear search)• If a certain number of cases is reached, treat the call site as

megamorphic and only do dynamic lookup

5Craig Chambers, David Ungar, and Elgin Lee. Optimizing dynamically-typedobject-oriented languages with polymorphic inline caches. ECOOP 1991.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 170

Just-In-Time Compilation Tracing JIT Compilers

4.4.5 Tracing JIT Compilers

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 171

Just-In-Time Compilation Tracing JIT Compilers

Tracing JIT Compilers

Observation

• Most time is spent in hot paths

Idea

• Concentrate on hot paths and not whole methods/code blocks

Approach

• Detect hot paths at runtime• Record trace when hot path is detected• Generate optimized code for individual traces• Use trace trees instead of control flow graphs

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 172

Just-In-Time Compilation Tracing JIT Compilers

Example

1: code;2: do {

if (rare condition) {3: code;

} else {4: code;

}5: } while (frequent condition);6: code;

Control Flow Graph:

1

2

3 4

5

6

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 173

Just-In-Time Compilation Tracing JIT Compilers

Example

1: code;2: do {

if (rare condition) {3: code;

} else {4: code;

}5: } while (frequent condition);6: code;

Control Flow Graph:

1

2

3 4

5

6

hot path = (2,4,5,2)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 174

Just-In-Time Compilation Tracing JIT Compilers

Hot Path Detection

• Only loops are considered for hot path detection (hot loops)• Add counter to each destination of a backward branch (potential

loop header)• Interpret the program• Increase counter when branch is taken• When threshold (e.g. 2 in TraceMonkey) is reached, hot loop is

detected

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 175

Just-In-Time Compilation Tracing JIT Compilers

Tracing

1. When hot loop is detected start with code tracing2. Record all interpreter instructions3. Stop recording, when either

I Cycle is found (tracing finished)I Trace becomes too long (tracing aborted)I Exception is thrown (tracing aborted)

4. Result is a code trace (loop trace)5. Branches in a code trace are replaced by guards to handle side

exitsI Failed guards return control to the interpreter

6. Method calls are inlined into the trace with appropriate guards incase of dynamic dispatch

7. The trace is optimized and compiled to native code8. In the next iteration the native code is executed

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 176

Page 23: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Just-In-Time Compilation Tracing JIT Compilers

Properties of Simple Tracing JITs

Advantages

• Optimizing single traces is much easier (faster) than whole CFG• Optimizing happens across method boundaries, which is

especially good for programs with many small methods• Implementation is simpler and takes less code compared to a

CFG-based JIT compiler

Disadvantages

• Only works well when there are hot dominant paths• Trace recording is very expensive

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 177

Just-In-Time Compilation Tracing JIT Compilers

Trace Trees6

Problem

• Simple tracing only records a single path• Does not work well for loops with non-dominant paths

Idea

• Instead of single traces use trace trees

Approach

• When a guard during execution of a compiled trace fails,immediately start trace recording

• When the new trace reaches the loop header, incorporate the newtrace into the trace tree

• Corresponding guard is turned into a conditional branch6See Gal and Franz, 2006

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 178

Just-In-Time Compilation Tracing JIT Compilers

Example

1: code;2: do {

if (condition) {3: code;

} else {4: code;

}5: } while (condition);6: code;

Control FlowGraph:

1

2

3 4

5

6

Trace

2

4

5

side exit (sx)

sx

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 179

Just-In-Time Compilation Tracing JIT Compilers

Example

1: code;2: do {

if (condition) {3: code;

} else {4: code;

}5: } while (condition);6: code;

Control FlowGraph:

1

2

3 4

5

6

Trace Tree

2

4

5

3

5sx sx

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 180

Just-In-Time Compilation Tracing JIT Compilers

Properties of Trace Trees

• A trace tree is a directed rooted tree• The root is called anchor node a and represents the loop header• All leaf nodes have an implicit back-edge to a• All nodes, except a, have exactly one predecessor• Nodes maybe duplicated if on multiple traces• Transformation to SSA form is fast, because there is only one

join-point (the anchor node)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 181

Just-In-Time Compilation Tracing JIT Compilers

Nested Loops

• Traces are added to a trace tree when a side exit is taken• For nested loops, the inner loop gets hot before the outer loop• As a consequence the loop is turned "inside out"

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 182

Just-In-Time Compilation Tracing JIT Compilers

Nested Loops Example

1: code;2: do {

code;3: do {

code;4: } while (condition);5: } while (condition);6: code;

CFG

1

2

3

4

5

6

Inner Trace

3

4 sx

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 183

Just-In-Time Compilation Tracing JIT Compilers

Nested Loops Example

1: code;2: do {

code;3: do {

code;4: } while (condition);5: } while (condition);6: code;

CFG

1

2

3

4

5

6

Extended Trace

3

4

5

2

sx

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 184

Page 24: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Just-In-Time Compilation Tracing JIT Compilers

Bounding Trace Trees

• Trace trees can grow indefinitely• In order to limit the size of trace trees, extending the tree is

stopped after a certain number of backward branches (e.g. 3)• Effectively limits the possible number of inlined outer loops

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 185

Just-In-Time Compilation Tracing JIT Compilers

Method Calls

• Like outer loops method calls are inlined• Virtual calls result in a branch

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 186

Just-In-Time Compilation Literature

Literature

General JIT Compilation• John Aycook. A Brief History of Just-In-Time. ACM Computing Surveys, Vol. 35, No. 2, June 2003, pp. 97-113.

http://dx.doi.org/10.1145/857076.857077

• M. Arnold et al. A Survey of Adaptive Optimization in Virtual Machines. Proc. IEEE. 2005.http://dx.doi.org/10.1109/JPROC.2004.840305

• T. Kotzmann, C. Wimmer, H. Mossenbock. Design of the Java HotSpot Client Compiler for Java 6. ACM TACO 2008.http://dx.doi.org/10.1145/1369396.1370017

• Sami Zhioua. A dynamic compiler in an embedded Java Virtual machine. Master’s Thesis. 2003.http://www.cs.mcgill.ca/~zhioua/MscSami.pdf

Tracing JITs• A. Gal and M. Franz. Inremental Dynamic Code Generation with Trace Trees. Technical Report, 2006.

http://www.ics.uci.edu/~franz/Site/pubs-pdf/ICS-TR-06-16.pdf

• A. Gal, C. W. Probst, and M. Franz. HotpathVM: An Effective JIT Compiler for Resource-constrained Devices. VEE’06.http://www.usenix.org/events/vee06/full_papers/p144-gal.pdf

• Gal et al. Trace-based Just-in-Time Type Specialization for Dynamic Languages. PLDI 2009.http://people.mozilla.org/~gal/compressed.tracemonkey-pldi-09.pdf

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 187

Further Aspects of Compilation

4.5 Further Aspects of Compilation

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 188

Further Aspects of Compilation

Code generation

Code generation can be split into four independentmachine-dependent tasks:• Memory allocation• Instruction selection and addressing• Instruction scheduling• Code optimization

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 189

Further Aspects of Compilation

Memory allocation

Modern machines have the following memory hierarchy:• Registers• Primary Cache (Instruction Cache, Data Cache)• Secondary Cache• Main memory (page/segment addressing)

Different from registers, the cache is controlled by the hardware.Efficient usage of the cache means in particular to align data objectsand instructions to borders of cache blocks (cf. Appel, Chap. 21). Thesame holds for main memory.

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 190

Further Aspects of Compilation

Instruction selection

Instruction selection aims at the best possible translation ofexpressions and basic blocks using the instruction set of the machine,for instance,• using complex addressing modes• considering the sizes of constants or the locality of jumps

Instruction selection is often formulated as a tree pattern matchingproblem with costs. (cf. Wilhelm/Maurer, Chap.11)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 191

Further Aspects of Compilation

Instruction scheduling

Modern machines allow processor-local parallel processing (pipeline,super-scalar, VLIW).

In order to use this parallel processing, code has to comply toadditionalrequirements that have to be considered for code generation.(see Appel, Chap. 20; Wilhelm/Maurer, Sect. 12.6)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 192

Page 25: 4. Selected Topics in Compiler Construction · 4. Selected Topics in Compiler Construction c Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 3 Chapter Outline

Further Aspects of Compilation

Code optimization

Optimizations of the assembler or machine code may allow anadditional increase in program efficiency.(see Wilhelm/Maurer, Sect. 6.9)

c© Prof. Dr. Arnd Poetzsch-Heffter Selected Topics in Compiler Construction 193