Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1
CS 1622 Lecture 21 1
CS1622
Lecture 21Code Optimization
CS 1622 Lecture 21 2
Code Optimizations Code optimizations are code
transformations Optimization is a misnomer
can sometimes make things worse Can be applied at various level of code:
high, intermediate and low Can be applied to various regions of
programs local - basic block global - control flow graph of method inter-procedure - across methods
CS 1622 Lecture 21 3
Optimization Phase Next to last phase of compiler Goal is to improve the quality of the
generated code Time Memory Power
Typically performed on some (orseveral levesl of) intermediate language
2
CS 1622 Lecture 21 4
Why Intermediate Languages? When to perform optimizations
On AST Pro: Machine independent Con: Too high level
On assembly language Pro: Exposes optimization opportunities Con: Machine dependent Con: Must re-implement optimizations when retargetting
On an intermediate language Pro: Machine independent Pro: Exposes optimization opportunities
CS 1622 Lecture 21 5
Intermediate Languages Each compiler uses its own intermediate
language IL design is still an active area of research
Intermediate language = high-level assemblylanguage Uses register names, but has an unlimited number Uses control structures like assembly language Uses opcodes but some are higher level
E.g., push translates to several assembly instructions Most opcodes correspond directly to assembly opcodes
CS 1622 Lecture 21 6
Three-Address IntermediateCode Each instruction is of the form x := y op z
y and z can be only registers or constants Just like assembly
Common form of intermediate code The AST expression x + y * z is translated as
t1 := y * z t2 := x + t1 Each subexpression has a “home”
3
CS 1622 Lecture 21 7
Generating Intermediate Code Similar to assembly code generation Major difference
Use any number of IL registers to holdintermediate results
CS 1622 Lecture 21 8
Generating IntermediateCode (Cont.) Igen(e, t) function generates code to
compute the value of e in register t Example:
igen(e1 + e2, t) = igen(e1, t1) (t1 is a fresh register) igen(e2, t2) (t2 is a fresh register) t := t + t1
Unlimited number of registers ⇒ simple code generation
CS 1622 Lecture 21 9
An Intermediate LanguageP ⇒S P | eS ⇒ id := id op id | id := op id | id := id | push id | id := pop | if id relop id goto L | L: | jump L
• id’s are register names
• Constants can replace id’s
• Typical operators: +, -, *
4
CS 1622 Lecture 21 10
Structure intermediateprogram to analyze
Generate intermediate code
want to analyze code to apply optimizations
Control flow important
Data flow important
Use control flow graph - node are basicblocks,edges show flow of control
CS 1622 Lecture 21 11
Definition. Basic Blocks A basic block is a maximal sequence of
instructions with: no labels (except at the first instruction), and no jumps (except in the last instruction)
Idea: Cannot jump in a basic block (except at beginning) Cannot jump out of a basic block (except at end) Each instruction in a basic block is executed after
all the preceding instructions have been executed All instructions will be executed if first one is
CS 1622 Lecture 21 12
Basic Block Example Consider the basic block
L: t := 2 * x w := t + x if w > 0 goto L’
No way for (3) to be executed without (2)having been executed right before We know can change (3) to w := 3 * x Can we eliminate (2) as well?
5
CS 1622 Lecture 21 13
Definition. Control-FlowGraphs A control-flow graph is a directed graph
with Basic blocks as nodes An edge from block A to block B if the
execution can flow from the last instructionin A to the first instruction in B
E.g., the last instruction in A is jump LB E.g., the execution can fall-through from
block A to block B
CS 1622 Lecture 21 14
Control-Flow Graphs.Example.
The body of a method(or procedure) can berepresented as acontrol-flow graph
There is one initialnode
All “return” nodes areterminal
x := 1i := 1
L: x := x * x i := i + 1 if i < 10 goto L
CS 1622 Lecture 21 15
Optimization Overview Optimization seeks to improve a program’s
utilization of some resource Execution time (most often) Code size Network messages sent, etc.
Optimization should not alter what theprogram computes The answer must still be the same
6
CS 1622 Lecture 21 16
A Classification ofOptimizations For languages like C and Java there are
three granularities of optimizations Local optimizations
Apply to a basic block in isolation Global optimizations
Apply to a control-flow graph (method body) in isolation Inter-procedural optimizations
Apply across method boundaries
Most compilers do (1), many do (2) andvery few do (3)
CS 1622 Lecture 21 17
Cost of Optimizations In practice, a conscious decision is made not
to implement the fanciest optimization known Why?
Some optimizations are hard to implement Some optimizations are costly in terms of
compilation time The fancy optimizations are both hard and costly
The goal: maximum improvement withminimum of cost
CS 1622 Lecture 21 18
Local Optimizations The simplest form of optimizations No need to analyze the whole
procedure body Just the basic block in question
Example: algebraic simplification
7
CS 1622 Lecture 21 19
Algebraic Simplification Some statements can be deleted
x := x + 0x := x * 1
Some statements can be simplified x := x * 0 ⇒ x := 0 y := y ** 2 ⇒ y := y * y x := x * 8 ⇒ x := x << 3 x := x * 15 ⇒ t := x << 4; x := t - x
(on some machines << is faster than *; butnot on all!)
CS 1622 Lecture 21 20
Constant Folding Operations on constants can be computed at
compile time In general, if there is a statement
x := y op z And y and z are constants Then y op z can be computed at compile time
Example: x := 2 + 2 ⇒ x := 4 Example: if 2 < 0 jump L can be deleted
CS 1622 Lecture 21 21
Control Flow Optimizations Eliminating unreachable code:
Code that is unreachable in the control-flow graph Basic blocks that are not the target of any jump or
“fall through” from a conditional Such basic blocks can be eliminated
Why would such basic blocks occur? Removing unreachable code makes the
program smaller And sometimes also faster
Due to memory cache effects (increased spatial locality)
8
CS 1622 Lecture 21 22
Static Single AssignmentForm (SSA) Some optimizations are simplified if each
name occurs only once on the left-hand sideof an assignment
Intermediate code can be rewritten to be insingle assignment formx := z + y x1 := z + ya := x ⇒ a := x1
x := 2 * x x2 := 2 * x1
More complicated in general, due to loops
CS 1622 Lecture 21 23
Common SubexpressionElimination Assume
Basic block is in single assignment form A definition x := is the first use of x in a block
If any assignment have the same rhs, theycompute the same value
Example:x := y + z x := y + z… ⇒ …w := y + z w := x(the values of x, y, and z do not change in the …
code)
CS 1622 Lecture 21 24
Copy Propagation If w := x appears in a block, all subsequent
uses of w can be replaced with uses of x Example:
b := z + y b := z + y a := b ⇒ a := b x := 2 * a x := 2 * b
This does not make the program smaller orfaster but might enable other optimizations Constant folding Dead code elimination: - see a:= b above
9
CS 1622 Lecture 21 25
Copy Propagation andConstant Folding Example:
a := 5 a := 5x := 2 * a ⇒ x := 10y := x + 6 y := 16t := x * y t := x << 4
CS 1622 Lecture 21 26
Copy Propagation and DeadCode EliminationIf
w := rhs appears in a basic blockw does not appear anywhere else in the program
Thenthe statement w := rhs is dead and can be
eliminated Dead = does not contribute to the program’s result
Example: (a is not used anywhere else)x := z + y x := z + y x:= x + ya := x ⇒ a := x ⇒ x := 2 * xx := 2 * a x := 2 * x
CS 1622 Lecture 21 27
Applying Local Optimizations Each local optimization does very little by
itself Typically optimizations interact
Performing one optimizations enables other opt. Typical optimizing compilers repeatedly
perform optimizations until no improvementis possible The optimizer can also be stopped at any time to
limit the compilation time
10
CS 1622 Lecture 21 28
An Example Initial code:
a := x ** 2 b := 3 c := x d := c * c e := b * 2 f := a + d g := e * f
CS 1622 Lecture 21 29
An Example Algebraic optimization:
a := x ** 2 b := 3 c := x d := c * c e := b * 2 f := a + d g := e * f
CS 1622 Lecture 21 30
An Example Algebraic optimization:
a := x * x b := 3 c := x d := c * c e := b << 1 f := a + d g := e * f
11
CS 1622 Lecture 21 31
An Example Copy propagation:
a := x * x b := 3 c := x d := c * c e := b << 1 f := a + d g := e * f
CS 1622 Lecture 21 32
An Example Copy propagation:
a := x * x b := 3 c := x d := x * x e := 3 << 1 f := a + d g := e * f
CS 1622 Lecture 21 33
An Example Constant folding:
a := x * x b := 3 c := x d := x * x e := 3 << 1 f := a + d g := e * f
12
CS 1622 Lecture 21 34
An Example Constant folding:
a := x * x b := 3 c := x d := x * x e := 6 f := a + d g := e * f
CS 1622 Lecture 21 35
An Example Common subexpression elimination:
a := x * x b := 3 c := x d := x * x e := 6 f := a + d g := e * f
CS 1622 Lecture 21 36
An Example Common subexpression elimination:
a := x * x b := 3 c := x d := a e := 6 f := a + d g := e * f
13
CS 1622 Lecture 21 37
An Example Copy propagation:
a := x * x b := 3 c := x d := a e := 6 f := a + d g := e * f
CS 1622 Lecture 21 38
An Example Copy propagation:
a := x * x b := 3 c := x d := a e := 6 f := a + a g := 6 * f
CS 1622 Lecture 21 39
An Example Dead code elimination:
a := x * x b := 3 c := x d := a e := 6 f := a + a g := 6 * f
14
CS 1622 Lecture 21 40
An Example Dead code elimination: Final:
Initial code a := x ** 2 a := x * xb := 3c := xd := c * ce := b * 2f := a + d f := a + ag := e * f g := 6 * f
This is the final form
CS 1622 Lecture 21 41
Peephole Optimizations onAssembly Code The optimizations presented before work on
intermediate code They are target independent But they can be applied on assembly language
also Peephole optimization is an effective
technique for improving assembly code The “peephole” is a short sequence of (usually
contiguous) instructions The optimizer replaces the sequence with another
equivalent one (but faster)
CS 1622 Lecture 21 42
Peephole Optimizations(Cont.) Write peephole optimizations as replacement
rules i1, …, in ® j1, …, jmwhere the rhs is the improved version of the lhs
Example: move $a $b, move $b $a ® move $a $b Works if move $b $a is not the target of a jump
Another exampleaddiu $a $a i, addiu $a $a j ® addiu $a $a i+j
15
CS 1622 Lecture 21 43
Peephole Optimizations(Cont.) Many (but not all) of the basic block
optimizations can be cast as peepholeoptimizations Example: addiu $a $b 0 ® move $a $b Example: move $a $a ® These two together eliminate addiu $a $a 0
Just like for local optimizations, peepholeoptimizations need to be applied repeatedlyto get maximum effect
CS 1622 Lecture 21 44
Local Optimizations. Notes. Intermediate code is helpful for many
optimizations Many simple optimizations can still be
applied on assembly language “Program optimization” is grossly misnamed
Code produced by “optimizers” is not optimal inany reasonable sense
“Program improvement” is a more appropriateterm
Move to global optimizations
CS 1622 Lecture 21 45
Local OptimizationRecall the simple basic-block optimizations
Constant propagation Dead code elimination
X := 3
Y := Z * W
Q := X + Y
X := 3
Y := Z * W
Q := 3 + Y
Y := Z * W
Q := 3 + Y
16
CS 1622 Lecture 21 46
Global OptimizationThese optimizations can be extended to
an entire control-flow graphX := 3
B > 0
Y := Z + W Y := 0
A := 2 * X
CS 1622 Lecture 21 47
Global OptimizationThese optimizations can be extended to
an entire control-flow graphX := 3
B > 0
Y := Z + W Y := 0
A := 2 * X
CS 1622 Lecture 21 48
Global OptimizationThese optimizations can be extended to
an entire control-flow graphX := 3
B > 0
Y := Z + W Y := 0
A := 2 * 3
17
CS 1622 Lecture 21 49
Correctness How do we know it is OK to globally
propagate constants? There are situations where it is
incorrect:X := 3
B > 0
Y := Z + W
X := 4
Y := 0
A := 2 * X
CS 1622 Lecture 21 50
Correctness (Cont.)To replace a use of x by a constant k we
must know that:
On every path leading to the use of x,the last assignment to x is x := k **
CS 1622 Lecture 21 51
Example 1 Revisited
X := 3
B > 0
Y := Z + W Y := 0
A := 2 * X
18
CS 1622 Lecture 21 52
Example 2 Revisited
X := 3
B > 0
Y := Z + W
X := 4
Y := 0
A := 2 * X
CS 1622 Lecture 21 53
Discussion The correctness condition is not trivial to
check
“All paths” includes paths around loops andthrough branches of conditionals
Checking the condition requires globalanalysis An analysis of the entire control-flow graph