21
1 CS 1622 Lecture 17 1 CS1622 Lecture 15 Code Generation CS 1622 Lecture 17 2 Status We have covered the front-end phases Lexical analysis Parsing Semantic analysis Next are the back-end phases Optimization Code generation We’ll do code generation first . . . CS 1622 Lecture 17 3 Run-time environments Before discussing code generation, we need to understand what we are trying to generate There are a number of standard techniques for structuring executable code that are widely used Run time environment - what environment exists when the code is executing?

CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

1

CS 1622 Lecture 17 1

CS1622

Lecture 15Code Generation

CS 1622 Lecture 17 2

Status We have covered the front-end phases

Lexical analysis Parsing Semantic analysis

Next are the back-end phases Optimization Code generation

We’ll do code generation first . . .

CS 1622 Lecture 17 3

Run-time environments Before discussing code generation, we need

to understand what we are trying to generate

There are a number of standard techniquesfor structuring executable code that arewidely used

Run time environment - what environmentexists when the code is executing?

Page 2: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

2

CS 1622 Lecture 17 4

Issues in run timeenvironments Management of run-time resources Correspondence between static

(compile-time) and dynamic (run-time)structures

Storage organization

CS 1622 Lecture 17 5

Run-time Resources Execution of a program is initially under

the control of the operating system When a program is invoked:

The OS allocates space for the program The code is loaded into part of the space The OS jumps to the entry point (i.e.,

“main”)

CS 1622 Lecture 17 6

Memory LayoutLow Address

High Address

Memory

Code

Other Space

Page 3: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

3

CS 1622 Lecture 17 7

Notes By tradition, pictures of machine organization

have: Low address at the top High address at the bottom Lines delimiting areas for different kinds of data

These pictures are simplifications E.g., not all memory need be contiguous

CS 1622 Lecture 17 8

What is Other Space? Holds all data for the program Other Space = Data Space Compiler is responsible for:

Generating code Orchestrating use of the data area - that is

the data space

CS 1622 Lecture 17 9

Code Generation Goals Two goals:

Correctness Speed

Most complications in code generationcome from trying to be fast as well ascorrect

Page 4: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

4

CS 1622 Lecture 17 10

Assumptions about Execution Execution is sequential; control moves from

one point in a program to another in a well-defined order

When a procedure is called, controleventually returns to the point immediatelyafter the call

Do these assumptions always hold? Whendon’t they?

CS 1622 Lecture 17 11

Procedure Activations Run time environments are structured

around concept of a procedure

Data space is organized around thedata needed for calls on procedures -activations

Call on P - create an activation of P

CS 1622 Lecture 17 12

Activations An invocation of procedure P is an

activation of P

The lifetime of an activation of P is All the steps to execute P Including all the steps in procedures P calls

Page 5: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

5

CS 1622 Lecture 17 13

Lifetimes of Variables The lifetime of a variable x is the

portion of execution in which x isdefined

Note that Lifetime is a dynamic (run-time) concept Scope is a static concept

CS 1622 Lecture 17 14

Activation Trees Assumption (2) on slide 10 requires that

when P calls Q, then Q returns before P does

Lifetimes of procedure activations areproperly nested

Activation lifetimes can be depicted as a tree

CS 1622 Lecture 17 15

ExampleClass Main {

g() : Int { 1 };f(): Int { g() };main(): Int {{ g(); f(); }};

} Main

fg

g

Page 6: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

6

CS 1622 Lecture 17 16

Example 2Class Main {

g() : Int { 1 };f(x:Int): Int { if x = 0 then g() else f(x - 1) fi};main(): Int {{f(3); }};

}What is the activation tree for this example?

CS 1622 Lecture 17 17

Notes The activation tree depends on run-time

behavior The activation tree may be different for every

program input Because of assumption 2, procedure

invocations obey “last in - first out” discipline:last called is the first to exit.

Therefore, can use stack to hold proceduredata

CS 1622 Lecture 17 18

ExampleClass Main {

g() : Int { 1 };f(): Int { g() };main(): Int {{ g(); f(); }};

} Main Stack

Main

Page 7: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

7

CS 1622 Lecture 17 19

ExampleClass Main {

g() : Int { 1 };f(): Int { g() };main(): Int {{ g(); f(); }};

} Main

g

Stack

Main

g

CS 1622 Lecture 17 20

ExampleClass Main {

g() : Int { 1 };f(): Int { g() };main(): Int {{ g(); f(); }};

} Main

g f

Stack

Main

f

CS 1622 Lecture 17 21

ExampleClass Main {

g() : Int { 1 };f(): Int { g() };main(): Int {{ g(); f(); }};

} Main

fg

g

Stack

Main

f

g

Page 8: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

8

CS 1622 Lecture 17 22

Revised Memory LayoutLow Address

High Address

Memory

Code

Stack

CS 1622 Lecture 17 23

Activation Records The information needed to manage one

procedure activation is called anactivation record (AR) or frame

If procedure F calls G, then G’sactivation record contains a mix of infoabout F and G.

CS 1622 Lecture 17 24

What is in G’s AR when Fcalls G? F is “suspended” until G completes, at which

point F resumes. G’s AR containsinformation needed to resume execution of F.

G’s AR may also contain: G’s return value (needed by F) Actual parameters to G (supplied by F) Space for G’s local variables

Page 9: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

9

CS 1622 Lecture 17 25

The Contents of a Typical ARfor G Space for G’s return value Actual parameters Pointer to the previous activation record

The control link; points to AR of caller of G Machine status prior to calling G

Contents of registers & program counter Local variables

Other temporary values

CS 1622 Lecture 17 26

Example 2, RevisitedClass Main {

g() : Int { 1 };f(x:Int):Int {if x=0 then g() else f(x - 1)(**)fi};main(): Int {{f(3); (*)

}};}

AR for f: return addresscontrol linkargumentresult

CS 1622 Lecture 17 27

Stack After Two Calls to fMain

(**)

2(result)f(*)

3(result)f

Page 10: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

10

CS 1622 Lecture 17 28

Notes Main has no argument or local variables and

its result is never used; its AR is uninteresting (*) and (**) are return addresses of the

invocations of f The return address is where execution resumes

after a procedure call finishes

This is only one of many possible AR designs Would also work for C, Pascal, FORTRAN, etc.

CS 1622 Lecture 17 29

The Main Point

The compiler must determine, at compile-time,the layout of activation records and generatecode that correctly accesses locations in the

activation record

Thus, the AR layout and the code generatormust be designed together!

CS 1622 Lecture 17 30

ExampleThe picture shows the state after the call to 2nd

invocation of f returnsMain

(**)

21f(*)

3(result)f

Page 11: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

11

CS 1622 Lecture 17 31

Discussion The advantage of placing the return value 1st

in a frame is that the caller can find it at afixed offset from its own frame

There is nothing magic about thisorganization Can rearrange order of frame elements Can divide caller/callee responsibilities differently An organization is better if it improves execution

speed or simplifies code generation

CS 1622 Lecture 17 32

Discussion (Cont.) Real compilers hold as much of the

frame as possible in registers Especially the method result and

arguments

CS 1622 Lecture 17 33

Globals All references to a global variable point to the

same object Can’t store a global in an activation record

Globals are assigned a fixed address once Variables with fixed address are “statically

allocated” Depending on the language, there may be

other statically allocated values

Page 12: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

12

CS 1622 Lecture 17 34

Memory Layout with StaticData

Low Address

High Address

Memory

Code

Stack

Static Data

CS 1622 Lecture 17 35

Heap Storage A value that outlives the procedure that

creates it cannot be kept in the ARmethod foo() { new Bar }

The Bar value must survive deallocation offoo’s AR

Languages with dynamically allocateddata use a heap to store dynamic data

CS 1622 Lecture 17 36

Notes The code area contains object code

For most languages, fixed size and read only The static area contains data (not code) with

fixed addresses (e.g., global data) Fixed size, may be readable or writable

The stack contains an AR for each currentlyactive procedure Each AR usually fixed size, contains locals

Heap contains all other data In C, heap is managed by malloc and free

Page 13: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

13

CS 1622 Lecture 17 37

Notes (Cont.) Both the heap and the stack grow

Must take care that they don’t grow intoeach other

Solution: start heap and stack atopposite ends of memory and let thegrow towards each other

CS 1622 Lecture 17 38

Memory Layout with HeapLow Address

High Address

Memory

Code

Stack

Static Data

Heap

CS 1622 Lecture 17 39

Data Layout Low-level details of machine architecture are

important in laying out data for correct codeand maximum performance

Chief among these concerns is alignment

Page 14: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

14

CS 1622 Lecture 17 40

Alignment Most modern machines are (still) 32 bit

8 bits in a byte 4 bytes in a word Machines are either byte or word addressable

Data is word aligned if it begins at a wordboundary

Most machines have some alignmentrestrictions Or performance penalties for poor alignment

CS 1622 Lecture 17 41

Alignment (Cont.) Example: A string

“Hello”Takes 5 characters (without a terminating \0)

To word align next datum, add 3 “padding”characters to the string

The padding is not part of the string, it’s justunused memory

CS 1622 Lecture 17 42

Stack machines The MIPS assembly language

Code Generation

Page 15: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

15

CS 1622 Lecture 17 43

Stack Machines A simple evaluation model No variables or registers A stack of values for intermediate results Each instruction:

Takes its operands from the top of the stack Removes those operands from the stack Computes the required operation on them Pushes the result on the stack

CS 1622 Lecture 17 44

Example of Stack MachineOperation The addition operation on a stack machine

579…

5

7

9…

pop

Å

add

129…

push

CS 1622 Lecture 17 45

Example of a Stack MachineProgram Consider two instructions

push i - place the integer i on top of the stack add - pop two elements, add them and put the result back on the stack

A program to compute 7 + 5: push 7 push 5 add

Page 16: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

16

CS 1622 Lecture 17 46

Why Use a Stack Machine ? Each operation takes operands from

the same place and puts results in thesame place

This means a uniform compilationscheme

And therefore a simpler compiler

CS 1622 Lecture 17 47

Why Use a Stack Machine ? Location of the operands is implicit

Always on the top of the stack No need to specify operands explicitly No need to specify the location of the result Instruction “add” as opposed to “add r1, r2”

Þ Smaller encoding of instructions Þ More compact programs

This is one reason why Java Bytecodes usea stack evaluation model

CS 1622 Lecture 17 48

Optimizing the Stack Machine The add instruction does 3 memory

operations Two reads and one write to the stack The top of the stack is frequently accessed

Idea: keep the top of the stack in a register(called accumulator) Register accesses are faster

The “add” instruction is now acc= acc + top_of_stack Only one memory operation!

Page 17: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

17

CS 1622 Lecture 17 49

Stack Machine withAccumulatorInvariants The result of computing an expression is

always in the accumulator For an operation op(e1,…,en) push the

accumulator on the stack after computingeach of e1,…,en-1 After the operation pop n-1 values

After computing an expression the stack isas before

CS 1622 Lecture 17 50

Stack Machine withAccumulator. Example Compute 7 + 5 using an accumulator

acc

stack

5

7…

acc= 5

12

Å

acc= acc + top_of_stackpop

7

acc= 7push acc

7

CS 1622 Lecture 17 51

A Bigger Example: 3 + (7 + 5) Code Acc Stackacc= 3 3 <init>

push acc 3 3, <init>acc= 7 7 3, <init>push acc 7 7, 3, <init>acc= 5 5 7, 3, <init>acc= acc + top_of_stack 12 7, 3, <init>pop 12 3, <init>acc= acc + top_of_stack 15 3, <init>pop 15 <init>

Page 18: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

18

CS 1622 Lecture 17 52

Notes It is very important that the stack is

preserved across the evaluation of asubexpression Stack before the evaluation of 7 + 5 is 3, <init> Stack after the evaluation of 7 + 5 is 3, <init> The first operand is on top of the stack

CS 1622 Lecture 17 53

From Stack Machines to MIPS The compiler generates code for a

stack machine with accumulator

We want to run the resulting code onthe MIPS processor (or simulator)

We simulate stack machine instructionsusing MIPS instructions and registers

CS 1622 Lecture 17 54

Simulating a Stack Machine… The accumulator is kept in MIPS register $a0 The stack is kept in memory The stack grows towards lower addresses

Standard convention on the MIPS architecture The address of the next location on the stack

is kept in MIPS register $sp The top of the stack is at address $sp + 4

Page 19: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

19

CS 1622 Lecture 17 55

MIPS AssemblyMIPS architecture

Prototypical Reduced Instruction Set Computer(RISC) architecture

Arithmetic operations use registers for operandsand results

Must use load and store instructions to useoperands and results in memory

32 general purpose registers (32 bits each) We will use $sp, $a0 and $t1 (a temporary register)

Read the SPIM tutorial for more details

CS 1622 Lecture 17 56

A Sample of MIPS Instructions lw reg1 offset(reg2)

Load 32-bit word from address reg2 + offset into reg1

add reg1 reg2 reg3 reg1= reg2 + reg3

sw reg1 offset(reg2) Store 32-bit word in reg1 at address reg2 + offset

addiu reg1 reg2 imm reg1= reg2 + imm “u” means overflow is not checked

li reg imm reg= imm

CS 1622 Lecture 17 57

MIPS Assembly. Example. The stack-machine code for 7 + 5 in MIPS:

acc= 7push acc

acc= 5acc= acc + top_of_stack

pop

li $a0 7sw $a0 0($sp)addiu $sp $sp -4li $a0 5lw $t1 4($sp)add $a0 $a0 $t1addiu $sp $sp 4

• We now generalize this to a simple language…

Page 20: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

20

CS 1622 Lecture 17 58

A Small Language A language with integers and integer

operations

P ⇒ D; P | D D ⇒ def id(ARGS) = E; ARGS ⇒ id, ARGS | id E ⇒ int | id | if E1 = E2 then E3 else E4

| E1 + E2 | E1 – E2 | id(E1,…,En)

CS 1622 Lecture 17 59

A Small Language (Cont.) The first function definition f is the “main”

routine Running the program on input i means

computing f(i) Program for computing the Fibonacci

numbers: def fib(x) = if x = 1 then 0 else if x = 2 then 1 else fib(x - 1) + fib(x – 2)

CS 1622 Lecture 17 60

Code Generation Strategy For each expression e we generate MIPS

code that: Computes the value of e in $a0 Preserves $sp and the contents of the stack

We define a code generation functioncgen(e) whose result is the code generatedfor e

Page 21: CS1622 - University of Pittsburghpeople.cs.pitt.edu/~mock/cs1622/lectures/lecture17.pdf · 2005-03-31 · data use a heap to store dynamic data CS 1622 Lecture 17 36 Notes The code

21

CS 1622 Lecture 17 61

Code Generation forConstants The code to evaluate a constant simply

copies it into the accumulator:

cgen(i) = li $a0 i

Note that this also preserves the stack,as required

CS 1622 Lecture 17 62

Code Generation for Add cgen(e1 + e2) = cgen(e1) sw $a0 0($sp) addiu $sp $sp -4 cgen(e2) lw $t1 4($sp) add $a0 $t1 $a0 addiu $sp $sp 4 Possible optimization: Put the result of e1

directly in register $t1 ?