Upload
muhammad-ali
View
221
Download
0
Embed Size (px)
Citation preview
8/2/2019 Comp Const Week1(Lec1 & 2)
1/88
1
Compiler ConstructionIntroduction
8/2/2019 Comp Const Week1(Lec1 & 2)
2/88
Background You Should Have
1. Programming Good C++ programmer (essential)
2. Basics of computer organization Architecture, pipelining, registers, assembly code
3. Basic theory of computation Finite automata, regular expressions, context-free
grammars
8/2/2019 Comp Const Week1(Lec1 & 2)
3/88
Textbook and Other Classroom
Material
Class textbook
Compilers: Principles, Techniques, and Tools, Aho, Sethi,
Ullman (aka red dragon book)
Other useful books
Lex & Yacc, Levine, Mason and Brown
Advanced Compiler Design & Implementation, StevenMuchnick
Building an Optimizing Compiler, Robert Morgan
Modern Compiler Implementation in Java, Andrew Appel
8/2/2019 Comp Const Week1(Lec1 & 2)
4/88
The Five Segments
Lexical and Syntax Analysis
Semantic Analysis
Code Generation Data-flow Optimizations
Instruction Optimizations
8/2/2019 Comp Const Week1(Lec1 & 2)
5/88
Compiler
A program written in one language istranslated into an equivalent program inanother language the target language.
Compiler Target ProgramSource Program
Error Messages
8/2/2019 Comp Const Week1(Lec1 & 2)
6/88
Why Compilers?
Compiler
A program that translates from 1 language toanother
It must preserve semantics of the source
It should create an efficient version of the targetlanguage
It reports to its user the presence of errors in thesource program
8/2/2019 Comp Const Week1(Lec1 & 2)
7/88
The Phases of a Compiler
Conceptually, a compiler operates in phases,each of which transforms the source programfrom one representation to another.
Generically speaking there are two parts tocompilation:
Analysis Synthesis
8/2/2019 Comp Const Week1(Lec1 & 2)
8/88
Many SW tools that manipulate source programs
first perform some kind of analysis like
Structure Editors
Pretty Printers
Static Checkers
Interpreters
8/2/2019 Comp Const Week1(Lec1 & 2)
9/88
Structure Editors
Takes as input a sequence of commands tobuild a source program.
performs the text creation and modification
Also analyzes the program text putting anappropriate hierarchal structure on the sourceprogram.
Examples Check that the input is correctly formed
Can supply keywords automatically
8/2/2019 Comp Const Week1(Lec1 & 2)
10/88
Pretty Printers
Analyzes a program and prints it in such away that the structure of the programbecomes clearly visible.
For eg: Comments may appear in a special font
Statement may appear with an amount of
indentation proportional to the depth of theirnesting in the hierarchical organization of thestatements
8/2/2019 Comp Const Week1(Lec1 & 2)
11/88
Static Checkers
Reads a program, analyzes it and attempts todiscover potential bugs without running theprogram
For e.g. It may detects the parts of the source program
that can never be executed, or that a certainvariable might no be used. Can catch logicalerrors such as trying to use a real variable as apointer.
8/2/2019 Comp Const Week1(Lec1 & 2)
12/88
Interpreters
Instead of producing a target program astranslation, an interpreter performs theoperations implied by the source program.
For e.g. An interpreter might build a tree and then carry
out the operations at the node as it walks the tree
Frequently used to execute commandlanguages.
8/2/2019 Comp Const Week1(Lec1 & 2)
13/88
Seemingly Unrelated places where
compiler technology is regularly used
Text formatter
Takes input that is a stream of characters, which
may include commands to indicate paragraphs,figures or mathematical structures likesuperscripts and subscripts
8/2/2019 Comp Const Week1(Lec1 & 2)
14/88
Silicon Compilers
Has a source language similar to a conventional
programming language. However the variables ofthe language represent, not location in memorybut logical signals (0 or 1) or group of signals in aswitching circuit. Output is a circuit design in an
appropriate language.
Seemingly Unrelated places where
compiler technology is regularly used
8/2/2019 Comp Const Week1(Lec1 & 2)
15/88
Query Interpreters
Translates a predicate containing relational and
Boolean operators into commands to search adatabase for records satisfying the predicate.
Seemingly Unrelated places where
compiler technology is regularly used
8/2/2019 Comp Const Week1(Lec1 & 2)
16/88
The Context of a Compiler
In addition to a compiler, several otherprograms may be required to create anexecutable target program.
A source program may be divided intomodules stored in separate files. The task ofcollecting the source program is entrusted to
a distinct program called a preprocessor.
8/2/2019 Comp Const Week1(Lec1 & 2)
17/88
A Language Processing SystemSkeletal source program
preprocessor
source program
compiler
assembler
Loader/linker - editor
Target assembly program
Relocatable machine code
Absolute machine code
Library, relocatable object files
8/2/2019 Comp Const Week1(Lec1 & 2)
18/88
Optimized Intermediate
Representation
Assembly code
Intermediate Code
generator
Lexical
Analyzer
Token Stream
Parse Tree
Program
(character stream)
Syntax
Analyzer
Semantic
Analyzer
Intermediate
Code Generator
CodeOptimizer
Code
Generator
Symbol TableManager
Error Handler
Source Program
Target Program
8/2/2019 Comp Const Week1(Lec1 & 2)
19/88
Token Stream
Program (character stream)
Lexical Analyzer (Scanner)
The Phases of a Compiler
8/2/2019 Comp Const Week1(Lec1 & 2)
20/88
Lexical Analyzer (Scanner)/ Linear
Analyzer
In which the stream of characters making upthe source program is read from the left-to-right and grouped into tokens that are
sequences of characters having a collectivemeaning.
For example:
Position := initial + rate * 60
Position := initial + rate * 60
8/2/2019 Comp Const Week1(Lec1 & 2)
21/88
Lexical Analyzer (Scanner)
18..23 + val#ue
Num(234) mul_op lpar_op Num(11) add_op rpar_op
2 3 4 * ( 1 1 + - 2 2 )
Num(-22)
8/2/2019 Comp Const Week1(Lec1 & 2)
22/88
Lexical Analyzer (Scanner)
18..23 + val#ue
Num(234) mul_op lpar_op Num(11) add_op rpar_op
2 3 4 * ( 1 1 + - 2 2 )
Num(-22)
Not a number
Variable names cannot have # character
8/2/2019 Comp Const Week1(Lec1 & 2)
23/88
Token Stream
Program (character stream)
Lexical Analyzer (Scanner)
The Phases of a Compiler
Syntax Analyzer (Parser)Parse Tree
8/2/2019 Comp Const Week1(Lec1 & 2)
24/88
Syntax Analyzer (Parser)
60
identifier
:=
position
+
Position := initial + rate * 60
Assignment Statement
identifier
initial
*
identifier
ratenumber
8/2/2019 Comp Const Week1(Lec1 & 2)
25/88
Syntax Analyzer (Parser)
num num+
( )*num
num * ( num + num )
8/2/2019 Comp Const Week1(Lec1 & 2)
26/88
Syntax Analyzer (Parser)
int * foo(i, j, k))
int i;
int j;{
for(i=0; i j) {
fi(i>j)
return j;}
Extra parentheses
Missing increment
Not an expression
Not a keyword
8/2/2019 Comp Const Week1(Lec1 & 2)
27/88
Token Stream
Program (character stream)Lexical Analyzer (Scanner)
The Phases of a Compiler
Syntax Analyzer (Parser)Parse Tree
Semantic Analyzer
Intermediate Representation
8/2/2019 Comp Const Week1(Lec1 & 2)
28/88
Semantic Analyzer
60
:=
position+
initial *
rate intoreal
8/2/2019 Comp Const Week1(Lec1 & 2)
29/88
Semantic Analyzer
int * foo(i, j, k)
int i;
int j;{
int x;
x = x + j + N;
return j;}
Type not declared
Mismatched return type
Uninitialized variable used
Undeclared variable
8/2/2019 Comp Const Week1(Lec1 & 2)
30/88
Token Stream
Program (character stream)
Lexical Analyzer (Scanner)
The Phases of a Compiler
Syntax Analyzer (Parser)
Parse Tree
Semantic Analyzer
Intermediate RepresentationIntermediate Code Generator
Intermediate Code Representation
8/2/2019 Comp Const Week1(Lec1 & 2)
31/88
Intermediate Code Generator
Some compilers generate an explicitintermediate representation of the sourceprogram.
Can have variety of forms.. Three-address code (like the assembly language
for a machine in which every memory location canact like a register.. It consists of a sequence ofinstructions, each of which has at most threeoperands.
8/2/2019 Comp Const Week1(Lec1 & 2)
32/88
Intermediate Code Generator
temp1 := inttoreal(60)
temp2 := id3 + temp1
temp3 := id2 + temp2
id1 := temp3
8/2/2019 Comp Const Week1(Lec1 & 2)
33/88
Another Example (Input Program)
int expr(int n)
{
int d;
d = 4 * n * n * (n + 1) * (n
+ 1);
return d;
}
8/2/2019 Comp Const Week1(Lec1 & 2)
34/88
Example (Output assembly code)
lda $30,-32($30)stq $26,0($30)
stq $15,8($30)
bis $30,$30,$15
bis $16,$16,$1
stl $1,16($15)
lds $f1,16($15)
sts $f1,24($15)ldl $5,24($15)
bis $5,$5,$2
s4addq $2,0,$3
ldl $4,16($15)
mull $4,$3,$2
ldl $3,16($15)
addq $3,1,$4
mull $2,$4,$2ldl $3,16($15)addq $3,1,$4mull $2,$4,$2stl $2,20($15)ldl $0,20($15)
br $31,$33$33:
bis $15,$15,$30ldq $26,0($30)ldq $15,8($30)addq $30,32,$30
ret $31,($26),1
8/2/2019 Comp Const Week1(Lec1 & 2)
35/88
Token Stream
Program (character stream)
Lexical Analyzer (Scanner)
The Phases of a Compiler
Syntax Analyzer (Parser)
Parse Tree
Semantic Analyzer
Intermediate Representation
Intermediate Code GeneratorIntermediate Code Representation
Code Optimizer
Optimized Intermediate Representation
8/2/2019 Comp Const Week1(Lec1 & 2)
36/88
Code Optimizer
Attempts to improve the intermediatecode, so that faster-running machine codebe generated
Temp1 := id3 + 60.0
Id1 := id2 + temp1
8/2/2019 Comp Const Week1(Lec1 & 2)
37/88
Optimization
How to make the code go faster Classical optimizations
Dead code elimination remove useless code
Common sub expression eliminationrecomputing the same thing multiple times
Machine independent (classical)
Useful for almost all architectures
Machine dependent Depends on processor architecture
Memory system, branches, dependences
8/2/2019 Comp Const Week1(Lec1 & 2)
38/88
Example
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
39/88
test:
subu $fp, 16
sw zero, 0($fp) # x = 0
sw zero, 4($fp) # y = 0
sw zero, 8($fp) # i = 0
lab1: # for(i=0;i
8/2/2019 Comp Const Week1(Lec1 & 2)
40/88
Lets Optimize...
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
41/88
Constant Propagation
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
42/88
Constant Propagation
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
43/88
Constant Propagation
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
44/88
Algebraic Simplification
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
45/88
Algebraic Simplification
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
46/88
Algebraic Simplification
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
47/88
Algebraic Simplification
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
48/88
Algebraic Simplification
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
49/88
Algebraic Simplification
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
50/88
Copy Propagation
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
51/88
Copy Propagation
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
52/88
Common Subexpression Elimination
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
53/88
int sumcalc(int a, int b, int N){
int i;
int x, y;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
54/88
int sumcalc(int a, int b, int N){
int i;
int x, y, t;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
55/88
int sumcalc(int a, int b, int N){
int i;
int x, y, t;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
56/88
Dead Code Elimination
int sumcalc(int a, int b, int N){
int i;
int x, y, t;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
57/88
Dead Code Elimination
int sumcalc(int a, int b, int N){
int i;
int x, y, t;
x = 0;
y = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
58/88
Dead Code Elimination
int sumcalc(int a, int b, int N){
int i;
int x, y, t;
x = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
59/88
Dead Code Elimination
int sumcalc(int a, int b, int N){
int i;
int x, t;
x = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
60/88
Loop Invariant Removal
int sumcalc(int a, int b, int N){
int i;
int x, t;
x = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
61/88
Loop Invariant Removal
int sumcalc(int a, int b, int N){
int i;
int x, t;
x = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
62/88
Loop Invariant Removal
int sumcalc(int a, int b, int N){
int i;
int x, t, u;
x = 0;
u = (4*a/b);
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
63/88
Strength Reduction
int sumcalc(int a, int b, int N){
int i;
int x, t, u;
x = 0;
u = (4*a/b);
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
64/88
Strength Reduction
int sumcalc(int a, int b, int N){
int i;
int x, t, u;
x = 0;
u = (4*a/b);for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
65/88
Strength Reduction
int sumcalc(int a, int b, int N){
int i;
int x, t, u, v;
x = 0;
u = (4*a/b);v = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
66/88
Strength Reduction
int sumcalc(int a, int b, int N){
int i;
int x, t, u, v;
x = 0;
u = (4*a/b);v = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
67/88
Strength Reduction
int sumcalc(int a, int b, int N){
int i;
int x, t, u, v;
x = 0;
u = (4*a/b);v = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
68/88
Optimized Example
int sumcalc(int a, int b, int N){
int i;
int x, t, u, v;
x = 0;
u = (4*a/b);v = 0;
for(i = 0; i
8/2/2019 Comp Const Week1(Lec1 & 2)
69/88
subu $fp, 16
add $t9, zero, zero # x = 0
sll $t0, $a0, 2 # a
8/2/2019 Comp Const Week1(Lec1 & 2)
70/88
test:subu $fp, 16add $t9, zero, zerosll $t0, $a0, 2div $t7, $t0, $a1add $t6, zero, zeroadd $t5, zero, zero
lab1:addui $t8, $t5, 1
mul $t0, $t8, $t8addu $t1, $t0, $t6addu $t9, t9, $t1
addu $6, $6, $7
addui $t5, $t5, 1
ble $t5, $a3, lab1
addu $v0, $t9, zeroaddu $fp, 16 b $ra
test:subu $fp, 16sw zero, 0($fp)sw zero, 4($fp)sw zero, 8($fp)
lab1: mul $t0, $a0, 4
div $t1, $t0, $a1lw $t2, 8($fp) mul $t3, $t1, $t2lw $t4, 8($fp)addui $t4, $t4, 1lw $t5, 8($fp)addui $t5, $t5, 1 mul $t6, $t4, $t5addu $t7, $t3, $t6lw $t8, 0($fp)add $t8, $t7, $t8sw $t8, 0($fp)lw $t0, 4($fp) mul $t1, $t0, a1lw $t2, 0($fp)add $t2, $t2, $t1sw $t2, 0($fp)lw $t0, 8($fp)addui $t0, $t0, 1
sw $t0, 8($fp) ble $t0, $a3, lab1
lw $v0, 0($fp)addu $fp, 16 b $ra
Execution time = 17 secExecution time = 43 sec
C mpil r Optimiz Pr r m f r
8/2/2019 Comp Const Week1(Lec1 & 2)
71/88
Compilers Optimize Programs for
Performance/Speed
Code Size
Power Consumption
Fast/Efficient Compilation
Security/Reliability
Debugging
C d G n r t r
8/2/2019 Comp Const Week1(Lec1 & 2)
72/88
Code Generator
MOVF id3, R2
MULF #60.0,R2
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1
Code Generator
8/2/2019 Comp Const Week1(Lec1 & 2)
73/88
Code Generator
test:subu $fp, 16add $t9, zero, zerosll $t0, $a0, 2div $t7, $t0, $a1add $t6, zero, zeroadd $t5, zero, zero
lab1:addui $t8, $t5, 1
mul $t0, $t8, $t8addu $t1, $t0, $t6addu $t9, t9, $t1
addu $6, $6, $7
addui $t5, $t5, 1 ble $t5, $a3, lab1
addu $v0, $t9, zeroaddu $fp, 16 b $ra
int sumcalc(int a, int b, int N)
{
int i;
int x, t, u, v;
x = 0;
u = ((a
8/2/2019 Comp Const Week1(Lec1 & 2)
74/88
Symbol Table Manager
8/2/2019 Comp Const Week1(Lec1 & 2)
75/88
Error Detection and Reporting
Dataflow and Control Flow Analysis
8/2/2019 Comp Const Week1(Lec1 & 2)
76/88
Dataflow and Control Flow Analysis
Dataflow analysis
Provide the necessary information about variableusage and execution behavior to determine when
a transformation is legal/illegal Control flow analysis
Execution behavior caused by control statements
Ifs, for/while loops, gotos
Control flow graph
Cousins of the Compiler (Preprocessors)
8/2/2019 Comp Const Week1(Lec1 & 2)
77/88
Cousins of the Compiler (Preprocessors)
Produces input to the compilers
1. Macro processing
To define macros that are shorthand for longer
constructs.2. File Inclusion
To include header files into the program text
Cousins of the Compiler
8/2/2019 Comp Const Week1(Lec1 & 2)
78/88
Cousins of the Compiler
3. Rational Preprocessor Augment older languages with modern flow of
control and data structuring facilities.
4. Language extensions
These processors attempt to add capabilities tothe language by what amounts to built in macros.
For example, such the language Equel is adatabase query language embedded in C.(Statements Beginning with ##)
Cousins of the Compiler (Assemblers)
8/2/2019 Comp Const Week1(Lec1 & 2)
79/88
Cousins of the Compiler (Assemblers)
Production of relocatable machine code thatcan be passed directly to the loader/linker-editor.
Assembly code (menmonic version of machinecode)
Example:
MOV a, R1
ADD #2, R1
MOV R1, b
Cousins of the Compiler (Assemblers)
8/2/2019 Comp Const Week1(Lec1 & 2)
80/88
Cousins of the Compiler (Assemblers)
Two-Pass Assembly (Two passes over theinput, Reading an input file once)
First phase: Identifiers r found & stored in a
symbol table Identifier Address
a 0
b 4Assume that a word, consisting of four bytes is setaside for each identifier and the addresses areassigned stating from byte 0.
Cousins of the Compiler (Assemblers)
8/2/2019 Comp Const Week1(Lec1 & 2)
81/88
Cousins of the Compiler (Assemblers)
Two-Pass Assembly Second phase:
Assembler scans the input again.
It translates each operation code into thesequence of bits representing that operation inmachine language.
Translates each identifier into the address given
for that in symbol table. Output is relocatable machine code.
Cousins of the Compiler (Assemblers)
8/2/2019 Comp Const Week1(Lec1 & 2)
82/88
Cousins of the Compiler (Assemblers)
A hypothetical machine codeInstruction R1 tag address/value
0001 01 00 00000000
0011 01 10 00000010 0010 01 00 00000100
0001 LOAD 0011 ADD 0010 STORE
Cousins of the Compiler (Loader and Link
8/2/2019 Comp Const Week1(Lec1 & 2)
83/88
Cousins of the Compiler (Loader and Link
- Editors)
2 Functions Loading
Linking
Loading Taking relocatable machine code,
altering the relocatable address
Placing the altered instruction and data in memory atproper locations
Cousins of the Compiler (Loader and Link
8/2/2019 Comp Const Week1(Lec1 & 2)
84/88
Cousins of the Compiler (Loader and Link
- Editors)
Represent the Relocation bit i.e. associatedwith each operand in relocatable machinecode.
Suppose address space containing thedata is to be located at location L
L must be added to address of the
instruction
Cousins of the Compiler (Loader and Link
8/2/2019 Comp Const Week1(Lec1 & 2)
85/88
Cousins of the Compiler (Loader and Link
- Editors)
If L = 00001111 i.e. 15, then a and b would beat 15 and 19
0001 01 00 00000000
0011 01 10 000000100010 01 00 00000100
Cousins of the Compiler (Loader and Link
8/2/2019 Comp Const Week1(Lec1 & 2)
86/88
Cousins of the Compiler (Loader and Link
- Editors)
Linker Editor allows to make a single programfrom several files of relocatable machinecode.
Files may be the result of several differentcompilations
One or more may be the library files of
routine
General Structure of a Modern Compiler
8/2/2019 Comp Const Week1(Lec1 & 2)
87/88
General Structure of a Modern Compiler
Lexical Analysis
Syntax Analysis
Semantic Analysis
Controlflow/Dataflow
Optimization
Code Generation
Source
Program
AssemblyCode
Scanner
Parser
High-level IR to low-level IR conversion
Build high-level IRContext
Symbol Table
CFG
Machine independent asm to machine dependent
Front end
Back end
Front and Back Ends
8/2/2019 Comp Const Week1(Lec1 & 2)
88/88
Front and Back Ends
Front End Phases or parts of phases depend primarily on
the source language
Back End Phases or parts of phases depend primarily on
the target machine
Generally do not depend on source language