Introduction to Compilers Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY

Preview:

DESCRIPTION

Programmers write more readable character strings An assembly language program that adds two numbers, from Louden and Lambert’s book.

Citation preview

Introduction to Compilers

Jianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITY

Computers run 0/1 strings(machine language program)001000100000010000100100000001000001011001000010001101100000001111110000001001010000000000000101 00000000000001100000000000000000

A machine language program that adds two numbers

•First 4 bits for opcode•Last 12 bits for operands

Source:Louden and Lambert’s book:Programming Languages

Programmers write more readable character strings

An assembly language program that adds two numbers, from Louden and Lambert’s book.

Even more readable character strings: high-level languages Imperative Languages: specifies HOW

Fortran ALGOL PASCAL C C++ Java

Declarative Languages: specifies WHAT SQL, ML, Prolog

Models of Computation in Languages

Underlying most programming languages is a model of computation:

Procedural: Fortran (1957)

Functional: Lisp (1958)

Object oriented: Simula (1967)

Logic: Prolog (1972)

Relational algebra: SQL (1974)

Source: A. V. Aho. Lectures of Programming Languages and Translators

Programming Languages Evolve:Java as an Example Java 1.0, 1996

Object-oriented The language of choice for internet applet programs.

Java 8, 2014 Changing computing background: multicore and

processing big data. Java 8 streams support database-queries style of

programming Java 8 incorporates many ideas from functional

programming.

What is a compiler?

A Compiler is a translator between computers and programmers

More generally speaking, a Compiler is a translator between source strings and target strings. between assembly language and Fortran between Java and Java Bytecode between Java and SQL

Assembly language vs Fortran

Source: Stephen A. Edwards. Lectures of Programming Languages and Translators

The Structure of a Compiler

1. Lexical Analysis2. Syntax Analysis (or Parsing)3. Semantic Analysis4. Intermediate Code Generation5. Code Optimization6. Code Generation

Translation of an assignment statement (1)

Translation of an assignment statement (2)

Translation of SQL query

SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND R.bid=100 AND S.rating>5

Reserves Sailors

sid=sid

bid=100 rating > 5

sname

Query can be converted to relational algebra Relational Algebra converts to tree, joins form branches Each operator has implementation choices Operators can also be applied in different order!

(sname)(bid=100 rating > 5) (Reserves Sailors)

Cost-based Query Sub-System

Query Parser

Query Optimizer

Plan Generator

Plan Cost Estimator

Query Executor

Catalog Manager

Usually there is aheuristics-basedrewriting step beforethe cost-based steps.

Schema Statistics

Select *From Blah BWhere B.blah = blah

Queries

Motivating Example

Cost: 500+500*1000 I/Os By no means the worst plan! Misses several opportunities:

selections could be`pushed’ down

no use made of indexes Goal of optimization: Find faster

plans that compute the same answer.

SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND R.bid=100 AND S.rating>5

Sailors Reserves

sid=sid

bid=100 rating > 5

sname

(Page-Oriented Nested loops)

(On-the-fly)

(On-the-fly)Plan:

500,500 IOs

Alternative Plans – Push Selects (No Indexes)

Sailors Reserves

sid=sid

bid=100 rating > 5

sname

(Page-Oriented Nested loops)

(On-the-fly)

(On-the-fly)

Sailors

Reserves

sid=sid

rating > 5

sname

(Page-Oriented Nested loops)

(On-the-fly)

(On-the-fly)

bid=100 (On-the-fly)

250,500 IOs

Alternative Plans – Push Selects (No Indexes)

Sailors

Reserves

sid=sid

rating > 5

sname

(Page-Oriented Nested loops)

(On-the-fly)

(On-the-fly)

bid=100 (On-the-fly)

Sailors Reserves

sid=sid

bid = 100

sname

(Page-Oriented Nested loops)

(On-the-fly)

rating > 5

(On-the-fly)(On-the-fly)

500 + 1000 + 250 + 250*10250,500 IOs4250 IOs