System Software &memory management

Embed Size (px)

Citation preview

  • 7/28/2019 System Software &memory management

    1/80

    UNIT 1

    1.1 Basic Compiler Functions

    1.2 Grammars

    1.3 Lexical Analysis

    1.4 Syntactic Analysis

    1.5 Code Generation

    1.6 Heap Management

    1.7 Parameter Passing Methods

    1.8 Semantics of Calls and Returns

    1.9 Implementing Subprograms

    1.10 Stack Dynamic Local Variables

    1.11 Dynamic binding of method calls to methods

    1.12 Overview of Memory Management, Virtual Memory, Process

    Creation

    1.13 Overview of I/O Systems, Device Drivers, System Boot

    1.1 Basic Compiler Functions

    compiler

    Translators from one representation of the program to another Typically from high level source code to low level machine code or object

    code

    Source code is normally optimized for human readability Expressive: matches our notion of languages (and application?!)

  • 7/28/2019 System Software &memory management

    2/80

    Redundant to help avoid programming errors Machine code is optimized for hardware

    Redundancy is reduced Information about the intent is lost

    This code should execute faster and use less resources.

    How to translate

    Source code and machine code mismatch in level of abstractionWe have to take some steps to go from the source code to machine code.

    Some languages are farther from machine code than others Goals of translation

    High level of abstraction Good performance for the generated code Good compile time performance

    Maintainable code

    The big picture

    Compiler is part of program development environment

  • 7/28/2019 System Software &memory management

    3/80

    The other typical components of this environment are editor, assembler,linker, loader, debugger, profiler etc.

    Profiler will tell the parts of the program which are slow or which have

    never been executed.

    The compiler (and all other tools) must support each other for easy programdevelopment

    1.2 Context Free Grammars

    Introduction

    Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:

    {anbn: n = 0, 1, 2, }

    {w : w a is palindrome}

    and there is no finite automata that accepts them.

  • 7/28/2019 System Software &memory management

    4/80

    context-free languages are a larger class of languages that encompasses allregular languages and many others, including the two above.

    Context-Free Grammars

    Languages that are generated by context-free grammars are context-freelanguages

    Context-free grammars are more expressive than finite automata: if alanguage L is accepted by a finite automata then L can be generated by a

    context-free grammar

    Beware: The converse is NOT trueDefinition.

    A context-free grammar is a 4-tuple (, NT, R, S), where:

    is an alphabet (each character in is called terminal) NT is a set (each element in NT is called nonterminal) R, the set of rules, is a subset of NT ( NT)*

    if (,) R, we write production

    is called a sentential form

    S, the start symbol, is one of the symbols in NTCFGs: Alternate Definition

    many textbooks use different symbols and terms to describe CFGs

    G = (V, S, P, S)

    V = variables a finite set

    S = alphabet or terminals a finite set

    P = productions a finite set

    S = start variable SV

  • 7/28/2019 System Software &memory management

    5/80

    Productions form, where AV, a(VS)*:

    A a

    Derivations

    Definition. v is one-step derivable from u, written u v, if:

    u = xz v = xz in R

    Definition. v is derivable from u, written u * v, if:

    There is a chain of one-derivations of the form: u u1 u2 v

    Context-Free Languages

    Definition. Given a context-free grammar

    G = (, NT, R, S), the language generated or

    derived from G is the set:

    L(G) = {w : }

    Definition. A language L is context-free if there is a context-free grammar G =

    (, NT, R, S), such that L is generated from G

    CFGs & CFLs: Example 1

    {anbn | n0}

    One of our canonical non-RLs.

    S e | a S b

    Formally: G = ({S}, {a,b},

  • 7/28/2019 System Software &memory management

    6/80

    {S e, S a S b}, S)

    CFGs & CFLs: Example 2

    all strings of balanced parentheses

    A core idea of most programming languages.

    Another non-RL.

    P e | ( P ) | P P

    CFGs & CFLs: Lessons

    Both examples used a common CFG technique, wrapping around a

    recursive variable.

    S a S b P ( P )

    CFGs & CFLs: Non-Example

    {anbncn | n0}

    Cant be done; CFL pumping lemma later.

    Intuition: Can count to n, then can count down from n, but forgetting n.

    I.e., a stack as a counter. Will see this when using a machine corresponding to CFGs.

    Parse Tree

    A parse tree of a derivation is a tree in which:

    Each internal node is labeled with a nonterminal If a rule A A1A2An occurs in the derivation then A is a parent

    node of nodes labeled A1, A2, , An

  • 7/28/2019 System Software &memory management

    7/80

    S A | A B

    A e | a | A b | A A

    B b | bc | B c | b B

    Sample derivations:

    S AB AAB aAB aaB aabB aabb

    S AB AbB Abb AAbb Aabb aabbThese two derivations use same productions, but in different orders.

    This ordering difference is often uninteresting.

    Derivation trees give way to abstract away ordering differences.

    Leftmost, Rightmost Derivations

  • 7/28/2019 System Software &memory management

    8/80

    Definition.

    A left-most derivation of a sentential form is one in which rules

    transforming the left-most nonterminal are always applied

    Definition.

    A right-most derivation of a sentential form is one in which rules

    transforming the right-most nonterminal are always applied

    Leftmost & Rightmost Derivations

    S A | A B

    A e | a | A b | A A

    B b | bc | B c | b B

    Sample derivations:

    S AB AAB aAB aaB aabB aabb

    S AB AbB Abb AAbb Aabbaabb

    These two derivations are special.

    1st derivation is leftmost.

    Always picks leftmost variable.

    2nd derivation is rightmost.

    Always picks rightmost variable.

  • 7/28/2019 System Software &memory management

    9/80

    In proofso Restrict attention to left- or rightmost derivations.

    In parsing algorithmso Restrict attention to left- or rightmost derivations.o E.g., recursive descent uses leftmost; yacc uses rightmost.

    Derivation Trees

    Ambiguous Grammar

    Definition. A grammar G is ambiguous if there is a word w L(G) having are

    least two different parse trees

    S A

    S B

    S AB

    A aA

  • 7/28/2019 System Software &memory management

    10/80

    B bB

    Ae

    Be

    Notice that a has at least two left-most derivations

    Ambiguity

    CFG ambiguous any of following equivalent statements:o string w with multiple derivation trees.o string w with multiple leftmost derivations.o string w with multiple rightmost derivations.

    Defining ambiguity of grammar, not language.Ambiguity & Disambiguation

    Given an ambiguous grammar, would like an equivalent unambiguousgrammar.

    o Allows you to know more about structure of a given derivation.o Simplifies inductive proofs on derivations.o Can lead to more efficient parsing algorithms.o In programming languages, want to impose a canonical structure on

    derivations. E.g., for1+23. Strategy: Force an ordering on all derivations.

    Disambiguation: Example 1

    Exp n| Exp + Exp

    | Exp ExpWhat is an equivalent unambiguous grammar?

    Exp Term

    | Term + Exp

  • 7/28/2019 System Software &memory management

    11/80

    Term n

    | n TermUses

    operator precedence left-associativity

    Disambiguation

    What is a general algorithm?

    None exists!

    There are CFLs that are inherently ambiguous

    Every CFG for this language is ambiguous.

    E.g., {anbncmdm | n1, m1} {anbmcmdn | n1, m1}.

    So, cant necessarily eliminate ambiguity!CFG Simplification

    Cant always eliminate ambiguity. But, CFG simplification & restriction still useful theoretically &

    pragmatically.

    o Simpler grammars are easier to understand.o Simpler grammars can lead to faster parsing.o Restricted forms useful for some parsing algorithms.o Restricted forms can give you more knowledge about derivations.

    CFG Simplification: Example

    How can the following be simplified?

    S A B

    S A C D

    A A a

    A a

    A a A

    A aC e

    D d D

    D E

    E e A e

    F ff

  • 7/28/2019 System Software &memory management

    12/80

    1) Delete: B useless because nothing derivable from B.

    2) Delete either AAa orAaA.

    3) Delete one of the idential productions.

    4) Delete & also replace SACD with SAD.

    5) Replace with DeAe.

    6) Delete: E useless after change #5.

    7) Delete: F useless because not derivable from S

    CFG Simplification

    Eliminate ambiguity.

    Eliminate useless variables.

    Eliminate e-productions: A.Eliminate unit productions: AB.

    Eliminate redundant productions.

    Trade left- & right-recursion.

    Trading Left- & Right-Recursion

    Left recursion: A A a

    Right recursion: A a A

    Most algorithms have trouble with one,

    In recursive descent, avoid left recursion.

    1.3 Lexical Analysis

    Recognizing words is not completely trivial. Therefore, we must know what the word separators are (blank, punctuation

    etc.)

    The language must define rules for breaking a sentence into a sequence ofwords.

    Compilers are much less smarter and so programming languages should

    have more specific rules.

    Normally white spaces and punctuations are word separators in languages.

  • 7/28/2019 System Software &memory management

    13/80

    In programming languages a character from a different class may also betreated as word separator. Eg if (a==b)

    The lexical analyzer breaks a sentence into a sequence of words or tokens: If a == b then a = 1 ; else a = 2 ; Sequence of words (total 14 words)

    if a == b then a = 1 ; else a = 2 ;

    The next step

    Once the words are understood, the next step is to understand the structureof the sentence

    The process is known as syntax checking or parsingParsing

    Parsing a program is exactly the same Consider an expression

    if stmt

    predicate then-stmt else-stmt

    = = = =

    x y z 1 z 2

    So we have a lexical analyzer and then a parser.

    Semantic Analysis

    Too hard for compilers. They do not have capabilities similar to humanunderstanding

  • 7/28/2019 System Software &memory management

    14/80

    However, compilers do perform analysis to understand the meaning andcatch inconsistencies

    Programming languages define strict rules to avoid such ambiguities{ int Amit = 3;

    { int Amit = 4;

    cout

  • 7/28/2019 System Software &memory management

    15/80

    Lexical Analysis

    Recognize tokens and ignore white spaces, comments

    Generates token stream

    Error reporting Model using regular expressions

    If the token is not valid (does not fall into any of the identifiable groups)

    then we have to tell the user about it.

    Recognize using Finite State Automata1.4 Syntax Analysis

    Check syntax and construct abstract syntax tree

    Error reporting and recoveryError recovery is a very important part of the syntax analyzer. Compiler

    should not stop when it sees an error and keep processing the input.It should report

    the error only once and give only appropriate error messages. It should also

  • 7/28/2019 System Software &memory management

    16/80

  • 7/28/2019 System Software &memory management

    17/80

    StaticcheckingTypechecking

    Controlflowchecking

    Uniquenesschecking

    Name checks We will have to generate type information for every node. It is not important

    for nodes like if and ;

    Code Optimization

    No strong counter part with English, but is similar to editing/prcis writing Automatically modify programs so that they

    Run faster Use less resources (memory, registers, space, fewer fetches etc.)

    Some common optimizations Common sub-expression elimination

    A=X+YP=R+X+Y save the values to

    prevent recalculation

    Copy propagationwhenever we are using a copy of the variable, instead of using

    the copy we should be able to use the original variables itself

    Dead code elimination code which is not used Code motion move unnecessary code out of the loops Strength reduction

    2x is replaced by x+x because addition is cheaper thanmultiplication. X/2 is replaced by right shift.

    Constant foldingExample: x = 15 * 3 is transformed to x = 45

  • 7/28/2019 System Software &memory management

    18/80

    Compilation is done only once and execution many times

    20-80 and 10-90 Rule: by 20% of the effort we can get 80%

    speedup. For further 10% speedup we have to put 90% effort.

    But in some programs we may need to extract every bit ofoptimization

    1.5 Code Generation

    Usually a two step process Generate intermediate code from the semantic representation of theprogram

    Generate machine code from the intermediate code The advantage is that each phase is simple Requires design of intermediate language Most compilers perform translation between successive intermediate

    representations

    stream tokensabstraction syntax treeannotated abstract syntaxtreeintermediate code

    Intermediate languages are generally ordered in decreasing level ofabstraction from highest (source) to lowest (machine)

    However, typically the one after the intermediate code generation is the mostimportant

    Intermediate Code Generation

    Abstraction at the source levelidentifiers, operators, expressions, control flow, statements, conditionals,

    iteration, functions (user defined, system defined or libraries)

  • 7/28/2019 System Software &memory management

    19/80

    Abstraction at the target levelmemory locations, registers, stack, opcodes, addressing modes, system

    libraries, interface to the operating systems

    Code generation is mapping from source level abstractions to target machineabstractions

    Map identifiers to locations (memory/storage allocation) Explicate variable accesses (change identifier reference to

    relocatable/absolute address

    Map source operators to opcodes or a sequence of opcodes Convert conditionals and iterations to a test/jump or compare instructions Layout parameter passing protocols: locations for parameters, return values,

    layout of activations frame etc.

    we should know where to pick up the parameters for function from and atwhich location to store the result.

    Interface calls to library, runtime system, operating systems applicationlibrary, runtime library, OS system calls

    Post translation Optimizations

    Algebraic transformations and re-ordering

    Remove/simplify operations like Multiplication by 1 Multiplication by 0 Addition with 0

    Reorder instructions based on Commutative properties of operators For example x+y is same as y+x (always?)

    Instruction selection

    Addressing mode selection Opcode selection Peephole optimization

  • 7/28/2019 System Software &memory management

    20/80

    Information required about the program variables during compilation Class of variable: keyword, identifier etc.

  • 7/28/2019 System Software &memory management

    21/80

    Type of variable: integer, float, array, function etc. Amount of storage required Address in the memory Scope information

    Location to store this information Attributes with the variable (has obvious problems)

    We need large amount of memory for the case a=a+b we will

    have to make changes in all the structures associated with a and

    so consistency will be a problem

    At a central repository and every phase refers to the repositorywhenever information is required

    Normally the second approach is preferredUse a data structure called symbol table

    Final Compiler structure

  • 7/28/2019 System Software &memory management

    22/80

    Advantages of the model

    Also known as Analysis-Synthesis model of compilation Front end phases are known as analysis phases Back end phases known as synthesis phases Each phase has a well defined work

    Each phase handles a logical activity in the process of compilation Compiler is retargetable Source and machine independent code optimization is possible. Optimization phase can be inserted after the front and back end phases have

    been developed and deployed

    Specifications and Compiler Generator

    How to write specifications of the source language and the target machine Language is broken into sub components like lexemes, structure,

    semantics etc.

    Each component can be specified separately. For example anidentifiers may be specified as

    A string of characters that has at least one alphabet starts with an alphabet followed by alphanumeric letter(letter|digit)*

    Similarly syntax and semantics can be described

  • 7/28/2019 System Software &memory management

    23/80

    Tool based Compiler Development

    Retarget Compilers

    Changing specifications of a phase can lead to a new compiler If machine specifications are changed then compiler can generate

    code for a different machine without changing any other phase

    If front end specifications are changed then we can get compiler for anew language

    Tool based compiler development cuts down development/maintenance timeby almost 30-40%

    Tool development/testing is one time effort Compiler performance can be improved by improving a tool and/orspecification for a particular phase

  • 7/28/2019 System Software &memory management

    24/80

    Bootstrapping

    Compiler is a complex program and should not be written in assemblylanguage

    How to write compiler for a language in the same language (first time!)? First time this experiment was done for Lisp Initially, Lisp was used as a notation for writing functions. Functions were then hand translated into assembly language and executed McCarthy wrote a function eval[e,a] in Lisp that took a Lisp expression e as

    an argument

    The function was later hand translated and it became an interpreter for Lisp A compiler can be characterized by three languages: the source language

    (S), the target language (T), and the implementation language (I)

    The three language S, I, and T can be quite different. Such a compiler iscalled cross-compiler

    This is represented by a T-diagram as:

    In textual form this can be represented asSIT

    Write a cross compiler for a language L in implementation language S togenerate code for machine N

    Existing compiler for S runs on a different machine M and generates codefor M

    When Compiler LSN is run through SMM we get compiler LM

    Bootstrapping a Compiler

  • 7/28/2019 System Software &memory management

    25/80

    Bootstrapping a Compiler:

    the Complete picture

  • 7/28/2019 System Software &memory management

    26/80

    1.7 Parameter Passing

    Some routines and calls in external Fortran classes are compiled using the

    Fortran parameter passing convention. This section describes how this is achieved.

    Routines without bodies in external Fortran classes and Fortran routines (routineswhose return types and all arguments are Fortran types) are compiled as described

    below. The explanation is done in terms of mapping the original Sather signatures

    to C prototypes. All Fortran types are assumed to have corresponding C types

    defined. For example, F_INTEGER class maps onto F_INTEGER C type. See

    unnamedlinkfor details on how this could be achieved in a portable fashion. The

    examples are used to illustrate parameter passing only - the actual binding of

    function names is irrelevant for this purpose.

    1.7.1. Return Types

    Routines that return F_INTEGER, F_REAL, F_LOGICAL, and F_DOUBLE map

    to C functions that return corresponding C types. A routine that returns

    F_COMPLEX or F_DOUBLE_COMPLEX is equivalent to a C routine with an

    extra initial arguments preceding other arguments in the argument list. This initial

    argument points to the storage for the return value.

    F_COMPLEX foo(i:F_INTEGER,a:F_REAL);

    -- this Sather signature is equivalent to

    void foo(F_COMPLEX* ret_val, F_INTEGER* i_address, F_REAL* a_address)

    A routine that returns F_CHARACTER is mapped to a C routine with two

    additional arguments: a pointer to the data, and a string size, always set to 1 in thecase of F_CHARACTER.

    F_CHARACTER foo(i:F_INTEGER, a:F_REAL);

    http://www.gnu.org/software/sather/docs-1.2/tutorial/fortran-portability.htmlhttp://www.gnu.org/software/sather/docs-1.2/tutorial/fortran-portability.htmlhttp://www.gnu.org/software/sather/docs-1.2/tutorial/fortran-portability.html
  • 7/28/2019 System Software &memory management

    27/80

    -- this Sather signature maps to

    void foo(F_CHARACTER* address, F_LENGTH size,

    F_INTEGER* i_address, F_REAL* a_address);

    Similarly, a routine returning F_STRING is equivalent to a C routine with two

    additional initial arguments, a data pointer and a string length]

    F_STRING foo(i:F_INTEGER, a:F_REAL);

    -- this Sather signature maps to

    void foo(F_CHARACTER* address, F_LENGTH size,

    F_INTEGER* i, F_REAL* a);

    [1] The current Sather 1.1 implementation disallows returning Fortran strings of

    size greater than 32 bytes. This restriction may be lifted in the future releases.

    13.4.2. Argument Types

    All Fortran arguments are passed by reference. In addition, for each argument of

    type F_CHARACTER or F_STRING, an extra parameter whose value is the length

    of the string is appended to the end of the argument list.

    foo(i:F_INTEGER,c:F_CHARACTER,a:F_REAL):F_INTEGER

    -- this is mapped to

    F_INTEGER foo(F_INTEGER* i_address, F_CHARACTER*c_address,

    http://www.gnu.org/software/sather/docs-1.2/tutorial/fortran-parameters.html#AEN2923http://www.gnu.org/software/sather/docs-1.2/tutorial/fortran-parameters.html#AEN2923http://www.gnu.org/software/sather/docs-1.2/tutorial/fortran-parameters.html#AEN2923
  • 7/28/2019 System Software &memory management

    28/80

    F_REAL* a_address, F_LENGTH c_length);

    -- all calls have c_length set to 1

    foo(i:F_INTEGER,s:F_STRING,a:F_REAL):F_INTEGER

    -- this is mapped to

    F_INTEGER foo(F_INTEGER* i_address, F_CHARACTER* s_address,

    F_REAL* a_address, F_LENGTH s_length);

    -- propoer s_length is supplied by the caller

    Additional string length arguments are passed by value. If there are more than one

    F_CHARACTER or F_STRING arguments, the lengths are appended to the end of

    the list in the textual order of string arguments:

    foo(s1:F_STRING,i:F_INTEGER,s2:F_STRING,a:F_REAL);

    -- this is mapped to

    void foo(F_CHARACTER* s1_address, F_INTEGER* i_address,

    F_CHARACTER* s2_address, F_REAL a_address,

    F_LENGTH s1_length, F_LENGTH s2_length);

    Sather signatures that have F_HANDLER arguments correspond to C integer

    functions whose return value represents the alternate return to take. The actual

    handlers are not passed to the Fortran code. Instead, code to do the branching

    based on the return value is emitted by the Sather compiler to conform to the

    alternate return semantics.

  • 7/28/2019 System Software &memory management

    29/80

    Arguments of type F_ROUT are passed as function pointers.

    Thus, the entire C argument list including additional arguments consists of:

    one additional argument due to F_COMPLEX or F_DOUBLE_COMPLEXreturn type, or two additional arguments due to F_CHARACTER orF_STRING return type

    references to "normal" arguments corresponding to a Sather signatureargument list

    additional arguments for each F_CHARACTER or F_STRING argument inthe Sather signature

    The following example combines all rules

    foo(s1:F_STRING, i:F_INTEGER, a:F_REAL,

    c:F_CHARACTER):F_COMPLEX

    -- is mapped to

    void foo(F_COMPLEX* ret_address, F_CHARACTER* s1_address,

    F_INTEGER* i_address, F_REAL* a_address,

    F_CHARACTER* c_address, F_LENGTH s1_length,

    F_LENGTH c_length);

    -- all Sather calls have c_length set to 1

    1.7.2. OUT and INOUT Arguments

    Sather 1.1 provides the extra flexibility of 'out' and 'inout' argument modesfor Fortran calls. The Sather compiler ensures that the semantics of 'out' and 'inout'

    is preserved even when calls cross the Sather language boundaries. In particular,

    the changes to such arguments are not observed until the call is complete - thus the

    interlanguage calls have the same semantics as regular Sather calls.

  • 7/28/2019 System Software &memory management

    30/80

    This additional mechanism makes the semantics of some arguments visually

    explicit and consequently helps catch some bugs caused by the modification of 'in'

    arguments (all Fortran arguments are passed by reference, and Fortran code can

    potentially modify all arguments without restrictions.) A special compiler option

    may enable checking the invariance of Fortran 'in' argument.

    The ICSI Sather 1.1 compiler currently does not implement this

    functionality.

    In the case of calling Fortran code, the Sather compiler ensures that the

    value/result semantics is preserved by the caller - the Sather compiler has no

    control over external Fortran code. This may involve copying 'inout' arguments to

    temporaries and passing references to these temporaries to Fortran. In the case of

    Sather routines that are called from Fortran, the Sather compiler emits a special

    prologue for such routines to ensure the value/result semantics for the Fortran

    caller. In summary, the value/result semantics for external calls to Fortran is

    ensured by the caller, and for Sather routines that are meant to be called by Fortran

    it is implemented by the callee.

    This example suggests how a signature for a routine that swaps two integers:

    SUBROUTINE SWAP(A,B)

    INTEGER A,B

    -- a Sather signature may look like

    swap(inout a:F_INTEGER, inout b:F_INTEGER);

    Note that using argument modes in this example makes the semantics of the

    routine more obvious.

  • 7/28/2019 System Software &memory management

    31/80

    In the following example, compiling the program with all checks on may reveal a

    bug due to the incorrect modification of the vector sizes:

    SUBROUTINE ADD_VECTORS(A,B,RES,size)

    REAL A(*),B(*),RES(*)

    INTEGER SIZE

    -- Sather signature

    add_vectors(a,b,res:F_ARRAY{F_REAL}, size:F_INTEGER)

    -- size is an 'in' parameter and cannot be modified by Fortran code

    In addition to extra debugging capabilities, 'in' arguments are passed slightly more

    efficiently than 'out' and 'inout' arguments.

    Points to note

    F_ROUT and F_HANDLER types cannot be "out" or "inout" arguments.Parameter Passing, Calls, Symbol Tables & Irs

    Parameter Passing

    Three semantic classes (semantic models) of parameters

    IN: pass value to subprogram

    OUT: pass value back to caller INOUT: pass value in and back Implementation alternatives Copy value Pass an access path (e.g. a pointer)

  • 7/28/2019 System Software &memory management

    32/80

    Parameter Passing Methods

    Pass-by-Value

    Pass-by-Reference

    Pass-by-Result

    Pass-by-Value-Re

    Pass-by-value

    Copy actual into formal

    Default in many imperative languages

    Only kind used in C and Java

    Used forINparameter passing

    Actual can typically be arbitrary expresion

    including constant & variable

    Pass-by-value cont.

    Advantage

    Cannot modify actuals

    So IN is automatically enforced

    Disadvantage

    Copying of large objects is expensive

    Dont want to copy whole array each call!

    Implementation

    Formal allocated on stack like a local variable

  • 7/28/2019 System Software &memory management

    33/80

    Value initialized with actual

    Optimization sometimes possible: keep only in register

    Pass-by-result

    Used for OUT parameters

    No value transmitted to subprogram

    Actual MUST be variable (more precisely value)

    foo(x) and foo(a[1]) are fine but not foo(3) or

    foo(x * y)

    Pass-by-result gotchas

    procedure foo(out int x, out int y) {

    g := 4; x := 42;

    y := 0; }

    main() {b: array[1..10] of integer;

    g: integer; g = 0;

    call to foo: }

    Pass-by-value-result

    Implementation model for in-out parameters

    Simply a combination of pass by value and pass by result

    Same advantages & disadvantages

    Actual must be all value

  • 7/28/2019 System Software &memory management

    34/80

    Pass-by-reference

    Also implements IN-OUT

    Pass an access path, no copy is performed

    Advantages:

    Efficient, no copying, no extra space

    Disadvantages

    Parameter access usually slower (via indirection)

    If only IN is required, may change value inadvertently

    Creates aliases

    Pass-by-reference aliases

    int g;

    void foo(int& x) {

    x = 1;}

    foo(g);

    g and x are aliased

    Pass-by-name

    Textual substitution of argument in subprogram

    Used in Algol for in-out parameters, C macro preprocessor

    evaluated at each reference to formal parameter in subprogram

    Subprogram can change values of variables used in argument expression

    Programmer must rename variables in subprogram in case of name clashes

  • 7/28/2019 System Software &memory management

    35/80

    Evaluation uses reference environment of caller

    Jensens device

    real procedure sigma(x, i, n);

    value n;

    real x; integer i, n;

    begin

    real s;

    s := 0;

    for i := 1 step 1 until n do

    s := s + x;

    sigma := s;

    end

    What does sigma(a(i), i, 10) do

    Design Issues

    Typechecking

    Are procedural parameters included?

    May not be possible in independent compilation

    Type-loophole in original Pascal: was not checked, but later procedure

    type required in formal declaration

    Pass-by-name Safety Problem

    procedure swap(a, b);

    integer a,b,temp;

    begin

  • 7/28/2019 System Software &memory management

    36/80

    temp := a;

    a := b;

    b := temp;

    end;

    swap(x,y):

    swap(i, x(i))

    Call-by-name Implementation

    Variables & constants easy

    Reference & copy

    Expressions are harder

    Have to use parameterless procedures aka.

    Procedures as Parameters

    In some languages procedures are first-class citizens, i.e., they can be assigned to

    variables, passed as arguments like any other data types

    -Even C, C++, Pascal have some (limited) support for procedural parameters

    -Major use: can write more general procedures, e.g. standard library in C:

    qsort(void* base, size_t nmemb, size_t

    size, int(*compar)(const void*, const void*));

    Design Issues

    Typechecking

    -Are procedural parameters included?

    -May not be possible in independent compilation

    -Type-loophole in original Pascal: was not checked, but later procedure type

  • 7/28/2019 System Software &memory management

    37/80

    requiredin formal declaration

    Prcedures as parameters

    How do we implement static scope rules?

    = how do we set up the static link?

    program param(input, output);

    procedure b(function h(n : integer):integer);

    begin writeln(h(2)) end {b};

    procedure c;

    var m : integer;

    function f(n : integer) : integer;

    begin f := m + n end {f};

    begin m: = 0; b(f) end {c};

    begin c

    end.Solution: pass static link:

    Procedure Calls

    5 Steps during procedure invocation

  • 7/28/2019 System Software &memory management

    38/80

    Procedure call (caller)

    Procedure prologue (callee)

    Procedure execution (callee)

    Procedure epilogue (callee)

    Caller restores execution environment and receives return value

    (caller)

    The Call

    Steps during procedure invocation

    Each argument is evaluated and put in corresponding register or stacklocation

    Address of called procedure is determined

    In most cases already known at compile / link time

    Caller-saved registers in use are saved in memory (on the stack)

    Static link is computed

    Return address is saved in a register and branch to callees code isexecuted

    The Prologue

    Save fp, fp := sp , sp = spframe size

    Callee-saved registers used by callee are saved in memory

    Construct display (if used in lieu of static link)

    The Epilogue

    Callee-saved registers that were saved are restored

    Restore old sp and fp

    Put return value in return register / stack location

  • 7/28/2019 System Software &memory management

    39/80

    Branch to return address

    Post Return

    Caller restores caller-saved registers that were saved

    Return value is used

    Division of caller-saved vs callee-saved is important

    Reduces number of register saves

    4 classes: caller-saved, callee-saved, temporary and dedicated

    Best division is program dependent so calling convention is a compromise

    Argument Registers

    Additional register class used when many GPRs available

    Separate for integer and floating point arguments

    Additional arguments passed on stack

    Access via fp+offset

    Return values

    Return value register or memory if too large

    Could be allocated in callers or callees space

    Callees space: not reentrant!

    Callers space

    Pass pointer to callers return value space

    If size is provided as well callee can check for fit

  • 7/28/2019 System Software &memory management

    40/80

    Procedure

    Calls with

    Register

    Windows

    Symbol Tables

    Maps symbol names to attributes

    Common attributes

    Name: String

    Class:Enumeration (storage class)

    Volatile:Boolean

    Size:Integer

    Bitsize:Integer

    Boundary: Integer

    Bitbdry:Integer

    Type:Enumeration or Type referent

    Basetype:Enumeration or Type referent

  • 7/28/2019 System Software &memory management

    41/80

    Machtype:Enumeration

    Nelts:Integer

    Register:Boolean

    Reg:String (register name)

    Basereg:String

    Disp:

    Symbol Table Operations

    New_Sym_Tab:SymTab -> SymTab

    Dest_Sym_Tab:SymTab -> SymTab

    Destroys symtab and returns parent

    Insert_Sym:SymTab X Symbol -> boolean

    Returns false if already present, otherwise inserts and returns true

    Locate_Sym:SymTab X Symbol -> boolean

    Get_Sym_Attr:SymTab X Symbol x Attr -> ValueSet_Sym_Attr:SymTab X Symbol x Attr X Value -> boolean

    Next_Sym:SymTab X Symbol -> Symbol

    More_Syms:SymTab X Symbol -> boolean

    Implementation Goals

    Fast insertion and lookup operations for symbols and attributes

    Alternatives

    Balanced binary tree

    Hash table (the usual choice)

    Open addressing or

  • 7/28/2019 System Software &memory management

    42/80

    Buckets (commonly used)

    Scoping and Symbol Tables

    Nested scopes (e.g. Pascal) can be represented as a tree

    Implement by pushing / popping symbol tables on/off a symbol table stack

    More efficient implementation with two stacks

    Scoping with Two Stacks

    Visibility versus Scope

    So far we have assumed scope ~ visibility

    Visibility directly corresponds to scope this is called open scope

    Closed scope=visibility explicitly specified

    Arises in module systems (import) and inheritance mechanisms in oo languages

    Can be implemented by adding a list of scope level numbers in which a symbol is

    visible

    Optimized implementation needs just one scope number

  • 7/28/2019 System Software &memory management

    43/80

    Stack represents declared scope or outermost exported scope

    Hash table implements visibility by reordering hash chain

    Intermediate Representations

    Make optimizer independent of source and target language

    Usually multiple levels

    HIR = high level encodes source language semantics

    Can express language-specific optimizations

    MIR = representation for multiple source and target languages

    Can express source/target independent optimizations

    LIR = low level representation with many specifics to target

    Can express target-specific optimizations

    IR Goals

    Primary goals Easy & effective analysis Few cases Support for things of interest Easy transformations General across source / target languages Secondary goals Compact in memory Easy to translate from / to Debugging support Extensible & displayable

    High-Level IRs

    Abstract syntax tree + symbol table most common

    LISP S-expressions

  • 7/28/2019 System Software &memory management

    44/80

    Medium-level IRs

    Represent source variables + temporaries and registers

    Reduce control flow to conditional + unconditional branches

    Explicit operations for procedure calls and block structure

    Most popular: three address code

    t1 := t2 op t3 (address at most 3 operands)

    if t goto L

    t1 := t2 < t3

    Important MIRs

    SSA = static single assignment form

    Like 3-address code but every variable has exactly one reaching definition

    Makes variables independent of the locations they are in

    Makes many optimization algorithms more effective

    SSA Example

    x := u

    x

    x := v

    x

    x1 := u

    x1

  • 7/28/2019 System Software &memory management

    45/80

    x2 := v

    x2

    Other Representations

    Triples

    (1) i + 1

    (2) i := (1)

    (3) i + 1

    (4)p + 4

    (5) *(4)

    (6)p := (4)

    Trees

    Like AST but at lower level

    Directed Acyclic Graphs (DAGs)More compact than trees through node sharing

    Three Address Code Example

  • 7/28/2019 System Software &memory management

    46/80

    Representation Components

    -Operations

    -Dependences between operations

    -Control dependences: sequencing of operations

    -Evaluation of then & else depend on result of test

    -Side effects of statements occur in right order

    -Data dependences: flow of values from definitions to uses

    -Operands computed before operation

    -Values read from variable before being overwritten

    -Want to represent only relevant dependences

    -Dependences constrain operations, so the fewer the better

    Representing Control Dependence

    Implicit in AST

    Explicit as Control Flow Graphs (CFGs)Nodes are basic blocks

    Instructions in block sequence side effects

    Edges represent branches (control flow between blocks)

    Fancier:

    Control Dependence Graph

    Part of the PDG (program dependence graph)

    Value dependence graph (VDG)

    Control dependence converted to data dependence

  • 7/28/2019 System Software &memory management

    47/80

    Data Dependence Kinds

    True (flow) dependence (read after write RAW)

    Reflects real data flow, operands to operation

    Anti-dependence (WAR)

    Output dependence (WAW)

    Reflects overwriting of memory, not real data flow

    Can often be eliminated

    Data Dependence Example

    Representing Data Dependences (within bbs)

    -Sequence of instructions

    -Simple

    -Easy analysis

    -But: may overconstrain operation order

    -Expression tree / DAG

    -Directly captures dependences in block

  • 7/28/2019 System Software &memory management

    48/80

    -Supports local CSE (common subexpression elimination)

    -Can be compact

    -Harder to analyze & transform

    -Eventually has to be linearized

    Representing Data Dependences (across blocks)

    -Implicit via def-use

    -Simple

    -Makes analysis slow (have to compute dependences each time)

    -Explicit: def-use chains

    -Fast

    -Space-consuming

    -Has to be updated after transformations

    -Advanced options:

    -SSA-VDGs

    -Dependence glow graphs (DFGs)

    1.8 Semantics of Calls and Returns

    Def: The subprogram call and return operations of a language are togethercalled its subprogram linkage

    1.9 Implementing Subprograms

    The General Semantics of Calls and Returns

    The subprogram call and return operations of a language are together called itssubprogram linkage.

  • 7/28/2019 System Software &memory management

    49/80

    A subprogram call in a typical language has numerous actions associated withit.

    The call must include the mechanism for whatever parameter-passing method isused.

    If local vars are not static, the call must cause storage to be allocated for thelocals declared in the called subprogram and bind those vars to that storage.

    It must save the execution status of the calling program unit. It must arrange to transfer control to the code of the subprogram and ensure that

    control to the code of the subprogram execution is completed.

    Finally, if the language allows nested subprograms, the call must cause somemechanism to be created to provide access to non-local vars that are visible to

    the called subprogram.

    Implementing Simple Subprograms

    Simple means that subprograms cannot be nested and all local vars are static. The semantics of a call to a simple subprogram requires the following actions:

    1. Save the execution status of the caller.2. Carry out the parameter-passing process.3. Pass the return address to the callee.4. Transfer control to the callee.

    The semantics of a return from a simple subprogram requires the followingactions:

    1. If pass-by-value-result parameters are used, move the current values ofthose parameters to their corresponding actual parameters.

    2. If it is a function, move the functional value to a place the caller can getit.

    3. Restore the execution status of the caller.4. Transfer control back to the caller.

    The call and return actions require storage for the following: Status information of the caller, parameters, return address, and functional value (if it is a function)

    These, along with the local vars and the subprogram code, form the completeset of information a subprogram needs to execute and then return control to thecaller.

    A simple subprogram consists of two separate parts: The actual code of the subprogram, which is constant, and The local variables and data, which can change when the subprogram is

    executed. Both of which have fixed sizes.

  • 7/28/2019 System Software &memory management

    50/80

    The format, or layout, of the non-code part of an executing subprogram is calledan activation record, b/c the data it describes are only relevant during the

    activation of the subprogram.

    The form of an activation record is static. An activation record instance is a concrete example of an activation record

    (the collection of data for a particular subprogram activation)

    B/c languages with simple subprograms do not support recursion; there can beonly one active version of a given subprogram at a time.

    Therefore, there can be only a single instance of the activation record for asubprogram.

    One possible layout for activation records is shown below.

    B/c an activation record instance for a simple subprogram has a fixed size, itcan be statically allocated.

    The following figure shows a program consisting of a main program and threesubprograms: A, B, and C.

  • 7/28/2019 System Software &memory management

    51/80

    The construction of the complete program shown above is not done entirely bythe compiler.

    In fact, b/c of independent compilation, MAIN, A, B, and C may have beencompiled on different days, or even in different years.

    At the time each unit is compiled, the machine code for it, along with a list ofreferences to external subprograms is written to a file.

    The executable program shown above is put together by the linker, which ispart of the O/S.

    1.10 Implementing Subprograms with Stack-Dynamic Local Variables

    One of the most important advantages of stack-dynamic local vars is support forrecursion.

    More Complex Activation Records

    Subprogram linkage in languages that use stack-dynamic local vars are morecomplex than the linkage of simple subprograms for the following reasons:o The compiler must generate code to cause the implicit allocation and

    deallocation of local variableso Recursion must be supported (adds the possibility of multiple

    simultaneous activations of a subprogram), which means there can bemore than one instance of a subprogram at a given time, with one call

    from outside the subprogram and one or more recursive calls.o Recursion, therefore, requires multiple instances of activation records,

    one for each subprogram activation that can exist at the same time.o Each activation requires its own copy of the formal parameters and the

    dynamically allocated local vars, along with the return address. The format of an activation record for a given subprogram in most languages is

    known at compile time. In many cases, the size is also known for activation records b/c all local data is

    of fixed size. In languages with stack-dynamic local vars, activation record instances must be

    created dynamically. The following figure shows the activation record for such

    a language.

  • 7/28/2019 System Software &memory management

    52/80

    B/c the return address, dynamic link, and parameters are placed in the activationrecord instance by the caller, these entries must appear first.

    The return address often consists of a ptr to the code segment of the caller andan offset address in that code segment of the instruction following the call.

    The dynamic linkpoints to the top of an instance of the activation record of thecaller.

    In static-scoped languages, this link is used in the destruction of the currentactivation record instance when the procedure completes its execution.

    The stack top is set to the value of the old dynamic link. The actual parameters in the activation record are the values or addresses

    provided by the caller.

    Local scalar vars are bound to storage within an activation record instance. Local structure vars are sometimes allocated elsewhere, and only their

    descriptors and a ptr to that storage are part of the activation record.

    Local vars are allocated and possibly initialized in the called subprogram, sothey appear last.

    Consider the following C skeletal function:void sub(float total, int part)

    {

    int list[4];

    float sum;

    }

    The activation record for sub is:

  • 7/28/2019 System Software &memory management

    53/80

    Activating a subprogram requires the dynamic creation of an instance of theactivation record for the subprogram.

    B/c of the call and return semantics specify that the subprogram last called isthe first to complete, it is reasonable to create instances of these activations

    records on a stack.

    This stack is part of the run-time system and is called run-time stack. Every subprogram activation, whether recursive or non-recursive, creates a new

    instance of an activation record on the stack.

    This provides the required separate copies of the parameters, local vars, andreturn address.

    An Example without Recursion

    Consider the following skeletal C programvoid fun1(int x) {

    int y;

    ... 2

    fun3(y);

    ...

    }

    void fun2(float r) {

    int s, t;

    ... 1

    fun1(s);

    ...

    }

    void fun3(int q) {

    ... 3

  • 7/28/2019 System Software &memory management

    54/80

    }

    void main() {

    float p;

    ...

    fun2(p);

    ...

    }

    The sequence of procedure calls in this program is:main calls fun2

    fun2 calls fun1

    fun1 calls fun3

    The stack contents for the points labeled 1, 2, and 3 are shown in the figurebelow:

    At point 1, only ARI for main and fun2 are on the stack. When fun2 calls fun1, an ARI of fun1 is created on the stack. When fun1 calls fun3, an ARI of fun3 is created on the stack.

  • 7/28/2019 System Software &memory management

    55/80

    When fun3s execution ends, its ARI is removed from the stack, and thedynamic link is used to reset the stack top pointer.

    A similar process takes place when fun1 and fun2 terminate. After the return from the call to fun2 from main, the stack has only the ARI of

    main.

    In this example, we assume that the stack grows from lower addresses to higheraddresses.

    The collection of dynamic links present in the stack at a given time is called thedynamic chain, orcall chain.

    It represents the dynamic history of how execution got to its current position,which is always in the subprogram code whose activation record instance is on

    top of the stack. References to local vars can be represented in the code as offsets from the

    beginning of the activation record of the local scope.

    Such an offset is called a local_offset. The local_offset of a local variable can be determined by the compiler at

    compile time, using the order, types, and sizes of vars declared in the

    subprogram associated with the activation record.

    Assume that all vars take one position in the activation record. The first local variable declared would be allocated in the activation record two

    positions plus the number of parameters from the bottom (the first two positions

    are for the return address and the dynamic link) The second local var declared would be one position nearer the stack top and so

    forth; e.g., in fun1, the local_offset ofy is 3.

    Likewise, in fun2, the local_offset ofs is 3; fort is 4.Recursion

    Consider the following C program which uses recursion:int factorial(int n) {

  • 7/28/2019 System Software &memory management

    56/80

    }

    void main() {

    int value;

    value = factorial(3);

  • 7/28/2019 System Software &memory management

    57/80

    This returns the value 2 to the 1st activation of factorial to be multiplied by itsparameter for value n, which is 3, yielding the final functional value of 6, whichis then returned to the first call to factorial in main

    Stack contents at position 1 in factorial is shown below.

    Figure below shows the stack contents during execution of main and factorial.

  • 7/28/2019 System Software &memory management

    58/80

  • 7/28/2019 System Software &memory management

    59/80

    Nested Subprograms

    Some of the non-C-based static-scoped languages (e.g., Fortran 95, Ada,JavaScript) use stack-dynamic local variables and allow subprograms to be

    nested.

    The Basics

    All variables that can be non-locally accessed reside in some activation recordinstance in the stack.

    The process of locating a non-local reference: Find the correct activation record instance in the stack in which the var was

    allocated.

    Determine the correct offset of the var within that activation record instance toaccess it.

    The Process of Locating a Non-local Reference: Finding the correct activation record instance: Only vars that are declared in static ancestor scopes are visible and can be

    accessed.

    Static semantic rules guarantee that all non-local variables that can bereferenced have been allocated in some activation record instance that is on the

    stack when the reference is made. A subprogram is callable only when all of its static ancestor subprograms are

    active. The semantics of non-local references dictates that the correct declaration is the

    first one found when looking through the enclosing scopes, most closely nestedfirst.

    Static Chains

    A static chain is a chain of static links that connects certain activation recordinstances in the stack.

    The static link, static scope pointer, in an activation record instance forsubprogram A points to one of the activation record instances of A's static

    parent. The static link appears in the activation record below the parameters. The static chain from an activation record instance connects it to all of its static

    ancestors.

    During the execution of a procedure P, the static link of its activation recordinstance points to an activation of Ps static program unit.

    That instances static link points, in turn, to Ps static grandparent programunits activation record instance, if there is one.

  • 7/28/2019 System Software &memory management

    60/80

    So the static chain links all the static ancestors of an executing subprogram, inorder of static parent first.

    This chain can obviously be used to implement the access to non-local vars instatic-scoped languages.

    When a reference is made to a non-local var, the ARI containing the var can befound by searching the static chain until a static ancestor ARI is found that

    contains the var.

    B/c the nesting scope is known at compile time, the compiler can determine notonly that a reference is non-local but also the length of the static chain must befollowed to reach the ARI that contains the non-local object.

    A static_depth is an integer associated with a static scope whose value is thedepth of nesting of that scope.

    main ----- static_depth = 0

    A ----- static_depth = 1B ----- static_depth = 2

    C ----- static_depth = 1

    The length of the static chain needed to reach the correct ARI for a non-localreference to a var X is exactly the difference between the static_depth of the

    procedure containing the reference to X and the static_depth of theprocedure containing the declaration for X

    The difference is called the nesting_depth, or chain_offset, of thereference.

    The actual reference can be represented by an ordered pair of integers(chain_offset, local_offset), where chain_offset is the number of links to the

    correct ARI.

    procedure A is

    procedure B is

    procedure C is

    end; // C

  • 7/28/2019 System Software &memory management

    61/80

    end; // B

    end; // A

    The static_depths of A, B, and C are 0, 1, 2, respectively. If procedure C references a var in A, the chain_offset of that reference

    would be 2 (static_depth of C minus the static_depth of A).

    If procedure C references a var in B, the chain_offset of that reference wouldbe 1 (static_depth of C minus the static_depth of B).

    References to locals can be handled using the same mechanism, with achain_offset of 0.

    procedure MAIN_2 is

    X : integer;

    procedure BIGSUB is

    A, B, C : integer;

    procedure SUB1 is

    A, D : integer;

    begin { SUB1 }

    A := B + C;

  • 7/28/2019 System Software &memory management

    62/80

    begin { SUB3 }

    SUB1;

    E := B + A:

  • 7/28/2019 System Software &memory management

    63/80

    The sequence of procedure calls is:MAIN_2 calls BIGSUB

    BIGSUB calls SUB2

    SUB2 calls SUB3

    SUB3 calls SUB1

    The stack situation when execution first arrives at point 1 in this program isshown below:

  • 7/28/2019 System Software &memory management

    64/80

    At position 1 in SUB1:

    A - (0, 3)

    B - (1, 4)

    C - (1, 5)

    At position 2 in SUB3:

    E - (0, 4)

    B - (1, 4)

    A - (2, 3)

    At position 3 in SUB2:

    A - (1, 3)

    D - an error

    E - (0, 5)1.12 Overview of Memory Management, Virtual Memory, Process

    Creation

    Memory Management

    Program data is stored in memory.

    Memory is a finite resource: programs may need to reuse some of it.

    Most programming languages provide two means of structuring

    data stored in memory:

    Stack:

    memory space (stack frames) for storing data local to a function body.

  • 7/28/2019 System Software &memory management

    65/80

    The programming language provides faciliBes for automaBcally managing

    stackallocated data. (i.e. compiler emits code for allocaBng/freeing stack

    frames)

    (Aside: Unsafe languages like C/C++ dont enforce the stack invariant,which leads to bugs that can be exploited for code injecBon a^acks)

    Heap:

    memory space for storing data that is created by a function

    but needed in a caller. (Its lifeBme is unknown at compile Bme.)

    Freeing/reusing this memory can be up to the programmer (C/C++)

    (Aside: Freeing memory twice or never freeing it also leads to many bugs

    in C/C++ programs)

    Garbage collection automates memory management for Java/ML/C#/etc.

    Explicit Memory Management

    On unix, libc provides a library that allows programmers to manage theheap:

    void * malloc(size_t n)

    Allocates n bytes of storage on the heap and returns its address.

    void free(void *addr)

    Releases the memory previously allocated by malloc address addr.

    These are userlevel library funcBons. Internally, malloc uses brk (or sbrk)system calls to have the kernel allocate space to the process.

    Simple Implementation: Free Lists

    Arrange the blocks of unused memory in a free list.

  • 7/28/2019 System Software &memory management

    66/80

    Each block has a pointer to the next free block.

    Each block keeps track of its size. (Stored before & aeer data parts.)

    Each block has a status flag = allocated or unallocated (Kept as a bit in the

    first size (assuming size is a mulBple of 2 so the last bit is unused)

    Malloc: walk down free list, find a block big enoughFirst fit? Best fit?

    Free: insert the freed block into the free list.

    Perhaps keep list sorted so that adjacent blocks can be merged.

    Problems:

    FragmentaBon ruins the heap

    Malloc can be slow

    Exponential Scaling / Buddy System

    Keep an array of freelists: FreeList[i]

    FreeList[i] points to a list of blocks of size 2i

    Malloc: round requested size up to nearest power of 2

    When FreeList[i] is empty, divide a block from FreeList[i+1] into two

    halves, put both chunks into FreeList[i]

    AlternaBvely, merge together two adjacent nodes from FreeList[i1]

    Free: puts freed block back into appropriate free list

    Malloc & free take O(1) Bme

    This approach trades external fragmentaBon (within the heap

    as a whole) for internal fragmentaBon (within each block).

    Wasted space: ~30%

    Manual memory management is cumbersome & error prone:

  • 7/28/2019 System Software &memory management

    67/80

    Freeing the same pointer twice is ill defined (seg fault or other bugs)

    Calling free on some pointer not created by malloc (e.g. to an element

    of an array) is also ill defined

    malloc and free arent modular: To properly free all allocated memory,

    the programmer has to know what code owns each object. Owner code

    must ensure free is called just once.

    Not calling free leads to space leaks: memory never reclaimed

    Many examples of space leaks in longrunning programs

    Garbage collecBon:

    Have the language runBme system determine when an allocated chunk of

    memory will no longer be used and free it automaBcally.

    But garbage collector is usually the most complex part of a languages

    runBme system.

    Garbage collecBon does impose costs (performance, predictability)

    Memory Use & Reachability

    When is a chunk of memory no longer needed?

    In general, this problem is undecidable.

    We can approximate this informaBon by freeing memory that cant be reached

    from any root references.

    A root pointer is one that might be accessible directly from the

    program (i.e. theyre not in the heap).

    Root pointers include pointer values stored in registers, in global

    variables, or on the stack.

    If a memory cell is part of a record (or other data structure)

    that can be reached by traversing pointers from the root, it is live.

  • 7/28/2019 System Software &memory management

    68/80

    It is safe to reclaim all memory cells not reachable from a root

    (such cells are garbage).

    Reachability & Pointers

    StarBng from stack, registers, & globals (roots), determine which

    objects in the heap are reachable following pointers.

    Reclaim any object that isn't reachable.

    Requires being able to disBnguish pointer values from other values

    (e.g., ints).

    Type safe languages:

    OCaml, SML/NJ use the low bit:

    1 it's a scalar, 0 it's a pointer. (Hence 31bit ints in OCaml)

    Java puts the tag bits in the object metadata (uses more space).

    Type safety implies that casts cant introduce new pointers

    Also, pointers are abstract (references), so objects can be moved without

    changing the meaning of the program

    Mark and Sweep Garbage Collection

    Classic algorithm with two phases:

    Phase 1: Mark

    Start from the roots

    Do depthfirst traversal, marking every object reached.

    Phase 2: Sweep

    Walk over all allocated objects and check for marks.

    Unmarked objects are reclaimed.

    Marked objects have their marks cleared.

    OpBonal: compact all live objects in heap by moving them adjacent to

  • 7/28/2019 System Software &memory management

    69/80

    one another. (needs extra work & indirecBon to patch up pointers)

    DeutschSchorrWaite (DSW) Algorithm No need for a stack, it is possible to use the graph being

    traversed itself to store the data necessary

    Idea: during depthfirstsearch, each pointer is followed only

    once. The algorithm can reverse the pointers on the way

    down and restore them on the way back up.

    Mark a bit on each object traversed on the way down.

    Two pointers:

    curr: points to the current node

    prev points to the previous node

    On the way down, flip pointers as you traverse them:

    tmp := curr

    curr := curr.next

    tmp.next := prev

    prev := curr

    Costs & Implications

    Need to generalize to account for objects that have mulBple

    outgoing pointers.

    Depthfirst traversal terminates when there are no children

    pointers or all children are already marked.

    Accounts for cycles in the object graph.

    The DeutschSchorrWaite algorithm breaks objects during

    the traversal.

    All computaBon must be halted during the mark phase. (Bad for

  • 7/28/2019 System Software &memory management

    70/80

  • 7/28/2019 System Software &memory management

    71/80

    Demand paging

    Bring a page into memory only when it is needed.o Less I/O neededo Less memory neededo Faster responseo More users

    Page is needed reference to it

    o invalid reference aborto not-in-memory bring to memory

    Page Fault

    If there is ever a reference to a page, first reference will trap toOS page fault

    OS looks at another table to decide:

    o Invalid reference abort.o Just not in memory.

    Get empty frame. Swap page into frame.

  • 7/28/2019 System Software &memory management

    72/80

    Reset tables, validation bit = 1. Restart instruction: Least Recently Used

    o Blockmoveo auto increment/decrement location

    Steps in Handling a Page Fault

    no free frame

    Page replacementfind some page in memory, but not really in use, swapit out.

    o algorithmo performance want an algorithm which will result in minimum

    number of page faults.

    Same page may be brought into memory several times.Performance of Demand Paging

    Page Fault Rate 0 p 1.0o ifp = 0 no page faultso ifp = 1, every reference is a fault

  • 7/28/2019 System Software &memory management

    73/80

  • 7/28/2019 System Software &memory management

    74/80

    A file is initially read using demand paging. A page-sized portion of the fileis read from the file system into a physical page. Subsequent reads/writesto/from the file are treated as ordinary memory accesses.

    Simplifies file access by treating file I/O through memory rather than read()write() system calls.

    Also allows several processes to map the same file allowing the pages inmemory to be shared.

    Memory Mapped Files

    Page Replacement

    Prevent over-allocation of memory by modifying page-fault service routineto include page replacement.

    Use modify (dirty) bitto reduce overhead of page transfers only modifiedpages are written to disk.

    Page replacement completes separation between logical memory andphysical memory large virtual memory can be provided on a smaller

    physical memory.

    Need For Page Replacement

    Page Replacement Algorithms

    Want lowest page-fault rate.

  • 7/28/2019 System Software &memory management

    75/80

    Evaluate algorithm by running it on a particular string of memory references(reference string) and computing the number of page faults on that string.

    In all our examples, the reference string is 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.

    FIFO Page Replacement

    Optical algorithm

  • 7/28/2019 System Software &memory management

    76/80

    Silberschatz, Galvin and Gagne 200210.26Operating System Concepts

    Optimal Algorithm

    Replace page that will not be used for longest period of

    time.

    4 frames example

    1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

    How do you know this?

    Used for measuring how well your algorithm performs.

    1

    2

    3

    4

    6 page faults

    4 5

    Optimal Page Replacement

  • 7/28/2019 System Software &memory management

    77/80

    LRU Page Replacement

    LRU Algorithm (Cont.)

    Stack implementationkeep a stack of page numbers in a double link form:o Page referenced:

    move it to the top requires 6 pointers to be changed

    o No search for replacement

    Use Of A Stack to Record The Most Recent Page References

    LRU Approximation Algorithms

    Reference bito With each page associate a bit, initially = 0o When page is referenced bit set to 1.o Replace the one which is 0 (if one exists). We do not know the order,

    however.

  • 7/28/2019 System Software &memory management

    78/80

    Second chanceo Need reference bit.o Clock replacement.o If page to be replaced (in clock order) has reference bit = 1. then:

    set reference bit 0. leave page in memory. replace next page (in clock order), subject to same rules.

    Second-Chance (clock) Page-Replacement Algorithm

    Counting Algorithms

    Keep a counter of the number of references that have been made to eachpage.

    LFU Algorithm: replaces page with smallest count. MFU Algorithm: based on the argument that the page with the smallest

    count was probably just brought in and has yet to be used.

    Allocation of Frames

    Each process needs minimum number of pages.

  • 7/28/2019 System Software &memory management

    79/80

    Example: IBM 3706 pages to handle SS MOVE instruction:o instruction is 6 bytes, might span 2 pages.o 2 pages to handle from.o 2 pages to handle to.

    Two major allocation schemes.o fixed allocationo priority allocation

    Fixed Allocation

    Equal allocatione.g., if 100 frames and 5 processes, give each 20 pages. Proportional allocationAllocate according to the size of process.

    n FIFO Page ReplacemfgkkmmUse modify (dirty) bit to reduce overheadof page transfers only modified pagesority Allolacement completes

    sepa

    Priority Allocation

    Use a proportional allocation scheme using priorities rather than size. If processPi generates a page fault,

    o select for replacement one of its frames.

    mS

    spa

    m

    sS

    ps

    iii

    i

    ii

    forallocation

    framesofnumbertotal

    processofsize

    5964137

    127

    564137

    10

    127

    10

    64

    2

    1

    2

    a

    a

    s

    s

    m

    i

  • 7/28/2019 System Software &memory management

    80/80

    o select for replacement a frame from a process with lower prioritynumber.

    Global vs. Local Allocation

    Global replacementprocess selects a replacement frame from the set of allframes; one process can take a frame from another. Local replacementeach process selects from only its own set of allocated

    frames.

    Thrashing

    If a process does not have enough pages, the page-fault rate is very high.This leads to:

    o low CPU utilization.o operating system thinks that it needs to increase the degree of

    multiprogramming.

    o another process added to the system. Thrashing a process is busy swapping pages in and out.