System Programming Set 1

Embed Size (px)

Citation preview

  • 8/3/2019 System Programming Set 1

    1/20

    System Programming MC0073

    Set -1

    1. Explain the following.a. Lexical Analysisb.

    Syntax Analysis

    Ans.

    a. Lexical AnalysisThe lexical analyzer is the interface between the

    source program and the compiler. The lexical analyzerreads the source program one character at a time,

    carving the source program into a sequence of atomic

    units called tokens. Each token represents a sequence

    of characters that can be treated as a single logical

    entity. Identifiers, keywords, constants, operators, and

    punctuation symbols such as commas and parenthesesare typical tokens. There are two kinds of token:

    specific strings such as IF or a semicolon, and classes

    of strings such as identifiers, constants, or labels.

  • 8/3/2019 System Programming Set 1

    2/20

    The lexical analyzer and the following phase, the

    syntax analyzer are often grouped together into the

    same pass. In that pass, the lexical analyzer operates

    either under the control of the parser or a co-routine

    with the parser. The parser asks the lexical analyzer

    returns to the parser a code for the token that it

    found. In the case that the token is an identifier or

    another token with a value, the value is also passed to

    the parser. The usual method of providing this

    information is for the lexical analyzer to call abookkeeping routine which installs the actual value in

    the symbol table if it is not already there. The lexical

    analyzer then passes the two components of the token

    to the parser. The first is a code for the token type

    (identifier), and the second is the value, a pointer to

    the place in the symbol table reserved for the specificvalue found.

    b. Syntax AnalysisThe parser has two functions. It checks that

    the tokens appearing in its input, which is the

    output of the lexical analyzer, occur in patterns

    that are permitted by the specification for the

    source language. It also imposes on the tokens a

  • 8/3/2019 System Programming Set 1

    3/20

    tree-like structure that is used by the

    subsequent phases of the compiler.

    The second aspect of syntax analysis is to makeexplicit the hierarchical structure of the

    incoming token stream by identifying which

    parts of the token stream should be grouped

    together.

    2. What is RISC and how it is different from theCISC ?

    Ans. The Reduced Instruction Set Computer, or RISC,

    is a microprocessor CPU design philosophy that favors

    a simpler set of instructions that all take about the

    same amount of time to execute. The most commonRISC microprocessors are AVR, PIC, ARM, DEC Alpha,

    PA-RISC, SPARC, MIPS, and IBMs PowerPC.

    RISC characteristics

    - Small number of machine instructions : less than 150

  • 8/3/2019 System Programming Set 1

    4/20

    - Small number of addressing modes : less than 4

    - Small number of instruction formats : less than 4

    - Instructions of the same length : 32 bits (or 64 bits)

    - Single cycle execution

    - Load / Store architecture

    - Large number of GRPs (General Purpose Registers):

    more than 32

    - Hardwired control

    - Support for HLL (High Level Language).

    RISC VS CISC

    CISC RISC

    Emphasis on

    hardware Emphasis on software

    Includes multi-clock

    complex

    instructions

    Single-clock,

    reduced instruction

    only

    Memory-to-

    memory: Register to register:

  • 8/3/2019 System Programming Set 1

    5/20

    "LOAD" and

    "STORE"

    incorporated ininstructions

    "LOAD" and "STORE"

    are independent

    instructions

    Small code sizes,

    high cycles per

    second

    Low cycles per second,

    large code sizes

    Transistors used

    for storing

    complex

    instructions

    Spends more

    transistors

    on memory registers

    3. Explain the following with respect to the

    design specifications of an Assembler:

    A) Data Structures

    B) pass1 & pass2 Assembler flow chart

    Ans. A) Data Structures

  • 8/3/2019 System Programming Set 1

    6/20

    The second step in our design procedure is to establish

    the databases that we have to work with.

    Pass 1 Data Structures1. Input source program

    2. A Location Counter (LC), used to keep track of each

    instructions location.

    3. A table, the Machine-operation Table (MOT), that

    indicates the symbolic mnemonic, for each instructionand its length (two, four, or six bytes)

    4. A table, the Pseudo-Operation Table (POT) that

    indicates the symbolic mnemonic and action to be taken

    for each pseudo-op in pass 1.

    5. A table, the Symbol Table (ST) that is used to storeeach label and its corresponding value.

    6. A table, the literal table (LT) that is used to store

    each literal encountered and its corresponding

    assignment location.

    7. A copy of the input to be used by pass 2.Pass 2 Data Structures

    1. Copy of source program input to pass1.

  • 8/3/2019 System Programming Set 1

    7/20

    2. Location Counter (LC)

    3. A table, the Machine-operation Table (MOT), that

    indicates for each instruction, symbolic mnemonic,length (two, four, or six bytes), binary machine opcode

    and format of instruction.

    4. A table, the Pseudo-Operation Table (POT), that

    indicates the symbolic mnemonic and action to be taken

    for each pseudo-op in pass 2.

    5. A table, the Symbol Table (ST), prepared by pass1,

    containing each label and corresponding value.

    6. A Table, the base table (BT), that indicates which

    registers are currently specified as base registers by

    USING pseudo-ops and what the specified contents of

    these registers are.

    7. A work space INST that is used to hold each

    instruction as its various parts are being assembled

    together.

    8. A work space, PRINT LINE, used to produce a

    printed listing.9. A work space, PUNCH CARD, used prior to actual

    outputting for converting the assembled instructions

    into the format needed by the loader.

  • 8/3/2019 System Programming Set 1

    8/20

    10. An output deck of assembled instructions in the

    format needed by the loader.

    Fig. 1.3: Data structures of the assembler

    B) pass1 & pass2 Assembler flow chart

    The third step in our design procedure is to specify

    the format and content of each of the data structures.Pass 2 requires a machine operation table (MOT)

    containing the name, length, binary code and format;

    pass 1 requires only name and length. Instead of using

    two different tables, we construct single (MOT). The

    Machine operation table (MOT) and pseudo-operation

    table are example of fixed tables. The contents of

    these tables are not filled in or altered during the

    assembly process.

  • 8/3/2019 System Programming Set 1

    9/20

    The following figure depicts the format of the

    machine-op table (MOT)

    6 bytes per entry

    Mnemonic

    Opcode

    (4bytes)characters

    Binary

    Opcode

    (1byte)(hexadecimal)

    Instruction

    length

    (2 bits)(binary)

    Instruction

    format

    (3bits)(binary)

    Not

    used

    here

    (3bits)

    Abbb 5A 10 001

    Ahbb 4A 10 001

    ALbb 5E 10 001

    ALRB 1E 01 000

    . . . .

    b represents blank

    Fig.:

    The flowchart for Pass 1:

  • 8/3/2019 System Programming Set 1

    10/20

    The primary function performed by the analysis phase

    is the building of the symbol table. For this purpose it

    must determine the addresses with which the symbol

    names used in a program are associated. It is possible

    to determine some address directly, e.g. the address

    of the first instruction in the program, however others

    must be inferred.

    To implement memory allocation a data structure

    called location counter (LC) is introduced. The location

    counter is always made to contain the address of the

    next memory word in the target program. It is

  • 8/3/2019 System Programming Set 1

    11/20

    initialized to the constant. Whenever the analysis

    phase sees a label in an assembly statement, it enters

    the label and the contents of LC in a new entry of the

    symbol table. It then finds the number of memory

    words required by the assembly statement and

    updates; the LC contents. This ensure: that LC points

    to the next memory word in the target program even

    when machine instructions have different lengths and

    DS/DC statements reserve different amounts of

    memory. To update the contents of LC, analysis phaseneeds to know lengths of different instructions. This

    information simply depends on the assembly language

    hence the mnemonics table can be extended to include

    this information in a new field called length. We refer

    to the processing involved in maintaining the location

    counter asLC processing

  • 8/3/2019 System Programming Set 1

    12/20

    Flow chart for Pass 2

    Fig. : Pass2 flowchart

  • 8/3/2019 System Programming Set 1

    13/20

    4. Define the following,

    A) Parsing

    B) ScanningC) Token

    Ans.

    A) Parsing

    Parsing transforms input text or string into a data

    structure, usually a tree, which is suitable for later

    processing and which captures the implied hierarchy of

    the input. Lexical analysis creates tokens from a

    sequence of input characters and it is these tokens

    that are processed by a parser to build a data

    structure such as parse tree or abstract syntax trees.

    Conceptually, the parser accepts a sequence of tokens

    and produces a parse tree. In practice this might notoccur.

    1. The source program might have errors. Shamefully,

    we will do very little error handling.

  • 8/3/2019 System Programming Set 1

    14/20

    2. Real compilers produce (abstract) syntax trees not

    parse trees (concrete syntax trees). We dont do this

    for the pedagogical reasons given previously.

    There are three classes for grammar-based parsers.

    1. Universal

    2. Top-down

    3. Bottom-up

    The universal parsers are not used in practice as they

    are inefficient; we will not discuss them.

    As expected, top-down parsers start from the root of

    the tree and proceed downward; whereas, bottom-up

    parsers start from the leaves and proceed upward. The

    commonly used top-down and bottom parsersare not universal. That is, there are (context-free)

    grammars that cannot be used with them.

    The LL and LR parsers are important in practice. Hand

    written parsers are often LL. Specifically, the

    predictive parsers we looked at in chapter two are for

    LL grammars. The LR grammars form a larger class.Parsers for this class are usually constructed with the

    aid of automatic tools.

  • 8/3/2019 System Programming Set 1

    15/20

    B) ScanningCompiler is a program which converts the source

    program into machine level language. It is a translator.Compiler performs analysis for sentence generations

    and interpretations. One phase output will go to the

    next phase as input. Conceptually, there are three

    phases of analysis with the output of one phase the

    input of the next. Each of these phases changes the

    representation of the program being compiled. Thephases are called lexical analysis or scanning, which

    transforms the program from a string of characters to

    a string of tokens.

    C)

    TokenSyntax Analysis or Parsing, transforms the program

    into some kind of syntax tree; and Semantic Analysis,

    decorates the tree with semantic information.

    The character stream input is grouped into meaningful

    units called lexemes, which are then mappedinto tokens, the latter constituting the output of the

    lexical analyzer.

    For example, any one of the following C statements

  • 8/3/2019 System Programming Set 1

    16/20

    x3 = y + 3;

    x3 = y + 3 ;

    x3 = y+ 3 ;

    but not

    x 3 = y + 3;

    would be grouped into the lexemes x3, =, y, +, 3, and ;.

    A token is a

    pair. The hierarchical

    decomposition above sentence is given figure

    Fig.

    A token is a pair.

  • 8/3/2019 System Programming Set 1

    17/20

    5. Describe the process of Bootstrapping in the

    context of Linkers

    Ans. The discussions of loading up to this point have all

    presumed that theres already an operating system or

    at least a program loader resident in the computer to

    load the program of interest. The chain of programs

    being loaded by other programs has to start

    somewhere, so the obvious question is how is the first

    program loaded into the computer?

    In modern computers, the first program the computer

    runs after a hardware reset invariably is stored in a

    ROM known as bootstrap ROM. as in "pulling ones self

    up by the bootstraps."

    When the CPU is powered on orreset, it sets its registers to a known state. On x86

    systems, for example, the reset sequence jumps to the

    address 16 bytes below the top of the systems

    address space. The bootstrap ROM occupies the top

    64K of the address space and ROM code then starts up

    the computer. On IBM-compatible x86 systems, theboot ROM code reads the first block of the floppy disk

    into memory, or if that fails the first block of the

    first hard disk, into memory location zero and jumps to

    location zero. The program in block zero in turn loads a

  • 8/3/2019 System Programming Set 1

    18/20

    slightly larger operating system boot program from a

    known place on the disk into memory, and jumps to that

    program which in turn loads in the operating system

    and starts it. (There can be even more steps, e.g., a

    boot manager that decides from which disk partition to

    read the operating system boot program, but the

    sequence of increasingly capable loaders remains.)

    Why not just load the operating system directly?

    Because you cant fit an operating system loader into512 bytes. The first level loader typically is only able

    to load a single-segment program from a file with a

    fixed name in the top-level directory of the boot disk.

    The operating system loader contains more

    sophisticated code that can read and interpret a

    configuration file, uncompress a compressed operatingsystem executable, address large amounts of memory

    (on an x86 the loader usually runs in real mode which

    means that its tricky to address more than 1MB of

    memory.) The full operating system can turn on the

    virtual memory system, loads the drivers it needs, and

    then proceed to run user-level programs.

    Many Unix systems use a similar bootstrap process to

    get user-mode programs running. The kernel creates a

    process, then stuffs a tiny little program, only a few

  • 8/3/2019 System Programming Set 1

    19/20

    dozen bytes long, into that process. The tiny program

    executes a system call that runs /etc/init, the user

    mode initialization program that in turn runs

    configuration files and starts the daemons and login

    programs that a running system needs.

    None of this matters much to the application level

    programmer, but it becomes more interesting if you

    want to write programs that run on the bare hardware

    of the machine, since then you need to arrange tointercept the bootstrap sequence somewhere and run

    your program rather than the usual operating system.

    Some systems make this quite easy (just stick the

    name of your program in AUTOEXEC.BAT and reboot

    Windows 95, for example), others make it nearly

    impossible. It also presents opportunities forcustomized systems. For example, a single-application

    system could be built over a Unix kernel by naming the

    application /etc/init.

    6. Describe the procedure for design of a Linker.

    Ans. Relocation and linking requirements in

    segmented addressing

  • 8/3/2019 System Programming Set 1

    20/20

    The relocation requirements of a program are

    influenced by the addressing structure of the

    computer system on which it is to execute. Use of the

    segmented addressing structure reduces the relocation

    requirements of program.

    -----end-----