84
UNIT II Control Unit Mrs.Rajani.P.K 03/16/22

co_u3_1

  • Upload
    sai

  • View
    219

  • Download
    3

Embed Size (px)

DESCRIPTION

comp organisation

Citation preview

  • UNIT IIControl Unit

    Mrs.Rajani.P.K*

  • Control UnitSingle us organization - register transfer, performing an arithmetic or logic operation, fetching and storing word from/to memory, execution of complete instruction , branch instruction, Multi bus organization, hardware control design methods state table and delay element method

    A complete processor, Micro-programmed control microinstructions, micro-program sequencing, wide branch addressing, micro-instructions with next address field, perfecting microinstructions, emulation

  • ObjectivesTo study Single and Multi Bus OrganizationExecution of an instruction by generating control signalsTo design control unit Hardwired Controlled DesignState Table methodDelay Element method

  • REFERENCEBook: Computer organization by HamacherChapter 7Pg 411*

  • TOPICS TO COVERChapter 7: Basic Processing UnitSingle Bus Organization Register Transfer Performing an arithmetic or logic operation Fetching and storing word from/to memory Execution of complete instruction branch instructionMulti-bus Organization Hardwired Control: Design methods State table and classical methodA complete Processor

    *

  • - Continued..Micro-programmed Control Microinstructions micro- program sequencing wide branch addressingmicroinstructions with next address field perfecting microinstructionsEmulation.*

  • *Recap: Organisation

  • *Fundamental ConceptsProcessor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making).Datapath: portion of the processor which contains hardware necessary to perform all operations required by the computer.Control: portion of the processor (also in hardware) which tells the datapath what needs to be done (the brain).

  • *Fundamental Concepts (2)Instruction execution cycle: fetch, decode, execute.Fetch: fetch next instruction (using PC) from memory into IR.Decode: decode the instruction.Execute: execute instruction.

  • ProcessorProcessing unit [Instruction Set Processor-ISP] or Processor:executes machine instructions and coordinates the activities of the other units.It used to be called Central Processing Unit(CPU).Central is less appropriate today since modern computers include several processing units.*

  • Assume each word is 4 bytes and each instruction is stored in a word, and that the memory is byte addressable.

    To execute an instruction, the processor has to perform the following 3 steps:

    1.Fetch the contents of the memory location pointed to by the PC.The contents of this location are interpreted as an instruction to be executed.Hence, they are loaded into Instruction Register(IR). Symbollically, this can be written as, IR [[PC]]

    *

  • 2.Assuming that the memory is byte addressable, increment the contents of the PC by 4, that is, PC [PC] + 43. Carry out the actions specified by the instruction in the IR.

    *In cases where an instruction occupies more than one word, steps1 &2 must be repeated as many times as necessary to fetch the complete instruction.steps1 &2-Fetch phasestep3-Execution phase

    *

  • *Single-bus Organization (p413)

  • Single-bus OrganizationInstructions can be executed by performing one or more of the following operations:Register TransferTransfer a word of data from one processor register to another or to the ALU. Performing ALU operation Perform arithmetic or logic operation and store the result in a processor register.Fetching word from memoryFetch the contents of the given memory location and load them into a processor register.Storing word to memory.Store a word of data from a processor register into a given memory location.*

  • *Multiple-Bus Organization

  • *Instruction ExecutionAn instruction can be executed by performing one or more of the following operations in some specified sequence:Transfer a word of data from one register to another or to the ALU (Arithmetic Logic Unit).Perform an arithmetic or a logic operation and store the result in a register.Fetch the contents of a given memory location and load them into a register.Store a word of data from a register into a given memory location.

  • *Register Transfer (p415)Register to register transfer:For each register Ri, two control signals:Riin used to load the data on the bus into the register.Riout to place the registers contents on the bus.

    Example: To transfer contents of R1 to R4:Set R1out to 1. This places contents of R1 on the bus.Set R4in to 1. This loads data from the processor bus into R4.

  • *Register Transfer (2)I/P and O/P gating for registers

  • *All operations & data transfers takes place within time periods called PROCESSOR CLOCK.When e two or more clock signals may be needed to guarantee proper transfer of data, that is known as Mutiphase clocking.

  • *When Rin=1, MUX selects data on the bus. This data will be loaded into F/F at rising edge of clk.When Rin=0, MUX feeds back the value currently stored in F/F.Q O/P is connected to bus via tri-state gate.When Riout=0, gates o/p -> high impedenceWhen Riout=1, gates drives bus to value 0 or 1I/P and O/P gating of 1 register bit

  • *Arithmetic/Logic Operation(Pg 415)ALU: Performs arithmetic and logic operations on its A and B inputs.A sequences of operation to add the contents of R1 to R2& store in R3 is:To perform R3 [R1] + [R2]:R1out, YinR2out, SelectY, Add, ZinZout, R3in

  • STEP 1: Reg R1 o/p and Reg Y I/P are enableSo R1 contents transferred to Y over the bus

    STEP 2:MUX select signal set to SelectYSo MUX gates the Y contents to I/p A of ALUAt same time-> Contents of R2 are gated onto the bus and hence, to I/P B of ALUHere ADD line is set to 1. SO ALU O/P = A+B -> Then o/p into Z

    STEP 3:Contents of Z transferred to Destination Reg R3.

    *

  • *Arithmetic/Logic Operation (2)If there are n operations, do we need n ALU control lines?No.We could use encoding, which requires log2 n control lines for n operations.n=8 then 3control signals are sufficient. However, this will increase complexity and hardware (additional decoder needed).

  • *Fetching a Word from Memory (Pg418)GENERAL OPERATION:Processor has to specify address of memory locationAlso need to Request a read operationData might be-> instruction/ operand

    Processor transfers required addr to MARMAR O/P connected to addr line of mem busAt same time, Processor uses control signals of mem bus to indicate that read operation is needed.When requested data is received from mem , they are stored in MDR.From MDR, can be transferred to other reg in processor.

  • MDR CONNECTIONSMDR has four control signals: MDRin, MDRout, -> to control connection to internal busMDRinE and MDRoutE .-> to control connection to exterrnal bus

    *

  • Fetching a Word from Memory (Pg418)Move (R1), R2/* R2 [[R1]]

    MAR [R1]Start a Read operation on the memory busWait for the control signal MFC(Memory Function completed) response from the memoryLoad MDR from the memory busR2 [MDR]

    *

  • *Reading a Word from Memory (2)Move (R1), R2/* R2 [[R1]]Sequence of control steps:R1out, MARin, ReadMDRinE, WMFCMDRout, R2inWMFC: Wait for arrival of MFC (Memory-Function-Completed) signal.MFC: To accommodate variability in response time, the processor waits until it receives an indication that the Read/Write operation has been completed. The addressed device sets MFC to 1 to indicate this.

  • *Storing a Word in MemoryMove R2, (R1)/* [R1] [R2]Sequence of control steps:R1out, MARin R2out, MDRin, WriteMDRoutE, WMFC

    STEPS:Desired addr is loaded into MARData to be written are loaded into MDR & write command is issuedProcessor remains in step 3 until mem operation is completed and MFC response is received.

  • Executing a Complete InstructionAdd (R3), R1/* R1 [R1] + [[R3]]Adds the contents of a memory location pointed to by R3 to register R1.

    GENERAL STEPS: 1. fetch instruction 2. Fetch 1st operand (contents of memory location pointed by R3) 3. Perform addition 4. Load result into R1

    *

  • *Executing a Complete Instruction

    Sequence of control steps:PCout, MARin, Read, Select4, Add, ZinZout, PCin, Yin, WMFCMDRout, IRinR3out, MARin, ReadR1out, Yin, WMFCMDRout, SelectY, Add, ZinZout, R1in, EndSteps 1 3: Instruction fetch

  • STEP 1: PCout, MARin, Read, Select4, Add, Zin

    Instructn fetch by loading PC content into MAR & send read request to mem. [PC] now onto busSelect4->causes MUX to select const 4. ALU->Operand B-> [PC] from busOperand A -> Const value 4Alu-> Now [PC]+4= [PC] -> Reg Z*

  • STEP 2: Zout, PCin, Yin, WMFC Reg Z-> back to PC while waiting for mem to respond

    STEP 3: MDRout, IRin Word fetched from mem loaded into IR

    -----------------------------Fetch Instruction completed ----------------------------STEP 4: R3out, MARin, Read [R3]-> MAR and mem read operatn initiated

    STEP 5: R1out, Yin, WMFC [R1]-> Reg Y and wait till mem read opeatn of 1st operand is completed

    *

  • STEP 6: MDRout, SelectY, Add, Zin When MFC=1 (Read operatn completed) -> mem operand into MDR[MDR] -> gated to the bus and Xferred to I/P B of ALUALU->SelectY -> Reg Y (Contents of R1) -> Operand A Operand B -> data from Bus

  • BRANCH INSTRUCTION (Pg 423)

    for UNCONDITIONAL1. PCout, MARIN, Read, Select4, ADD, Zin2. Zout, PCin, Yin, WMFC3. MDRout , IRin4. Offset-field-of-IRout, Add, Zin5. Zout, PCin, EndFOR CONDITIONAL4. Offset-field-of-IRout, Add, Zin, If N=0 Then End.

    Fetch Cycle ends when instructn loaded into IROffset value is extracted from IR by instruction decoding ckt and gated onto the busPC updated value -> in Reg Y -> ALU operand B Offset -> Bus -> ALU operand AADD and O/P into Z -> addr where to branch next

    *

  • *Multiple-Bus OrganizationSingle-bus structure: Control sequences are long as only one data item can be transferred over the bus in a clock cycle.Figure on next slide shows a three-bus structure.All registers are combined into a single block called register file with three ports: 2 outputs allowing 2 registers to be accessed simultaneously and have their contents put on buses A and B, and 1 input allowing data on bus C to be loaded into a third register.Buses A and B are used to transfer source operands to the A and B inputs of ALU, and result transferred to destination over bus C.

  • *Multiple-Bus Organization (2)

  • *Multiple-Bus Organization (3)For the ALU, R=A (or R=B) means that its A (or B) input is passed unmodified to bus C.Add R4, R5, R6/* R6 [R4] + [R5]Adds the contents of R4 and R5 to R6.Sequence of control steps:PCout, R=B, MARin, Read, IncPCWMFCMDRoutB, R=B, IRinR4outA, R5outB, SelectA, Add, R6in, End

  • STEP 1: Contents of PC are passed through ALU and stored into MAR to start mem read operatn. At same time, PC incremented by 4. Now [MAR] -> original PC value [PC] -> updated PC valueSTEP 2:Processor waits for MFC and loads data received into MDR.STEP 3: MDR data transfered into IR through ALU using instructn R=B. STEP 4: Execution phase of instructn requires only 1 control step.

    *

  • *ControlGenerating of control signals are of 2 categories: Hardwired control Microprogrammed control.Hardwired control:Memory bus data lines

  • Required control signals are determined by following information:Contents of control step counterContents of instruction registerContents of CCRExternal i/p signals such as MFC and interrupt requests

    Here encoder/decoder ckt generates required control O/PsDepending on state of all its i/ps.*

  • Separation of decoding & encoding functions(Fig7.11)*

  • Step decoder provides a separate signal line for each step or time slot, in the control sequence.O/p of instructn decoder consists of a separate line for each m/c instructn.For any instructn loader in IR, one of the o/p lines INS1 through INSm is set to 1 while rest to 0.Encoder block I/P generates signals like Yin, PCout, Add, End, etc.

    *

  • Eg: Zin= T1 +T6.ADD+T4.BR+..This signal Zin is asserted during time slot T1 for all instructions, during T6 for an ADD instruction, during T4 for an unconditional branch instruction and so on.

    *

  • *The ckt that generates the End Control signal from the logic function, End = T7 ADD + T5 BR + (T5 N + T4 N) BRN +

  • Hardwired Control The sequence of operation carried out by this m/c is determined by wiring the logic elements, hence the name hardwired.Hardwired control provides highest speed.RISCs are implemented with hardwired control. If the instruction set becomes very complex (CISCs) implementing hardwired control is very difficult. In this case microprogrammed control units are used.In order to allow execution of register-to-register operations in a single clock cycle, RISCs (and other modern processors) use three-bus CPU structures *

  • There are 4 techniques for design of hardwired control unit.State-table method:It is a classical method of sequential ckt design.It attempts to minimize the amount of h/w.Delay-Element MethodIt is heuristic method based on the use of clocked delay element( D FF) for control signal timings.3. Sequence- Counter MethodIt uses counter for timing purposes.4. PLA MethodIt uses a Programmable Logic Array(PLA)*

  • State-table methodHere start with construction of state table transition table.In every state the control unit generates a set of controls.Control unit trasmits from one state to another state depending on its:1) current state2) I/P to the controller*

  • State-table methodEg: Hardwired control unit for multiplication of 2 unsigned numbers.Flowchart is in the next slide.Multiplicand=M, Multiplier=Q.Register C handles carry if any during addition.Initially Reg. C & Ac are cleared.Seq. count reg. count set to n=No. of bits in the multiplier.Next is a loop, that keeps forming the Partial products.If Q=1,M is added to AC.Any carry from addition is transferred to CElse no need of this actionThe counter is decremented by 1.Reg. C, AC, Q are then shifted once to the right to obtain the Partial Product.

    *

  • State-table method:

    *

  • Delay-Element Method

    The control signals from the control unit are activated in a proper seq.There is specific time delay b/w activation of 2 groups of consecutive control signals.A seq. of delay elements can be used to generate control signals one after the other.To ensure synchronous operation, the delay elements are implemented by D FFs controlled by a common clock signal.*

  • Delay-Element Method

    A control unit using delay elements can be constructed directly fro the flowchart that specifies required control signal seq.Every state requires a delay element.The signals that activate same control signals are Ored to get one common o/p signals.When n lines in the flowchart merge to a common point, then these lines are an n I/P OR gate.A decision box can be implemented by 2 I/P AND gate.The 1st I/P of each AND gate is driven by I/P A and complement of A respectively.While 2nd I/P of both gates is common and it is the O/P of the delay element.*

  • Delay-Element Method*

  • A Complete Processor (Pg 428)

  • Instructn Unit->Fetches instructn from instructn cache or main mem whn desired instructn is not already in cache.Has separate processing units to deal with integer and floating point data.Data cache is inserted between these units and main mem.Today many processors use separate caches for instructn and dataProcessor connected to system bus and rest of computer by means of bus interface.A processor may include several integer and floating point units to increase potential of concurrent operations.

    *

  • *Microprogrammed control

    Control signals generated by a program.Control word (CW) is a microinstruction that contains individual bits that represent the various control signals.

    Vertical organization: highly encoded schemes that use compact codes to specify only a small number of control functions in each microinstruction.

    Horizontal organization: minimally encoded scheme in which many resources can be controlled with a single microinstructions.

    Popular in Complex Instruction Set Architectures (CISC) because complex instruction sets require complex controllers that can more easily be implemented as microprograms.Memory bus data lines

  • A sequence of CWs corresponding to the control sequence of M/C instruction constitutes MICROROUTINE for that instruction.Individual control words in this microroutine are called MICROINSTRUCTIONS.Microroutines for all instructions in instructn set of computer are stored in a special mem called CONTROL STORE.To read control words sequentially from control store, a MICROPROGRAM COUNTER(PC) is used.*

  • *Example of a horizontal organization scheme:Memory bus data linesPCout, MARin, Read, Select4, Add, ZinZout, PCin, Yin, WMFCMDRout, IRinR3out, MARin, ReadR1out, Yin, WMFCMDRout, SelectY, Add, ZinZout, R1in, EndSelect=0: SelectYSelect=1: Select4Control signals are generated by a program similar to machine language programs.Control Word (CW); microroutine; microinstruction

    12345670100000100000010010001001000001001000100000100100100000010000101000010010000100001000000001000100001001000000001

  • Fig7.16 Basic organization of microprogrammed control unit*

  • To read control words sequentially from control store, a PC (Micro Program Counter) is used.Every time a new instructn is loaded into IR, O/P of starting addr generator is loaded into PC. PC is then automatically incremented by clock, causing successive micro instructns to be read from control store.Hence control signals are delivered to various parts of processor in correct sequence.

    *

  • Modification required in basic organizationIn case of branching instructn-> control unit is required to check status of conditional codes or external i/ps or choose between alternative course of action.In Hardwired control-> this situation is handled by including an appropriate logic function in encoder circuitry.In Microprogrammed control->In addn to branch addr, these microinstructions specify which external i/ps, conditional codes or bits of instruction register should be checked as condition for branching to take place.*

  • Eg: Microroutine for instructn branch
  • Pg 431*

  • STEPSAfter loading Instructn into IR, a Branch Microinstruction xfers control to corresponding microroutine, which is assumed to start at locatn 25 in control store.This addr(25) is o/p of starting addr generator.This Microinstruction at 25 tests Nbit of conditn codes.If N=0 -> Branch to locatn 0 to fetch new m/c instructnElse -> Microinstruction at locatn 26 is executed to put branch target addr into reg Z. Microinstruction in locatn 27 loads this addr into PC

    *

  • Figure 7.18.Organization of the control unit to allow conditional branching in the microprogram.ControlstoreClockgeneratorStarting andbranch addressConditioncodesinputsExternalCWIRmPC

  • *

  • Microinstructions (Pg 433)FORMAT OF INDIVIDUAL MICROINSRUCTIONSA straightforward way to structure microinstructions is to assign one bit position to each control signal.(See Fig in next slide)However, this is very inefficient as it results in long microinstructions. Also only few bits are 1(active gating), so available bit space is poorly used.The length can be reduced: most signals are not needed simultaneously, and many signals are mutually exclusive.All mutually exclusive signals are placed in the same group and are represented using binary coding scheme.This reduces number of bits from 42 to 20 bits to store pattern for 42 signals. So smaller control store.

  • Control signals are generated by a program similar to machine language programs.Control Word (CW); microroutine; microinstruction

  • Partial Format for the MicroinstructionsWhat is the price paid for this scheme?

  • Eg: Here F4 contains only 4 bits that specify one of the 16 ALU operations

    Problem:Grouping control signals into fields requires alittel more hardware because decoding ckts must be used to decode bit patterns.

    *

  • Further ImprovementEnumerate the patterns of required signals in all possible microinstructions. Each meaningful combination of active control signals can then be assigned a distinct code that represents the microinstruction.Such full encoding further reduces microword lengths but increases complexity of decoder ckts.Vertical organizationHorizontal organization

  • Vertical organization: highly encoded schemes that use compact codes to specify only a small number of control functions in each microinstruction.Horizontal organization: minimally encoded scheme in which many resources can be controlled with a single microinstructionsHorizontal organization is used when higher operating speed is desired and when m/c structure allows parallel use of resources.Vertical results in slower operating speeds becoz more microinstructions are needed to perform desired control functions.

    *

  • Microprogram SequencingIf all microprograms require only straightforward sequential execution of microinstructions except for branches, letting a PC governs the sequencing would be efficient.However, two disadvantages:Having a separate micro-routine for each machine instruction results in a large total number of microinstructions and a large control store.Longer execution time because it takes more time to carry out the required branches.Eg: Add src, Rdst (Adds source operand with Rdst reg contents) Assume source operands can be specified in Four addressing modes: register, autoincrement, autodecrement, and indexed (with indirect forms of all 4).

  • Flowchart of a microprogram for given instruction(next slide)

    Each box corresponds to a microinstruction that controls Xfers and operations indicated in box. Microinstruction IDs located at addr indicated by octal no.

  • Branch addr modification using Bit-Oring

    Fig shows that branches not always made to a single branch addr becoz simple microroutines are combined to share common parts.At point , necessary to choose betwn actions required by direct & indirect addressing modes.If Indirect: then microinstruction at loactn 170 is performed else 171 by bypassing 170.Most efficient way to bypass 170 is to have the preceding branch microinstructn specify addr 170 and then use OR gate to change LSB of this addr to 1 if direct addressing mode is involved.Called bit-ORing technique

  • WIDE BRANCH ADDRESSINGFig 7.20 included wide branch at locatn 003.Instructn decoder generates addr of microroutine that implements instructn loaded into IR.Here IR contains ADD instructn for which decoder generates addr 101. But this 101 cannot be directly loaded into uPC.5 possible addr modes as shown on diff branches at addr values 161, 141, 121, 101 and 111.Bit-ORing technique is then used to modify starting addr generated by decoder to reach appropriate path.*

  • OP code010RsrcRdstModeContents of IR034781011Figure 7.21. Microinstruction for Add (Rsrc)+Rdst.Note: Microinstruction at location 170 is not executed for this addressing mode. AddressMicroinstruction(octal)000PCout, MARin, Read, Select4, Add, Zin001Zout, PCin, Yin, WMFC002MDRout, IRin003mBranch {mPC 101 (from Instruction decoder);mPC5,4 [IR10,9];mPC3121Rsrcout, MARin, Read, Select4, Add, Zin122Zout, Rsrcin123170MDRout, MARin, Read, WMFC171MDRout, Yin172Rdstout, SelectY, Add, Zin173Zout, Rdstin, End[IR10][IR9][IR8]}mBranch {mPC 170;mPC0[IR8]}, WMFC

  • Microinstructions with Next-Address FieldThe microprogram we discussed requires several branch microinstructions, which perform no useful operation in the datapath but are used only to determine addr of next instructn.A powerful alternative approach is to include an address field as a part of every microinstruction to indicate the location of the next microinstruction to be fetched.Pros: separate branch microinstructions are virtually eliminated; few limitations in assigning addresses to microinstructions.Cons: additional bits for the address field (around 1/6 capacity of control store would be devoted to addressing)

  • Here each instructn contains addr of next instructn, so no need to PC and hence is replaced with AR (microinstructn addr register)This AR is loaded from next-addr field in each microinstructn.Next addr bits are fed through OR gates to AR.

    *

  • Microinstructions with Next-Address Field

  • Implementation of the Microroutine

  • bit-ORing

  • Prefetching

    Drawback of microprogrammed control: Leads to slower operating speed becoz of the time it takes to fetch microinstructions from control store.Faster operation is achieved if next instruction is prefetched while current one is being operated.Thus execution time can be overlapped with fetch time.Some difficulties:Status flags and result of currently executed microinstructns are needed to determine addr of next microinstructn.Thus straightforward prefetching occasionally prefetches a wrong microinstructn.In such cases, fetch must be repeated with correct addr.

  • EMULATIONIf we add to the instructn set of Computer M1, an entire new set of instructions which belong to computer M2,Programs written in m/c language of M2 can then be run on computer M1 i.e M1 emulates M2.Emulation allows to replace obsolete equipment with more up-to-date machines.If replacement computer fully emulates original one, then no s/w changes is to be made to run existing programs.Thus emulation facilitates transitions to new computer systems with minimal disruption.Emulation easy with computers having same architectures but can succeed with those having totally different architectures.*

  • EMULATIONEmulation is best described as imitating a certain computer platform or program on another platform or program. In this manner, it is possible to view documents or run programs on a computer not designed to do so. An emulator is itself a program that creates an extra layer between an existing computer platform (host platform) and the platform to be reproduced (target platform).

    *

  • COMPARE SIMULATOR& EMULATORA simulator is a software that duplicates some processor in almost all the possible ways. An emulator is a hardware which duplicates the features and functions of a real system, so that it can behave like the actual system. Difference between model and real system operation called-> credibility gap

  • *

    Sr.No.AttributeHardwired ControlMicroprogrammed Control1SpeedFastslow2Cost MoreCheaper3Implementation ApproachSequential CktProgramming4FlexibilityNot Flexible, difficult to modify for new instructionFlexible, new instructions

    567891011

    ****************