07 Processor Basics(1)

Embed Size (px)

Citation preview

  • 5/25/2018 07 Processor Basics(1)

    1/46

    Digital Design:An Embedded SystemsApproach Using Verilog

    Chapter 7

    Processor Basics

    Portions of this work are from the book, Digital Design: An Embedded

    Systems Approach Using Verilog,by Peter J. Ashenden, published by MorganKaufmann Publishers, Copyright 2007 Elsevier Inc. All rights reserved.

  • 5/25/2018 07 Processor Basics(1)

    2/46

    Verilog

    Digital DesignChapter 7Processor Basics 2

    Embedded Computers

    A computer as part of a digital system Performs processing to implement or control the

    systems function

    Components

    Processor core

    Instruction and data memory

    Input, output, and input/output controllers

    For interacting with the physical world

    Accelerators

    High-performance circuit for specialized functions

    Interconnecting buses

  • 5/25/2018 07 Processor Basics(1)

    3/46

    Verilog

    Digital DesignChapter 7Processor Basics 3

    Memory Organization

    Von Neumann architecture Single memory for instructions and data

    Harvard architecture

    Separate instruction and data memories

    Most common in embedded systems

    CPU

    AcceleratorInstruction

    memory

    Input

    controller

    Output

    controller

    I/O

    controller

    Data

    memory

  • 5/25/2018 07 Processor Basics(1)

    4/46

    Verilog

    Digital DesignChapter 7Processor Basics 4

    Bus Organization

    Single bus for low-cost low-performancesystems

    Multiple buses for higher performance

    CPU

    Accelerator

    Instruction

    memory

    Input

    controller

    Output

    controller

    I/O

    controller

    Data

    memory

  • 5/25/2018 07 Processor Basics(1)

    5/46

    Verilog

    Digital DesignChapter 7Processor Basics 5

    Microprocessors

    Single-chip processor in a package External connections to memory and

    I/O buses

    Most commonly seen in generalpurpose computers

    E.g., Intel Pentium family, PowerPC,

  • 5/25/2018 07 Processor Basics(1)

    6/46

    Verilog

    Digital DesignChapter 7Processor Basics 6

    Microcontrollers

    Single chip combining Processor

    A small amount of instruction/data memory

    I/O controllers

    Microcontroller families Same processor, varying memory and I/O

    8-bit microcontrollers Operate on 8-bit data

    Low cost, low performance

    16-bit and 32-bit microcontrollers Higher performance

  • 5/25/2018 07 Processor Basics(1)

    7/46

    Verilog

    Digital DesignChapter 7Processor Basics 7

    Processor Cores

    Processor as a component in an FPGA orASIC

    In FPGA, can be a fixed-function block

    E.g., PowerPC cores in some Xilinx FPGAs

    Or can be a soft core

    Implemented using programmable resources

    E.g., Xilinx MicroBlaze, Altera Nios-II

    In ASIC, provided as an IP block E.g., ARM, PowerPC, MIPS, Tensilica cores

    Can be customized for an application

  • 5/25/2018 07 Processor Basics(1)

    8/46

    Verilog

    Digital DesignChapter 7Processor Basics 8

    Digital Signal Processors

    DSPs are processors optimized forsignal processing operations

    E.g., audio, video, sensor data; wireless

    communication Often combined with a conventional

    core for processing other data

    Heterogeneous multiprocessor

  • 5/25/2018 07 Processor Basics(1)

    9/46

    Verilog

    Digital DesignChapter 7Processor Basics 9

    Instruction Sets

    A processor executes a program A sequence of instructions, each performing a

    small step of a computation

    Instruction set: the repertoire of available

    instructions

    Different processor types have different instructionsets

    High-level languages: more abstract E.g., C, C++, Ada, Java

    Translated to processor instructions by a compiler

  • 5/25/2018 07 Processor Basics(1)

    10/46

    Verilog

    Digital DesignChapter 7Processor Basics 10

    Instruction Execution

    Instructions are encoded in binary Stored in the instruction memory

    A processor executes a program by

    repeatedly Fetching the next instruction

    Decoding it to work out what to do

    Executing the operation

    Program counter (PC) Register in the processor holding the

    address of the next instruction

  • 5/25/2018 07 Processor Basics(1)

    11/46

    Verilog

    Digital DesignChapter 7Processor Basics 11

    Data and Endian-ness

    Instructions operate on data from the data memory Byte: 8-bit data

    Data memory is usually byte addressed

    16-bit, 32-bit, 64-bit words of data

    0

    least sig. byte

    Little endianBig endian

    8-bit data

    16-bit data

    32-bit data

    most sig. byte

    least sig. byte

    most sig. byte

    m

    m+ 1

    n

    n+ 2

    n+ 3

    n+ 1

    0

    least sig. byte

    8-bit data

    16-bit data

    32-bit data

    most sig. byte

    least sig. byte

    most sig. byte

    m

    m+ 1

    n

    n+ 2

    n+ 3

    n+ 1

  • 5/25/2018 07 Processor Basics(1)

    12/46

    Verilog

    Digital DesignChapter 7Processor Basics 12

    The Gumnut Core

    A small 8-bit soft core Can be used in FPGA designs

    Instruction set illustrates features typical of 8-bit cores and processors in general

    Programs written in assembly language

    Each processor instruction written explicitly

    Translated to binary representation by an

    assembler Resources available on companions web site

    il

  • 5/25/2018 07 Processor Basics(1)

    13/46

    Verilog

    Digital DesignChapter 7Processor Basics 13

    Gumnut Storage

    V il

  • 5/25/2018 07 Processor Basics(1)

    14/46

    Verilog

    Digital DesignChapter 7Processor Basics 14

    Arithmetic Instructions

    Operate on register data and put resultin a register add, addc, sub, subc

    Can have immediate value operand

    Condition codes Z: 1 if result is zero, 0 if result is non-zero

    C: carry out of add/addc, borrow out of

    sub/subc addcand subcinclude C bit in

    operation

    V il

  • 5/25/2018 07 Processor Basics(1)

    15/46

    Verilog

    Digital DesignChapter 7Processor Basics 15

    Arithmetic Instructions

    Examples add r3, r4, r1

    add r5, r1, 2

    sub r4, r4, 1 Evaluate 2x + 1; x in r3, result in r4

    add r4, r4, r3 ; double x

    add r4, r4, 1 ; then add 1

    Ve ilog

  • 5/25/2018 07 Processor Basics(1)

    16/46

    Verilog

    Digital DesignChapter 7Processor Basics 16

    Logical Instructions

    Operate on register data and put resultin a register

    and, or, xor, mask(and not)

    Operate bitwise on 8-bit operands Can have immediate value operand

    Condition codes

    Z: 1 if result is zero, 0 if result is non-zero C: always 0

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    17/46

    Verilog

    Digital DesignChapter 7Processor Basics 17

    Logical Instructions

    Examples and r3, r4, r5

    or r1, r1, 0x80 ; set r1(7)

    xor r5, r5, 0xFF ; invert r5

    Set Z if least-significant 4 bits of r2 are 0101 and r1, r2, 0x0F ; clear high bitssub r0, r1, 0x05 ; compare with 0101

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    18/46

    Verilog

    Digital DesignChapter 7Processor Basics 18

    Shift Instructions

    Logical shift/rotate register data andput result in a register

    shl, shr, rol, ror

    Count specified as a literal operand Condition codes

    Z: 1 if result is zero, 0 if result is non-zero

    C: the value of the last bit shifted/rotatedpast the end of the byte

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    19/46

    Verilog

    Digital DesignChapter 7Processor Basics 19

    Shift Instructions

    Examples shl r4, r1, 3

    ror r2, r2, 4

    Multiply r4 by 8, ignoring overflow shl r4, r4, 3

    Multiply r4 by 10, ignoring overflow shl r1, r4, 1 ; multiply by 2shl r4, r4, 3 ; multiply by 8

    add r4, r4, r1

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    20/46

    Verilog

    Digital DesignChapter 7Processor Basics 20

    Memory Instructions

    Transfer data between registers and datamemory Compute address by adding an offset to a base

    register value

    Load register from memory ldm r1, (r2)+5

    Store from register to memory stm r1, (r4)-2

    Use r0 if base address is 0 ldm r3, 23 ldm r3, (r0)+23

    Condition codes not affected

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    21/46

    Verilog

    Digital DesignChapter 7Processor Basics 21

    Memory Instructions

    Increment a 16-bit integer in memory Little-endian: address of lsb in r2, msb in next

    location

    ldm r1, (r2) ; increment lsb

    add r1, r1, 1stm r1, (r2)ldm r1, (r2)+1 ; increment msbaddc r1, r1, 0 ; with carry

    stm r1, (r2)+1

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    22/46

    Verilog

    Digital DesignChapter 7Processor Basics 22

    Input/Output Instructions

    I/O controllers have registers that governtheir operation

    Each has an address, like data memory

    Gumnut has separate data and I/O address spaces

    Input from I/O register

    inp r3, 157 inp r3, (r0)+157

    Output to I/O register

    out r3, (r7) out r3, (r7)+0 Condition codes not affected

    Further examples in Chapter 8

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    23/46

    Verilog

    Digital DesignChapter 7Processor Basics 23

    Branch Instructions

    Programs can evaluate conditions and takealternate courses of action

    Condition codes (Z, C) represent outcomes ofarithmetic/logical/shift instructions

    Branch instructions examine Z or C bz, bnz, bc, bnc

    Add a displacement to PC if condition is true

    Specifies how many instructions forward orbackward to skip

    Counting from instruction after branch

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    24/46

    Verilog

    Digital DesignChapter 7Processor Basics 24

    Branch Example

    Elapsed seconds in location 100 Increment, wrapping to 0 after 59

    ldm r1, 100add r1, r1, 1

    sub r0, r1, 60 ; Z set if r1 = 60bnz +1 ; Skip to store ifadd r1, r0, 0 ; Z is 0stm r1, 100

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    25/46

    Verilog

    Digital DesignChapter 7Processor Basics 25

    Jump Instruction

    Unconditionally skips forward or backward tospecified address Changes the PC to the address

    Example: if r1 = 0, clear data location 100 to

    0; otherwise clear location 200 to 0 Assume instructions start at address 10

    10: sub r0, r1, 011: bnz +212: stm r0, 10013: jmp 1514: stm r0, 20015: ...

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    26/46

    g

    Digital DesignChapter 7Processor Basics 26

    Subroutines

    A sequence of instructions that performsome operation Can callthem from different parts of a

    program using a jsbinstruction

    Subroutine returns with a retinstruction

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    27/46

    g

    Digital DesignChapter 7Processor Basics 27

    Subroutine Example

    Subroutine to increment second count Address of count in r2 ldm r1, (r2)add r1, r1, 1sub r0, r1, 60

    bnz +1add r1, r0, 0stm r1, (r2)ret

    Call to increment locations 100 and 102 add r2, r0, 100jsb 20add r2, r0, 102jsb 20

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    28/46

    g

    Digital DesignChapter 7Processor Basics 28

    Return Address Stack

    The jsbsaves the return address foruse by the ret

    But what if the subroutine includes a jsb?

    Gumnut core includes an 8-entry push-down stack of return addresses

    return addr for first call

    return addr for second call

    return addr for first call

    return addr for second call

    return addr for third call

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    29/46

    g

    Digital DesignChapter 7Processor Basics 29

    Miscellaneous Instructions

    Instructions supporting interrupts See Chapter 8

    reti Return from interrupt

    enai Enable interrupts disi Disable interrupts

    wait Wait for an interrupt

    stby Stand by in low power mode untilan interrupt occurs

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    30/46

    Digital DesignChapter 7Processor Basics 30

    The Gumnut Assembler

    Gasm: translates assembly programs Generates memory images for program

    text (binary-coded instructions) and data

    See documentation on web site Write a program as a text file

    Instructions

    Directives Comments

    Use symbolic labels

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    31/46

    Digital DesignChapter 7Processor Basics 31

    Example Program

    ; Program to determine greater of value_1 and value_2

    textorg 0x000 ; start here on resetjmp main

    ; Data memory layout

    datavalue_1: byte 10

    value_2: byte 20result: bss 1

    ; Main program

    textorg 0x010

    main: ldm r1, value_1 ; load valuesldm r2, value_2

    sub r0, r1, r2 ; compare valuesbc value_2_greaterstm r1, result ; value_1 is greaterjmp finish

    value_2_greater:stm r2, result ; value_2 is greater

    finish: jmp finish ; idle loop

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    32/46

    Digital DesignChapter 7Processor Basics 32

    Gumnut Instruction Encoding

    Instructions are a form of information Can be encoded in binary

    Gumnut encoding

    18 bits per instruction Divided into fields representing different

    aspects of the instruction

    Opcodes and function codes

    Register numbers

    Addresses

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    33/46

    Digital DesignChapter 7Processor Basics 33

    Gumnut Instruction Encoding

    1 1 01 1 1 fn disp

    6 2 2 8

    Branch

    Arith/LogicalRegister

    Arith/Logical

    Immediate

    Shift

    Memory, I/O

    1 1 01 fnrd rs rs2

    4 3 33 3 2

    0 fn rd rs immed

    1 83 3 3

    1 1 0 fnrd rs count

    3 31 23 3 3

    1 0 fn rd rs offset

    2 2 3 3 8

    1 1 1 1 0

    0

    fn addr5 1 12

    Jump

    1 1 1 1 1 1 fn

    7 3 8

    Miscellaneous

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    34/46

    Digital DesignChapter 7Processor Basics 34

    Encoding Examples

    Encoding for addc r3, r5, 24Arithmetic immediate, fn = 001

    0 fn rd rs immed

    0 00 1 10 1 01 1 0 0 10 1 00 0

    1 83 3 3

    Instruction encoded by 2ECFC

    1 1 01 1 1 fn disp

    6 2 2 8

    1 1 0 0 01 1 1 1 1 1 1 11 1 0 01

    Branch bnc -4

    05D18

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    35/46

    Digital DesignChapter 7Processor Basics 35

    Other Instruction Sets

    8-bit cores and microcontrollers Xilinx PicoBlaze: like Gumnut

    8051, and numerous like it

    Originated as 8-bit microprocessors

    Instructions encoded as one or more bytes Instruction set is more complex and irregular

    Complex instruction set computer (CISC)

    C.f. Reduced instruction set computer (RISC)

    16-, 32- and 64-bit cores Mostly RISC

    E.g., PowerPC, ARM, MIPS, Tensilica,

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    36/46

    Digital DesignChapter 7Processor Basics 36

    Instruction and Data Memory

    In embedded systems Instruction memory is usually ROM, flash,

    SRAM, or combination

    Data memory is usually SRAM DRAM if large capacity needed

    Processor/memory interfacing

    Gluing the signals together

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    37/46

    Digital DesignChapter 7Processor Basics 37

    Example: Gumnut Memory

    inst_adr_o

    inst_dat_i

    rst_i

    gum nut data

    SRAM

    inst_cyc_oinst_stb_o

    inst_ack_i

    data_adr_o

    data_dat_idata_dat_o

    data_cyc_odata_stb_o

    data_ack_i

    data_we_o

    adr

    dat_odat_i

    en

    we

    adr

    dat_o

    en

    clk_i

    clk_i

    instruction

    ROMclk_i

    D Q

    clk

    DQ

    clk

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    38/46

    Digital DesignChapter 7Processor Basics 38

    Example: Gumnut Memory

    always @(posedge clk) // Instruction memoryif (inst_cyc_o && inst_stb_o) begininst_dat_i

  • 5/25/2018 07 Processor Basics(1)

    39/46

    Digital DesignChapter 7Processor Basics 39

    Example: Gumnut Memory

    always @(posedge clk) // Data memoryif (data_cyc_o && data_stb_o)if (data_we_o) begin

    data_RAM[data_adr_o]

  • 5/25/2018 07 Processor Basics(1)

    40/46

    Digital DesignChapter 7Processor Basics 40

    Example: Microcontroller Memory

    A(15 ..8 )

    A(7..0)

    CE

    WE

    OE

    D

    A(16 )

    D

    LE

    P2

    Q

    PSEN

    ALE

    8051 SRAM

    RD

    WR

    P0

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    41/46

    Digital DesignChapter 7Processor Basics 41

    32-bit Memory

    Four bytes per memory word Little-endian: lsb at least address

    Big-endian: msb at least address

    0 1 2 3

    4 5 6 7

    8 9 10 11

    Partial-word read

    Read all bytes, processor selects those needed

    Partial-word write

    Use byte-enable signals

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    42/46

    Digital DesignChapter 7Processor Basics 42

    Example: MicroBlaze Memory

    D_in

    ASSRAM

    en

    wr

    D_out

    clk

    D_in

    A

    SSRAM

    en

    wr

    D_out

    clk

    D_in

    A

    SSRAM

    en

    wr

    D_out

    clk

    D_in

    A

    SSRAM

    en

    wr

    D_out

    clk

    0: 7

    8:15

    16:23

    24:31

    0: 7

    2:16

    8:15

    16:23

    24:31

    Add r

    Data_Write

    AS

    Read_Strobe

    Ready

    Clk

    Data_Read

    Write_Strobe

    Byte_Enable(0)

    Byte_Enable(1)

    Byte_Enable(2)

    Byte_Enable(3)

    +V

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    43/46

    Digital DesignChapter 7Processor Basics 43

    Cache Memory

    For high-performance processors Memory access time is several clock cycles

    Performance bottleneck

    Cache memory Small fast memory attached to a processor

    Stores most frequently accessed items,

    plus adjacent items Locality: those items are most likely to be

    accessed again soon

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    44/46

    Digital DesignChapter 7Processor Basics 44

    Cache Memory

    Memory contents divided into fixed-sized blocks (lines) Cache copies whole lines from memory

    When processor accesses an item If item is in cache: hit - fast access

    Occurs most of the time

    If item is not in cache: miss

    Line containing item is copied from memory Slower, but less frequent

    May need to replace a line already in cache

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    45/46

    Digital DesignChapter 7Processor Basics 45

    Fast Main Memory Access

    Optimize memory for line access by cache Wide memory

    Read a line in one access

    Burst transfers

    Send starting address, then read successive locations Pipelining

    Overlapping stages of memory access

    E.g., address transfer, memory operation, data transfer

    Double data rate (DDR), Quad data rate (QDR) Transfer on both rising and falling clock edges

    Verilog

  • 5/25/2018 07 Processor Basics(1)

    46/46

    Summary

    Embedded computer Processor, memory, I/O controllers, buses

    Microprocessors, microcontrollers, andprocessor cores

    Soft-core processors for ASIC/FPGA

    Processor instruction sets Binary encoding for instructions

    Assembly language programs

    Memory interfacing