Upload
arun-shetty
View
219
Download
0
Embed Size (px)
Citation preview
8/2/2019 ss assinment
1/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 1
CONTENTS :
TOPIC PAGE NUMBER
1. INTRODUCTION 02
2. CORE COMPLEX SUMMARY 02
3. MULTIPLE CYCLE UNIT 04
4. FEATURES 06
5. ARCHITECHTURE 06
6. APPLICATIONS 11
8/2/2019 ss assinment
2/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 2
Introduction
The e500 processor core is a low-power implementation of the family of reduced instruction
set computing (RISC) embedded processors that implement the Book E definition of the
PowerPC architecture.
Book E allows processors to provide auxiliary processing units (APUs), which are
extensions to the architecture that can perform computational or system management
functions.
Core Complex Summary
The core complex is a superscalar processor that can issue two instructions and complete two
instructions per clock cycle.
The processor core integrates two simple instruction units (SU1, SU2), a multiple-cycle
instruction unit (MU), a branch unit (BU), and a load/store unit (LSU).
The core complex supports a high-speed on-chip internal bus with data tagging called the
core complex bus (CCB) which is the interface between the core and the integrating device.
Book E Debug Events
Debugevents cause debug exceptions to be recorded in the DBSR (Debug Status Register
(DBSR)).
Dual-issue superscalar
Two-instructions-per-clock peak issue rate
8/2/2019 ss assinment
3/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 3
Precise exception handling
Decode unit
12-entry instruction queue (IQ)
Full hardware detection of interlocks
Decodes as many as two instructions per cycle
Decode serialization control
Register dependency resolution and renaming
Branch prediction unit (BPU)
Dynamic branch prediction using a 512-entry, 4-way set-associative branch targe
Branch prediction is handled in the fetch stages.
Completion unit
As many as 14 instructions allowed in 14-entry completion queue (CQ)
In-order retirement of as many as two instructions per cycle
Completion and re fetch serialization control
8/2/2019 ss assinment
4/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 4
Synchronization for all instruction flow changesinterrupts, mispredicted branches, and
context-synchronizing instructions
Issue queues
Two-entry branch instruction issue queue (BIQ)
Four-entry general instruction issue queue (GIQ)
Branch unit
The branch unit (BU) is an execution unit and is distinct from the BPU. It executes (resolves)
all branch and CR logical instructions.
Two simple units (SU1 and SU2)
Add and subtract
Shift and rotate
Logical operations
Support for 64-bit SPE APU instructions in SU1
8/2/2019 ss assinment
5/12
8/2/2019 ss assinment
6/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 6
Features
Key features of the e500 are summarized as follows:
Implements Book E 32-bit architecture
1.Auxiliary processing units
Integer select. This APU consists of the Integer Select instruction, isel, which is a conditional
register move that helps eliminate conditional branches, decreases latency, and reduces the
code footprint
Performance monitor. The performance monitor facility provides the ability to monitor and
count predefined events such as processor clocks, misses in the instruction cache or data
cache.
Single-precision embedded scalar and vector floating-point APUs.
2.Power management
Low-power design
3.Testability
4.Reliability and serviceability
5.Instruction Set
8/2/2019 ss assinment
7/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 7
The Book E instruction set for 32-bit implementations. This is composed primarily of the
user-level instructions
6.Initial Instruction Fetch
The e500 core begins execution at fixed virtual address 0xFFFF_FFFC
7.Instruction Flow
The e500 core is a pipelined, superscalar processor with parallel execution units that allow
instructions to execute out of order but record their results in order. Pipelining breaks
instruction processing into discrete stages, so multiple instructions in an instruction sequence
can occupy the successive stages: as an instruction completes one stage, it passes to the next,
leaving the previous stage available to a subsequent instruction. So, even though it may take
multiple cycles for an instruction to pass through all of the pipeline stages, once a pipeline is
full, instruction throughput is much shorter than the latency.
A superscalar processor is one that issues multiple independent instructions into separate
execution units, allowing parallel execution. The e500 core has five execution units, one each
for branch (BU), load/store (LSU), and multiple-cycle operations (MU), and two for simple
arithmetic operations (SU1 and SU2). The MU and SU1 arithmetic execution units also
execute 64-bit SPE vector instructions, using both the lower and upper halves of the 64-bit
GPRs. Theparallel execution units allow multiple instructions to execute in parallel and out
of order. For example, a low-latency addition instruction that is issued to an SU after an
integer divide is issued to the MU should finish executing before the higher latency divide
instruction. Theaddinstruction can make its results available to a subsequent instruction, but
it cannot update the architected GPR
specified as its target operand ahead of the multiple-cycle divide instruction.
8/2/2019 ss assinment
8/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 8
Initial Instruction Fetch
The e500 core begins execution at fixed virtual address 0xFFFF_FFFC. The MMU has a
default page translation which maps this to the identical physical address. So, the instruction
at physical address 0xFFFF_FFFC must be a branch to another address within the 4-Kbyte
boot page.
Branch Detection and Prediction
To improve branch performance, the e500 provides implementation-specific dynamic branch
prediction using the BTB to resolve branch instructions and improve the accuracy of branch
predictions. Each of the 512 entries in the 4-way set associative address cache of branch
target addresses includes a 2-bit saturating branch history counter, whose value isincremented or decremented depending on whether the branch was taken. These bits can take
on four values indicating strongly taken, weakly taken, weakly not taken, and strongly not
taken. The BTB is used not only to predict branches, but to detect branches during the fetch
stage, offering an efficient way to access instruction streams for branches predicted as taken.
In the e500, all branch instructions are assigned positions in the completion queue at
dispatch. Speculative instructions in branch target streams are allowed to execute and
proceed through the completion queue, although they can complete only after the branch
prediction is resolved as correct and after the branch instruction itself completes.
If a branch resolves as correct, instructions in the target stream are marked nonspeculative
and are allowed to complete. If the branch history bits in the BTB indicated weakly taken or
weakly not taken, the prediction is upgraded to strongly taken or strongly not taken. If a
branch resolves as incorrect, instructions in the target stream are flushed from the execution
pipeline, the branch history bits are updated in the BTB entry, and nonspeculative fetching
begins
from the correct path.
8/2/2019 ss assinment
9/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 9
e500 Execution Pipeline
The seven stages of the e500 execution pipelinefetch1, fetch2/predecode, decode/dispatch,
issue, execute, complete, and write back
8.Register Model
Registers used for integer operations:
General-Purpose Registers (GPRs)
Book E implementations provide 32 GPRs (GPR0 GPR31) for integer operations.
Integer Exception Register (XER)
Registers for Branch Operations
Condition Register (CR)
Link Register (LR)
The link register can be used to provide the branch target address for a Branch Conditional to
LR instruction, and it holds the return address after branch and link instructions.
Count Register (CTR)
The CTR can be used to hold a loopcount that can be decremented and tested during
execution of branch instructions
Processor Control Registers
8/2/2019 ss assinment
10/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 10
Machine State Register (MSR)
The machine state register (MSR), shown in Figure 2-2, defines the state of the processor that
is,enabling and disabling of interrupts and debugging exceptions
Timer Register
TCR[WPEXT] and TCR[FPEXT], not specified in Book E, are concatenated with
TCR[WP] and TCR[FP] to select a bit that triggers the watchpoint timer and fixed-interval
timer events.
Interrupt Registers
Branch Target Buffer (BTB) Registers
Hardware Implementation-Dependent Registers
9. e500-Specific Instructions
8/2/2019 ss assinment
11/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 11
Applications
PowerQUICC
All PowerQUICC 85xx devices are based on e500v1 or e500v2 cores, most of them on the
e500v2.
QorIQ
In June 2008 Freescale announced the QorIQ brand, microprocessors based on e500 cores.
The QorIQ P1 and P2 families are using e500v2 while the P3 and P4 families are using the
e500mc cores and CoreNet communications fabric.
Desktop Computer
Apple Computer was the dominant player in the market of desktop computers based on
PowerPC
Servers
Apple Xserve Rack server.
IBM Rack server.
Supercomputers
Personal digital assistants
Game consoles
http://en.wikipedia.org/wiki/PowerQUICChttp://en.wikipedia.org/wiki/QorIQhttp://en.wikipedia.org/wiki/Apple_Computerhttp://en.wikipedia.org/wiki/Xservehttp://en.wikipedia.org/wiki/IBMhttp://en.wikipedia.org/wiki/IBMhttp://en.wikipedia.org/wiki/Xservehttp://en.wikipedia.org/wiki/Apple_Computerhttp://en.wikipedia.org/wiki/QorIQhttp://en.wikipedia.org/wiki/PowerQUICC8/2/2019 ss assinment
12/12
POWER PC E500
DEPT OF CSE NOV-DEC 2011 Page 12
All three major seventh-generation game consoles contain PowerPC-based processors.
Sony's PlayStation 3 console
TV Set Top Boxes/Digital Recorder
Printers/Graphics
Network/USB Devices
Automotive
Medical Equipments
Military and Aerospace
http://en.wikipedia.org/wiki/Game_consoleshttp://en.wikipedia.org/wiki/PlayStation_3http://en.wikipedia.org/wiki/PlayStation_3http://en.wikipedia.org/wiki/Game_consoles