Upload
mareo
View
37
Download
1
Embed Size (px)
DESCRIPTION
Precision Timed Embedded Systems Using TickPAD Memory. Matthew M Y Kuo* Partha S Roop* Sidharta Andalam † Nitish Patel* *University of Auckland, New Zealand † TUM CREATE, Singapore. Introduction. Hard real time systems Need to meet real time deadlines - PowerPoint PPT Presentation
Citation preview
Precision Timed Embedded Systems Using TickPAD Memory
Precision Timed Embedded Systems Using TickPAD MemoryMatthew M Y Kuo*Partha S Roop*Sidharta AndalamNitish Patel*
*University of Auckland, New ZealandTUM CREATE, Singapore
IntroductionHard real time systemsNeed to meet real time deadlinesCatastrophic events may occur when missedSynchronous execution approachGood for hard real time systemsDeterministicReactiveAids static timing analysisWell bounded programsNo unbounded loops or recursions
Synchronous LanguagesExecutes in logical timeTicksSample input computation emit outputSynchronous hypothesis Tick are instantaneousAssumes system is executes infinitely fastSystem is faster than environment responseWorst case reaction timeTime between two logical ticksLanguagesEsterel ScadePRET-CExtension to C
Synchronous LanguagesExecutes in logical timeTicksSample input computation emit outputSynchronous hypothesis Tick are instantaneousAssumes system is executes infinitely fastSystem is faster than environment responseWorst case reaction timeTime between two logical ticksLanguagesEsterel ScadePRET-CExtension to C
PRET-CLight-weight multithreading in CProvides thread safe memory accessC extension implemented as C macrosStatementMeaningReactiveInput IDeclares I as a reactive environment input ReactiveOutput ODeclares O as a reactive environment outputPAR(T1, . Tn)Synchronously executes n threads in parallel, where thread ti has a higher priority than ti+1EOTMarks the end of tick[weak] abort P when CPreempt p when c is true5
IntroductionPractical System require larger memoryNot all applications fit on on-chip memoryRequire memory hierarchy Processor memory gap
[1] Hennessy, John L., and David A. Patterson. Computer Architecture: A Quantitative Approach. San Francisco, CA: Morgan Kaufmann, 2011. IntroductionTraditional approachesCachesScratchpadsHowever, Scant research for memory architectures tailored for synchronous execution and concurrency.
CachesCPUMain MemoryCachesTraditionally CachesSmall fast piece of memoryTemporal localitySpatial localityHardware ControlledReplacement policy
CPUMain MemoryCacheCachesHard real time systemsNeeds to model the architectureCompute the WCRTCaches models Trade off between length of computation time and tightnessVery tight worse case estimate is not scalable
CPUMain MemoryCacheScratchpadScratchpad Memory (SPM)Software controlledStatically allocatedStatically or dynamically loadedRequires an allocation algorithme.g. ILP, Greedy
CPUMain MemorySPMScratchpadHard real time systemsEasy to compute tight the WCRTReduces the worst case performanceBalance between amount of reload points and overheadsMay perform worst than cache in the worst case performance
CPUMain MemorySPMTickPADCPUMain MemorySPMCacheGood at overall performanceHardware controlled
Good at worst case performanceEasy for fast and tight static analysis
TickPADCPUMain MemorySPMCacheGood at overall performanceHardware controlled
Good at worst case performanceEasy for fast and tight static analysis
TPMTickPADCPUMain MemoryTPMTickPAD MemoryTickPAD - Tick Precise Allocation DeviceMemory controllerHybrid between caches and scratchpadsHardware controlled featuresStatic software allocationTailored for synchronous languagesInstruction memory
TickPAD Design flow
PRET-Cint main() {init();PAR(t1,t2,t3);...}
void thread t1() {compute;EOT;compute;EOT;}
maint1t3t2PRET-Cint main() {init();PAR(t1,t2,t3);...}
void thread t1() {compute;EOT;compute;EOT;}
Computationmaint1t3t2PRET-Cint main() {init();PAR(t1,t2,t3);...}
void thread t1() {compute;EOT;compute;EOT;}
Spawn children threadsmaint1t3t2PRET-Cint main() {init();PAR(t1,t2,t3);...}
void thread t1() {compute;EOT;compute;EOT;}
End of tick Synchronization boundariesmaint1t3t2PRET-Cint main() {init();PAR(t1,t2,t3);...}
void thread t1() {compute;EOT;compute;EOT;}
Child thread terminatemaint1t3t2PRET-Cint main() {init();PAR(t1,t2,t3);...}
void thread t1() {compute;EOT;compute;EOT;}
Main thread resumemaint1t3t2PRET-C Execution
Timemaint1t3t2Sample inputsPRET-C Execution
maint1t3t2mainTimePRET-C Execution
maint1t3t2mainTimet1PRET-C Execution
maint1t3t2mainTimet1t2PRET-C Execution
maint1t3t2mainTimet1t2t2PRET-C Execution
maint1t3t2mainTimet1t2t2Emit OutputsPRET-C Execution
maint1t3t2mainTimet1t2t21 tick (reaction time)PRET-C Execution
maint1t3t2mainTimet1t2t2local tickAssumptions0x000x040x080x0C
4 Instructions1 Cache LineTakes 1 burst transfer from main memoryCache miss, takes 38 clock cycles [2]0x00Each instructions takes 2 cycles to executebufferBuffers are 1 cache line in size2. J. Whitham and N. Audsley. The Scratchpad Memory Management Unit for Microblaze: Implmentation, Testing, and Case Study. Technical Report YCS-2009-439, University of York, 2009. TickPAD - Overview
Spatial memory pipelineTo accelerate linear code TickPAD - Overview
Associative loop memoryFor predictable temporal locality Statically allocated and Dynamically loaded
TickPAD - Overview
Tick address queueStores the resumptions address of active threads
TickPAD - Overview
Tick instruction bufferStores the instructions at the resumption of the next active threadTo reduce context switching overhead at state/tick boundaries
TickPAD - Overview
Command tableStores a set of commands to be executed by the TickPAD controller. TickPAD - Overview
Command bufferA buffer to store operands fetched from main memory Command requiring 2+ operands TickPAD - OverviewSpatial Memory PipelineCache on missFetches from main memory on to cacheFirst instruction miss, subsequence instructions on that line hitsRequires history of cache needed for timing analysisScratchpad unallocatedExecutes from main memoryMiss cost for all instructionsSimple timing analysisSpatial Memory PipelineMemory controllerSingle line buffer Simple analysisAnalyse previous instructionFirst instruction miss, subsequence instructions on that line hits
CPUMain MemorySpatial Memory PipelineComputation required many lines of instructions
Exploit spatial localityPredictability prefetch the next line of instructionsAdd another buffer
Spatial Memory PipelineTo preserve determinismPrefetch only active if no branch
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory Pipeline
Spatial Memory PipelineTiming analysisSimple to analyseAnalysis next instruction lineIf has a branch next target line will misse.g. 38 clock cycles Else will be prefetchede.g. 38 8 = 30 clock cycles
Spatial Memory PipelineTiming analysisSimple to analyseAnalysis next instruction lineIf has a branch next target line will misse.g. 38 clock cycles Else will be prefetchede.g. 38 8 = 30 clock cycles
Spatial Memory PipelineTiming analysisSimple to analyseAnalysis next instruction lineIf has a branch next target line will misse.g. 38 clock cycles Else will be prefetchede.g. 38 8 = 30 clock cyclesTick Address QueueTick Instruction BufferReduce cost of context switchingMaintains a priority queueThread execution orderPrefetches instructions from next threadMake context switching points appear as linear codePaired using Spatial Memory PipelineTick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Context switching memory cost same as linear code
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Tick Address QueueTick Instruction Buffer
Timing analysisSame prefetch lines for allocated context switching points
Associative Loop MemoryStatically AllocatedGreedyAllocates inner most look firstFetches Loop Before ExecutingPredictable easy and tight to modelExploits temporal locality
Command TableStatically AllocatedA Look Up table to dynamically loadTick Instruction BufferTick QueueAssociative Loop MemoryCommand are executed when the PC matches the address stored on the commandAllows the TickPAD to function without modification to source codeLibrariesPropriety programs
Command TableThree fieldsAddressThe PC address to execute the commandCommandDiscard Loop Associative MemoryStore Loop Associative MemoryFill Tick Instruction BufferLoad Tick Address QueueOperandData used by the command
Command Table AllocationNodeCommandAddressFORKLoad Tick Address Queue x NFill Tick Instruction BufferAddress of FORKEOTLoad Tick Address QueueFill Tick Instruction BufferAddress of EOTKILLFill Tick Instruction BufferAddress of KillLoopsDiscard Loop Associative MemoryStore Loop Associative MemoryAddress at start of LoopCommand Table Allocation
NodeCommandAddressFORKLoad Tick Address Queue x NFill Tick Instruction BufferAddress of FORKEOTLoad Tick Address QueueFill Tick Instruction BufferAddress of EOTKILLFill Tick Instruction BufferAddress of KillLoopsDiscard Loop Associative MemoryStore Loop Associative MemoryAddress at start of Loop72Command Table AllocationNodeCommandAddressFORKLoad Tick Address Queue x NFill Tick Instruction BufferAddress of FORKEOTLoad Tick Address QueueFill Tick Instruction BufferAddress of EOTKILLFill Tick Instruction BufferAddress of KillLoopsDiscard Loop Associative MemoryStore Loop Associative MemoryAddress at start of Loop
73Command Table AllocationNodeCommandAddressFORKLoad Tick Address Queue x NFill Tick Instruction BufferAddress of FORKEOTLoad Tick Address QueueFill Tick Instruction BufferAddress of EOTKILLFill Tick Instruction BufferAddress of KillLoopsDiscard Loop Associative MemoryStore Loop Associative MemoryAddress at start of Loop
74Command Table AllocationNodeCommandAddressFORKLoad Tick Address Queue x NFill Tick Instruction BufferAddress of FORKEOTLoad Tick Address QueueFill Tick Instruction BufferAddress of EOTKILLFill Tick Instruction BufferAddress of KillLoopsDiscard Loop Associative MemoryStore Loop Associative MemoryAddress at start of Loop
75Results
Results
WCRT reduction8.5% Locked SPMs 12.3% Thread multiplexed SPM13.4% Direct Mapped CachesResults
Results - Synthesis
ConclusionPresented a new memory architectureTailored for synchronous programsHas better worst case performance Analysis time is scalableBetween scratchpad and abstract cache analysisThe presented architecture is also suitable for other synchronous languagesFuture workData TickPADTickPAD on multicoresThank YouTickPAD Allocation Analysis
TickPAD Timing Analysis
TCCFG
PRET-CProgram
Worst Case Reaction Time
Graph Construction
ReachabilityAnalysis
TickPAD Configuration File
Updated TCCFG
1
2
3
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x3100x3200x330
0x3B0
6
7
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x3100x3200x330
0x3B0
6
7
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x3100x3200x330
0x3B0
6
7
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
Toggle
Main Memory
TAG
Brach Instruction Check
TAG
Instruction[32]
ADDR[TAG]
ADDR[Block Offset]
Tick FIFO
Control Logic
WriteEn
Associative Loop Memory
Spatial Memory Pipeline
Demux
Demux
Demux
Demux
Demux
SMP Buffer 1
SMP Buffer 2
Command Buffer
hasBranch
clk
Address[32]
0x3100x3200x330
0x3B0
6
7
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
0x320
0x330
0x310
0x3B0
0x320
0x330
0x320
Disabled
0x330
0x3B0
Linear Code
Branch
Execute Buffer
Fetch Buffer
Fetching
Processor Execution
Fetching
Fetching
0x310
Stall
Stall
Stall
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
310
6
Tick Address Queue
PC = 2B0
PC = 2C0
*980
*4F0
*2F0
i2F0
PC = 2C0
*980
*4F0
i2F0
PC = 310
*310
*980
*4F0
i4F0
PC = 310
*310
*980
Tick Instruction Buffer
I
II
III
IV
V
i2F0
*980
*4F0
PC = 2F0
VI
i4F0
PC = 4F0
*310
*980
VII
Execute Buffer
Fetch Buffer
Processor Execution
0x2B0
0x2C0
0x2B0
0x2C0
0x2C0
Fetching
Disabled
Empty
0x2F0
Stall
0x300
0x2F0
0x300
Fetching
Fetching
Stall
0x300
Tick Instruction Buffer
Stall
Invaild
0x310
0x310
Fetching
0x500
Fetching
0x310
Stall
Fetching
0x4F0
Stall
0x4F0
Disabled
Invaild
Stall
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x310
6
0x2F00x300
0x310
0x2C0
0x2A00x2B0
0x4F00x5000x510
0x520
0x9800x9900x9A0
2
3
4
5
22
23
28
0x2F00x300
0x310
0x4F00x5000x510
0x520
0x9800x9900x9A0
0x9B0
4
5
22
23
28
29
35
0x944
0x4B4
0xAE4
36
37
38
0x3B0
0x3A0
0x3300x340
0x390
0x390
0x3B0
7
9
31
10
11
8