8/12/2019 Tutorial06 Solution
1/16
Technische Universitt Mnchen
Chip Multicore ProcessorsTutorial 6
Institute for Integrated Systems
Theresienstr. 90
Building N1www.lis.ei.tum.de
S. Wallentowitz
8/12/2019 Tutorial06 Solution
2/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 2S. Wallentowitz
Task 6.1: Cache Misses
Explain the 3 Cs of cache misses. How do they relate?
8/12/2019 Tutorial06 Solution
3/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 3S. Wallentowitz
3 Cs Model
Compulsory :Very first access to a data blockCannot be avoided
Capacity :Restricted size leads to miss after replacementWould not have happened if cache was unlimited
Conflict :Miss occurs due to prior replacementWould not have happened with full-associative caches
8/12/2019 Tutorial06 Solution
4/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 4S. Wallentowitz
Example for Contribution of each of the Cs
(c) Rochester Institute of Technology
8/12/2019 Tutorial06 Solution
5/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 5S. Wallentowitz
Task 6.2: Multilevel Caches and separated instruction and
data caches
In this tutorial we focus on the impact of the cache hierarchy on theperformance of the system.
a) You have a simple system with a processor core, no caches and anexternal memory. The on-chip interconnect between the processor core and
the memory requires 3 + x cycles for the transfer of x data words. Thememory requires 46 cycles to access data. By using a benchmark you findthat on average each fifth instruction accesses data. Your processor has a
CPI of 1.8.
What is the CPI of your whole system?
8/12/2019 Tutorial06 Solution
6/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 6S. Wallentowitz
Simple System
Proc.Core Memory
8/12/2019 Tutorial06 Solution
7/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 7S. Wallentowitz
b)
Proc.Core Memory
You are using a direct-mapped cache of 32 kB and with cache blocks of 4words (each 32 bit). The cache is accessed in one clock cycle. For your
application you measure a miss rate of 5%. How does the CPI change?
Cache
8/12/2019 Tutorial06 Solution
8/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 8S. Wallentowitz
c)
Proc.Core
Impressed by the improvement you add a second instance of this cache.How does the CPI change?
Cache MemoryCache
8/12/2019 Tutorial06 Solution
9/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 9S. Wallentowitz
Local vs. Global Miss Rate
8/12/2019 Tutorial06 Solution
10/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 10S. Wallentowitz
d)
Alternatively, you can use two other caches, each of total 256 kB. One ofthe caches is 2-way associative and has an access time of 4 cycles. Theother cache is 4-way associative and has an access time of 6 cycles. The
global miss rate is 4% for the first cache and 3% for the second cache.
Which cache should be chosen?
8/12/2019 Tutorial06 Solution
11/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 11S. Wallentowitz
Proc.Core MemoryCache
256 kB2-way
4% global miss rate4 cycles
Cache
32 kBDirect-mapped
5% global miss rate1 cycle
8/12/2019 Tutorial06 Solution
12/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 12S. Wallentowitz
Proc.Core MemoryCache
256 kB4-way
3% global miss rate6 cycles
Cache
32 kBDirect-mapped
5% global miss rate1 cycle
8/12/2019 Tutorial06 Solution
13/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 13S. Wallentowitz
e)
Explain the meaning of spatial and temporal localityin the context of instructions and data.
8/12/2019 Tutorial06 Solution
14/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 14S. Wallentowitz
Locality of data
8/12/2019 Tutorial06 Solution
15/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 15S. Wallentowitz
Locality of instructions
8/12/2019 Tutorial06 Solution
16/16
Technische Universitt Mnchen
Institute for Integrated SystemsChip Multicore Processors Tutorial 6 16S. Wallentowitz
f)Based on your previous findings, exchange the level one cache with separatecaches for instructions and data. The miss rate for instructions is 3% and for
data 8%. Use the level 2 cache from d) and calculate the CPI value.