View
213
Download
0
Embed Size (px)
Citation preview
EECE476: Computer Architecture
Lecture 11: Understanding and Assessing Performance
Chapter 4.1, 4.2
The University ofBritish Columbia EECE 476 © 2005 Guy Lemieux
3
Measurement and Evaluation
Architecture is an iterative process -- searching the space of possible designs -- at all levels of computer systems
Good IdeasGood Ideas
Mediocre IdeasBad Ideas
Cost /PerformanceAnalysis
Design
Analysis
Creativity
We need a way to measure performance so we can find the good ideas!!!
4
Performance Trends
Microprocessors
Minicomputers
MainframesSupercomputers
1995
Year
19901970 1975 1980 1985
Lo
g o
f P
erfo
rma
nce
6
How to Obtain Performance?Through Transistors?
• Source: Intel
7
How to Obtain Performance?Through Clock Speed?
100.0E+3
1.0E+6
10.0E+6
100.0E+6
1.0E+9
10.0E+9
Jun-71
Jun-74
May-77
May-80
Jun-83
Jun-86
Jun-89
May-92
Jun-95
Jun-98
Jun-01
Jun-04
ClockSpeedof IntelCPUs
(cyclespersecond)
YEAR
9
Performance
• How to obtain performance?– We can’t really answer this until we understand how
to measure performance!
• How to measure performance?– This is a fundamental question!– Buying a Car:
• Top Speed? Fuel Economy? Range? Turning radius?
– Buying a Computer:• Clock Speed? Power? Battery Life? Boot-up time?
10
Airplanes!Which has Greater Performance?
Airplane Passenger Capacity
(ppl)
Range (km) Speed (km/h)
Throughput (ppl*km/h)
Boeing 777 375 7,450 980 367,500
Boeing 747 470 6,680 980 460,600
Concorde 132 6,440 2,170 286,440
DouglasDC-8-50
146 14,030 875 127,750
11
Performance:Two Fundamental Concepts
1. Throughput (aka bandwidth)– Total amount of work done in a given time
• Boeing 747• Laundromat with many washers & dryers• Important for computer data centres
2. Response time (aka latency)– Time from start to end of a given task
• Concorde• One fast, modern laundry machine at home• Important for personal computers
Which is more important for this course?– Mostly response time!– Better response time usually implies higher throughput (but not )
12
Defining Performance(Response Time)
Given a computer architecture X, define:PerformanceX = 1 / ExecutionTimeX
Suppose X is “faster” than Y:PerformanceX > PerformanceY
Implies:1 / ExecutionTimeX > 1 / ExecutionTimeY
orExecutionTimeY > ExecutionTimeX
13
Relative PerformanceX is n times faster than Y means:
n = PerformanceX / PerformanceY
= ExecutionTimeY / ExecutionTimeX
Example: how much faster is A than B?
• Machine A: 10 seconds.• Machine B: 15 seconds.• 15/10 = 1.5
Hence, A is 1.5 times faster than B.
Try to be clear: IMPROVE performance, don’t increase it!!!
14
Measuring Execution TimeThree possible ways of measuring response time
1. Wall-clock Time• Start to finish, includes everything (eg, other programs, I/O)• Very non-deterministic!
2. CPU Time (System + User)• User Time = your program (directly)• System Time = in OS on behalf of your program (excludes I/O)• System Time difficult to ascertain, other programs may affect it• Can vary greatly, depending on quality of OS!• Non-deterministic!
3. CPU Time (User only)• Users program, excluding I/O, excluding OS• Fairly deterministic
Which is better? Either 2 or 3 …
15
Poor Choices for Performance Metrics
• Why are each of these metrics bad?
– Number of instructions• Static instruction count• Dynamic instruction count• How much work is done in each instruction?
– Number of instructions per second• MIPS: millions of instructions per second• MIPS: meaningless indicator of processor speed (!)
– Number of clock cycles– Clock speed (clock rate)
• Taken together, we may have something here….
16
Performance Equation (1)
Simplified version:CPUTime = #ClockCycles * CycleTime
= #ClockCycles / ClockRate
#ClockCycles• Encapsulates two things:
– Number of instructions in a program– Complexity of each instruction
CycleTime = 1 / ClockRate• Clock Rate is the clock speed (in MHz or GHz) of the CPU• Cycle Time is the clock period (in ns) of the CPU
18
CycleTimeCycleTime == clock period
• 3.6 GHz Pentium 4 processor is fast!– 0.2778ns cycle time– SPECint_base2000 benchmark: 1510 (15.1 times faster than ULTRASparc)– http://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040621-03127.html
• 2.0 GHz Pentium M processor is faster!– 0.5ns cycle time– SPECint_base2000 benchmark: 1528 (15.28 times faster than ULTRASparc)– http://www.spec.org/cpu2000/results/res2004q2/cpu2000-20040614-03081.html
Huh????
• Clock speed alone is not a good indicator of processor speed
21
#ClockCycles (part 1)
Could assume each instruction takes one cycle:
1st
inst
ruct
ion
2nd
inst
ruct
ion
3rd
inst
ruct
ion
4th
5th
6th ...
This assumption is incorrect,
different instructions take different amounts of time on different machines.
Why? hint: these are machine instructions, not lines of C code
time
22
#ClockCycles (part 2)
Reality: each instruction can take a different number of cycles!
1. Multiplication is slower than addition
2. Floating point operations are slower than integer operations
3. Accessing memory takes is slower than accessing registers
Important point: changing the cycle time often changes the number of cycles required for various instructions (more later)
time
23
#ClockCycles (part 3)
• MIPS or InstrCount alone is meaningless• #ClockCycles alone is meaningless• CycleTime alone is meaningless
… need to tie all three together….
• InstrCount (instructions per program)• CPI (cycles per instruction)• CycleTime (time per cycle)
24
Performance Equation (2)
• Put the pieces together…CPUTime = InstrCount * CPI * CycleTime
• Dimensional analysis– Check the units…
time/prog = (instr/prog)*(cycle/instr)*(time/cycle)X XX X
25
Performance Equation (3)
Full version:
CPUTime = i (InstrCounti * CPIi) * CycleTime
• InstrCounti count of instructions of type i
• CPIi cycles per instruction of type i
26
Quickie Quiz• Give 2 most important concepts of performance measurements
• Give 3 ways of measuring performance
• Explain what is wrong with the following performance metrics– Instructions per second– Clock speed– Cycles per instruction– Number of transistors– Power
• What performance metric is used in this course?
• What is the performance equation? What does it mean? Why is it used?