44
MAC/VU-Advanced Com MAC/VU-Advanced Com puter Architecture puter Architecture Lecture 2 - Performance Lecture 2 - Performance 1 CS 704 CS 704 Advanced Computer Advanced Computer Architecture Architecture Lecture 2 Lecture 2 Quantitative Principles Quantitative Principles Detailed discussion on the Detailed discussion on the computer Performance – the computer Performance – the key to quantitative design key to quantitative design and analysis and analysis

Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

Embed Size (px)

Citation preview

Page 1: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 11

CS 704CS 704 Advanced Computer ArchitectureAdvanced Computer Architecture

Lecture 2Lecture 2Quantitative PrinciplesQuantitative Principles

Detailed discussion on the Detailed discussion on the computer Performance – the key to computer Performance – the key to quantitative design and analysis quantitative design and analysis

Page 2: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 22

Today’s Topics

Recap of Lecture 1Recap of Lecture 1

Growth in processor Growth in processor performance performance

Price-performance designPrice-performance design

CPU performance metricsCPU performance metrics

CPU benchmarks suitesCPU benchmarks suites

SummarySummary

Page 3: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 33

Recap of Lecture 1 Computer Systems:Computer Systems:

Architecture refers to those attributes of a computer visible to a programmer or compiler writer; e.g. instruction set, addressing techniques, I/O mechanisms etc.

Organization refers to how the features of a computer are implemented? i.e., control signals are generated using the principles of finite state machine (FSM) or microprogramming

Page 4: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 44

Recap of Lecture 1Computer Development:Computer Development:

•Academically, modern computer developments have their infancy in 1944-49

•Commercially, the first machine was built by Eckert-Mauchly Computer Corporation in 1949

•Technological developments, from vacuum tubes to VLSI circuits, dynamic memory and network technology gave birth to four different generations of computers.

•Microprocessor and PCs were introduced in 1971

Page 5: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 55

Recap of Lecture 1

Design Perspectives: Design Perspectives:

ProcessorProcessor – ISA, ILP and Cache – ISA, ILP and Cache

Memory hierarchy: Memory hierarchy: Multilevel Multilevel cache and Virtual memorycache and Virtual memory

input/output and storagesinput/output and storages

multiprocessor and networksmultiprocessor and networks

Page 6: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 66

Recap of Lecture 1Computer Design Cycle:Computer Design Cycle:

• The computer design and development has been under the influence of

-Technology

-performance and

-cost;

the decisive factors for rapid changes in the computer development have been the performance enhancements, price reduction and functional improvements.

Page 7: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 77

Growth in Processor PerformanceInsert Slide 9 here

•The supercomputers and mainframes, costing millions of dollars and occupying excessively large space, prevailing form of computing in 1960s were replaced with relatively low-cost and smaller-sized minicomputers in 1970s

•In 1980s, very low-cost microprocessor-based desktop computing machines in the form of personal computer (PC) and workstation were introduced.

Page 8: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 88

Growth in Processor PerformanceInsert Slide 9 here

•The growth in processor performance since mid-1980s has been substantially high than in earlier years

•Prior to the mid-1980s microprocessor performance growth was averaged about 35% per year

•By 2001 the growth raised to about 1.58 per year

Page 9: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 99

Growth in Processor PerformanceP

erfo

rman

ce r

elat

ive

to M

IPS

Year

0

200

600

800

1000

198419861988199019921994 1996■ ■

■■

■ ■

1998

2000

400

12001400

1600 Intel P-III

HP 9000

HP 9000IBM Power1 DEC

AlphaMIPS R2000

DEC Alpha

Page 10: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1010

Price-Performance DesignPrice-Performance Design

Technology improvements are used to lower the cost and increase performance.

The relationship between cost and price is complex one

The cost is the total amount spends to produce a product

The price is the amount for which a finished good is sold.

Page 11: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1111

Price-Performance DesignPrice-Performance Design

The cost passes throughdifferent stages before it becomes price.

A small change in cost may have a big impact on price

Page 12: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1212

Price vs. Cost Price vs. Cost ….. ….. Insert Slide 14 hereInsert Slide 14 here• Manufacturing Costs:Manufacturing Costs: Total amount spent to Total amount spent to produce a componentproduce a component - - Component Cost: Component Cost: Cost at which the Cost at which the components are available to the components are available to the designer. - designer. - It It ranges fromranges from 40% to 50% 40% to 50% of of thethe list price list price of the of the product. product. - - Recurring costs:Recurring costs: Labor, purchasing Labor, purchasing scrap, warranty – 4% - 16 % of list pricescrap, warranty – 4% - 16 % of list price - - Gross margin – Gross margin – Non-recurring cost:Non-recurring cost: R&D, R&D, marketing, sales, equipment, rental,marketing, sales, equipment, rental, maintenance, financing cost, pre-tax maintenance, financing cost, pre-tax profits, profits, taxestaxes

Page 13: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1313

Price vs. Cost Price vs. Cost ….. ….. Insert Slide 14 Insert Slide 14 herehere

• List PriceList Price:: •Amount for which the finished good is Amount for which the finished good is sold; sold; •it includes it includes Average Discount Average Discount of of 15% to 35% of the 15% to 35% of the as volume discounts as volume discounts and/or retailer markupand/or retailer markup

Page 14: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1414

Price vs. Cost Price vs. Cost ….. Price-Performance Design Cont’d….. Price-Performance Design Cont’d

0%

20%

40%

60%

80%

100%

Mini W/S PC

Average Discount

Gross Margin

Direct Costs

Component Costs

Page 15: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1515

Cost-effective IC Design:Cost-effective IC Design: Price-Performance DesignPrice-Performance Design

Yield: Yield: Percentage of manufactured components surviving testing

Volume: increases manufacturing hence decreases the list price and improves the purchasing efficiency

Feature Size:Feature Size: the minimum size of a the minimum size of a transistor or wire in either x or y direction transistor or wire in either x or y direction

Page 16: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1616

Cost-effective IC Design:Cost-effective IC Design: Price-Performance DesignPrice-Performance Design

Reduction in feature size from 10 microns in 1971 and 0.18 in 2001has resulted in:

- - Quadratic rise in transistor count

- Linear increase in performance

- 4-bit to 64-bit microprocessor

- Desktops have replaced time-sharing machines

Page 17: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1717

Cost of Integrated CircuitsCost of Integrated Circuits

Manufacturing Stages:

The Integrated circuit manufacturing passes through many stage:

Wafer growth and testing Wafer chopping it into dies Packaging the dies to chips Testing a chip.

Page 18: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1818

Cost of Integrated CircuitsCost of Integrated CircuitsInsert Slide 19 hereInsert Slide 19 here

Die: is the square area of the wafer containing the integrated circuit

See that while fitting dies on the wafer the small wafer area around the periphery goes waist

Cost of a die: The cost of a die is determined from cost of a wafer; the number of dies fit on a wafer and the percentage of dies that work, i.e., the yield of the die.

Page 19: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 1919

Dies of Integrated CircuitsDies of Integrated Circuits

Page 20: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2020

Cost of Integrated CircuitsCost of Integrated CircuitsInsert Slide 21 hereInsert Slide 21 here

• The cost of integrated circuit can be determined as ratio of the total cost; i.e., the sum of the costs of die, cost of testing die, cost of packaging and the cost of final testing a chip; to the final test yield.

Page 21: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2121

Calculating Integrated Circuits CostsCalculating Integrated Circuits Costs

Cost of IC =

die cost + die testing cost + packaging cost + final testing cost

final test yield

Page 22: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2222

Cost of Integrated CircuitsCost of Integrated CircuitsInsert Slide 23 hereInsert Slide 23 here

• The cost of die is the ratio of the cost of the wafer to the product of the dies per wafer and die yield

Page 23: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2323

Calculating Integrated Circuits CostsCalculating Integrated Circuits Costs

Cost of IC =

die cost + die testing cost + packaging cost + final testing cost

final test yield

Cost of die = Cost of wafer

dies per wafer x die yield

Page 24: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2424

Cost of Integrated CircuitsCost of Integrated CircuitsInsert Slide 25 hereInsert Slide 25 here

• The number of dies per wafer is determined by the dividing the wafer area (minus the waist wafer area near the round periphery) by the die area

Page 25: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2525

Calculating Integrated Circuits CostsCalculating Integrated Circuits Costs

Cost of IC =

die cost + die testing cost + packaging cost + final testing cost

final test yield

Cost of die = Cost of wafer

dies per wafer x die yield

Dies per wafer =

π (wafer diameter/2)2 π (wafer diameter)

die area √ 2 x die area

Page 26: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2626

Example Calculating Number of DiesExample Calculating Number of Dies

For die of 0.7 Cm on a side, find the number of dies per wafer of 30 cm diameter

Answer:[Wafer area / Die Area] - Wafer Waist area

= π (30/2)2 / 0.49 - π (30) / √ (2 x 0.49)

= 1347 dies

Page 27: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2727

ExampleExample

For die of 0.7 Cm on a side, find the number of dies per wafer of 30 cm diameter

Answer:[Wafer area / Die Area] - Wafer Waist area

= π (30/2)2 / 0.49 - π (30) / √ (2 x 0.49)

= 1347 dies

Page 28: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2828

Calculating Die YieldCalculating Die YieldInsert Slide 29 hereInsert Slide 29 here

• Die yield is the fraction or percentage of good dies on a wafer number

• Wafer yield accounts for completely bad wafers so need not be tested

• Wafer yield corresponds to on defect density by α which depends on number of masking levels • good estimate for CMOS is 4.0 and

Page 29: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 2929

Calculating Integrated Circuits CostsCalculating Integrated Circuits Costs Die yield =

Wafer yield x (1 + defects per unit area x die area) -α

α

Example:

The yield of a die, 0.7cm on a side, with defect density of 0.6/cm2

= (1+[0.6x0.47]/4.0) -4 = 0.75

Page 30: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3030

Price-Performance DesignPrice-Performance Design

• Time to run the task:

• Execution time, response time, latency

• Throughput or bandwidth:

• Tasks per day, hour, week, sec, ns …

Page 31: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3131

Price-Performance DesignPrice-Performance DesignInsert Slid 32Insert Slid 32

• Example:

• To carry 2400 passengers from Lahore to Islamabad – • Train completes the task in 4:00 hrs while airplane completes the same task

in 6.00 hrs.;

• .e., 66.67% of the task in same time – throughput and hence performance of train is 50% more than airplane

Page 32: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3232

Price-Performance Design: ExamplePrice-Performance Design: Example

Vehicle

Train

Plane

Cost / person

300 Rs.

3000 Rs.

TimeLah to

Isb

4.0 hours

45 min.

Passengers/ trip

2400

300

Execution time

/person

6.0 sec

9.0 sec.

Cost-performance

300x6=1,800Rs-sec/person

3000x9=27,000Rs-sec/person

Time to complete

job

4.0 hours

45x8 min. = 6.0 Hr

Plane 10 time faster but takes Plane 10 time faster but takes 50% more time to complete the 50% more time to complete the

job; i.e., lesser throughput – job; i.e., lesser throughput – thus performance of train is thus performance of train is

50%better than plane50%better than plane

The time per person and The time per person and cost person of train is less cost person of train is less than that of plane Thus the than that of plane Thus the cost-performance of plane cost-performance of plane

is 1:15is 1:15

Page 33: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3333

Metrics of PerformanceMetrics of PerformanceInsert Slide 33Insert Slide 33

MIPS: Millions of Instructions per second

MFLOPS: millions of FP operations per sec.

Cycles per second (clock rate)

Megabytes per second

Compiler

Programming Language

Application

Instruction Set Architecture

Answers per monthOperations per second

Datapath

Control

TransistorsWire – I/OPins/

Function Units

Page 34: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3434

Aspects of CPU PerformanceAspects of CPU PerformanceCPU time = Seconds = Instructions x Cycles x

Seconds

Program Program Instruction Cycle

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

Inst CountInst Count CPI Clock RateCPI Clock Rate

ProgramProgram √√

CompilerCompiler √√

Inst. Set.Inst. Set. √√ √√

Organization Organization √√ √√

TechnologyTechnology √√

Page 35: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3535

Cycles Per InstructionCycles Per Instruction• Cycles per Instruction – CPI

= CPU Clock Cycles for program / Instruction Count= (CPU Time * Clock Rate) / Instruction Count

• Instruction Frequency –

For instruction mix, the relative frequency of occurrence of different types of instructions is given as:

FICi = IC of ith instruction / Total Instruction count

• Average Cycles per Instruction –

n nCPI = [1/Instruction count] ∑ ICi x CPIi = ∑ FICi x CPIi

i=1 i=1

Page 36: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3636

Example: Calculating average CPIExample: Calculating average CPI

Base Machine (Reg / Reg)

Op Freq Cycles CPI (i) (% Time)

ALU 50% 1 0.5 (33%)

Load 20% 2 0.4 (27%)

Store 10% 2 0.2 (13%)

Branch 20% 2 0.4 (27%)

1.5

Page 37: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3737

Cycles Per InstructionCycles Per Instruction nn

Arithmetic mean time:Arithmetic mean time: 1/n 1/n ∑ ∑ Time Time i

i=1i=1

Weighted arithmetic mean time:Weighted arithmetic mean time: nn

∑ ∑ ww i x Time x Time i

i=1i=1

Geometric mean time:Geometric mean time: n __________________n __________________

/ n / n // ππ Execution time ratio Execution time ratio i

√ √ I =1I =1

Page 38: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3838

Summary: Price-Performance DesignSummary: Price-Performance Design

Computer cost: The total cost of manufacturing a computer is distributed among different parts of the system such as the cost of cabinet, processor board and I/O devices.

Performance Time is the key measurement of performance

Comparing performance of two designs: the ratio,

η = Execution time Y / Execution time X

determines how much lower execution time machine Y takes as compared to X ; as performance is inverse of execution time, i.e.,

η = Performance X / Performance Y

Page 39: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 3939

Instruction Execution Rate - MIPSInstruction Execution Rate - MIPS

MIPS specify performance inversely to execution time; For a given program:

MIPS = (instruction count) / (execution time x 106)

MIPS could not be calculated from the instruction mix Relative MIPS for a machine ‘M’ is defined based on some reference machine as: RMIPS = [Performance M / Performance reference] x MIPS reference

or = [Time reference / Time M] x MIPS reference

MFLOPS defined for Floating-point-intensive programs as millions of floating-point operations per second

Page 40: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 4040

CPU Benchmark SuitesCPU Benchmark Suites

Performance Comparison: the execution time of the same workload running on two machines without running the actual programsBenchmarks: the programs specifically chosen to measure the performance. Five levels of programs: in the decreasing order of accuracy– Real Applications – Modified Applications – Kernels – Toy benchmarks – Synthetic benchmarks

Page 41: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 4141

SPEC:SPEC: System Performance Evaluation CooperativeSystem Performance Evaluation Cooperative First Round 1989:First Round 1989: 10 programs yielding a single number – 10 programs yielding a single number – SPECmarksSPECmarks

Second Round 1992:Second Round 1992: SPECInt92 (6 integer programs) and SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs)SPECfp92 (14 floating point programs)

Third Round 1995Third Round 1995– new set of programs: SPECint95 (8 integer programs) and new set of programs: SPECint95 (8 integer programs) and

SPECfp95 (10 floating point) SPECfp95 (10 floating point)

– ““benchmarks useful for 3 years”benchmarks useful for 3 years”

– Single flag setting for all programs: SPECint_base95, Single flag setting for all programs: SPECint_base95, SPECfp_base95 SPECfp_base95

Page 42: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 4242

Summary: Summary: Designing and performance comparisonDesigning and performance comparison

• Designing to Last through Trends

Capacity SpeedLogic 2x in 3 years 2x in 3 years

DRAM 4x in 3 years 2x in 10 years

Disk 4x in 3 years 2x in 10 years

• 6yrs to graduate => 16X CPU speed, DRAM/Disk size

• Time to run the task– Execution time, response time, latency

• Tasks per day, hour, week, sec, ns, …– Throughput, bandwidth

• “X is n times faster than Y” means ExTime(Y) Performance(X)

=

ExTime(X) Performance(Y)

Page 43: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 4343

SummarySummary …….. Cont’d…….. Cont’d

CPI Law:CPI Law:

Execution timeExecution time is the REAL measure of computer is the REAL measure of computer performance!performance!

Good productsGood products created when have: created when have:– Good benchmarks, good ways to summarize Good benchmarks, good ways to summarize

performanceperformanceDie CostDie Cost goes roughly with die area goes roughly with die area44

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

Page 44: Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02

MAC/VU-Advanced CompMAC/VU-Advanced Computer Architectureuter Architecture

Lecture 2 - PerformanceLecture 2 - Performance 4444

SummarySummary ….. Cont’d….. Cont’d

““For better or worse, benchmarks shape a field”For better or worse, benchmarks shape a field”

Good products created when have:Good products created when have:– Good benchmarksGood benchmarks– Good ways to summarize performanceGood ways to summarize performance

Given sales is a function in part of performance relative to Given sales is a function in part of performance relative to competition, investment in improving product as reported competition, investment in improving product as reported by performance summaryby performance summary

If benchmarks/summary inadequate, then choose between If benchmarks/summary inadequate, then choose between improving product for real programs vs. improving product improving product for real programs vs. improving product to get more sales;to get more sales;Sales almost always wins!Sales almost always wins!

Execution time is the measure of computer performance!Execution time is the measure of computer performance!