24
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Evaluation – Metrics, Simulation, and Workloads Copyright 2010 Daniel J. Sorin Duke University

ECE 259 / CPS 221 Advanced Computer Architecture II

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ECE 259 / CPS 221 Advanced Computer Architecture II

ECE 259 / CPS 221Advanced Computer Architecture II(Parallel Computer Architecture)

Evaluation – Metrics, Simulation, and Workloads

Copyright 2010 Daniel J. SorinDuke University

Page 2: ECE 259 / CPS 221 Advanced Computer Architecture II

Outline

• Metrics

• Methodologies– Modeling– Simulation

2(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Workloads

Page 3: ECE 259 / CPS 221 Advanced Computer Architecture II

Performance Metrics

• How do we tell if our design is good?

• Performance metrics– Clock speed (gigahertz)? No! Why not?– Instructions per cycle? No! Why not? (tougher qu estion)– Database transactions per second?

3(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• What’s important? Depends on workload …– Latency ���� interactive computing– Throughput ���� batch jobs/queries– Availability ���� enterprise applications– Power ���� mobile computing … and everything else now, too– Cost-efficiency ���� everything but perhaps supercomputing

Page 4: ECE 259 / CPS 221 Advanced Computer Architecture II

Metrics and Units

• Latency (an aspect of performance)– Response time

• Throughput (another aspect of performance)– Transactions per cycle (e.g., TPM-C or TPM-H)

• Availability– How many “nines” (e.g., 5 nines = 99.999% available)

• Power: watts

4(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Power: watts• Energy: joules• Hybrid metrics capture more than one aspect

– Cost-efficiency: dollars-seconds– Power-delay (energy-delay): watts-seconds (joules-s econds)– Performability: combines performance with availabil ity

Page 5: ECE 259 / CPS 221 Advanced Computer Architecture II

Secondary Metrics

• Metrics that we can use for insight, debugging, etc .– Quantify specific aspects of system, not holistic b ehavior

• Examples– Instructions per cycle (IPC)– Cache hit rates– Average memory request latency

5(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

– Average memory request latency– Average network link utilization– Fraction of directory requests that require 3 hops– Etc.

• You can use these metrics to explain results– Otherwise, results are just inscrutable, unjustifie d numbers

Page 6: ECE 259 / CPS 221 Advanced Computer Architecture II

Comparing to Prior Work

• How does your idea compare to prior work?– This is how we show that our idea is worthy of publ ication– E.g., 50% better throughput on TPC-C, but with 20% more power

• Why is comparison difficult?– Impossible to exactly reproduce experimental setup

6(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Example differences in experimental setup– Different system model

» Different ISA, microarchitecture, network, etc.– Different workloads (or same workloads compiled dif ferently)

» Different OS» Even for same exact application, can have different jobs

running in the background (e.g., kernel daemons)– Different simulator (or different configuration of simulator)

» Assumptions about latencies, bandwidths, etc.

Page 7: ECE 259 / CPS 221 Advanced Computer Architecture II

Fair Comparisons

• Ideally, we’d make perfectly fair comparisons– Compare “apples and apples”

• If impossible, then give benefit of doubt to prior work– Assumptions about prior work should be optimistic

7(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Assumptions about our work should be pessimistic– Don’t assume that our 4MB cache can be accessed in 1 cycle– Find the worst-case scenario for our system– Assume that future trends will be less favorable th an is likely

• Show that, even in our worst case, we still do well– Otherwise, readers will be less convinced

Page 8: ECE 259 / CPS 221 Advanced Computer Architecture II

Cost Effective Computing (Wood & Hill)

• DISCUSSION

8(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

Page 9: ECE 259 / CPS 221 Advanced Computer Architecture II

Outline

• Metrics

• Evaluation Methodologies– Modeling– Simulation

9(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Workloads

Page 10: ECE 259 / CPS 221 Advanced Computer Architecture II

Ways to Evaluate New Architectures

Tradeoff between three desired features

10(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

Building

Page 11: ECE 259 / CPS 221 Advanced Computer Architecture II

Building

• Construct a hardware prototype– ASIC vs. FPGA

• Advantages+ Way cool to show off hardware to friends+ Runs quickly

• Disadvantages– Takes long time (grad student time!) to build

11(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

– Takes long time (grad student time!) to build– Expensive– Not flexible (esp. ASIC)

ASICs generally too labor intensive for research st udies, but FPGAs are viable options in many cases

Page 12: ECE 259 / CPS 221 Advanced Computer Architecture II

Modeling

• Mathematically model the system– Use probabilities and/or queuing models (see ECE 25 5/257)

• Advantages+ Very flexible+ Very quick to develop+ Runs quickly

12(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

+ Runs quickly

• Disadvantages– Cannot capture effects of system details– Architects are skeptical of models

Generally OK for back of the envelope estimates

Page 13: ECE 259 / CPS 221 Advanced Computer Architecture II

Simulating

• Write a program that mimics system behavior

• Advantages+ Very flexible+ Relatively quick to develop

• Disadvantages

13(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Disadvantages– Runs slowly (e.g., 30,000 times slower than hardwar e)

Method of choice for most architectural research

Page 14: ECE 259 / CPS 221 Advanced Computer Architecture II

Simulation Challenges

SimulatorWorkload

System

Performance metrics

14(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

System description

Tough problems associated with each arrow!

Page 15: ECE 259 / CPS 221 Advanced Computer Architecture II

Applications to Simulate

• We care how system does on important applications

• We’ll talk about this in a few slides …

15(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

Page 16: ECE 259 / CPS 221 Advanced Computer Architecture II

Describing Simulated System

• How detailed must our simulator be?• Model every transistor in the processor?

– Would take too long

• Abstract away details of processor organization?– Could miss important effects of processor features– Could achieve wrong conclusion

16(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Need balance– Model in detail only where necessary– E.g., model memory system in detail, but abstract d isks

Page 17: ECE 259 / CPS 221 Advanced Computer Architecture II

Analytic Model of Shared Memory System

• Queuing model can capture behavior of system• Optional reading from ISCA 1998: "Analytic

Evaluation of Shared-Memory Parallel Systems with ILP Processors“

– Models processor cores as request generators– Models cache coherent memory system (directory prot ocol) as

queuing system where requests (customers) access

17(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

queuing system where requests (customers) access – Outputs average utilizations, throughputs, waiting t imes, etc.

Page 18: ECE 259 / CPS 221 Advanced Computer Architecture II

Some Processor Simulators

uniproc

Not full-system Full-system

SimpleScalarSimics (Virtutech)

Simics+GEMS (Wisconsin)

Simics + Flexus (CMU)

18(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

MP

Simics + Flexus (CMU)

M5 (Michigan)

PTLsim (Suny-Binghamton)

ASIM (Intel)

SimOS (Stanford)

Liberty (Princeton)

SESC (UCSC/Illinois)

RSIM (Rice)

Wisconsin Wind Tunnel

Page 19: ECE 259 / CPS 221 Advanced Computer Architecture II

Simics

• Simics is a full-system commercial simulator

• PRESENTATION

19(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

Page 20: ECE 259 / CPS 221 Advanced Computer Architecture II

RAMP

• PRESENTATION

20(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

Page 21: ECE 259 / CPS 221 Advanced Computer Architecture II

Outline

• Metrics

• Methodologies– Modeling– Simulation

21(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

• Workloads

Page 22: ECE 259 / CPS 221 Advanced Computer Architecture II

Workloads

• We care how system does on important applications

• But who defines “important”? (I do!)

• Types of applications– Scientific (genomics, weather simulation, protein f olding)

22(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

– Scientific (genomics, weather simulation, protein f olding)– Commercial (database, web serving, application serv ing)– Desktop (office productivity software, multimedia)– Portable (voice recognition)– ???

Page 23: ECE 259 / CPS 221 Advanced Computer Architecture II

DEC/Compaq/Intel (?) Workload Analysis

• Commercial workloads are different from scientific

• PRESENTATION

23(C) 2010 Daniel J. Sorin ECE 259 / CPS 221

Page 24: ECE 259 / CPS 221 Advanced Computer Architecture II

“Simulating $2M Server on $2K PC”

• Commercial workloads are different from scientific• Simulating them requires extra work

• PRESENTATION

24(C) 2010 Daniel J. Sorin ECE 259 / CPS 221