18
© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved CHAPTER 9 Simulation Methods SIMULATION METHODS PARALLEL SIMULATIONS

CHAPTER 9 Simulation Methods - es.ele.tue.nlmwijtvliet/5SIA0_MC/downloads/2_simulators.pdf · Advantages Runs fast Disadvantages Takes long time to build - RPM (Rapid Prototyping

Embed Size (px)

Citation preview

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

CHAPTER 9Simulation Methods

• SIMULATION METHODS

• PARALLEL SIMULATIONS

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

How to study a computer system Methodologies

Construct a hardware prototype Mathematical modeling Simulation

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Construct a hardware prototype Advantages

Runs fast Disadvantages

Takes long time to build- RPM (Rapid Prototyping engine for Multiprocessors) Project @ USC; took a few

graduate students several years Expensive Not flexible

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Mathematically model the system Use analytical modeling

Probabilistic Queuing Markov Petri Net

Advantages Very flexible Very quick to develop Runs quickly

Disadvantages Can not capture effects of system details Computer architects are skeptical of models

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Simulation Write a program that mimics system behavior Advantages

Very flexible Relatively quick to develop

Disadvantages Runs slowly (e.g., 30,000 times slower than hardware)

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Most popular research method Simulation is chosen by MOST research projects Why?

Mathematical model is NOT accurate Building prototype is too time-consuming and too expensive for academic

researchers

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Simulation Bottleneck 1 GHz = 1 Billion Cycles per Second Simulating a second of a future machine execution = Simulate 1B

cycles!! Simulation of 1 cycle of a target = 30,000 cycles on a host 1 second of target simulation = 30,000 seconds on host = 8.3

Hours CPU2K run for a few hours natively Speed much worse when simulating CMP targets!!

7

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Simulation Bottleneck 1 GHz = 1 Billion Cycles per Second Simulating a second of a future machine execution = Simulate 1B

cycles!! Simulation of 1 cycle of a target = 30,000 cycles on a host 1 second of target simulation = 30,000 seconds on host = 8.3

Hours CPU2K run for a few hours natively Speed much worse when simulating CMP targets!!

8

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

How to overcome simulation bottleneck

Gate level (RTL)

Cycle accurate

Functional level (ISA)

Detail Simulation speed

trade accuracy for simulation speed

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

How to overcome simulation bottleneck

Gate level (RTL)

Cycle accurate

Functional level (ISA)

Model based approximation

Detail Simulation speed

trade accuracy for simulation speed

This trade-off has resulted ina plethora of simulators

11

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Tool classification OS code execution

System-level (complete system)- Does simulate behavior of an entire computer system, including OS and user code- Examples:

– Simics– SimOS

User-level- Does NOT simulate OS code- Does emulate system calls- Examples:

– SimpleScalar

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Tool classification Simulation detail

Instruction set- Does simulate the function of instructions- Does NOT model detailed micro-architectural timing- Examples:

– Simics Micro-architecture

- Does clock cycle level simulation- Does speculative, out-of-order multiprocessor timing simulation- May NOT implement functionality of full instruction set or any devices- Examples:

– SimpleScalar RTL

- Does logic gate-level simulation- Examples:

– Synopsis

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Tool classification Simulation input

Trace-driven- Simulator reads a “trace” of inst captured during a previous execution by

software/hardware- Easy to implement, no functional component needed- Large trace size; no branch prediction

Execution-driven- Simulator “runs” the program, generating a trace on-the-fly- More difficult to implement, but has many advantages- Interpreter, direct-execution- Examples:

– Simics, SimpleScalar…

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Interval Simulation

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Multi-Core Simulation Sequential simulation

– All target cores are simulated in one thread (on one host core)– Unified memory hierarchy models simulate resource contention

Parallel simulation Each target core is simulated in separate thread

© Michel Dubois, Murali Annavaram, Per Strenstrom All rights reserved

Multi-Core Simulation Sequential simulation

– All target cores are simulated in one thread (on one host core)– Unified memory hierarchy models simulate resource contention

Parallel simulation Each target core is simulated in separate thread

There is no relation between the number of target cores and the cores on the host!

(except simulation speed)