20
A Mathematical Model for Balancing Co-Phase Effects in Simulated Multithreaded Systems Joshua L. Kihm, Tipp Moseley, and Dan Connors University of Colorado at Boulder

A Mathematical Model for Balancing Co-Phase Effects in Simulated Multithreaded Systems Joshua L. Kihm, Tipp Moseley, and Dan Connors University of Colorado

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

A Mathematical Model for Balancing Co-Phase Effects in

Simulated Multithreaded Systems

Joshua L. Kihm, Tipp Moseley, and Dan Connors

University of Colorado at Boulder

Exploiting Phase Behavior for Efficient Architecture Simulation• Program behavior patterns, or Phases, can be exploited for

efficient simulation [Simpoint-Sherwood, et al. PACT ’01]– Capture repeating phase and eliminate simulation time or direct

detailed simulation

• Industry trends towards multithreaded processors– In a multithreaded system, execution is characterized by a

combination of phases between co-resident threads, called a Co-Phase [VanBiesbrouk, et al., ISPASS ’04]

– Phase exploitation more difficult for simulation and design of multithreaded systems since the individual phases interact in unique ways

Program Execution

Terminology

Period1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

A CA A A A A AA A A AB B B B B CC C C Phase

1 2 3 44 4 4 433 322 2 211 1 1 1 1 Interval

PERIOD – A segment of program execution of a given length

(one OS scheduling period in this work)

PHASE – A set of periods with similar behavior

INTERVAL – A set of consecutive periods with the same phase(one occurrence of a phase)

Effects of Co-phases

• 181.mcf and 186.crafty

• Relative progress of threads is determined by individual thread behavior and inter-thread interference

• As Co-Phase changes, so does interference and performance

• Data from Pentium-4 (Northwood) system illustrates co-phase effects and transitions

• [Graph format from VanBiesbrouk]

What if we start here?

Or here?

Or here?

Problem Statement

• Variation in offset between threads causes variation in which co-phases are encountered and their relative importance – >15% standard deviation in IPC for some combinations.

• Offset is caused by:– Start Times– OS Scheduling– Simulation Error

• The average performance must be determined in order to reflect real system performance where the relative position of threads will be randomized

Example Analysis of Pentium-4 Data

Total ST runtime

HT performs below ST!

Best performanceat high offset

Motivation (Methodology)

• Tested on implemented hardware– Intel Pentium-4 Northwood with

Hyperthreading

• Used 5 SPEC CPU 2000 benchmarks– 188.ammp, 179.art, 186.crafty, 252.eon,

181.mcf– Long-running benchmarks

• Offsets of –100s,-90s, -80s, … +80s, +90s, +100s (21 tests per pairing)

Performance Variance Due to Offset

• Percent standard deviation• Variation is high for many metrics• Self-pairings have high variation

Co-Phase Variance

• Difference in portion of time spent in each co-phase

• Co-phase mix changes with offset

Conceptual Model

• The time spent in each co-phase interval will determine overall performance

• The amount of time in the co-phase interval is dependent on each thread’s:– Performance in co-phase– Length of the interval– Number of operations already completed in

the current interval of each thread

Determining the Time in Co-Phase Interval

• Interval length and co-phase performance are constant, but need to be determined ahead of time*Assumption of phase-based simulation

• The number of a operations already completed is a function of previous performance and co-phase profile

Determining the Time in a Co-Phase Interval

Offset

Time inInterval

Interval i runs in its entirety

Interval is notencountered

Similar case for thread Y Part of Interval occurs(Monotonic)

Overall case is the minimum

Thread Y changes phase first Next interval is (i,j+1)

Thread X changes phase first Next interval is (i+1,j)

Area under the curve isproportional to averagelength of the interval

Mathematical Model

Thread X finishes first

Thread Y finishes first

Performance in Co-Phase

Number of operations in

interval

Number of operations yet to

complete in interval

Number of operations already completed in

Interval

Start-up Intervals

• Interval lengths are dependent on previous intervals (the total number of retired operations) all the way back to the start of execution of the thread–Some model is needed to simulate the number of operations difference between thread

• Model based on single-threaded behavior*Assume that single phase behavior is indicative of average co-phase behavior

Deriving Start-up Intervals

Offset

Time inInterval

Length of PhaseInterval

Interval is neverentered

Interval doesn’t occur Co-phase i,1 Next Startup interval i+1,0

i+1,0

i-1,0

Interval length equals offset

M=1

Length of phase i

Mathematical Model(Start-up Interval)

Length of intervalsup to “i”

Length of intervalsup to and including “i”

Partial completion

Interval not encountered

Full completion

Example analysis of Pentium-4 Data

• First two intervals of 186.crafty and 188.art

Example Analysis of Pentium-4 Data

Run time of crafty Run time of art

Total ST runtime

Art Phase 2 causes heavy interference

HT performs below ST!

Best performanceat high offset

Extensions to More Threads

• One thread is “reference”– Arbitrarily chosen– Number of variables

grows linearly

• Concepts and equations easily extend to more threads

Conclusions

• Offset causes variations in co-phase mix and therefore performance– Average 6.7% standard deviation in IPC

• A complete picture of performance can only be gained through looking at more than “0” offset

• Relative importance of co-phases can be determined mathematically