Transcript
Page 1: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

CAP: Criticality AnCAP: Criticality AnEfficient Speculatip

James Tuck Wei LiuJames Tuck, Wei LiuUniversity of Illinois at

International Conference on Comput

nalysis for Powernalysis for Power-ive Multithreadingg

u Josep Torrellasu, Josep TorrellasUrbana-Champaign

er Design (ICCD), October 2007.

Page 2: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

MotivMotiv

S l ti M ltith dSpeculative MultithreadCMPs− It can speedup hard-to-

Power inefficiency of SPower inefficiency of S

vationvation

di (SM) i i t t fding (SM) is important for

parallelize programs

M is a serious concernM is a serious concern

Page 3: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

PropProp

W iWe can improve powercriticality analysis− Some threads matter m

others

Dynamically construct aand calculate criticalityand calculate criticality

Schedule tasks on a CM− DVFS per-core for powe

Schedule critical tasks t− Schedule critical tasks tcritical to lower V-f

posalposal

ffi i ir efficiency using

more for performance than

a graph of SM execution

MP using criticalityer-efficiencyto higher V f cores nonto higher V-f cores, non-

Page 4: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

ContribContrib

N l id l li blNovel, widely-applicablSpeculative Multithread

CAP architecture for a our proposed modelour proposed model

Evaluation of SPECint2− We reduce average pow

Average slow down of 2− Average slow down of 2− ED^2 reduced on avera

Characterize task criticdifferent applications

butionsbutions

t k iti l d l fe task critical model for ding

CMP that implements

2000wer by Geo.Mean of 22%2 6%2.6%age 15%

ality composition of

Page 5: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

Task-Level CrTask-Level Cr

M d l ti t thModel execution at the − Events of interest: spawp− Keeps overhead low co

schemesschemes

Seamlessly handle a va− In-order vs. out-of-order− Scheduling mechanismSc edu g ec a s

Round robinFirst available coreFirst available core

Efficient hardware impl

riticality Modelriticality Model

l l f t k tlevel of task eventswn, commit, squash, , qompared to instruction-level

ariety of SM systemsr spawns

ementation

Page 6: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

Lifetime ofLifetime off SM Taskf SM Task

Page 7: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

Criticality GraCriticality Gra

N dNodes− Stages of the task's exeg

Start, Execute, Finish/Sp

EdgesEdges− Transitioning between s− Events between tasks

Spawn, squash, commitSpawn, squash, commitbecome safe

aph Summaryaph Summary

ecutionpawn, Commit

states in a single task

, freeing a resource, wait to , freeing a resource, wait to

Page 8: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

CAP ArcCAP Arc

B ild iti lit h iBuild criticality graph inmodel

Dynamically analyze cr

M k di ti dMake predictions and s

hitecturehitecture

h d in hardware using our

ritical path of graph

h d l t kschedule tasks

Page 9: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

CAP in a MultiprCAP in a Multiprrocessor Systemrocessor System

T k C t llTask Controller− Tracks running tasks and g

their context

Novel components of CAPNovel components of CAP− Critical path builder

Builds path and analyzes graph

− Critical path predictor

Page 10: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

CAP OvCAP Ov

T ll t th hTo collect the graph− TC sends summary of ty

task commits− Summary contains sumSummary contains sum

Who spawned it, who sq

The CPB creates a nod− The CPB creates a nodedges

Analyzing the graph− Store nodes such that c− Walk graph in reverse t

verviewverview

task execution to builder after

mmary of important edgesmmary of important edgesquashed it, etc.

e in the graph and adds thee in the graph and adds the

critical path calculation is easyp yo find critical path

Page 11: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

Critical PatCritical Pat

T i i l l t dTrain using calculated c

Record edge-centric infeco d edge ce t c− Spawn edges− Squash edges

Use strongly biased edg ydecisions. For example

When Task(A) spawns− When Task(A) spawns − When Task(A) squashe

iti lcritical

h Predictorh Predictor

iti l thcritical path

formationo at o

ges to control scheduling g ge:Task(B) B is likely criticalTask(B), B is likely critical

es Task(C), C becomes

Page 12: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

SchedulingScheduling

A DVFSAssume DVFS per core

CMP is statically configC s stat ca y co gfrequency (V-f) pairs

P t iti l t k tPromote critical tasks to

Demote non-critical tas

g on a CMPg on a CMP

CMPe on a CMP

gured among Voltage-gu ed a o g o tage

hi h V fo high V-f cores

sks to low V-f cores

Page 13: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

EvaluatioEvaluatioSPECint2000 applicatioSPECint2000 applicatio− Optimized for SM using

Two V-f settings

3 Static CMP configura3 Static CMP configura− 3-Crit, 2-Crit, 1-Crit

on Setupon Setuponsons POSH compiler [PPoPP'06]

tions 3-Crittions

2-Crit

1-Crit

Page 14: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

Normalized ExNormalized Ex

Moving to fewer fast cogperformance

Only 2 2% for 2 Crit!− Only 2.2% for 2-Crit!

xecution Timexecution Time

ores has small impact on p

Page 15: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

Normalized ENormalized E

Best ED^2 is obtained Average reduction of 16− Average reduction of 16

− Max reduction of 57.5%

E-D-SquaredE-D-Squared

for 2-Crit6 2%6.2%

%

Page 16: CAP: Criticality AnCAP: Criticality Annalysis for ...iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_iccd07.pdfCAP OvCAP Ov zTlltth hTo collect the graph −TC sends summary of t task

ConcluConclu

SM b ffi iSM can be power effici

Efficiently modeled taskc e t y ode ed tashardware

C iti lit l iCriticality analysis succfor power efficiency− Average performance lo

ED^2 reduction of 16 6%− ED 2 reduction of 16.6%

usionsusions

tent

k-level criticality in e e c t ca ty

f ll h d l t kcessfully schedules tasks

oss of only 2.2%% on average% on average