18
Institut für Technik der Informationsverarbeitung (ITIV) KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Improving Parallel MPSoC Simulation Performance by Exploiting Dynamic Routing Delay Prediction Christoph Roth, Harald Bucher, Simon Reder, Oliver Sander, Jürgen Becker

Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Improving Parallel MPSoC Simulation Performance by Exploiting Dynamic Routing Delay Prediction

Christoph Roth, Harald Bucher, Simon Reder, Oliver Sander, Jürgen Becker

Page 2: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)2 01.08.2013

Outline

MotivationBaseline Modeling and Simulation MethodologyDynamic Routing Delay Prediction

OverviewRouter Shortest Path GraphLocal Quantum CalculationExecution Semantics of the Lightweight Scheduler

Case StudyConclusion and Outlook

Page 3: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Motivation I

Evaluation of Network-on-Chip (NoC) based Multi-Processor System-on-Chip (MPSoC) platforms regarding throughput, latency or jitter demands for …

rendering of effects like internal network congestion and buffer contention detailed cycle-timed modeling of mechanisms like routing algorithm or flow control in order to measure their impactconsideration of the whole NoC and not only a single routerutilization of realistic application-generated traffic

3 01.08.2013

Full system simulation is a commonly used approach forperforming such evaluations

Page 4: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Motivation II

The gap between MPSoC complexity and productivity in design tools evermore increasesSimulation-based evaluation and verification consumes much of the MPSoC design effort

4 01.08.2013

Source: http://www.imd.uni-rostock.de/noc/index.php?id=36

Source:http://www.intel.com/pressroom/archive/releases/2009/20091202comp_sm.htm

Page 5: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Abstraction of communication and computation in transaction level modeling (TLM)

Possible Solution I: Raising Abstraction Level

5 01.08.2013

Source: Cai, L. & Gajski, D. Transaction level modeling: an overview Hardware/Software Codesign andSystem Synthesis, 2003. First IEEE/ACM/IFIP International Conference on, 2003, 19-24

Page 6: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Possible Solution II: Parallel Simulation

6 01.08.2013

Parallel SystemC framework: Runtime system for simulatingmulti-cores on multi-cores

Support for different execution strategies like logical process(LP) based parallel discrete event simulation (PDES)

C. Roth et al., “Asynchronous parallel MPSoC simulation on the Single-Chip CloudComputer,” in System on Chip (SoC), 2012 International Symposium on, 2012.

Page 7: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Socket Links

Lightweight SchedulerCtrlTrans

Lightweight Processesvbuf

Lightweight Scheduler

CtrlTrans

Lightweight Processes ibuf

vbuf

lclock lclock

nbuf

ibuf

nbuf

DataTransDataTrans

Baseline TLM Method for NoC-based MPSoCs

01.08.2013

Inherently parallelizable transaction level modeling methodIntra-module behavior (e.g. router):

Lightweight scheduler and processes implement synchronous execution semanticsMaintenance of a local time in order to be able to run ahead of the global time

Inter-module behavior (inspired by *):Transaction-based communicationAdjacent modules regularly synchronize via control transactions after a predefinedglobal time quantumExploitation of buffer synchronicity for temporal independencyBetter performance but loss of timing accuracy if time quantum is

chosen larger than one clock cycle

* A. Mello et al., “Parallel simulation of systemc tlm 2.0 compliant mpsoc on smp workstations,” in Design, Automation Test in Europe Conference Exhibition (DATE), 2010, 2010.

Goal: Dynamically adapt temporal decoupling locally to avoid lossof accuracy

Page 8: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Proposed Extension within this Work

01.08.2013

Dela

y An

nota

tions

/ Pr

edic

tion

Rule

s Fixed Quantum

Page 9: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Dynamic Routing Delay Predicition

01.08.2013

Cros

sbar

Example: TL model of the Hermes* routerISO/OSI reference model analogy:

Buffers = Physical Layer (Layer 1)Input Controllers = Data Link Layer (Layer 2)Switch Controller= Network Layer (Layer 3)

Idea:

Exploit higher layer delays (beyond layer 1) as guarantee fortemporal indenpendency

* F. Moraes et al., “HERMES: an infrastructure for low area overhead´packet-switching networks on chip,” Integr. VLSI J., vol. 38, 2004.

Page 10: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Router Shortest Path Graph (RSPG)

01.08.2013

),( ,1,west

jieastji ibufvbufd eastibuf

westibuf

localibuf

eastvbuf

westvbuf

localvbuf

),( ,1,east

jiwestji ibufvbufd

),( ,, jilocallocal

ji ibufvbufd

),( ,,1eastji

westji ibufvbufd

),( ,,1westji

eastji ibufvbufd

),( ,,localjiji

local ibufvbufd

jir ,

),( westeast vbufibufd

),( localeast vbufibufd

A is a directed graph with vertices , edges and edge weights . Two vertices and are connected by a directed edge

if there exists a causal dependency in terms of a minimum delay bound after which may affect . The weight of the edge is the minimum delay bound .

Vv EeDd xv yv

),(, yxyx vve xv yv yxe ,

),( yx vvd

),,( DEVRSPG

Page 11: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Calculation of Local Time Quanta and Next Synchronization Times between Modules

11 01.08.2013

Step I: Calculate Dynamic Lookahead from RSPG (delaysare annotated to behavioral model)

Step II: Calculate Local Time Quantum and Next Synchronization Time of module connection

Page 12: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Appropriate Lightweight Scheduler

12 01.08.2013

Page 13: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Layer 1: Buffer delay is still a constant of one clock cycleLayer 2 & 3: Annotation of state depending delays to Input Control (left) and Switch Control (right) state machines

From these the dynamic lookahead can be calculated withinthe sync()action of the lightweight scheduler by callingcalc_lookahead()

Case Study: Hermes Multiprocessor System

13 01.08.2013

S_INIT

S_IDLE

S_ARBITRATE

S_ROUTE

S_CONNECT

S_END

6

5

6

46

3

2

1

S_IDLE

S_WAIT

S_HEAD

S_PL

S_END

S_END2

2

3

2

1

1

5

4

3

2

1

Page 14: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Implementation of Dynamic Lookahead Calculation

14 01.08.2013

Page 15: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Evaluation: Host Independent Characteristics

15 01.08.2013

Page 16: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Evaluation: Host Dependent Characteristics

16 01.08.2013

Core i7 930 Single-chip Cloud Computer

Page 17: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Conclusion & Outlook

Presentation of an extension for a modeling strategy for NoC-based MPSoCs using SystemC/TLM

Maintenance of cycle accuracy despite temporal decoupling is possible by dynamically predicting and adapting the degree of temporal decoupling

The number of control transactions is significantly reduced compared to the baseline strategy. This improves performance on different types of hosts.

Future work comprises:Application of the method to other types of modules like network interface of processorDeeper investigation of scalability and application dependencyAutomization of model partitioning and host mapping

17 01.08.2013

Page 18: Technische Universität Darmstadt - Improving Parallel MPSoC … · 2020. 7. 16. · 17 01.08.2013. Institut für Technik der Informationsverarbeitung (ITIV) Thank you very much for

Institut für Technik der Informationsverarbeitung (ITIV)

Thank you very much for your attention!

18 01.08.2013