30
1 R. Ernst, TU Braunschweig 1 A compositional approach to embedded system performance analysis R. Ernst TU Braunschweig R. Ernst, TU Braunschweig 5 Overview heterogeneous embedded architectures formal performance analysis – holistic and flow based a compositional approach to performance analysis example context aware analysis conclusion

A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

1

R. Ernst, TU Braunschweig 1

A compositional approach to embedded system performance analysis

R. Ernst

TU Braunschweig

R. Ernst, TU Braunschweig 5

Overview

• heterogeneous embedded architectures

• formal performance analysis – holistic and flow based

• a compositional approach to performance analysis

• example

• context aware analysis

• conclusion

Page 2: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

2

R. Ernst, TU Braunschweig 6

Embedded systems - trends

• Trend 1: higher system integration– integration of complete programmable subsystems on a single

IC – Multiprocessor-Systems-on-Chip (MpSoC)

R. Ernst, TU Braunschweig 7

Nexperia example: Viper Setop BoxExternal SDRAM

Interrupt controller

Enhanced JTAG

Universal async.receiver/transmitter

(UART)

Universal serial bus

IC debug

Clocks

CPU debug

ISO UART

Reset

MIPS(PR3940)

CPU

TriMediaC-Bridge

Memory controller

Fast C-Bridge

TriMedia(TM32)CPU

MIPS bridge

IEEE 1394 link layer controller

High-performance2D-rendering engine

MIPS C-Bridge

I²C

Exp. bus interfaceunit PCI/XIO

CRCDMA

Adv. imagecompositionProcessor

MPEG-2video decoder

Video inputprocessor

Memory-basedscaler

MPEGsystem proc.

C-bridge

Interrupt ctrl.

Audio I/O

Sony PhilipsDigital I/O

Transport streamDMA

General-purposeI/O

Synchronousserial interface

Fast

PIbu

s

MIP

SPI

bus

Mem

.M

gmt.

IFbu

s

TriM

edi a

PI b

u s

D$

I$

D$

I$

Page 3: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

3

R. Ernst, TU Braunschweig 8

Embedded systems - trends - 2

• Trend 2: networked systemsubiquitous computing, telecom, automotive, avionics, space, ...

subsystem integration

subsystem integration

Image source: Siemens

service integrationservice

integration

R. Ernst, TU Braunschweig 9

Embedded architectures - trends

• system function integration – reactive and transformative parts– function IP, legacy code, new functions

• component and subsystem reuse (IP)– increased design productivity and reduced development cost

• programmable platforms – improved design productivity – increased volume– examples: network processors, multi-media platforms,

automotive platforms, game platforms

Page 4: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

4

R. Ernst, TU Braunschweig 10

Embedded architecture - challenges

• design specialization – increased performance– reduced power consumption– lower cost and size

• design flexibility– late changes, platforms, reuse

• HW and SW IP integration– result of reuse

⇒ embedded HW architectures are heterogeneous

R. Ernst, TU Braunschweig 11

ES architectures are heterogeneous

• different processing element types– processors, weakly programmable coprocessors, IP

components

• different interconnection networks and communication protocols

• different memory types

• different scheduling and synchronization strategies

M

CoP

M

M

PDSP

M

P

Page 5: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

5

R. Ernst, TU Braunschweig 12

Heterogeneous architecture: Viper Setop Box

External SDRAM

Interrupt controller

Enhanced JTAG

Universal async.receiver/transmitter

(UART)

Universal serial bus

IC debug

Clocks

CPU debug

ISO UART

Reset

MIPS(PR3940)

CPU

TriMediaC-Bridge

Memory controller

Fast C-Bridge

TriMedia(TM32)CPU

MIPS bridge

IEEE 1394 link layer controller

High-performance2D-rendering engine

MIPS C-Bridge

I²C

Exp. bus interfaceunit PCI/XIO

CRCDMA

Adv. imagecompositionProcessor

MPEG-2video decoder

Video inputprocessor

Memory-basedscaler

MPEGsystem proc.

C-bridge

Interrupt ctrl.

Audio I/O

Sony PhilipsDigital I/O

Transport streamDMA

General-purposeI/O

Synchronousserial interface

Fast

PIbu

s

MIP

SPI

bus

Mem

.M

gmt.

IFbu

s

TriM

edi a

PI b

u s

D$

I$

D$

I$

R. Ernst, TU Braunschweig 13

Managing HW architecture complexity

• development of application programmer interfaces (API) to hide complexity from application programmer and improve portability

• specialized RTOS to control resource sharing and interfaces

⇒ complex multi-level HW/SW architecture

Page 6: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

6

R. Ernst, TU Braunschweig 14

Software architecture example

Bus

core

RTOS

I/O Int Bus-CTRL

timertimer

drivers

RTOS-APIs

application

periphery

cache

memprivate

private

private

private

shar

ed

hardware

software

architecture

application

• layered software architecture with HW dependent SW and API

⇒ embedded SW is heterogeneous

ce1

pe1

API

R. Ernst, TU Braunschweig 15

Communication Centric Design• communication network as a backbone for systems integration

• state of the art: – off chip: busses w. different

protocols and performance levels– on chip:

Proprietary or standard buses with bridgesAMBA, Sonics, ...

– future: multi-stage networks on-chip as well as off-chip (PCI Express)

CoPro

VLIW MEMIPIP IPIPMEM

RISC MEM DSP

communication networkcommunication network

Application B

Application A

Page 7: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

7

R. Ernst, TU Braunschweig 16

Subsystem integration - example

M2IP2M3

M1

Com Netw

DSPIP1

HWCPU

M2

IP2M3

M1DSP

IP1HW

CPU

Integration

subsystem 1

subsystem 2P1

P3

P2

Sens

Sens

subsystem 2

subsystem 1

R. Ernst, TU Braunschweig 17

Communication centric design - key challenges

• subsystem integration&verification

• design space exploration and optimization

Page 8: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

8

R. Ernst, TU Braunschweig 18

Integration tasks

• resource sharing– several SW processes mapped to one processing element

⇒ task scheduling– mapping several communications mapped to one

communication path⇒ communication (bus) scheduling

– several process data mapped to one memory ⇒ memory assignment (space & time)

• interface synthesis– synthesizing process communication to target system

communication

• integration verification

R. Ernst, TU Braunschweig 19

Integration verification

• correct implementation of specified function• HW/SW co-simulation, verification

• correct target architecture parameters• processor and communication performance• adherence to timing requirements• no memory over/underflow• no run-time dependent dead-locks

generaldesign problem

challenge to heterogeneous system design

Page 9: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

9

R. Ernst, TU Braunschweig 20

Actuatorsystem busSensor

RTOS

I/O int bus-CTRL

timertimercore

drivers

RTOS-APIs

application

cache

MEM

RTOS

core

drivers

RTOS-APIs

application

I/Ointbus-CTRL

timertimer

ReleaseAirbag

Complex performance objectives and constraints

Crash

PctrlCPsens. PdetcCC

Pact.CC

Reaction time of airbag after crash ?

tcom+ tdrv

=tAPI

+ tprocess

+ tAPI

=tdrv

+ tcom

+ tdrv

=tAPI

+ tprocess

+ tAPI

=tdrv

+ tcom

=tsenstcrash + tcsens + tdetc + tfbus + tcact + tairbag+ tact+ + tctrl tact+ tairbag+

physical delay

tsens +tcrash +

physical delay tcom+ tdrv

tAPI+ tprocess

+ tAPI

tdrv+ tcom

+ tdrv

tAPI+ tprocess

+ tAPI

tdrv+ tcom

cache

MEM

R. Ernst, TU Braunschweig 21

Complex run-time interdependencies

M2IP2M3

M1

Com Netw

DSPIP1

HWCPUSens

• run-time dependencies of independent components via communication

• influence on timing and power

Page 10: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

10

R. Ernst, TU Braunschweig 22

CPU1M1

communication networkcommunication network

min execution time⇒ high bus load

max execution time⇒ low bus load

P1

tbc1 twc1

Interdependency example

• complex non-functional interdependencies

• complex system corner cases

R. Ernst, TU Braunschweig 23

CoPro

Heterogeneous resource sharing -example

VLIW MEMIPIP IPIPMEM

RISC MEM DSP

com. netw.com. netw.

static executionorder scheduling

static priorityschedulingFCFS scheduling

earliest deadlinefirst scheduling

TDMA scheduling

proprietary(abstract info)

Page 11: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

11

R. Ernst, TU Braunschweig 24

Performance verification - state of the art

• current approach: target architecture co-simulation• combines functional and performance validation• reuse component validation pattern for system integration and

function test• reuse application benchmarks for target architecture function

validation• visualization of system execution• extensive simulation run times to include many test cases

R. Ernst, TU Braunschweig 25

Co-simulation limitations

• identification of system performance corner cases– different from component performance corner cases– target architecture behavior unknown to the application

function developer (cp. functional HW test)⇒ test case definition and selection ?

• analysis of target architecture – confusing variety of run-time interdependencies– data dependent “transient” run-time effects– mixed in co-simulation

⇒ limited support of design space exploration⇒ debugging challenge

• inclusion of incomplete application specifications ⇒ additional performance models required

Page 12: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

12

R. Ernst, TU Braunschweig 26

Alternatives?

• conservative design– install independency in resource sharing– example: fixed execution time slots – TDMA

• formal performance analysis

R. Ernst, TU Braunschweig 29

Formal performance analysis

• formal techniques known for individual components and subsystems (RMS, static scheduling etc.)

• heterogeneity is problem

Page 13: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

13

R. Ernst, TU Braunschweig 30

Performance model structure

processexecution model

• execution path – data dependent• path execution – arch. dependent• communication – data & arch. dep.

P1 P2

activation

P1

• resource sharing strategy• process activation• component state (caches)

M

IP

M P M P

M

component & communicationexecution model

systemmodel

• communication pattern• shared memory access• environment model

influenced bymodel type

R. Ernst, TU Braunschweig 31

System performance analysis approaches

• global approach– analysis scope extension to several subsystems

• flow based hierachical approach– global flow analysis combined with local scheduling analysis

Page 14: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

14

R. Ernst, TU Braunschweig 32

Analysis scope extension• coherent analysis („holistic“ approach)

• example: Tindell 94, Pop/Eles (DATE 2000, DAC 2002, …):TDMA + static priority – automotive applications

• problem: scalability

P2 P1

T

TTP bus interface

P3 P4

D

TTP bus interfacequeue

RTOSRTOS

TTP bus (TDMA)

static priorityprocess scheduling

static priorityqueueingT: Transmitter

process

R. Ernst, TU Braunschweig 33

Hierarchical approach

• independently scheduled subsystems are coupled by dataflow

⇒ subsystems coupled by stream of data⇒ interpreted as activating events

⇒ coupling corresponds to event propagation

SB 1

scheduling SB 1

P2

P1

SB 2

scheduling SB 2

P4

P3

Page 15: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

15

R. Ernst, TU Braunschweig 34

Event propagation and analysis principle

environment model

local analysis

derive output event model

map to input event model

until convergence or non-schedulability

R. Ernst, TU Braunschweig 35

Flow based hierarchical approaches

• event model generalization for a set of scheduling strategies– arrival and service curves Chakraborty/Thiele/Gries/Künzli …– new analysis approaches needed, e.g. Baruah– used for network processor design

• event stream model adaptation– use abstract interface stream properties to couple local analysis– used e.g. for automotive software

Page 16: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

16

R. Ernst, TU Braunschweig 36

Generalized event model• incoming events stream is integrated over time intervals and

captured in arrival curves

• event consumption is captured in service curves

• 2 curves each for upper and lower bounds (interval)

• upper and lower bounds linearly approximated for analyis

R. Ernst, TU Braunschweig 37

Load analysis with interval event model

• Load analyis: Addition propagation of computation and load intervals - min/max algebra

• e.g. buffer size: difference between input load and processed events HW and SW resources

resource bounds

input stream bounds

remaining resources processed packet streamssource: L. Thiele, ETH Zurich

L. Thiele, ETH Zurich

Page 17: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

17

R. Ernst, TU Braunschweig 38

Compositional Approach (Ernst et al.)

• observation: real-time analysis assumes similar event models at input– periodic event stream

– periodic event streams with jitter

– periodic event streams with burst

– sporadic events with minimum event separation

– sporadic events with bursts

• comparable event models appear at output

• volume of generated events is fixed or interval (data dependent)

tptpte1 te2 te3

te1 te2 te3tptp

tint

tmin

te1 te2 te3 ten ten+1

te1 te2

R. Ernst, TU Braunschweig 39

RTA event model examples

• static execution order– periodic → periodic w. jitter

• Jitter: fixed part (scheduling) + variable part (data dependency)

– sporadic → sporadic w. jitter

• static priority – periodic → periodic w. jitter and burst

– sporadic → sporadic w. jitter and burst

• time driven scheduling: TDMA, round robin– adds jitter to any model

Page 18: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

18

R. Ernst, TU Braunschweig 40

Example: Static priority with arbitrary deadlines

• complex execution sequence - may create output bursts

• found in communication scheduling and multiprocessing

busy period

T1

T2

T2

T2 T2

P3

P1

P2

priority analysis solution e.g. by Lehoczky

R. Ernst, TU Braunschweig 41

Few stream models can be derived

• nact (t) no. of activating events in time t

• burst can be simplified and modeled as jitter greater period with min. event distance

Page 19: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

19

R. Ernst, TU Braunschweig 42

Event stream model adaptation

• composition requires model adaptation

• adaptation for compatible input and output event models– simple mathematical transformation to adapt analysis

Event Model InterFace (EMIF)– no added function, no run-time effect

• adaptation for non-compatible input and output event models– insert Event Adaptation Function (EAF) to transform models– EAF describes interface buffer function

⇒ shows that buffer is required to couple subsystems

R. Ernst, TU Braunschweig 43

periodic with jitterJ J J

TTperiodic with burst

Tb

t

b

t

periodicTT

sporadicx≥≥≥≥t x≥≥≥≥t x≥≥≥≥t

Event Model Interface Classification

jitter = 0burst length (b) = 1

t = T - J

t = T

t = t

lossless EMIF EMIF to less expressive model

T=T, t=T, b=1 T=T, J=0

Page 20: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

20

R. Ernst, TU Braunschweig 44

Using EMIFs and EAFs

Sporadic

Periodic with Burst Periodic with Jitter

PeriodicEAF bufferrequired

upper bound only

R. Ernst, TU Braunschweig 45

Example revisited

com.netw.

subsystem 2

subsystem 1

Distortion

non-functional dependency cycle

Page 21: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

21

R. Ernst, TU Braunschweig 46

EMIF w/ EAF

M2DSPIP2M3IP1

HWM1CPUSens

sporadic

EMIF EMIF

sporadic

EMIF

EMIF

burst

EMIF

sporadic

C1

C3

periodic periodic

com.netw.

burst-basednetw

ork analysis

C2

R. Ernst, TU Braunschweig 47

CPU

sporadicw/ jitter

periodicw/ burst

P3preemptionP1

EMIFw/ EAF

NoCC1 interference C2

HWSens

EMIF

simpleperiodic

simpleperiodic

activation by RTOS

com.network

Dependency cycle

resynchronization reduces jitter

Page 22: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

22

R. Ernst, TU Braunschweig 48

simpleperiodic

EMIFw/ EAF

EMIF w/ EAF

M2DSPIP2M3IP1

HWM1CPU

Sens

simplesporadic sporadic

w/ jitter EMIF

simplesporadic

C1

C3

periodic periodic

NoC C2

periodicw/ burst

periodicw/ burst

periodicw/ jitter

fmax=1,7kHz, s=8kB

f=140kHz, s=1kB

f=20kHz

s=3kB

8kB→→→→ 32×××× ( 256 + 6 ) byte

1kB→→→→ 1×××× ( 1024 + 6 ) byte

3kB→→→→ 24×××× ( 128 + 6 ) byte

• t (P1)= 250 µµµµs

• t (P3)= 10 µµµµs

com.network

R. Ernst, TU Braunschweig 49

Results

39

Page 23: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

23

R. Ernst, TU Braunschweig 50

Application model compatibility

• stream model represents exact data volume – keeps causality for application model activation– required for „AND“ activation, e.g. data flow graphs

otherwise infinite buffers or starvation

– not needed for process systems with „OR“ activation or time driven activation

P1 P2

11 1

11

1

CPU1 CPU2

P1 P2

2

1 1 11

CPU1 CPU2

P321

R. Ernst, TU Braunschweig 51

System contexts

• another stream property – partial event order is kept

• application: tagged token (SPI model) to take system contexts into account

• example set top box

Page 24: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

24

R. Ernst, TU Braunschweig 52

Context exploitation: Set top box

system bus

decryptionunit

RF

hard-disk

R. Ernst, TU Braunschweig 53

Context exploitation: Set top box

decryptionunit

RF

hard-disk

mux MPEG-2

IP-traffic

• scenario 1: record video + watch movie + download file via IP

Page 25: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

25

R. Ernst, TU Braunschweig 54

MPEG stream modeling

• stream properties– I, B, P frame types of different data volume– encoding with the following constraint on frame types

• stream coding– set of constraints on the sequence of tagged tokens

Min 2 out of 12

IMin 2 out of 12PMin 6 out of 12BMax 4 out of 12IMax 4 out of 12PMax 8 out of 12B

R. Ernst, TU Braunschweig 55

Event stream contexts

• intra event stream context – data volume and execution time depend on frame types and

frame sequence

• inter event stream context– bus load and response times depend on relative phase of

streams

• inter + intra event stream contexts– merged streams on bus have variable order

• solutions available for cyclo static streams (Chen/Mok) and inter event stream contexts only (Tindell), extended to constrained sequences and combinations

Page 26: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

26

R. Ernst, TU Braunschweig 56

Response time impact

• response times for static priority bus arbitration

III I I I I I

w/o context t respIP = 920

PI I I I Ptw context t respIP = 690

t

118

muxIP

video streams are multiplexed and have higher priority than IP

R. Ernst, TU Braunschweig 57

Analysis improvement due to intra event streamcontexts

context w/otcontext wt

max resp,

max resp,

Page 27: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

27

R. Ernst, TU Braunschweig 58

Set top box scenario 2

decryptionunit

RF

hard-disk

• scenario 2: decript video + download file via IP

Encrypted MPEG-2

Decrypted MPEG-2

IP-traffic

R. Ernst, TU Braunschweig 59

Response time impact

140

I I

t

transaction period = 100

I I50

I I

170 t

I I

100

enc enc

enc enc

dec dec

dec dec

decIP

enc

• enc and dec streams are coupled by decryption unit execution time

Page 28: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

28

R. Ernst, TU Braunschweig 60

Analysis improvement due to inter event streamcontexts

context w/otcontext wt

max resp,

max resp,

R. Ernst, TU Braunschweig 61

Analysis improvement due to both intra and inter event stream contexts

context w/otcontext wt

max resp,

max resp,

Page 29: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

29

R. Ernst, TU Braunschweig 62

Acknowledgement

• The following persons made major contributions to this work

– Jörn Braam– J. Bhavani– Rafik Henia – Marek Jersak– Razvan Racu – Kai Richter

R. Ernst, TU Braunschweig 63

Conclusion

• systems integration is key ES design problem

• performance verification is key integration problem

• many hidden performance problems not reflected in system function

• performance verification currently primarily based on simulation – risky and time consuming

• presented compositional analysis based on abstract eventflow models

• event flow models enable analysis of complex situations with feedback and contexts

• work applied in several ongoing industry cooperations (SpeAC, FlexFilm, automotive SW, …)

Page 30: A compositional approach to embedded system performance ...€¦ · 4 R. Ernst, TU Braunschweig 10 Embedded architecture - challenges • design specialization – increased performance

30

R. Ernst, TU Braunschweig 64

Literature

• see: www.spi-project.org