36
Prepared: OELPW Anders Åhlander Approved: Checked: Date: 2015-02-08 Confidentiality Class: COMPANY UNCLASSIFIED Document Number: en Revision: PA1 Document Name: High Performance Embedded Computing Challenges in long lifetime applications Chalmers Computing Lab Tech Talks February 8 2015

High Performance Embedded Computing - Chalmerslaurako/links/saab.pdf · High Performance Embedded ... SURFACE RADAR SYSTEMS FUTURE SENSOR SYSTEMS ... antenna technology. Advanced

Embed Size (px)

Citation preview

Prepared:OELPW Anders ÅhlanderApproved:

Checked:

Date:2015-02-08Confidentiality Class:COMPANY UNCLASSIFIED

Document Number:enRevision:PA1Document Name:

High Performance EmbeddedComputing

Challenges in long lifetime applications

Chalmers Computing Lab Tech TalksFebruary 8 2015

COMPANY UNCLASSIFIED

Page 2 (33)PAGE 2PAGE 2ELECTRONIC DEFENCE SYSTEMSPAGE 2

ELECTRONIC DEFENCE SYSTEMS

SURFACE RADARSYSTEMS

FUTURE SENSORSYSTEMS

AIRBORNESURVEILLANCE

SYSTEMS

SOURCING &PRODUCTION

EW SYSTEMS

SeniorVice

President

Stafffunctions

COMPANY UNCLASSIFIED

Page 3 (33)PAGE 3

World-leading centre of competence for microwave andantenna technology. Advanced airborne, ground-based andnaval radar systems as well as radar upgrade expertise.

Full range of assets in the Electronic Warfare area, forsignals intelligence, warning and self-protection.

MSEK 2013 2012 2011

Order intake 7,620 2,739 3,229Sales 4,161 4,276 4,561No employees 2,588 2,620 2,557

MSEK 2013 2012 2011

Order intake 7,620 2,739 3,229Sales 4,161 4,276 4,561No employees 2,588 2,620 2,557

KEY CAPABILITIESAND FIGURES

COMPANY UNCLASSIFIED

Page 4 (33)PAGE 4PAGE 4

GLOBAL OPERATIONS

Competence centres in five countries.

Systems in operation in more than 30 countries.

Key markets: Sweden, Germany, UK, US, Brazil, Middle East,India, Thailand, South Korea.

COMPANY UNCLASSIFIED

Page 5 (33)

Outline

Long-lifetime HPEC applications• Particular challenges• Aspects of cost-effective application development

Exploiting the processing technology development

Possibilities to obtain cost-efficiency• Previous and ongoing “FoT” projects

An ongoing project: ESCHER

Summary

enRev PA1

COMPANY UNCLASSIFIED

Page 6 (33)

Active Electronically Scanned Array

Performance

Time

before AESA after AESA

GFLOPS

TFLOPS

Cost:High demands on signal processorHigh complexity

en

Example: AESA based sensor systems

Benefit:Powerful system operation

Rev PA1

COMPANY UNCLASSIFIED

Page 7 (33)

en

Processing challenges

AESA SP performance in the same “box”as for conventional systems, considering• Physical size• Power dissipation• Physical robustness

Cost-effective development of the processing• Engineering efficiency; mastering complexity• Flexibility; easy and efficient enabling of various options• Sustainability; application support over many years of lifetime

Contradictory?

Rev PA1

COMPANY UNCLASSIFIED

Page 8 (33)

en

Engineering efficiency

Mastering complexity• high computational load – high parallelism• multi-functional – complex resource management and scheduling

The aggregated complexity scales differently with the system• More channels ->

- same complexity on algorithm level (only larger matrices)- more complexity in realization (more processors)

• More functions/new algorithms ->- more complex control and interactions

Technology improvement may make it possible to trade hardwareperformance for reduced development time• use higher levels of abstraction

Rev PA1

COMPANY UNCLASSIFIED

Page 9 (33)

en

Flexibility

Possibility to enable software options for differentproducts or for different users• different functions• different types and number of sensors• efficient testing and deployment of the systems

Possibility to configure the system for the mission at hand

Capability to meet changing demands on the fly

Rev PA1

COMPANY UNCLASSIFIED

Page 10 (33)

en

Sustainability

Technology insertion• Possible to upgrade the SW and refresh the

HW over lifetime

Scalability• Easy to scale up or down functionality/performance in order to tolerate

different hardware implementations of the system• Easy to “ride on Moore´s law”

Layered application development• A modern, sustainable codebase• A clear separation of hardware features from the application requirement

is made

General-purpose vs. special-purpose hardware• Certain functions can be implemented in acceleration hardware as long

as there is a clear path (and a clean interface) to replace the hardwarewhen it becomes obsolete

Rev PA1

COMPANY UNCLASSIFIED

Page 11 (33)

en

Desired platform properties

Possible to

Take advantage of the rapid technology development to shortenthe application development timeDecouple system SW from the HW implementationMultiple implementation options for any given applicationA simple way to replace or add hardware modulesLayered development of application SW and support for re-use ofSW componentsScalability in terms of problem size as well as technologydevelopment

Rev PA1

COMPANY UNCLASSIFIED

Page 12 (33)

How to combine• high processing efficiency with• high engineering efficiency?

enRev PA1

COMPANY UNCLASSIFIED

Page 13 (33)Rev PA1en

Generality-performance trade-off

COTS

custom

time

enoughgenerality

for applicationdomain

architectureadvantage

performancegaintime lead

performance

generality

scales withtechnologydevelopment

Combine the best of two worlds- performance, engineering efficiency

COMPANY UNCLASSIFIED

Page 14 (33)

high performance

The application domain

linear operations

MM FIR FFTdataset

multi-functional

high complexity

Rev PA1en

COMPANY UNCLASSIFIED

Page 15 (33)

Technology development

The International Technology Roadmap for Semiconductors (ITRS)

• Includes time-lines up to about 15 years into the future

• ITRS is sponsored by- the European Semiconductor Industry Association- the Japan Electronics and Information Technology

Industries Association- the Korean Semiconductor Industry Association- the Taiwan Semiconductor Industry Association- the United States Semiconductor Industry Association

enRev PA1

COMPANY UNCLASSIFIED

Page 16 (33)Rev PA1en

ITRS market drivers

ITRS identifies different market drivers for the technologydevelopment

Portable/consumer, Medical, Defense, Automotive, etc.

The market drivers drive the development of, e.g., microprocessorsand System-on-a-chip (SoC) devices

Portable/consumer market driver is chosen here• High power efficiency is paramount in HPEC

COMPANY UNCLASSIFIED

Page 17 (33)

Number of processing cores on a chip

Suits luckily typically our applications well

en

Trend according to ITRS

Rev PA1

COMPANY UNCLASSIFIED

Page 18 (33)

Performance trend

0,0

0,5

1,0

1,5

2,0

2,5

3,0

3,5

4,0

4,5

5,0

2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024

Year

log(

norm

aliz

edpe

rform

ance

)

Trend: Performance Requirement: Performance

Soc Consumer Portable Processing Performance Trends (source ITRS)

enRev PA1

COMPANY UNCLASSIFIED

Page 19 (33)

Power trend

0

2

4

6

8

10

12

14

2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024

Year

Pow

er(W

att)

Trend: Total chip power Requirement: Total chip power

Soc Consumer Portable total power trend (source ITRS)

Total power = static power (gate leakage etc.) + dynamic power (switching)

enRev PA1

COMPANY UNCLASSIFIED

Page 20 (33)

Performance/power ratio trend

0

5

10

15

20

25

30

35

40

45

50

2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024

Year

Norm

aliz

edra

tio

Trend: Performance/power ratio

Consumer Portable performance per power trend, normalized to 2009

enRev PA1

COMPANY UNCLASSIFIED

Page 21 (33)

Sensor signal processing

0

5

10

15

20

25

30

35

40

45

50

2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024

Year

Nor

mal

ized

ratio

Trend: Performance/power ratio

32 channels

1 channel

short-term long-term

Illustration, requirements on SP power efficiency vs. HW trend

enRev PA1

COMPANY UNCLASSIFIED

Page 22 (33)

Energy efficiency

Different aspects of energy efficiency

Larger high-end systems• Many watts of available power but very high

computational performance• E.g. multi-channel adaptive processing in AESA based systems

Small handheld systems• Short range - low transmit power, thus more focus on

SP power• Often high frequency - many resolution cells, thus hiigh

SP performance• Battery powered

en

(illustration)

Rev PA1

COMPANY UNCLASSIFIED

Page 23 (33)

How to exploit the technology development

The processor chips goes towards many-core (100+ cores)

How shall we efficiently use all the cores?• Application programming• Mapping on the processor architecture

Joint industry/academy research projectsaddress this

enRev PA1

COMPANY UNCLASSIFIED

Page 24 (33)

Research projects

en

1995 2014

REMAP HSSP EEE

TELLUSSPREWS

national/international joint industry/academic projects

EDA projects2009 2014

TELLUS 2

SMECY ESCHEREPC JUMP

… and others

Rev PA1

ERTCENS

COMPANY UNCLASSIFIED

Page 25 (33)

High Speed Signal ProcessingExample of project result

en

• Mainly COTS Application development environment• Commercial RTOS• ANSI C• In-house algorithmic library

• A TFLOPS system realization proposal• Five BYB601 cassettes• LSI Logic G13 ASICs with commercial RISC masters• LVDS ring network, 1.6 GB/s data channel + control

channel• Realizable in year 2001

Rev PA1

COMPANY UNCLASSIFIED

Page 26 (33)

An ongoing project: ESCHER

Embedded Streaming Computations onHeterogeneous Energy-efficient aRchitectures

KK-foundation HÖG project, 2014-2016Lead: CERES at Halmstad University

enRev PA1

COMPANY UNCLASSIFIED

Page 27 (33)

Processing eras

Single-Core Era

Enabled by:+Moore’s Law+Voltage Scaling+Micro Architecture/RISC

Constrained by:– Power– Complexity

Time

we arehere

Sin

gle

thre

adpe

rform

ance

27

Multi-Core Era

Enabled by:+Moore’s Law+Desire for throughput+20 years of SMP

Constrained by:– Power– Parallel SW available– Scalability

Time (# processors)

we arehere

Thro

ughp

utpe

rform

ance

Many-Core/Heterogeneous

Systems EraEnabled by:

+Power efficiency throughhigh parallelism+Moore’s Law

Currently Constrained by:– Power– Programming models– Communication overhead

Time

we arehere

Targ

eted

appl

icat

ion

perfo

rman

ce

Assembler => C => C++/Java pthreads => OpenMP/TBB ... OpenCL/CUDA, StreamIt, CAL,Occam-Pi, Chapel, ZPL,...

Inspired by "The Future Is Heterogeneous Computing",Mike Houston, Advanced Micro Devices, 2010

enRev PA1

COMPANY UNCLASSIFIED

Page 28 (33)

ESCHER

Two main parts:

Application Development support and Languages forHeterogeneous Manycore Architectures

Heterogeneous Many-core Architectures for Real-TimeEmbedded Streaming Systems

enRev PA1

COMPANY UNCLASSIFIED

Page 29 (33) 29

Embedded Real-Time StreamingComputations

The Applications of the Industrial Partners are often in the form ofEmbedded Streaming Applications like:

Sensor systems

Autonomous Vehicles

Vision/Video

Communication

enRev PA1

COMPANY UNCLASSIFIED

Page 30 (33)

ESCHER

Maximize the “four Ps”• Programmer Productivity• Program Portability• Performance• Power efficiency

enRev PA1

COMPANY UNCLASSIFIED

Page 31 (33)

Applications

ScientificComputing

Streaming Real-time Processing

Data Mining

Machine Learning

Architectures

Many-core

GPGPU

Course-grainedReconfig. Arch.

FPGA

Programmability

Inspired by Kevin J. Brown, et al.,“A Heterogeneous Parallel Framework for Domain-Specific Languages”,PACT 2011

Parallel Programming Language

enRev PA1

COMPANY UNCLASSIFIED

Page 32 (33)

Applications

ScientificComputing

Streaming Real-time Processing

Data Mining

Machine Learning

Architectures

Many-core

GPGPU

Course-grainedReconfig. Arch.

FPGA

Programmability Chasm

Inspired by Kevin J. Brown, et al.,“A Heterogeneous Parallel Framework for Domain-Specific Languages”,PACT 2011

enRev PA1

COMPANY UNCLASSIFIED

Page 33 (33) 33

Applications

ScientificEngineering

Streaming Real-time Processing

Data Mining

Machine Learning

Architectures

Many-Core

GPGPU

Course-grainedReconfig. Arch.

FPGA

How to bridge theProgrammability Chasm?

enRev PA1

COMPANY UNCLASSIFIED

Page 34 (33)

Saab EDS in ESCHER

Brings in expertise• Signal and data processing• HW and SW development for sensor processing

Provides application use casesEvaluates the developed design environments

Goals: get knowledge if the studied development approaches willoutperform traditional approaches in aspects such as engineeringefficiency, performance, and powerA concrete goal: a proof of concept realization of an AESA signalprocessing chain using the design tool flows on real hardware

enRev PA1

COMPANY UNCLASSIFIED

Page 35 (33)

Summary

High performance efficiency and high engineering efficiency musttypically be combined in HPECTypically a lifetime mismatch between SP technology and sensorapplicationsPossibilities• Cost-effective sustainable solutions are possible• Possible to ride on Moore’s law – scale in problem size and function

Risks• Applications are not portable to new hardware platforms

Solutions• Layered application development• Domain specific languages with the right abstraction level• Intermediate representations that support portability over hardware• HW architecture that is “general enough”

enRev PA1