Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Marc Duranton
Koen De Bosschere, Christian Gamrat,
Jonas Maebe, Harm Munk, Olivier Zendra
High-Performance and
Embedded Architecture and
Compilation
HiPEAC Vision
2017for Computing
in 2025
The HiPEAC project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement number 687698.
The HiPEAC Vision Document is a deliverable of the coordination and support action on High Performance and Embedded Architecture and Compilation that gathers over 450 leading European academic and industrial computing system researchers from nearly 320 institutions in one virtual center of excellence of 1700 researchers.
The HiPEAC Vision
2009 20112008 2013 2015
2
SELF-SUFFICIENCY IN ICT and HiPEAC Vision 2015
About processor development:“No state-funded or EU-fundedinitiatives exist in Europe, yet. Theopening up of the CPU-market is,however, an opportunity for Europe tojump in, as it clearly shows thatinformation technology is not tightlybound to one computing platformanymore. Open architectures, wherethe code can be reviewed and thedesign audited, may play a major rolein this climate. “
3
The HiPEAC Vision Document is a deliverable of the coordination and support action on High Performance and Embedded Architecture and Compilation that gathers over 450 leading European academic and industrial computing system researchers from nearly 320 institutions in one virtual center of excellence of 1700 researchers.
The HiPEAC Vision
2009 20112008 2013 2015 2017
January 2017 version is available at:http://hipeac.net/vision
4
The document
If you only have 5mn…
5
…Read the back page (1 page)
and/or the Executive summary (2 pages)
The document
If you have 15mn…
6
…Read the Part 1: Recommendations (4 pages)
The document
If you have more time…
7
…Read the Part 2: Rationale (103 pages)
Structure HiPEAC Vision 2017
Recommen-
dations
Society
Market Technology
Position of Europe
8
This presentation
9
Evolution of society
ICT-worker shortage in Europe
10
Shortage: 2015: 442 0002020: 913 000
11
Market evolution
22FD
28nm
14nm
10nm
7nm
5nm
Next Gen
FinFET
Non planar / trigate / stacked Nanowires
25nm TBOX
20nm LG ISPD SiCRSD
Si channel
2017
2018
25nm TBOX
20nm LG ISPD SiCRSD
Si channel
12FD
Silicon Quantum bits
FDSOI
Technology evolution
12
22FD
28nm
14nm
10nm
7nm
5nm
Next Gen
Mechanical switches
Hyb
rid
lo
gic
Steep slope devices
Si Quantum bits
Disruptive scaling
Monolithic 3D for 3D VLSI
FinFET
Alternative to scaling and diversification
25nm TBOX
20nm LG ISPD SiCRSD
Si channel
2017
2018
25nm TBOX
20nm LG ISPD SiCRSD
Si channel
12FD
Silicon Quantum bits
FDSOI
Technology evolution
Non planar / trigate / stacked Nanowires
13
The cost per transistors is not decreasing anymore
Start of 2 design tracks ?
• High end, high volume -> Latest technology
• Cost effective, mainstream -> Mature technology
14
Increased Complexity and Cost
The initial product designs will need to
generate high revenues to provide good
buyback from the design and yield ramp-up
costs.
• Barrier for specialization to computing
• Barrier for advanced feature
monolithic dies
Source IBS, Aug. 2014
28nm 20nm 16nm 10nm 7nm 5nm
$38M $67M$132M
$273M
$593M
$1348MIC Design Cost
NRE ++
Wafer Cost
16nm 10nm 7nm 5nm
$9885
$11881
$14707
$19620
IC Design and Yield Ramp-up Costs
28nm 20nm 16nm 10nm 7nm 5nm
$59M $91M$176M
$373M
$876M
$2243M
15
16
Emerging technologies
Cyber physical entanglement
• The entanglement between the physical and virtual world
• Virtual reality, augmented reality and cyber-physical systems blending together
• Many computers with any shape or size and new interactions with surrounding people
17
“Uncharted 4”
Human and machine collaborating
• Entering the Centaur1 era
• Intelligent Personal Assistant (Siri, Cortana, Google now, Alexa…)
• Self-Driving car
• BIC (Brain Inspired Computing)
• …Mainly using Deep Learning
techniques for natural signal processing
18
1 In Advanced Chess, a "Centaur" is a man/machine team.
Advanced Chess (sometimes called cyborg chess or centaur
chess) was first introduced by grandmaster Garry Kasparov,
with the objective of a human player and a computer chess
program playing as a team against other such pairs.
(from Wikipedia)
(Narrow) Artificial Intelligence everywhere
• Artificial Intelligence is changing the man-machine interaction – natural interfaces, ”intelligent” behavior
– Voice recognition and synthesis
– Image and situation understanding
– …
19
Key elements of Artificial Intelligence
Traditional AI
Analysis of “big data”
ML-based AI:
Deep Learning*
20
* Reinforcement Learning, One-shot Learning, etc…
From Greg. S. Corrado, Google brain team co-founder:
– “Traditional AI systems are programmed to be clever
– Modern ML-based AI systems learn to be clever.”
Example of hardware: Baidu’s Minwa
– For vision using deep learning
– 36 server nodes, each with Intel Xeon E5-2620, FDR Infiniband (56Gb/s) and 4 Nvidia Tesla K40m GPU
– Total of 8.6 TB of fast memory
(Deep) Learning is quite demanding
Example of hardware: NVIDIA DGX-1
(Narrow) Artificial Intelligence everywhere
• Artificial Intelligence is changing the man-machine interaction – natural interfaces, ”intelligent” behavior
– Voice recognition and synthesis
– Image and situation understanding
– …
23
(Narrow) Artificial Intelligence everywhere
• Artificial Intelligence is changing the man-machine interaction – natural interfaces, ”intelligent” behavior
– Voice recognition and synthesis
– Image and situation understanding
– …
• The new systems should make intelligent and trustable decisions
24
Key ingredients for trustable systems
Mixed-criticality
Security Privacy
Safety
25
Compatibility with
“classical” ICT
IoT: the Internet of Threats
Today security / privacy issues make the newspaper headlines
Massive adoption of IoT by citizens relies on confidence in terms of security and privacy
26
Europe has little share in this market: Spends 25% of global cybersecurity marketEarns 8.5% of global cybersecurity market
• Beyond predictability by design – because it is not anymore possible (WCET, simulations of all use cases)
• Capability to build trustable systems from untrusted components
• Mastering trustability for complex distributed systems, composed of black or grey boxes – where transparency is not always possible
Trust is key for critical applications
27
New services
Smart sensors
“Dumb” Internet of Things devices
Big Data
Data Analytics / Cognitive
computing
Cloud / HPC
Physical Systems
Real-time
Embedded
Intelligence
at the edge:
Fog computing
Edge computing
Stream analytics
Fast data…
Transforming data into information
as early as possible
Cyber Physical Entanglement
28
Computing Distribution for ”Cognitive” systems
HPC
in the loop
Processing,
Abstracting
Understanding
as early as
possible
System should be autonomous to make good decisions in all conditions
Embedded intelligence needs local high-end computing
Safety will impose that basic autonomous functionsshould not rely on “always connected” or “always available”
29
Privacy will impose that some processing should be done locally and not be sent to the cloud.
Example: detecting elderly people falling in their home
Embedded intelligence needs local high-end computing
30
Embedded intelligence needs local high-end computing
Dumb sensors Smart sensors: Streaming and distributed data analytics
Bandwidth will require more local processing 31
"People who are really serious about software should make their own
hardware" Alan Kay
• With doubling hardware performance, the value was in the software
• With stagnating hardware performance, the value is in the co-design of hardware and software
• In the embedded systems market,
almost 90% of the market is on
hardware (from global market insight).
• We need to retain European
capacity to design hardware
PC-era: IntelMobile era: ARM
CPS-era: ?32
Digital sovereignty of Europe is in danger if the capability to design and produce hardware is lost
TodayYesterday (Today/) Tomorrow
33
• Computers should not waste energy on tasks that have no added value
• Trade-off energy/precision/response time• Approximate systems because the world is not only 1 and 0• Need new programming concepts for energy efficiency• The myriad of IoT devices will have a large worldwide
energy impact
Power = performance
34
35
Customized hardware…… required to increase energy efficiency (e.g. for the inference phase of Deep learning)
Computations (operations and precision) adapted to the use
Growing complexity of software and hardware Features
•ARM® Dual Cortex™-A15 Microprocessor Subsystem
• Up to 1.5 GHz
• NEON™ SIMD coprocessor and VFPv4
• 2-MiB Unified L2 Cache Memory
• 6 Power Domains
•IVA-HD Hardware Accelerator Subsystem
•ARM Dual Cortex™-M4 Image Processing Unit (IPU)
• Dual-core, 200 MHz per Core
•On-Chip Debug with 14-Pin JTAG and CTools Technologies
•Display Subsystem
• Display Controller with DMA Engine
• Support for 3 LCD Outputs and 1 TV
• 3 Video, 1 GFX, and 1 Write-back Pipeline
• HDMI Encoder: HDMI 1.4a, HDCP 1.4, and
DVI 1.0 Compliant
•Dual-Core PowerVR® SGX544™ 3D GPU
•2D-Graphics Accelerator (BB2D) Subsystem
• Vivante™ GC320 Core
•Imaging Subsystem (ISS), Consisting of Image Signal Processor (ISP) and Still Image Coprocessor (SIMCOP) Block
•Face Detection Interface (FDIF)
•Power-Independent Audio Back-End (ABE) Subsystem
•Level 3 (L3) and Level 4 (L4) Interconnects
•DDR3/DDR3L Memory Interface (EMIF) Module
• Up to 4 GiB of SDRAM per EMIF (2 GiB per Chip Select)
•General-Purpose Memory Controller (GPMC)
•System Direct Memory Access (DMA) Controller
•Five High-Speed Inter-Integrated Circuit (I2C) Ports
•HDQ™/1-Wire® Interface
•5 Configurable UART/IrDA/CIR Modules
•4 Multichannel Serial Peripheral Interfaces (MCSPIs)
•Multichannel Buffered Serial Port (MCBSP)
•Multichannel Pulse Density Modulation (MCPDM)
•Multichannel Audio Serial Port (MCASP)
•6-Path Digital Microphone (DMIC) Module
•MIPI® High-Speed Synchronous Serial Interface (HSI)
•High-Speed (HS) Multiport USB Host Subsystem
•SATA Host Controller and Physical Layer (PHY)
•MMC/SDIO Host Controller
•SuperSpeed (SS) USB OTG Subsystem and USB3 PHY
•Up to 256 General-Purpose I/O (GPIO) Pins
•11 General-Purpose Timers
•2 Watchdog Timers
•32-kHz Synchronized Timer
•Power, Reset, and Clock Management
• Multiple Independent Core Power Domains
• Multiple Independent Core Voltage Domains
• Module-Level Clock Management for Dynamic Reduction of Consumption
• Available TI Clock Tree Tool (CTT) for Interactive Clock Tree Configuration
•Package
• 754 Device Pins
• Ball Grid Array (BGA)
• 0.5-mm Ball PitchFunctional diagram OMAP 5432 Multimedia Device
Processors
DSPs
Graphics accelerators (GPU)
Crypto accelerators
FPGAs
Deep learning accelerators
Quantum accelerators
36
There is a need for a holistic approach for systems development
3737
Interoperability and composability
38
Interoperability and composability solutions are required
Multiple Control Apps
Cognitive solutions for
computing systems:
• Using AI techniques
for computing
systems
• Similar to Generative
design for mechanical
engineering
Managing complexity
39
40
AI for making computing systems: similar to “Generative design” approach
Motorcycle swingarm: the piece that hinges the rear wheel to the bike’s frame
The user only states desired goals and constraints-> The complexity wall might prevent explaining the solution -> Shall we trust “meta-rules”, or the process that is followed to build the AI?
“Autodesk”
Time to revisit the basic concepts
The US wants to “reboot computing”…
We propose to re-invent computing, typically by challenging basic assumptions... - Interrupts, layered of memory, binary
coding, ...
41
42
Cyber physical entanglement
Human and machine collaboration
Artificial intelligence
Highlights of HiPEAC vision 2017…
43
HPC in the loop
Human in the loop
(visualization, interactive
simulations, …)
Artificial intelligence
Highlights of HiPEAC vision 2017…
for High Performance Computers
HPC at the edge Data analytics
HPC at the edge: supercomputers from previous generations will become embedded systems in the next generations
Watson in 2011…
“In 2011, the supercomputer WATSON was
the size of a bedroom. Today, it's about the
size of three pizza boxes stacked up. It's also
24 times faster and has seen a 2,400 percent
improvement in performance”
"Watson" tomorrow
Note: it is not for the Jeopardy application,
This slide is just to illustrate the title
Holistic view
Guaranteeing trust
Mastering complexity
Improving performance and energy efficiency
Increasing ICT
workforceReinventingcomputing
Security, safety, privacy
Mastering parallelism and heterogeneity
Beyond predictability
by design
46
http://hipeac.net/vision
Download thenew HiPEAC Vision at:
Give us your comments at:
HiPEAC Vision 2017Editorial board: Marc Duranton, Koen De Bosschere,
Christian Gamrat, Jonas Maebe, Harm Munk, Olivier Zendra
47