52
1 Advanced Processor Advanced Processor Technologies Technologies group overview group overview

Advanced Processor Technologies group overview

  • Upload
    bessie

  • View
    39

  • Download
    6

Embed Size (px)

DESCRIPTION

Advanced Processor Technologies group overview. APT mission. “To explore novel architectures and techniques that will enable the effective exploitation of the billion transistor chips of the near-future”. APT group. Focus: Moore’s Law will soon deliver billion transistor chips - PowerPoint PPT Presentation

Citation preview

Page 1: Advanced Processor Technologies group overview

1

Advanced Processor Advanced Processor TechnologiesTechnologies

group overviewgroup overview

Page 2: Advanced Processor Technologies group overview

2

APT missionAPT mission

“To explore novel architectures and techniques that will enable the effective exploitation of the billion transistor chips of the near-future”

Page 3: Advanced Processor Technologies group overview

3

APT groupAPT group

• Focus:– Moore’s Law will soon deliver

billion transistor chips– how do we make best use of a

billion transistors?•parallel processing•systems-on-chip•novel architectures•…?

Page 4: Advanced Processor Technologies group overview

4

Strategy/VisionStrategy/Vision

• Industry shift to multicore processors– directly addressed by our CMP work

• Power/heat is performance-limiting– asynchronous and low-power design

have growing importance• Timing closure is a critical problem

– acceptance of mixed timing and GALS• Design automation is vital

– async automation must be competitive

Page 5: Advanced Processor Technologies group overview

5

Strategy/VisionStrategy/Vision

• Can university groups design state-of-the-art digital silicon?– probably not in conventional

processors– few academic groups still fab digital

chips• Is trying to take designs through to

fabrication still a good idea?– we believe so, because ‘reality’

matters!– but the game is very tough indeed

Page 6: Advanced Processor Technologies group overview

6

Many-core Many-core Architecture and Architecture and

SoftwareSoftwareMikel Lujan

Page 7: Advanced Processor Technologies group overview

7

Buying a single-core Buying a single-core processor is difficult!processor is difficult!

Multi-cores bring fundamental changes for Computer Science

[applications, programming languages, compilers runtime systems (OS), computer architecture]

Page 8: Advanced Processor Technologies group overview

8

Active projectsActive projects

• Managed Runtime Environments and Low-Power Many-core Architectures– DOME Delaying and Overcoming

Microprocessor Errors

• Teraflux– On the search for a “good” parallel

computational model

• AXLE – Accelerating Analytics of Big Data

Page 9: Advanced Processor Technologies group overview

9

Managed Runtime Managed Runtime EnviromentsEnviroments

• Java, .Net are examples of managed runtime environments (JVM, CLR)

• Key elements: JIT compilation and control of memory allocation

• Research opportunities:– Scaling MREs for many-core architectures (GPUs)– Hardware acceleration of MREs– Use MREs for low-power computing– Use MREs for dealing with faults and transistor

wearout -> DOME

Page 10: Advanced Processor Technologies group overview

10

TeraFlux Project

• Major focus of current ‘General Purpose’ Many-Core research.

• Three major goals– To define the hardware architecture of a highly

extensible, general purpose multi-core system– To develop a simple to use parallel

programming approach based on programming with • side-effect-free computations + transactions

– How do we simulate/prototype many-cores architectures?

Page 11: Advanced Processor Technologies group overview

11

Starting Assumptions

• Requiring strongly consistent shared memory is a major impediment to extensibility

• The efficient scheduling of control-flow based threads is hard

• The major complexity in parallel programming is the handling of shared state (locks etc.)

Page 12: Advanced Processor Technologies group overview

12

Simulate/Prototypemany-core architectures

• Designing a chip is expensive and time consuming• Computer architects build software models to

simulate new architectures• Simulation can be slow (months to run one

application)• How we can accelerate this process? Research

opportunities– New modelling techniques– FPGA prototyping

Page 13: Advanced Processor Technologies group overview

13

AXLE & Big DataAXLE & Big Data

• Collaboration with Dr. Gavin Brown (MLO group)

• Amount of data generated in scientific experiments or social web keeps growing!

• Graph-based data -> complex computation• How can we make sense of this data

deluge?– New Learning techniques capable of working at scale– Redesign architectures (clusters/data centres) and

software for low power analytics– Accelerate software (JIT adaptation) for data processing– Hardware acceleration for low-power learning algorithms

Page 14: Advanced Processor Technologies group overview

14

For more background infoFor more background info

• "Future Multi-core Computing" (COMP6062b)– Learn by directed reading and group

discussions of research papers– Practice parallel programming in the labs

• Watch out for the organised ARM & Intel school seminars in Nov and Dec

Page 15: Advanced Processor Technologies group overview

15

CommunicationCommunicationArchitecturesArchitectures

Javier Navaridas

Page 16: Advanced Processor Technologies group overview

16

• On-chip networks– Tile-based systems– Heterogeneous systems

• High performance computing networks– Massively Parallel Processing systems– Compute Clusters– Datacentres

InterconnectionInterconnectionNetworksNetworks

Page 17: Advanced Processor Technologies group overview

17

• Topologies– Routing– Wiring– Fault resilience– Deadlock avoidance

• Router microarchitecture– Congestion control– Quality of Service– Fault tolerance

• Scheduling and resource management– Task placement

• System and workload modelling– Analytical modelling– Simulation

TopicsTopics

Page 18: Advanced Processor Technologies group overview

18

VirtualizationVirtualization

Alasdair Rawsthorne

Page 19: Advanced Processor Technologies group overview

19

Unifying Unifying System System and and Process Process VirtualizationVirtualization

• Potential benefits: performance, power, design time, security• Impacts design of future compilers, OS, CPU and runtimes

[email protected]

ApplicationApplication

Operating System

Operating System

CPUCPU

Unvirtualized

ApplicationApplication

Operating System

Operating System

CPUCPU

System Virtualization

(eg Xen, Vmware,

VirtualBox)

Hypervisor/VMMHypervisor/VMM

ApplicationApplication

Operating System

Operating System

CPUCPU

Process Virtualization

(eg JVM, Rosetta,

DynamoRIO, ValGrind)

Dynamic RuntimeDynamic Runtime

Application

Application

Operating System

Operating System

CPUCPU

Unified Virtualization

Optimizing VMM

Page 20: Advanced Processor Technologies group overview

20

Neural SystemNeural Systems s EngineeringEngineering

Steve Furber,Jim Garside,Dave Lester

Page 21: Advanced Processor Technologies group overview

21

The SpiNNakerThe SpiNNaker projectproject

• Multi-core CPU node– 18 ARM968 processors– to model large-scale

systems of spiking neurons

– in biological real time

• Scalable up to systems with 10,000s of nodes– over a million

processors– >108 MIPS total

Page 22: Advanced Processor Technologies group overview

22

Current status…Current status…• Full 18-core chip: arrived 20 May 2011• Test card: 4 chips, 72 processors

– Cards can be linked together• Neuron models: LIF, Izhikevich, MLP• Synapse models: STDP, NMDA• Networks: PyNN -> SpiNNaker, various small

tools to build Router tables, etc• 48-chip 103 machine

…and the next steps:• 500-chip 104 machine (Q4 2012), 5,000-chip 105

machine (H1 2013), 50,000-chip 106 machine (H2 2013).

Page 23: Advanced Processor Technologies group overview

23

PhD projectsPhD projects

• Recent:– SpiNNaker monitoring– PyNN -> SpiNNaker– Real-time neural learning algorithms– Modelling the rat barrel cortex– Technology scaling on SpiNNaker– Error correction with CRC

Page 24: Advanced Processor Technologies group overview

24

Technology ScalingTechnology Scaling• 90nm SpiNNaker CPU node

• SP library is faster• requires 128k DTCM

• LL library better overall?(work by Eustace Painkras, UoM PhD)

Page 25: Advanced Processor Technologies group overview

25

PyNN -> SpiNNPyNN -> SpiNN

• LIF

• Izhikevich

Page 26: Advanced Processor Technologies group overview

26

PhD projectsPhD projects

• Future:– System software

• run-time fault-tolerance, scaling, …

– SpiNNaker2 architecture exploration– Neural network models

• learning algorithms, rewiring

– Robotics using SpiNNaker– Non-neural algorithms

• graphics, physics modelling, …

Page 27: Advanced Processor Technologies group overview

27

Emerging Technologies Emerging Technologies for Integrated Circuits for Integrated Circuits

and Systemsand Systems

Let’s do some hard(ware) work

Vasilis Pavlidiswww.cs.man.ac.uk/~pavlidiv

Page 28: Advanced Processor Technologies group overview

28

3-D Integration 3-D Integration OpportunitiesOpportunities

• The same total area for the two circuits

• RTSV = 170 mΩ, CTSV = 2 fF

• *RCs for 65 nm, Del. Impr: 54%

• Integrate disparate technologies/components

28

2-D global wire of 20 mm 3-D global wire of 12 mm

* “ASU Predictive Technology Model.” [Online]. Available: http://www.eas.asu.edu/~ptm/

Page 29: Advanced Processor Technologies group overview

29

Three-Dimensional (3-D) Three-Dimensional (3-D) Integrated Circuits and SystemsIntegrated Circuits and Systems

• Develop design methodologies for 3-D ICs

• New models are required to consider the third physical dimension

• Diverse technologies– SiP, interposer, TSVs

• Many challenges exist down the road!!!– Be the first to address them

• Opportunities to tape-out do exist!– CMP/Tezzaron - cmp.imag.fr– Cadence PDK - 3-D Encounter

Xilinx FPGAVirtex 7

Page 30: Advanced Processor Technologies group overview

30

A New Circuit Design A New Circuit Design Paradigm (Safe Projects )Paradigm (Safe Projects )

• (Re-)Design and assess SpiNNaker-based 3-D architectures– Power, area, performance,

cost/yield– Interposer and TSVs

technologies

• Research methodology– Use available resources– Differentiate only where

required

• Other topics– Can resonance improve energy

efficiency of GALS based architectures?

– Design for manufacturability for GALS systems 2-D/3-D• Considering process, voltage, and

temperature (PVT) variations

• PVT behavior is substantially different in 3-D systems

Develop/extend CAD tools for the physical design of 3-D systems

– Special focus on interposer technologies

Page 31: Advanced Processor Technologies group overview

31

3-D Integration as a System 3-D Integration as a System Integration Approach Integration Approach (High-Return Projects)(High-Return Projects)

• Heterogeneous 3-D integration– Preached a lot but not explored

(at all)!

• Memory on logic is a single application

• Develop techniques and methods for “Mix-and-Match” systems– How do you model…?– How do evaluate…?– How do you integrate…?– How do you manufacture…?

• The physical proximity of diverse systems may not come for free!

31

Interdisciplinary research is a prerequisite for such systems

Rather application driven

Page 32: Advanced Processor Technologies group overview

32

PhD GuidelinesPhD Guidelines

• Persistence, Persistence, Persistence!• Manage rejection• Be there early!• Citations value more than publications• Presentation and writing skills

32

PhD is NOT an end in itself but a means to end!

Page 33: Advanced Processor Technologies group overview

33

Asynchronous Logic Asynchronous Logic Design ToolsDesign Tools

[Doug Edwards,]Jim Garside,

Steve Furber,Alasdair Rawsthorne

Page 34: Advanced Processor Technologies group overview

34

Previous ProjectsPrevious Projects

• Balsa– world-leading public asynchronous

synthesis tool– used for complete microprocessors

• SEDATE– delay Insensitive datapath synthesis

• GALSA– framework for heterogeneous GALS

• ...

Page 35: Advanced Processor Technologies group overview

35

GAELSGAELS

• Globally Asynchronous Elastic Logic Synthesis– modern SoCs comprise numerous,

semi-autonomous subsystems– shrinking transistors have hard-to-

predict variations

• Address using Elastic Logic – new, delay tolerant paradigm– new project!

Page 36: Advanced Processor Technologies group overview

36

ReconfigurableReconfigurableProcessingProcessing

Jim Garside

Page 37: Advanced Processor Technologies group overview

37

Current ComputingCurrent Computing

• Energy use is a problem• Software

– offers processing flexibility– highly inefficient – big overheads

• Hardware– limited programmability– greater efficiency– expensive to develop

Page 38: Advanced Processor Technologies group overview

38

A Solution?A Solution?

• Compile an algorithm into a mixture of hardware and software– how to partition the 'code'?– dynamic adaptation

• Existing solutions tend towards static partitioning– require wide skills from developers– sacrifice potential flexibility– intolerant of differing hardware

Page 39: Advanced Processor Technologies group overview

39

Dynamic Dynamic ReconfigurationReconfiguration

• Keep algorithm in common 'object' format

• Identify, 'compile' and run repeating sections in available hardware

• Adapt to facilities of any given chip – allow for future portability

Page 40: Advanced Processor Technologies group overview

40

To date ...To date ...

• Can identify critical loops and recompile them to hardware– using pre-existing code

• Developing tool flow• Have reasonable reconfigurable

hardware architectureResults• Promising – not 'earth shattering'

Page 41: Advanced Processor Technologies group overview

41

FutureFuture

• Want:• Means of expressing algorithms

allowing easy compilation into software or hardware

• Extract/exploit sensible parallelism – 'fine grain' for hardware– 'coarse grain' (?) for software

• Get (some of) the available speed/power efficiency

Page 42: Advanced Processor Technologies group overview

42

Mobile Systems Mobile Systems ArchitectureArchitecture

Nick Filer with help from

Barry Cheetham

Page 43: Advanced Processor Technologies group overview

43

Nick FilerNick Filer

• Interests:– Wireless networks of all types. Mainly:

• Ad-hoc, • Voice over IP, • Sensors (data collection) , • Pocket networks (e.g. mobile phones, PDAs), • Information dissemination.

– Supported by:• Simulation, analysis, software generation tools.

– eLearning tools for science.

Page 44: Advanced Processor Technologies group overview

44

Current Interest - 1Current Interest - 1

• Pocket Networks– Based on clusters of mobile users. – Person to person transport.– What applications are useful, will work,

when and how will applications work?• Voice?• Video?• Delay tolerant text messages?

Page 45: Advanced Processor Technologies group overview

45

Current Interest - 2Current Interest - 2

• Low power Wireless Sensor Networks– Algorithms for reduced power usage,

mainly getting it low by design.– Intelligent transport/routing protocols

driving low power packet routing.– Smart dust:

• Current cost $100+, needs to be cheaper.• Ultra-low power (NEW): processor, memory,

design.• Nano scale. E.g. for use down oil wells!

Page 46: Advanced Processor Technologies group overview

46

Current Interest – 3Current Interest – 3• Hand-over in mobile wireless networks.

– Pretty much solved problem (even if not always ideal) for mobile phones.

– Close to solutions for WiFi, WiMAX, Bluetooth, Zigbee etc. Still lots to learn though.

– Currently 3 layer hierarchy – infrastructure Wide Area Personal Area.

– What happens with more layers? • Macro scale to nano scale?• Fixed infrastructure interacting with mobile

autonomous agents?• Just how inefficient are these mechanisms currently?

Page 47: Advanced Processor Technologies group overview

47

Current Interest - 4Current Interest - 4• Information dissemination in mobile

ad-hoc networks.– P2P technologies.– P2P optimization for task, availability,

handover, low energy, access latency…– P2P to aid DNS like queries (information

retrieval) in mobile, changing topology networks.

– Delay tolerant P2P. Opportunistic communications e.g. send 100,000 sensors down an oil well, get 1 back, what does it know? Own data, others data?

Page 48: Advanced Processor Technologies group overview

48

Current Interest - 5Current Interest - 5• Real time distributed systems (sound and video)

– Internet choir• Very tight audio constraints (max 50ms)• Demands of latency & bandwidth

– Singing together• Less constrained internet choir but synchronization very

difficult.– Broadcast simulcasts

• Mixed video and sound from various locations.• Broadcast over multiple media types with different delay

etc. characteristics.– Major Obstacles:

• Media types and standards, protocols, congestion, error handling, signal processing, links to hand-over problems ....

Joint with Barry Cheetham

Page 49: Advanced Processor Technologies group overview

49

Current Interest - 6Current Interest - 6

• Support for adaptable network stacks – Writing or changing software is time

consuming, error prone, …– Models can capture semantics of software:

Purpose, usage, transformation knowledge ...– Hence: Use models to generate

implementations.• Use in teaching/learning, simulation, network

stack implementation.

– Support for adaptable network stacks

Page 50: Advanced Processor Technologies group overview

50

Current Interest – 7Current Interest – 7

• eLearning for Complex Systems– Most eLearning tools you have seen are not

much more Content Management Systems.– There is currently little or no evidence they

improve student grades!– We have on-going work looking at improving

understanding of wireless systems.– Also, interested in science teaching for

awkward adolescents.

Joint with Barry Cheetham

Page 51: Advanced Processor Technologies group overview

51

Arithmetic and Arithmetic and Control TheoryControl Theory

Dave Lester

Page 52: Advanced Processor Technologies group overview

52

Arithmetic and Control Arithmetic and Control TheoryTheory

• Exact Arithmetic– NASA/Boeing

• Correctness of Control Theory Applications– Airbus

• Formalisation and Mechanisation of Probabilistic Reasoning