36
Dr. Rajveer S Shekhawat GM, New Products Development, Secure Meters Ltd

Architectures for mobile computing dec12

Embed Size (px)

Citation preview

Dr. Rajveer S Shekhawat

GM, New Products Development,Secure Meters Ltd

Performance and Power Challenges to mobile computingHD Video playbackSteaming audio & video3D Gaming3D interfacesWeb browsingMultiple applicationLocation-based services (maps and satellite images)

So single processors are being replaced by multicore processors to meet the above requirements.

04/07/15 CMCTAR2012 2

Predictions

04/07/15 CMCTAR2012 3

04/07/15 CMCTAR2012 4

Major Mobile Devices using MCPs Smart phonesPDAsTabletsLaptopsGame StationsVehicle navigation systems

04/07/15 CMCTAR2012 5

Performance Challenges Multi core architectures with high integration of

peripherals are needed to deliver ever increasing performance. The likely peripherals are:

Graphics/image/videoVoice/speechIntelligent keys/trackballs3D motionGPSCommunication (Bluetooth, WiFi, IR, GSM/UMTS)

04/07/15 CMCTAR2012 6

Challenge areasHardware

ArchitectureComputationSpeedPowerComplexityGraphicsspeech

SoftwareSystem programmingApplication programmingUser interface

04/07/15 CMCTAR2012 7

Cost Optimization

04/07/15 CMCTAR2012 8

Power Optimization

04/07/15 CMCTAR2012 9

04/07/15 CMCTAR2012 10

04/07/15 CMCTAR2012 11

04/07/15 CMCTAR2012 12

04/07/15 CMCTAR2012 13

04/07/15 CMCTAR2012 14

04/07/15 CMCTAR2012 15

04/07/15 CMCTAR2012 16

Parallel Programming Multi-core architectures can help in reducing power

consumption of single CPU to increase computational power. However to best make use of them, we need to write efficient parallel programs for both systems and application programming. This area is still evolving and needs better programming tools to support faster, accurate and efficient programs.

Multi-core processors can have two configurations: Symmetric multiprocessing (SMP) Assymetric multiprocessing (ASMP)

04/07/15 CMCTAR2012 17

Symmetri Multi ProcessingSMP architecture consists of two or more identical

CPU cores.All cores share a common system memory and are

controlled by a single Operating system.Each CPU is capable of operating independently on

different workloads and whenever possible, is also capable of sharing workloads with the other CPU.

04/07/15 CMCTAR2012 18

Example

NVIDIA Tegra 2 and Tegra 3

04/07/15 CMCTAR2012 19

04/07/15 CMCTAR2012 20

04/07/15 CMCTAR2012 21

Architectural Features of Tegra 2

04/07/15 CMCTAR2012 22

Dynamic length 8-stage pipeline supporting speculative out-of-order execution. This allows the processor to dynamically reorder instructions to improve performance by avoiding stalls due to instruction latencies and resource conflicts. Older generation Cortex-A8 processors use an in-order pipeline and are unable to avoid the penalties that arise from branching and cache misses support for speculative branch predictions to avoid branching penalties.

A Dual-core Symmetrical Multiprocessing (SMP) configuration operating either independently, or in lockstep to deliver peak performance when needed, and consuming almost zero power when idle.

32KB Instruction cache and 32KB Data cache per core with both cores sharing a common 1MB L2 Cache. The 1MB L2 cache is large enough to load an entire browser memory footprint into cache to provide a faster Web browsing experience.

CPU cores that are optimized to operate at a frequency of one Gigahertz with the ability to scale up to even higher frequencies. The two cores are assisted by a common snoop control unit that enforces coherency between the cores and manages the common 1MB L2 cache shared by the two cores.

04/07/15 CMCTAR2012 23

Intelligent Power Management Long battery life along with high computing power is

only feasible if we can use multi-core architectures with low power consumption. A popular technique is Dynamic Voltage and Frequency Scaling (DVFS). Here the voltages (both supply and threshold) can be reduced to for lower power operation. Further frequency of operation can also be scaled down. However, to keep the execution timing of tasks intact, multitasking/mutit-threading can be used. There appropriate scheduling algo’s for multi-cores.

04/07/15 CMCTAR2012 24

04/07/15 CMCTAR2012 25

Tegra 3 from Nvidia (vSMP)

04/07/15 CMCTAR2012 26

Renesas Dual Core (EMMA)

04/07/15 CMCTAR2012 27

EMMA FeaturesIt is an application processor for smart mobiles. It has two ARM Cortex-9 cores with two Neon

extensionsIt has an integrated audio/video engine, A 3D graphics block A number of communication interfacesIt uses hardware accelerator for HD quality decodingIt consumes minimal power

04/07/15 CMCTAR2012 28

04/07/15 CMCTAR2012 29

04/07/15 CMCTAR2012 30

04/07/15 CMCTAR2012 31

Expectations (PwC report)

04/07/15 CMCTAR2012 32

Parallel Programming

04/07/15 CMCTAR2012 33

04/07/15 CMCTAR2012 34

Common Prog Environs

04/07/15 CMCTAR2012 35

Thanks

04/07/15 CMCTAR2012 36