Upload
tracey-washington
View
219
Download
4
Embed Size (px)
Citation preview
ARM 2007 [email protected]
Chapter 15
The Future of the Architecture
by John Rayfield
Optimization Technique in Embedded System (ARM)[email protected], 2008 April
ARM 2007 [email protected]
Overview
• 1999, ARM plan the future architecture– What’s the future direction of the architecture ?– This consideration results ARMv6.
» First implemented as ARM1136J-S
• Challenges in future– DSP, Video processing for CE device;– Mixture of Little- and Big-endian for TCP/IP;– Sync. methods for multiple processor system;– Power consumption (Computing/mW) .
• Future after ARMv6– ARM TrustZone
ARM 2007 [email protected]
15.1 Advanced DSP & SIMD support in ARMv6
• SIMD– Advantage: Code density, low power: less
instruction, less time.– Price for this efficient: reduced flexibility.
• Light-weight SIMD– Slicing up existing 32-bit datapath into four 8bit
or two 16bit slices. » So, speedup is 2 (16-bit) or 4 (8-bit).
• ARMv6 includes this “lightweight” SIMD.– SADD8, UADD8, etc.– SADD16, UADD16, etc.
ARM 2007 [email protected]
ARMv6 Instruction
• SIMD arithmetic instruction• Pack instruction
– PKHTB Rd, Rn, Rm // pack halves of Rn, Rm into Rd– PKHBT
• Complex arithmetic instruction– SMUSD Rt, Ra, Rb // Ra(R)*Rb(R) – Ra(i)*Rb(i)
• Cryptographic multiplication– UMAAL Rl, Rh, Rm, Rs // Rh/Rl = Rm*Rs+Rh+Rl
ARM 2007 [email protected]
15.2 System support additions to ARMv6
• Set current endian– SETEND <spec>
» // spec = BE or LE
• And – REV Rd, Rm
ARM 2007 [email protected]
15.2.2 Exception Procession
• ARMv6 adds the instruction to improve the efficiency for OS to save the return state of an interruption or exception on a stack.
ARM 2007 [email protected]
15.2.3 Multiprocessing Synchronization Primitives
• As System-on-Chip (SoC) architecture have become more sophisticated.– ARM cores are now often found in devices with
many processing units that compete for shared resources.
ARM 2007 [email protected]
Atomic Sync
• Before, SWP instruction is used to keep semaphores coherent.– But, SWP carries the bottleneck. Because SWP is a block
ing instruction (lock the BUS until resource released, as spin-lock).
• LDREX/STREX in ARMv6– Given system monitor in Memory System.– LDREX load a value from M[x] into Rn, and assuming it
will not be changed during it being used.– STREX store a value into M[x], and its return indicates if
Mx had been modified between previous LDREX and STREX.(means STREX maybe fail)
– Multi-Reads, Exclusive Write.
ARM 2007 [email protected]
Organization of ARMv6
• Most sophisticated ARM pipeline– 8-stage, and separate pipelines for load/store
and multiply/accumulate.
• Hit-under-N-miss– Parallel Load Store Unit (LSU)– Decoupling the pipeline execution from the
completion of loads and stores.
• Physical Cache (instead Virtual Cache)– It will reduce cache flushing when context
switching.– Further more, save the power-consumption
brought with memory access ( up to ~20% improvement).
ARM 2007 [email protected]
ARM 2007 [email protected]
15.4 Future Technologies beyond ARMv6
• In 2003, ARM made further technology announcements including TrustZone and Thumb-2.
ARM 2007 [email protected]
15.4.1 TrustZone
• TrustZone is an architecture extension– first introduced in ARM1176JZ-S.
• Reason– OS are now so complex that it is very hard to verify
security and correctness in the software.– The ARM solution is to add new operating “states” when
only a small verifiable software kernel will run, and this will provide services to the larger OS.
– The microprocessor core then take a role in controlling system peripherals that may be only available to the secure “state” through some new exported signals on the bus interface.
• TrustZone is most useful in devices that will carrying out content downloads, such as cell phones or other portable devices with network connections.
ARM 2007 [email protected]
ARM 2007 [email protected]
15.4.2 Thumb-2
• Thumb-2 is an architecture extension– designed to increase performance at high code
density.– It allows for a blend of 32-bit ARM-like
instruction with 16-bit thumb instructions.
• Thumb-2 is announced in Oct 2003.– will be implemented in ARM1156T2-S.– details are not public by the time of writing.
ARM 2007 [email protected]
Summary
• The ARM architecture is not a static constant.– But is being developed and improved to suite the
application required by today’s consumer devices.
– Although the ARMv5TE was very successful at adding some DSP support to ARM. ARMv6 extends the DSP support as well as adding support for large multiprocessor system.
• ARM still concentrates on one of its key benefits—Code Density—and has recently announced the Thumb-2.
• The new focus on security with TrustZone gives ARM a leading in this area.