Upload
damian-hunt
View
273
Download
1
Embed Size (px)
Citation preview
Chapter 3-1ARM ISA
ARM Instruction Set ArchitectureARM Instruction Set Architecture Next LectureNext Lecture
ARM program examplesARM program examples
2
ARM processors Used in low-power and low-cost embedded Used in low-power and low-cost embedded
applicationsapplications Cell phones, PDAs, modemsCell phones, PDAs, modems
Various simulation models available for Various simulation models available for embedded system design as well as low-power embedded system design as well as low-power design design
Support both Big-endian and Little-endianSupport both Big-endian and Little-endian All arithmetic and logic instructions operate All arithmetic and logic instructions operate
only on data in processor registersonly on data in processor registers Pipelining: 3 or 5 stages Pipelining: 3 or 5 stages
Instruction Fetch (IF), Decode (ID), Execute (EX), Instruction Fetch (IF), Decode (ID), Execute (EX), Memory Access (Mem) and Write-back (WB)Memory Access (Mem) and Write-back (WB)
http://www.heyrick.co.uk/assembler/
33 39v10 The ARM Architecture
Data Sizes and Instruction Sets
The ARM is a 32-bit RISC architecture.
When used in relation to the ARM: Byte means 8 bits Halfword means 16 bits (two bytes) Word means 32 bits (four bytes)
Most ARMs implement two instruction sets 32-bit ARM Instruction Set 16-bit Thumb Instruction Set
for tiny systems
Jazelle cores can also execute Java bytecode
44 39v10 The ARM Architecture
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
FIQ IRQ SVC Undef Abort
User Moder0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
Current Visible Registers
Banked out Registers
FIQ IRQ SVC Undef Abort
r0
r1
r2
r3
r4
r5
r6
r7
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User IRQ SVC Undef Abort
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
FIQ ModeIRQ Moder0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ SVC Undef Abort
r13 (sp)
r14 (lr)
Undef Moder0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ IRQ SVC Abort
r13 (sp)
r14 (lr)
SVC Moder0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ IRQ Undef Abort
r13 (sp)
r14 (lr)
Abort Mode r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
r15 (pc)
cpsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r13 (sp)
r14 (lr)
spsr
r8
r9
r10
r11
r12
r13 (sp)
r14 (lr)
spsr
Current Visible Registers
Banked out Registers
User FIQ IRQ SVC Undef
r13 (sp)
r14 (lr)
The ARM Register Set
5
ARM Instruction Set
RegistersRegisters 15 general purpose registers (R0-R14), 32 bit wide15 general purpose registers (R0-R14), 32 bit wide R15 is Program Counter (PC)R15 is Program Counter (PC) R14 is used as Link Register (LR), R13 is Stack Pointer (SP)R14 is used as Link Register (LR), R13 is Stack Pointer (SP) Status Register (CPSR) holds the condition flags (N,Z,C, Status Register (CPSR) holds the condition flags (N,Z,C,
and V), the interrupt disable bits and processor mode bitsand V), the interrupt disable bits and processor mode bits There are 15 additional general purpose registers called There are 15 additional general purpose registers called
the banked registers, which are used when the processor the banked registers, which are used when the processor switches into Supervisor or Interrupt modesswitches into Supervisor or Interrupt modes
CPSR
31 2830 29
N Z C V
7 6 4 0
Processor mode bits
Interruptdisable bits
6
Each instruction is encoded into 32 bitsEach instruction is encoded into 32 bits Access to memory is through load and store onlyAccess to memory is through load and store only
In a load, the operand is transferred into the register named in the Rd field
In a store, the operand is transferred from Rd into memory
If the operand is a byte, it is always located in the lower order byte position of the register and on a load the higher order bytes are filled with zeros.
ARM Instructions
Condition OP code Rn Rd Other info Rm
31 28 27 20 19 16 15 12 11 4 3 0
8bits8bits4 bits4 bits 44 44 448bits8bits
7
All instructionsAll instructions are conditionally executed are conditionally executed The instruction is executed only if the current The instruction is executed only if the current
state of the processor condition code flags state of the processor condition code flags equal the condition specified in bits b31 equal the condition specified in bits b31 –– b28 b28
One of the conditions is used to indicate that One of the conditions is used to indicate that the instruction is always executedthe instruction is always executed
Conditional Executions of Instructions
Condition OP code Rn Rd Other info Rm
31 28 27 20 19 16 15 12 11 4 3 0
8bits8bits4 bits4 bits 44 44 448bits8bits
CPSR N Z C V31 2830 29
8
CMP Rn, RmCMP Rn, Rm Performs the operation [Rn]-[Rm] and sets the Performs the operation [Rn]-[Rm] and sets the
condition codes based on the result of the condition codes based on the result of the operationoperation
The arithmetic and logic instructions affect the The arithmetic and logic instructions affect the condition code flags only if explicitly specified in condition code flags only if explicitly specified in the Opcode fieldthe Opcode field
ExampleExample ADDS R0, R1, R2ADDS R0, R1, R2 ; sets the condition code ; sets the condition code
flagsflags ADD R0, R1, R2ADD R0, R1, R2 ; does not; does not
Setting Condition Code
99 39v10 The ARM Architecture
ARM instructions can be made to execute conditionally by postfixing them with the appropriate condition code field. This improves code density and performance by reducing the
number of forward branch instructions. CMP r3,#0 CMP r3,#0
BEQ skip ADDNE r0,r1,r2 ADD r0,r1,r2skip
By default, data processing instructions do not affect the condition code flags but the flags can be optionally set by using “S”. CMP does not need “S”.
loop … SUBS r1,r1,#1 BNE loop if Z flag clear then branch
decrement r1 and set flags
Conditional Execution and Flags
1010 39v10 The ARM Architecture
Condition Codes
Not equalUnsigned higher or sameUnsigned lowerMinus
Equal
OverflowNo overflowUnsigned higherUnsigned lower or same
Positive or Zero
Less thanGreater thanLess than or equalAlways
Greater or equal
EQNECS/HSCC/LO
PLVS
HILSGELTGTLEAL
MI
VC
Suffix Description
Z=0C=1C=0
Z=1Flags tested
N=1N=0V=1V=0C=1 & Z=0C=0 or Z=1N=VN!=VZ=0 & N=VZ=1 or N=!V
The possible condition codes are listed below: Note: AL is the default and does not need to be specified
1111 39v10 The ARM Architecture
Examples of conditional execution Use a sequence of several conditional instructions
if (a==0) func(1);CMP r0,#0MOVEQ r0,#1BLEQ func
Set the flags, then use various condition codesif (a==0) x=0;if (a>0) x=1; (else if)
CMP r0,#0MOVEQ r1,#0MOVGT r1,#1
Use conditional compare instructionsif (a==4 || a==10) x=0; Pop Quiz?Pop Quiz?
Bonus 1pt on testBonus 1pt on test
12
Basic load instruction: LDR Rd, [Rn, #offset] Basic load instruction: LDR Rd, [Rn, #offset] Offset: a signed number in the immediate modeOffset: a signed number in the immediate mode EA = a signed offset + the contents of register RnEA = a signed offset + the contents of register Rn Operation: Rd Operation: Rd [[Rn]+offset] [[Rn]+offset] The destination register listed firstThe destination register listed first
The magnitude of the offset is a 12 bit immediate The magnitude of the offset is a 12 bit immediate value contained in the lower 12 bits of the value contained in the lower 12 bits of the instructioninstruction
LDR Rd, [Rn,Rm] performs Rd LDR Rd, [Rn,Rm] performs Rd [[Rn]+[Rm]] [[Rn]+[Rm]] The magnitude is the content of a third register RmThe magnitude is the content of a third register Rm
LDR Rd, [Rn] performs Rd LDR Rd, [Rn] performs Rd [[Rn]] [[Rn]]
Basic Addressing Modes
Condition OP code Rn Rd Other info Rm
31 28 27 20 19 16 15 12 11 4 3 0
8bits8bits4 bits4 bits 44 44 448bits8bits
offsetoffset
13
STR Rd, [Rn] performs [[Rn]] STR Rd, [Rn] performs [[Rn]] [Rd] [Rd] i.e., transfers a word into the memoryi.e., transfers a word into the memory
The STRB instruction transfers the byte The STRB instruction transfers the byte contained in the low-order end of Rdcontained in the low-order end of Rd
Note the order of operandsNote the order of operands
Addressing Modes: Store
14
[Rn, #offset] or [Rn, ±Rm, shift][Rn, #offset] or [Rn, ±Rm, shift] EA = [Rn] + offset, or EA = [Rn] ± [Rm] shiftedEA = [Rn] + offset, or EA = [Rn] ± [Rm] shifted
Calculate operand address first, and then perform operationCalculate operand address first, and then perform operation
Addressing Modes: Pre-indexed
1000
STR R3, [R5,R6]
Operand1100
Offset 100
Word (4 bytes)
1000 R5
100 R6
Base register
Offset register
15
STR R3, [R5, R10, LSL #2]STR R3, [R5, R10, LSL #2] EA = [R5] + [R10 * 4]EA = [R5] + [R10 * 4]
Addressing Modes: Pre-indexed example
1000
STR R3, [R5, R10, LSL #2]
Operand1100
Offset 100
Word (4 bytes)
1000 R5
25 R10
Base register
Offset registerC[ ]
int C[100]; …for (i=0; i++; i<N) C[i] = A[i] + B[i]
i
C[0]
16
[Rn, #offset][Rn, #offset]!! or [Rn, ±Rm, shift] or [Rn, ±Rm, shift]!! EA = [Rn] + offsetEA = [Rn] + offset EA = [Rn] ± [Rm] shiftedEA = [Rn] ± [Rm] shifted Then, EA is written back into RnThen, EA is written back into Rn
Example:Example: STR R0, [Rbase, Rindex]STR R0, [Rbase, Rindex]!!
Store R0 at Rbase + Rindex, and write Store R0 at Rbase + Rindex, and write back new address Rbase + Rindex to back new address Rbase + Rindex to Rbase. Rbase.
!! in the Pre-indexed mode means that a write in the Pre-indexed mode means that a write back is to be performed back is to be performed
Pre-indexed with Write Back
Q?
17
2008 27
2012
2012R5
27R0
After execution of the Push instruction
Base register (Stack pointer)
Push instruction:STR R0, [R5, #-4]! STR R0, [R5, #-4]!
EA = R5 –– 4, i.e., R5 R5 –– 4 Perform operation Store
R5 is used as the stack pointer R5 initially contains the address
2012 of the current TOS The immediate offset -4 is added to
the content (2012) of R5 and written back into R5
This new TOS location is used as the EA (2008) to store the contents of R0, 27
Pre-indexed Addressing with write-back
18
The EA of the operand is the contents of RnThe EA of the operand is the contents of Rn Perform operation first with the operandPerform operation first with the operand Then add the offset to Rn (i.e., the result is Then add the offset to Rn (i.e., the result is
written back into Rn)written back into Rn) The post-indexed mode always involves a The post-indexed mode always involves a
write backwrite back The pre-indexed and post-indexed are The pre-indexed and post-indexed are
distinguished by the way the square brackets distinguished by the way the square brackets are used.are used. [[Rn, #offsetRn, #offset]] vs. vs. [[RnRn]], #offset, #offset
The offset may be given as an immediate The offset may be given as an immediate value (range +/- 4095) or as the contents of value (range +/- 4095) or as the contents of the third register Rmthe third register Rm
Addressing Modes: Post-indexed
19
1000 6
3211200
1000R2
25R10
Base register
Word (4 bytes)/element
Offset register
-17
LDR R1, [R2], R10, LSL #2
100 = 25x4
100 = 25x4
Post-indexed Example: used to access a column of elements of a 25x25 matrix
(1,1(1,1))
(1,2(1,2))
…… (1,25(1,25))
(2,1(2,1))
……
(3,1(3,1))
……
……
1000
1100
1200
1..00
1100
for (i=1; i++; i≤N) sum += D[i,1];
2020 39v10 The ARM Architecture
0x5
0x5
r1
0x200Base
Register 0x200
r0
0x5Source
Registerfor STR
Offset
12 0x20c
r1
0x200
OriginalBase
Register0x200
r0
0x5Source
Registerfor STR
Offset
12 0x20c
r1
0x20cUpdated
BaseRegister
Write-back (auto-update) form: STR r0,[r1,#12]!
Pre or Post Indexed Addressing?
Pre-indexed: STR r0,[r1,#12]
Post-indexed: STR r0,[r1],#12 int *ptr;x = *ptr++;
21
LDR R0, [R1, -R2]LDR R0, [R1, -R2]!! R0 R0 [[R1] [[R1] –– [R2]]; R1 [R2]]; R1 [R1] [R1] –– [R2] [R2]
When the offset is given in a register, it may be scaled When the offset is given in a register, it may be scaled by a power of 2 by shifting to the right or to the left. by a power of 2 by shifting to the right or to the left. This is indicated with either LSL or LSR and the shift This is indicated with either LSL or LSR and the shift
amountamount The shift amount is in the range 0 to 31The shift amount is in the range 0 to 31
LDR R0, [R1, -R2, LSL #4]LDR R0, [R1, -R2, LSL #4]!! R0 R0 [[R1] [[R1] –– 16 x [R2]]; R1 16 x [R2]]; R1 [R1] [R1] –– 16 x [R2] 16 x [R2]
The PC may be used as the base register Rn. The The PC may be used as the base register Rn. The assembler determines the immediate offset as the assembler determines the immediate offset as the signed distance between the address of the operand and signed distance between the address of the operand and the contents of the PC. (relative addressing mode)the contents of the PC. (relative addressing mode)
Recap: Pre, Post-indexed Modes
22
Word (4 bytes)MemoryAddress
1000 LDR R1, ITEM
1004
1008
OperandITEM=1060
Updated [PC] = 1008
52 = offset
•The offset calculated by the assembler is 52 because the updated PC = 1008
•EA = 1060 = 1008 + 52
Relative Addressing Mode When the effective address is calculated at instruction When the effective address is calculated at instruction
execution time, the contents of the PC will have been execution time, the contents of the PC will have been updated to the address two words (8 bytes) forward from updated to the address two words (8 bytes) forward from the current instructionthe current instruction
Why?Why?
23
Multiple Load and Store The ARM can also load multiple operands
Called block transfer LDM: load multiple STM: store multiple The offset is always 4; thus it is not specified
explicitly
Assume R10 is the base register and it contains Assume R10 is the base register and it contains 10001000
LDMIA R10!, {R0,R1,R6,R7} LDMIA R10!, {R0,R1,R6,R7}
transfers the words from locations 1000, 1004, transfers the words from locations 1000, 1004, 1008, 1012 into registers R0, R1, R6 and R71008, 1012 into registers R0, R1, R6 and R7
The suffix IA indicates increment afterThe suffix IA indicates increment after IB: Increment Before, DA: Decrement After, DB: Decrement IB: Increment Before, DA: Decrement After, DB: Decrement
Before Before
2424 39v10 The ARM Architecture
LDM / STM operation
Syntax:<LDM|STM>{<cond>}<addressing_mode> Rb{!}, <register list>
4 addressing modes: LDMIA / STMIA increment after LDMIB / STMIB increment before LDMDA / STMDA decrement after LDMDB / STMDB decrement before
DA
r1 DecreasingAddress
r4
r0
r1
r4
r0
r1
r0
r4 r1
r0
r4
r10
DB IA IBLDMxx r10, {r0,r1,r4}STMxx r10, {r0,r1,r4}
Base Register (Rb)
Pop Quiz?Pop Quiz?
25
Move Instructions
MOV Rd, RmMOV Rd, Rm
Rd Rd [Rm] [Rm]
MOV R0, #76MOV R0, #76
R0 R0 #76 #76
2626 39v10 The ARM Architecture
Branch : B{<cond>} label
Branch with Link : BL{<cond>} subroutine_label
The processor core shifts the offset field left by 2 positions, sign-extends it and adds it to the PC ± 32 Mbyte range How to perform longer branches?
2831 24 0
Cond 1 0 1 L Offset
Condition field
Link bit 0 = Branch1 = Branch with link
232527
Branch instructions
27
Conditional branch instructions contain 2Conditional branch instructions contain 2’’s complement s complement 24 bit offset, whihc is first left-shifted by 2 and then 24 bit offset, whihc is first left-shifted by 2 and then added to the added to the updated contentsupdated contents of the PC to generate of the PC to generate the branch target. the branch target.
condition OPcode offset
31 28 27 24 23 0
BEQ LOCATION
Updated [PC] =1008
10001004
LOCATION = 1100
Offset = 92
At the time the branch target address is calculated, the content of the PC has been updated to contain the address of the instruction that is two words beyond the branch instruction. Branch Target
Conditional Branch Instructions
2828 39v10 The ARM Architecture
ARM Branches and Subroutines
B <label> PC relative. ±32 Mbyte range.
BL <subroutine> Stores return address in LR Returning implemented by restoring the PC from LR For non-leaf functions, LR will have to be stacked
STMFD sp!,{regs,lr}
:
BL func2
:
LDMFD sp!,{regs,pc}
func1 func2
:
:
BL func1
:
:
:
:
:
:
:
MOV pc, lr
2929 39v10 The ARM Architecture
Data processing Instructions
Consist of : Arithmetic: ADD ADC SUB SBC RSB RSC Logical: AND ORR EOR BIC Comparisons: CMP CMN TST TEQ Data movement: MOV MVN
These instructions only work on registers, NOT memory.
Syntax:<Operation>{<cond>}{S} Rd, Rn, Operand2
Comparisons set flags only - they do not change Rd Data movement does not change Rn
Second operand is sent to the ALU via barrel shifter.
30
Opcode Rd, Rn, RmOpcode Rd, Rn, Rm ADD R0, R2, R4ADD R0, R2, R4
R0 R0 [R2] + [R4][R2] + [R4] SUB R0, R6, R5SUB R0, R6, R5
R0 R0 [R6] [R6] –– [R5] [R5] ADD R0, R3, #17ADD R0, R3, #17
R0 R0 [R3] + 17[R3] + 17 ADD R0, R1, R5, LSL #4ADD R0, R1, R5, LSL #4
R0 R0 [R1] + 16 x [R5] [R1] + 16 x [R5] MUL R0, R1, R2MUL R0, R1, R2
R0 R0 [R1] x [R2] [R1] x [R2] Places the low-order 32 bits of the product in a third Places the low-order 32 bits of the product in a third
registerregister High order bits of the product are discardedHigh order bits of the product are discarded
MLA R0, R1, R2, R3MLA R0, R1, R2, R3 R0 R0 [R1] x [R2] + [R3]; multiply accumulate [R1] x [R2] + [R3]; multiply accumulate
Arithmetic Instructions
31
AND Rd, Rn, RmAND Rd, Rn, Rm Rd Rd [Rn] AND [Rm] ; logical bitwise AND [Rn] AND [Rm] ; logical bitwise AND Example ; R0 Example ; R0 02FA62CA and R1 02FA62CA and R1 0000FFFF 0000FFFF
AND R0, R0, R1AND R0, R0, R1 ;R0 ;R0 000062CA 000062CA BIC Rd, Rn, RmBIC Rd, Rn, Rm
Bit clear, complements each bit in Rm and then Bit clear, complements each bit in Rm and then performs AND with the bits in Rnperforms AND with the bits in Rn
Example ; R0 Example ; R0 02FA62CA and R1 02FA62CA and R1 0000FFFF 0000FFFF BIC R0, R0, R1BIC R0, R0, R1 ;R0 ;R0 02FA0000 02FA0000
MVN complements the bits of the source operand and MVN complements the bits of the source operand and places the result in Rdplaces the result in Rd R3 R3 0F0F0F0F 0F0F0F0F MVN R0, R3MVN R0, R3 ;R0 ;R0 F0F0F0F0 F0F0F0F0
Logic Instructions
3232 39v10 The ARM Architecture
DestinationCF 0 Destination CF
LSL : Logical Shift Left ASR: Arithmetic Shift Right
Multiplication by a power of 2 Division by a power of 2, preserving the sign bit
Destination CF...0 Destination CF
LSR : Logical Shift Right ROR: Rotate Right
Division by a power of 2 Bit rotate with wrap aroundfrom LSB to MSB
Destination
RRX: Rotate Right Extended
Single bit rotate with wrap aroundfrom CF to MSB
CF
Shift Operations
A barrel shifter is a hardware device that can
shift a data word left or right by any number of bits in a single operation
3333 39v10 The ARM Architecture
A Barrel Shifter
3434 39v10 The ARM Architecture
Register, optionally with shift operation Shift value can be either be:
5 bit unsigned integer Specified in bottom byte of
another register. Used for multiplication by
constant
Immediate value 8 bit number, with a range of 0-
255. Rotated right through even
number of positions Allows increased range of 32-bit
constants to be loaded directly into registers
Result
Operand 1
BarrelShifter
Operand 2
ALU
Using the Barrel Shifter:The Second Operand
3535 39v10 The ARM Architecture
No ARM instruction can contain a 32 bit immediate constant All ARM instructions are fixed as 32 bits long
The data processing instruction format has 12 bits available for operand2
4 bit rotate value (0-15) is multiplied by two to give range 0-30 in steps of 2
Rule to remember is “8-bits shifted by an even number of bit positions”.
0711 8
immed_8
ShifterROR
rot
x2
Quick Quiz:
MOV r0,#255,8
Immediate Constants (1)
3636 39v10 The ARM Architecture
Examples:
The assembler converts immediate values to the rotate form:
MOV r0,#4096 ; uses 0x40 ror 26 ADD r1,r2,#0xFF0000 ; uses 0xFF ror 16
The bitwise complements can also be formed using MVN: MOV r0, #0xFFFFFFFF ; assembles to MVN r0,#0
Values that cannot be generated in this way will cause an error.
031
ror #0
range 0-0xff000000 step 0x01000000 ror #8
range 0-0x000000ff step 0x00000001
range 0-0x000003fc step 0x00000004 ror #30
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Immediate Constants (2)
3737 39v10 The ARM Architecture
To allow larger constants to be loaded, the assembler offers a pseudo-instruction: LDR rd, =const
This will either: Produce a MOV or MVN instruction to generate the value (if possible).
or Generate a LDR instruction with a PC-relative address to read the constant
from a literal pool (Constant data area embedded in the code).
For example LDR r0,=0xFF => MOV r0,#0xFF LDR r0,=0x55555555 => LDR r0,[PC,#Imm12]
……DCD 0x55555555
This is the recommended way of loading constants into a register
Loading 32 Bit Constants
3838 39v10 The ARM Architecture
Multiply
Syntax: MUL{<cond>}{S} Rd, Rm, Rs Rd = Rm * Rs MLA{<cond>}{S} Rd,Rm,Rs,Rn Rd = (Rm * Rs) + Rn [U|S]MULL{<cond>}{S} RdLo, RdHi, Rm, Rs RdHi,RdLo := Rm*Rs [U|S]MLAL{<cond>}{S} RdLo, RdHi, Rm, Rs RdHi,RdLo := (Rm*Rs)
+RdHi,RdLo
Cycle time Basic MUL instruction
2-5 cycles on ARM7TDMI 1-3 cycles on StrongARM/XScale 2 cycles on ARM9E/ARM102xE
+1 cycle for ARM9TDMI (over ARM7TDMI) +1 cycle for accumulate (not on 9E though result delay is one cycle longer) +1 cycle for “long”
Above are “general rules” - refer to the TRM for the core you are using for the exact details
3939 39v10 The ARM Architecture
Single register data transfer
LDR STR Word LDRB STRB Byte LDRH STRH Halfword LDRSB Signed byte load LDRSH Signed halfword load
Memory system must support all access sizes
Syntax: LDR{<cond>}{<size>} Rd, <address> STR{<cond>}{<size>} Rd, <address>
e.g. LDREQB
4040 39v10 The ARM Architecture
Address accessed by LDR/STR is specified by a base register plus an offset
For word and unsigned byte accesses, offset can be An unsigned 12-bit immediate value (ie 0 - 4095 bytes).
LDR r0,[r1,#8] A register, optionally shifted by an immediate value
LDR r0,[r1,r2]LDR r0,[r1,r2,LSL#2]
This can be either added or subtracted from the base register:LDR r0,[r1,#-8]LDR r0,[r1,-r2]LDR r0,[r1,-r2,LSL#2]
For halfword and signed halfword / byte, offset can be: An unsigned 8 bit immediate value (ie 0-255 bytes). A register (unshifted).
Choice of pre-indexed or post-indexed addressing
Address Accessed