Comparison instructions

Branch and jump instructions

Simple Code Sequences

Where Are Branches Used?In C control statements If statement

if(n > 0) {

} else {

}

While loopwhile (s != NULL) {

}

For loopfor (i = 0; i < N; i++) {

}

Do loopdo {

}while (s != NULL)

Otherse.g. max = (x > y) ? x : y;

Comparison InstructionsTo set up conditions in CR or XER bits Set by arithmetic/logic/shift instructions with . suffix Set by comparison instructions

Compare signed word and unsigned wordcmpw r3, r4 ; set CR0 as for signed r3-r4cmplw r3, r4 ; set CR0 as for unsigned r3-r4

Cmplw: compare logical

Compare using immediate valuescmpwi r3, 200 ; set CR0 as for signed r3-200cmplwi r3, 200 ; set CR0 as for unsigned r3-200

Comparison InstructionsCompare and set specific condition registers

Comparison may specify which CR field to use

cmpw cr3, r3, r4 ; set CR3 instead of CR0

cmplwi cr2, r3, r4 ; logical and using immediate; and set CR2

cmpw cr0, r3, r4 ; equivalent to cmpw r3, r4

CR0 CR1 CR2 CR3 CR4 CR5 CR6 CR7

LT GT EQ SO

Branch Basic Termsbranch condition, branch-target

Conditional branches Take the branch only if some condition holds

Unconditional Branches Unconditional branches

C Assembly

while (1) { loop:addi r9, r9, 1X=X+1;} b loop

(-4)

The target loop is specified as an offset from the curre

instruction (PC-relative).

Conditional BranchesCommonly used branches

Use condition register CR0 LT, GT, EQ, SO

Common forms: ble target_address ble: branch if less then or equal GT=0

blt: branch if less then LT=1

beq: branch if equal EQ=1

bne: branch if not equal EQ=0 bge: branch if greater than or equal to LT=0

bgt: branch if greater thanGT=1

All encoded in the same instruction format (see next)

Conditional BranchesUsing CR fieldsbne cr2, target ; branch if EQ of CR2 is zero

Example: using branch with comparison instructionsloop:

addi r3, r3, 1 ; increase r3cmpw r3, r4 ; compare r4bne target ; branch if r3 != r4

Example: using different CR fieldloop:

addi r3, r3, 1 ; increase r3cmpwcr3, r3, r4 ; compare using cr3bne cr3, target ; branch if r3 != r4

1. PC-relative: next PC = PC + EXTS(PC-Offset || 0b00)2. Absolute: next PC = EXTS(PC-Offset || 0b00);

3. Register: next PC = value of register Can use two special registers: LR or CTR

Why sign-extension of an address (for absolute)?

0xff00 gets sign-extended to 0xffffff00.

Update LR option: l suffix If updating, save PC+4 into LR

Do not update LR: b target_addr

When do we want to save PC+4?

Underlying Details

bx: encodes 24-bit address (26-bit effective)

bcx: encodes 14-bit address (16-bit effective)bclrx: uses LR register as target addressbcctrx: uses CR register as target addressx:representing AA and LK bits, e.g. l, a, la

11

16 BO AA LK

0-5 6-10 30 31

bcx BI BD

11-15 16-29

19 BO LKbclrx BI 00000 16

19 BO LKbcctrx BI 00000 528

18 PC-Offset AA LKbx

Instruction format

Underlying Details BO: Branch options

Encodes branching on TRUE or FALSE or on CTR values

BI: Index of the CR bit to use five bits index to 32 CR bits, 3-bit for CR index, 2-bit to select LT,

GT, EQ, or SO

BD: Branch displacement

14-bit (16-bit effective), signed-extended

LK: link bit 1 update LR with PC+4; 0 do not update

16 BO AA LKbcx BI BD

Instruction Fields

Underlying DetailsFrequently used BO encoding in bc, bclr, and bcctr BO=00100 (4): branch if the condition is false BO=01100 (12): branch if the condition is true

BO=10100 (20): branch always BO=10000 (16): decreases CTR then branch if CTR!=0

blr bclr 20, 0: unconditional branch to addr in LR bnelr target_addr bclr 4, 2: branch to LR if not equal

Explanation: bc 4, 14, target_addr: branch if bit 14 inCR (CR3[EQ]) is false (because BO=4) bne cr3,target_addr

13

Underlying Details

Branch examples using AA and LK bits (zeros by default)

; and save PC+4 in LR

16 BO AA LK

0-5 6-10 30 31

bcx BI BD

10-15 16-29

19 BO LKbclrx BI 00000 16

19 BO LKbcctrx BI 00000 528

18 Offset AA LKbx

AA and LK fields

Support Procedure Call/ReturnLink RegisterSupporting function calls

1. A parent function calls a child function: blchild_func LR

Simple Code SequencesHow to translate:

C arithmetic expressions C ifstatement

C for loops

Function calls (next week)

C Arithmetic ExpressionsBasic operationsstatic int sum;

static int x1, x2;

static int y1, y2;

sum = (x1+x2)-(y1+y2)+100;

Assembly

lwz r3, 4(r13) ; load x1

lwz r0, 8(r13) ; load x2

lwz r0, 16(r13) ; load y2

add r0, r3, r0 ; y1+y2

subf r3, r0, r4 ; minusaddi r0, r3, 100; ; add 100

stw r0, 0(r13) ; store sum

Q: What would happen if signed is changed to unsigned?

C Arithmetic ExpressionsSign extensionstatic short sum;

static short x1, x2;

static short y1, y2;

sum = (x1+x2)-(y1+y2) + 100;

Assembly

lha r3, 2(r13) ; load x1

lha r0, 4(r13) ; load x2

add r4, r3, r0 ; x1+x2

lha r3, 6(r13) ; load y1

lha r0, 8(r13) ; load y2

add r0, r3, r0 ; y1+y2

subf r3, r0, r4 ; minus

sth r0, 0(r13) ; store sum

If-then-elseC Programif (x > y)

z = 1;

else z = 0;

Assembly

cmpw r3, r4

ble skip1

li r31, 1b skip2

skip1: li r31, 0

skip2:

Notes:

Code generated by CodeWarrior and then revised

x r3; y r4; z r31

li r31, 1 => addi r31, 0, 1; li called simplified mnemonic

If-then-elseC Programstatic int x, y;static int max;if (x y > 0)

max = x;else

max = y;

Assemblylwz r4, 0(r13) ; load ylwz r0, 4(r13) ; load xsubf r0, r4, r0 ; x-ycmpwi r0, 0x0000 ; x-y>0?ble skip1 ; no, skip max=x

lwz r0, 0(r13) ; load xstw r0, 8(r13) ; max=xb skip2 ; skip max=y

skip1: lwz r0, 4(r13) ; load ystw r0, 8(r13) ; max=y

skip2:

Notes:

Generated by CodeWarrior and then revised

Can you optimize the code? i.e. reduce number of

instruction but produce the same output

00000048: 7C001800 cmpw r0,r30000004C: 4081000C ble *+1200000050: 3BE00001 li r31,100000054: 48000008 b *+8

00000058: 3BE00000 li r31,00000005C:

Assembly Source:

cmpw r0, r3

ble skip1

li r31, 1

b skip2skip1: li r31, 0

skip2:

Binary code

For loopC codestatic int sum;static int X[100];int i;

sum = 0;for (i = 0; i < 100; i ++)sum += X[i];

Assemblyli r0, 0 ; sum = 0 ; sumr31

stw r0, 0(r13); ; sum = 0li r31, 0 ; ir31b cmp_ ;

loop: slwi r4, r31, 2 ; r4=i*4

addi r31, r31, 1 ; increase icmp_: cmpwi r31, 0x0064 ; 0x64 = 100

blt loop(generated by CodeWarrior and then revised)

Exercise: (1) How many instructions will be executed? (2) Optimize the code

to reduce the loop body to 4 instructions; (3) further reduce the loop body to 3

instructions. Loop body includes the branch instruction.

