High Performance RISC Engine Programming Manual

SuperH™ (SH) 32-Bit RISC MCU/MPU Series

SH7750

High-Performance RISC Engine

Programming Manual

ADE-602-156A

Rev. 2.003/04/99Hitachi, Ltd.

Cautions

1. Hitachi neither warrants nor grants licenses of any rights of Hitachi’s or any third party’spatent, copyright, trademark, or other intellectual property rights for information contained inthis document. Hitachi bears no responsibility for problems that may arise with third party’srights, including intellectual property rights, in connection with use of the informationcontained in this document.

2. Products and product specifications may be subject to change without notice. Confirm that youhave received the latest product standards or specifications before final design, purchase oruse.

3. Hitachi makes every attempt to ensure that its products are of high quality and reliability.However, contact Hitachi’s sales office before using the product in an application thatdemands especially high quality and reliability or where its failure or malfunction may directlythreaten human life or cause risk of bodily injury, such as aerospace, aeronautics, nuclearpower, combustion control, transportation, traffic, safety equipment or medical equipment forlife support.

4. Design your application so that the product is used within the ranges guaranteed by Hitachiparticularly for maximum rating, operating supply voltage range, heat radiation characteristics,installation conditions and other characteristics. Hitachi bears no responsibility for failure ordamage when used beyond the guaranteed ranges. Even within the guaranteed ranges,consider normally foreseeable failure rates or failure modes in semiconductor devices andemploy systemic measures such as fail-safes, so that the equipment incorporating Hitachiproduct does not cause bodily injury, fire or other consequential damage due to operation ofthe Hitachi product.

5. This product is not designed to be radiation resistant.

6. No one is permitted to reproduce or duplicate, in any form, the whole or part of this documentwithout written approval from Hitachi.

7. Contact Hitachi’s sales office for any questions regarding this document or Hitachisemiconductor products.

Rev. 2.0, 03/99, page v of 13

Preface

The SH-4 (SH7750) has been developed as the top-end model in the SuperH™ RISC enginefamily, featuring a 128-bit graphic engine for multimedia applications and 360 MIPS performance.The SH7750 CPU has a RISC type instruction set, and features upward-compatibility at the objectcode level with SH-1, SH-2, SH-3, and SH-3E microcomputers.In addition to single- and double-precision floating-point operation capability, the on-chip FPUhas a 128-bit graphic engine that enables 32-bit floating-point data to be processed 128 bits at atime. It also supports 4 × 4 array operations and inner product operations, enabling a performanceof 1.4 GFLOPS to be achieved.A superscalar architecture is employed that enables simultaneous execution of two instructions(including FPU instructions), providing performance of up to twice that of conventionalarchitectures at the same frequency.SH7750 on-chip peripheral modules include oscillator circuits, an interrupt controller (INTC),direct memory access controller (DMAC), timer unit (TMU), real-time clock (RTC), serialcommunication interfaces (SCI, SCIF), and a user break controller (UBC), enabling a user systemto be configured with a minimum of components.An 8-kbyte instruction cache and 16-kbyte data cache are also provided, and the on-chip memorymanagement unit (MMU) handles translation from the 4-Gbyte virtual address space to thephysical address space. The bus state controller (BSC) supporting external memory access canhandle a 64-bit synchronous DRAM 4-bank system and 64-bit data bus as well as ROM, SRAM,DRAM, synchronous DRAM, and PCMCIA.This programming manual gives details of the SH7750 instructions. For hardware details, refer tothe relevant hardware manual.Related Manual:SH7750 Hardware Manual (Document No. ADE-602-124)Please consult your Hitachi sales representative for information on development environmentsystems.SuperH is a trademark of Hitachi, Ltd.

Rev. 2.0, 03/99, page vii of 13

Contents

Section 1 Overview............................................................................................................. 11.1 SH7750 Features............................................................................................................... 11.2 Block Diagram.................................................................................................................. 8

Section 2 Programming Model........................................................................................ 92.1 Data Formats..................................................................................................................... 92.2 Register Configuration...................................................................................................... 10

2.2.1 Privileged Mode and Banks................................................................................. 102.2.2 General Registers................................................................................................. 132.2.3 Floating-Point Registers ...................................................................................... 152.2.4 Control Registers ................................................................................................. 172.2.5 System Registers.................................................................................................. 18

2.3 Memory-Mapped Registers .............................................................................................. 202.4 Data Format in Registers .................................................................................................. 212.5 Data Formats in Memory.................................................................................................. 212.6 Processor States ................................................................................................................ 222.7 Processor Modes............................................................................................................... 23

Section 3 Memory Management Unit (MMU)........................................................... 253.1 Overview .......................................................................................................................... 25

3.1.1 Features................................................................................................................ 253.1.2 Role of the MMU ................................................................................................ 253.1.3 Register Configuration......................................................................................... 283.1.4 Caution ................................................................................................................ 28

3.2 Register Descriptions........................................................................................................ 293.3 Memory Space .................................................................................................................. 32

3.3.1 Physical Memory Space ...................................................................................... 323.3.2 External Memory Space ...................................................................................... 353.3.3 Virtual Memory Space......................................................................................... 363.3.4 On-Chip RAM Space........................................................................................... 373.3.5 Address Translation............................................................................................. 373.3.6 Single Virtual Memory Mode and Multiple Virtual Memory Mode ................... 383.3.7 Address Space Identifier (ASID)......................................................................... 38

3.4 TLB Functions .................................................................................................................. 383.4.1 Unified TLB (UTLB) Configuration ................................................................... 383.4.2 Instruction TLB (ITLB) Configuration................................................................ 423.4.3 Address Translation Method................................................................................ 42

3.5 MMU Functions................................................................................................................ 453.5.1 MMU Hardware Management............................................................................. 45

Rev. 2.0, 03/99, page viii of 13

3.5.2 MMU Software Management .............................................................................. 453.5.3 MMU Instruction (LDTLB) ................................................................................ 453.5.4 Hardware ITLB Miss Handling ........................................................................... 463.5.5 Avoiding Synonym Problems.............................................................................. 47

3.6 MMU Exceptions.............................................................................................................. 483.6.1 Instruction TLB Multiple Hit Exception.............................................................. 483.6.2 Instruction TLB Miss Exception.......................................................................... 493.6.3 Instruction TLB Protection Violation Exception................................................. 503.6.4 Data TLB Multiple Hit Exception ....................................................................... 513.6.5 Data TLB Miss Exception ................................................................................... 513.6.6 Data TLB Protection Violation Exception........................................................... 523.6.7 Initial Page Write Exception................................................................................ 53

3.7 Memory-Mapped TLB Configuration .............................................................................. 543.7.1 ITLB Address Array............................................................................................ 553.7.2 ITLB Data Array 1 .............................................................................................. 563.7.3 ITLB Data Array 2 .............................................................................................. 573.7.4 UTLB Address Array .......................................................................................... 573.7.5 UTLB Data Array 1 ............................................................................................. 593.7.6 UTLB Data Array 2 ............................................................................................. 60

Section 4 Caches ................................................................................................................ 614.1 Overview .......................................................................................................................... 61

4.1.1 Features................................................................................................................ 614.1.2 Register Configuration......................................................................................... 62

4.2 Register Descriptions........................................................................................................ 624.3 Operand Cache (OC) ........................................................................................................ 65

4.3.1 Configuration....................................................................................................... 654.3.2 Read Operation .................................................................................................... 664.3.3 Write Operation ................................................................................................... 674.3.4 Write-Back Buffer ............................................................................................... 694.3.5 Write-Through Buffer.......................................................................................... 694.3.6 RAM Mode.......................................................................................................... 694.3.7 OC Index Mode ................................................................................................... 704.3.8 Coherency between Cache and External Memory............................................... 714.3.9 Prefetch Operation............................................................................................... 71

4.4 Instruction Cache (IC) ...................................................................................................... 724.4.1 Configuration....................................................................................................... 724.4.2 Read Operation .................................................................................................... 734.4.3 IC Index Mode..................................................................................................... 74

4.5 Memory-Mapped Cache Configuration............................................................................ 744.5.1 IC Address Array................................................................................................. 744.5.2 IC Data Array ...................................................................................................... 754.5.3 OC Address Array ............................................................................................... 76

Rev. 2.0, 03/99, page ix of 13

4.5.4 OC Data Array..................................................................................................... 784.6 Store Queues..................................................................................................................... 79

4.6.1 SQ Configuration................................................................................................. 794.6.2 SQ Writes ............................................................................................................ 794.6.3 Transfer to External Memory .............................................................................. 794.6.4 SQ Protection....................................................................................................... 81

Section 5 Exceptions .......................................................................................................... 835.1 Overview .......................................................................................................................... 83

5.1.1 Features................................................................................................................ 835.1.2 Register Configuration......................................................................................... 83

5.2 Register Descriptions........................................................................................................ 845.3 Exception Handling Functions.......................................................................................... 85

5.3.1 Exception Handling Flow.................................................................................... 855.3.2 Exception Handling Vector Addresses ................................................................ 85

5.4 Exception Types and Priorities ......................................................................................... 865.5 Exception Flow................................................................................................................. 88

5.5.1 Exception Flow.................................................................................................... 885.5.2 Exception Source Acceptance ............................................................................. 895.5.3 Exception Requests and BL Bit ........................................................................... 915.5.4 Return from Exception Handling......................................................................... 91

5.6 Description of Exceptions................................................................................................. 925.6.1 Resets................................................................................................................... 925.6.2 General Exceptions.............................................................................................. 975.6.3 Interrupts.............................................................................................................. 1115.6.4 Priority Order with Multiple Exceptions ............................................................. 114

5.7 Usage Notes...................................................................................................................... 1155.8 Restrictions ....................................................................................................................... 115

Section 6 Floating-Point Unit .......................................................................................... 1176.1 Overview .......................................................................................................................... 1176.2 Data Formats..................................................................................................................... 117

6.2.1 Floating-Point Format.......................................................................................... 1176.2.2 Non-Numbers (NaN) ........................................................................................... 1196.2.3 Denormalized Numbers ....................................................................................... 120

6.3 Registers ........................................................................................................................... 1216.3.1 Floating-Point Registers ...................................................................................... 1216.3.2 Floating-Point Status/Control Register (FPSCR) ................................................ 1236.3.3 Floating-Point Communication Register (FPUL) ................................................ 124

6.4 Rounding .......................................................................................................................... 1246.5 Floating-Point Exceptions................................................................................................. 1256.6 Graphics Support Functions.............................................................................................. 126

6.6.1 Geometric Operation Instructions........................................................................ 126

Rev. 2.0, 03/99, page x of 13

6.6.2 Pair Single-Precision Data Transfer .................................................................... 128

Section 7 Instruction Set ................................................................................................... 1297.1 Execution Environment .................................................................................................... 1297.2 Addressing Modes ............................................................................................................ 1317.3 Instruction Set................................................................................................................... 135

Section 8 Pipelining............................................................................................................ 1498.1 Pipelines............................................................................................................................ 1498.2 Parallel-Executability ....................................................................................................... 1568.3 Execution Cycles and Pipeline Stalling ............................................................................ 160

Section 9 Power-Down Modes........................................................................................ 1779.1 Overview .......................................................................................................................... 177

9.1.1 Types of Power-Down Modes ............................................................................. 1779.1.2 Register Configuration......................................................................................... 179

9.2 Register Descriptions........................................................................................................ 1799.2.1 Standby Control Register (STBCR) .................................................................... 1799.2.2 Peripheral Module Pin High Impedance Control................................................. 1819.2.3 Peripheral Module Pin Pull-Up Control .............................................................. 1829.2.4 Standby Control Register 2 (STBCR2) ............................................................... 182

9.3 Sleep Mode ....................................................................................................................... 1839.3.1 Transition to Sleep Mode..................................................................................... 1839.3.2 Exit from Sleep Mode.......................................................................................... 183

9.4 Deep Sleep Mode.............................................................................................................. 1839.4.1 Transition to Deep Sleep Mode ........................................................................... 1839.4.2 Exit from Deep Sleep Mode ................................................................................ 183

9.5 Standby Mode................................................................................................................... 1849.5.1 Transition to Standby Mode ................................................................................ 1849.5.2 Exit from Standby Mode ..................................................................................... 1859.5.3 Clock Pause Function .......................................................................................... 185

9.6 Module Standby Function................................................................................................. 1869.6.1 Transition to Module Standby Function .............................................................. 1869.6.2 Exit from Module Standby Function ................................................................... 186

Section 10 Instruction Descriptions ............................................................................... 18710.1 ADD ............ ADD binary ............................................Arithmetic Instruction ........... 20010.2 ADDC ......... ADD with Carry .....................................Arithmetic Instruction ........... 20210.3 ADDV ......... ADD with (V flag) overflow check ........Arithmetic Instruction ........... 20310.4 AND ............ AND logical.............................................Logical Instruction ................ 20510.5 BF ................ Branch if False ........................................Branch Instruction................. 20710.6 BF/S ............ Branch if False with delay Slot ...............Branch Instruction................. 20910.7 BRA ............ BRAnch ..................................................Branch Instruction................. 211

Rev. 2.0, 03/99, page xi of 13

10.8 BRAF .......... BRAnch Far ............................................Branch Instruction................. 21310.9 BSR ............. Branch to SubRoutine .............................Branch Instruction................. 21410.10 BSRF ........... Branch to SubRoutine Far ......................Branch Instruction................. 21610.11 BT ............... Branch if True .........................................Branch Instruction................. 21810.12 BT/S ............. Branch if True with delay Slot ................Branch Instruction................. 22010.13 CLRMAC..... CleaR MAC register ...............................System Control Instruction ... 22210.14 CLRS ........... CleaR S bit ..............................................System Control Instruction ... 22310.15 CLRT .......... CleaR T bit .............................................System Control Instruction ... 22410.16 CMP/cond ... CoMPare conditionally ...........................Arithmetic Instruction ........... 22510.17 DIV0S ......... DIVide (step 0) as Signed .......................Arithmetic Instruction ........... 22910.18 DIV0U ......... DIVide (step 0) as Unsigned ..................Arithmetic Instruction ........... 23010.19 DIV1 ........... DIVide 1 step ..........................................Arithmetic Instruction ........... 23110.20 DMULS.L ... Double-length MULtiply as Signed ........Arithmetic Instruction ........... 23610.21 DMULU.L ... Double-length MULtiply as Unsigned ...Arithmetic Instruction ........... 23810.22 DT ............... Decrement and Test ................................Arithmetic Instruction ........... 24010.23 EXTS ........... EXTend as Signed ..................................Arithmetic Instruction ........... 24110.24 EXTU .......... EXTend as Unsigned ..............................Arithmetic Instruction ........... 24310.25 FABS ........... Floating-point ABSolute value ...............Floating-Point Instruction ..... 24410.26 FADD .......... Floating-point ADD ................................Floating-Point Instruction ..... 24510.27 FCMP .......... Floating-point CoMPare .........................Floating-Point Instruction ..... 24710.28 FCNVDS ..... Floating-point CoNVert

Double to Single precision ......................Floating-Point Instruction ..... 25010.29 FCNVSD ..... Floating-point CoNVert

Single to Double precision ......................Floating-Point Instruction ..... 25210.30 FDIV ........... Floating-point DIVide ............................Floating-Point Instruction ..... 25410.31 FIPR ............ Floating-point Inner PRoduct .................Floating-Point Instruction ..... 25810.32 FLDI0 .......... Floating-point LoaD Immediate 0.0 .......Floating-Point Instruction ..... 26010.33 FLDI1 .......... Floating-point LoaD Immediate 1.0 .......Floating-Point Instruction ..... 26110.34 FLDS ........... Floating-point LoaD to System register ..Floating-Point Instruction ..... 26210.35 FLOAT ........ Floating-point convert from integer ........Floating-Point Instruction ..... 26310.36 FMAC ......... Floating-point Multiply and ACcumulate

.......................................................................................Floating-Point Instruction ..... 26510.37 FMOV ......... Floating-point MOVe .............................Floating-Point Instruction ..... 27110.38 FMOV ......... Floating-point MOVe extension .............Floating-Point Instruction ..... 27510.39 FMUL ......... Floating-point MULtiply ........................Floating-Point Instruction ..... 27810.40 FNEG .......... Floating-point NEGate value ..................Floating-Point Instruction ..... 28010.41 FRCHG ....... FR-bit CHanGe .......................................Floating-Point Instruction ..... 28110.42 FSCHG ........ Sz-bit CHanGe ........................................Floating-Point Instruction ..... 28210.43 FSQRT ......... Floating-point SQuare RooT .................Floating-Point Instruction ..... 28310.44 FSTS ........... Floating-point STore System register .....Floating-Point Instruction ..... 28610.45 FSUB ........... Floating-point SUBtract .........................Floating-Point Instruction ..... 28710.46 FTRC ........... Floating-point TRuncate and Convert to integer

.......................................................................................Floating-Point Instruction ..... 289

Rev. 2.0, 03/99, page xii of 13

10.47 FTRV .......... Floating-point TRansform Vector ..........Floating-Point Instruction ..... 29210.48 JMP ............. JuMP .......................................................Branch Instruction................. 29510.49 JSR .............. Jump to SubRoutine ...............................Branch Instruction................. 29610.50 LDC ............. LoaD to Control register .........................System Control Instruction ... 29810.51 LDS ............. LoaD to FPU System register .................System Control Instruction ... 30210.52 LDS ............. LoaD to System register .........................System Control Instruction ... 30410.53 LDTLB ........ LoaD PTEH/PTEL/PTEA to TLB ..........System Control Instruction ... 30610.54 MAC.L ........ Multiply and ACcumulate Long .............Arithmetic Instruction ........... 30810.55 MAC.W ....... Multiply and ACcumulate Word ............Arithmetic Instruction ........... 31210.56 MOV ........... MOVe data .............................................Data Transfer Instruction ...... 31510.57 MOV ........... MOVe constant value .............................Data Transfer Instruction ...... 32010.58 MOV ........... MOVe global data ..................................Data Transfer Instruction ...... 32310.59 MOV ........... MOVe structure data ..............................Data Transfer Instruction ...... 32610.60 MOVA ........ MOVe effective Address ........................Data Transfer Instruction ...... 32910.61 MOVCA.L .. MOVe with Cache block Allocation ......Data Transfer Instruction ...... 33010.62 MOVT ......... MOVe T bit ............................................Data Transfer Instruction ...... 33110.63 MUL.L ........ MULtiply Long ......................................Arithmetic Instruction ........... 33210.64 MULS.W ..... MULtiply as Signed Word .....................Arithmetic Instruction ........... 33310.65 MULU.W .... MULtiply as Unsigned Word .................Arithmetic Instruction ........... 33410.66 NEG ............ NEGate ...................................................Arithmetic Instruction ........... 33510.67 NEGC .......... NEGate with Carry .................................Arithmetic Instruction ........... 33610.68 NOP ............. No OPeration ..........................................System Control Instruction ... 33710.69 NOT ............ NOT-logical complement .......................Logical Instruction ................ 33810.70 OCBI ........... Operand Cache Block Invalidate ............Data Transfer Instruction ...... 33910.71 OCBP .......... Operand Cache Block Purge ...................Data Transfer Instruction ...... 34010.72 OCBWB ...... Operand Cache Block Write Back ..........Data Transfer Instruction ...... 34110.73 OR ............... OR logical................................................Logical Instruction ................ 34210.74 PREF ........... PREFetch data to cache ..........................Data Transfer Instruction ...... 34410.75 ROTCL ........ ROTate with Carry Left ..........................Shift Instruction..................... 34510.76 ROTCR ....... ROTate with Carry Right .......................Shift Instruction..................... 34610.77 ROTL .......... ROTate Left ............................................Shift Instruction..................... 34710.78 ROTR .......... ROTate Right ..........................................Shift Instruction..................... 34810.79 RTE ............. ReTurn from Exception ..........................System Control Instruction ... 34910.80 RTS ............. ReTurn from Subroutine .........................Branch Instruction................. 35110.81 SETS ........... SET S bit ................................................System Control Instruction ... 35310.82 SETT ........... SET T bit ................................................System Control Instruction ... 35410.83 SHAD .......... SHift Arithmetic Dynamically ................Shift Instruction..................... 35510.84 SHAL .......... SHift Arithmetic Left ..............................Shift Instruction..................... 35710.85 SHAR .......... SHift Arithmetic Right ...........................Shift Instruction..................... 35810.86 SHLD .......... SHift Logical Dynamically .....................Shift Instruction..................... 35910.87 SHLL ........... SHift Logical Left ...................................Shift Instruction..................... 36110.88 SHLLn ......... n bits SHift Logical Left .........................Shift Instruction..................... 36210.89 SHLR .......... SHift Logical Right ................................Shift Instruction..................... 364

Rev. 2.0, 03/99, page xiii of 13

10.90 SHLRn ........ n bits SHift Logical Right .......................Shift Instruction..................... 36510.91 SLEEP ......... SLEEP ....................................................System Control Instruction ... 36710.92 STC ............. STore Control register ............................System Control Instruction ... 36810.93 STS .............. STore System register .............................System Control Instruction ... 37310.94 STS .............. STore from FPU System register ............System Control Instruction ... 37510.95 SUB ............. SUBtract binary ......................................Arithmetic Instruction ........... 37710.96 SUBC .......... SUBtract with Carry ...............................Arithmetic Instruction ........... 37810.97 SUBV .......... SUBtract with (V flag) underflow check Arithmetic Instruction ........... 37910.98 SWAP ......... SWAP register halves .............................Data Transfer Instruction ...... 38110.99 TAS ............. Test And Set ...........................................Logical Instruction ................ 38310.100 TRAPA ........ TRAP Always .........................................System Control Instruction ... 38510.101 TST ............. TeST logical ...........................................Logical Instruction ................ 38610.102 XOR ............ eXclusive OR logical ..............................Logical Instruction ................ 38810.103 XTRCT ........ eXTRaCT ...............................................Data Transfer Instruction ...... 390

Appendix A Address List.................................................................................................. 391

Appendix B Instruction Prefetch Side Effects ............................................................ 396

Rev. 2.0, 03/99, page xiv of 13

Rev. 2.0, 03/99, page 1 of 396

Section 1 Overview

1.1 SH7750 Features

The SH7750 is a 32-bit RISC (reduced instruction set computer) microprocessor, featuring objectcode upward-compatibility with SH-1, SH-2, SH-3, and SH-3E microcomputers. It includes an 8-kbyte instruction cache, a 16-kbyte operand cache with a choice of copy-back or write-throughmode, and an MMU (memory management unit) with a 64-entry fully-associative unified TLB(translation lookaside buffer).

The SH7750 has an on-chip bus state controller (BSC) that allows direct connection to DRAM andsynchronous DRAM without external circuitry. Its 16-bit fixed-length instruction set enablesprogram code size to be reduced by almost 50% compared with 32-bit instructions.

The features of the SH7750 are summarized in table 1.1.

Rev. 2.0, 03/99, page 2 of 396

Table 1.1 SH7750 Features

Item Features

LSI • Operating frequency: 200 MHz

• Performance:

360 MIPS (200 MHz)

1.4 GFLOPS (200 MHz)

• Superscalar architecture: Parallel execution of two instructions

• Voltage: 1.8 V (internal), 3.3 V (I/O)

• Packages: 256-pin BGA, 208-pin QFP

• External buses

Separate 26-bit address and 64-bit data buses

External bus frequency of 1/2, 1/3, 1/4, 1/6, or 1/8 times internal busfrequency

CPU • Original Hitachi SH architecture

• 32-bit internal data bus

• General register file:

Sixteen 32-bit general registers (and eight 32-bit shadow registers)

Seven 32-bit control registers

Four 32-bit system registers

• RISC-type instruction set (upward-compatible with SH Series)

Fixed 16-bit instruction length for improved code efficiency

Load-store architecture

Delayed branch instructions

Conditional execution

C-based instruction set

• Superscalar architecture (providing simultaneous execution of twoinstructions) including FPU

• Instruction execution time: Maximum 2 instructions/cycle

• Virtual address space: 4 Gbytes (448-Mbyte external memory space)

• Space identifier ASIDs: 8 bits, 256 virtual address spaces

• On-chip multiplier

• Five-stage pipeline

Rev. 2.0, 03/99, page 3 of 396

Table 1.1 SH7750 Features (cont)

Item Features

FPU • On-chip floating-point coprocessor

• Supports single-precision (32 bits) and double-precision (64 bits)

• Supports IEEE754-compliant data types and exceptions

• Two rounding modes: Round to Nearest and Round to Zero

• Handling of denormalized numbers: Truncation to zero or interruptgeneration for compliance with IEEE754

• Floating-point registers: 32 bits × 16 words × 2 banks(single-precision × 16 words or double-precision × 8 words) × 2 banks

• 32-bit CPU-FPU floating-point communication register (FPUL)

• Supports FMAC (multiply-and-accumulate) instruction

• Supports FDIV (divide) and FSQRT (square root) instructions

• Supports FLDI0/FLDI1 (load constant 0/1) instructions

• Instruction execution times

Latency (FMAC/FADD/FSUB/FMUL): 3 cycles (single-precision), 8cycles (double-precision)

Pitch (FMAC/FADD/FSUB/FMUL): 1 cycle (single-precision), 6 cycles(double-precision)

Note: FMAC is supported for single-precision only.

• 3-D graphics instructions (single-precision only):

4-dimensional vector conversion and matrix operations (FTRV): 4cycles (pitch), 7 cycles (latency)

4-dimensional vector (FIPR) inner product: 1 cycle (pitch), 4 cycles(latency)

• Five-stage pipeline

Rev. 2.0, 03/99, page 4 of 396


Item Features

Clock pulsegenerator (CPG)

• Choice of main clock: 1/2, 1, 3, or 6 times EXTAL

• Clock modes:

CPU frequency: 1, 1/2, 1/3, 1/4, 1/6, or 1/8 times main clock:maximum 200 MHz

Bus frequency: 1/2, 1/3, 1/4, 1/6, or 1/8 times main clock: maximum100 MHz

Peripheral frequency: 1/2, 1/3, 1/4, 1/6, or 1/8 times main clock:maximum 50 MHz

• Power-down modes

Sleep mode

Standby mode

Module standby function

• Single-channel watchdog timer

Memorymanagementunit (MMU)

• 4-Gbyte address space, 256 address space identifiers (8-bit ASIDs)

• Single virtual mode and multiple virtual memory mode

• Supports multiple page sizes: 1 kbyte, 4 kbytes, 64 kbytes, 1 Mbyte

• 4-entry fully-associative TLB for instructions

• 64-entry fully-associative TLB for instructions and operands

• Supports software-controlled replacement and random-counterreplacement algorithm

• TLB contents can be accessed directly by address mapping

Rev. 2.0, 03/99, page 5 of 396


Item Features

Cache memory • Instruction cache (IC)

8 kbytes, direct mapping

256 entries, 32-byte block length

Normal mode (8-kbyte cache)

Index mode

• Operand cache (OC)

16 kbytes, direct mapping

512 entries, 32-byte block length

Normal mode (16-kbyte cache)

Index mode

RAM mode (8-kbyte cache + 8-kbyte RAM)

Choice of write method (copy-back or write-through)

• Single-stage copy-back buffer, single-stage write-through buffer

• Cache memory contents can be accessed directly by address mapping(usable as on-chip memory)

• Store queue (32 bytes × 2 entries)

Interrupt controller(INTC)

• Five independent external interrupts (NMI, IRL3 to IRL0)

• 15-level signed external interrupts: IRL3 to IRL0

• On-chip peripheral module interrupts: Priority level can be set for eachmodule

User breakcontroller (UBC)

• Supports debugging by means of user break interrupts

• Two break channels

• Address, data value, access type, and data size can all be set as breakconditions

• Supports sequential break function

Rev. 2.0, 03/99, page 6 of 396


Item Features

Bus statecontroller (BSC)

• Supports external memory access

64/32/16/8-bit external data bus

• External memory space divided into seven areas, each of up to 64Mbytes, with the following parameters settable for each area:

Bus size (8, 16, 32, or 64 bits)

Number of wait cycles (hardware wait function also supported)

Direct connection of DRAM, synchronous DRAM, and burst ROMpossible by setting space type

Supports fast page mode and DRAM EDO

Supports PCMCIA interface

Chip select signals (&6� to &6�) output for relevant areas

• DRAM/synchronous DRAM refresh functions

Programmable refresh interval

Supports CAS-before-RAS refresh mode and self-refresh mode

• DRAM/synchronous DRAM burst access function

• Big endian or little endian mode can be set

Direct memoryaccess controller(DMAC)

• 4-channel physical address DMA controller

• Transfer data size: 8, 16, 32, or 64 bits, or 32 bytes

• Address modes:

1-bus-cycle single address mode

2-bus-cycle dual address mode

• Transfer requests: External, on-chip module, or auto-requests

• Bus modes: Cycle-steal or burst mode

• Supports on-demand data transfer

Timer unit (TMU) • 3-channel auto-reload 32-bit timer

• Input capture function

• Choice of seven counter input clocks

Realtime clock(RTC)

• On-chip clock and calendar functions

• Built-in 32 kHz crystal oscillator with maximum 1/256 second resolution(cycle interrupts)

Rev. 2.0, 03/99, page 7 of 396


Item Features

Serialcommunicationinterface(SCI, SCIF)

• Two full-duplex communication channels (SCI, SCIF)

• Channel 1 (SCI):

Choice of asynchronous mode or synchronous mode

Supports smart card interface

• Channel 2 (SCIF):

Supports asynchronous mode

Separate 16-byte FIFOs provided for transmitter and receiver

Packages • 256-pin BGA, 208-pin QFP

Rev. 2.0, 03/99, page 8 of 396

1.2 Block Diagram

Figure 1.1 shows an internal block diagram of the SH7750.

CPG

INTC

SCI(SCIF)

RTC

TMU

Externalbus interface

BSC DMAC

Add

ress

29-b

it ad

dres

s

64-b

it da

ta

64-b

it da

ta

32-b

it da

ta

32-b

it da

ta

Upp

er 3

2-bi

t dat

a

32-b

it ad

dres

s (in

stru

ctio

ns)

32-b

it da

ta (

inst

ruct

ions

)

32-b

it ad

dres

s (d

ata)

Per

iphe

ral a

ddre

ss b

us

26-bitaddress 64-bit

data

16-b

it pe

riphe

ral d

ata

bus

UBC

Lower 32-bit data

Lower 32-bit data

32-b

it da

ta (

load

)

32-b

it da

ta (

stor

e)

CPU

I cache(8 kB)

O cache(16 kB)ITLB UTLBCCN

FPU

64-b

it da

ta (

stor

e)

CCN: Cache and TLB controllerBSC: Bus state controllerCPG: Clock pulse generatorDMAC: Direct memory access controllerFPU: Floating-point unitINTC: Interrupt controllerITLB: Instruction TLB (translation lookaside buffer)

UTLB: Unified TLB (translation lookaside buffer)RTC: Realtime clockSCI: Serial communication interfaceSCIF: Serial communication interface with FIFOTMU: Timer unitUBC: User break controller

Figure 1.1 Block Diagram of SH7750 Functions

Rev. 2.0, 03/98, page 9 of 396

Section 2 Programming Model

2.1 Data Formats

The data formats handled by the SH7750 are shown in figure 2.1.

Byte (8 bits)

Word (16 bits)

Longword (32 bits)

Single-precision floating-point (32 bits)

Double-precision floating-point (64 bits)

07

015

031

031 30 22

fractionexps

063 62 51

exps fraction

Figure 2.1 Data Formats

Rev. 2.0, 03/98, page 10 of 396

2.2 Register Configuration

2.2.1 Privileged Mode and Banks

Processor Modes: The SH7750 has two processor modes, user mode and privileged mode. TheSH7750 normally operates in user mode, and switches to privileged mode when an exceptionoccurs or an interrupt is accepted. There are four kinds of registers—general registers, systemregisters, control registers, and floating-point registers—and the registers that can be accesseddiffer in the two processor modes.

General Registers: There are 16 general registers, designated R0 to R15. General registers R0 toR7 are banked registers which are switched by a processor mode change.

In privileged mode, the register bank bit (RB) in the status register (SR) defines which bankedregister set is accessed as general registers, and which set is accessed only through the load controlregister (LDC) and store control register (STC) instructions.

When the RB bit is 1 (that is, when bank 1 is selected), the 16 registers comprising bank 1 generalregisters R0_BANK1 to R7_BANK1 and non-banked general registers R8 to R15 can be accessedas general registers R0 to R15. In this case, the eight registers comprising bank 0 general registersR0_BANK0 to R7_BANK0 are accessed by the LDC/STC instructions. When the RB bit is 0 (thatis, when bank 0 is selected), the 16 registers comprising bank 0 general registers R0_BANK0 toR7_BANK0 and non-banked general registers R8 to R15 can be accessed as general registers R0to R15. In this case, the eight registers comprising bank 1 general registers R0_BANK1 toR7_BANK1 are accessed by the LDC/STC instructions.

In user mode, the 16 registers comprising bank 0 general registers R0_BANK0 to R7_BANK0 andnon-banked general registers R8 to R15 can be accessed as general registers R0 to R15. The eightregisters comprising bank 1 general registers R0_BANK1 to R7_BANK1 cannot be accessed.

Control Registers: Control registers comprise the global base register (GBR) and status register(SR), which can be accessed in both processor modes, and the saved status register (SSR), savedprogram counter (SPC), vector base register (VBR), saved general register 15 (SGR), and debugbase register (DBR), which can only be accessed in privileged mode. Some bits of the statusregister (such as the RB bit) can only be accessed in privileged mode.

System Registers: System registers comprise the multiply-and-accumulate registers(MACH/MACL), the procedure register (PR), the program counter (PC), the floating-pointstatus/control register (FPSCR), and the floating-point communication register (FPUL). Access tothese registers does not depend on the processor mode.

Rev. 2.0, 03/98, page 11 of 396

Floating-Point Registers: There are thirty-two floating-point registers, FR0–FR15 and XF0–XF15. FR0–FR15 and XF0–XF15 can be assigned to either of two banks (FPR0_BANK0–FPR15_BANK0 or FPR0_BANK1–FPR15_BANK1).

FR0–FR15 can be used as the eight registers DR0/2/4/6/8/10/12/14 (double-precision floating-point registers, or pair registers) or the four registers FV0/4/8/12 (register vectors), while XF0–XF15 can be used as the eight registers XD0/2/4/6/8/10/12/14 (register pairs) or register matrixXMTRX.

Register values after a reset are shown in table 2.1.

Table 2.1 Initial Register Values

Type Registers Initial Value*

General registers R0_BANK0–R7_BANK0,R0_BANK1–R7_BANK1,R8–R15

Undefined

SR MD bit = 1, RB bit = 1, BL bit = 1, FD bit = 0,I3–I0 = 1111 (H'F), reserved bits = 0, othersundefined

GBR, SSR, SPC, SGR,DBR

Undefined

Control registers

VBR H'00000000

MACH, MACL, PR, FPUL Undefined

PC H'A0000000

System registers

FPSCR H'00040001

Floating-pointregisters

FR0–FR15, XF0–XF15 Undefined

Note: * Initialized by a power-on reset and manual reset.

The register configuration in each processor is shown in figure 2.2.

Switching between user mode and privileged mode is controlled by the processor mode bit (MD)in the status register.

Rev. 2.0, 03/98, page 12 of 396

31 0R0_BANK0*1,*2

R1_BANK0*2

R2_BANK0*2

R3_BANK0*2

R4_BANK0*2

R5_BANK0*2

R6_BANK0*2

R7_BANK0*2

R8R9

R10R11R12R13R14R15

SR

GBRMACHMACL

PR

PC

(a) Register configuration in user mode

31 0R0_BANK1*1,*3

R1_BANK1*3

R2_BANK1*3

R3_BANK1*3

R4_BANK1*3

R5_BANK1*3

R6_BANK1*3

R7_BANK1*3

R8R9

R10R11R12R13R14R15

R0_BANK0*1,*4

R1_BANK0*4

R2_BANK0*4

R3_BANK0*4

R4_BANK0*4

R5_BANK0*4

R6_BANK0*4

R7_BANK0*4

(b) Register configuration in privileged mode (RB = 1)

GBRMACHMACL

VBRPR

SRSSR

PCSPC

31 0

R0_BANK1*1,*3

R1_BANK1*3

R2_BANK1*3

R3_BANK1*3

R4_BANK1*3

R5_BANK1*3

R6_BANK1*3

R7_BANK1*3

R8R9

R10R11R12R13R14R15

R0_BANK0*1,*4

R1_BANK0*4

R2_BANK0*4

R3_BANK0*4

R4_BANK0*4

R5_BANK0*4

R6_BANK0*4

R7_BANK0*4

(c) Register configuration in privileged mode (RB = 0)

GBRMACHMACL

VBRPR

SRSSR

PCSPC

SGR

DBR

SGR

DBR

Notes: 1. The R0 register is used as the index register in indexed register-indirect addressing mode and indexed GBR indirect addressing mode.

2. Banked registers 3. Banked registers

Accessed as general registers when the RB bit is set to 1 in the SR register. Accessed only by LDC/STC instructions when the RB bit is cleared to 0.

4. Banked registersAccessed as general registers when the RB bit is cleared to 0 in the SR register. Accessed only by LDC/STC instructions when the RB bit is set to 1.

Figure 2.2 CPU Register Configuration in Each Processor Mode

Rev. 2.0, 03/98, page 13 of 396

2.2.2 General Registers

Figure 2.3 shows the relationship between the processor modes and general registers. The SH7750has twenty-four 32-bit general registers (R0_BANK0–R7_BANK0, R0_BANK1–R7_BANK1,and R8–R15). However, only 16 of these can be accessed as general registers R0–R15 in oneprocessor mode. The SH7750 has two processor modes, user mode and privileged mode, in whichR0–R7 are assigned as shown below.

• R0_BANK0–R7_BANK0

In user mode (SR.MD = 0), R0–R7 are always assigned to R0_BANK0–R7_BANK0.

In privileged mode (SR.MD = 1), R0–R7 are assigned to R0_BANK0–R7_BANK0 only whenSR.RB = 0.

• R0_BANK1–R7_BANK1

In user mode, R0_BANK1–R7_BANK1 cannot be accessed.

In privileged mode, R0–R7 are assigned to R0_BANK1–R7_BANK1 only when SR.RB = 1.

Rev. 2.0, 03/98, page 14 of 396

SR.MD = 0 or (SR.MD = 1, SR.RB = 0)

R0_BANK0R1_BANK0R2_BANK0R3_BANK0R4_BANK0R5_BANK0R6_BANK0R7_BANK0




R0R1R2R3R4R5R6R7

R0R1R2R3R4R5R6R7

R8R9

R10R11R12R13R14R15

R8R9

R10R11R12R13R14R15

R8R9

R10R11R12R13R14R15

(SR.MD = 1, SR.RB = 1)

Figure 2.3 General Registers

Programming Note: As the user’s R0–R7 are assigned to R0_BANK0–R7_BANK0, and after anexception or interrupt R0–R7 are assigned to R0_BANK1–R7_BANK1, it is not necessary for theinterrupt handler to save and restore the user’s R0–R7 (R0_BANK0–R7_BANK0).

After a reset, the values of R0_BANK0–R7_BANK0, R0_BANK1–R7_BANK1, and R8–R15 areundefined.

Rev. 2.0, 03/98, page 15 of 396

2.2.3 Floating-Point Registers

Figure 2.4 shows the floating-point registers. There are thirty-two 32-bit floating-point registers,divided into two banks (FPR0_BANK0–FPR15_BANK0 and FPR0_BANK1–FPR15_BANK1).These 32 registers are referenced as FR0–FR15, DR0/2/4/6/8/10/12/14, FV0/4/8/12, XF0–XF15,XD0/2/4/6/8/10/12/14, or XMTRX. The correspondence between FPRn_BANKi and the referencename is determined by the FR bit in FPSCR (see figure 2.4).

• Floating-point registers, FPRn_BANKi (32 registers)

FPR0_BANK0, FPR1_BANK0, FPR2_BANK0, FPR3_BANK0, FPR4_BANK0,FPR5_BANK0, FPR6_BANK0, FPR7_BANK0, FPR8_BANK0, FPR9_BANK0,FPR10_BANK0, FPR11_BANK0, FPR12_BANK0, FPR13_BANK0, FPR14_BANK0,FPR15_BANK0

FPR0_BANK1, FPR1_BANK1, FPR2_BANK1, FPR3_BANK1, FPR4_BANK1,FPR5_BANK1, FPR6_BANK1, FPR7_BANK1, FPR8_BANK1, FPR9_BANK1,FPR10_BANK1, FPR11_BANK1, FPR12_BANK1, FPR13_BANK1, FPR14_BANK1,FPR15_BANK1

• Single-precision floating-point registers, FRi (16 registers)

When FPSCR.FR = 0, FR0–FR15 are assigned to FPR0_BANK0–FPR15_BANK0.

When FPSCR.FR = 1, FR0–FR15 are assigned to FPR0_BANK1–FPR15_BANK1.

• Double-precision floating-point registers or single-precision floating-point register pairs, DRi(8 registers): A DR register comprises two FR registers.

DR0 = {FR0, FR1}, DR2 = {FR2, FR3}, DR4 = {FR4, FR5}, DR6 = {FR6, FR7},DR8 = {FR8, FR9}, DR10 = {FR10, FR11}, DR12 = {FR12, FR13}, DR14 = {FR14, FR15}

• Single-precision floating-point vector registers, FVi (4 registers): An FV register comprisesfour FR registers

FV0 = {FR0, FR1, FR2, FR3}, FV4 = {FR4, FR5, FR6, FR7},FV8 = {FR8, FR9, FR10, FR11}, FV12 = {FR12, FR13, FR14, FR15}

• Single-precision floating-point extended registers, XFi (16 registers)

When FPSCR.FR = 0, XF0–XF15 are assigned to FPR0_BANK1–FPR15_BANK1.

When FPSCR.FR = 1, XF0–XF15 are assigned to FPR0_BANK0–FPR15_BANK0.

• Single-precision floating-point extended register pairs, XDi (8 registers): An XD registercomprises two XF registers

XD0 = {XF0, XF1}, XD2 = {XF2, XF3}, XD4 = {XF4, XF5}, XD6 = {XF6, XF7},XD8 = {XF8, XF9}, XD10 = {XF10, XF11}, XD12 = {XF12, XF13}, XD14 = {XF14, XF15}

Rev. 2.0, 03/98, page 16 of 396

• Single-precision floating-point extended register matrix, XMTRX: XMTRX comprises all 16XF registers

XMTRX = XF0 XF4 XF8 XF12

XF1 XF5 XF9 XF13

XF2 XF6 XF10 XF14

XF3 XF7 XF11 XF15

FPR0_BANK0FPR1_BANK0FPR2_BANK0FPR3_BANK0FPR4_BANK0FPR5_BANK0FPR6_BANK0FPR7_BANK0FPR8_BANK0FPR9_BANK0

FPR10_BANK0FPR11_BANK0FPR12_BANK0FPR13_BANK0FPR14_BANK0FPR15_BANK0

XF0XF1XF2 XF3XF4XF5XF6XF7 XF8 XF9 XF10 XF11XF12XF13XF14XF15

FR0FR1FR2 FR3FR4FR5FR6FR7 FR8 FR9 FR10 FR11FR12FR13FR14FR15

DR0

DR2

DR4

DR6

DR8

DR10

DR12

DR14

FV0

FV4

FV8

FV12

XD0 XMTRX

XD2

XD4

XD6

XD8

XD10

XD12

XD14





DR0

DR2

DR4

DR6

DR8

DR10

DR12

DR14

FV0

FV4

FV8

FV12

XD0XMTRX

XD2

XD4

XD6

XD8

XD10

XD12

XD14

FPSCR.FR = 0 FPSCR.FR = 1

Figure 2.4 Floating-Point Registers

Rev. 2.0, 03/98, page 17 of 396

Programming Note: After a reset, the values of FPR0_BANK0–FPR15_BANK0 andFPR0_BANK1–FPR15_BANK1 are undefined.

2.2.4 Control Registers

Status register, SR (32 bits, privilege protection, initial value = 0111 0000 0000 0000 000000XX 1111 00XX)

31 30 29 28 27 16 15 14 10 9 8 7 4 3 2 1 0

— MD RB BL — FD — M Q IMASK — S T

Note: —: Reserved. These bits are always read as 0, and should only be written with 0.X: Undefined

• MD: Processor mode

MD = 0: User mode (some instructions cannot be executed, and some resources cannot beaccessed)

MD = 1: Privileged mode

• RB: General register bank specifier in privileged mode (set to 1 by a reset, exception, orinterrupt)

RB = 0: R0_BANK0–R7_BANK0 are accessed as general registers R0–R7. (R0_BANK1–R7_BANK1 can be accessed using LDC/STC R0_BANK–R7_BANK instructions.)

RB = 1: R0_BANK1–R7_BANK1 are accessed as general registers R0–R7. (R0_BANK0–R7_BANK0 can be accessed using LDC/STC R0_BANK–R7_BANK instructions.)

• BL: Exception/interrupt block bit (set to 1 by a reset, exception, or interrupt)

BL = 1: Interrupt requests are masked. If a general exception other than a user break occurswhile BL = 1, the processor switches to the reset state.

• FD: FPU disable bit (cleared to 0 by a reset)

FD = 1: An FPU instruction causes a general FPU disable exception, and if the FPU instructionis in a delay slot, a slot FPU disable exception is generated. (FPU instructions: H'F***instructions, LDC(.L)/STS(.L) instructions for FPUL/FPSCR)

• M, Q: Used by the DIV0S, DIV0U, and DIV1 instructions.

• IMASK: Interrupt mask level

External interrupts of a lower level than IMASK are masked.

• S: Specifies a saturation operation for a MAC instruction.

• T: True/false condition or carry/borrow bit

Rev. 2.0, 03/98, page 18 of 396

Saved status register, SSR (32 bits, privilege protection, initial value undefined): The currentcontents of SR are saved to SSR in the event of an exception or interrupt.

Saved program counter, SPC (32 bits, privilege protection, initial value undefined): Theaddress of an instruction at which an interrupt or exception occurs is saved to SPC.

Global base register, GBR (32 bits, initial value undefined): GBR is referenced as the baseaddress in a GBR-referencing MOV instruction.

Vector base register, VBR (32 bits, privilege protection, initial value = H’0000 0000): VBR isreferenced as the branch destination base address in the event of an exception or interrupt. Fordetails, see section 5, Exceptions.

Saved general register 15, SGR (32 bits, privilege protection, initial value undefined): Thecontents of R15 are saved to SGR in the event of an exception or interrupt.

Debug base register, DBR (32 bits, privilege protection, initial value undefined): When theuser break debug function is enabled (BRCR.UBDE = 1), DBR is referenced as the user breakhandler branch destination address instead of VBR.

2.2.5 System Registers

Multiply-and-accumulate register high, MACH (32 bits, initial value undefined)Multiply-and-accumulate register low, MACL (32 bits, initial value undefined)MACH/MACL is used for the added value in a MAC instruction, and to store a MAC instructionor MUL operation result.

Procedure register, PR (32 bits, initial value undefined): The return address is stored in PR in asubroutine call using a BSR, BSRF, or JSR instruction, and PR is referenced by the subroutinereturn instruction (RTS).

Program counter, PC (32 bits, initial value = H’A000 0000): PC indicates the instruction fetchaddress.

Rev. 2.0, 03/98, page 19 of 396

Floating-point status/control register, FPSCR (32 bits, initial value = H’0004 0001)

31 22 21 20 19 18 17 12 11 7 6 2 1 0

— FR SZ PR DN Cause Enable Flag RM

Note: —: Reserved. These bits are always read as 0, and should only be written with 0.

• FR: Floating-point register bank

FR = 0: FPR0_BANK0–FPR15_BANK0 are assigned to FR0–FR15; FPR0_BANK1–FPR15_BANK1 are assigned to XF0–XF15.

FR = 1: FPR0_BANK0–FPR15_BANK0 are assigned to XF0–XF15; FPR0_BANK1–FPR15_BANK1 are assigned to FR0–FR15.

• SZ: Transfer size mode

SZ = 0: The data size of the FMOV instruction is 32 bits.

SZ = 1: The data size of the FMOV instruction is a 32-bit register pair (64 bits).

• PR: Precision mode

PR = 0: Floating-point instructions are executed as single-precision operations.

PR = 1: Floating-point instructions are executed as double-precision operations (the result ofinstructions for which double-precision is not supported is undefined).

Do not set SZ and PR to 1 simultaneously; this setting is reserved.

[SZ, PR = 11]: Reserved (FPU operation instruction is undefined.)

• DN: Denormalization mode

DN = 0: A denormalized number is treated as such.

DN = 1: A denormalized number is treated as zero.

FPUError (E)

InvalidOperation (V)

Divisionby Zero (Z)

Overflow(O)

Underflow(U)

Inexact(I)

Cause FPU exceptioncause field

Bit 17 Bit 16 Bit 15 Bit 14 Bit 13 Bit 12

Enable FPU exceptionenable field

None Bit 11 Bit 10 Bit 9 Bit 8 Bit 7

Flag FPU exceptionflag field


When an FPU operation instruction is executed, the cause field is cleared to zero first. Whenthe next FPU exception is requested, the corresponding bits in the cause field and flag field areset to 1. The flag field holds the status of the exception generated after the field was lastcleared.

Rev. 2.0, 03/98, page 20 of 396

• RM: Rounding mode

RM = 00: Round to Nearest

RM = 01: Round to Zero

RM = 10: Reserved

RM = 11: Reserved

• Bits 22 to 31: Reserved

Floating-point communication register, FPUL (32 bits, initial value undefined): Data transferbetween FPU registers and CPU registers is carried out via the FPUL register.

Programming Note: When SZ = 1 and big endian mode is selected, FMOV can be used fordouble-precision floating-point load or store operations. In little endian mode, two 32-bit data sizemoves must be executed, with SZ = 0, to load or store a double-precision floating-point number.

2.3 Memory-Mapped Registers

Appendix A shows the control registers mapped to memory. The control registers are double-mapped to the following two memory areas. All registers have two addresses.

H'1F00 0000–H'1FFF FFFFH'FF00 0000–H'FFFF FFFF

These two areas are used as follows.

• H'1F00 0000–H'1FFF FFFF

This area must be accessed in address translation mode using the TLB. Since external memoryis defined as a 29-bit address space in the SH7750 architecture, the TLB’s physical pagenumbers do not cover a 32-bit address space. In address translation, the page numbers of thisarea can be set in the corresponding field of the TLB by accessing a memory-mapped register.The page numbers of this area should be used as the actual page numbers set in the TLB.When address translation is not performed, the operation of accesses to this area is undefined.

• H'FF00 0000–H'FFFF FFFF

This area must be accessed without address translation.

Do not access undefined locations in either area The operation of an access to an undefinedlocation is undefined. Also, memory-mapped registers must be accessed using a fixed datasize. The operation of an access using an invalid data size is undefined.

Programming Note: Access to area H'FF00 0000–H'FFFF FFFF in user mode will cause anaddress error. Memory-mapped registers can be referenced in user mode by means of access thatinvolves address translation.

Rev. 2.0, 03/98, page 21 of 396

2.4 Data Format in Registers

Register operands are always longwords (32 bits). When a memory operand is only a byte (8 bits)or a word (16 bits), it is sign-extended into a longword when loaded into a register.

31 0Longword

2.5 Data Formats in Memory

Memory data formats are classified into bytes, words, and longwords. Memory can be accessed in8-bit byte, 16-bit word, or 32-bit longword form. A memory operand less than 32 bits in length issign-extended before being loaded into a register.

A word operand must be accessed starting from a word boundary (even address of a 2-byte unit:address 2n), and a longword operand starting from a longword boundary (even address of a 4-byteunit: address 4n). An address error will result if this rule is not observed. A byte operand can beaccessed from any address.

Big endian or little endian byte order can be selected for the data format. The endian should be setwith the MD5 external pin in a power-on reset. Big endian is selected when the MD5 pin is low,and little endian when high. The endian cannot be changed dynamically. Bit positions arenumbered left to right from most-significant to least-significant. Thus, in a 32-bit longword, theleftmost bit, bit 31, is the most significant bit and the rightmost bit, bit 0, is the least significantbit.

The data format in memory is shown in figure 2.5. In little endian mode, data written as byte-size(8 bits) should be read as byte size, and data written as word-size (16 bits) should be read as wordsize.

Address A

A

7 0 7 0 7 0 7 0

31

15 0 15 0

31 0

15 0

31 0

23 15 7 0

A + 1 A + 2 A + 3

Byte 0

Word 0

Longword

Word 1

Byte 1 Byte 2 Byte 3

A + 11

7 0 7 0 7 0 7 0

31

15 0

23 15 7 0

A + 10 A + 9 A + 8

Byte 3

Word 1

Longword

Word 0

Byte 2 Byte 1 Byte 0

Address A + 4

Address A + 8

Address A + 8

Address A + 4

Address A

Big endian Little endian

Figure 2.5 Data Formats In Memory

Rev. 2.0, 03/98, page 22 of 396

Note: The SH7750 does not support endian conversion for the 64-bit data format. Therefore, ifdouble-precision floating-point format (64-bit) access is performed in little endian mode,the upper and lower 32 bits will be reversed.

2.6 Processor States

The SH7750 has five processor states: the reset state, exception-handling state, bus-released state,program execution state, and power-down state.

Reset State: In this state the CPU is reset. The reset state is entered when the 5(6(7 pin goeslow. The CPU enters the power-on reset state if the 05(6(7 pin is high, and the manual resetstate if the 05(6(7 pin is low. For more information on resets, see section 5, Exceptions.

In the power-on reset state, the internal state of the CPU and the on-chip peripheral moduleregisters are initialized. In the manual reset state, the internal state of the CPU and registers of on-chip peripheral modules other than the bus state controller (BSC) are initialized. Since the busstate controller (BSC) is not initialized in the manual reset state, refreshing operations continue.Refer to the register configurations in the relevant sections for further details.

Exception-Handling State: This is a transient state during which the CPU’s processor state flowis altered by a reset, general exception, or interrupt exception handling source.

In the case of a reset, the CPU branches to address H'A000 0000 and starts executing the user-coded exception handling program.

In the case of a general exception or interrupt, the program counter (PC) contents are saved in thesaved program counter (SPC), the status register (SR) contents are saved in the saved statusregister (SSR), and the R15 contents are saved in saved general register 15 (SGR). The CPUbranches to the start address of the user-coded exception service routine found from the sum of thecontents of the vector base address and the vector offset. See section 5, Exceptions, for moreinformation on resets, general exceptions, and interrupts.

Program Execution State: In this state the CPU executes program instructions in sequence.

Power-Down State: In the power-down state, CPU operation halts and power consumption isreduced. The power-down state is entered by executing a SLEEP instruction. There are two modesin the power-down state: sleep mode and standby mode. For details, see section 9, Power-DownModes.

Bus-Released State: In this state the CPU has released the bus to a device that requested it.

Transitions between the states are shown in figure 2.6.

Rev. 2.0, 03/98, page 23 of 396

= 0, = 1

= 1, = 0

= 1, = 1

Power-on reset state Manual reset state

Program execution state

Bus-released state

Exception-handling state

Interrupt InterruptEnd of exceptiontransition processing

Bus request clearance

Exceptioninterrupt

Bus request clearanceBus

request

Bus request clearance

SLEEP instruction with STBY bit cleared

SLEEP instruction with STBY bit set

From any state when = 0 and = 1

From any state when = 0 and = 0

Reset state

Power-down state

Bus request

Bus request

Standby modeSleep mode

Figure 2.6 Processor State Transitions

2.7 Processor Modes

There are two processor modes: user mode and privileged mode. The processor mode isdetermined by the processor mode bit (MD) in the status register (SR). User mode is selectedwhen the MD bit is cleared to 0, and privileged mode when the MD bit is set to 1. When the resetstate or exception state is entered, the MD bit is set to 1. When exception handling ends, the MDbit is cleared to 0 and user mode is entered. There are certain registers and bits which can only beaccessed in privileged mode.

Rev. 2.0, 03/98, page 24 of 396

Rev. 2.0, 03/99, page 25 of 396

Section 3 Memory Management Unit (MMU)

3.1 Overview

3.1.1 Features

The SH7750 can handle 29-bit external memory space from an 8-bit address space identifier and32-bit logical (virtual) address space. Address translation from virtual address to physical addressis performed using the memory management unit (MMU) built into the SH7750. The MMUperforms high-speed address translation by caching user-created address translation tableinformation in an address translation buffer (translation lookaside buffer: TLB). The SH7750 hasfour instruction TLB (ITLB) entries and 64 unified TLB (UTLB) entries. UTLB copies are storedin the ITLB by hardware. A paging system is used for address translation, with support for fourpage sizes (1, 4, and 64 kbytes, and 1 Mbyte). It is possible to set the virtual address space accessright and implement storage protection independently for privileged mode and user mode.

3.1.2 Role of the MMU

The MMU was conceived as a means of making efficient use of physical memory. As shown infigure 3.1, when a process is smaller in size than the physical memory, the entire process can bemapped onto physical memory, but if the process increases in size to the point where it does not fitinto physical memory, it becomes necessary to divide the process into smaller parts, and map theparts requiring execution onto physical memory on an ad hoc basis ((1)). Having this mappingonto physical memory executed consciously by the process itself imposes a heavy burden on theprocess. The virtual memory system was devised as a means of handling all physical memorymapping to reduce this burden ((2)). With a virtual memory system, the size of the availablevirtual memory is much larger than the actual physical memory, and processes are mapped ontothis virtual memory. Thus processes only have to consider their operation in virtual memory, andmapping from virtual memory to physical memory is handled by the MMU. The MMU isnormally managed by the OS, and physical memory switching is carried out so as to enable thevirtual memory required by a task to be mapped smoothly onto physical memory. Physicalmemory switching is performed via secondary storage, etc.

The virtual memory system that came into being in this way works to best effect in a time sharingsystem (TSS) that allows a number of processes to run simultaneously ((3)). Running a number ofprocesses in a TSS did not increase efficiency since each process had to take account of physicalmemory mapping. Efficiency is improved and the load on each process reduced by the use of avirtual memory system ((4)). In this system, virtual memory is allocated to each process. The taskof the MMU is to map a number of virtual memory areas onto physical memory in an efficientmanner. It is also provided with memory protection functions to prevent a process frominadvertently accessing another process’s physical memory.

Rev. 2.0, 03/99, page 26 of 396

When address translation from virtual memory to physical memory is performed using the MMU,it may happen that the translation information has not been recorded in the MMU, or the virtualmemory of a different process is accessed by mistake. In such cases, the MMU will generate anexception, change the physical memory mapping, and record the new address translationinformation.

Although the functions of the MMU could be implemented by software alone, having addresstranslation performed by software each time a process accessed physical memory would be veryinefficient. For this reason, a buffer for address translation (the translation lookaside buffer: TLB)is provided in hardware, and frequently used address translation information is placed here. TheTLB can be described as a cache for address translation information. However, unlike a cache, ifaddress translation fails—that is, if an exception occurs—switching of the address translationinformation is normally performed by software. Thus memory management can be performed in aflexible manner by software.

There are two methods by which the MMU can perform mapping from virtual memory to physicalmemory: the paging method, using fixed-length address translation, and the segment method,using variable-length address translation. With the paging method, the unit of translation is afixed-size address space called a page (usually from 1 to 64 kbytes in size).

In the following descriptions, the address space in virtual memory in the SH7750 is referred to asvirtual address space, and the address space in physical memory as physical address space.

Rev. 2.0, 03/99, page 27 of 396

��

��

��

��

(2)

Process 1

Process 1Physicalmemory

Process 1

Process 2

Process 3

Virtualmemory

Process 1

Process 1

Process 2

Process 3

MMU

MMU

(4)(3)

(1)

Physicalmemory

Physicalmemory

Physicalmemory

Physicalmemory

Virtualmemory

Figure 3.1 Role of the MMU

Rev. 2.0, 03/99, page 28 of 396

3.1.3 Register Configuration

The MMU registers are shown in table 3.1.

Table 3.1 MMU Registers

NameAbbrevia-tion R/W

InitialValue*1

P4Address*2

Area 7Address*2

AccessSize

Page table entry highregister

PTEH R/W Undefined H’FF00 0000 H’1F00 0000 32

Page table entry lowregister

PTEL R/W Undefined H’FF00 0004 H’1F00 0004 32

Page table entryassistance register

PTEA R/W Undefined H’FF00 0034 H’1F00 0034 32

Translation table baseregister

TTB R/W Undefined H’FF00 0008 H’1F00 0008 32

TLB exception addressregister

TEA R/W Undefined H’FF00 000C H’1F00 000C 32

MMU control register MMUCR R/W H’0000 0000 H’FF00 0010 H’1F00 0010 32

Notes: 1. The initial value is the value after a power-on reset or manual reset.2. This is the address when using the virtual/physical address space P4 area. When

making an access from physical address space area 7 using the TLB, the upper 3 bitsof the address are ignored.

3.1.4 Caution

Operation is not guaranteed if an area designated as a reserved area in this manual is accessed.

Rev. 2.0, 03/99, page 29 of 396

3.2 Register Descriptions

There are six MMU-related registers.

31 10 9 8 7 0

VPN

PPN

— — ASID

1. PTEH

31 30 29 28 10 9 8 7 6 5 4 3 2 1 0

— — — — V SZ PR SZ C D SH WT

2. PTEL

31 4 3 2 0

TC SA

3. PTEA

31 0

TTB

4. TTB

31

Virtual address at which MMU exception or address error occurred

5. TEA

31 26 24 23 18 17 16 15 10 9 8 7 6 5 4 3 2 1 0

LRUI — — — — URC

SQMD

SV — — — — — TI — AT

6. MMUCR

— indicates a reserved bit: the write value must be 0, and a read will return an undefined value.

URB

25

Figure 3.2 MMU-Related Registers

Rev. 2.0, 03/99, page 30 of 396

1. Page table entry high register (PTEH): Longword access to PTEH can be performed fromH’FF00 0000 in the P4 area and H’1F00 0000 in area 7. PTEH consists of the virtual page number(VPN) and address space identifier (ASID). When an MMU exception or address error exceptionoccurs, the VPN of the virtual address at which the exception occurred is set in the VPN field byhardware. VPN varies according to the page size, but the VPN set by hardware when an exceptionoccurs consists of the upper 22 bits of the virtual address which caused the exception. VPN settingcan also be carried out by software. The number of the currently executing process is set in theASID field by software. ASID is not updated by hardware. VPN and ASID are recorded in theUTLB by means of the LDLTB instruction.

2. Page table entry low register (PTEL): Longword access to PTEL can be performed fromH’FF00 0004 in the P4 area and H’1F00 0004 in area 7. PTEL is used to hold the physical pagenumber and page management information to be recorded in the UTLB by means of the LDTLBinstruction. The contents of this register are not changed unless a software directive is issued.

3. Page table entry assistance register (PTEA): Longword access to PTEA can be performedfrom H’FF00 0034 in the P4 area and H’1F00 0034 in area 7. PTEL is used to store assistance bitsfor PCMCIA access to the UTLB by means of the LDTLB instruction. The contents of thisregister are not changed unless a software directive is issued.

4. Translation table base register (TTB): Longword access to TTB can be performed fromH’FF00 0008 in the P4 area and H’1F00 0008 in area 7. TTB is used, for example, to hold the baseaddress of the currently used page table. The contents of TTB are not changed unless a softwaredirective is issued. This register can be freely used by software.

5. TLB exception address register (TEA): Longword access to TEA can be performed fromH’FF00 000C in the P4 area and H’1F00 000C in area 7. After an MMU exception or address errorexception occurs, the virtual address at which the exception occurred is set in TEA by hardware.The contents of this register can be changed by software.

6. MMU control register (MMUCR): MMUCR contains the following bits:LRUI: Least recently used ITLBURB: UTLB replace boundaryURC: UTLB replace counterSQMD: Store queue mode bitSV: Single virtual mode bitTI: TLB invalidateAT: Address translation bit

Longword access to MMUCR can be performed from H’FF00 0010 in the P4 area and H’1F000010 in area 7. The individual bits perform MMU settings as shown below. Therefore, MMUCRrewriting should be performed by a program in the P1 or P2 area. After MMUCR is updated, aninstruction that performs data access to the P0, P3, U0, or store queue area should be located atleast four instructions after the MMUCR update instruction. Also, a branch instruction to the P0,

Rev. 2.0, 03/99, page 31 of 396

P3, or U0 area should be located at least eight instructions after the MMUCR update instruction.MMUCR contents can be changed by software. The LRUI bits and URC bits may also be updatedby hardware.

• LRUI: The LRU (least recently used) method is used to decide the ITLB entry to be replacedin the event of an ITLB miss. The entry to be purged from the ITLB can be confirmed usingthe LRUI bits. LRUI is updated by means of the algorithm shown below. A dash in this tablemeans that updating is not performed.

LRUI

[5] [4] [3] [2] [1] [0]

When ITLB entry 0 is used 0 0 0 — — —

When ITLB entry 1 is used 1 — — 0 0 —

When ITLB entry 2 is used — 1 — 1 — 0

When ITLB entry 3 is used — — 1 — 1 1

Other than the above — — — — — —

When the LRUI bit settings are as shown below, the corresponding ITLB entry is updated byan ITLB miss. An asterisk in this table means “don’t care”.

LRUI

[5] [4] [3] [2] [1] [0]

ITLB entry 0 is updated 1 1 1 * * *

ITLB entry 1 is updated 0 * * 1 1 *

ITLB entry 2 is updated * 0 * 0 * 1

ITLB entry 3 is updated * * 0 * 0 0

Other than the above Setting prohibited

Ensure that values for which “Setting prohibited” is indicated in the above table are not set atthe discretion of software. After a power-on or manual reset the LRUI bits are initialized to 0,and therefore a prohibited setting is never made by a hardware update.

• URB: Bits that indicate the UTLB entry boundary at which replacement is to be performed.Valid only when URB > 0.

• URC: Random counter for indicating the UTLB entry for which replacement is to beperformed with an LDTLB instruction. URC is incremented each time the UTLB is accessed.When URB > 0, URC is reset to 0 when the condition URC = URB occurs. Also note that, if avalue is written to URC by software which results in the condition URC > URB, incrementing

Rev. 2.0, 03/99, page 32 of 396

is first performed in excess of URB until URC = H’3F. URC is not incremented by an LDTLBinstruction.

• SQMD: Store queue mode bit. Specifies the right of access to the store queues.

0: User/privileged access possible

1: Privileged access possible (address error exception in case of user access)

• SV: Bit that switches between single virtual memory mode and multiple virtual memory mode.

0: Multiple virtual memory mode

1: Single virtual memory mode

When this bit is changed, ensure that 1 is also written to the TI bit.

• TI: Writing 1 to this bit invalidates (clears to 0) all valid UTLB/ITLB bits. This bit alwaysreturns 0 when read.

• AT: Specifies MMU enabling or disabling.

0: MMU disabled

1: MMU enabled

MMU exceptions are not generated when the AT bit is 0. In the case of software that does notuse the MMU, therefore, the AT bit should be cleared to 0.

3.3 Memory Space

3.3.1 Physical Memory Space

The SH7750 supports a 32-bit physical memory space, and can access a 4-Gbyte address space.When the MMUCR.AT bit is cleared to 0 and the MMU is disabled, the address space is thisphysical memory space. The physical memory space is divided into a number of areas, as shownin figure 3.3. The physical memory space is permanently mapped onto 29-bit external memoryspace; this correspondence can be implemented by ignoring the upper 3 bits of the physicalmemory space addresses. In privileged mode, the 4-Gbyte space from the P0 area to the P4 areacan be accessed. In user mode, a 2-Gbyte space in the U0 area can be accessed. Accessing the P1to P4 areas (except the store queue area) in user mode will cause an address error.

Rev. 2.0, 03/99, page 33 of 396

Area 0Area 1Area 2Area 3Area 4Area 5Area 6Area 7

Externalmemory space

Address error

Address error

Store queue area

User modePrivileged mode

P1 areaCacheable

P0 areaCacheable

P2 areaNon-cacheable

P3 areaCacheable


U0 areaCacheable

H'0000 0000

H'8000 0000

H'E000 0000H'E400 0000

H'FFFF FFFF

H'0000 0000

H'8000 0000

H'FFFF FFFF

H'A000 0000

H'C000 0000

H'E000 0000

Figure 3.3 Physical Memory Space (MMUCR.AT = 0)

P0, P1, P3, U0 Areas: The P0, P1, P3, and U0 areas can be accessed using the cache. Whether ornot the cache is used is determined by the cache control register (CCR). When the cache is used,with the exception of the P1 area, switching between the copy-back method and the write-throughmethod for write accesses is specified by the CCR.WT bit. For the P1 area, switching is specifiedby the CCR.CB bit. Zeroizing the upper 3 bits of an address in these areas gives the correspondingexternal memory space address. However, since area 7 in the external memory space is a reservedarea, a reserved area also appears in these areas.

P2 Area: The P2 area cannot be accessed using the cache. In the P2 area, zeroizing the upper 3bits of an address gives the corresponding external memory space address. However, since area 7in the external memory space is a reserved area, a reserved area also appears in this area.

P4 Area: The P4 area is mapped onto SH7750 on-chip I/O channels. This area cannot be accessedusing the cache. The P4 area is shown in detail in figure 3.4.

Rev. 2.0, 03/99, page 34 of 396

H'E000 0000

H'E400 0000

H'F000 0000

H'F100 0000

H'F200 0000

H'F300 0000

H'F400 0000

H'F500 0000

H'F600 0000

H'F700 0000

H'F800 0000

H'FF00 0000

Store queue

Reserved area

Instruction cache address array

Instruction cache data array

Instruction TLB address array

Instruction TLB data arrays 1 and 2

Operand cache address array

Operand cache data array

Unified TLB address array

Unified TLB data arrays 1 and 2

Reserved area

Control register area

Figure 3.4 P4 Area

The area from H’E000 0000 to H’E3FF FFFF comprises addresses for accessing the store queues(SQs). When the MMU is disabled (MMUCR.AT = 0), the SQ access right is specified by theMMUCR.SQMD bit. For details, see section 4.6, Store Queues.

The area from H’F000 0000 to H’F0FF FFFF is used for direct access to the instruction cacheaddress array. For details, see section 4.5.1, IC Address Array.

The area from H’F100 0000 to H’F1FF FFFF is used for direct access to the instruction cache dataarray. For details, see section 4.5.2, IC Data Array.

The area from H’F200 0000 to H’F2FF FFFF is used for direct access to the instruction TLBaddress array. For details, see section 3.7.1, ITLB Address Array.

The area from H’F300 0000 to H’F3FF FFFF is used for direct access to instruction TLB dataarrays 1 and 2. For details, see sections 3.7.2, ITLB Data Array 1, and 3.7.3, ITLB Data Array 2.

Rev. 2.0, 03/99, page 35 of 396

The area from H’F400 0000 to H’F4FF FFFF is used for direct access to the operand cache addressarray. For details, see section 4.5.3, OC Address Array.

The area from H’F500 0000 to H’F5FF FFFF is used for direct access to the operand cache dataarray. For details, see section 4.5.4, OC Data Array.

The area from H’F600 0000 to H’F6FF FFFF is used for direct access to the unified TLB addressarray. For details, see section 3.7.4, UTLB Address Array.

The area from H’F700 0000 to H’F7FF FFFF is used for direct access to unified TLB data arrays 1and 2. For details, see sections 3.7.5, UTLB Data Array 1, and 3.7.6, UTLB Data Array 2.

The area from H’FF00 0000 to H’FFFF FFFF is the on-chip peripheral module control registerarea.

3.3.2 External Memory Space

The SH7750 supports a 29-bit external memory space. The external memory space is divided intoeight areas as shown in figure 3.5. Areas 0 to 6 relate to memory, such as SRAM, synchronousDRAM, DRAM, and PCMCIA. Area 7 is a reserved area. For details, see section 13, Bus StateController (BSC), in the Hardware Manual.

H'0000 0000

H'0400 0000

H'0800 0000

H'0C00 0000

H'1000 0000

H'1400 0000

H'1800 0000

H'1C00 0000H'1FFF FFFF

Area 0

Area 1

Area 2

Area 3

Area 4

Area 5

Area 6

Area 7 (reserved area)

Figure 3.5 External Memory Space

Rev. 2.0, 03/99, page 36 of 396

3.3.3 Virtual Memory Space

Setting the MMUCR.AT bit to 1 enables the P0, P3, and U0 areas of the physical memory space inthe SH7750 to be mapped onto any external memory space in 1-, 4-, or 64-kbyte, or 1-Mbyte,page units. By using an 8-bit address space identifier, the P0, U0, P3, and store queue areas can beincreased to a maximum of 256. This is called the virtual memory space. Mapping from virtualmemory space to 29-bit external memory space is carried out using the TLB. Only when area 7 inexternal memory space is accessed using virtual memory space, addresses H’1F00 0000 to H’1FFFFFFF of area 7 are not designated as a reserved area, but are equivalent to the P4 area controlregister area in the physical memory space. Virtual memory space is illustrated in figure 3.6.

Area 0

Area 1

Area 2

Area 3

Area 4

Area 5

Area 6

Area 7

Externalmemory space

256256

U0 areaCacheable

Address translation possible

Address error

Address error

Store queue area

P0 areaCacheable


User modePrivileged mode

P1 areaCacheable

Address translation not possible



P3 areaCacheable




Figure 3.6 Virtual Memory Space (MMUCR.AT = 1)

P0, P3, U0 Areas: The P0 area (excluding addresses H’7C00 0000 to H’7FFF FFFF), P3 area, andU0 area allow access using the cache and address translation using the TLB. These areas can bemapped onto any external memory space in 1-, 4-, or 64-kbyte, or 1-Mbyte, page units. WhenCCR is in the cache-enabled state and the TLB enable bit (C bit) is 1, accesses can be performedusing the cache. In write accesses to the cache, switching between the copy-back method and thewrite-through method is indicated by the TLB write-through bit (WT bit), and is specified in pageunits.

Rev. 2.0, 03/99, page 37 of 396

Only when the P0, P3, and U0 areas are mapped onto external memory space by means of theTLB, addresses H’1F00 0000 to H’1FFF FFFF of area 7 in external memory space are allocated tothe control register area. This enables on-chip peripheral module control registers to be accessedfrom the U0 area in user mode. In this case, the C bit for the corresponding page must be clearedto 0.

In the cache enabled state, when areas P0, P3, and U0 are mapped onto the PCMCIA space bymeans of TLB, it is necessary either to specify 1 for the WT bit or to specify 0 for the C bit; it isnot possible to use copy-back mode cache for the PCMCIA space.

P1, P2, P4 Areas: Address translation using the TLB cannot be performed for the P1, P2, or P4area (except for the store queue area). Accesses to these areas are the same as for physical memoryspace. The store queue area can be mapped onto any external memory space by the MMU.However, operation in the case of an exception differs from that for normal P0, U0, and P3 spaces.For details, see section 4.6, Store Queues.

3.3.4 On-Chip RAM Space

In the SH7750, half (8 kbytes) of the instruction cache (16 kbytes) can be used as on-chip RAM.This can be done by changing the CCR settings.

When the operand cache is used as on-chip RAM (CCR.ORA = 1), P0 area addresses H’7C000000 to H’7FFF FFFF are an on-chip RAM area. Data accesses (byte/word/longword/quadword)can be used in this area. This area can only be used in RAM mode.

3.3.5 Address Translation

When the MMU is used, the virtual address space is divided into units called pages, andtranslation to physical addresses is carried out in these page units. The address translation table inexternal memory contains the physical addresses corresponding to virtual addresses and additionalinformation such as memory protection codes. Fast address translation is achieved by caching thecontents of the address translation table located in external memory into the TLB. In the SH7750,basically, the ITLB is used for instruction accesses and the UTLB for data accesses. In the eventof an access to an area other than the P4 area, the accessed virtual address is translated to aphysical address. If the virtual address belongs to the P1 or P2 area, the physical address isuniquely determined without accessing the TLB. If the virtual address belongs to the P0, U0, or P3area, the TLB is searched using the virtual address, and if the virtual address is recorded in theTLB, a TLB hit is made and the corresponding physical address is read from the TLB. If theaccessed virtual address is not recorded in the TLB, a TLB miss exception is generated andprocessing switches to the TLB miss exception routine. In the TLB miss exception routine, theaddress translation table in external memory is searched, and the corresponding physical addressand page management information are recorded in the TLB. After the return from the exceptionhandling routine, the instruction which caused the TLB miss exception is re-executed.

Rev. 2.0, 03/99, page 38 of 396

3.3.6 Single Virtual Memory Mode and Multiple Virtual Memory Mode

There are two virtual memory systems, single virtual memory and multiple virtual memory, eitherof which can be selected with the MMUCR.SV bit. In the single virtual memory system, a numberof processes run simultaneously, using virtual address space on an exclusive basis, and thephysical address corresponding to a particular virtual address is uniquely determined. In themultiple virtual memory system, a number of processes run while sharing the virtual addressspace, and a particular virtual address may be translated into different physical addressesdepending on the process. The only difference between the single virtual memory and multiplevirtual memory systems in terms of operation is in the TLB address comparison method (seesection 3.4.3, Address Translation Method).

3.3.7 Address Space Identifier (ASID)

In multiple virtual memory mode, the 8-bit address space identifier (ASID) is used to distinguishbetween processes running simultaneously while sharing the virtual address space. Software canset the ASID of the currently executing process in PTEH in the MMU. The TLB does not have tobe purged when processes are switched by means of ASID.

In single virtual memory mode, ASID is used to provide memory protection for processes runningsimultaneously while using the virtual memory space on an exclusive basis.

3.4 TLB Functions

3.4.1 Unified TLB (UTLB) Configuration

The unified TLB (UTLB) is so called because of its use for the following two purposes:

1. To translate a virtual address to a physical address in a data access

2. As a table of address translation information to be recorded in the instruction TLB in the eventof an ITLB miss

Information in the address translation table located in external memory is cached into the UTLB.The address translation table contains virtual page numbers and address space identifiers, andcorresponding physical page numbers and page management information. Figure 3.7 shows theoverall configuration of the UTLB. The UTLB consists of 64 fully-associative type entries. Figure3.8 shows the relationship between the address format and page size.

Rev. 2.0, 03/99, page 39 of 396

PPN [28:10]

PPN [28:10]

PPN [28:10]

SZ [1:0]

SZ [1:0]

SZ [1:0]

SH

SH

SH

C

C

C

PR [1:0]

PR [1:0]

PR [1:0]

ASID [7:0]

ASID [7:0]

ASID [7:0]

VPN [31:10]

VPN [31:10]

VPN [31:10]

V

V

V

Entry 0

Entry 1

Entry 2

D

D

D

WT

WT

WT

PPN [28:10] SZ [1:0] SH C PR [1:0]

SA [2:0]

SA [2:0]

SA [2:0]

TC

TC

TC

SA [2:0] TCASID [7:0] VPN [31:10] VEntry 63 D WT

Figure 3.7 UTLB Configuration

31

• 1-kbyte page

10 9 0Virtual address

31

• 4-kbyte page


31

• 64-kbyte page


31

• 1-Mbyte page


VPN Offset

VPN Offset

VPN Offset

VPN Offset

28 10 9 0Physical address




PPN Offset

PPN Offset

PPN Offset

PPN Offset

Figure 3.8 Relationship between Page Size and Address Format

• VPN: Virtual page number

For 1-kbyte page: upper 22 bits of virtual address



For 1-Mbyte page: upper 12 bits of virtual address

Rev. 2.0, 03/99, page 40 of 396

• ASID: Address space identifier

Indicates the process that can access a virtual page.

In single virtual memory mode and user mode, or in multiple virtual memory mode, if the SHbit is 0, this identifier is compared with the ASID in PTEH when address comparison isperformed.

• SH: Share status bit

When 0, pages are not shared by processes.

When 1, pages are shared by processes.

• SZ: Page size bits

Specify the page size.

00: 1-kbyte page

01: 4-kbyte page

10: 64-kbyte page

11: 1-Mbyte page

• V: Validity bit

Indicates whether the entry is valid.

0: Invalid

1: Valid

Cleared to 0 by a power-on reset.

Not affected by a manual reset.

• PPN: Physical page number

Upper 22 bits of the physical address.

With a 1-kbyte page, PPN bits [28:10] are valid.



With a 1-Mbyte page, PPN bits [28:20] are valid.

The synonym problem must be taken into account when setting the PPN (see section 3.5.5,Avoiding Synonym Problems).

• PR: Protection key data

2-bit data expressing the page access right as a code.

00: Can be read only, in privileged mode

01: Can be read and written in privileged mode

10: Can be read only, in privileged or user mode

11: Can be read and written in privileged mode or user mode

Rev. 2.0, 03/99, page 41 of 396

• C: Cacheability bit

Indicates whether a page is cacheable.

0: Not cacheable

1: Cacheable

When control register space is mapped, this bit must be cleared to 0.

When performing PCMCIA space mapping in the cache enabled state, either clear this bit to 0or set the WT bit to 1.

• D: Dirty bit

Indicates whether a write has been performed to a page.

0: Write has not been performed

1: Write has been performed

• WT: Write-through bit

Specifies the cache write mode.

0: Copy-back mode

1: Write-through mode

When performing PCMCIA space mapping in the cache enabled state, either set this bit to 1 orclear the C bit to 0.

• SA: Space attribute bits

Valid only when the page is mapped onto PCMCIA connected to area 5 or 6.

000: Undefined

001: Variable-size I/O space (base size according to ,2,6�� signal)

010: 8-bit I/O space

011: 16-bit I/O space

100: 8-bit common memory space

101: 16-bit common memory space

110: 8-bit attribute memory space

111: 16-bit attribute memory space

• TC: Timing control bit

Used to select wait control register bits in the bus control unit for areas 5 and 6.

0: WCR2 (A5W2–A5W0) and PCR (A5PCW1–A5PCW0, A5TED2–A5TED0, A5TEH2–A5TEH0) are used

1: WCR2 (A6W2–A6W0) and PCR (A6PCW1–A6PCW0, A6TED2–A6TED0, A6TEH2–A6TEH0) are used

Rev. 2.0, 03/99, page 42 of 396

3.4.2 Instruction TLB (ITLB) Configuration

The ITLB is used to translate a virtual address to a physical address in an instruction access.Information in the address translation table located in the UTLB is cached into the ITLB. Figure3.9 shows the overall configuration of the ITLB. The ITLB consists of 4 fully-associative typeentries. The address translation information is almost the same as that in the UTLB, but with thefollowing differences:

1. D and WT bits are not supported.

2. There is only one PR bit, corresponding to the upper of the PR bits in the UTLB.

PPN [28:10]

PPN [28:10]

PPN [28:10]

PPN [28:10]

SZ [1:0]

SZ [1:0]

SZ [1:0]

SZ [1:0]

SH

SH

SH

SH

C

C

C

C

PR

PR

PR

PR

ASID [7:0]

ASID [7:0]

ASID [7:0]

ASID [7:0]

VPN [31:10]

VPN [31:10]

VPN [31:10]

VPN [31:10]

V

V

V

V

Entry 0

Entry 1

Entry 2

Entry 3

SA [2:0]

SA [2:0]

SA [2:0]

SA [2:0]

TC

TC

TC

TC

Figure 3.9 ITLB Configuration

3.4.3 Address Translation Method

Figures 3.10 and 3.11 show flowcharts of memory accesses using the UTLB and ITLB.

Rev. 2.0, 03/99, page 43 of 396

MMUCR.AT = 1

SH = 0 and (MMUCR.SV = 0 or

SR.MD = 0)

VPNs matchand ASIDs match and

V = 1

Only oneentry matches

SR.MD?

CCR.OCE?

CCR.CB? CCR.WT?

VPNs matchand V = 1

Cache accessin write-through mode

Memory access

Memory access

Data TLB multiplehit exception

Data TLB protectionviolation exception

Data TLB missexception

Initial page writeexception

Data TLB protectionviolation exception

Cache accessin copy-back mode

Data access to virtual address (VA)

On-chip I/O access

R/W?R/W?

VA is in P4 area

VA is in P2 area

VA is in P1 area

VA is in P0, U0, or P3 area

Yes

No

1

0

Yes

Yes

NoNo

Yes

Yes

Yes

No

No

1 (Privileged)

1

0

0

PR?

0 (User)

D?

R/W? WWW

RRR R

WR/W?

(Non-cacheable)

WT?

C = 1 and CCR.OCE = 1

No

1

1

0

0

00 or 01

10 11 01 or 11 00 or 10

Figure 3.10 Flowchart of Memory Access Using UTLB

Rev. 2.0, 03/99, page 44 of 396

MMUCR.AT = 1

SH = 0and (MMUCR.SV = 0 or

SR.MD = 0)

VPNs matchand ASIDs match and

V = 1

Only oneentry matches

SR.MD?

CCR.ICE?

VPNs matchand V = 1

Memory access

Instruction TLBmultiple hit exception

Instruction TLBmiss exception

Instruction access to virtual address (VA)

VA is in P4 area

VA is in P2 area

VA is in P1 area

VA is in P0, U0, or P3 area

Yes

No

1

0

Yes

Yes

NoNo

Yes

Yes

No

(Non-cacheable)

C = 1and CCR.ICE = 1

No

PR?

Instruction TLB protectionviolation exception

Match? Record in ITLB

Access prohibited

0

1

No

Yes

Yes

No

Hardware ITLB miss handling

0 (User)1 (Privileged)

Search UTLB

Cache access

Figure 3.11 Flowchart of Memory Access Using ITLB

Rev. 2.0, 03/99, page 45 of 396

3.5 MMU Functions

3.5.1 MMU Hardware Management

The SH7750 supports the following MMU functions.

1. The MMU decodes the virtual address to be accessed by software, and performs addresstranslation by controlling the UTLB/ITLB in accordance with the MMUCR settings.

2. The MMU determines the cache access status on the basis of the page managementinformation read during address translation (C, WT, SA, and TC bits).

3. If address translation cannot be performed normally in a data access or instruction access, theMMU notifies software by means of an MMU exception.

4. If address translation information is not recorded in the ITLB in an instruction access, theMMU searches the UTLB, and if the necessary address translation information is recorded inthe UTLB, the MMU copies this information into the ITLB in accordance withMMUCR.LRUI.

3.5.2 MMU Software Management

Software processing for the MMU consists of the following:

1. Setting of MMU-related registers. Some registers are also partially updated by hardwareautomatically.

2. Recording, deletion, and reading of TLB entries. There are two methods of recording UTLBentries: by using the LDTLB instruction, or by writing directly to the memory-mapped UTLB.ITLB entries can only be recorded by writing directly to the memory-mapped ITLB. Fordeleting or reading UTLB/ITLB entries, it is possible to access the memory-mappedUTLB/ITLB.

3. MMU exception handling. When an MMU exception occurs, processing is performed based oninformation set by hardware.

3.5.3 MMU Instruction (LDTLB)

A TLB load instruction (LDTLB) is provided for recording UTLB entries. When an LDTLBinstruction is issued, the SH7750 copies the contents of PTEH, PTEL, and PTEA to the UTLBentry indicated by MMUCR.URC. ITLB entries are not updated by the LDTLB instruction, andtherefore address translation information purged from the UTLB entry may still remain in theITLB entry. As the LDTLB instruction changes address translation information, ensure that it isissued by a program in the P1 or P2 area. The operation of the LDTLB instruction is shown infigure 3.12.

Rev. 2.0, 03/99, page 46 of 396

PPN [28:10]

PPN [28:10]

PPN [28:10]

SZ [1:0]

SZ [1:0]

SZ [1:0]

SH

SH

SH

C

C

C

PR [1:0]

PR [1:0]

PR [1:0]

ASID [7:0]

ASID [7:0]

ASID [7:0]

VPN [31:10]

VPN [31:10]

VPN [31:10]

V

V

V

Entry 0

Entry 1

Entry 2

D

D

D

WT

WT

WT

PPN [28:10] SZ [1:0] SH C PR [1:0]

SA [2:0]

SA [2:0]

SA [2:0]

TC

TC

TC

SA [2:0] TCASID [7:0] VPN [31:10] VEntry 63 D WT

31 29 28 9 8 7 6 5 4 3 2 1 0

— — V SZ PR SZ C D SHWT

PTEL

Write

UTLB

31 10 9 8 7 0

— ASID

PTEH

31 26 25 24 23 18 17 16 15 10 9 8 7 3 2 1 0

LRUI — URB — URC SV

SQMD

— TI — AT

MMUCR

VPN

10

PPN

31 4 3 2 0

— SATC

PTEA

Entry specification

Figure 3.12 Operation of LDTLB Instruction

3.5.4 Hardware ITLB Miss Handling

In an instruction access, the SH7750 searches the ITLB. If it cannot find the necessary addresstranslation information (i.e. in the event of an ITLB miss), the UTLB is searched by hardware, andif the necessary address translation information is present, it is recorded in the ITLB. Thisprocedure is known as hardware ITLB miss handling. If the necessary address translationinformation is not found in the UTLB search, an instruction TLB miss exception is generated andprocessing passes to software.

Rev. 2.0, 03/99, page 47 of 396

3.5.5 Avoiding Synonym Problems

When 1- or 4-kbyte pages are recorded in TLB entries, a synonym problem may arise. Theproblem is that, when a number of virtual addresses are mapped onto a single physical address, thesame physical address data is recorded in a number of cache entries, and it becomes impossible toguarantee data integrity. This problem does not occur with the instruction TLB or instructioncache . In the SH7750, entry specification is performed using bits [13:5] of the virtual address inorder to achieve fast operand cache operation. However, bits [13:10] of the virtual address in thecase of a 1-kbyte page, and bits [13:12] of the virtual address in the case of a 4-kbyte page, aresubject to address translation. As a result, bits [13:10] of the physical address after translation maydiffer from bits [13:10] of the virtual address.

Consequently, the following restrictions apply to the recording of address translation informationin UTLB entries.

1. When address translation information whereby a number of 1-kbyte page UTLB entries aretranslated into the same physical address is recorded in the UTLB, ensure that the VPN [13:10]values are the same.

2. When address translation information whereby a number of 4-kbyte page UTLB entries aretranslated into the same physical address is recorded in the UTLB, ensure that the VPN [13:12]values are the same.

3. Do not use 1-kbyte page UTLB entry physical addresses with UTLB entries of a different pagesize.

4. Do not use 4-kbyte page UTLB entry physical addresses with UTLB entries of a different pagesize.

The above restrictions apply only when performing accesses using the cache. When cache indexmode is used, VPN [25] is used for the entry address instead of VPN [13], and therefore the aboverestrictions apply to VPN [25].

Note: When multiple items of address translation information use the same physical memory toprovide for future SH Series expansion, ensure that the VPN [20:10] values are the same.Also, do not use the same physical address for address translation information of differentpage sizes.

Rev. 2.0, 03/99, page 48 of 396

3.6 MMU Exceptions

There are seven MMU exceptions: the instruction TLB multiple hit exception, instruction TLBmiss exception, instruction TLB protection violation exception, data TLB multiple hit exception,data TLB miss exception, data TLB protection violation exception, and initial page writeexception. Refer to figures 3.10 and 3.11 for the conditions under which each of these exceptionsoccurs.

3.6.1 Instruction TLB Multiple Hit Exception

An instruction TLB multiple hit exception occurs when more than one ITLB entry matches thevirtual address to which an instruction access has been made. If multiple hits occur when theUTLB is searched by hardware in hardware ITLB miss handling, a data TLB multiple hitexception will result.

When an instruction TLB multiple hit exception occurs a reset is executed, and cache coherency isnot guaranteed.

Hardware Processing: In the event of an instruction TLB multiple hit exception, hardware carriesout the following processing:

1. Sets the virtual address at which the exception occurred in TEA.

2. Sets exception code H’140 in EXPEVT.

3. Branches to the reset handling routine (H’A000 0000).

Software Processing (Reset Routine): The ITLB entries which caused the multiple hit exceptionare checked in the reset handling routine. This exception is intended for use in programdebugging, and should not normally be generated.

Rev. 2.0, 03/99, page 49 of 396

3.6.2 Instruction TLB Miss Exception

An instruction TLB miss exception occurs when address translation information for the virtualaddress to which an instruction access is made is not found in the UTLB entries by the hardwareITLB miss handling procedure. The instruction TLB miss exception processing carried out byhardware and software is shown below. This is the same as the processing for a data TLB missexception.

Hardware Processing: In the event of an instruction TLB miss exception, hardware carries outthe following processing:

1. Sets the VPN of the virtual address at which the exception occurred in PTEH.



4. Sets the PC value indicating the address of the instruction at which the exception occurred inSPC. If the exception occurred at a delay slot, sets the PC value indicating the address of thedelayed branch instruction in SPC.

5. Sets the SR contents at the time of the exception in SSR.

6. Sets the MD bit in SR to 1, and switches to privileged mode.

7. Sets the BL bit in SR to 1, and masks subsequent exception requests.

8. Sets the RB bit in SR to 1.

9. Branches to the address obtained by adding offset H’0000 0400 to the contents of VBR, andstarts the instruction TLB miss exception handling routine.

Software Processing (Instruction TLB Miss Exception Handling Routine): Software isresponsible for searching the external memory page table and assigning the necessary page tableentry. Software should carry out the following processing in order to find and assign the necessarypage table entry.

1. Write to PTEL the values of the PPN, PR, SZ, C, D, SH, V, and WT bits in the page tableentry recorded in the external memory address translation table. If necessary, the values of theSA and TC bits should be written to PTEA.

2. When the entry to be replaced in entry replacement is specified by software, write that value toURC in the MMUCR register. If URC is greater than URB at this time, the value should bechanged to an appropriate value after issuing an LDTLB instruction.

3. Execute the LDTLB instruction and write the contents of PTEH, PTEL, and PTEA to the TLB.

4. Finally, execute the exception handling return instruction (RTE), terminate the exceptionhandling routine, and return control to the normal flow. The RTE instruction should be issuedat least one instruction after the LDTLB instruction.

Rev. 2.0, 03/99, page 50 of 396

3.6.3 Instruction TLB Protection Violation Exception

An instruction TLB protection violation exception occurs when, even though an ITLB entrycontains address translation information matching the virtual address to which an instructionaccess is made, the actual access type is not permitted by the access right specified by the PR bit.The instruction TLB protection violation exception processing carried out by hardware andsoftware is shown below.

Hardware Processing: In the event of an instruction TLB protection violation exception,hardware carries out the following processing:



3. Sets exception code H’0A0 in EXPEVT.






9. Branches to the address obtained by adding offset H’0000 0100 to the contents of VBR, andstarts the instruction TLB protection violation exception handling routine.

Software Processing (Instruction TLB Protection Violation Exception Handling Routine):Resolve the instruction TLB protection violation, execute the exception handling return instruction(RTE), terminate the exception handling routine, and return control to the normal flow. The RTEinstruction should be issued at least one instruction after the LDTLB instruction.

Rev. 2.0, 03/99, page 51 of 396

3.6.4 Data TLB Multiple Hit Exception

A data TLB multiple hit exception occurs when more than one UTLB entry matches the virtualaddress to which a data access has been made. A data TLB multiple hit exception is also generatedif multiple hits occur when the UTLB is searched in hardware ITLB miss handling.

When a data TLB multiple hit exception occurs a reset is executed, and cache coherency is notguaranteed. The contents of PPN in the UTLB prior to the exception may also be corrupted.

Hardware Processing: In the event of a data TLB multiple hit exception, hardware carries out thefollowing processing:



3. Branches to the reset handling routine (H’A000 0000).

Software Processing (Reset Routine): The UTLB entries which caused the multiple hitexception are checked in the reset handling routine. This exception is intended for use in programdebugging, and should not normally be generated.

3.6.5 Data TLB Miss Exception

A data TLB miss exception occurs when address translation information for the virtual address towhich a data access is made is not found in the UTLB entries. The data TLB miss exceptionprocessing carried out by hardware and software is shown below.

Hardware Processing: In the event of a data TLB miss exception, hardware carries out thefollowing processing:



3. Sets exception code H’040 in the case of a read, or H’060 in the case of a write, in EXPEVT(OCBP, OCBWB: read; OCBI, MOVCA.L: write).






9. Branches to the address obtained by adding offset H’0000 0400 to the contents of VBR, andstarts the data TLB miss exception handling routine.

Rev. 2.0, 03/99, page 52 of 396

Software Processing (Data TLB Miss Exception Handling Routine): Software is responsiblefor searching the external memory page table and assigning the necessary page table entry.Software should carry out the following processing in order to find and assign the necessary pagetable entry.

1. Write to PTEL the values of the PPN, PR, SZ, C, D, SH, V, and WT bits in the page tableentry recorded in the external memory address translation table. If necessary, the values of theSA and TC bits should be written to PTEA.


3. Execute the LDTLB instruction and write the contents of PTEH, PTEL, and PTEA to theUTLB.


3.6.6 Data TLB Protection Violation Exception

A data TLB protection violation exception occurs when, even though a UTLB entry containsaddress translation information matching the virtual address to which a data access is made, theactual access type is not permitted by the access right specified by the PR bit. The data TLBprotection violation exception processing carried out by hardware and software is shown below.

Hardware Processing: In the event of a data TLB protection violation exception, hardwarecarries out the following processing:



3. Sets exception code H’0A0 in the case of a read, or H’0C0 in the case of a write, in EXPEVT(OCBP, OCBWB: read; OCBI, MOVCA.L: write).






9. Branches to the address obtained by adding offset H’0000 0100 to the contents of VBR, andstarts the data TLB protection violation exception handling routine.

Rev. 2.0, 03/99, page 53 of 396

Software Processing (Data TLB Protection Violation Exception Handling Routine): Resolvethe data TLB protection violation, execute the exception handling return instruction (RTE),terminate the exception handling routine, and return control to the normal flow. The RTEinstruction should be issued at least one instruction after the LDTLB instruction.

3.6.7 Initial Page Write Exception

An initial page write exception occurs when the D bit is 0 even though a UTLB entry containsaddress translation information matching the virtual address to which a data access (write) ismade, and the access is permitted. The initial page write exception processing carried out byhardware and software is shown below.

Hardware Processing: In the event of an initial page write exception, hardware carries out thefollowing processing:









9. Branches to the address obtained by adding offset H’0000 0100 to the contents of VBR, andstarts the initial page write exception handling routine.

Rev. 2.0, 03/99, page 54 of 396

Software Processing (Initial Page Write Exception Handling Routine): The followingprocessing should be carried out as the responsibility of software:

1. Retrieve the necessary page table entry from external memory.

2. Write 1 to the D bit in the external memory page table entry.

3. Write to PTEL the values of the PPN, PR, SZ, C, D, WT, SH, and V bits in the page tableentry recorded in external memory. If necessary, the values of the SA and TC bits should bewritten to PTEA.


5. Execute the LDTLB instruction and write the contents of PTEH, PTEL, and PTEA to theUTLB.


3.7 Memory-Mapped TLB Configuration

To enable the ITLB and UTLB to be managed by software, their contents can be read and writtenby a P2 area program with a MOV instruction in privileged mode. Operation is not guaranteed ifaccess is made from a program in another area. A branch to an area other than the P2 area shouldbe made at least 8 instructions after this MOV instruction. The ITLB and UTLB are allocated tothe P4 area in physical memory space. VPN, V, and ASID in the ITLB can be accessed as anaddress array, PPN, V, SZ, PR, C, and SH as data array 1, and SA and TC as data array 2. VPN,D, V, and ASID in the UTLB can be accessed as an address array, PPN, V, SZ, PR, C, D, WT, andSH as data array 1, and SA and TC as data array 2. V and D can be accessed from both the addressarray side and the data array side. Only longword access is possible. Instruction fetches cannot beperformed in these areas. For reserved bits, a write value of 0 should be specified; their read valueis undefined.

Rev. 2.0, 03/99, page 55 of 396

3.7.1 ITLB Address Array

The ITLB address array is allocated to addresses H’F200 0000 to H’F2FF FFFF in the P4 area. Anaddress array access requires a 32-bit address field specification (when reading or writing) and a32-bit data field specification (when writing). Information for selecting the entry to be accessed isspecified in the address field, and VPN, V, and ASID to be written to the address array arespecified in the data field.

In the address field, bits [31:24] have the value H’F2 indicating the ITLB address array, and theentry is selected by bits [9:8]. As longword access is used, 0 should be specified for address fieldbits [1:0].

In the data field, VPN is indicated by bits [31:10], V by bit [8], and ASID by bits [7:0].

The following two kinds of operation can be used on the ITLB address array:

1. ITLB address array read

VPN, V, and ASID are read into the data field from the ITLB entry corresponding to the entryset in the address field.

2. ITLB address array write

VPN, V, and ASID specified in the data field are written to the ITLB entry corresponding tothe entry set in the address field.

Address field31 23 0

1 1 1 1 0 0 1 0 E

Data field31 10 9 0

VVPN

VPN:V:

E:

24

Virtual page numberValidity bitEntry

10 9 8 7

9 8 7

ASID

ASID::

Address space identifierReserved bits (0 write value, undefinedread value)

Figure 3.13 Memory-Mapped ITLB Address Array

Rev. 2.0, 03/99, page 56 of 396

3.7.2 ITLB Data Array 1

ITLB data array 1 is allocated to addresses H’F300 0000 to H’F37F FFFF in the P4 area. A dataarray access requires a 32-bit address field specification (when reading or writing) and a 32-bitdata field specification (when writing). Information for selecting the entry to be accessed isspecified in the address field, and PPN, V, SZ, PR, C, and SH to be written to the data array arespecified in the data field.

In the address field, bits [31:23] have the value H’F30 indicating ITLB data array 1, and the entryis selected by bits [9:8].

In the data field, PPN is indicated by bits [28:10], V by bit [8], SZ by bits [7] and [4], PR by bit[6], C by bit [3], and SH by bit [1].

The following two kinds of operation can be used on ITLB data array 1:

1. ITLB data array 1 read

PPN, V, SZ, PR, C, and SH are read into the data field from the ITLB entry corresponding tothe entry set in the address field.

2. ITLB data array 1 write

PPN, V, SZ, PR, C, and SH specified in the data field are written to the ITLB entrycorresponding to the entry set in the address field.


1 1 1 1 0 0 01 1 E

Data field

PPN:V:E:

SZ:

24

Physical page numberValidity bitEntryPage size bits

10 9 8 7

PR:C:

SH::

Protection key dataCacheability bitShare status bitReserved bits (0 write value, undefinedread value)

31 2 1 0

V

10 9 8 730 29 28 4 36 5

SZ SHPR

CPPN

Figure 3.14 Memory-Mapped ITLB Data Array 1

Rev. 2.0, 03/99, page 57 of 396

3.7.3 ITLB Data Array 2

ITLB data array 2 is allocated to addresses H’F380 0000 to H’F3FF FFFF in the P4 area. A dataarray access requires a 32-bit address field specification (when reading or writing) and a 32-bitdata field specification (when writing). Information for selecting the entry to be accessed isspecified in the address field, and SA and TC to be written to data array 2 are specified in the datafield.

In the address field, bits [31:23] have the value H’F38 indicating ITLB data array 2, and the entryis selected by bits [9:8].

In the data field, SA is indicated by bits [2:0], and TC by bit [3].

The following two kinds of operation can be used on ITLB data array 2:

1. ITLB data array 2 read

SA and TC are read into the data field from the ITLB entry corresponding to the entry set inthe address field.

2. ITLB data array 2 write

SA and TC specified in the data field are written to the ITLB entry corresponding to the entryset in the address field.


1 1 1 1 0 0 1 1 1 E

Data field31 4 0

TC: E:

24

Timing control bitEntry

89 7

3 2

SA::

Space attribute bitsReserved bits (0 write value, undefined read value)

10

SA

TC

Figure 3.15 Memory-Mapped ITLB Data Array 2

3.7.4 UTLB Address Array

The UTLB address array is allocated to addresses H’F600 0000 to H’F6FF FFFF in the P4 area. Anaddress array access requires a 32-bit address field specification (when reading or writing) and a32-bit data field specification (when writing). Information for selecting the entry to be accessed isspecified in the address field, and VPN, D, V, and ASID to be written to the address array arespecified in the data field.

Rev. 2.0, 03/99, page 58 of 396

In the address field, bits [31:24] have the value H’F6 indicating the UTLB address array, and theentry is selected by bits [13:8]. The address array bit [7] association bit (A bit) specifies whetheror not address comparison is performed when writing to the UTLB address array.

In the data field, VPN is indicated by bits [31:10], D by bit [9], V by bit [8], and ASID by bits[7:0].

The following three kinds of operation can be used on the UTLB address array:

1. UTLB address array read

VPN, D, V, and ASID are read into the data field from the UTLB entry corresponding to theentry set in the address field. In a read, associative operation is not performed regardless ofwhether the association bit specified in the address field is 1 or 0.

2. UTLB address array write (non-associative)

VPN, D, V, and ASID specified in the data field are written to the UTLB entry correspondingto the entry set in the address field. The A bit in the address field should be cleared to 0.

3. UTLB address array write (associative)

When a write is performed with the A bit in the address field set to 1, comparison of all theUTLB entries is carried out using the VPN specified in the data field and PTEH.ASID. Theusual address comparison rules are followed, but if a UTLB miss occurs, the result is nooperation, and an exception is not generated. If the comparison identifies a UTLB entrycorresponding to the VPN specified in the data field, D and V specified in the data field arewritten to that entry. If there is more than one matching entry, a data TLB multiple hitexception results. This associative operation is simultaneously carried out on the ITLB, and if amatching entry is found in the ITLB, V is written to that entry. Even if the UTLB comparisonresults in no operation, a write to the ITLB side only is performed as long as there is an ITLBmatch. If there is a match in both the UTLB and ITLB, the UTLB information is also written tothe ITLB.

Address field

Data field

VPN:V:E:D:

Virtual page numberValidity bitEntryDirty bit

ASID:A:

:

Address space identifierAssociation bitReserved bits (0 write value, undefinedread value)

31 0

VD

10 9 8 730 29 28

A

8 7

ASIDVPN

31 23 2 1 0

1 1 1 1 0 1 1 0 E

24 14 13

Figure 3.16 Memory-Mapped UTLB Address Array

Rev. 2.0, 03/99, page 59 of 396

3.7.5 UTLB Data Array 1

UTLB data array 1 is allocated to addresses H’F700 0000 to H’F77F FFFF in the P4 area. A dataarray access requires a 32-bit address field specification (when reading or writing) and a 32-bitdata field specification (when writing). Information for selecting the entry to be accessed isspecified in the address field, and PPN, V, SZ, PR, C, D, SH, and WT to be written to the dataarray are specified in the data field.

In the address field, bits [31:23] have the value H’F70 indicating UTLB data array 1, and the entryis selected by bits [13:8].

In the data field, PPN is indicated by bits [28:10], V by bit [8], SZ by bits [7] and [4], PR by bits[6:5], C by bit [3], D by bit [2], SH by bit [1], and WT by bit [0].

The following two kinds of operation can be used on UTLB data array 1:

1. UTLB data array 1 read

PPN, V, SZ, PR, C, D, SH, and WT are read into the data field from the UTLB entrycorresponding to the entry set in the address field.

2. UTLB data array 1 write

PPN, V, SZ, PR, C, D, SH, and WT specified in the data field are written to the UTLB entrycorresponding to the entry set in the address field.

Address field

Data field

PPN:V:E:

SZ:D:

Physical page numberValidity bitEntryPage size bitsDirty bit

PR:C:

SH:WT:

:

Protection key dataCacheability bitShare status bitWrite-through bitReserved bits (0 write value, undefined read value)

31 2 1 0

V

10 9 8 730 29 28 4 36 5

PR CPPN

31 23 0

1 1 1 1 0 1 1 1 0 E

24 8 714 13

D

SZ SH WT

Figure 3.17 Memory-Mapped UTLB Data Array 1

Rev. 2.0, 03/99, page 60 of 396

3.7.6 UTLB Data Array 2

UTLB data array 2 is allocated to addresses H’F780 0000 to H’F7FF FFFF in the P4 area. A dataarray access requires a 32-bit address field specification (when reading or writing) and a 32-bitdata field specification (when writing). Information for selecting the entry to be accessed isspecified in the address field, and SA and TC to be written to data array 2 are specified in the datafield.

In the address field, bits [31:23] have the value H’F78 indicating UTLB data array 2, and the entryis selected by bits [13:8].

In the data field, TC is indicated by bit [3], and SA by bits [2:0].

The following two kinds of operation can be used on UTLB data array 2:

1. UTLB data array 2 read

SA and TC are read into the data field from the UTLB entry corresponding to the entry set inthe address field.

2. UTLB data array 2 write

SA and TC specified in the data field are written to the UTLB entry corresponding to the entryset in the address field.


1 1 1 1 0 1 1 1 1 E

Data field31 4 0

TC

24 813 7

3 2

14

SA

TC: E:

Timing control bitEntry

SA::

Space attribute bitsReserved bits (0 write value, undefined readvalue)

Figure 3.18 Memory-Mapped UTLB Data Array 2

Rev. 2.0, 03/99, page 61 of 396

Section 4 Caches

4.1 Overview

4.1.1 Features

The SH7750 has an on-chip 8-kbyte instruction cache (IC) for instructions and 16-kbyte operandcache (OC) for data. Half of the memory of the operand cache (8 kbytes) can also be used as on-chip RAM. The features of these caches are summarized in table 4.1.

Table 4.1 Cache Features

Item Instruction Cache Operand Cache

Capacity 8-kbyte cache 16-kbyte cache or 8-kbyte cache +8-kbyte RAM

Type Direct mapping Direct mapping

Line size 32 bytes 32 bytes

Entries 256 512

Write method Copy-back/write-through selectable

Item Store Queues

Capacity 2 × 32 bytes

Addresses H’E000 0000 to H’E3FF FFFF

Write Store instruction (1-cycle write)

Write-back Prefetch instruction

Access right MMU off: according to MMUCR.SQMD

MMU on: according to individual page PR

Rev. 2.0, 03/99, page 62 of 396


Table 4.2 shows the cache control registers.

Table 4.2 Cache Control Registers

Name Abbreviation R/WInitialValue*1

P4Address*2

Area 7Address*2

AccessSize

Cache controlregister

CCR R/W H’0000 0000 H’FF00 001C H’1F00 001C 32

Queue addresscontrol register 0

QACR0 R/W Undefined H’FF00 0038 H’1F00 0038 32

Queue addresscontrol register 1

QACR1 R/W Undefined H’FF00 003C H’1F00 003C 32

Notes: 1. The initial value is the value after a power-on or manual reset.2. This is the address when using the virtual/physical address space P4 area. When



There are three cache and store queue related control registers, as shown in figure 4.1.

CCR

31 1416 15 12 11 10 9 8 7 6 5 4 3 2

CB

1 0

ICI ICE ORAOIX OCI

AREA

indicates reserved bits: 0 must be specified in a write; the read value is undefined.

WT OCEIIX

QACR0

31 5 4 2 1 0

AREA

QACR1

31 5 4 2 1 0

Figure 4.1 Cache and Store Queue Control Registers

Rev. 2.0, 03/99, page 63 of 396

(1) Cache Control Register (CCR): CCR contains the following bits:

IIX: IC index enableICI: IC invalidationICE: IC enableOIX: OC index enableORA: OC RAM enableOCI: OC invalidationCB: Copy-back enableWT: Write-through enableOCE: OC enable

Longword access to CCR can be performed from H’FF00 001C in the P4 area and H’1F00 001C inarea 7. The CCR bits are used for the cache settings described below. Consequently, CCRmodifications must only be made by a program in the non-cached P2 area. After CCR is updated,an instruction that performs data access to the P0, P1, P3, or U0 area should be located at leastfour instructions after the CCR update instruction. Also, a branch instruction to the P0, P1, P3, orU0 area should be located at least eight instructions after the CCR update instruction.

• IIX: IC index enable bit

0: Address bits [12:5] used for IC entry selection

1: Address bits [25] and [11:5] used for IC entry selection

• ICI: IC invalidation bit

When 1 is written to this bit, the V bits of all IC entries are cleared to 0. This bit always returns0 when read.

• ICE: IC enable bit

Indicates whether or not the IC is to be used. When address translation is performed, the ICcannot be used unless the C bit in the page management information is also 1.

0: IC not used

1: IC used

• OIX: OC index enable bit

0: Address bits [13:5] used for OC entry selection

1: Address bits [25] and [12:5] used for OC entry selection

• ORA: OC RAM enable bit

When the OC is enabled (OCE = 1), the ORA bit specifies whether the 8 kbytes from entry128 to entry 255 and from entry 384 to entry 511 of the OC are to be used as RAM. When theOC is not enabled (OCE = 0), the ORA bit should be cleared to 0.

0: 16 kbytes used as cache

1: 8 kbytes used as cache, and 8 kbytes as RAM

Rev. 2.0, 03/99, page 64 of 396

• OCI: OC invalidation bit

When 1 is written to this bit, the V and U bits of all OC entries are cleared to 0. This bit alwaysreturns 0 when read.

• CB: Copy-back bit

Indicates the P1 area cache write mode.


1: Copy-back mode

• WT: Write-through bit

Indicates the P0, U0, and P3 area cache write mode. When address translation is performed,the value of the WT bit in the page management information has priority.

0: Copy-back mode


• OCE: OC enable bit

Indicates whether or not the OC is to be used. When address translation is performed, the OCcannot be used unless the C bit in the page management information is also 1.

0: OC not used

1: OC used

(2) Queue Address Control Register 0 (QACR0): Longword access to QACR0 can beperformed from H’FF00 0038 in the P4 area and H’1F00 0038 in area 7. QACR0 specifies the areaonto which store queue 0 (SQ0) is mapped when the MMU is off.

(3) Queue Address Control Register 1 (QACR1): Longword access to QACR1 can beperformed from H’FF00 003C in the P4 area and H’1F00 003C in area 7. QACR1 specifies thearea onto which store queue 1 (SQ1) is mapped when the MMU is off.

Rev. 2.0, 03/99, page 65 of 396

4.3 Operand Cache (OC)

4.3.1 Configuration

Figure 4.2 shows the configuration of the operand cache.

31 26 25 5 4 3 2 1

LW0

32 bits

LW1

32 bits

LW2

32 bits

LW3

32 bits

LW4

32 bits

LW5

32 bits

LW6

32 bits

LW7

32 bits

MMU

RAM areadetermination

ORAOIX[13] [12]

[11:5]

511 19 bits 1 bit 1 bit

Tag address U V

Address array Data array

Ent

ry s

elec

tion

Longword (LW) selection

Effective address

39

22

19

0

Write dataRead data

Hit signal

Compare

13 12 11 10 9 0

Figure 4.2 Configuration of Operand Cache

Rev. 2.0, 03/99, page 66 of 396

The operand cache consists of 512 cache lines, each composed of a 19-bit tag, V bit, U bit, and 32-byte data.

• Tag

Stores the upper 19 bits of the 29-bit external memory address of the data line to be cached.The tag is not initialized by a power-on or manual reset.

• V bit (validity bit)

Indicates that valid data is stored in the cache line. When this bit is 1, the cache line data isvalid. The V bit is initialized to 0 by a power-on reset, but retains its value in a manual reset.

• U bit (dirty bit)

The U bit is set to 1 if data is written to the cache line while the cache is being used in copy-back mode. That is, the U bit indicates a mismatch between the data in the cache line and thedata in external memory. The U bit is never set to 1 while the cache is being used in write-through mode, unless it is modified by accessing the memory-mapped cache (see section 4.5,Memory-Mapped Cache Configuration). The U bit is initialized to 0 by a power-on reset, butretains its value in a manual reset.

• Data field

The data field holds 32 bytes (256 bits) of data per cache line. The data array is not initializedby a power-on or manual reset.

4.3.2 Read Operation

When the OC is enabled (CCR.OCE = 1) and data is read by means of an effective address from acacheable area, the cache operates as follows:

1. The tag, V bit, and U bit are read from the cache line indexed by effective address bits [13:5].

2. The tag is compared with bits [28:10] of the address resulting from effective addresstranslation by the MMU:

• If the tag matches and the V bit is 1 → (3a)

• If the tag matches and the V bit is 0 → (3b)

• If the tag does not match and the V bit is 0 → (3b)

• If the tag does not match, the V bit is 1, and the U bit is 0 → (3b)

• If the tag does not match, the V bit is 1, and the U bit is 1 → (3c)

Rev. 2.0, 03/99, page 67 of 396

3a. Cache hit

The data indexed by effective address bits [4:0] is read from the data field of the cache lineindexed by effective address bits [13:5] in accordance with the access size(quadword/longword/word/byte).

3b. Cache miss (no write-back)

Data is read into the cache line from the external memory space corresponding to the effectiveaddress. Data reading is performed, using the wraparound method, in order from the longworddata corresponding to the effective address, and when the corresponding data arrives in thecache, the read data is returned to the CPU. While the remaining one cache line of data is beingread, the CPU can execute the next processing. When reading of one line of data is completed,the tag corresponding to the effective address is recorded in the cache, and 1 is written to the Vbit.

3c. Cache miss (with write-back)

The tag and data field of the cache line indexed by effective address bits [13:5] are saved in thewrite-back buffer. Then data is read into the cache line from the external memory spacecorresponding to the effective address. Data reading is performed, using the wraparoundmethod, in order from the longword data corresponding to the effective address, and when thecorresponding data arrives in the cache, the read data is returned to the CPU. While theremaining one cache line of data is being read, the CPU can execute the next processing. Whenreading of one line of data is completed, the tag corresponding to the effective address isrecorded in the cache, 1 is written to the V bit, and 0 to the U bit. The data in the write-backbuffer is then written back to external memory.

4.3.3 Write Operation

When the OC is enabled (CCR.OCE = 1) and data is written by means of an effective address to acacheable area, the cache operates as follows:

1. The tag, V bit, and U bit are read from the cache line indexed by effective address bits [13:5].


Copy-back Write-through

• If the tag matches and the V bit is 1 → (3a) → (3b)

• If the tag matches and the V bit is 0 → (3c) → (3d)

• If the tag does not match and the V bit is 0 → (3c) → (3d)

• If the tag does not match, the V bit is 1, and the U bit is 0 → (3c) → (3d)

• If the tag does not match, the V bit is 1, and the U bit is 1 → (3e) → (3d)

Rev. 2.0, 03/99, page 68 of 396

3a. Cache hit (copy-back)

A data write in accordance with the access size (quadword/longword/word/byte) is performedfor the data indexed by bits [4:0] of the effective address of the data field of the cache lineindexed by effective address bits [13:5]. Then 1 is set in the U bit.

3b. Cache hit (write-through)

A data write in accordance with the access size (quadword/longword/word/byte) is performedfor the data indexed by bits [4:0] of the effective address of the data field of the cache lineindexed by effective address bits [13:5]. A write is also performed to the correspondingexternal memory using the specified access size.

3c. Cache miss (no copy-back/write-back)

A data write in accordance with the access size (quadword/longword/word/byte) is performedfor the data indexed by bits [4:0] of the effective address of the data field of the cache lineindexed by effective address bits [13:5]. Then, data is read into the cache line from the externalmemory space corresponding to the effective address. Data reading is performed, using thewraparound method, in order from the longword data corresponding to the effective address,and one cache line of data is read excluding the written data. During this time, the CPU canexecute the next processing. When reading of one line of data is completed, the tagcorresponding to the effective address is recorded in the cache, and 1 is written to the V bit andU bit.

3d. Cache miss (write-through)

A write of the specified access size is performed to the external memory corresponding to theeffective address. In this case, a write to cache is not performed.

3e. Cache miss (with copy-back/write-back)

The tag and data field of the cache line indexed by effective address bits [13:5] are first savedin the write-back buffer, and then a data write in accordance with the access size(quadword/longword/word/byte) is performed for the data indexed by bits [4:0] of the effectiveaddress of the data field of the cache line indexed by effective address bits [13:5]. Then, data isread into the cache line from the external memory space corresponding to the effectiveaddress. Data reading is performed, using the wraparound method, in order from the longworddata corresponding to the effective address, and one cache line of data is read excluding thewritten data. During this time, the CPU can execute the next processing. When reading of oneline of data is completed, the tag corresponding to the effective address is recorded in thecache, and 1 is written to the V bit and U bit. The data in the write-back buffer is then writtenback to external memory.

Rev. 2.0, 03/99, page 69 of 396

4.3.4 Write-Back Buffer

In order to give priority to data reads to the cache and improve performance, the SH7750 has awrite-back buffer which holds the relevant cache entry when it becomes necessary to purge a dirtycache entry into external memory as the result of a cache miss. The write-back buffer contains onecache line of data and the physical address of the purge destination.

LW7Physical address bits [28:5] LW6LW5LW4LW3LW2LW1LW0

Figure 4.3 Configuration of Write-Back Buffer

4.3.5 Write-Through Buffer

The SH7750 has a 64-bit buffer for holding write data when writing data in write-through mode orwriting to a non-cacheable area. This allows the CPU to proceed to the next operation as soon asthe write to the write-through buffer is completed, without waiting for completion of the write toexternal memory.

Physical address bits [28:0] LW1LW0

Figure 4.4 Configuration of Write-Through Buffer

4.3.6 RAM Mode

Setting CCR.ORA to 1 enables 8 kbytes of the operand cache to be used as RAM. The operandcache entries used as RAM are entries 128 to 255 and 384 to 511 . Other entries can still be usedas cache. RAM can be accessed using addresses H’7C00 0000 to H’7FFF FFFF. Byte-, word-,longword-, and quadword-size data reads and writes can be performed in the operand cache RAMarea. Instruction fetches cannot be performed in this area.

An example of RAM use is shown below. Here, the 4 kbytes comprising OC entries 128 to 256are designated as RAM area 1, and the 4 kbytes comprising OC entries 384 to 511 as RAM area 2.

Rev. 2.0, 03/99, page 70 of 396

• When OC index mode is off (CCR.OIX = 0)

H’7C00 0000 to H’7C00 0FFF (4 kB): Corresponds to RAM area 1





: : :

RAM areas 1 and 2 then repeat every 8 kbytes up to H’7FFF FFFF.

Thus, to secure a continuous 8-kbyte RAM area, the area from H’7C00 1000 to H’7C00 2FFFcan be used, for example.

• When OC index mode is on (CCR.OIX = 1)




: : :

H’7DFF F000 to H’7DFF FFFF (4 kB): Corresponds to RAM area 1

H’7E00 0000 to H’7E00 0FFF (4 kB): Corresponds to RAM area 2

H’7E00 1000 to H’7E00 1FFF (4 kB): Corresponds to RAM area 2

: : :

H’7FFF F000 to H’7FFF FFFF (4 kB): Corresponds to RAM area 2

As the distinction between RAM areas 1 and 2 is indicated by address bit [25], the area fromH’7DFF F000 to H’7E00 0FFF should be used to secure a continuous 8-kbyte RAM area.

4.3.7 OC Index Mode

Setting CCR.OIX to 1 enables OC indexing to be performed using bit [25] of the effectiveaddress. This is called OC index mode. In normal mode, with CCR.OIX cleared to 0, OC indexingis performed using bits [13:5] of the effective address; therefore, when 16 kbytes or more ofconsecutive data is handled, the OC is fully used by this data. This results in frequent cachemisses. Using index mode allows the OC to be handled as two 8-kbyte areas by means of effectiveaddress bit [25], providing efficient use of the cache.

Rev. 2.0, 03/99, page 71 of 396

4.3.8 Coherency between Cache and External Memory

Coherency between cache and external memory should be assured by software. In the SH7750, thefollowing four new instructions are supported for cache operations. For details of theseinstructions, see section 10, Instruction Descriptions.

Invalidate instruction: OCBI @Rn Cache invalidation (no write-back)

Purge instruction: OCBP @Rn Cache invalidation (with write-back)

Write-back instruction: OCBWB @Rn Cache write-back

Allocate instruction: MOVCA.L R0,@Rn Cache allocation

4.3.9 Prefetch Operation

The SH7750 supports a prefetch instruction to reduce the cache fill penalty incurred as the resultof a cache miss. If it is known that a cache miss will result from a read or write operation, it ispossible to fill the cache with data beforehand by means of the prefetch instruction to prevent acache miss due to the read or write operation, and so improve software performance. If a prefetchinstruction is executed for data already held in the cache, or if the prefetch address results in aUTLB miss or a protection violation, the result is no operation, and an exception is not generated.For details of the prefetch instruction, see section 10.74, PREF.

Prefetch instruction: PREF @Rn

Rev. 2.0, 03/99, page 72 of 396

4.4 Instruction Cache (IC)

4.4.1 Configuration

Figure 4.5 shows the configuration of the instruction cache.

LW0

32 bits

LW1

32 bits

LW2

32 bits

LW3

32 bits

LW4

32 bits

LW5

32 bits

LW6

32 bits

LW7

32 bits255 19 bits 1 bit

Tag address V

Address array

Longword (LW) selection

Data array

0

Read data

Hit signal

Compare

31 26 25 5 4 3 2 1

MMU

IIX[12]

[11:5]

Ent

ry s

elec

tion

Effective address

8 3

22

19

13 12 11 10 9 0

Figure 4.5 Configuration of Instruction Cache

Rev. 2.0, 03/99, page 73 of 396

The instruction cache consists of 256 cache lines, each composed of a 19-bit tag, V bit, and 32-byte data (16 instructions).

• Tag

Stores the upper 19 bits of the 29-bit external memory address of the data line to be cached.The tag is not initialized by a power-on or manual reset.

• V bit (validity bit)

Indicates that valid data is stored in the cache line. When this bit is 1, the cache line data isvalid. The V bit is initialized to 0 by a power-on reset, but retains its value in a manual reset.

• Data array

The data field holds 32 bytes (256 bits) of data per cache line. The data array is not initializedby a power-on or manual reset.

4.4.2 Read Operation

When the IC is enabled (CCR.ICE = 1) and instruction fetches are performed by means of aneffective address from a cacheable area, the instruction cache operates as follows:

1. The tag and V bit are read from the cache line indexed by effective address bits [12:5].


• If the tag matches and the V bit is 1 → (3a)

• If the tag matches and the V bit is 0 → (3b)



3a. Cache hit

The data indexed by effective address bits [4:2] is read as an instruction from the data field ofthe cache line indexed by effective address bits [12:5].

3b. Cache miss

Data is read into the cache line from the external memory space corresponding to the effectiveaddress. Data reading is performed, using the wraparound method, in order from the longworddata corresponding to the effective address, and when the corresponding data arrives in thecache, the read data is returned to the CPU as an instruction. When reading of one line of datais completed, the tag corresponding to the effective address is recorded in the cache, and 1 iswritten to the V bit.

Rev. 2.0, 03/99, page 74 of 396

4.4.3 IC Index Mode

Setting CCR.IIX to 1 enables IC indexing to be performed using bit [25] of the effective address.This is called IC index mode. In normal mode, with CCR.IIX cleared to 0, IC indexing isperformed using bits [12:5] of the effective address; therefore, when 8 kbytes or more ofconsecutive program instructions are handled, the IC is fully used by this program. This results infrequent cache misses. Using index mode allows the IC to be handled as two 4-kbyte areas bymeans of effective address bit [25], providing efficient use of the cache.

4.5 Memory-Mapped Cache Configuration

To enable the IC and OC to be managed by software, their contents can be read and written by aP2 area program with a MOV instruction in privileged mode. Operation is not guaranteed if accessis made from a program in another area. In this case, a branch to the P0, U0, P1, or P3 area shouldbe made at least 8 instructions after this MOV instruction. The IC and OC are allocated to the P4area in physical memory space. Only data accesses can be used on both the IC address array anddata array and the OC address array and data array, and accesses are always longword-size.Instruction fetches cannot be performed in these areas. For reserved bits, a write value of 0 shouldbe specified; their read value is undefined.

4.5.1 IC Address Array

The IC address array is allocated to addresses H’F000 0000 to H’F0FF FFFF in the P4 area. Anaddress array access requires a 32-bit address field specification (when reading or writing) and a32-bit data field specification. The entry to be accessed is specified in the address field, and thewrite tag and V bit are specified in the data field.

In the address field, bits [31:24] have the value H’F0 indicating the IC address array, and the entryis specified by bits [12:5]. CCR.IIX has no effect on this entry specification. The address array bit[3] association bit (A bit) specifies whether or not association is performed when writing to the ICaddress array. As only longword access is used, 0 should be specified for address field bits [1:0].

In the data field, the tag is indicated by bits [31:10], and the V bit by bit [0]. As the IC addressarray tag is 19 bits in length, data field bits [31:29] are not used in the case of a write in whichassociation is not performed. Data field bits [31:29] are used for the virtual address specificationonly in the case of a write in which association is performed.

The following three kinds of operation can be used on the IC address array:

1. IC address array read

The tag and V bit are read into the data field from the IC entry corresponding to the entry set inthe address field. In a read, associative operation is not performed regardless of whether theassociation bit specified in the address field is 1 or 0.

Rev. 2.0, 03/99, page 75 of 396

2. IC address array write (non-associative)

The tag and V bit specified in the data field are written to the IC entry corresponding to theentry set in the address field. The A bit in the address field should be cleared to 0.

3. IC address array write (associative)

When a write is performed with the A bit in the address field set to 1, the tag stored in the entryspecified in the address field is compared with the tag specified in the data field. If the MMUis enabled at this time, comparison is performed after the virtual address specified by data fieldbits [31:10] has been translated to a physical address using the ITLB. If the addresses matchand the V bit is 1, the V bit specified in the data field is written into the IC entry. Thisoperation is used to invalidate a specific IC entry. If an ITLB miss occurs during addresstranslation, or the comparison shows a mismatch, no operation results and the write is notperformed. If an instruction TLB multiple hit exception occurs during address translation,processing switches to the instruction TLB multiple hit exception handling routine.

Address field31 23 12 5 4 3 2 1 0

1 1 1 1 0 0 0 0 Entry A

Data field31 10 9 1 0

VTag address

VA

24 13

: Validity bit: Association bit: Reserved bits (0 write value, undefined read value)

Figure 4.6 Memory-Mapped IC Address Array

4.5.2 IC Data Array

The IC data array is allocated to addresses H’F100 0000 to H’F1FF FFFF in the P4 area. A dataarray access requires a 32-bit address field specification (when reading or writing) and a 32-bitdata field specification. The entry to be accessed is specified in the address field, and the longworddata to be written is specified in the data field.

In the address field, bits [31:24] have the value H’F1 indicating the IC data array, and the entry isspecified by bits [12:5]. CCR.IIX has no effect on this entry specification. Address field bits [4:2]are used for the longword data specification in the entry. As only longword access is used, 0should be specified for address field bits [1:0].

The data field is used for the longword data specification.

The following two kinds of operation can be used on the IC data array:

Rev. 2.0, 03/99, page 76 of 396

1. IC data array read

Longword data is read into the data field from the data specified by the longword specificationbits in the address field in the IC entry corresponding to the entry set in the address field.

2. IC data array write

The longword data specified in the data field is written for the data specified by the longwordspecification bits in the address field in the IC entry corresponding to the entry set in theaddress field.

Address field31 23 12 5 4 2 1 0

1 1 1 1 0 0 0 1 Entry L

Data field31 0

Longword data

L

24 13

: Longword specification bits: Reserved bits (0 write value, undefined read value)

Figure 4.7 Memory-Mapped IC Data Array

4.5.3 OC Address Array

The OC address array is allocated to addresses H’F400 0000 to H’F4FF FFFF in the P4 area. Anaddress array access requires a 32-bit address field specification (when reading or writing) and a32-bit data field specification. The entry to be accessed is specified in the address field, and thewrite tag, U bit, and V bit are specified in the data field.

In the address field, bits [31:24] have the value H’F4 indicating the OC address array, and theentry is specified by bits [13:5]. CCR.OIX and CCR.ORA have no effect on this entryspecification. The address array bit [3] association bit (A bit) specifies whether or not associationis performed when writing to the OC address array. As only longword access is used, 0 should bespecified for address field bits [1:0].

In the data field, the tag is indicated by bits [31:10], the U bit by bit [1], and the V bit by bit [0].As the OC address array tag is 19 bits in length, data field bits [31:29] are not used in the case of awrite in which association is not performed. Data field bits [31:29] are used for the virtual addressspecification only in the case of a write in which association is performed.

Rev. 2.0, 03/99, page 77 of 396

The following three kinds of operation can be used on the OC address array:

1. OC address array read

The tag, U bit, and V bit are read into the data field from the OC entry corresponding to theentry set in the address field. In a read, associative operation is not performed regardless ofwhether the association bit specified in the address field is 1 or 0.

2. OC address array write (non-associative)

The tag, U bit, and V bit specified in the data field are written to the OC entry corresponding tothe entry set in the address field. The A bit in the address field should be cleared to 0.

When a write is performed to a cache line for which the U bit and V bit are both 1, after write-back of that cache line, the tag, U bit, and V bit specified in the data field are written.

3. OC address array write (associative)

When a write is performed with the A bit in the address field set to 1, the tag stored in the entryspecified in the address field is compared with the tag specified in the data field. If the MMUis enabled at this time, comparison is performed after the virtual address specified by data fieldbits [31:10] has been translated to a physical address using the UTLB. If the addresses matchand the V bit is 1, the U bit and V bit specified in the data field are written into the OC entry.This operation is used to invalidate a specific OC entry. If the OC entry U bit is 1, and 0 iswritten to the V bit or to the U bit, write-back is performed. If an UTLB miss occurs duringaddress translation, or the comparison shows a mismatch, no operation results and the write isnot performed. If a data TLB multiple hit exception occurs during address translation,processing switches to the data TLB multiple hit exception handling routine.

Address field31 23 5 4 3 2 1 0

1 1 1 1 0 1 0 0 Entry A

Data field31 10 9 1 0

VTag address

24 1314

2

U

VUA

: Validity bit: Dirty bit: Association bit: Reserved bits (0 write value, undefined read value)

Figure 4.8 Memory-Mapped OC Address Array

Rev. 2.0, 03/99, page 78 of 396

4.5.4 OC Data Array

The OC data array is allocated to addresses H’F500 0000 to H’F5FF FFFF in the P4 area. A dataarray access requires a 32-bit address field specification (when reading or writing) and a 32-bitdata field specification. The entry to be accessed is specified in the address field, and the longworddata to be written is specified in the data field.

In the address field, bits [31:24] have the value H’F5 indicating the OC data array, and the entry isspecified by bits [13:5]. CCR.OIX and CCR.ORA have no effect on this entry specification.Address field bits [4:2] are used for the longword data specification in the entry. As only longwordaccess is used, 0 should be specified for address field bits [1:0].

The data field is used for the longword data specification.

The following two kinds of operation can be used on the OC data array:

1. OC data array read

Longword data is read into the data field from the data specified by the longword specificationbits in the address field in the OC entry corresponding to the entry set in the address field.

2. OC data array write

The longword data specified in the data field is written for the data specified by the longwordspecification bits in the address field in the OC entry corresponding the entry set in the addressfield. This write does not set the U bit to 1 on the address array side.

Address field31 23 5 4 2 1 0

1 1 1 1 0 1 0 1 Entry L

Data field31 0

Longword data

24 1314

L : Longword specification bits: Reserved bits (0 write value, undefined read value)

Figure 4.9 Memory-Mapped OC Data Array

Rev. 2.0, 03/99, page 79 of 396

4.6 Store Queues

Two 32-byte store queues (SQs) are supported to perform high-speed writes to external memory.

4.6.1 SQ Configuration

There are two 32-byte store queues, SQ0 and SQ1, as shown in figure 4.10. These two storequeues can be set independently.

SQ0 SQ0[0] SQ0[1] SQ0[2] SQ0[3] SQ0[4] SQ0[5] SQ0[6] SQ0[7]

SQ1 SQ1[0] SQ1[1] SQ1[2] SQ1[3] SQ1[4] SQ1[5] SQ1[6] SQ1[7]

4B 4B 4B 4B 4B 4B 4B 4B

Figure 4.10 Store Queue Configuration

4.6.2 SQ Writes

A write to the SQs can be performed using a store instruction (MOV) on P4 area H’E000 0000 toH’E3FF FFFC. A longword or quadword access size can be used. The meaning of the address bitsis as follows:

[31:26]: 111000 Store queue specification[25:6]: Don’t care Used for external memory transfer/access right[5]: 0/1 0: SQ0 specification 1: SQ1 specification[4:2]: LW specification Specifies longword position in SQ0/SQ1[1:0] 00 Fixed at 0

4.6.3 Transfer to External Memory

Transfer from the SQs to external memory can be performed with a prefetch instruction (PREF).Issuing a PREF instruction for P4 area H'E000 0000 to H'E3FF FFFC starts a burst transfer fromthe SQs to external memory. The burst transfer length is fixed at 32 bytes, and the start address isalways at a 32-byte boundary. While the contents of one SQ are being transferred to externalmemory, the other SQ can be written to without a penalty cycle, but writing to the SQ involved inthe transfer to external memory is deferred until the transfer is completed.

The SQ transfer destination external memory address bit [28:0] specification is as shown below,according to whether the MMU is on or off.

Rev. 2.0, 03/99, page 80 of 396

• When MMU is on

The SQ area (H’E000 0000 to H’E3FF FFFF) is set in VPN of the UTLB, and the transferdestination external memory address in PPN. The ASID, V, SZ, SH, PR, and D bits have thesame meaning as for normal address translation, but the C and WT bits have no meaning withregard to this page. Since burst transfer is prohibited for PCMCIA areas, the SA and TC bitsalso have no meaning.

When a prefetch instruction is issued for the SQ area, address translation is performed andexternal memory address bits [28:10] are generated in accordance with the SZ bit specification.For external memory address bits [9:5], the address prior to address translation is generated inthe same way as when the MMU is off. External memory address bits [4:0] are fixed at 0.Transfer from the SQs to external memory is performed to this address.

• When MMU is off

The SQ area (H’E000 0000 to H’E3FF FFFF) is specified as the address at which a prefetch isperformed. The meaning of address bits [31:0] is as follows:

[31:26]: 111000 Store queue specification

[25:6]: Address External memory address bits [25:6]

[5]: 0/1 0: SQ0 specification1: SQ1 specification and external memory address bit [5]

[4:2]: Don’t care No meaning in a prefetch

[1:0] 00 Fixed at 0

External memory address bits [28:26], which cannot be generated from the above address, aregenerated from the QACR0/1 registers.

QACR0 [4:2]: External memory address bits [28:26] corresponding to SQ0

QACR1 [4:2]: External memory address bits [28:26] corresponding to SQ1

External memory address bits [4:0] are always fixed at 0 since burst transfer starts at a 32-byteboundary.

Rev. 2.0, 03/99, page 81 of 396

4.6.4 SQ Protection

It is possible to set protection against SQ writes and transfers to external memory. If an SQ writeviolates the protection setting, an exception will be generated but the SQ contents will becorrupted. If a transfer from the SQs to external memory (prefetch instruction) violates theprotection setting, the transfer to external memory will be inhibited and an exception will begenerated.

• When MMU is on

Operation is in accordance with the address translation information recorded in the UTLB, andMMUCR.SQMD. Write type exception judgment is performed for writes to the SQs, and readtype for transfer from the SQs to external memory (PREF instruction), and a TLB missexception, protection violation exception, or initial page write exception is generated.However, if SQ access is enabled, in privileged mode only, by MMUCR.SQMD, an addresserror will be flagged in user mode even if address translation is successful.

• When MMU is off

Operation is in accordance with MMUCR.SQMD.

0: Privileged/user access possible

1: Privileged access possible

If the SQ area is accessed in user mode when MMUCR.SQMD is set to 1, an address error willbe flagged.

Rev. 2.0, 03/99, page 82 of 396

Rev. 2.0, 03/99, page 83 of 396

Section 5 Exceptions

5.1 Overview

5.1.1 Features

Exception handling is processing handled by a special routine, separate from normal programprocessing, that is executed by the CPU in case of abnormal events. For example, if the executinginstruction ends abnormally, appropriate action must be taken in order to return to the originalprogram sequence, or report the abnormality before terminating the processing. The process ofgenerating an exception handling request in response to abnormal termination, and passing controlto a user-written exception handling routine, in order to support such functions, is given thegeneric name of exception handling.

SH7750 exception handling is of three kinds: for resets, general exceptions, and interrupts.


The registers used in exception handling are shown in table 5.1.

Table 5.1 Exception-Related Registers

NameAbbrevia-tion R/W Initial Value*1

P4Address*2

Area 7Address*2

AccessSize

TRAPA exceptionregister

TRA R/W Undefined H’FF00 0020 H’1F00 0020 32

Exception eventregister

EXPEVT R/W H’0000 0000/H’0000 0020*1

H’FF00 0024 H’1F00 0024 32

Interrupt eventregister

INTEVT R/W Undefined H’FF00 0028 H’1F00 0028 32

Notes: 1. H’0000 0000 is set in a power-on reset, and H’0000 0020 in a manual reset.2. This is the address when using the virtual/physical address space P4 area. When


Rev. 2.0, 03/99, page 84 of 396


There are three registers related to exception handling. These are allocated to memory, and can beaccessed by specifying the P4 address or area 7 address.

1. The exception event register (EXPEVT) resides at P4 address H’FF00 0024, and contains a 12-bit exception code. The exception code set in EXPEVT is that for a reset or general exceptionevent. The exception code is set automatically by hardware when an exception occurs.EXPEVT can also be modified by software.

2. The interrupt event register (INTEVT) resides at P4 address H’FF00 0028, and contains a 12-bit exception code. The exception code set in INTEVT is that for an interrupt request. Theexception code is set automatically by hardware when an exception occurs. INTEVT can alsobe modified by software.

3. The TRAPA exception register (TRA) resides at P4 address H’FF00 0020, and contains 8-bitimmediate data (imm) for the TRAPA instruction. TRA is set automatically by hardware whena TRAPA instruction is executed. TRA can also be modified by software.

The bit configurations of EXPEVT, INTEVT, and TRA are shown in figure 5.1.

31 0

0

0 0 0 0

0

31 10 9 1 0

0:

imm:

Reserved bits. These bits are always read as 0, and should only be written with 0.8-bit immediate data of the TRAPA instruction

12 11

2

EXPEVT and INTEVT

TRA

imm

Exception code

Figure 5.1 Register Bit Configurations

Rev. 2.0, 03/99, page 85 of 396

5.3 Exception Handling Functions

5.3.1 Exception Handling Flow

In exception handling, the contents of the program counter (PC) and status register (SR) are savedin the saved program counter (SPC) and saved status register (SSR), and the CPU starts executionof the appropriate exception handling routine according to the vector address. An exceptionhandling routine is a program written by the user to handle a specific exception. The exceptionhandling routine is terminated and control returned to the original program by executing a return-from-exception instruction (RTE). This instruction restores the PC and SR contents and returnscontrol to the normal processing routine at the point at which the exception occurred.

The basic processing flow is as follows. See section 2, Data Formats and Registers, for themeaning of the individual SR bits.

1. The PC and SR contents are saved in SPC and SSR.

2. The block bit (BL) in SR is set to 1.

3. The mode bit (MD) in SR is set to 1.

4. The register bank bit (RB) in SR is set to 1.

5. In a reset, the FPU disable bit (FD) in SR is cleared to 0.

6. The exception code is written to bits 11–0 of the exception event register (EXPEVT) orinterrupt event register (INTEVT).

7. The CPU branches to the determined exception handling vector address, and the exceptionhandling routine begins.

5.3.2 Exception Handling Vector Addresses

The reset vector address is fixed at H'A000 0000. Exception and interrupt vector addresses aredetermined by adding the offset for the specific event to the vector base address, which is set bysoftware in the vector base register (VBR). In the case of the TLB miss exception, for example,the offset is H'0000 0400, so if H'9C08 0000 is set in VBR, the exception handling vector addresswill be H'9C08 0400. If a further exception occurs at the exception handling vector address, aduplicate exception will result, and recovery will be difficult; therefore, fixed physical addresses(P1, P2) should be specified for vector addresses.

Rev. 2.0, 03/99, page 86 of 396

5.4 Exception Types and Priorities

Table 5.2 shows the types of exceptions, with their relative priorities, vector addresses, andexception/interrupt codes.

Table 5.2 Exceptions

ExceptionCategory

ExecutionMode Exception

PriorityLevel

PriorityOrder

VectorAddress Offset

ExceptionCode

Power-on reset 1 1 H’A000 0000 — H’000

Manual reset 1 2 H'A000 0000 — H’020

Hitachi-UDI reset 1 1 H'A000 0000 — H’000

Instruction TLB multiple-hitexception

1 3 H'A000 0000 — H’140

Reset Abort type

Data TLB multiple-hit exception 1 4 H'A000 0000 — H’140

User break before instructionexecution*1

2 0 (VBR/DBR) H'100/— H'1E0

Instruction address error 2 1 (VBR) H'100 H'0E0

Instruction TLB miss exception 2 2 (VBR) H'400 H'040

Instruction TLB protectionviolation exception

2 3 (VBR) H'100 H'0A0

General illegal instructionexception

2 4 (VBR) H'100 H'180

Slot illegal instruction exception 2 4 (VBR) H'100 H'1A0

General FPU disable exception 2 4 (VBR) H'100 H'800

Slot FPU disable exception 2 4 (VBR) H'100 H'820

Data address error (read) 2 5 (VBR) H'100 H'0E0

Data address error (write) 2 5 (VBR) H'100 H'100

Data TLB miss exception (read) 2 6 (VBR) H'400 H'040

Data TLB miss exception (write) 2 6 (VBR) H'400 H'060

Data TLB protectionviolation exception (read)

2 7 (VBR) H'100 H'0A0

Data TLB protectionviolation exception (write)

2 7 (VBR) H'100 H'0C0

FPU exception 2 8 (VBR) H'100 H'120

Re-executiontype

Initial page write exception 2 9 (VBR) H'100 H'080

Unconditional trap (TRAPA) 2 4 (VBR) H'100 H'160

Generalexception

Completiontype

User break after instructionexecution*1

2 10 (VBR/DBR) H'100/— H'1E0

Rev. 2.0, 03/99, page 87 of 396

Table 5.2 Exceptions (cont)

ExceptionCategory


PriorityLevel

PriorityOrder


ExceptionCode

Nonmaskable interrupt 3 — (VBR) H'600 H'1C0

0 H'200

1 H'220

2 H'240

3 H'260

4 H'280

5 H'2A0

6 H'2C0

7 H'2E0

8 H'300

9 H'320

A H'340

B H'360

C H'380

D H'3A0

Externalinterrupts

IRL3–IRL0

E

4 *2 (VBR) H'600

H'3C0

TMU0 TUNI0 4 *2 (VBR) H'600 H'400

TMU1 TUNI1 H'420

TUNI2 H'440TMU2

TICPI2 H'460

ATI H'480

PRI H'4A0

RTC

CUI H'4C0

SCI ERI H'4E0

RXI H'500

TXI H'520

TEI H'540

WDT ITI H'560

RCMI H'580REF

ROVI H'5A0

Interrupt Completiontype

Peripheralmoduleinterrupt(module/source)

Hitachi-UDI Hitachi-UDI

H'600

GPIO GPIOI H'620

Rev. 2.0, 03/99, page 88 of 396

Table 5.2 Exceptions (cont)

ExceptionCategory


PriorityLevel

PriorityOrder


ExceptionCode

DMTE0 H’640

DMTE1 H’660

DMTE2 H’680

DMTE3 H’6A0

DMAC

DMAE H’6C0

ERI H’700

RXI H’720

BRI H’740

Interrupt Completiontype

Peripheralmoduleinterrupt(module/source)

SCIF

TXI

4 *2 (VBR) H’600

H’760

Priority: Priority is first assigned by priority level, then by priority order within each level (the lowestnumber represents the highest priority).Exception transition destination: Control passes to H’A000 0000 in a reset, and to [VBR + offset] inother cases.Exception code: Stored in EXPEVT for a reset or general exception, and in INTEVT for an interrupt.IRL: Interrupt request level (pins IRL3–IRL0).

Module/source: See the sections on the relevant peripheral modules.

Notes: 1. When BRCR.UBDE = 1, PC = DBR. In other cases, PC = VBR + H'100.2. The priority order of external interrupts and peripheral module interrupts can be set by

software.

5.5 Exception Flow

5.5.1 Exception Flow

Figure 5.2 shows an outline flowchart of the basic operations in instruction execution andexception handling. For the sake of clarity, the following description assumes that instructions areexecuted sequentially, one by one. Figure 5.2 shows the relative priority order of the differentkinds of exceptions (reset/general exception/interrupt). Register settings in the event of anexception are shown only for SSR, SPC, EXPEVT/INTEVT, SR, and PC, but other registers maybe set automatically by hardware, depending on the exception. For details, see section 5.6,Description of Exceptions. Also, see section 5.6.4, Priority Order with Multiple Exceptions, forexception handling during execution of a delayed branch instruction and a delay slot instruction,and in the case of instructions in which two data accesses are performed.

Rev. 2.0, 03/99, page 89 of 396

Execute next instruction

Is highest- priority exception

re-exceptiontype?

Cancel instruction executionresult

Yes

Yes

Yes

No

No

No

No

Yes

SSR ← SRSPC ← PCSGR ← R15EXPEVT/INTEVT ← exception codeSR.{MD,RB,BL} ← 111PC ← (BRCR.UBDE=1 && User_Break?

DBR: (VBR + Offset))

EXPEVT ← exception codeSR. {MD, RB, BL, FD, IMASK} ← 11101111PC ← H'A000 0000

Interruptrequested?

Generalexception requested?

Resetrequested?

Figure 5.2 Instruction Execution and Exception Handling

5.5.2 Exception Source Acceptance

A priority ranking is provided for all exceptions for use in determining which of two or moresimultaneously generated exceptions should be accepted. Five of the general exceptions—thegeneral illegal instruction exception, slot illegal instruction exception, general FPU disableexception, slot FPU disable exception, and unconditional trap exception—are detected in theprocess of instruction decoding, and do not occur simultaneously in the instruction pipeline. Theseexceptions therefore all have the same priority. General exceptions are detected in the order ofinstruction execution. However, exception handling is performed in the order of instruction flow(program order). Thus, an exception for an earlier instruction is accepted before that for a laterinstruction. An example of the order of acceptance for general exceptions is shown in figure 5.3.

Rev. 2.0, 03/99, page 90 of 396

IF

IF

ID

ID

EX

EX

MA

MA

WB

WB

TLB miss (data access)Pipeline flow:

Order of detection:

Instruction nInstruction n+1

General illegal instruction exception (instruction n+1) and TLB miss (instruction n+2) are detected simultaneously

Order of exception handling:

TLB miss (instruction n)

Program order

1

Instruction n+2

General illegal instruction exception

IF ID EX MA WB

IF ID EX MA WB

TLB miss (instruction access)

2

3

4

IF: Instruction fetchID: Instruction decodeEX: Instruction executionMA: Memory accessWB: Write-back

Instruction n+3

TLB miss (instruction n)

Re-execution of instruction n

General illegal instruction exception (instruction n+1)

Re-execution of instruction n+1

TLB miss (instruction n+2)

Re-execution of instruction n+2

Execution of instruction n+3

Figure 5.3 Example of General Exception Acceptance Order

Rev. 2.0, 03/99, page 91 of 396

5.5.3 Exception Requests and BL Bit

When the BL bit in SR is 0, exceptions and interrupts are accepted.

When the BL bit in SR is 1 and an exception other than a user break is generated, the CPU’sinternal registers are set to their post-reset state, the registers of the other modules retain theircontents prior to the exception, and the CPU branches to the same address as in a reset (H'A0000000). For the operation in the event of a user break, see section 20, User Break Controller. If anordinary interrupt occurs, the interrupt request is held pending and is accepted after the BL bit hasbeen cleared to 0 by software. If a nonmaskable interrupt (NMI) occurs, it can be held pending oraccepted according to the setting made by software.

Thus, normally, SPC and SSR are saved and then the BL bit in SR is cleared to 0, to enablemultiple exception state acceptance.

5.5.4 Return from Exception Handling

The RTE instruction is used to return from exception handling. When the RTE instruction isexecuted, the SPC contents are restored to PC and the SSR contents to SR, and the CPU returnsfrom the exception handling routine by branching to the SPC address. If SPC and SSR were savedto external memory, set the BL bit in SR to 1 before restoring the SPC and SSR contents andissuing the RTE instruction.

Rev. 2.0, 03/99, page 92 of 396

5.6 Description of Exceptions

The various exception handling operations are described here, covering exception sources,transition addresses, and processor operation when a transition is made.

5.6.1 Resets

(1) Power-On Reset

• Sources:

SCK2 pin high level and 5(6(7 pin low level

When the watchdog timer overflows while the WT/,7 bit is set to 1 and the RSTS bit iscleared to 0 in WTCSR. For details, see section 10, Clock Oscillation Circuits.

• Transition address: H’A000 0000

• Transition operations:

Exception code H’000 is set in EXPEVT, initialization of VBR and SR is performed, and abranch is made to PC = H’A000 0000.

In the initialization processing, the VBR register is set to H’0000 0000, and in SR, the MD,RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3–I0) areset to B’1111.

CPU and on-chip peripheral module initialization is performed. For details, see the registerdescriptions in the relevant sections. For some CPU functions, the 7567 pin and 5(6(7 pinmust be driven low. It is therefore essential to execute a power-on reset and drive the 7567pin low when powering on.

Power_on_reset()

{

EXPEVT = H’00000000;

VBR = H’00000000;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

SR.(I0-I3) = B’1111;

SR.FD=0;

Initialize_CPU();

Initialize_Module(PowerOn);

PC = H’A0000000;

}

Rev. 2.0, 03/99, page 93 of 396

(2) Manual Reset

• Sources:

SCK2 pin low level and 5(6(7 pin low level

When a general exception other than a user break occurs while the BL bit is set to 1 in SR

When the watchdog timer overflows while the RSTS bit is set to 1 in WTCSR. For details,see section 10, Clock Oscillation Circuits.




In the initialization processing, the VBR register is set to H’0000 0000, and in SR, the MD,RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3–I0) areset to B'1111.

CPU and on-chip peripheral module initialization is performed. For details, see the registerdescriptions in the relevant sections.

Manual_reset()

{

EXPEVT = H’00000020;

VBR = H’00000000;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

SR.(I0-I3) = B’1111;

SR.FD = 0;

Initialize_CPU();

Initialize_Module(Manual);

PC = H’A0000000;

}

Table 5-3 Types of Reset

Reset State TransitionConditions Internal States

Type 6&.� 5(6(7 CPUOn-Chip PeripheralModules

Power-on reset High Low Initialized

Manual reset Low Low Initialized

See RegisterConfiguration ineach section

Rev. 2.0, 03/99, page 94 of 396

(3) Hitachi-UDI Reset

• Source: SDIR.TI3–TI0 = B'0110 (negation) or B'0111 (assertion)

• Transition address: H'A000 0000


Exception code H'000 is set in EXPEVT, initialization of VBR and SR is performed, and abranch is made to PC = H'A000 0000.

In the initialization processing, the VBR register is set to H'0000 0000, and in SR, the MD,RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3–I0) areset to B'1111.

CPU and on-chip peripheral module initialization is performed. For details, see the registerdescriptions in the relevant sections.

Hitachi-UDI_reset()

{

EXPEVT = H’00000000;

VBR = H’00000000;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

SR.(I0-I3) = B’1111;

SR.FD = 0;

Initialize_CPU();

Initialize_Module(PowerOn);

PC = H’A0000000;

}

Rev. 2.0, 03/99, page 95 of 396

(4) Instruction TLB Multiple-Hit Exception

• Source: Multiple ITLB address matches



The virtual address (32 bits) at which this exception occurred is set in TEA, and thecorresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicatesthe ASID when this exception occurred.



CPU and on-chip peripheral module initialization is performed in the same way as in a manualreset. For details, see the register descriptions in the relevant sections.

TLB_multi_hit()

{

TEA = EXCEPTION_ADDRESS;

PTEH.VPN = PAGE_NUMBER;

EXPEVT = H’00000140;

VBR = H’00000000;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

SR.(I0-I3) = B’1111;

SR.FD = 0;

Initialize_CPU();


PC = H’A0000000;

}

Rev. 2.0, 03/99, page 96 of 396

(5) Operand TLB Multiple-Hit Exception

• Source: Multiple UTLB address matches






CPU and on-chip peripheral module initialization is performed in the same way as in a manualreset. For details, see the register descriptions in the relevant sections.

TLB_multi_hit()

{



EXPEVT = H’00000140;

VBR = H’00000000;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

SR.(I0-I3) = B’1111;

SR.FD = 0;

Initialize_CPU();


PC = H’A0000000;

}

Rev. 2.0, 03/99, page 97 of 396

5.6.2 General Exceptions

(1) Data TLB Miss Exception

• Source: Address mismatch in UTLB address comparison

• Transition address: VBR + H’0000 0400



The PC and SR contents for the instruction at which this exception occurred are saved in SPCand SSR.

Exception code H’040 (for a read access) or H’060 (for a write access) is set in EXPEVT. TheBL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H’0400.

To speed up TLB miss processing, the offset is separate from that of other exceptions.

Data_TLB_miss_exception()

{



SPC = PC;

SSR = SR;

EXPEVT = read_access ? H’00000040 : H’00000060;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000400;

}

Rev. 2.0, 03/99, page 98 of 396

(2) Instruction TLB Miss Exception

• Source: Address mismatch in ITLB address comparison





Exception code H’040 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and abranch is made to PC = VBR + H’0400.

To speed up TLB miss processing, the offset is separate from that of other exceptions.

ITLB_miss_exception()

{



SPC = PC;

SSR = SR;

EXPEVT = H’00000040;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000400;

}

Rev. 2.0, 03/99, page 99 of 396

(3) Initial Page Write Exception

• Source: TLB is hit in a store access, but dirty bit D = 0






Initial_write_exception()

{



SPC = PC;

SSR = SR;

EXPEVT = H’00000080;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 100 of 396

(4) Data TLB Protection Violation Exception

• Source: The access does not accord with the UTLB protection information (PR bits) shownbelow.

PR Privileged Mode User Mode

00 Only read access possible Access not possible

01 Read/write access possible Access not possible

10 Only read access possible Only read access possible

11 Read/write access possible Read/write access possible





Exception code H’0A0 (for a read access) or H’0C0 (for a write access) is set in EXPEVT. TheBL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H’0100.

Data_TLB_protection_violation_exception()

{



SPC = PC;

SSR = SR;

EXPEVT = read_access ? H’000000A0 : H’000000C0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 101 of 396

(5) Instruction TLB Protection Violation Exception

• Source: The access does not accord with the ITLB protection information (PR bits) shownbelow.

PR Privileged Mode User Mode

0 Access possible Access not possible

1 Access possible Access possible





Exception code H’0A0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and abranch is made to PC = VBR + H’0100.

ITLB_protection_violation_exception()

{



SPC = PC;

SSR = SR;

EXPEVT = H’000000A0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 102 of 396

(6) Data Address Error

• Sources:

Word data access from other than a word boundary (2n +1)

Longword data access from other than a longword data boundary (4n +1, 4n + 2, or 4n +3)

Quadword data access from other than a quadword data boundary (8n +1, 8n + 2, 8n +3, 8n+ 4, 8n + 5, 8n + 6, or 8n + 7)

Access to area H'8000 0000–H'FFFF FFFF in user mode

• Transition address: VBR + H'0000 0100




Exception code H'0E0 (for a read access) or H'100 (for a write access) is set in EXPEVT. TheBL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H'0100. Fordetails, see section 3, Memory Management Unit (MMU).

Data_address_error()

{


PTEN.VPN = PAGE_NUMBER;

SPC = PC;

SSR = SR;

EXPEVT = read_access? H’000000E0: H’00000100;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 103 of 396

(7) Instruction Address Error

• Sources:

Instruction fetch from other than a word boundary (2n +1)

Instruction fetch from area H'8000 0000–H'FFFF FFFF in user mode




The PC and SR contents for the instruction at which this exception occurred are saved in theSPC and SSR.

Exception code H'0E0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and abranch is made to PC = VBR + H'0100. For details, see section 3, Memory Management Unit(MMU).

Instruction_address_error()

{


PTEN.VPN = PAGE_NUMBER;

SPC = PC;

SSR = SR;

EXPEVT = H’000000E0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 104 of 396

(8) Unconditional Trap

• Source: Execution of TRAPA instruction



As this is a processing-completion-type exception, the PC contents for the instructionfollowing the TRAPA instruction are saved in SPC. The value of SR when the TRAPAinstruction is executed are saved in SSR. The 8-bit immediate value in the TRAPA instructionis multiplied by 4, and the result is set in TRA [9:0]. Exception code H’160 is set in EXPEVT.The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H’0100.

TRAPA_exception()

{

SPC = PC + 2;

SSR = SR;

TRA = imm << 2;

EXPEVT = H’00000160;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 105 of 396

(9) General Illegal Instruction Exception

• Sources:

Decoding of an undefined instruction not in a delay slot

Delayed branch instructions: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT/S, BF/S

Undefined instruction: H’FFFD

Decoding in user mode of a privileged instruction not in a delay slot

Privileged instructions: LDC, STC, RTE, LDTLB, SLEEP, but excluding LDC/STCinstructions that access GBR




Exception code H’180 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and abranch is made to PC = VBR + H’0100. Operation is not guaranteed if an undefined code otherthan H’FFFD is decoded.

General_illegal_instruction_exception()

{

SPC = PC;

SSR = SR;

EXPEVT = H’00000180;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 106 of 396

(10) Slot Illegal Instruction Exception

• Sources:

Decoding of an undefined instruction in a delay slot

Delayed branch instructions: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT/S, BF/S

Undefined instruction: H’FFFD

Decoding of an instruction that modifies PC in a delay slot

Instructions that modify PC: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT, BF,BT/S, BF/S, TRAPA, LDC Rm, SR, LDC.L @Rm+, SR

Decoding in user mode of a privileged instruction in a delay slot

Privileged instructions: LDC, STC, RTE, LDTLB, SLEEP, but excluding LDC/STCinstructions that access GBR

Decoding of a PC-relative MOV instruction or MOVA instruction in a delay slot



The PC contents for the preceding delayed branch instruction are saved in SPC. The SRcontents when this exception occurred are saved in SSR.

Exception code H’1A0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and abranch is made to PC = VBR + H’0100. Operation is not guaranteed if an undefined code otherthan H’FFFD is decoded.

Slot_illegal_instruction_exception()

{

SPC = PC - 2;

SSR = SR;

EXPEVT = H’000001A0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 107 of 396

(11) General FPU Disable Exception

• Source: Decoding of an FPU instruction* not in a delay slot with SR.FD =1





Note: * FPU instructions are instructions in which the first 4 bits of the instruction code are F (butexcluding undefined instruction H’FFFD), and the LDS, STS, LDS.L, and STS.Linstructions corresponding to FPUL and FPSCR.

General_fpu_disable_exception()

{

SPC = PC;

SSR = SR;

EXPEVT = H’00000800;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 108 of 396

(12) Slot FPU Disable Exception

• Source: Decoding of an FPU instruction in a delay slot with SR.FD =1



The PC contents for the preceding delayed branch instruction are saved in SPC. The SRcontents when this exception occurred are saved in SSR.


Slot_fpu_disable_exception()

{

SPC = PC - 2;

SSR = SR;

EXPEVT = H’00000820;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 109 of 396

(13) User Breakpoint Trap

• Source: Fulfilling of a break condition set in the user break controller

• Transition address: VBR + H’0000 0100, or DBR


In the case of a post-execution break, the PC contents for the instruction following theinstruction at which the breakpoint is set are set in SPC. In the case of a pre-execution break,the PC contents for the instruction at which the breakpoint is set are set in SPC.

The SR contents when the break occurred are saved in SSR. Exception code H’1E0 is set inEXPEVT.

The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H’0100. It isalso possible to branch to PC = DBR.

For details of PC, etc., when a data break is set, see section 20, User Break Controller.

User_break_exception()

{

SPC = (pre_execution break? PC : PC + 2);

SSR = SR;

EXPEVT = H’000001E0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = (BRCR.UBDE==1 ? DBR : VBR + H’00000100);

}

Rev. 2.0, 03/99, page 110 of 396

(14) FPU Exception

• Source: Exception due to execution of a floating-point operation



The PC and SR contents for the instruction at which this exception occurred are saved in SPCand SSR. Exception code H’120 is set in EXPEVT. The BL, MD, and RB bits are set to 1 inSR, and a branch is made to PC = VBR + H’0100.

FPU_exception()

{

SPC = PC;

SSR = SR;

EXPEVT = H’00000120;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000100;

}

Rev. 2.0, 03/99, page 111 of 396

5.6.3 Interrupts

(1) NMI

• Source: NMI pin edge detection



The PC and SR contents for the instruction at which this exception is accepted are saved inSPC and SSR.

Exception code H’1C0 is set in INTEVT. The BL, MD, and RB bits are set to 1 in SR, and abranch is made to PC = VBR + H’0600. When the BL bit in SR is 0, this interrupt is notmasked by the interrupt mask bits in SR, and is accepted at the highest priority level. When theBL bit in SR is 1, a software setting can specify whether this interrupt is to be masked oraccepted. For details, see section 19, Interrupt Controller.

NMI()

{

SPC = PC;

SSR = SR;

INTEVT = H’000001C0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000600;

}

Rev. 2.0, 03/99, page 112 of 396

(2) IRL Interrupts

• Source: The interrupt mask bit setting in SR is smaller than the IRL (3–0) level, and the BL bitin SR is 0 (accepted at instruction boundary).



The PC contents immediately after the instruction at which the interrupt is accepted are set inSPC. The SR contents at the time of acceptance are set in SSR.

The code corresponding to the IRL (3–0) level is set in INTEVT. See table 19.5, InterruptException Handling Sources and Priority Order, for the corresponding codes. The BL, MD,and RB bits are set to 1 in SR, and a branch is made to VBR + H'0600. The acceptance level isnot set in the interrupt mask bits in SR. When the BL bit in SR is 1, the interrupt is masked.For details, see section 19, Interrupt Controller.

IRL()

{

SPC = PC;

SSR = SR;

INTEVT = H’00000200 ~ H’000003C0;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000600;

}

Rev. 2.0, 03/99, page 113 of 396

(3) Peripheral Module Interrupts

• Source: The interrupt mask bit setting in SR is smaller than the peripheral module (Hitachi-UDI, GPIO, DMAC, TMU, RTC, SCI, SCIF, WDT, or REF) interrupt level, and the BL bit inSR is 0 (accepted at instruction boundary).



The PC contents immediately after the instruction at which the interrupt is accepted are set inSPC. The SR contents at the time of acceptance are set in SSR.

The code corresponding to the interrupt source is set in INTEVT. The BL, MD, and RB bitsare set to 1 in SR, and a branch is made to VBR + H’0600. The module interrupt levels shouldbe set as values between B’0000 and B’1111 in the interrupt priority registers (IPRA–IPRC) inthe interrupt controller. For details, see section 19, Interrupt Controller.

Module_interruption()

{

SPC = PC;

SSR = SR;

INTEVT = H’00000400 ~ H’00000760;

SR.MD = 1;

SR.RB = 1;

SR.BL = 1;

PC = VBR + H’00000600;

}

Rev. 2.0, 03/99, page 114 of 396

5.6.4 Priority Order with Multiple Exceptions

With some instructions, such as instructions that make two accesses to memory, and theindivisible pair comprising a delayed branch instruction and delay slot instruction, multipleexceptions occur. Care is required in these cases, as the exception priority order differs from thenormal order.

1. Instructions that make two accesses to memory

With MAC instructions, memory-to-memory arithmetic/logic instructions, and TASinstructions, two data transfers are performed by a single instruction, and an exception will bedetected for each of these data transfers. In these cases, therefore, the following order is usedto determine priority.

a. Data address error in first data transfer

b. TLB miss in first data transfer

c. TLB protection violation in first data transfer

d. Initial page write exception in first data transfer

e. Data address error in second data transfer

f. TLB miss in second data transfer

g. TLB protection violation in second data transfer

h. Initial page write exception in second data transfer

2. Indivisible delayed branch instruction and delay slot instruction

As a delayed branch instruction and its associated delay slot instruction are indivisible, theyare treated as a single instruction. Consequently, the priority order for exceptions that occur inthese instructions differs from the usual priority order. The priority order shown below is forthe case where the delay slot instruction has only one data transfer.

a. The delayed branch instruction is checked for priority levels 1 and 2.

b. The delay slot instruction is checked for priority levels 1 and 2.

c. A check is performed for priority level 3 in the delayed branch instruction and prioritylevel 3 in the delay slot instruction. (There is no priority ranking between these two.)

d. A check is performed for priority level 4 in the delayed branch instruction and prioritylevel 4 in the delay slot instruction. (There is no priority ranking between these two.)

If the delay slot instruction has a second data transfer, two checks are performed in step b, as in1 above.

If the accepted exception (the highest-priority exception) is a delay slot instruction re-execution type exception, the branch instruction PR register write operation (PC → PRoperation performed in BSR, BSRF, JSR) is inhibited.

Rev. 2.0, 03/99, page 115 of 396

5.7 Usage Notes

1. Return from exception handling

a. Check the BL bit in SR with software. If SPC and SSR have been saved to externalmemory, set the BL bit in SR to 1 before restoring them.

b. Issue an RTE instruction. When RTE is executed, the SPC contents are set in PC, the SSRcontents are set in SR, and branch is made to the SPC address to return from the exceptionhandling routine.

2. If an exception or interrupt occurs when SR.BL = 1

a. Exception

When an exception other than a user break occurs, the CPU’s internal registers are set totheir post-reset state, the registers of the other modules retain their contents prior to theexception, and the CPU branches to the same address as in a reset (H'A000 0000). Thevalue in EXPEVT at this time is H'0000 0020; the value of the SPC and SSR registers isundefined.

b. Interrupt

If an ordinary interrupt occurs, the interrupt request is held pending and is accepted afterthe BL bit in SR has been cleared to 0 by software. If a nonmaskable interrupt (NMI)occurs, it can be held pending or accepted according to the setting made by software. In thesleep or standby state, however, an interrupt is accepted even if the BL bit in SR is set to 1.

3. SPC when an exception occurs

a. Re-execution type exception

The PC value for the instruction in which the exception occurred is set in SPC, and theinstruction is re-executed after returning from exception handling. If an exception occurs ina delay slot instruction, however, the PC value for the delay slot instruction is saved in SPCregardless of whether or not the preceding delay slot instruction condition is satisfied.

b. Completion type exception or interrupt

The PC value for the instruction following that in which the exception occurred is set inSPC. If an exception occurs in a branch instruction with delay slot, however, the PC valuefor the branch destination is saved in SPC.

4. An exception must not be generated in an RTE instruction delay slot, as the operation will beundefined in this case.

5.8 Restrictions

1. Restrictions on first instruction of exception handling routine

• Do not locate a BT, BF, BT/S, BF/S, BRA, or BSR instruction at address VBR + H'100, VBR+ H'400, or VBR + H'600.

Rev. 2.0, 03/99, page 116 of 396

• When the UBDE bit in the BRCR register is set to 1 and the user break debug supportfunction* is used, do not locate a BT, BF, BT/S, BF/S, BRA, or BSR instruction at the addressindicated by the DBR register.

Note: * See section 20.4 in the SH-4 Hardware manual.

Rev. 2.0, 03/99, page 117 of 396

Section 6 Floating-Point Unit

6.1 Overview

The floating-point unit (FPU) has the following features:

• Conforms to IEEE754 standard

• 32 single-precision floating-point registers (can also be referenced as 16 double-precisionregisters)

• Two rounding modes: Round to Nearest and Round to Zero

• Two denormalization modes: Flush to Zero and Treat Denormalized Number

• Six exception sources: FPU Error, Invalid Operation, Divide By Zero, Overflow, Underflow,and Inexact

• Comprehensive instructions: Single-precision, double-precision, graphics support, systemcontrol

When the FD bit in SR is set to 1, the FPU cannot be used, and an attempt to execute an FPUinstruction will cause an FPU disable exception.

6.2 Data Formats

6.2.1 Floating-Point Format

A floating-point number consists of the following three fields:

• Sign (s)

• Exponent (e)

• Fraction (f)

The SH7750 can handle single-precision and double-precision floating-point numbers, using theformats shown in figures 6.1 and 6.2.

31

s e f

30 23 22 0

Figure 6.1 Format of Single-Precision Floating-Point Number

Rev. 2.0, 03/99, page 118 of 396

63

s e f

62 52 51 0

Figure 6.2 Format of Double-Precision Floating-Point Number

The exponent is expressed in biased form, as follows:

e = E + bias

The range of unbiased exponent E is Emin – 1 to Emax + 1. The two values Emin – 1 and Emax + 1 aredistinguished as follows. Emin – 1 indicates zero (both positive and negative sign) and adenormalized number, and Emax + 1 indicates positive or negative infinity or a non-number (NaN).Table 6.1 shows bias, Emin, and Emax values.

Table 6.1 Floating-Point Number Formats and Parameters

Parameter Single-Precision Double-Precision

Total bit width 32 bits 64 bits

Sign bit 1 bit 1 bit

Exponent field 8 bits 11 bits

Fraction field 23 bits 52 bits

Precision 24 bits 53 bits

Bias +127 +1023

Emax +127 +1023

Emin –126 –1022

Floating-point number value v is determined as follows:

If E = Emax + 1 and f ≠ 0, v is a non-number (NaN) irrespective of sign sIf E = Emax + 1 and f = 0, v = (–1)s (infinity) [positive or negative infinity]If Emin ≤ E ≤ Emax , v = (–1)s2E (1.f) [normalized number]If E = Emin – 1 and f ≠ 0, v = (–1)s2Emin (0.f) [denormalized number]If E = Emin – 1 and f = 0, v = (–1)s0 [positive or negative zero]

Table 6.2 shows the ranges of the various numbers in hexadecimal notation.

Rev. 2.0, 03/99, page 119 of 396

Table 6.2 Floating-Point Ranges

Type Single-Precision Double-Precision

Signaling non-number H’7FFFFFFF to H’7FC00000 H’7FFFFFFF H’FFFFFFFF toH’7FF80000 H’00000000

Quiet non-number H’7FBFFFFF to H’7F800001 H’7FF7FFFF H’FFFFFFFF toH’7FF00000 H’00000001

Positive infinity H’7F800000 H’7FF00000 H’00000

Positive normalizednumber

H’7F7FFFFF to H’00800000 H’7FEFFFFF H’FFFFFFFF toH’00100000 H’00000000

Positive denormalizednumber

H’007FFFFF to H’00000001 H’000FFFFF H’FFFFFFFF toH’00000000 H’00000001

Positive zero H’00000000 H’00000000 H’00000000

Negative zero H’80000000 H’80000000 H’00000000

Negative denormalizednumber

H’80000001 to H’807FFFFF H’80000000 H’00000001 toH’800FFFFF H’FFFFFFFF

Negative normalizednumber

H’80800000 to H’FF7FFFFF H’80100000 H’00000000 toH’FFEFFFFF H’FFFFFFFF

Negative infinity H’FF800000 H’FFF00000 H’00000000

Quiet non-number H’FF800001 to H’FFBFFFFF H’FFF00000 H’00000001 toH’FFF7FFFF H’FFFFFFFF

Signaling non-number H’FFC00000 to H’FFFFFFFF H’FFF80000 H’00000000 toH’FFFFFFFF H’FFFFFFFF

6.2.2 Non-Numbers (NaN)

Figure 6.3 shows the bit pattern of a non-number (NaN). A value is NaN in the following case:

• Sign bit: Don’t care

• Exponent field: All bits are 1

• Fraction field: At least one bit is 1

The NaN is a signaling NaN (sNaN) if the MSB of the fraction field is 1, and a quiet NaN (qNaN)if the MSB is 0.

Rev. 2.0, 03/99, page 120 of 396

31

x 11111111 Nxxxxxxxxxxxxxxxxxxxxxx

30 23 22 0

N = 1: sNaNN = 0: qNaN

Figure 6.3 Single-Precision NaN Bit Pattern

An sNAN is input in an operation, except copy, FABS, and FNEG, that generates a floating-pointvalue.

• When the EN.V bit in the FPSCR register is 0, the operation result (output) is a qNaN.

• When the EN.V bit in the FPSCR register is 1, an invalid operation exception will begenerated. In this case, the contents of the operation destination register are unchanged.

If a qNaN is input in an operation that generates a floating-point value, and an sNaN has not beeninput in that operation, the output will always be a qNaN irrespective of the setting of the EN.V bitin the FPSCR register. An exception will not be generated in this case.

The qNAN values generated by the SH7750 as operation results are as follows:

• Single-precision qNaN: H’7FBFFFFF

• Double-precision qNaN: H’7FF7FFFF FFFFFFFF

See section 10, Instruction Descriptions, for details of floating-point operations when a non-number (NaN) is input.

6.2.3 Denormalized Numbers

For a denormalized number floating-point value, the exponent field is expressed as 0, and thefraction field as a non-zero value.

When the DN bit in the FPU’s status register FPSCR is 1, a denormalized number (source operandor operation result) is always flushed to 0 in a floating-point operation that generates a value (anoperation other than copy, FNEG, or FABS).

When the DN bit in FPSCR is 0, a denormalized number (source operand or operation result) isprocessed as it is. See section 10, Description of Instructions, for details of floating-pointoperations when a denormalized number is input.

Rev. 2.0, 03/99, page 121 of 396

6.3 Registers

6.3.1 Floating-Point Registers

Figure 6.4 shows the floating-point register configuration. There are thirty-two 32-bit floating-point registers, referenced by specifying FR0–FR15, DR0/2/4/6/8/10/12/14, FV0/4/8/12, XF0–XF15, XD0/2/4/6/8/10/12/14, or XMTRX.

1. Floating-point registers, FPRi_BANKj (32 registers)

FPR0_BANK0–FPR15_BANK0

FPR0_BANK1–FPR15_BANK1

2. Single-precision floating-point registers, FRi (16 registers)

When FPSCR.FR = 0, FR0–FR15 indicate FPR0_BANK0–FPR15_BANK0;

when FPSCR.FR = 1, FR0–FR15 indicate FPR0_BANK1–FPR15_BANK1.

3. Double-precision floating-point registers, DRi (8 registers): A DR register comprises two FRregisters

DR0 = {FR0, FR1}, DR2 = {FR2, FR3}, DR4 = {FR4, FR5}, DR6 = {FR6, FR7},DR8 = {FR8, FR9}, DR10 = {FR10, FR11}, DR12 = {FR12, FR13}, DR14 = {FR14, FR15}

4. Single-precision floating-point vector registers, FVi (4 registers): An FV register comprisesfour FR registers

FV0 = {FR0, FR1, FR2, FR3}, FV4 = {FR4, FR5, FR6, FR7},FV8 = {FR8, FR9, FR10, FR11}, FV12 = {FR12, FR13, FR14, FR15}

5. Single-precision floating-point extended registers, XFi (16 registers)

When FPSCR.FR = 0, XF0–XF15 indicate FPR0_BANK1–FPR15_BANK1;

when FPSCR.FR = 1, XF0–XF15 indicate FPR0_BANK0–FPR15_BANK0.

6. Double-precision floating-point extended registers, XDi (8 registers): An XD registercomprises two XF registers

XD0 = {XF0, XF1}, XD2 = {XF2, XF3}, XD4 = {XF4, XF5}, XD6 = {XF6, XF7},XD8 = {XF8, XF9}, XD10 = {XF10, XF11}, XD12 = {XF12, XF13}, XD14 = {XF14, XF15}

7. Single-precision floating-point extended register matrix, XMTRX: XMTRX comprises all 16XF registers

XMTRX = XF0 XF4 XF8 XF12

XF1 XF5 XF9 XF13

XF2 XF6 XF10 XF14

XF3 XF7 XF11 XF15

Rev. 2.0, 03/99, page 122 of 396





DR0

DR2

DR4

DR6

DR8

DR10

DR12

DR14

FV0

FV4

FV8

FV12

XD0 XMTRX

XD2

XD4

XD6

XD8

XD10

XD12

XD14





DR0

DR2

DR4

DR6

DR8

DR10

DR12

DR14

FV0

FV4

FV8

FV12

XD0XMTRX

XD2

XD4

XD6

XD8

XD10

XD12

XD14

FPSCR.FR = 0 FPSCR.FR = 1

Figure 6.4 Floating-Point Registers

Rev. 2.0, 03/99, page 123 of 396

6.3.2 Floating-Point Status/Control Register (FPSCR)

Floating-point status/control register, FPSCR (32 bits, initial value = H’0004 0001)

• FR: Floating-point register bank

FR = 0: FPR0_BANK0–FPR15_BANK0 are assigned to FR0–FR15; FPR0_BANK1–FPR15_BANK1 are assigned to XF0–XF15.

FR = 1: FPR0_BANK0–FPR15_BANK0 are assigned to XF0–XF15; FPR0_BANK1–FPR15_BANK1 are assigned to FR0–FR15.

• SZ: Transfer size mode

SZ = 0: The data size of the FMOV instruction is 32 bits.

SZ = 1: The data size of the FMOV instruction is a 32-bit register pair (64 bits).

• PR: Precision mode

PR = 0: Floating-point instructions are executed as single-precision operations.

PR = 1: Floating-point instructions are executed as double-precision operations (graphicssupport instructions are undefined).

Do not set SZ and PR to 1 simultaneously; this setting is reserved.

[SZ, PR = 11]: Reserved (FPU operation instruction is undefined.)

• DN: Denormalization mode

DN = 0: A denormalized number is treated as such.

DN = 1: A denormalized number is treated as zero.

FPUError (E)

InvalidOperation (V)

Divisionby Zero (Z)

Overflow(O)

Underflow(U)

Inexact(I)

Cause FPU exceptioncause field

Bit 17 Bit 16 Bit 15 Bit 14 Bit 13 Bit 12

Enable FPU exceptionenable field


Flag FPU exceptionflag field


When an FPU exception is requested, the corresponding bits in the cause and flag fields are setto 1. Each time an FPU operation instruction is executed, the cause field is cleared to 0 first.The flag field retains the value of 1 until cleared to 0 by software.

Rev. 2.0, 03/99, page 124 of 396

• RM: Rounding mode

RM = 00: Round to Nearest

RM = 01: Round to Zero

RM = 10: Reserved

RM = 11: Reserved

• Bits 22 to 31: Reserved

These bits are always read as 0, and should only be written with 0.

Notes: The following functions have been added to the FPU of the SH7750 (not provided in theFPU of the SH7718):

1. The FR, SZ, and PR bits have been added.

2. Exception O (overflow), U (underflow), and I (inexact) bits have been added to thecause, enable, and flag fields.

3. An exception E (FPU error) bit has been added to the cause field.

6.3.3 Floating-Point Communication Register (FPUL)

Information is transferred between the FPU and CPU via the FPUL register. The 32-bit FPULregister is a system register, and is accessed from the CPU side by means of LDS and STSinstructions. For example, to convert the integer stored in general register R1 to a single-precisionfloating-point number, the processing flow is as follows:

R1 → (LDS instruction) → FPUL → (single-precision FLOAT instruction) → FR1

6.4 Rounding

In a floating-point instruction, rounding is performed when generating the final operation resultfrom the intermediate result. Therefore, the result of combination instructions such as FMAC,FTRV, and FIPR will differ from the result when using a basic instruction such as FADD, FSUB,or FMUL. Rounding is performed once in FMAC, but twice in FADD, FSUB, and FMUL.

There are two rounding methods, the method to be used being determined by the RM field inFPSCR.

• RM = 00: Round to Nearest

• RM = 01: Round to Zero

Round to Nearest: The value is rounded to the nearest expressible value. If there are two nearestexpressible values, the one with an LSB of 0 is selected.

Rev. 2.0, 03/99, page 125 of 396

If the unrounded value is 2Emax (2 – 2–P) or more, the result will be infinity with the same sign asthe unrounded value. The values of Emax and P, respectively, are 127 and 24 for single-precision,and 1023 and 53 for double-precision.

Round to Zero: The digits below the round bit of the unrounded value are discarded.

If the unrounded value is larger than the maximum expressible absolute value, the value will bethe maximum expressible absolute value.

6.5 Floating-Point Exceptions

FPU-related exceptions are as follows:

• General illegal instruction/slot illegal instruction exception

The exception occurs if an FPU instruction is executed when SR.FD = 1.

• FPU exceptions

The exception sources are as follows:

FPU error (E): When FPSCR.DN = 0 and a denormalized number is input

Invalid operation (V): In case of an invalid operation, such as NaN input

Division by zero (Z): Division with a zero divisor

Overflow (O): When the operation result overflows

Underflow (U): When the operation result underflows

Inexact exception (I): When overflow, underflow, or rounding occurs

The FPSCR cause field contains bits corresponding to all of above sources E, V, Z, O, U, andI, and the FPSCR flag and enable fields contain bits corresponding to sources V, Z, O, U, andI, but not E. Thus, FPU errors cannot be disabled.

When an exception source occurs, the corresponding bit in the cause field is set to 1, and 1 isadded to the corresponding bit in the flag field. When an exception source does not occur, thecorresponding bit in the cause field is cleared to 0, but the corresponding bit in the flag fieldremains unchanged.

• Enable/disable exception handling

The SH7750 supports enable exception handling and disable exception handling.

Enable exception handling is initiated in the following cases:

FPU error (E): FPSCR.DN = 0 and a denormalized number is input

Invalid operation (V): FPSCR.EN.V = 1 and (instruction = FTRV or invalid operation)

Division by zero (Z): FPSCR.EN.Z = 1 and division with a zero divisor

Overflow (O): FPSCR.EN.O = 1 and instruction with possibility of operation resultoverflow

Rev. 2.0, 03/99, page 126 of 396

Underflow (U): FPSCR.EN.U = 1 and instruction with possibility of operation resultunderflow

Inexact exception (I): FPSCR.EN.I = 1 and instruction with possibility of inexact operationresult

These possibilities are shown in the individual instruction descriptions. All exception eventsthat originate in the FPU are assigned as the same exception event. The meaning of anexception is determined by software by reading system register FPSCR and interpreting theinformation it contains. If no bits are set in the cause field of FPSCR when one or more of bitsO, U, I, and V (in case of FTRV only) are set in the enable field, this indicates that an actualexception source is not generated. Also, the destination register is not changed by any enableexception handling operation.

Except for the above, the FPU disables exception handling. In all processing, the bitcorresponding to source V, Z, O, U, or I is set to 1, and disable exception handling is providedfor each exception.

Invalid operation (V): qNAN is generated as the result.

Division by zero (Z): Infinity with the same sign as the unrounded value is generated.

Overflow (O):

When rounding mode = RZ, the maximum normalized number, with the same sign as theunrounded value, is generated.

When rounding mode = RN, infinity with the same sign as the unrounded value isgenerated.

Underflow (U):

When FPSCR.DN = 0, a denormalized number with the same sign as the unrounded value,or zero with the same sign as the unrounded value, is generated.

When FPSCR.DN = 1, zero with the same sign as the unrounded value, is generated.

Inexact exception (I): An inexact result is generated.

6.6 Graphics Support Functions

The SH7750 supports two kinds of graphics functions: new instructions for geometric operations,and pair single-precision transfer instructions that enable high-speed data transfer.

6.6.1 Geometric Operation Instructions

Geometric operation instructions perform approximate-value computations. To enable high-speedcomputation with a minimum of hardware, the SH7750 ignores comparatively small values in thepartial computation results of four multiplications. Consequently, the error shown below isproduced in the result of the computation:

Rev. 2.0, 03/99, page 127 of 396

Maximum error = MAX (individual multiplication result ×2–MIN (number of multiplier significant digits–1, number of multiplicand significant digits–1)) + MAX (result

value × 2–23, 2–149)

The number of significant digits is 24 for a normalized number and 23 for a denormalized number(number of leading zeros in the fractional part).

In future version of SH series, the above error is guaranteed, but the same result as SH7750 is notguaranteed.

FIPR FVm, FVn (m, n: 0, 4, 8, 12): This instruction is basically used for the following purposes:

• Inner product (m ≠ n):

This operation is generally used for surface/rear surface determination for polygon surfaces.

• Sum of square of elements (m = n):

This operation is generally used to find the length of a vector.

Since approximate-value computations are performed to enable high-speed computation, theinexact exception (I) bit in the cause field and flag field is always set to 1 when an FIPRinstruction is executed. Therefore, if the corresponding bit is set in the enable field, enableexception handling will be executed.

FTRV XMTRX, FVn (n: 0, 4, 8, 12): This instruction is basically used for the followingpurposes:

• Matrix (4 × 4) ⋅ vector (4):

This operation is generally used for viewpoint changes, angle changes, or movements calledvector transformations (4-dimensional). Since affine transformation processing for angle +parallel movement basically requires a 4 × 4 matrix, the SH7750 supports 4-dimensionaloperations.

• Matrix (4 × 4) × matrix (4 × 4):

This operation requires the execution of four FTRV instructions.

Since approximate-value computations are performed to enable high-speed computation, theinexact exception (I) bit in the cause field and flag field is always set to 1 when an FTRVinstruction is executed. Therefore, if the corresponding bit is set in the enable field, enableexception handling will be executed. For the same reason, it is not possible to check all data typesin the registers beforehand when executing an FTRV instruction. If the V bit is set in the enablefield, enable exception handling will be executed.

FRCHG: This instruction modifies banked registers. For example, when the FTRV instruction isexecuted, matrix elements must be set in an array in the background bank. However, to create theactual elements of a translation matrix, it is easier to use registers in the foreground bank. Whenthe LDC instruction is used on FPSCR, this instruction expends 4 to 5 cycles in order to maintain

Rev. 2.0, 03/99, page 128 of 396

the FPU state. With the FRCHG instruction, an FPSCR.FR bit modification can be performed inone cycle.

6.6.2 Pair Single-Precision Data Transfer

In addition to the powerful new geometric operation instructions, the SH7750 also supports high-speed data transfer instructions.

When FPSCR.SZ = 1, the SH7750 can perform data transfer by means of pair single-precisiondata transfer instructions.

• FMOV DRm/XDm, DRn/XDRn (m, n: 0, 2, 4, 6, 8, 10, 12, 14)

• FMOV DRm/XDm, @Rn (m: 0, 2, 4, 6, 8, 10, 12, 14; n: 0 to 15)

These instructions enable two single-precision (2 × 32-bit) data items to be transferred; that is, thetransfer performance of these instructions is doubled.

• FSCHG

This instruction changes the value of the SZ bit in FPSCR, enabling fast switching betweenuse and non-use of pair single-precision data transfer.

Rev. 2.0, 03/99, page 129 of 396

Section 7 Instruction Set

7.1 Execution Environment

PC: At the start of instruction execution, PC indicates the address of the instruction itself.

Data sizes and data types: The SH7750’s instruction set is implemented with 16-bit fixed-lengthinstructions. The SH7750 can use byte (8-bit), word (16-bit), longword (32-bit), and quadword(64-bit) data sizes for memory access. Single-precision floating-point data (32 bits) can be movedto and from memory using longword or quadword size. Double-precision floating-point data (64bits) can be moved to and from memory using longword size. When a double-precision floating-point operation is specified (FPSCR.PR = 1), the result of an operation using quadword access willbe undefined. When the SH7750 moves byte-size or word-size data from memory to a register, thedata is sign-extended.

Load-Store Architecture: The SH7750 features a load-store architecture in which operations arebasically executed using registers. Except for bit-manipulation operations such as logical ANDthat are executed directly in memory, operands in an operation that requires memory access areloaded into registers and the operation is executed between the registers.

Delayed Branches: Except for the two branch instructions BF and BT, the SH7750’s branchinstructions and RTE are delayed branches. In a delayed branch, the instruction following thebranch is executed before the branch destination instruction. This execution slot following adelayed branch is called a delay slot. For example, the BRA execution sequence is as follows:

Static Sequence Dynamic Sequence

BRA TARGET BRA TARGET

ADD R1, R0next_2

ADD R1, R0target_instr

ADD in delay slot is executed beforebranching to TARGET

Delay Slot: An illegal instruction exception may occur when a specific instruction is executed in adelay slot. See section 5, Exceptions. The instruction following BF/S or BT/S for which thebranch is not taken is also a delay slot instruction.

T Bit: The T bit in the status register (SR) is used to show the result of a compare operation, andis referenced by a conditional branch instruction. An example of the use of a conditional branchinstruction is shown below.

ADD #1, R0 ; T bit is not changed by ADD operationCMP/EQ R1, R0 ; If R0 = R1, T bit is set to 1BT TARGET ; Branches to TARGET if T bit = 1 (R0 = R1)

Rev. 2.0, 03/99, page 130 of 396

In an RTE delay slot, status register (SR) bits are referenced as follows. In instruction access, theMD bit is used before modification, and in data access, the MD bit is accessed after modification.The other bits—S, T, M, Q, FD, BL, and RB—after modification are used for delay slotinstruction execution. The STC and STC.L SR instructions access all SR bits after modification.

Constant Values: An 8-bit constant value can be specified by the instruction code and animmediate value. 16-bit and 32-bit constant values can be defined as literal constant values inmemory, and can be referenced by a PC-relative load instruction.

MOV.W @(disp, PC), RnMOV.L @(disp, PC), Rn

There are no PC-relative load instructions for floating-point operations. However, it is possible toset 0.0 or 1.0 by using the FLDI0 or FLDI1 instruction on a single-precision floating-pointregister.

Rev. 2.0, 03/99, page 131 of 396

7.2 Addressing Modes

Addressing modes and effective address calculation methods are shown in table 7.1. When alocation in virtual memory space is accessed (MMUCR.AT = 1), the effective address is translatedinto a physical memory address. If multiple virtual memory space systems are selected(MMUCR.SV = 0), the least significant bit of PTEH is also referenced as the access ASID. Seesection 3, Memory Management Unit (MMU).

Table 7.1 Addressing Modes and Effective Addresses

AddressingMode

InstructionFormat Effective Address Calculation Method

CalculationFormula

Registerdirect

Rn Effective address is register Rn.(Operand is register Rn contents.)

—

Registerindirect

@Rn Effective address is register Rn contents.

Rn Rn

Rn → EA(EA: effectiveaddress)

Registerindirectwith post-increment

@Rn+ Effective address is register Rn contents.A constant is added to Rn after instructionexecution: 1 for a byte operand, 2 for a wordoperand, 4 for a longword operand, 8 for aquadword operand.

Rn Rn

1/2/4/8

+Rn + 1/2/4/8

Rn → EAAfterinstructionexecution

Byte:Rn + 1 → Rn

Word:Rn + 2 → Rn

Longword:Rn + 4 → Rn

Quadword:Rn + 8 → Rn

Registerindirectwith pre-decrement

@–Rn Effective address is register Rn contents,decremented by a constant beforehand:1 for a byte operand, 2 for a word operand,4 for a longword operand, 8 for a quadwordoperand.

Rn

1/2/4/8

Rn – 1/2/4/8–Rn – 1/2/4/8

Byte:Rn – 1 → Rn

Word:Rn – 2 → Rn

Longword:Rn – 4 → Rn

Quadword:Rn – 8 → Rn

Rn → EA(Instructionexecutedwith Rn aftercalculation)

Rev. 2.0, 03/99, page 132 of 396

Table 7.1 Addressing Modes and Effective Addresses (cont)

AddressingMode


CalculationFormula

Registerindirect withdisplacement

@(disp:4, Rn) Effective address is register Rn contents with4-bit displacement disp added. After disp iszero-extended, it is multiplied by 1 (byte), 2 (word),or 4 (longword), according to the operand size.

Rn

Rn + disp × 1/2/4+

×

1/2/4

disp(zero-extended)

Byte: Rn +disp → EA

Word: Rn +disp × 2 → EA

Longword:Rn + disp × 4→ EA

Indexedregisterindirect

@(R0, Rn) Effective address is sum of register Rn and R0contents.

Rn

R0

Rn + R0+

Rn + R0 → EA

GBR indirectwithdisplacement

@(disp:8,GBR)

Effective address is register GBR contents with8-bit displacement disp added. After disp iszero-extended, it is multiplied by 1 (byte), 2 (word),or 4 (longword), according to the operand size.

GBR

1/2/4

GBR+ disp × 1/2/4

+

×

disp(zero-extended)

Byte: GBR +disp → EA

Word: GBR +disp × 2 → EA

Longword:GBR + disp ×4 → EA

IndexedGBR indirect

@(R0, GBR) Effective address is sum of register GBR and R0contents.

GBR

R0

GBR + R0+

GBR + R0 →EA

Rev. 2.0, 03/99, page 133 of 396


AddressingMode


CalculationFormula

PC-relativewithdisplacement

@(disp:8, PC) Effective address is PC+4 with 8-bit displacementdisp added. After disp is zero-extended, it ismultiplied by 2 (word), or 4 (longword), accordingto the operand size. With a longword operand,the lower 2 bits of PC are masked.

PC

H'FFFFFFFC

PC + 4 + disp × 2

or PC & H'FFFFFFFC+ 4 + disp × 4

+4

2/4

×

+

& *

disp(zero-extended)

* With longword operand

Word: PC + 4+ disp × 2 →EA

Longword:PC &H’FFFFFFFC+ 4 + disp × 4→ EA

PC-relative disp:8 Effective address is PC+4 with 8-bit displacementdisp added after being sign-extended andmultiplied by 2.

2

+

×

disp(sign-extended)

4

+

PC

PC + 4 + disp × 2

PC + 4 + disp× 2 → Branch-Target

Rev. 2.0, 03/99, page 134 of 396


AddressingMode


CalculationFormula

PC-relative disp:12 Effective address is PC+4 with 12-bit displacementdisp added after being sign-extended andmultiplied by 2.

2

+

×

disp(sign-extended)

4

+

PC

PC + 4 + disp × 2

PC + 4 + disp× 2 → Branch-Target

Rn Effective address is sum of PC+4 and Rn.

PC

4

Rn

+

+ PC + 4 + Rn

PC + 4 + Rn→ Branch-Target

Immediate #imm:8 8-bit immediate data imm of TST, AND, OR, orXOR instruction is zero-extended.

—

#imm:8 8-bit immediate data imm of MOV, ADD, orCMP/EQ instruction is sign-extended.

—

#imm:8 8-bit immediate data imm of TRAPA instruction iszero-extended and multiplied by 4.

—

Note: For the addressing modes below that use a displacement (disp), the assembler descriptionsin this manual show the value before scaling (×1, ×2, or ×4) is performed according to theoperand size. This is done to clarify the operation of the chip. Refer to the relevantassembler notation rules for the actual assembler descriptions.@ (disp:4, Rn) ; Register indirect with displacement

@ (disp:8, GBR) ; GBR indirect with displacement@ (disp:8, PC) ; PC-relative with displacementdisp:8, disp:12 ; PC-relative

Rev. 2.0, 03/99, page 135 of 396

7.3 Instruction Set

Table 7.2 shows the notation used in the following SH instruction list.

Table 7.2 Notation Used in Instruction List

Item Format Description

Instructionmnemonic

OP.Sz SRC, DEST OP: Operation codeSz: SizeSRC: SourceDEST: Source and/or destination operand

Summary ofoperation

→, ← Transfer direction(xx) Memory operandM/Q/T SR flag bits& Logical AND of individual bits| Logical OR of individual bits∧ Logical exclusive-OR of individual bits~ Logical NOT of individual bits<<n, >>n n-bit shift

Instruction code MSB ↔ LSB mmmm: Register number (Rm, FRm)nnnn: Register number (Rn, FRn)0000: R0, FR00001: R1, FR1 :1111: R15, FR15mmm: Register number (DRm, XDm, Rm_BANK)nnn: Register number (DRm, XDm, Rn_BANK)000: DR0, XD0, R0_BANK001: DR2, XD2, R1_BANK :111: DR14, XD14, R7_BANKmm: Register number (FVm)nn: Register number (FVn)00: FV001: FV410: FV811: FV12iiii: Immediate datadddd: Displacement

Privileged mode “Privileged” means the instruction can only be executedin privileged mode.

T bit Value of T bit afterinstruction execution

—: No change

Note: Scaling (×1, ×2, ×4, or ×8) is executed according to the size of the instruction operand(s).

Rev. 2.0, 03/99, page 136 of 396

Table 7.3 Fixed-Point Transfer Instructions

Instruction Operation Instruction Code Privileged T Bit

MOV #imm,Rn imm → sign extension → Rn 1110nnnniiiiiiii — —

MOV.W @(disp,PC),Rn (disp × 2 + PC + 4) → signextension → Rn

1001nnnndddddddd — —

MOV.L @(disp,PC),Rn (disp × 4 + PC & H'FFFFFFFC+ 4) → Rn

1101nnnndddddddd — —

MOV Rm,Rn Rm → Rn 0110nnnnmmmm0011 — —

MOV.B Rm,@Rn Rm → (Rn) 0010nnnnmmmm0000 — —

MOV.W Rm,@Rn Rm → (Rn) 0010nnnnmmmm0001 — —

MOV.L Rm,@Rn Rm → (Rn) 0010nnnnmmmm0010 — —

MOV.B @Rm,Rn (Rm) → sign extension → Rn 0110nnnnmmmm0000 — —

MOV.W @Rm,Rn (Rm) → sign extension → Rn 0110nnnnmmmm0001 — —

MOV.L @Rm,Rn (Rm) → Rn 0110nnnnmmmm0010 — —

MOV.B Rm,@-Rn Rn-1 → Rn, Rm → (Rn) 0010nnnnmmmm0100 — —

MOV.W Rm,@-Rn Rn-2 → Rn, Rm → (Rn) 0010nnnnmmmm0101 — —

MOV.L Rm,@-Rn Rn-4 → Rn, Rm → (Rn) 0010nnnnmmmm0110 — —

MOV.B @Rm+,Rn (Rm)→ sign extension → Rn,Rm + 1 → Rm

0110nnnnmmmm0100 — —

MOV.W @Rm+,Rn (Rm) → sign extension → Rn,Rm + 2 → Rm


MOV.L @Rm+,Rn (Rm) → Rn, Rm + 4 → Rm 0110nnnnmmmm0110 — —

MOV.B R0,@(disp,Rn) R0 → (disp + Rn) 10000000nnnndddd — —

MOV.W R0,@(disp,Rn) R0 → (disp × 2 + Rn) 10000001nnnndddd — —

MOV.L Rm,@(disp,Rn) Rm → (disp × 4 + Rn) 0001nnnnmmmmdddd — —

MOV.B @(disp,Rm),R0 (disp + Rm) → sign extension→ R0

10000100mmmmdddd — —

MOV.W @(disp,Rm),R0 (disp × 2 + Rm) → signextension → R0

10000101mmmmdddd — —

MOV.L @(disp,Rm),Rn (disp × 4 + Rm) → Rn 0101nnnnmmmmdddd — —

MOV.B Rm,@(R0,Rn) Rm → (R0 + Rn) 0000nnnnmmmm0100 — —

MOV.W Rm,@(R0,Rn) Rm → (R0 + Rn) 0000nnnnmmmm0101 — —

MOV.L Rm,@(R0,Rn) Rm → (R0 + Rn) 0000nnnnmmmm0110 — —

MOV.B @(R0,Rm),Rn (R0 + Rm) → sign extension→ Rn


MOV.W @(R0,Rm),Rn (R0 + Rm) → sign extension→ Rn


MOV.L @(R0,Rm),Rn (R0 + Rm) → Rn 0000nnnnmmmm1110 — —

Rev. 2.0, 03/99, page 137 of 396

Table 7.3 Fixed-Point Transfer Instructions (cont)


MOV.B R0,@(disp,GBR) R0 → (disp + GBR) 11000000dddddddd — —

MOV.W R0,@(disp,GBR) R0 → (disp × 2 + GBR) 11000001dddddddd — —

MOV.L R0,@(disp,GBR) R0 → (disp × 4 + GBR) 11000010dddddddd — —

MOV.B @(disp,GBR),R0 (disp + GBR) →sign extension → R0

11000100dddddddd — —

MOV.W @(disp,GBR),R0 (disp × 2 + GBR) →sign extension → R0


MOV.L @(disp,GBR),R0 (disp × 4 + GBR) → R0 11000110dddddddd — —

MOVA @(disp,PC),R0 disp × 4 + PC & H'FFFFFFFC+ 4 → R0


MOVT Rn T → Rn 0000nnnn00101001 — —

SWAP.B Rm,Rn Rm → swap lower 2 bytes→ Rn


SWAP.W Rm,Rn Rm → swap upper/lowerwords → Rn


XTRCT Rm,Rn Rm:Rn middle 32 bits → Rn 0010nnnnmmmm1101 — —

Rev. 2.0, 03/99, page 138 of 396

Table 7.4 Arithmetic Operation Instructions


ADD Rm,Rn Rn + Rm → Rn 0011nnnnmmmm1100 — —

ADD #imm,Rn Rn + imm → Rn 0111nnnniiiiiiii — —

ADDC Rm,Rn Rn + Rm + T → Rn, carry → T 0011nnnnmmmm1110 — Carry

ADDV Rm,Rn Rn + Rm → Rn, overflow → T 0011nnnnmmmm1111 — Overflow

CMP/EQ #imm,R0 When R0 = imm, 1 → TOtherwise, 0 → T

10001000iiiiiiii — Comparisonresult

CMP/EQ Rm,Rn When Rn = Rm, 1 → TOtherwise, 0 → T

0011nnnnmmmm0000 — Comparisonresult

CMP/HS Rm,Rn When Rn ≥ Rm (unsigned),1 → TOtherwise, 0 → T


CMP/GE Rm,Rn When Rn ≥ Rm (signed), 1 → TOtherwise, 0 → T


CMP/HI Rm,Rn When Rn > Rm (unsigned),1 → TOtherwise, 0 → T


CMP/GT Rm,Rn When Rn > Rm (signed), 1 → TOtherwise, 0 → T


CMP/PZ Rn When Rn ≥ 0, 1 → TOtherwise, 0 → T

0100nnnn00010001 — Comparisonresult

CMP/PL Rn When Rn > 0, 1 → TOtherwise, 0 → T


CMP/STR Rm,Rn When any bytes are equal,1 → TOtherwise, 0 → T


DIV1 Rm,Rn 1-step division (Rn ÷ Rm) 0011nnnnmmmm0100 — Calculationresult

DIV0S Rm,Rn MSB of Rn → Q,MSB of Rm → M, M^Q → T

0010nnnnmmmm0111 — Calculationresult

DIV0U 0 → M/Q/T 0000000000011001 — 0

DMULS.L Rm,Rn Signed, Rn × Rm → MAC,32 × 32 → 64 bits


DMULU.L Rm,Rn Unsigned, Rn × Rm → MAC,32 × 32 → 64 bits


DT Rn Rn – 1 → Rn; when Rn = 0,1 → TWhen Rn ≠ 0, 0 → T


EXTS.B Rm,Rn Rm sign-extended frombyte → Rn


Rev. 2.0, 03/99, page 139 of 396

Table 7.4 Arithmetic Operation Instructions (cont)


EXTS.W Rm,Rn Rm sign-extended fromword → Rn


EXTU.B Rm,Rn Rm zero-extended frombyte → Rn


EXTU.W Rm,Rn Rm zero-extended fromword → Rn


MAC.L @Rm+,@Rn+ Signed, (Rn) × (Rm) + MAC →MACRn + 4 → Rn, Rm + 4 → Rm32 × 32 + 64 → 64 bits


MAC.W @Rm+,@Rn+ Signed, (Rn) × (Rm) + MAC →MACRn + 2 → Rn, Rm + 2 → Rm16 × 16 + 64 → 64 bits


MUL.L Rm,Rn Rn × Rm → MACL32 × 32 → 32 bits


MULS.W Rm,Rn Signed, Rn × Rm → MACL16 × 16 → 32 bits


MULU.W Rm,Rn Unsigned, Rn × Rm → MACL16 × 16 → 32 bits


NEG Rm,Rn 0 – Rm → Rn 0110nnnnmmmm1011 — —

NEGC Rm,Rn 0 – Rm – T → Rn, borrow → T 0110nnnnmmmm1010 — Borrow

SUB Rm,Rn Rn – Rm → Rn 0011nnnnmmmm1000 — —

SUBC Rm,Rn Rn – Rm – T → Rn, borrow → T 0011nnnnmmmm1010 — Borrow

SUBV Rm,Rn Rn – Rm → Rn, underflow → T 0011nnnnmmmm1011 — Underflow

Rev. 2.0, 03/99, page 140 of 396

Table 7.5 Logic Operation Instructions


AND Rm,Rn Rn & Rm → Rn 0010nnnnmmmm1001 — —

AND #imm,R0 R0 & imm → R0 11001001iiiiiiii — —

AND.B #imm,@(R0,GBR) (R0 + GBR) & imm → (R0 +GBR)

11001101iiiiiiii — —

NOT Rm,Rn ~Rm → Rn 0110nnnnmmmm0111 — —

OR Rm,Rn Rn | Rm → Rn 0010nnnnmmmm1011 — —

OR #imm,R0 R0 | imm → R0 11001011iiiiiiii — —

OR.B #imm,@(R0,GBR) (R0 + GBR) | imm → (R0 +GBR)

11001111iiiiiiii —

TAS.B @Rn When (Rn) = 0, 1 → TOtherwise, 0 → TIn both cases, 1 → MSB of (Rn)

0100nnnn00011011 — Test result

TST Rm,Rn Rn & Rm; when result = 0,1 → TOtherwise, 0 → T

0010nnnnmmmm1000 — Test result

TST #imm,R0 R0 & imm; when result = 0,1 → TOtherwise, 0 → T

11001000iiiiiiii — Test result

TST.B #imm,@(R0,GBR) (R0 + GBR) & imm; when result= 0, 1 → TOtherwise, 0 → T

11001100iiiiiiii — Test result

XOR Rm,Rn Rn ∧ Rm → Rn 0010nnnnmmmm1010 — —

XOR #imm,R0 R0 ∧ imm → R0 11001010iiiiiiii — —

XOR.B #imm,@(R0,GBR) (R0 + GBR) ∧ imm → (R0 +GBR)


Rev. 2.0, 03/99, page 141 of 396

Table 7.6 Shift Instructions


ROTL Rn T ← Rn ← MSB 0100nnnn00000100 — MSB

ROTR Rn LSB → Rn → T 0100nnnn00000101 — LSB

ROTCL Rn T ← Rn ← T 0100nnnn00100100 — MSB

ROTCR Rn T → Rn → T 0100nnnn00100101 — LSB

SHAD Rm,Rn When Rn ≥ 0, Rn << Rm → RnWhen Rn < 0, Rn >> Rm →[MSB → Rn]


SHAL Rn T ← Rn ← 0 0100nnnn00100000 — MSB

SHAR Rn MSB → Rn → T 0100nnnn00100001 — LSB

SHLD Rm,Rn When Rn ≥ 0, Rn << Rm → RnWhen Rn < 0, Rn >> Rm →[0 → Rn]


SHLL Rn T ← Rn ← 0 0100nnnn00000000 — MSB

SHLR Rn 0 → Rn → T 0100nnnn00000001 — LSB

SHLL2 Rn Rn << 2 → Rn 0100nnnn00001000 — —

SHLR2 Rn Rn >> 2 → Rn 0100nnnn00001001 — —

SHLL8 Rn Rn << 8 → Rn 0100nnnn00011000 — —

SHLR8 Rn Rn >> 8 → Rn 0100nnnn00011001 — —

SHLL16 Rn Rn << 16 → Rn 0100nnnn00101000 — —

SHLR16 Rn Rn >> 16 → Rn 0100nnnn00101001 — —

Rev. 2.0, 03/99, page 142 of 396

Table 7.7 Branch Instructions


BF label When T = 0, disp × 2 + PC +4 → PCWhen T = 1, nop


BF/S label Delayed branch; when T = 0,disp × 2 + PC + 4 → PCWhen T = 1, nop


BT label When T = 1, disp × 2 + PC +4 → PCWhen T = 0, nop


BT/S label Delayed branch; when T = 1,disp × 2 + PC + 4 → PCWhen T = 0, nop


BRA label Delayed branch, disp × 2 +PC + 4 → PC

1010dddddddddddd — —

BRAF Rn Rn + PC + 4 → PC 0000nnnn00100011 — —

BSR label Delayed branch, PC + 4 → PR,disp × 2 + PC + 4 → PC

1011dddddddddddd — —

BSRF Rn Delayed branch, PC + 4 → PR,Rn + PC + 4 → PC

0000nnnn00000011 — —

JMP @Rn Delayed branch, Rn → PC 0100nnnn00101011 — —

JSR @Rn Delayed branch, PC + 4 → PR,Rn → PC

0100nnnn00001011 — —

RTS Delayed branch, PR → PC 0000000000001011 — —

Rev. 2.0, 03/99, page 143 of 396

Table 7.8 System Control Instructions


CLRMAC 0 → MACH, MACL 0000000000101000 — —

CLRS 0 → S 0000000001001000 — —

CLRT 0 → T 0000000000001000 — 0

LDC Rm,SR Rm → SR 0100mmmm00001110 Privileged LSB

LDC Rm,GBR Rm → GBR 0100mmmm00011110 — —

LDC Rm,VBR Rm → VBR 0100mmmm00101110 Privileged —

LDC Rm,SSR Rm → SSR 0100mmmm00111110 Privileged —

LDC Rm,SPC Rm → SPC 0100mmmm01001110 Privileged —

LDC Rm,DBR Rm → DBR 0100mmmm11111010 Privileged —

LDC Rm,Rn_BANK Rm → Rn_BANK (n = 0 to 7) 0100mmmm1nnn1110 Privileged —

LDC.L @Rm+,SR (Rm) → SR, Rm + 4 → Rm 0100mmmm00000111 Privileged LSB

LDC.L @Rm+,GBR (Rm) → GBR, Rm + 4 → Rm 0100mmmm00010111 — —

LDC.L @Rm+,VBR (Rm) → VBR, Rm + 4 → Rm 0100mmmm00100111 Privileged —

LDC.L @Rm+,SSR (Rm) → SSR, Rm + 4 → Rm 0100mmmm00110111 Privileged —

LDC.L @Rm+,SPC (Rm) → SPC, Rm + 4 → Rm 0100mmmm01000111 Privileged —

LDC.L @Rm+,DBR (Rm) → DBR, Rm + 4 → Rm 0100mmmm11110110 Privileged —

LDC.L @Rm+,Rn_BANK (Rm) → Rn_BANK,Rm + 4 → Rm

0100mmmm1nnn0111 Privileged —

LDS Rm,MACH Rm → MACH 0100mmmm00001010 — —

LDS Rm,MACL Rm → MACL 0100mmmm00011010 — —

LDS Rm,PR Rm → PR 0100mmmm00101010 — —

LDS.L @Rm+,MACH (Rm) → MACH, Rm + 4 → Rm 0100mmmm00000110 — —

LDS.L @Rm+,MACL (Rm) → MACL, Rm + 4 → Rm 0100mmmm00010110 — —

LDS.L @Rm+,PR (Rm) → PR, Rm + 4 → Rm 0100mmmm00100110 — —

LDTLB PTEH/PTEL → TLB 0000000000111000 Privileged —

MOVCA.L R0,@Rn R0 → (Rn) (without fetchingcache block)

0000nnnn11000011 — —

NOP No operation 0000000000001001 — —

OCBI @Rn Invalidates operand cache block 0000nnnn10010011 — —

OCBP @Rn Writes back and invalidatesoperand cache block

0000nnnn10100011 — —

OCBWB @Rn Writes back operand cacheblock

0000nnnn10110011 — —

PREF @Rn (Rn) → operand cache 0000nnnn10000011 — —

RTE Delayed branch, SSR/SPC →SR/PC

0000000000101011 Privileged —

Rev. 2.0, 03/99, page 144 of 396

Table 7.8 System Control Instructions (cont)


SETS 1 → S 0000000001011000 — —

SETT 1 → T 0000000000011000 — 1

SLEEP Sleep or standby 0000000000011011 Privileged —

STC SR,Rn SR → Rn 0000nnnn00000010 Privileged —

STC GBR,Rn GBR → Rn 0000nnnn00010010 — —

STC VBR,Rn VBR → Rn 0000nnnn00100010 Privileged —

STC SSR,Rn SSR → Rn 0000nnnn00110010 Privileged —

STC SPC,Rn SPC → Rn 0000nnnn01000010 Privileged —

STC SGR,Rn SGR → Rn 0000nnnn00111010 Privileged —

STC DBR,Rn DBR → Rn 0000nnnn11111010 Privileged —

STC Rm_BANK,Rn Rm_BANK → Rn (m = 0 to 7) 0000nnnn1mmm0010 Privileged —

STC.L SR,@-Rn Rn – 4 → Rn, SR → (Rn) 0100nnnn00000011 Privileged —

STC.L GBR,@-Rn Rn – 4 → Rn, GBR → (Rn) 0100nnnn00010011 — —

STC.L VBR,@-Rn Rn – 4 → Rn, VBR → (Rn) 0100nnnn00100011 Privileged —

STC.L SSR,@-Rn Rn – 4 → Rn, SSR → (Rn) 0100nnnn00110011 Privileged —

STC.L SPC,@-Rn Rn – 4 → Rn, SPC → (Rn) 0100nnnn01000011 Privileged —

STC.L SGR,@-Rn Rn – 4 → Rn, SGR → (Rn) 0100nnnn00110010 Privileged —

STC.L DBR,@-Rn Rn – 4 → Rn, DBR → (Rn) 0100nnnn11110010 Privileged —

STC.L Rm_BANK,@-Rn Rn – 4 → Rn,Rm_BANK → (Rn) (m = 0 to 7)

0100nnnn1mmm0011 Privileged —

STS MACH,Rn MACH → Rn 0000nnnn00001010 — —

STS MACL,Rn MACL → Rn 0000nnnn00011010 — —

STS PR,Rn PR → Rn 0000nnnn00101010 — —

STS.L MACH,@-Rn Rn – 4 → Rn, MACH → (Rn) 0100nnnn00000010 — —

STS.L MACL,@-Rn Rn – 4 → Rn, MACL → (Rn) 0100nnnn00010010 — —

STS.L PR,@-Rn Rn – 4 → Rn, PR → (Rn) 0100nnnn00100010 — —

TRAPA #imm PC + 2 → SPC, SR → SSR,#imm << 2 → TRA,H'160 → EXPEVT,VBR + H'0100 → PC


Rev. 2.0, 03/99, page 145 of 396

Table 7.9 Floating-Point Single-Precision Instructions


FLDI0 FRn H’00000000 → FRn 1111nnnn10001101 — —

FLDI1 FRn H'3F800000 → FRn 1111nnnn10011101 — —

FMOV FRm,FRn FRm → FRn 1111nnnnmmmm1100 — —

FMOV.S @Rm,FRn (Rm) → FRn 1111nnnnmmmm1000 — —

FMOV.S @(R0,Rm),FRn (R0 + Rm) → FRn 1111nnnnmmmm0110 — —

FMOV.S @Rm+,FRn (Rm) → FRn, Rm + 4 → Rm 1111nnnnmmmm1001 — —

FMOV.S FRm,@Rn FRm → (Rn) 1111nnnnmmmm1010 — —

FMOV.S FRm,@-Rn Rn-4 → Rn, FRm → (Rn) 1111nnnnmmmm1011 — —

FMOV.S FRm,@(R0,Rn) FRm → (R0 + Rn) 1111nnnnmmmm0111 — —

FMOV DRm,DRn DRm → DRn 1111nnn0mmm01100 — —

FMOV @Rm,DRn (Rm) → DRn 1111nnn0mmmm1000 — —

FMOV @(R0,Rm),DRn (R0 + Rm) → DRn 1111nnn0mmmm0110 — —

FMOV @Rm+,DRn (Rm) → DRn, Rm + 8 → Rm 1111nnn0mmmm1001 — —

FMOV DRm,@Rn DRm → (Rn) 1111nnnnmmm01010 — —

FMOV DRm,@-Rn Rn-8 → Rn, DRm → (Rn) 1111nnnnmmm01011 — —

FMOV DRm,@(R0,Rn) DRm → (R0 + Rn) 1111nnnnmmm00111 — —

FLDS FRm,FPUL FRm → FPUL 1111mmmm00011101 — —

FSTS FPUL,FRn FPUL → FRn 1111nnnn00001101 — —

FABS FRn FRn & H'7FFF FFFF → FRn 1111nnnn01011101 — —

FADD FRm,FRn FRn + FRm → FRn 1111nnnnmmmm0000 — —

FCMP/EQ FRm,FRn When FRn = FRm, 1 → TOtherwise, 0 → T


FCMP/GT FRm,FRn When FRn > FRm, 1 → TOtherwise, 0 → T


FDIV FRm,FRn FRn/FRm → FRn 1111nnnnmmmm0011 — —

FLOAT FPUL,FRn (float) FPUL → FRn 1111nnnn00101101 — —

FMAC FR0,FRm,FRn FR0*FRm + FRn → FRn 1111nnnnmmmm1110 — —

FMUL FRm,FRn FRn*FRm → FRn 1111nnnnmmmm0010 — —

FNEG FRn FRn ∧ H'80000000 → FRn 1111nnnn01001101 — —

FSQRT FRn √FRn → FRn 1111nnnn01101101 — —

FSUB FRm,FRn FRn – FRm → FRn 1111nnnnmmmm0001 — —

FTRC FRm,FPUL (long) FRm → FPUL 1111mmmm00111101 — —

Rev. 2.0, 03/99, page 146 of 396

Table 7.10 Floating-Point Double-Precision Instructions


FABS DRn DRn & H’7FFF FFFF FFFFFFFF → DRn

1111nnn001011101 — —

FADD DRm,DRn DRn + DRm → DRn 1111nnn0mmm00000 — —

FCMP/EQ DRm,DRn When DRn = DRm, 1 → TOtherwise, 0 → T

1111nnn0mmm00100 — Comparisonresult

FCMP/GT DRm,DRn When DRn > DRm, 1 → TOtherwise, 0 → T

1111nnn0mmm00101 — Comparisonresult

FDIV DRm,DRn DRn /DRm → DRn 1111nnn0mmm00011 — —

FCNVDS DRm,FPUL double_to_ float[DRm] → FPUL 1111mmm010111101 — —

FCNVSD FPUL,DRn float_to_ double [FPUL] → DRn 1111nnn010101101 — —

FLOAT FPUL,DRn (float)FPUL → DRn 1111nnn000101101 — —

FMUL DRm,DRn DRn *DRm → DRn 1111nnn0mmm00010 — —

FNEG DRn DRn ^ H'8000 0000 0000 0000→ DRn

1111nnn001001101 — —

FSQRT DRn √DRn → DRn 1111nnn001101101 — —

FSUB DRm,DRn DRn – DRm → DRn 1111nnn0mmm00001 — —

FTRC DRm,FPUL (long) DRm → FPUL 1111mmm000111101 — —

Table 7.11 Floating-Point Control Instructions


LDS Rm,FPSCR Rm → FPSCR 0100mmmm01101010 — —

LDS Rm,FPUL Rm → FPUL 0100mmmm01011010 — —

LDS.L @Rm+,FPSCR (Rm) → FPSCR, Rm+4 → Rm 0100mmmm01100110 — —

LDS.L @Rm+,FPUL (Rm) → FPUL, Rm+4 → Rm 0100mmmm01010110 — —

STS FPSCR,Rn FPSCR → Rn 0000nnnn01101010 — —

STS FPUL,Rn FPUL → Rn 0000nnnn01011010 — —

STS.L FPSCR,@-Rn Rn – 4 → Rn, FPSCR → (Rn) 0100nnnn01100010 — —

STS.L FPUL,@-Rn Rn – 4 → Rn, FPUL → (Rn) 0100nnnn01010010 — —

Rev. 2.0, 03/99, page 147 of 396

Table 7.12 Floating-Point Graphics Acceleration Instructions


FMOV DRm,XDn DRm → XDn 1111nnn1mmm01100 — —

FMOV XDm,DRn XDm → DRn 1111nnn0mmm11100 — —

FMOV XDm,XDn XDm → XDn 1111nnn1mmm11100 — —

FMOV @Rm,XDn (Rm) → XDn 1111nnn1mmmm1000 — —

FMOV @Rm+,XDn (Rm) → XDn, Rm + 8 → Rm 1111nnn1mmmm1001 — —

FMOV @(R0,Rm),DRn (R0 + Rm) → DRn 1111nnn1mmmm0110 — —

FMOV XDm,@Rn XDm → (Rn) 1111nnnnmmm11010 — —

FMOV XDm,@-Rn Rn – 8 → Rn, XDm → (Rn) 1111nnnnmmm11011 — —

FMOV XDm,@(R0,Rn) XDm → (R0+Rn) 1111nnnnmmm10111 — —

FIPR FVm,FVn inner_product [FVm, FVn] →FR[n+3]

1111nnmm11101101 — —

FTRV XMTRX,FVn transform_vector [XMTRX, FVn]→ FVn

1111nn0111111101 — —

FRCHG ~FPSCR.FR → SPFCR.FR 1111101111111101 — —

FSCHG ~FPSCR.SZ → SPFCR.SZ 1111001111111101 — —

Rev. 2.0, 03/99, page 148 of 396

Rev. 2.0, 03/99, page 149 of 396

Section 8 Pipelining

The SH7750 is a 2-ILP (instruction-level-parallelism) superscalar pipelining microprocessor.Instruction execution is pipelined, and two instructions can be executed in parallel. The executioncycles depend on the implementation of a processor. Definitions in this section may not beapplicable to SH-4 Series models other than the SH7750.

8.1 Pipelines

Figure 8.1 shows the basic pipelines. Normally, a pipeline consists of five or six stages: instructionfetch (I), decode and register read (D), execution (EX/SX/F0/F1/F2/F3), data access (NA/MA),and write-back (S/FS). An instruction is executed as a combination of basic pipelines. Figure 8.2shows the instruction execution patterns.

Rev. 2.0, 03/99, page 150 of 396

1. General Pipeline

• Instruction fetch • Instruction decode

• Issue• Register read• Destination address calculation

for PC-relative branch

• Non-memory data access

• Write-back

I D EX

• Operation

NA S

2. General Load/Store Pipeline


• Issue• Register read

• Memory data access

• Write-back

I D EX

• Address calculation

MA S

3. Special Pipeline



• Non-memory data access

• Write-back

I D SX

• Operation

NA S

4. Special Load/Store Pipeline



• Memory data access

• Write-back

I D SX

• Address calculation

MA S

5. Floating-Point Pipeline



• Computation 2 • Computation 3• Write-back

I D F1

• Computation 1

F2 FS

6. Floating-Point Extended Pipeline



• Computation 1 • Computation 3• Write-back

I D F0

• Computation 0

F1 F2 FS

• Computation 2

F3

Computation: Takes several cycles

7. FDIV/FSQRT Pipeline

Figure 8.1 Basic Pipelines

Rev. 2.0, 03/99, page 151 of 396

1. 1-step operation: 1 issue cycleEXT[SU].[BW], MOV, MOV#, MOVA, MOVT, SWAP.[BW], XTRCT, ADD*, CMP*, DIV*, DT, NEG*, SUB*, AND, AND#, NOT, OR, OR#, TST, TST#, XOR, XOR#, ROT*, SHA*, SHL*, BF*, BT*, BRA, NOP, CLRS, CLRT, SETS, SETT, LDS to FPUL, STS from FPUL/FPSCR, FLDI0, FLDI1, FMOV, FLDS, FSTS, single-/double-precision FABS/FNEG

I D EX NA S

2. Load/store: 1 issue cycleMOV.[BWL]. FMOV*@, LDS.L to FPUL, LDTLB, PREF, STS.L from FPUL/FPSCR

I D EX MA S

3. GBR-based load/store: 1 issue cycleMOV.[BWL]@(d,GBR)

I D SX MA S

4. JMP, RTS, BRAF: 2 issue cyclesI D EX NA S

D EX NA S

5. TST.B: 3 issue cycles

I D SX MA SD SX NA S

D SX NA S

6. AND.B, OR.B, XOR.B: 4 issue cyclesI D SX MA S

D SX NA SD SX NA S

D SX MA S

7. TAS.B: 5 issue cycles

I D EX MA SD EX MA S

D EX NA SD EX NA S

D EX MA S

8. RTE: 5 issue cyclesI D EX NA S

D EX NA SD EX NA S

D EX NA SD EX NA S

9. SLEEP: 4 issue cycles

I D EX NA SD EX NA S

D EX NA SD EX NA S

Figure 8.2 Instruction Execution Patterns

Rev. 2.0, 03/99, page 152 of 396

10. OCBI: 1 issue cycleI D EX MA S

MA

11. OCBP, OCBWB: 1 issue cycleI D EX MA S

MAMA

MAMA

12. MOVCA.L: 1 issue cycleI D EX MA S

MAMA

MAMA

MAMA

13. TRAPA: 7 issue cyclesI D EX NA S

D EX NA SD EX NA S

D EX NA SD EX NA S

D EX NA SD EX NA S

14. CR definition: 1 issue cycleLDC to DBR/Rp_BANK/SSR/SPC/VBR, BSR

I D EX NA SSX

SX

15. LDC to GBR: 3 issue cyclesI D EX NA S

DDSX

SX

16. LDC to SR: 4 issue cyclesI D EX NA S

DD

D

SXSX

SX

I D EX MA S

17. LDC.L to DBR/Rp_BANK/SSR/SPC/VBR: 1 issue cycle

SXSX

18. LDC.L to GBR: 3 issue cycles

I D EX MA SD

DSX

SX

Figure 8.2 Instruction Execution Patterns (cont)

Rev. 2.0, 03/99, page 153 of 396

19. LDC.L to SR: 4 issue cyclesI D EX MA S

DD

D

SXSX

SX

20. STC from DBR/GBR/Rp_BANK/SR/SSR/SPC/VBR: 2 issue cyclesI D SX NA S

D SX NA S

21. STC.L from SGR: 3 issue cyclesI D SX NA S

D SX NA SD SX NA S

22. STC.L from DBR/GBR/Rp_BANK/SR/SSR/SPC/VBR: 2 issue cycles

I D SX NA SD SX MA S

23. STC.L from SGR: 3 issue cyclesI D SX NA S

D SX NA SD SX MA S

24. LDS to PR, JSR, BSRF: 2 issue cyclesI D EX NA S

D SXSX

25. LDS.L to PR: 2 issue cyclesI D EX MA S

D SXSX

26. STS from PR: 2 issue cyclesI D SX NA S

D SX NA S

27. STS.L from PR: 2 issue cycles

I D SX NA SD SX MA S

28. MACH/L definition: 1 issue cycleCLRMAC, LDS to MACH/L

I D EX NA SF1

F1 F2 FS

29. LDS.L to MACH/L: 1 issue cycleI D EX MA S

F1F1 F2 FS

30. STS from MACH/L: 1 issue cycle

I D EX NA S


Rev. 2.0, 03/99, page 154 of 396

31. STS.L from MACH/L: 1 issue cycleI D EX MA S

32. LDS to FPSCR: 1 issue cycle

I D EX NA SF1

F1F1

F1F1

F1

33. LDS.L to FPSCR: 1 issue cycleI D EX MA S

34. Fixed-point multiplication: 2 issue cyclesDMULS.L, DMULU.L, MUL.L, MULS.W, MULU.W

I D EX NA (CPU)D EX NA S

f1 (FPU)f1

f1f1 F2 FS

35. MAC.W, MAC.L: 2 issue cyclesI D EX MA S (CPU)

D EX MA S

f1 (FPU)f1

f1f1 F2 FS

36. Single-precision floating-point computation: 1 issue cycleFCMP/EQ,FCMP/GT, FADD,FLOAT,FMAC,FMUL,FSUB,FTRC,FRCHG,FSCHG

I D F1 F2 FS

37. Single-precision FDIV/SQRT: 1 issue cycle

I D F1 F2 FSF3

F1 F2 FS

38. Double-precision floating-point computation 1: 1 issue cycle FCNVDS, FCNVSD, FLOAT, FTRC

I D F1 F2 FSd F1 F2 FS

39. Double-precision floating-point computation 2: 1 issue cycle FADD, FMUL, FSUB


d F1 F2 FSd F1 F2 FS

d F1 F2 FS

F1 F2 FS


Rev. 2.0, 03/99, page 155 of 396

I D F1 F2 FSD F1 F2 FS

40. Double-precision FCMP: 2 issue cyclesFCMP/EQ,FCMP/GT

I D F1 F2 FS

F3F1 F2 FS

41. Double-precision FDIV/SQRT: 1 issue cycle FDIV, FSQRT

F1 F2d

F1 F2 FSF1 F2 FS

42. FIPR: 1 issue cycleI D F0 F1 F2 FS

43. FTRV: 1 issue cycleF1 F2 FSD F0I

F1 F2 FSd F0F1 F2 FSd F0

F1 F2 FSd F0

Notes: ??

: Locks D-stage

: Register read only

: Locks, but no operation is executed.

: Can overlap another f1, but not another F1.

d

D

??

f1

: Cannot overlap a stage of the same kind, except when two instructions are executed in parallel.


Rev. 2.0, 03/99, page 156 of 396

8.2 Parallel-Executability

Instructions are categorized into six groups according to the internal function blocks used, asshown in table 8.1. Table 8.2 shows the parallel-executability of pairs of instructions in terms ofgroups. For example, ADD in the EX group and BRA in the BR group can be executed in parallel.

Table 8.1 Instruction Groups

1. MT Group

CLRT CMP/HI Rm,Rn MOV Rm,Rn

CMP/EQ #imm,R0 CMP/HS Rm,Rn NOP

CMP/EQ Rm,Rn CMP/PL Rn SETT

CMP/GE Rm,Rn CMP/PZ Rn TST #imm,R0

CMP/GT Rm,Rn CMP/STR Rm,Rn TST Rm,Rn

2. EX Group

ADD #imm,Rn MOVT Rn SHLL2 Rn

ADD Rm,Rn NEG Rm,Rn SHLL8 Rn

ADDC Rm,Rn NEGC Rm,Rn SHLR Rn

ADDV Rm,Rn NOT Rm,Rn SHLR16 Rn

AND #imm,R0 OR #imm,R0 SHLR2 Rn

AND Rm,Rn OR Rm,Rn SHLR8 Rn

DIV0S Rm,Rn ROTCL Rn SUB Rm,Rn

DIV0U ROTCR Rn SUBC Rm,Rn

DIV1 Rm,Rn ROTL Rn SUBV Rm,Rn

DT Rn ROTR Rn SWAP.B Rm,Rn

EXTS.B Rm,Rn SHAD Rm,Rn SWAP.W Rm,Rn

EXTS.W Rm,Rn SHAL Rn XOR #imm,R0

EXTU.B Rm,Rn SHAR Rn XOR Rm,Rn

EXTU.W Rm,Rn SHLD Rm,Rn XTRCT Rm,Rn

MOV #imm,Rn SHLL Rn

MOVA @(disp,PC),R0 SHLL16 Rn

3. BR Group

BF disp BRA disp BT disp

BF/S disp BSR disp BT/S disp

Rev. 2.0, 03/99, page 157 of 396

Table 8.1 Instruction Groups (cont)

4. LS Group

FABS DRn FMOV.S @Rm+,FRn MOV.L R0,@(disp,GBR)

FABS FRn FMOV.S FRm,@(R0,Rn) MOV.L Rm,@(disp,Rn)

FLDI0 FRn FMOV.S FRm,@-Rn MOV.L Rm,@(R0,Rn)

FLDI1 FRn FMOV.S FRm,@Rn MOV.L Rm,@-Rn

FLDS FRm,FPUL FNEG DRn MOV.L Rm,@Rn

FMOV @(R0,Rm),DRn FNEG FRn MOV.W @(disp,GBR),R0

FMOV @(R0,Rm),XDn FSTS FPUL,FRn MOV.W @(disp,PC),Rn

FMOV @Rm,DRn LDS Rm,FPUL MOV.W @(disp,Rm),R0

FMOV @Rm,XDn MOV.B @(disp,GBR),R0 MOV.W @(R0,Rm),Rn

FMOV @Rm+,DRn MOV.B @(disp,Rm),R0 MOV.W @Rm,Rn

FMOV @Rm+,XDn MOV.B @(R0,Rm),Rn MOV.W @Rm+,Rn

FMOV DRm,@(R0,Rn) MOV.B @Rm,Rn MOV.W R0,@(disp,GBR)

FMOV DRm,@-Rn MOV.B @Rm+,Rn MOV.W R0,@(disp,Rn)

FMOV DRm,@Rn MOV.B R0,@(disp,GBR) MOV.W Rm,@(R0,Rn)

FMOV DRm,DRn MOV.B R0,@(disp,Rn) MOV.W Rm,@-Rn

FMOV DRm,XDn MOV.B Rm,@(R0,Rn) MOV.W Rm,@Rn

FMOV FRm,FRn MOV.B Rm,@-Rn MOVCA.L R0,@Rn

FMOV XDm,@(R0,Rn) MOV.B Rm,@Rn OCBI @Rn

FMOV XDm,@-Rn MOV.L @(disp,GBR),R0 OCBP @Rn

FMOV XDm,@Rn MOV.L @(disp,PC),Rn OCBWB @Rn

FMOV XDm,DRn MOV.L @(disp,Rm),Rn PREF @Rn

FMOV XDm,XDn MOV.L @(R0,Rm),Rn STS FPUL,Rn

FMOV.S @(R0,Rm),FRn MOV.L @Rm,Rn

FMOV.S @Rm,FRn MOV.L @Rm+,Rn

Rev. 2.0, 03/99, page 158 of 396


5. FE Group

FADD DRm,DRn FIPR FVm,FVn FSQRT DRn

FADD FRm,FRn FLOAT FPUL,DRn FSQRT FRn

FCMP/EQ FRm,FRn FLOAT FPUL,FRn FSUB DRm,DRn

FCMP/GT FRm,FRn FMAC FR0,FRm,FRn FSUB FRm,FRn

FCNVDS DRm,FPUL FMUL DRm,DRn FTRC DRm,FPUL

FCNVSD FPUL,DRn FMUL FRm,FRn FTRC FRm,FPUL

FDIV DRm,DRn FRCHG FTRV XMTRX,FVn

FDIV FRm,FRn FSCHG

Rev. 2.0, 03/99, page 159 of 396


6. CO Group

AND.B #imm,@(R0,GBR) LDS Rm,FPSCR STC SR,Rn

BRAF Rm LDS Rm,MACH STC SSR,Rn

BSRF Rm LDS Rm,MACL STC VBR,Rn

CLRMAC LDS Rm,PR STC.L DBR,@-Rn

CLRS LDS.L @Rm+,FPSCR STC.L GBR,@-Rn

DMULS.L Rm,Rn LDS.L @Rm+,FPUL STC.L Rp_BANK,@-Rn

DMULU.L Rm,Rn LDS.L @Rm+,MACH STC.L SGR,@-Rn

FCMP/EQ DRm,DRn LDS.L @Rm+,MACL STC.L SPC,@-Rn

FCMP/GT DRm,DRn LDS.L @Rm+,PR STC.L SR,@-Rn

JMP @Rn LDTLB STC.L SSR,@-Rn

JSR @Rn MAC.L @Rm+,@Rn+ STC.L VBR,@-Rn

LDC Rm,DBR MAC.W @Rm+,@Rn+ STS FPSCR,Rn

LDC Rm,GBR MUL.L Rm,Rn STS MACH,Rn

LDC Rm,Rp_BANK MULS.W Rm,Rn STS MACL,Rn

LDC Rm,SPC MULU.W Rm,Rn STS PR,Rn

LDC Rm,SR OR.B #imm,@(R0,GBR) STS.L FPSCR,@-Rn

LDC Rm,SSR RTE STS.L FPUL,@-Rn

LDC Rm,VBR RTS STS.L MACH,@-Rn

LDC.L @Rm+,DBR SETS STS.L MACL,@-Rn

LDC.L @Rm+,GBR SLEEP STS.L PR,@-Rn

LDC.L @Rm+,Rp_BANK STC DBR,Rn TAS.B @Rn

LDC.L @Rm+,SPC STC GBR,Rn TRAPA #imm

LDC.L @Rm+,SR STC Rp_BANK,Rn TST.B #imm,@(R0,GBR)

LDC.L @Rm+,SSR STC SGR,Rn XOR.B #imm,@(R0,GBR)

LDC.L @Rm+,VBR STC SPC,Rn

Rev. 2.0, 03/99, page 160 of 396

Table 8.2 Parallel-Executability

2nd Instruction

MT EX BR LS FE CO

MT O O O O O X

EX O X O O O X

BR O O X O O X

LS O O O X O X

FE O O O O X X

1stInstruction

CO X X X X X X

O: Can be executed in parallelX: Cannot be executed in parallel

8.3 Execution Cycles and Pipeline Stalling

There are three basic clocks in this processor: the I-clock, B-clock, and P-clock. Each hardwareunit operates on one of these clocks, as follows:

• I-clock: CPU, FPU, MMU, caches

• B-clock: External bus controller

• P-clock: Peripheral units

The frequency ratios of the three clocks are determined with the frequency control register(FRQCR). In this section, machine cycles are based on the I-clock unless otherwise specified. Fordetails of FRQCR, see section 10, Clock Oscillation Circuits.

Instruction execution cycles are summarized in table 8.3. Penalty cycles due to a pipeline stall orfreeze are not considered in this table.

• Issue rate: Interval between the issue of an instruction and that of the next instruction

• Latency: Interval between the issue of an instruction and the generation of its result(completion)

• Instruction execution pattern (see figure 8.2)

• Locked pipeline stages

• Interval between the issue of an instruction and the start of locking

• Lock time: Period of locking in machine cycle units

Rev. 2.0, 03/99, page 161 of 396

The instruction execution sequence is expressed as a combination of the execution patterns shownin figure 8.2. One instruction is separated from the next by the number of machine cycles for itsissue rate. Normally, execution, data access, and write-back stages cannot be overlapped onto thesame stages of another instruction; the only exception is when two instructions are executed inparallel under parallel-executability conditions. Refer to (a) through (d) in figure 8.3 for somesimple examples.

Latency is the interval between issue and completion of an instruction, and is also the intervalbetween the execution of two instructions with an interdependent relationship. When there isinterdependency between two instructions fetched simultaneously, the latter of the two is stalledfor the following number of cycles:

• (Latency) cycles when there is flow dependency (read-after-write)

• (Latency - 1) or (latency - 2) cycles when there is output dependency (write-after-write)

Single/double-precision FDIV, FSQRT is the preceding instruction (latency – 1) cycles

The other FE group is the preceding instruction (latency – 2) cycles

• 5 or 2 cycles when there is anti-flow dependency (write-after-read), as in the following cases:

FTRV is the preceding instruction (5 cycle)

A double-precision FADD, FSUB, or FMUL is the preceding instruction (2 cycles)

In the case of flow dependency, latency may be exceptionally increased or decreased, dependingon the combination of sequential instructions (figure 8.3 (e)).

• When a floating-point (FP) computation is followed by an FP register store, the latency of theFP computation may be decreased by 1 cycle.

• If there is a load of the shift amount immediately before an SHAD/SHLD instruction, thelatency of the load is increased by 1 cycle.

• If an instruction with a latency of less than 2 cycles, including write-back to an FP register, isfollowed by a double-precision FP instruction, FIPR, or FTRV, the latency of the firstinstruction is increased to 2 cycles.

The number of cycles in a pipeline stall due to flow dependency will vary depending on thecombination of interdependent instructions or the fetch timing (see figure 8.3. (e)).

Output dependency occurs when the destination operands are the same in a preceding FE groupinstruction and a following LS group instruction.

For the stall cycles of an instruction with output dependency, the longest latency to the last write-back among all the destination operands must be applied instead of “latency” (see figure 8.3 (f)).A stall due to output dependency with respect to FPSCR, which reflects the result of an FPoperation, never occurs. For example, when FADD follows FDIV with no dependency betweenFP registers, FADD is not stalled even if both instructions update the cause field of FPSCR.

Rev. 2.0, 03/99, page 162 of 396

Anti-flow dependency can occur only between a preceding double-precision FADD, FMUL,FSUB, or FTRV and a following FMOV, FLDI0, FLDI1, FABS, FNEG, or FSTS. See figure 8.3(g).

If an executing instruction locks any resource—i.e. a function block that performs a basicoperation—a following instruction that happens to attempt to use the locked resource must bestalled (figure 8.3 (h)). This kind of stall can be compensated by inserting one or more instructionsindependent of the locked resource to separate the interfering instructions. For example, when aload instruction and an ADD instruction that references the loaded value are consecutive, the 2-cycle stall of the ADD is eliminated by inserting three instructions without dependency. Softwareperformance can be improved by such instruction scheduling.

Other penalties arise in the event of exceptions or external data accesses, as follows.

• Instruction TLB miss: a penalty of 7 CPU clocks

• Instruction access to external memory (instruction cache miss, etc.)

• Data access to external memory (operand cache miss, etc.): a penalty of 2 CPU clocks + 3 busclocks

• Data access to a memory-mapped control register. The penalty differs from register to register,and depends on the kind of operation (read or write), the clock mode, and the bus useconditions when the access is made.

During the penalty cycles of an instruction TLB miss or external instruction access, no instructionis issued, but execution of instructions that have already been issued continues. The penalty for adata access is a pipeline freeze: that is, the execution of uncompleted instructions is interrupteduntil the arrival of the requested data. The number of penalty cycles for instruction and dataaccesses is largely dependent on the user’s memory subsystems.

Rev. 2.0, 03/99, page 163 of 396

(a) Serial execution: non-parallel-executable instructions

ADD R2,R1MOV.L @R4,R5

MOV R1,R2next

SHAD R0,R1ADD R2,R3next

I D EX NA SI D EX NA S

I D ...

1 stall cycle

(b) Parallel execution: parallel-executable and no dependency

I D EX NA SI D EX MA S

(c) Issue rate: multi-step instruction

AND.B#1,@(R0,GBR) I D SX MA S

D SX MA SD SX NA S

D SX NA S

II

(d) Branch

1 issue cycle

1 issue cycle

4 issue cycles

...


2-cycle latency for I-stage of branch destination

1 stall cycleI D


I D EX NA S

BT/S L_farADD R0,R1SUB R2,R3

BT/S L_farADD R0,R1

L_far

I D EX NA SI D

I D

— — —...

No stall

BT L_skipADD #1,R0L_skip:

...

i D E A S

4 stall cycles

EX-group SHAD and EX-group ADD cannot be executed in parallel. Therefore, SHAD is issued first, and the following ADD is recombined with the next instruction.

EX-group ADD and LS-group MOV.L can be executed in parallel. Overlapping of stages in the 2nd instruction is possible.

AND.B and MOV are fetched simultaneously, but MOV is stalled due to resource locking. After the lock is released, MOV is refetched together with the next instruction.

No stall occurs if the branch is not taken.

If the branch is taken, the I-stage of the branch destination is stalled for the period of latency. This stall can be covered with a delay slot instruction which is not parallel-executable with the branch instruction.

Even if the BT/BF branch is taken, the I-stage of the branch destination is not stalled if the displacement is zero.

Figure 8.3 Examples of Pipelined Execution

Rev. 2.0, 03/99, page 164 of 396

(e) Flow dependency


MOV R0,R1ADD R2,R1

ADD R2,R1MOV.L @R1,R1next

I D EX NA SI D EX MA Si

I ...

...

...

Zero-cycle latency

1-cycle latency

1 stall cycle

MOV.L @R1,R1ADD R0,R1next

I D EX MA SI DI

EX NA SD

EX NA S

2-cycle latency

1 stall cycle

MOV.L @R1,R1SHAD R1,R2next

FADD FR1,FR2STS FPUL,R1STS FPSCR,R2

I D EX NA SI

4-cycle latency for FPSCR

2 stall cycles

I D F1 F2 FS

I D EX MA SI DI

2-cycle latency

2 stall cycles

EX NA Sd

1-cycle increase

II



F1 F2 FSd F1 F2 FS

EX NA SDEX NA SD

FADD DR0,DR2

7-cycle latency for lower FR8-cycle latency for upper FR

FMOV FR3,FR5FMOV FR2,FR4

FLOAT FPUL,DR0FMOV.S FR0,@-R15

FR3 writeFR2 write


I D EX MA S

3-cycle latency for upper/lower FR

FR1 writeFR0 write

FLDI1 FR3FIPR FV0,FV4

FMOV @R1,XD14FTRV XMTRX,FV0

I D EX NA SI D d F0 F1 F2 FS

Zero-cycle latency3-cycle increase

3 stall cycles

I D EX MA SI D d F0 F1 F2 FS

d F0 F1 FSF2d F0 F2F1 FS

d F1F0 F2 FS

2-cycle latency1-cycle increase

3 stall cycles

The following instruction, ADD, is not stalled when executed after an instruction with zero-cycle latency, even if there is dependency.

ADD and MOV.L are not executed in parallel, since MOV.L references the result of ADD as its destination address.

Because MOV.L and ADD are not fetched simultaneously in this example, ADD is stalled for only 1 cycle even though the latency of MOV.L is 2 cycles.

Due to the flow dependency between the load and the SHAD/SHLD shift amount, the latency of the load is increased to 3 cycles.

Figure 8.3 Examples of Pipelined Execution (cont)

Rev. 2.0, 03/99, page 165 of 396

I D EX NA S

I D EX NA SD F1 F2 FS

D F1 F2 FS

(e) Flow dependency (cont)

I

I

LDS R0,FPULFLOAT FPUL,FR0LDS R1,FPULFLOAT FPUL,R1

Effectively 1-cycle latency for consecutive LDS/FLOAT instructions

I D EX NA SD F1 F2 FSI

D F1 F2 FSII D EX NA S

Effectively 1-cycle latency for consecutive FTRC/STS instructions

FTRC FR0,FPULSTS FPUL,R0FTRC FR1,FPULSTS FPUL,R1

(f) Output dependency

D F1 F2 FSI

I DF1 F2 FS

F1 F2 FS

11-cycle latency

10 stall cycles = latency (11) - 1The registers are written-back in program order.

D F1 F2 FSId F1 F2 FS


d F1 F2 FS

F1 F2 FSEX NA SI D

7-cycle latency for lower FR8-cycle latency for upper FR

6 stall cycles = longest latency (8) - 2

FR2 writeFR3 write


d F1 F2 FSd F1

F0F0

F0F0 F2 FS

(g) Anti-flow dependency

EX MA SI D5 stall cycles



EX NA SI D2 stall cycles

d F1 F2 FSF1 F2 FS

FSQRT FR4

FMOV FR0,FR4

FADD DR0,DR2

FMOV FR0,FR3

FTRV XMTRX,FV0

FMOV @R1,XD0

FADD DR0,DR2

FMOV FR4,FR1

F3


Rev. 2.0, 03/99, page 166 of 396

(h) Resource conflict

F1 stage locked for 1 cycle

Latency1 cycle/issue

1 stall cycle (F1 stage resource conflict)

FDIV FR6,FR7

FMAC FR0,FR8,FR9FMAC FR0,FR10,FR11

FMAC FR0,FR12,FR13

FIPR FV8,FV0FADD FR15,FR4

I D F1F0 F2 FSI D F1 F2 FS

1 stall cycle

LDS.L @R15+,PR I D EX MA FSD SX

SXSX NA S

SX NA SDI

3 stall cycles

STC GBR,R2

FADD DR0,DR2 I D F1 F2 FSd F1 F2 FS


d F1 F2 FS

F1 F2 FSEX MA Sf1

EX MA SDf1

f1 F2 FSf1 F2 FS

I D5 stall cycles

MAC.W @R1+,@R2+

I D EX MA Sf1

f1f1 F2 FS

f1 F2 FSI

f1D EX MA Sf1

D EX MA S

f1 F2 FSf1 F2 FS

F1 F2 FSd F1 F2 FS


d F1 F2 FS

F1 ...

I D3 stall cycles

1 stall cycle

2 stall cycles

MAC.W @R1+,@R2+

MAC.W @R1+,@R2+

FADD DR4,DR6

f1 stage can overlap preceding f1, but F1 cannot overlap f1.

D EX MA S

D

I D F1 F2 FS

I D F1 F2 FS

F1 F2 FS

F1 F2I D FS

F3

I D F1 F2 FS

#1 #2 #3 .................................................. #10 #11#8 #9 #12

... :


Rev. 2.0, 03/99, page 167 of 396

Table 8.3 Execution Cycles

LockFunctionalCategory No. Instruction

Instruc-tionGroup

IssueRate Latency

Execu-tionPattern Stage Start Cycles

1 EXTS.B Rm,Rn EX 1 1 #1 — — —

2 EXTS.W Rm,Rn EX 1 1 #1 — — —

3 EXTU.B Rm,Rn EX 1 1 #1 — — —

4 EXTU.W Rm,Rn EX 1 1 #1 — — —

5 MOV Rm,Rn MT 1 0 #1 — — —

6 MOV #imm,Rn EX 1 1 #1 — — —

7 MOVA @(disp,PC),R0 EX 1 1 #1 — — —

8 MOV.W @(disp,PC),Rn LS 1 2 #2 — — —

9 MOV.L @(disp,PC),Rn LS 1 2 #2 — — —

10 MOV.B @Rm,Rn LS 1 2 #2 — — —

11 MOV.W @Rm,Rn LS 1 2 #2 — — —

12 MOV.L @Rm,Rn LS 1 2 #2 — — —

13 MOV.B @Rm+,Rn LS 1 1/2 #2 — — —

14 MOV.W @Rm+,Rn LS 1 1/2 #2 — — —

15 MOV.L @Rm+,Rn LS 1 1/2 #2 — — —

16 MOV.B @(disp,Rm),R0 LS 1 2 #2 — — —

17 MOV.W @(disp,Rm),R0 LS 1 2 #2 — — —

18 MOV.L @(disp,Rm),Rn LS 1 2 #2 — — —

19 MOV.B @(R0,Rm),Rn LS 1 2 #2 — — —

20 MOV.W @(R0,Rm),Rn LS 1 2 #2 — — —

21 MOV.L @(R0,Rm),Rn LS 1 2 #2 — — —

22 MOV.B @(disp,GBR),R0 LS 1 2 #3 — — —

23 MOV.W @(disp,GBR),R0 LS 1 2 #3 — — —

24 MOV.L @(disp,GBR),R0 LS 1 2 #3 — — —

25 MOV.B Rm,@Rn LS 1 1 #2 — — —

26 MOV.W Rm,@Rn LS 1 1 #2 — — —

27 MOV.L Rm,@Rn LS 1 1 #2 — — —

28 MOV.B Rm,@-Rn LS 1 1/1 #2 — — —

29 MOV.W Rm,@-Rn LS 1 1/1 #2 — — —

30 MOV.L Rm,@-Rn LS 1 1/1 #2 — — —

Datatransferinstructions

31 MOV.B R0,@(disp,Rn) LS 1 1 #2 — — —

Rev. 2.0, 03/99, page 168 of 396

Table 8.3 Execution Cycles (cont)


Instruc-tionGroup

IssueRate Latency


32 MOV.W R0,@(disp,Rn) LS 1 1 #2 — — —

33 MOV.L Rm,@(disp,Rn) LS 1 1 #2 — — —

34 MOV.B Rm,@(R0,Rn) LS 1 1 #2 — — —

35 MOV.W Rm,@(R0,Rn) LS 1 1 #2 — — —

36 MOV.L Rm,@(R0,Rn) LS 1 1 #2 — — —

37 MOV.B R0,@(disp,GBR) LS 1 1 #3 — — —

38 MOV.W R0,@(disp,GBR) LS 1 1 #3 — — —

39 MOV.L R0,@(disp,GBR) LS 1 1 #3 — — —

40 MOVCA.L R0,@Rn LS 1 3–7 #12 MA 4 3–7

41 MOVT Rn EX 1 1 #1 — — —

42 OCBI @Rn LS 1 1–2 #10 MA 4 1–2

43 OCBP @Rn LS 1 1–5 #11 MA 4 1–5

44 OCBWB @Rn LS 1 1–5 #11 MA 4 1–5

45 PREF @Rn LS 1 1 #2 — — —

46 SWAP.B Rm,Rn EX 1 1 #1 — — —

47 SWAP.W Rm,Rn EX 1 1 #1 — — —

Datatransferinstructions

48 XTRCT Rm,Rn EX 1 1 #1 — — —

49 ADD Rm,Rn EX 1 1 #1 — — —

50 ADD #imm,Rn EX 1 1 #1 — — —

51 ADDC Rm,Rn EX 1 1 #1 — — —

52 ADDV Rm,Rn EX 1 1 #1 — — —

53 CMP/EQ #imm,R0 MT 1 1 #1 — — —

54 CMP/EQ Rm,Rn MT 1 1 #1 — — —

55 CMP/GE Rm,Rn MT 1 1 #1 — — —

56 CMP/GT Rm,Rn MT 1 1 #1 — — —

57 CMP/HI Rm,Rn MT 1 1 #1 — — —

58 CMP/HS Rm,Rn MT 1 1 #1 — — —

59 CMP/PL Rn MT 1 1 #1 — — —

60 CMP/PZ Rn MT 1 1 #1 — — —

61 CMP/STR Rm,Rn MT 1 1 #1 — — —

Fixed-pointarithmeticinstructions

62 DIV0S Rm,Rn EX 1 1 #1 — — —

Rev. 2.0, 03/99, page 169 of 396



Instruc-tionGroup

IssueRate Latency


63 DIV0U EX 1 1 #1 — — —

64 DIV1 Rm,Rn EX 1 1 #1 — — —

65 DMULS.L Rm,Rn CO 2 4/4 #34 F1 4 2

66 DMULU.L Rm,Rn CO 2 4/4 #34 F1 4 2

67 DT Rn EX 1 1 #1 — — —

68 MAC.L @Rm+,@Rn+ CO 2 2/2/4/4 #35 F1 4 2

69 MAC.W @Rm+,@Rn+ CO 2 2/2/4/4 #35 F1 4 2

70 MUL.L Rm,Rn CO 2 4/4 #34 F1 4 2

71 MULS.W Rm,Rn CO 2 4/4 #34 F1 4 2

72 MULU.W Rm,Rn CO 2 4/4 #34 F1 4 2

73 NEG Rm,Rn EX 1 1 #1 — — —

74 NEGC Rm,Rn EX 1 1 #1 — — —

75 SUB Rm,Rn EX 1 1 #1 — — —

76 SUBC Rm,Rn EX 1 1 #1 — — —

Fixed-pointarithmeticinstructions

77 SUBV Rm,Rn EX 1 1 #1 — — —

78 AND Rm,Rn EX 1 1 #1 — — —

79 AND #imm,R0 EX 1 1 #1 — — —

80 AND.B #imm,@(R0,GBR) CO 4 4 #6 — — —

81 NOT Rm,Rn EX 1 1 #1 — — —

82 OR Rm,Rn EX 1 1 #1 — — —

83 OR #imm,R0 EX 1 1 #1 — — —

84 OR.B #imm,@(R0,GBR) CO 4 4 #6 — — —

85 TAS.B @Rn CO 5 5 #7 — — —

86 TST Rm,Rn MT 1 1 #1 — — —

87 TST #imm,R0 MT 1 1 #1 — — —

88 TST.B #imm,@(R0,GBR) CO 3 3 #5 — — —

89 XOR Rm,Rn EX 1 1 #1 — — —

90 XOR #imm,R0 EX 1 1 #1 — — —

Logicalinstructions

91 XOR.B #imm,@(R0,GBR) CO 4 4 #6 — — —

Rev. 2.0, 03/99, page 170 of 396



Instruc-tionGroup

IssueRate Latency


92 ROTL Rn EX 1 1 #1 — — —

93 ROTR Rn EX 1 1 #1 — — —

94 ROTCL Rn EX 1 1 #1 — — —

95 ROTCR Rn EX 1 1 #1 — — —

96 SHAD Rm,Rn EX 1 1 #1 — — —

97 SHAL Rn EX 1 1 #1 — — —

98 SHAR Rn EX 1 1 #1 — — —

99 SHLD Rm,Rn EX 1 1 #1 — — —

100 SHLL Rn EX 1 1 #1 — — —

101 SHLL2 Rn EX 1 1 #1 — — —

102 SHLL8 Rn EX 1 1 #1 — — —

103 SHLL16 Rn EX 1 1 #1 — — —

104 SHLR Rn EX 1 1 #1 — — —

105 SHLR2 Rn EX 1 1 #1 — — —

106 SHLR8 Rn EX 1 1 #1 — — —

Shiftinstructions

107 SHLR16 Rn EX 1 1 #1 — — —

108 BF disp BR 1 2 (or 1) #1 — — —

109 BF/S disp BR 1 2 (or 1) #1 — — —

110 BT disp BR 1 2 (or 1) #1 — — —

111 BT/S disp BR 1 2 (or 1) #1 — — —

112 BRA disp BR 1 2 #1 — — —

113 BRAF Rn CO 2 3 #4 — — —

114 BSR disp BR 1 2 #14 SX 3 2

115 BSRF Rn CO 2 3 #24 SX 3 2

116 JMP @Rn CO 2 3 #4 — — —

117 JSR @Rn CO 2 3 #24 SX 3 2

Branchinstructions

118 RTS CO 2 3 #4 — — —

Rev. 2.0, 03/99, page 171 of 396



Instruc-tionGroup

IssueRate Latency


119 NOP MT 1 0 #1 — — —

120 CLRMAC CO 1 3 #28 F1 3 2

121 CLRS CO 1 1 #1 — — —

122 CLRT MT 1 1 #1 — — —

123 SETS CO 1 1 #1 — — —

124 SETT MT 1 1 #1 — — —

125 TRAPA #imm CO 7 7 #13 — — —

126 RTE CO 5 5 #8 — — —

127 SLEEP CO 4 4 #9 — — —

128 LDTLB CO 1 1 #2 — — —

129 LDC Rm,DBR CO 1 3 #14 SX 3 2

130 LDC Rm,GBR CO 3 3 #15 SX 3 2

131 LDC Rm,Rp_BANK CO 1 3 #14 SX 3 2

132 LDC Rm,SR CO 4 4 #16 SX 3 2

133 LDC Rm,SSR CO 1 3 #14 SX 3 2

134 LDC Rm,SPC CO 1 3 #14 SX 3 2

135 LDC Rm,VBR CO 1 3 #14 SX 3 2

136 LDC.L @Rm+,DBR CO 1 1/3 #17 SX 3 2

137 LDC.L @Rm+,GBR CO 3 3/3 #18 SX 3 2

138 LDC.L @Rm+,Rp_BANK CO 1 1/3 #17 SX 3 2

139 LDC.L @Rm+,SR CO 4 4/4 #19 SX 3 2

140 LDC.L @Rm+,SSR CO 1 1/3 #17 SX 3 2

141 LDC.L @Rm+,SPC CO 1 1/3 #17 SX 3 2

142 LDC.L @Rm+,VBR CO 1 1/3 #17 SX 3 2

143 LDS Rm,MACH CO 1 3 #28 F1 3 2

144 LDS Rm,MACL CO 1 3 #28 F1 3 2

145 LDS Rm,PR CO 2 3 #24 SX 3 2

146 LDS.L @Rm+,MACH CO 1 1/3 #29 F1 3 2

147 LDS.L @Rm+,MACL CO 1 1/3 #29 F1 3 2

148 LDS.L @Rm+,PR CO 2 2/3 #25 SX 3 2

149 STC DBR,Rn CO 2 2 #20 — — —

Systemcontrolinstructions

150 STC SGR,Rn CO 3 3 #21 — — —

Rev. 2.0, 03/99, page 172 of 396



Instruc-tionGroup

IssueRate Latency


151 STC GBR,Rn CO 2 2 #20 — — —

152 STC Rp_BANK,Rn CO 2 2 #20 — — —

153 STC SR,Rn CO 2 2 #20 — — —

154 STC SSR,Rn CO 2 2 #20 — — —

155 STC SPC,Rn CO 2 2 #20 — — —

156 STC VBR,Rn CO 2 2 #20 — — —

157 STC.L DBR,@-Rn CO 2 2/2 #22 — — —

158 STC.L SGR,@-Rn CO 3 3/3 #23 — — —

159 STC.L GBR,@-Rn CO 2 2/2 #22 — — —

160 STC.L Rp_BANK,@-Rn CO 2 2/2 #22 — — —

161 STC.L SR,@-Rn CO 2 2/2 #22 — — —

162 STC.L SSR,@-Rn CO 2 2/2 #22 — — —

163 STC.L SPC,@-Rn CO 2 2/2 #22 — — —

164 STC.L VBR,@-Rn CO 2 2/2 #22 — — —

165 STS MACH,Rn CO 1 3 #30 — — —

166 STS MACL,Rn CO 1 3 #30 — — —

167 STS PR,Rn CO 2 2 #26 — — —

168 STS.L MACH,@-Rn CO 1 1/1 #31 — — —

169 STS.L MACL,@-Rn CO 1 1/1 #31 — — —

Systemcontrolinstructions

170 STS.L PR,@-Rn CO 2 2/2 #27 — — —

171 FLDI0 FRn LS 1 0 #1 — — —

172 FLDI1 FRn LS 1 0 #1 — — —

173 FMOV FRm,FRn LS 1 0 #1 — — —

174 FMOV.S @Rm,FRn LS 1 2 #2 — — —

175 FMOV.S @Rm+,FRn LS 1 1/2 #2 — — —

176 FMOV.S @(R0,Rm),FRn LS 1 2 #2 — — —

177 FMOV.S FRm,@Rn LS 1 1 #2 — — —

178 FMOV.S FRm,@-Rn LS 1 1/1 #2 — — —

179 FMOV.S FRm,@(R0,Rn) LS 1 1 #2 — — —

180 FLDS FRm,FPUL LS 1 0 #1 — — —

Single-precisionfloating-pointinstructions

181 FSTS FPUL,FRn LS 1 0 #1 — — —

Rev. 2.0, 03/99, page 173 of 396



Instruc-tionGroup

IssueRate Latency


182 FABS FRn LS 1 0 #1 — — —

183 FADD FRm,FRn FE 1 3/4 #36 — — —

184 FCMP/EQ FRm,FRn FE 1 2/4 #36 — — —

185 FCMP/GT FRm,FRn FE 1 2/4 #36 — — —

186 FDIV FRm,FRn FE 1 12/13 #37 F3 2 10

F1 11 1

187 FLOAT FPUL,FRn FE 1 3/4 #36 — — —

188 FMAC FR0,FRm,FRn FE 1 3/4 #36 — — —

189 FMUL FRm,FRn FE 1 3/4 #36 — — —

190 FNEG FRn LS 1 0 #1 — — —

191 FSQRT FRn FE 1 11/12 #37 F3 2 9

F1 10 1

192 FSUB FRm,FRn FE 1 3/4 #36 — — —

193 FTRC FRm,FPUL FE 1 3/4 #36 — — —

194 FMOV DRm,DRn LS 1 0 #1 — — —

195 FMOV @Rm,DRn LS 1 2 #2 — — —

196 FMOV @Rm+,DRn LS 1 1/2 #2 — — —

197 FMOV @(R0,Rm),DRn LS 1 2 #2 — — —

198 FMOV DRm,@Rn LS 1 1 #2 — — —

199 FMOV DRm,@-Rn LS 1 1/1 #2 — — —

Single-precisionfloating-pointinstructions

200 FMOV DRm,@(R0,Rn) LS 1 1 #2 — — —

201 FABS DRn LS 1 0 #1 — — —

202 FADD DRm,DRn FE 1 (7, 8)/9 #39 F1 2 6

203 FCMP/EQ DRm,DRn CO 2 3/5 #40 F1 2 2

204 FCMP/GT DRm,DRn CO 2 3/5 #40 F1 2 2

205 FCNVDS DRm,FPUL FE 1 4/5 #38 F1 2 2

206 FCNVSD FPUL,DRn FE 1 (3, 4)/5 #38 F1 2 2

F3 2 23

F1 22 3

207 FDIV DRm,DRn FE 1 (24, 25)/26

#41

F1 2 2

208 FLOAT FPUL,DRn FE 1 (3, 4)/5 #38 F1 2 2

Double-precisionfloating-pointinstructions

209 FMUL DRm,DRn FE 1 (7, 8)/9 #39 F1 2 6

Rev. 2.0, 03/99, page 174 of 396



Instruc-tionGroup

IssueRate Latency


210 FNEG DRn LS 1 0 #1 — — —

F3 2 22

F1 21 3

211 FSQRT DRn FE 1 (23, 24)/25

#41

F1 2 2

212 FSUB DRm,DRn FE 1 (7, 8)/9 #39 F1 2 6

Double-precisionfloating-pointinstructions

213 FTRC DRm,FPUL FE 1 4/5 #38 F1 2 2

214 LDS Rm,FPUL LS 1 1 #1 — — —

215 LDS Rm,FPSCR CO 1 4 #32 F1 3 3

216 LDS.L @Rm+,FPUL CO 1 1/2 #2 — — —

217 LDS.L @Rm+,FPSCR CO 1 1/4 #33 F1 3 3

218 STS FPUL,Rn LS 1 3 #1 — — —

219 STS FPSCR,Rn CO 1 3 #1 — — —

220 STS.L FPUL,@-Rn CO 1 1/1 #2 — — —

FPU systemcontrolinstructions

221 STS.L FPSCR,@-Rn CO 1 1/1 #2 — — —

222 FMOV DRm,XDn LS 1 0 #1 — — —

223 FMOV XDm,DRn LS 1 0 #1 — — —

224 FMOV XDm,XDn LS 1 0 #1 — — —

225 FMOV @Rm,XDn LS 1 2 #2 — — —

226 FMOV @Rm+,XDn LS 1 1/2 #2 — — —

227 FMOV @(R0,Rm),XDn LS 1 2 #2 — — —

228 FMOV XDm,@Rn LS 1 1 #2 — — —

229 FMOV XDm,@-Rm LS 1 1/1 #2 — — —

230 FMOV XDm,@(R0,Rn) LS 1 1 #2 — — —

231 FIPR FVm,FVn FE 1 4/5 #42 F1 3 1

232 FRCHG FE 1 1/4 #36 — — —

233 FSCHG FE 1 1/4 #36 — — —

F0 2 4

Graphicsaccelerationinstructions

234 FTRV XMTRX,FVn FE 1 (5, 5, 6,7)/8

#43

F1 3 4

Notes: 1. See table 8.1 for the instruction groups.2. Latency “L1/L2...”: Latency corresponding to a write to each register, including

MACH/MACL/FPSCR.Example: MOV.B @Rm+, Rn “1/2”: The latency for Rm is 1 cycle, and the latency for

Rn is 2 cycles.3. Branch latency: Interval until the branch destination instruction is fetched

Rev. 2.0, 03/99, page 175 of 396

4. Conditional branch latency “2 (or 1)”: The latency is 2 for a nonzero displacement, and1 for a zero displacement.

5. Double-precision floating-point instruction latency “(L1, L2)/L3”: L1 is the latency for FR[n+1], L2 that for FR [n], and L3 that for FPSCR.

6. FTRV latency “(L1, L2, L3, L4)/L5”: L1 is the latency for FR [n], L2 that for FR [n+1], L3that for FR [n+2], L4 that for FR [n+3], and L5 that for FPSCR.

7. Latency “L1/L2/L3/L4” of MAC.L and MAC.W instructions: L1 is the latency for Rm, L2that for Rn, L3 that for MACH, and L4 that for MACL.

8. Latency “L1/L2” of MUL.L, MULS.W, MULU.W, DMULS.L, and DMULU.L instructions:L1 is the latency for MACH, and L2 that for MACL.

9. Execution pattern: The instruction execution pattern number (see figure 8.2)

10. Lock/stage: Stage locked by the instruction11. Lock/start: Locking start cycle; 1 is the first D-stage of the instruction.12. Lock/cycles: Number of cycles locked

Exceptions:1. When a floating-point computation instruction is followed by an FMOV store, an STS

FPUL, Rn instruction, or an STS.L FPUL, @-Rn instruction, the latency of the floating-point computation is decreased by 1 cycle.

2. When the preceding instruction loads the shift amount of the following SHAD/SHLD, thelatency of the load is increased by 1 cycle.

3. When an LS group instruction with a latency of less than 3 cycles is followed by adouble-precision floating-point instruction, FIPR, or FTRV, the latency of the firstinstruction is increased to 3 cycles.Example: n the case of FMOV FR4,FR0 and FIPR FV0,FV4, FIPR is stalled for 2

cycles.4. When MAC*/MUL*/DMUL* is followed by an STS.L MAC*, @-Rn instruction, the latency

of MAC*/MUL*/DMUL* is 5 cycles.5. In the case of consecutive executions of MAC*/MUL*/DMUL*, the latency is decreased

to 2 cycles.6. When an LDS to MAC* is followed by an STS.L MAC*, @-Rn instruction, the latency of

the LDS to MAC* is 4 cycles.7. When an LDS to MAC* is followed by MAC*/MUL*/DMUL*, the latency of the LDS to

MAC* is 1 cycle.8. When an FSCHG or FRCHG instruction is followed by an LS group instruction that

reads or writes to a floating-point register, the aforementioned LS group instruction[s]cannot be executed in parallel.

9. When a single-precision FTRC instruction is followed by an STS FPUL, Rn instruction,the latency of the single-precision FTRC instruction is 1 cycle.

Rev. 2.0, 03/99, page 176 of 396

Rev. 2.0, 03/99, page 177 of 396

Section 9 Power-Down Modes

9.1 Overview

In the power-down modes, some of the on-chip peripheral modules and the CPU functions arehalted, enabling power consumption to be reduced.

9.1.1 Types of Power-Down Modes

The following power-down modes and functions are provided:

• Sleep mode

• Deep sleep mode

• Standby mode

• Module standby function (TMU, RTC, SCI/SCIF, and DMAC on-chip peripheral modules)

Table 9.1 shows the conditions for entering these modes from the program execution state, thestatus of the CPU and peripheral modules in each mode, and the method of exiting each mode.

Rev. 2.0, 03/99, page 178 of 396

Table 9.1 Status of CPU and Peripheral Modules in Power-Down Modes

Status

Power-DownMode

EnteringConditions CPG CPU

On-ChipMemory

On-chipPeripheralModules Pins

ExternalMemory

ExitingMethod

Sleep SLEEPinstructionexecutedwhile STBYbit is 0 inSTBCR

Operating Halted(registersheld)

Held Operating Held Refreshing • Interrupt

• Reset

Deepsleep

SLEEPinstructionexecutedwhile STBYbit is 0 inSTBCR,and DSLPbit is 1 inSTBCR2

Operating Halted(registersheld)

Held Operating(DMAhalted)

Held Self-refreshing

• Interrupt

• Reset

Standby SLEEPinstructionexecutedwhile STBYbit is 1 inSTBCR

Halted Halted(registersheld)

Held Halted* Held Self-refreshing

• Interrupt

• Reset

Modulestandby

SettingMSTP bit to1 in STBCR

Operating Operating Held Specifiedmoduleshalted*

Held Refreshing • ClearingMSTP bitto 0

• Reset

Note: The RTC operates when the START bit in RCR2 is 1 (see section 11, Realtime Clock(RTC), in the Hardware Manual).

Rev. 2.0, 03/99, page 179 of 396


Table 9.2 shows the registers used for power-down mode control.

Table 9.2 Power-Down Mode Registers

Name Abbreviation R/WInitialValue P4 Address

Area 7Address

AccessSize

Standby control register STBCR R/W H’00 H’FFC00004 H’1FC00004 8

Standby control register 2 STBCR2 R/W H’00 H’FFC00010 H’1FC00010 8


9.2.1 Standby Control Register (STBCR)

The standby control register (STBCR) is an 8-bit readable/writable register that specifies thepower-down mode status. It is initialized to H’00 by a power-on reset via the 5(6(7 pin or due towatchdog timer overflow.

Bit: 7 6 5 4 3 2 1 0

STBY PHZ PPU MSTP4 MSTP3 MSTP2 MSTP1 MSTP0

Initial value: 0 0 0 0 0 0 0 0

R/W: R/W R/W R/W R/W R/W R/W R/W R/W

Bit 7—Standby (STBY): Specifies a transition to standby mode.

Bit 7: STBY Description

0 Transition to sleep mode on execution of SLEEP instruction (Initial value)

1 Transition to standby mode on execution of SLEEP instruction

Bit 6—Peripheral Module Pin High Impedance Control (PHZ): Controls the state ofperipheral module related pins in standby mode. When the PHZ bit is set to 1, peripheral modulerelated pins go to the high-impedance state in standby mode.

For the relevant pins, see section 9.2.2, Peripheral Module Pin High Impedance Control.

Bit 6: PHZ Description

0 Peripheral module related pins are in normal state (Initial value)

1 Peripheral module related pins go to high-impedance state

Rev. 2.0, 03/99, page 180 of 396

Bit 5—Peripheral Module Pin Pull-Up Control (PPU): Controls the state of peripheral modulerelated pins. When the PPU bit is cleared to 0, the pull-up resistor is turned on for peripheralmodule related pins in the input or high-impedance state.

For the relevant pins, see section 9.2.3, Peripheral Module Pin Pull-Up Control.

Bit 5: PPU Description

0 Peripheral module related pin pull-up resistors are on (Initial value)

1 Peripheral module related pin pull-up resistors are off

Bit 4—Module Stop 4 (MSTP4): Specifies stopping of the clock supply to the DMAC among theon-chip peripheral modules. The clock supply to the DMAC is stopped when the MSTP4 bit is setto 1. When DMA transfer is used, stop the transfer before setting the MSTP4 bit to 1. When DMAtransfer is performed after clearing the MSTP4 bit to 0, DMAC settings must be made again.

Bit 4: MSTP4 Description

0 DMAC operates (Initial value)

1 DMAC clock supply is stopped

Bit 3—Module Stop 3 (MSTP3): Specifies stopping of the clock supply to serial communicationinterface channel 2 (SCIF) among the on-chip peripheral modules. The clock supply to the SCIF isstopped when the MSTP3 bit is set to 1.


0 SCIF operates (Initial value)

1 SCIF clock supply is stopped

Bit 2—Module Stop 2 (MSTP2): Specifies stopping of the clock supply to the timer unit (TMU)among the on-chip peripheral modules. The clock supply to the TMU is stopped when the MSTP2bit is set to 1.


0 TMU operates (Initial value)

1 TMU clock supply is stopped

Rev. 2.0, 03/99, page 181 of 396

Bit 1—Module Stop 1 (MSTP1): Specifies stopping of the clock supply to the realtime clock(RTC) among the on-chip peripheral modules. The clock supply to the RTC is stopped when theMSTP1 bit is set to 1. When the clock supply is stopped, RTC registers cannot be accessed but thecounters continue to operate.


0 RTC operates (Initial value)

1 RTC clock supply is stopped

Bit 0—Module Stop 0 (MSTP0): Specifies stopping of the clock supply to serial communicationinterface channel 1 (SCI) among the on-chip peripheral modules. The clock supply to the SCI isstopped when the MSTP0 bit is set to 1.


0 SCI operates (Initial value)

1 SCI clock supply is stopped

9.2.2 Peripheral Module Pin High Impedance Control

When bit 6 in the standby control register (STBCR) is set to 1, peripheral module related pins goto the high-impedance state in standby mode.

• Relevant Pins

SCI related pins MD0/SCK MD1/TXD2

MD7/TXD MD8/RTS2

CTS2

DMA related pins DACK0 DRAK0

DACK1 DRAK1

• Other Information

High impedance control is not performed when the above pins are used as port output pins.

Rev. 2.0, 03/99, page 182 of 396

9.2.3 Peripheral Module Pin Pull-Up Control

When bit 5 in the standby control register (STBCR) is cleared to 0, peripheral module related pinsare pulled up when in the input or high-impedance state.

• Relevant Pins

SCI related pins MD0/SCK MD1/TXD2 MD2/RXD2

MD7/TXD MD8/RTS2 SCK2/05(6(7

RXD CTS2

DMA related pins '5(4� DACK0 DRAK0

'5(4� DACK1 DRAK1

TMU related pin TCLK

9.2.4 Standby Control Register 2 (STBCR2)

Standby control register 2 (STBCR2) is an 8-bit readable/writable register that specifies the sleepmode and deep sleep mode transition conditions. It is initialized to H’00 by a power-on reset viathe 5(6(7 pin or due to watchdog timer overflow.

Bit: 7 6 5 4 3 2 1 0

DSLP — — — — — — —

Initial value: 0 0 0 0 0 0 0 0

R/W: R/W R R R R R R R

Bit 7—Deep Sleep (DSLP): Specifies a transition to deep sleep mode

Bit 7: DSLP Description

0 Transition to sleep mode or standby mode on execution of SLEEPinstruction, according to setting of STBY bit in STBCR register (Initial value)

1 Transition to deep sleep mode on execution of SLEEP instruction*

Note: * When the STBY bit in the STBCR register is 0

Bits 6 to 0—Reserved: Only 0 should only be written to these bits; operation cannot beguaranteed if 1 is written. These bits are always read as 0.

Rev. 2.0, 03/99, page 183 of 396

9.3 Sleep Mode

9.3.1 Transition to Sleep Mode

If a SLEEP instruction is executed when the STBY bit in STBCR is cleared to 0, the chip switchesfrom the program execution state to sleep mode. After execution of the SLEEP instruction, theCPU halts but its register contents are retained. The on-chip peripheral modules continue tooperate, and the clock continues to be output from the CKIO pin.

In sleep mode, a high-level signal is output at the STATUS1 pin, and a low-level signal at theSTATUS0 pin.

9.3.2 Exit from Sleep Mode

Sleep mode is exited by means of an interrupt (NMI, IRL, or on-chip peripheral module) or areset. In sleep mode, interrupts are accepted even if the BL bit in the SR register is 1. If necessary,SPC and SSR should be saved to the stack before executing the SLEEP instruction.

Exit by Interrupt: When an NMI, IRL, or on-chip peripheral module interrupt is generated, sleepmode is exited and interrupt exception handling is executed. The code corresponding to theinterrupt source is set in the INTEVT register.

Exit by Reset: Sleep mode is exited by means of a power-on or manual reset via the 5(6(7 pin,or a power-on or manual reset executed when the watchdog timer overflows.

9.4 Deep Sleep Mode

9.4.1 Transition to Deep Sleep Mode

If a SLEEP instruction is executed when the STBY bit in STBCR is cleared to 0 and the DSLP bitin STBCR2 is set to 1, the chip switches from the program execution state to deep sleep mode.After execution of the SLEEP instruction, the CPU halts but its register contents are retained.Except for the DMAC, on-chip peripheral modules continue to operate, and the clock continues tobe output from the CKIO pin.

In deep sleep mode, a high-level signal is output at the STATUS1 pin, and a low-level signal atthe STATUS0 pin.

9.4.2 Exit from Deep Sleep Mode

As with sleep mode, deep sleep mode is exited by means of an interrupt (NMI, IRL, or on-chipperipheral module) or a reset.

Rev. 2.0, 03/99, page 184 of 396

9.5 Standby Mode

9.5.1 Transition to Standby Mode

If a SLEEP instruction is executed when the STBY bit in STBCR is set to 1, the chip switchesfrom the program execution state to standby mode. In standby mode, the on-chip peripheralmodules halt as well as the CPU. Clock output from the CKIO pin is also stopped.

The CPU and cache register contents are retained. Some on-chip peripheral module registers areinitialized. The state of the peripheral module registers in standby mode is shown in table 9.4.

Table 9.4 State of Registers in Standby Mode

Module Initialized RegistersRegisters That RetainTheir Contents

Interrupt controller — All registers

User break controller — All registers

Bus state controller — All registers

On-chip oscillation circuits — All registers

Timer unit TSTR register* All registers except TSTR

Realtime clock — All registers

Direct memory access controller — All registers

Serial communication interface See Appendix A, Address List See Appendix A, Address List

Note: * Not initialized when the realtime clock (RTC) is in use (see section 12, Timer Unit (TMU), inthe Hardware Manual).

Note: DMA transfer should be terminated before making a transition to standby mode. Transferresults are not guaranteed if standby mode is entered during transfer.

The procedure for a transition to standby mode is shown below.

1. Clear the TME bit in the WDT timer control register (WTCSR) to 0, and stop the WDT.

Set the initial value for the up-count in the WDT timer counter (WTCNT), and set the clock tobe used for the up-count in bits CKS2–CKS0 in the WTCSR register.

2. Set the STBY bit in the STBCR register to 1, then execute a SLEEP instruction.

3. When standby mode is entered and the chip’s internal clock stops, a low-level signal is outputat the STATUS1 pin, and a high-level signal at the STATUS0 pin.

Rev. 2.0, 03/99, page 185 of 396

9.5.2 Exit from Standby Mode

Standby mode is exited by means of an interrupt (NMI, IRL, or on-chip peripheral module) or areset via the 5(6(7 pin.

Exit by Interrupt: A hot start can be performed by means of the on-chip WDT. When an NMI,IRL*1, or on-chip peripheral module (except interval timer)*2 interrupt is detected, the WDT startscounting. After the count overflows, clocks are supplied to the entire chip, standby mode is exited,and the STATUS1 and STATUS0 pins both go low. Interrupt exception handling is then executed,and the code corresponding to the interrupt source is set in the INTEVT register. In standby mode,interrupts are accepted even if the BL bit in the SR register is 1, and so, if necessary, SPC and SSRshould be saved to the stack before executing the SLEEP instruction.

The phase of the CKIO pin clock output may be unstable immediately after an interrupt isdetected, until standby mode is exited.

Notes: 1. Only when the RTC clock (32.768 kHz) is operating (see section 19.2.2, IRLInterrupts, in the Hardware Manual), standby mode can be exited by means of IRL3–IRL0 (when the IRL3–IRL0 level is higher than the SR register I3–I0 mask level).

2. Standby mode can be exited by means of an RTC interrupt.

Exit by Reset: Standby mode is exited by means of a reset (power-on or manual) via the 5(6(7pin. The 5(6(7 pin should be held low until clock oscillation stabilizes. The internal clockcontinues to be output at the CKIO pin.

9.5.3 Clock Pause Function

In standby mode, it is possible to stop or change the frequency of the clock input from the EXTALpin. This function is used as follows.

1. Enter standby mode following the transition procedure described above.

2. When standby mode is entered and the chip’s internal clock stops, a low-level signal is outputat the STATUS1 pin, and a high-level signal at the STATUS0 pin.

3. The input clock is stopped, or its frequency changed, after the STATUS1 pin goes low and theSTATUS0 pin high.

4. When the frequency is changed, input an NMI or IRL interrupt after the change. When theclock is stopped, input an NMI or IRL interrupt after applying the clock.

5. After the time set in the WDT, clock supply begins inside the chip, the STATUS1 andSTATUS0 pins both go low, and operation is resumed from interrupt exception handling.

Rev. 2.0, 03/99, page 186 of 396

9.6 Module Standby Function

9.6.1 Transition to Module Standby Function

Setting the MSTP4–MSTP0 bits in the standby control register to 1 enables the clock supply to thecorresponding on-chip peripheral modules to be halted. Use of this function allows powerconsumption in sleep mode to be further reduced.

In the module standby state, the on-chip peripheral module external pins retain their states prior tohalting of the modules, and most registers retain their states prior to halting of the modules.

Bit Description

MSTP4 0 DMAC operates

1 Clock supplied to DMAC is stopped

MSTP3 0 SCIF operates

1 Clock supplied to SCIF is stopped

MSTP2 0 TMU operates

1 Clock supplied to TMU is stopped, and register is initialized*1

MSTP1 0 RTC operates

1 Clock supplied to RTC is stopped*2

MSTP0 0 SCI operates

1 Clock supplied to SCI is stopped

Notes: 1. The register initialized is the same as in standby mode, but initialization is notperformed if the RTC clock is not in use (see section 12, Timer Unit (TMU), in theHardware Manual).

2. The counter operates when the START bit in RCR2 is 1 (see section 11, RealtimeClock (RTC), in the Hardware Manual).

9.6.2 Exit from Module Standby Function

The module standby function is exited by clearing the MSTP4–MSTP0 bits to 0, or by a power-onreset via the 5(6(7 pin or a power-on reset caused by watchdog timer overflow.

Rev. 2.0, 03/99, page 187 of 396

Section 10 Instruction Descriptions

Instructions are listed in this section in alphabetical order. The following format is used for theinstruction descriptions.

Instruction Name Full Name Instruction TypeFunction (Indication of delayed branch

instruction or interrupt-disablinginstruction)

Format Summary of Operation Instruction CodeExecutionStates T Bit

The assembler inputformat is shown. immand disp are numericvalues, expressions,or symbols.

Summarizes the operationof the instruction.

Shown in MSB ←→LSB order.

The no-wait valueis shown.

Shows theT bit valueafterexecutionof theinstruction.

Description

Describes the operation of the instruction.

Notes

Identifies points to be noted when using the instruction.

Operation

Shows the operation in C. This is given as reference material to help understand the operation ofthe instruction. Use of the following resources is assumed.

char 8-bit integer

short 16-bit integer

int 32-bit integer

long 64-bit integer

float single-precision floating point number(32 bits)

double double-precision floating point number(64 bits)

These are data types.

Rev. 2.0, 03/99, page 188 of 396

unsigned char Read_Byte(unsigned long Addr);

unsigned short Read_Word(unsigned long Addr);

unsigned long Read_Long(unsigned long Addr);

These reflect the respective sizes of address Addr. A word read from other than a 2n address, or alongword read from other than a 4n address, will be detected as an address error.

unsigned char Write_Byte(unsigned long Addr, unsigned long Data);

unsigned short Write_Word(unsigned long Addr, unsigned long Data);

unsigned long Write_Long(unsigned long Addr, unsigned long Data);

These write data Data to address Addr, using the respective sizes. A word write to other than a 2n address,or a longword write to other than a 4n address, will be detected as an address error.

Delay_Slot(unsigned long Addr);

Shifts to execution of the slot instruction at address (Addr).

unsigned long R[16];

unsigned long SR,GBR,VBR;

unsigned long MACH,MACL,PR;

unsigned long PC;

Registers

struct SR0 {

unsigned long dummy0:22;

unsigned long M0:1;

unsigned long Q0:1;

unsigned long I0:4;

unsigned long dummy1:2;

unsigned long S0:1;

unsigned long T0:1;

};

SR structure definitions

define M ((*(struct SR0 *)(&SR)).M0)

#define Q ((*(struct SR0 *)(&SR)).Q0)

#define S ((*(struct SR0 *)(&SR)).S0)

#define T ((*(struct SR0 *)(&SR)).T0)

Definitions of bits in SR

Rev. 2.0, 03/99, page 189 of 396

Error( char *er );

Error display function

These are floating-point number definition statements.

#define PZERO 0

#define NZERO 1

#define DENORM 2

#define NORM 3

#define PINF 4

#define NINF 5

#define qNaN 6

#define sNaN 7

#define EQ 0

#define GT 1

#define LT 2

#define UO 3

#define INVALID 4

#define FADD 0

#define FSUB 1

#define CAUSE 0x0003f000 /* FPSCR(bit17-12) */

#define SET_E 0x00020000 /* FPSCR(bit17) */

#define SET_V 0x00010040 /* FPSCR(bit16,6) */

#define SET_Z 0x00008020 /* FPSCR(bit15,5) */

#define SET_O 0x00004010 /* FPSCR(bit14,4) */

#define SET_U 0x00002008 /* FPSCR(bit13,3) */

#define SET_I 0x00001004 /* FPSCR(bit12,2) */

#define ENABLE_VOUI 0x00000b80 /* FPSCR(bit11,9-7) */

#define ENABLE_V 0x00000800 /* FPSCR(bit11) */

#define ENABLE_Z 0x00000400 /* FPSCR(bit10) */

#define ENABLE_OUI 0x00000380 /* FPSCR(bit9-7) */

#define ENABLE_I 0x00000080 /* FPSCR(bit7) */

#define FLAG 0x0000007C /* FPSCR(bit6-2) */

#define FPSCR_FR FPSCR>>21&1

#define FPSCR_PR FPSCR>>19&1

#define FPSCR_DN FPSCR>>18&1

Rev. 2.0, 03/99, page 190 of 396

#define FPSCR_I FPSCR>>12&1

#define FPSCR_RM FPSCR&1

#define FR_HEX frf.l[ FPSCR_FR]

#define FR frf.f[ FPSCR_FR]

#define DR frf.d[ FPSCR_FR]

#define XF_HEX frf.l[~FPSCR_FR]

#define XF frf.f[~FPSCR_FR]

#define XD frf.d[~FPSCR_FR]

union {

int l[2][16];

float f[2][16];

double d[2][8];

}frf;

int FPSCR;

int sign_of(int n)

{

return(FR_HEX[n]>>31);

}

int data_type_of(int n) {

int abs;

abs = FR_HEX[n] & 0x7fffffff;

if(FPSCR_PR == 0) { /* Single-precision */

if(abs < 0x00800000){

if((FPSCR_DN == 1) || (abs == 0x00000000)){

if(sign_of(n) == 0) {zero(n, 0); return(PZERO);}

else {zero(n, 1); return(NZERO);}

}

else return(DENORM);

}

else if(abs < 0x7f800000) return(NORM);

else if(abs == 0x7f800000) {

if(sign_of(n) == 0) return(PINF);

else return(NINF);

}

else if(abs < 0x7fc00000) return(qNaN);

Rev. 2.0, 03/99, page 191 of 396

else return(sNaN);

}

else { /* Double-precision */

if(abs < 0x00100000){

if((FPSCR_DN == 1) ||

((abs == 0x00000000) && (FR_HEX[n+1] == 0x00000000)){

if(sign_of(n) == 0) {zero(n, 0); return(PZERO);}

else {zero(n, 1); return(NZERO);}

}


}

else if(abs < 0x7ff00000) return(NORM);

else if((abs == 0x7ff00000) &&

(FR_HEX[n+1] == 0x00000000)) {

if(sign_of(n) == 0) return(PINF);

else return(NINF);

}

else if(abs < 0x7ff80000) return(qNaN);

else return(sNaN);

}

}

void register_copy(int m,n)

{

FR[n] = FR[m];

if(FPSCR_PR == 1) FR[n+1] = FR[m+1];

}

void normal_faddsub(int m,n,type)

{

union {

float f;

int l;

} dstf,srcf;

union {

long d;

int l[2];

} dstd,srcd;

union { /* “long double” format: */

Rev. 2.0, 03/99, page 192 of 396

long double x; /* 1-bit sign */

int l[4]; /* 15-bit exponent */

} dstx; /* 112-bit mantissa */

if(FPSCR_PR == 0) {

if(type == FADD) srcf.f = FR[m];

else srcf.f = -FR[m];

dstd.d = FR[n]; /* Conversion from single-precision to double-precision */

dstd.d += srcf.f;

if(((dstd.d == FR[n]) && (srcf.f != 0.0)) ||

((dstd.d == srcf.f) && (FR[n] != 0.0))) {

set_I();

if(sign_of(m)^ sign_of(n)) {

dstd.l[1] -= 1;

if(dstd.l[1] == 0xffffffff) dstd.l[0] -= 1;

}

}

if(dstd.l[1] & 0x1fffffff) set_I();

dstf.f += srcf.f; /* Round to nearest */

if(FPSCR_RM == 1) {

dstd.l[1] &= 0xe0000000; /* Round to zero */

dstf.f = dstd.d;

}

check_single_exception(&FR[n],dstf.f);

}else {

if(type == FADD) srcd.d = DR[m>>1];

else srcd.d = -DR[m>>1];

dstx.x = DR[n>>1];

/* Conversion from double-precision to extended double-precision */

dstx.x += srcd.d;

if(((dstx.x == DR[n>>1]) && (srcd.d != 0.0)) ||

((dstx.x == srcd.d) && (DR[n>>1] != 0.0)) ) {

set_I();

if(sign_of(m)^ sign_of(n)) {

dstx.l[3] -= 1;

if(dstx.l[3] == 0xffffffff) {dstx.l[2] -= 1;

if(dstx.l[2] == 0xffffffff) {dstx.l[1] -= 1;

if(dstx.l[1] == 0xffffffff) {dstx.l[0] -= 1;}}}

Rev. 2.0, 03/99, page 193 of 396

}

}

if((dstx.l[2] & 0x0fffffff) || dstx.l[3]) set_I();

dst.d += srcd.d; /* Round to nearest */

if(FPSCR_RM == 1) {

dstx.l[2] &= 0xf0000000; /* Round to zero */

dstx.l[3] = 0x00000000;

dst.d = dstx.x;

}

check_double_exception(&DR[n>>1] ,dst.d);

}

}

void normal_fmul(int m,n)

{

union {

float f;

int l;

} tmpf;

union {

double d;

int l[2];

} tmpd;

union {

long double x;

int l[4];

} tmpx;

if(FPSCR_PR == 0) {

tmpd.d = FR[n]; /* Single-precision to double-precision */

tmpd.d *= FR[m]; /* Precise creation */

tmpf.f *= FR[m]; /* Round to nearest */

if(tmpf.f != tmpd.d) set_I();

if((tmpf.f > tmpd.d) && (FPSCR_RM == 1)) {

tmpf.l -= 1; /* Round to zero */

}

check_single_exception(&FR[n],tmpf.f);

}else {

tmpx.x = DR[n>>1]; /* Single-precision to double-precision */

Rev. 2.0, 03/99, page 194 of 396

tmpx.x *= DR[m>>1]; /* Precise creation */

tmpd.d *= DR[m>>1]; /* Round to nearest */

if(tmpd.d != tmpx.x) set_I();

if(tmpd.d > tmpx.x) && (FPSCR_RM == 1)) {

tmpd.l[1] -= 1; /* Round to zero */

if(tmpd.l[1] == 0xffffffff) tmpd.l[0] -= 1;

}

check_double_exception(&DR[n>>1], tmpd.d);

}

}

void fipr(int m,n)

{

union {

double d;

int l[2];

} mlt[4];

float dstf;

if((data_type_of(m) == sNaN) || (data_type_of(n) == sNaN) ||

(data_type_of(m+1) == sNaN) || (data_type_of(n+1) == sNaN) ||



(check_product_invalid(m,n)) ||

(check_product_invalid(m+1,n+1)) ||

(check_product_invalid(m+2,n+2)) ||

(check_product_invalid(m+3,n+3)) ) invalid(n+3);

else if((data_type_of(m) == qNaN)|| (data_type_of(n) == qNaN)||

(data_type_of(m+1) == qNaN) || (data_type_of(n+1) == qNaN) ||

(data_type_of(m+2) == qNaN) || (data_type_of(n+2) == qNaN) ||

(data_type_of(m+3) == qNaN) || (data_type_of(n+3) == qNaN))qnan(n+3);

else if (check_ positive_infinity() &&

(check_ negative_infinity()) invalid(n+3);

else if (check_ positive_infinity()) inf(n+3,0);

else if (check_ negative_infinity()) inf(n+3,1);

else {

for(i=0;i<4;i++) {

/* If FPSCR_DN == 1, zeroize */

if (data_type_of(m+i) == PZERO) FR[m+i] = +0.0;

Rev. 2.0, 03/99, page 195 of 396

else if(data_type_of(m+i) == NZERO) FR[m+i] = -0.0;

if (data_type_of(n+i) == PZERO) FR[n+i] = +0.0;

else if(data_type_of(n+i) == NZERO) FR[n+i] = -0.0;

mlt[i].d = FR[m+i];

mlt[i].d *= FR[n+i];

/* To be precise, with FIPR, the lower 18 bits are discarded; therefore, this description

is simplified, and differs from the hardware. */

mlt[i].l[1] &= 0xff000000;

mlt[i].l[1] |= 0x00800000;

}

mlt[0].d += mlt[1].d + mlt[2].d + mlt[3].d;

mlt[0].l[1] &= 0xff800000;

dstf = mlt[0].d;

set_I();

check_single_exception(&FR[n+3],dstf);

}

}

void check_single_exception(float *dst,result)

{

union {

float f;

int l;

} tmp;

float abs;

if(result < 0.0) tmp.l = 0xff800000; /* – infinity */

else tmp.l = 0x7f800000; /* + infinity */

if(result == tmp.f) {

set_O(); set_I();

if(FPSCR_RM == 1) {

tmp.l -= 1; /* Maximum value of normalized number */

result = tmp.f;

}

}

if(result < 0.0) abs = -result;

else abs = result;

tmp.l = 0x00800000; /* Minimum value of normalized number */

Rev. 2.0, 03/99, page 196 of 396

if(abs < tmp.f) {

if((FPSCR_DN == 1) && (abs != 0.0)) {

set_I();

if(result < 0.0) result = -0.0; /* Zeroize denormalized number */

else result = 0.0;

}

if(FPSCR_I == 1) set_U();

}

if(FPSCR & ENABLE_OUI) fpu_exception_trap();

else *dst = result;

}

void check_double_exception(double *dst,result)

{

union {

double d;

int l[2];

} tmp;

double abs;

if(result < 0.0) tmp.l[0] = 0xfff00000; /* – infinity */

else tmp.l[0] = 0x7ff00000; /* + infinity */

tmp.l[1] = 0x00000000;

if(result == tmp.d)

set_O(); set_I();

if(FPSCR_RM == 1) {

tmp.l[0] -= 1;

tmp.l[1] = 0xffffffff;

result = tmp.d; /* Maximum value of normalized number */

}

}

if(result < 0.0) abs = -result;

else abs = result;

tmp.l[0] = 0x00100000; /* Minimum value of normalized number */

tmp.l[1] = 0x00000000;

if(abs < tmp.d) {

if((FPSCR_DN == 1) && (abs != 0.0)) {

set_I();

if(result < 0.0) result = -0.0;

Rev. 2.0, 03/99, page 197 of 396

/* Zeroize denormalized number */

else result = 0.0;

}

if(FPSCR_I == 1) set_U();

}

if(FPSCR & ENABLE_OUI) fpu_exception_trap();

else *dst = result;

}

int check_product_invalid(int m,n)

{

return(check_product_infinity(m,n) &&

((data_type_of(m) == PZERO) || (data_type_of(n) == PZERO) ||

(data_type_of(m) == NZERO) || (data_type_of(n) == NZERO)));

}

int check_ product_infinity(int m,n)

{

return((data_type_of(m) == PINF) || (data_type_of(n) == PINF) ||

(data_type_of(m) == NINF) || (data_type_of(n) == NINF));

}

int check_ positive_infinity(int m,n)

{

return(((check_ product_infinity(m,n) && (~sign_of(m)^sign_of(n))) ||

((check_ product_infinity(m+1,n+1) && (~sign_of(m+1)^sign_of(n+1))) ||

((check_ product_infinity(m+2,n+2) && (~sign_of(m+2)^sign_of(n+2))) ||

((check_ product_infinity(m+3,n+3) && (~sign_of(m+3)^sign_of(n+3))));

}

int check_ negative_infinity(int m,n)

{

return(((check_ product_infinity(m,n) && (sign_of(m)^ sign_of(n))) ||

((check_ product_infinity(m+1,n+1) && (sign_of(m+1)^sign_of(n+1))) ||

((check_ product_infinity(m+2,n+2) && (sign_of(m+2)^sign_of(n+2))) ||

((check_ product_infinity(m+3,n+3) && (sign_of(m+3)^sign_of(n+3))));

Rev. 2.0, 03/99, page 198 of 396

}

void clear_cause () {FPSCR &= ~CAUSE;}

void set_E() {FPSCR |= SET_E; fpu_exception_trap();}

void set_V() {FPSCR |= SET_V;}

void set_Z() {FPSCR |= SET_Z;}

void set_O() {FPSCR |= SET_O;}

void set_U() {FPSCR |= SET_U;}

void set_I() {FPSCR |= SET_I;}

void invalid(int n)

{

set_V();

if((FPSCR & ENABLE_V) == 0 qnan(n);

else fpu_exception_trap();

}

void dz(int n,sign)

{

set_Z();

if((FPSCR & ENABLE_Z) == 0 inf(n,sign);


}

void zero(int n,sign)

{

if(sign == 0) FR_HEX [n] = 0x00000000;

else FR_HEX [n] = 0x80000000;

if (FPSCR_PR==1) FR_HEX [n+1] = 0x00000000;

}

void inf(int n,sign) {

if (FPSCR_PR==0) {

if(sign == 0) FR_HEX [n] = 0x7f800000;

else FR_HEX [n] = 0xff800000;

}else {

if(sign == 0) FR_HEX [n] = 0x7ff00000;

else FR_HEX [n] = 0xfff00000;

FR_HEX [n+1] = 0x00000000;

}

}

Rev. 2.0, 03/99, page 199 of 396

void qnan(int n)

{

if (FPSCR_PR==0) FR[n] = 0x7fbfffff;

else { FR[n] = 0x7ff7ffff;

FR[n+1] = 0xffffffff;

}

}

Example

An example is shown using assembler mnemonics, indicating the states before and after executionof the instruction.

Italics (e.g., .align) indicate an assembler control instruction. The meaning of the assemblercontrol instructions is given below. For details, refer to the Cross-Assembler User’s Manual.

.org Location counter setting

.data.w Word integer data allocation

.data.l Longword integer data allocation

.sdata String data allocation

.align 2 2-byte boundary alignment



.arepeat 16 16-times repeat expansion

.arepea t 32 32-times repeat expansion

.aendr Count-specification repeat expansion end

Note: SH Series cross-assembler version 1.0 does not support conditional assembler functions.

Rev. 2.0, 03/99, page 200 of 396

10.1 ADD ADD binary Arithmetic InstructionBinary Addition


ADD Rm,Rn Rn+Rm → Rn 0011nnnnmmmm1100 1 —

ADD #imm,Rn Rn+imm → Rn 0111nnnniiiiiiii 1 —

Description

This instruction adds together the contents of general registers Rn and Rm and stores the result inRn.

8-bit immediate data can also be added to the contents of general register Rn.

8-bit immediate data is sign-extended to 32 bits, allowing use in decrement operations.

Operation

ADD(long m, long n) /* ADD Rm,Rn */

{

R[n]+=R[m];

PC+=2;

}

ADDI(long i, long n) /* ADD #imm,Rn */

{

if ((i&0x80)==0)

R[n]+=(0x000000FF & (long)i);

else R[n]+=(0xFFFFFF00 | (long)i);

PC+=2;

}

Rev. 2.0, 03/99, page 201 of 396

Example

ADD R0,R1 ;Before execution R0 = H’7FFFFFFF, R1 = H’00000001

;After execution R1 = H’80000000

ADD #H’01,R2 ;Before execution R2 = H’00000000


ADD #H’FE,R3 ;Before execution R3 = H’00000001

;After execution R3 = H’FFFFFFFF

Rev. 2.0, 03/99, page 202 of 396

10.2 ADDC ADD with Carry Arithmetic InstructionBinary Additionwith Carry


ADDC Rm,Rn Rn+Rm+T → Rn, carry → T 0011nnnnmmmm1110 1 Carry

Description

This instruction adds together the contents of general registers Rn and Rm and the T bit, and storesthe result in Rn. A carry resulting from the operation is reflected in the T bit. This instruction isused for additions exceeding 32 bits.

Operation

ADDC(long m, long n) /* ADDC Rm,Rn */

{

unsigned long tmp0,tmp1;

tmp1=R[n]+R[m];

tmp0=R[n];

R[n]=tmp1+T;

if (tmp0>tmp1) T=1;

else T=0;

if (tmp1>R[n]) T=1;

PC+=2;

}

Example

CLRT ;R0:R1(64 bits) + R2:R3(64 bits) = R0:R1(64 bits)

ADDC R3,R1 ;Before execution T = 0, R1 = H’00000001, R3 = H’FFFFFFFF

;After execution T = 1, R1 = H’00000000

ADDC R2,R0 ;Before execution T = 1, R0 = H’00000000, R2 = H’00000000

;After execution T = 0, R0 = H’00000001

Rev. 2.0, 03/99, page 203 of 396

10.3 ADDV ADD with (V flag) overflow check Arithmetic InstructionBinary Additionwith Overflow Check


ADDV Rm,Rn Rn+Rm → Rn,overflow → T

0011nnnnmmmm1111 1 Overflow

Description

This instruction adds together the contents of general registers Rn and Rm and stores the result inRn. If overflow occurs, the T bit is set.

Operation

ADDV(long m, long n) /* ADDV Rm,Rn */

{

long dest,src,ans;

if ((long)R[n]>=0) dest=0;

else dest=1;

if ((long)R[m]>=0) src=0;

else src=1;

src+=dest;

R[n]+=R[m];

if ((long)R[n]>=0) ans=0;

else ans=1;

ans+=dest;

if (src==0 || src==2) {

if (ans==1) T=1;

else T=0;

}

else T=0;

PC+=2;

}

Rev. 2.0, 03/99, page 204 of 396

Example

ADDV R0,R1 ;Before execution R0 = H’00000001, R1 = H’7FFFFFFE, T=0

;After execution R1 = H’7FFFFFFF, T=0

ADDV R0,R1 ;Before execution R0 = H’00000002, R1 = H’7FFFFFFE, T=0

;After execution R1 = H’80000000, T=1

Rev. 2.0, 03/99, page 205 of 396

10.4 AND AND logical Logical InstructionLogical AND


AND Rm,Rn Rm & Rm → Rn 0010nnnnmmmm1001 1 —

AND #imm,R0 R0 & imm → R0 11001001iiiiiiii 1 —

AND.B #imm,@(R0,GBR) (R0+GBR) & imm →(R0+GBR)

11001101iiiiiiii 4 —

Description

This instruction ANDs the contents of general registers Rn and Rm and stores the result in Rn.

This instruction can be used to AND general register R0 contents with zero-extended 8-bitimmediate data, or, in indexed GBR indirect addressing mode, to AND 8-bit memory with 8-bitimmediate data.

Notes

With AND #imm,R0, the upper 24 bits of R0 are always cleared as a result of the operation.

Operation

AND(long m, long n) /* AND Rm,Rn */

{

R[n]&=R[m];

PC+=2;

}

ANDI(long i) /* AND #imm,R0 */

{

R[0]&=(0x000000FF & (long)i);

PC+=2;

}

ANDM(long i) /* AND.B #imm,@(R0,GBR) */

{

long temp;

temp=(long)Read_Byte(GBR+R[0]);

Rev. 2.0, 03/99, page 206 of 396

temp&=(0x000000FF & (long)i);

Write_Byte(GBR+R[0],temp);

PC+=2;

}

Example

AND R0,R1 ;Before execution R0 = H’AAAAAAAA, R1=H’55555555


AND #H’0F,R0 ;Before execution R0 = H’FFFFFFFF

;After execution R0 = H’0000000F

AND.B #H’80,@(R0,GBR) ;Before execution (R0,GBR) = H’A5

;After execution (R0,GBR) = H’80

Rev. 2.0, 03/99, page 207 of 396

10.5 BF Branch if False Branch InstructionConditional Branch


BF label If T = 0PC + 4 + disp × 2 → PCIf T = 1, nop

10001011dddddddd 1 —

Description

This is a conditional branch instruction that references the T bit. The branch is taken if T = 0, andnot taken if T = 1. The branch destination is address (PC + 4 + displacement × 2). The PC sourcevalue is the BF instruction address. As the 8-bit displacement is multiplied by two after sign-extension, the branch destination can be located in the range from –256 to +254 bytes from the BFinstruction.

Notes

If the branch destination cannot be reached, the branch must be handled by using BF incombination with a BRA or JMP instruction, for example.

Operation

BF(int d) /* BF disp */

{

int disp;

if ((d&0x80)==0)

disp=(0x000000FF & d);

else disp=(0xFFFFFF00 | d);

if (T==0)

PC=PC+4+(disp<<1);

else PC+=2;

}

Rev. 2.0, 03/99, page 208 of 396

Example

CLRT ;Normally T = 0

BT TRGET_T ;T = 0, so branch is not taken.

BF TRGET_F ;T = 0, so branch to TRGET_F.

NOP ;

NOP ;

TRGET_F: ;← BF instruction branch destination

Rev. 2.0, 03/99, page 209 of 396

10.6 BF/S Branch if False with delay Slot Branch InstructionConditional Branch with Delay Delayed Branch Instruction


BF/S label If T = 0PC + 4 + disp × 2 → PCIf T = 1, nop


Description

This is a delayed conditional branch instruction that references the T bit. If T = 1, the nextinstruction is executed and the branch is not taken. If T = 0, the branch is taken after execution ofthe next instruction.

The branch destination is address (PC + 4 + displacement × 2). The PC source value is the BF/Sinstruction address. As the 8-bit displacement is multiplied by two after sign-extension, the branchdestination can be located in the range from –256 to +254 bytes from the BF/S instruction.

Notes

As this is a delayed branch instruction, when the branch condition is satisfied, the instructionfollowing this instruction is executed before the branch destination instruction.

Interrupts are not accepted between this instruction and the following instruction.

If the following instruction is a branch instruction, it is identified as a slot illegal instruction.

If this instruction is located in the delay slot immediately following a delayed branch instruction, itis identified as a slot illegal instruction.

If the branch destination cannot be reached, the branch must be handled by using BF/S incombination with a BF, BRA, or JMP instruction, for example.

Rev. 2.0, 03/99, page 210 of 396

Operation

BFS(int d) /* BFS disp */

{

int disp;

unsigned int temp;

temp=PC;

if ((d&0x80)==0)

disp=(0x000000FF & d);


if (T==0)

PC=PC+4+(disp<<1);

else PC+=4;

Delay_Slot(temp+2);

}

Example

CLRT ;Normally T = 0

BT/S TRGET_T ;T = 0, so branch is not taken.

NOP ;

BF/S TRGET_F ;T = 0, so branch to TRGET.

ADD R0,R1 ;Executed before branch.

NOP ;

TRGET_F: ;← BF/S instruction branch destination

Rev. 2.0, 03/99, page 211 of 396

10.7 BRA BRAnch Branch InstructionUnconditional Branch Delayed Branch Instruction


BRA label PC+4+disp×2 → PC 1010dddddddddddd 1 —

Description

This is an unconditional branch instruction. The branch destination is address (PC + 4 +displacement × 2). The PC source value is the BRA instruction address. As the 12-bitdisplacement is multiplied by two after sign-extension, the branch destination can be located in therange from –4096 to +4094 bytes from the BRA instruction. If the branch destination cannot bereached, this branch can be performed with a JMP instruction.

Notes

As this is a delayed branch instruction, the instruction following this instruction is executed beforethe branch destination instruction.

Interrupts are not accepted between this instruction and the following instruction. If the followinginstruction is a branch instruction, it is identified as a slot illegal instruction.

Operation

BRA(int d) /* BRA disp */

{

int disp;

unsigned int temp;

temp=PC;

if ((d&0x800)==0)

disp=(0x00000FFF & d);

else disp=(0xFFFFF000 | d);

PC=PC+4+(disp<<1);

Delay_Slot(temp+2);

}

Rev. 2.0, 03/99, page 212 of 396

Example

BRA TRGET ;Branch to TRGET.

ADD R0,R1 ;ADD executed before branch.

NOP ;

TRGET: ;← BRA instruction branch destination

Rev. 2.0, 03/99, page 213 of 396

10.8 BRAF BRAnch Far Branch InstructionUnconditional Branch Delayed Branch Instruction


BRAF Rn PC+4+Rn → PC 0000nnnn00100011 2 —

Description

This is an unconditional branch instruction. The branch destination is address (PC + 4 + Rn). Thebranch destination address is the result of adding 4 plus the 32-bit contents of general register Rnto PC.

Notes



Operation

BRAF(int n) /* BRAF Rn */

{

unsigned int temp;

temp=PC;

PC=PC+4+R[n];

Delay_Slot(temp+2);

}

Example

MOV.L #(TRGET-BRAF_PC),R0 ;Set displacement.

BRAF R0 ;Branch to TRGET.

ADD R0,R1 ;ADD executed before branch.

BRAF_PC: ;

NOP

TRGET: ;← BRAF instruction branch destination

Rev. 2.0, 03/99, page 214 of 396

10.9 BSR Branch to SubRoutine Branch InstructionBranch to Subroutine Procedure Delayed Branch Instruction


BSR label PC+4 → PR,PC+4+disp×2 → PC

1011dddddddddddd 1 —

Description

This instruction branches to address (PC + 4 + displacement × 2), and stores address (PC + 4) inPR. The PC source value is the BSR instruction address. As the 12-bit displacement is multipliedby two after sign-extension, the branch destination can be located in the range from –4096 to+4094 bytes from the BSR instruction. If the branch destination cannot be reached, this branch canbe performed with a JSR instruction.

Notes



Operation

BSR(int d) /* BSR disp */

{

int disp;

unsigned int temp;

temp=PC;

if ((d&0x800)==0)

disp=(0x00000FFF & d);

else disp=(0xFFFFF000 | d);

PR=PC+4;

PC=PC+4+(disp<<1);

Delay_Slot(temp+2);

}

Rev. 2.0, 03/99, page 215 of 396

Example

BSR TRGET ;Branch to TRGET.

MOV R3,R4 ;MOV executed before branch.

ADD R0,R1 ;Subroutine procedure return destination (contents of PR)

.....

.....

TRGET: ;← Entry to procedure

MOV R2,R3 ;

RTS ;Return to above ADD instruction.

MOV #1,R0 ;MOV executed before branch.

Rev. 2.0, 03/99, page 216 of 396

10.10 BSRF Branch to SubRoutine Far Branch InstructionBranch to Subroutine Procedure Delayed Branch Instruction


BSRF Rn PC+4 → PR,PC+4+Rn → PC

0000nnnn00000011 2 —

Description

This instruction branches to address (PC + 4 + Rn), and stores address (PC + 4) in PR. The PCsource value is the BSRF instruction address. The branch destination address is the result ofadding the 32-bit contents of general register Rn to PC + 4.

Notes



Operation

BSRF(int n) /* BSRF Rn */

{

unsigned int temp;

temp=PC;

PR=PC+4;

PC=PC+4+R[n];

Delay_Slot(tmp+2);

}

Rev. 2.0, 03/99, page 217 of 396

Example

MOV.L #(TRGET-BSRF_PC),R0 ;Set displacement.

BSRF R0 ;Branch to TRGET.


BSRF_PC: ;

ADD R0,R1 ;

.....

TRGET: ;← Entry to procedure

MOV R2,R3 ;



Rev. 2.0, 03/99, page 218 of 396

10.11 BT Branch if True Branch InstructionConditional Branch


BT label If T = 1PC + 4 + disp × 2 → PCIf T = 0, nop


Description

This is a conditional branch instruction that references the T bit. The branch is taken if T = 1, andnot taken if T = 0.

The branch destination is address (PC + 4 + displacement × 2). The PC source value is the BTinstruction address. As the 8-bit displacement is multiplied by two after sign-extension, the branchdestination can be located in the range from –256 to +254 bytes from the BT instruction.

Notes

If the branch destination cannot be reached, the branch must be handled by using BT incombination with a BRA or JMP instruction, for example.

Operation

BT(int d) /* BT disp */

{

int disp;

if ((d&0x80)==0)

disp=(0x000000FF & d);


if (T==1)

PC=PC+4+(disp<<1);

else PC+=2;

}

Rev. 2.0, 03/99, page 219 of 396

Example

SETT ;Normally T = 1

BF TRGET_F ;T = 1, so branch is not taken.

BT TRGET_T ;T = 1, so branch to TRGET_T.

NOP ;

NOP ;

TRGET_T: ;← BT instruction branch destination

Rev. 2.0, 03/99, page 220 of 396

10.12 BT/S Branch if True with delay Slot Branch InstructionConditional Branch with Delay Delayed Branch Instruction


BT/S label If T = 1PC + 4 + disp × 2 → PCIf T = 0, nop


Description

This is a conditional branch instruction that references the T bit. The branch is taken if T = 1, andnot taken if T = 0.

The PC source value is the BT/S instruction address. As the 8-bit displacement is multiplied bytwo after sign-extension, the branch destination can be located in the range from –256 to +254bytes from the BT/S instruction. If the branch destination cannot be reached, the branch must behandled by using BT/S in combination with a BRA or JMP instruction, for example.

Notes

As this is a delayed branch instruction, when the branch condition is satisfied, the instructionfollowing this instruction is executed before the branch destination instruction.

Interrupts are not accepted between this instruction and the following instruction.

If the following instruction is a branch instruction, it is identified as a slot illegal instruction.

Rev. 2.0, 03/99, page 221 of 396

Operation

BTS(int d) /* BTS disp */

{

int disp;

unsigned temp;

temp=PC;

if ((d&0x80)==0)

disp=(0x000000FF & d);


if (T==1)

PC=PC+4+(disp<<1);

else PC+=4;

Delay_Slot(temp+2);

}

Example

SETT ;Normally T = 1

BF/S TRGET_F ;T = 1, so branch is not taken.

NOP ;

BT/S TRGET_T ;T = 1, so branch to TRGET_T.

ADD R0,R1 ;Executed before branch.

NOP ;

TRGET_T: ;← BT/S instruction branch destination

Rev. 2.0, 03/99, page 222 of 396

10.13 CLRMAC CleaR MAC register System Control InstructionMAC Register Clear


CLRMAC 0 → MACH, MACL 0000000000101000 1 —

Description

This instruction clears the MACH and MACL registers.

Operation

CLRMAC( ) /* CLRMAC */

{

MACH=0;

MACL=0;

PC+=2;

}

Example

CLRMAC ;Clear MAC register to initialize.

MAC.W @R0+,@R1+ ;Multiply-and-accumulate operation

MAC.W @R0+,@R1+ ;

Rev. 2.0, 03/99, page 223 of 396

10.14 CLRS CleaR S bit System Control InstructionS Bit Clear


CLRS 0 → S 0000000001001000 1 —

Description

This instruction clears the S bit to 0.

Operation

CLRS( ) /* CLRS */

{

S=0;

PC+=2;

}

Example

CLRS ;Before execution S = 1

;After execution S = 0

Rev. 2.0, 03/99, page 224 of 396

10.15 CLRT CleaR T bit System Control InstructionT Bit Clear


CLRT 0 → T 0000000000001000 1 0

Description

This instruction clears the T bit.

Operation

CLRT( ) /* CLRT */

{

T=0;

PC+=2;

}

Example

CLRT ;Before execution T = 1

;After execution T = 0

Rev. 2.0, 03/99, page 225 of 396

10.16 CMP/cond CoMPare conditionally Arithmetic InstructionCompare


CMP/EQ Rm,Rn If Rn = Rm, 1 → T 0011nnnnmmmm0000 1 Result ofcomparison

CMP/GE Rm,Rn If Rn ≥ Rm, signed, 1 → T 0011nnnnmmmm0011 1 Result ofcomparison

CMP/GT Rm,Rn If Rn > Rm, signed, 1 → T 0011nnnnmmmm0111 1 Result ofcomparison

CMP/HI Rm,Rn If Rn > Rm, unsigned, 1 → T 0011nnnnmmmm0110 1 Result ofcomparison

CMP/HS Rm,Rn If Rn ≥ Rm, unsigned, 1 → T 0011nnnnmmmm0010 1 Result ofcomparison

CMP/PL Rn If Rn > 0, 1 → T 0100nnnn00010101 1 Result ofcomparison

CMP/PZ Rn If Rn ≥ 0, 1 → T 0100nnnn00010001 1 Result ofcomparison

CMP/STR Rm,Rn If any bytes are equal, 1 → T 0010nnnnmmmm1100 1 Result ofcomparison

CMP/EQ #imm,R0 If R0 = imm, 1 → T 10001000iiiiiiii 1 Result ofcomparison

Description

This instruction compares general registers Rn and Rm, and sets the T bit if the specified condition(cond) is true. If the condition is false, the T bit is cleared. The contents of Rn are not changed.Nine conditions can be specified. For the two conditions PZ and PL, Rn is compared with 0.

With the EQ condition, sign-extended 8-bit immediate data can be compared with R0. Thecontents of R0 are not changed.

Rev. 2.0, 03/99, page 226 of 396

Mnemonic Description

CMP/EQ Rm,Rn If Rn = Rm, T = 1

CMP/GE Rm,Rn If Rn ≥ Rm as signed values, T = 1

CMP/GT Rm,Rn If Rn > Rm as signed values, T = 1

CMP/HI Rm,Rn If Rn > Rm as unsigned values, T = 1

CMP/HS Rm,Rn If Rn ≥ Rm as unsigned values, T = 1

CMP/PL Rn If Rn > 0, T = 1

CMP/PZ Rn If Rn ≥ 0, T = 1

CMP/STR Rm,Rn If any bytes are equal, T = 1

CMP/EQ #imm,R0 If R0 = imm, T = 1

Operation

CMPEQ(long m, long n) /* CMP_EQ Rm,Rn */

{

if (R[n]==R[m]) T=1;

else T=0;

PC+=2;

}

CMPGE(long m, long n) /* CMP_GE Rm,Rn */

{

if ((long)R[n]>=(long)R[m]) T=1;

else T=0;

PC+=2;

}

CMPGT(long m, long n) /* CMP_GT Rm,Rn */

{

if ((long)R[n]>(long)R[m]) T=1;

else T=0;

PC+=2;

}

CMPHI(long m, long n) /* CMP_HI Rm,Rn */

{

if ((unsigned long)R[n]>(unsigned long)R[m]) T=1;

else T=0;

Rev. 2.0, 03/99, page 227 of 396

PC+=2;

}

CMPHS(long m, long n) /* CMP_HS Rm,Rn */

{

if ((unsigned long)R[n]>=(unsigned long)R[m]) T=1;

else T=0;

PC+=2;

}

CMPPL(long n) /* CMP_PL Rn */

{

if ((long)R[n]>0) T=1;

else T=0;

PC+=2;

}

CMPPZ(long n) /* CMP_PZ Rn */

{

if ((long)R[n]>=0) T=1;

else T=0;

PC+=2;

}

CMPSTR(long m, long n) /* CMP_STR Rm,Rn */

{

unsigned long temp;

long HH,HL,LH,LL;

temp=R[n]^R[m];

HH=(temp&0xFF000000)>>24;

HL=(temp&0x00FF0000)>>16;

LH=(temp&0x0000FF00)>>8;

LL=temp&0x000000FF;

HH=HH&&HL&&LH&&LL;

if (HH==0) T=1;

else T=0;

Rev. 2.0, 03/99, page 228 of 396

PC+=2;

}

CMPIM(long i) /* CMP_EQ #imm,R0 */

{

long imm;

if ((i&0x80)==0) imm=(0x000000FF & (long i));

else imm=(0xFFFFFF00 | (long i));

if (R[0]==imm) T=1;

else T=0;

PC+=2;

}

Example

CMP/GE R0,R1 ;R0 = H’7FFFFFFF, R1 = H’80000000

BT TRGET_T ;T = 0, so branch is not taken.

CMP/HS R0,R1 ;R0 = H’7FFFFFFF, R1 = H’80000000

BT TRGET_T ;T = 1, so branch is taken.

CMP/STR R2,R3 ;R2 = "ABCD", R3 = "XYCZ"

BT TRGET_T ;T = 1, so branch is taken.

Rev. 2.0, 03/99, page 229 of 396

10.17 DIV0S DIVide (step 0) as Signed Arithmetic InstructionInitialization forSigned Division


DIV0S Rm,Rn MSB of Rn → Q,MSB of Rm → M,M^Q → T

0010nnnnmmmm0111 1 Result ofcalculation

Description

This instruction performs initial settings for signed division. This instruction is followed by aDIV1 instruction that executes 1-digit division, for example, and repeated divisions are executedto find the quotient. See the description of the DIV1 instruction for details.

Operation

DIV0S(long m, long n) /* DIV0S Rm,Rn */

{

if ((R[n] & 0x80000000)==0) Q=0;

else Q=1;

if ((R[m] & 0x80000000)==0) M=0;

else M=1;

T=!(M==Q);

PC+=2;

}

Example

See the examples for the DIV1 instruction.

Rev. 2.0, 03/99, page 230 of 396

10.18 DIV0U DIVide (step 0) as Unsigned Arithmetic InstructionInitialization for Unsigned Division


DIV0U 0 → M/Q/T 0000000000011001 1 0

Description

This instruction performs initial settings for unsigned division. This instruction is followed by aDIV1 instruction that executes 1-digit division, for example, and repeated divisions are executedto find the quotient. See the description of the DIV1 instruction for details.

Operation

DIV0U( ) /* DIV0U */

{

M=Q=T=0;

PC+=2;

}

Example

See the examples for the DIV1 instruction.

Rev. 2.0, 03/99, page 231 of 396

10.19 DIV1 DIVide 1 step Arithmetic InstructionDivision


DIV1 Rm,Rn 1-step division(Rn ÷ Rm)

0011nnnnmmmm0100 1 Result ofcalculation

Description

This instruction performs 1-digit division (1-step division) of the 32-bit contents of generalregister Rn (dividend) by the contents of Rm (divisor). The quotient is obtained by repeatedexecution of this instruction alone or in combination with other instructions. The specifiedregisters and the M, Q, and T bits must not be modified during these repeated executions.

In 1-step division, the dividend is shifted 1 bit to the left, the divisor is subtracted from this, andthe quotient bit is reflected in the Q bit according to whether the result is positive or negative.

The remainder can be found as follows after first finding the quotient using the DIV1 instruction:

(Remainder) = (dividend) – (divisor) × (quotient)

Detection of division by zero or overflow is not provided. Check for division by zero and overflowdivision before executing the division. A remainder operation is not provided. Find the remainderby finding the product of the divisor and the obtained quotient, and subtracting this value from thedividend.

Initial settings should first be made with the DIV0S or DIV0U instruction. DIV1 is executed oncefor each bit of the divisor. If a quotient of more than 17 bits is required, place an ROTCLinstruction before the DIV1 instruction. See the examples for details of the division sequence.

Operation

DIV1(long m, long n) /* DIV1 Rm,Rn */

{

unsigned long tmp0, tmp2;

unsigned char old_q, tmp1;

old_q=Q;

Q=(unsigned char)((0x80000000 & R[n])!=0);

tmp2= R[m];

R[n]<<=1;

R[n]|=(unsigned long)T;

Rev. 2.0, 03/99, page 232 of 396

switch(old_q){

case 0:switch(M){

case 0:tmp0=R[n];

R[n]-=tmp2;

tmp1=(R[n]>tmp0);

switch(Q){

case 0:Q=tmp1;

break;

case 1:Q=(unsigned char)(tmp1==0);

break;

}

break;

case 1:tmp0=R[n];

R[n]+=tmp2;

tmp1=(R[n]<tmp0);

switch(Q){


break;

case 1:Q=tmp1;

break;

}

break;

}

break;

case 1:switch(M){

case 0:tmp0=R[n];

R[n]+=tmp2;

tmp1=(R[n]<tmp0);

switch(Q){

case 0:Q=tmp1;

break;


break;

}

break;

case 1:tmp0=R[n];

Rev. 2.0, 03/99, page 233 of 396

R[n]-=tmp2;

tmp1=(R[n]>tmp0);

switch(Q){


break;

case 1:Q=tmp1;

break;

}

break;

}

break;

}

T=(Q==M);

PC+=2;

}

Example 1

;R1 (32 bits) ÷ R0 (16 bits) = R1 (16 bits); unsigned

SHLL16 R0 ;Set divisor in upper 16 bits, clear lower 16 bits to 0

TST R0,R0 ;Check for division by zero

BT ZERO_DIV ;

CMP/HS R0,R1 ;Check for overflow

BT OVER_DIV ;

DIV0U ;Flag initialization

.arepeat 16 ;

DIV1 R0,R1 ;Repeat 16 times

.aendr ;

ROTCL R1 ;

EXTU.W R1,R1 ;R1 = quotient

Rev. 2.0, 03/99, page 234 of 396

Example 2

; R1:R2 (64 bits) ÷ R0 (32 bits) = R2 (32 bits); unsigned

TST R0,R0 ;Check for division by zero

BT ZERO_DIV ;

CMP/HS R0,R1 ;Check for overflow

BT OVER_DIV ;

DIV0U ;Flag initialization

.arepeat 32 ;

ROTCL R2 ;Repeat 32 times

DIV1 R0,R1 ;

.aendr ;

ROTCL R2 ;R2 = quotient

Example 3

;R1 (16 bits) ÷ R0 (16 bits) = R1 (16 bits); signed

SHLL16 R0 ;Set divisor in upper 16 bits, clear lower 16 bits to 0

EXTS.W R1,R1 ;Dividend sign-extended to 32 bits

XOR R2,R2 ;R2 = 0

MOV R1,R3 ;

ROTCL R3 ;

SUBC R2,R1 ;If dividend is negative, subtract 1

DIV0S R0,R1 ;Flag initialization

.arepeat 16 ;

DIV1 R0,R1 ;Repeat 16 times

.aendr ;

EXTS.W R1,R1 ;

ROTCL R1 ;R1 = quotient (one’s complement notation)

ADDC R2,R1 ;If MSB of quotient is 1, add 1 to convert to two’s complement notation

EXTS.W R1,R1 ;R1 = quotient (two’s complement notation)

Rev. 2.0, 03/99, page 235 of 396

Example 4

;R2 (32 bits) ÷ R0 (32 bits) = R2 (32 bits); signed

MOV R2,R3 ;

ROTCL R3 ;

SUBC R1,R1 ;Dividend sign-extended to 64 bits (R1:R2)

XOR R3,R3 ;R3 = 0

SUBC R3,R2 ;If dividend is negative, subtract 1 to convert to one’s complement notation

DIV0S R0,R1 ;Flag initialization

.arepeat 32 ;

ROTCL R2 ;Repeat 32 times

DIV1 R0,R1 ;

.aendr ;

ROTCL R2 ;R2 = quotient (one’s complement notation)

ADDC R3,R2 ;If MSB of quotient is 1, add 1 to convert to two’s complement notation

;R2 = quotient (two’s complement notation)

Rev. 2.0, 03/99, page 236 of 396

10.20 DMULS.L Double-lengthMULtiply as Signed Arithmetic Instruction

Signed Double-LengthMultiplication


DMULS.L Rm,Rn Signed,Rn × Rm →MACH, MACL

0011nnnnmmmm1101 2–5 —

Description

This instruction performs 32-bit multiplication of the contents of general register Rn by thecontents of Rm, and stores the 64-bit result in the MACH and MACL registers. The multiplicationis performed as a signed arithmetic operation.

Operation

DMULS(long m, long n) /* DMULS.L Rm,Rn */

{

unsigned long RnL,RnH,RmL,RmH,Res0,Res1,Res2;

unsigned long temp0,temp1,temp2,temp3;

long tempm,tempn,fnLmL;

tempn=(long)R[n];

tempm=(long)R[m];

if (tempn<0) tempn=0-tempn;

if (tempm<0) tempm=0-tempm;

if ((long)(R[n]^R[m])<0) fnLmL=-1;

else fnLmL=0;

temp1=(unsigned long)tempn;

temp2=(unsigned long)tempm;

RnL=temp1&0x0000FFFF;

RnH=(temp1>>16)&0x0000FFFF;

RmL=temp2&0x0000FFFF;

RmH=(temp2>>16)&0x0000FFFF;

Rev. 2.0, 03/99, page 237 of 396

temp0=RmL*RnL;

temp1=RmH*RnL;

temp2=RmL*RnH;

temp3=RmH*RnH;

Res2=0;

Res1=temp1+temp2;

if (Res1<temp1) Res2+=0x00010000;

temp1=(Res1<<16)&0xFFFF0000;

Res0=temp0+temp1;

if (Res0<temp0) Res2++;

Res2=Res2+((Res1>>16)&0x0000FFFF)+temp3;

if (fnLmL<0) {

Res2=~Res2;

if (Res0==0)

Res2++;

else

Res0=(~Res0)+1;

}

MACH=Res2;

MACL=Res0;

PC+=2;

}

Example

DMULS.L R0,R1 ;Before execution R0 = H’FFFFFFFE, R1 = H’00005555

;After execution MACH = H’FFFFFFFF, MACL = H’FFFF5556

STS MACH,R0 ;Get operation result (upper)

STS MACL,R1 ;et operation result (lower)

Rev. 2.0, 03/99, page 238 of 396

10.21 DMULU.L Double-length MULtiplyas Unsigned Arithmetic Instruction

Unsigned Double-LengthMultiplication


DMULU.L Rm,Rn Unsigned,Rn × Rm →MACH, MACL

0011nnnnmmmm0101 2–5 —

Description

This instruction performs 32-bit multiplication of the contents of general register Rn by thecontents of Rm, and stores the 64-bit result in the MACH and MACL registers. The multiplicationis performed as an unsigned arithmetic operation.

Operation

DMULU(long m, long n) /* DMULU.L Rm,Rn */

{



RnL=R[n]&0x0000FFFF;

RnH=(R[n]>>16)&0x0000FFFF;

RmL=R[m]&0x0000FFFF;

RmH=(R[m]>>16)&0x0000FFFF;

temp0=RmL*RnL;

temp1=RmH*RnL;

temp2=RmL*RnH;

temp3=RmH*RnH;

Res2=0

Res1=temp1+temp2;

if (Res1<temp1) Res2+=0x00010000;


Rev. 2.0, 03/99, page 239 of 396

Res0=temp0+temp1;



MACH=Res2;

MACL=Res0;

PC+=2;

}

Example

DMULU.L R0,R1 ;Before execution R0 = H’FFFFFFFE, R1 = H’00005555

;After execution MACH = H’00005554, MACL = H’FFFF5556

STS MACH,R0 ;Get operation result (upper)

STS MACL,R1 ;Get operation result (lower)

Rev. 2.0, 03/99, page 240 of 396

10.22 DT Decrement and Test Arithmetic InstructionDecrement and Test


DT Rn Rn – 1 → Rn;if Rn = 0, 1 → Tif Rn ≠ 0, 0 → T

0100nnnn00010000 1 Testresult

Description

This instruction decrements the contents of general register Rn by 1 and compares the result withzero. If the result is zero, the T bit is set to 1. If the result is nonzero, the T bit is cleared to 0.

Operation

DT(long n)/* DT Rn */

{

R[n]--;

if (R[n]==0) T=1;

else T=0;

PC+=2;

}

Example

MOV #4,R5 ;Set loop count

LOOP:

ADD R0,R1 ;

DT R5 ;Decrement R5 value and check for 0.

BF LOOP ;If T = 0, branch to LOOP (in this example, 4 loops are executed).

Rev. 2.0, 03/99, page 241 of 396

10.23 EXTS EXTend as Signed Arithmetic InstructionSign Extension


EXTS.B Rm,Rn Rm sign-extended frombyte → Rn

0110nnnnmmmm1110 1 —

EXTS.W Rm,Rn Rm sign-extended fromword → Rn


Description

This instruction sign-extends the contents of general register Rm and stores the result in Rn.

For a byte specification, the value of Rm bit 7 is transferred to Rn bits 8 to 31. For a wordspecification, the value of Rm bit 15 is transferred to Rn bits 16 to 31.

Operation

EXTSB(long m, long n) /* EXTS.B Rm,Rn */

{

R[n]=R[m];

if ((R[m]&0x00000080)==0) R[n]&=0x000000FF;

else R[n]|=0xFFFFFF00;

PC+=2;

}

EXTSW(long m, long n) /* EXTS.W Rm,Rn */

{

R[n]=R[m];

if ((R[m]&0x00008000)==0) R[n]&=0x0000FFFF;

else R[n]|=0xFFFF0000;

PC+=2;

}

Rev. 2.0, 03/99, page 242 of 396

Example

EXTS.B R0,R1 ;Before execution R0 = H’00000080

;After execution R1 = H’FFFFFF80

EXTS.W R0,R1 ;Before execution R0 = H’00008000

;After execution R1 = H’FFFF8000

Rev. 2.0, 03/99, page 243 of 396

10.24 EXTU EXTend as Unsigned Arithmetic InstructionZero Extension


EXTU.B Rm,Rn Rm zero-extended frombyte → Rn


EXTU.W Rm,Rn Rm zero-extended fromword → Rn


Description

This instruction zero-extends the contents of general register Rm and stores the result in Rn.

For a byte specification, 0 is transferred to Rn bits 8 to 31. For a word specification, 0 istransferred to Rn bits 16 to 31.

Operation

EXTUB(long m, long n) /* EXTU.B Rm,Rn */

{

R[n]=R[m];

R[n]&=0x000000FF;

PC+=2;

}

EXTUW(long m, long n) /* EXTU.W Rm,Rn */

{

R[n]=R[m];

R[n]&=0x0000FFFF;

PC+=2;

}

Example

EXTU.B R0,R1 ;Before execution R0 = H’FFFFFF80


EXTU.W R0,R1 ;Before execution R0 = H’FFFF8000


Rev. 2.0, 03/99, page 244 of 396

10.25 FABS Floating-point ABSolute value Floating-Point InstructionFloating-PointAbsolute Value

PR Format Summary of Operation Instruction CodeExecutionStates T Bit

0 FABS FRn |FRn| → FRn 1111nnnn01011101 1 —

1 FABS DRn |DRn| → DRn 1111nnn001011101 1 —

Description

This instruction clears the most significant bit of the contents of floating-point register FRn/DRnto 0, and stores the result in FRn/DRn.

The cause and flag fields in FPSCR are not updated.

Operation

void FABS (int n){

FR[n] = FR[n] & 0x7fffffff;

pc += 2;

}

/* Same operation is performed regardless of precision. */

Possible Exceptions:None

Rev. 2.0, 03/99, page 245 of 396

10.26 FADD Floating-point ADD Floating-Point InstructionFloating-PointAddition


0 FADD FRm,FRn FRn+FRm → FRn 1111nnnnmmmm0000 1 —

1 FADD DRm,DRn DRn+DRm → DRn 1111nnn0mmm00000 6 —

Description

When FPSCR.PR = 0: Arithmetically adds the two single-precision floating-point numbers in FRnand FRm, and stores the result in FRn.

When FPSCR.PR = 1: Arithmetically adds the two double-precision floating-point numbers inDRn and DRm, and stores the result in DRn.

When FPSCR.enable.O/U/I is set, an FPU exception trap is generated regardless of whether or notan exception has occurred. When an exception occurs, correct exception information is reflected inFPSCR.cause and FPSCR.flag, and FRn or DRn is not updated. Appropriate processing shouldtherefore be performed by software.

Operation

void FADD (int m,n)

{

pc += 2;

clear_cause();

if((data_type_of(m) == sNaN) ||

(data_type_of(n) == sNaN)) invalid(n);

else if((data_type_of(m) == qNaN) ||

(data_type_of(n) == qNaN)) qnan(n);

else if((data_type_of(m) == DENORM) ||

(data_type_of(n) == DENORM)) set_E();

else switch (data_type_of(m)){

case NORM: switch (data_type_of(n)){

case NORM: normal_faddsub(m,n,ADD); break;

case PZERO:

case NZERO:register_copy(m,n); break;

default: break;

} break;

Rev. 2.0, 03/99, page 246 of 396

case PZERO: switch (data_type_of(n)){

case NZERO: zero(n,0); break;

default: break;

} break;

case NZERO: break;

case PINF: switch (data_type_of(n)){

case NINF: invalid(n); break;

default: inf(n,0); break;

} break;

case NINF: switch (data_type_of(n)){

case PINF: invalid(n); break;


} break;

}

}

FADD Special Cases

FRm,DRm FRn,DRn

NORM +0 –0 +INF –INF DENORM qNaN sNaN

NORM ADD –INF

+0 +0

–0 –0

+INF +INF Invalid

–INF –INF Invalid –INF

DENORM Error

qNaN qNaN

sNaN Invalid

Note: When DN = 1, the value of a denormalized number is treated as 0.

Possible Exceptions:• FPU error

• Invalid operation

• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 247 of 396

10.27 FCMP Floating-point CoMPare Floating-Point InstructionFloating-PointComparison


0 1. FCMP/EQ FRm,FRn (FRn==FRm)?1:0 → T 1111nnnnmmmm0100 1 1/0

1 2. FCMP/EQ DRm,DRn (DRn==DRm)?1:0 → T 1111nnn0mmm00100 1 1/0

0 3. FCMP/GT FRm,FRn (FRn>FRm)?1:0 → T 1111nnnnmmmm0101 2 1/0

1 4. FCMP/GT DRm,DRn (DRn>DRm)?1:0 → T 1111nnn0mmm00101 2 1/0

Description

1. When FPSCR.PR = 0: Arithmetically compares the two single-precision floating-pointnumbers in FRn and FRm, and stores 1 in the T bit if they are equal, or 0 otherwise.

2. When FPSCR.PR = 1: Arithmetically compares the two double-precision floating-pointnumbers in DRn and DRm, and stores 1 in the T bit if they are equal, or 0 otherwise.

3. When FPSCR.PR = 0: Arithmetically compares the two single-precision floating-pointnumbers in FRn and FRm, and stores 1 in the T bit if FRn > FRm, or 0 otherwise.

4. When FPSCR.PR = 1: Arithmetically compares the two double-precision floating-pointnumbers in DRn and DRm, and stores 1 in the T bit if DRn > DRm, or 0 otherwise.

Operation

void FCMP_EQ(int m,n) /* FCMP/EQ FRm,FRn */

{

pc += 2;

clear_cause();

if(fcmp_chk (m,n) == INVALID) fcmp_invalid();

else if(fcmp_chk (m,n) == EQ) T = 1;

else T = 0;

}

void FCMP_GT(int m,n) /* FCMP/GT FRm,FRn */

{

pc += 2;

clear_cause();

if ((fcmp_chk (m,n) == INVALID) ||

(fcmp_chk (m,n) == UO)) fcmp_invalid();

else if(fcmp_chk (m,n) == GT) T = 1;

Rev. 2.0, 03/99, page 248 of 396

else T = 0;

}

int fcmp_chk (int m,n)

{


(data_type_of(n) == sNaN)) return(INVALID);


(data_type_of(n) == qNaN)) return(UO);

else switch(data_type_of(m)){

case NORM: switch(data_type_of(n)){

case PINF :return(GT); break;

case NINF :return(LT); break;

default: break;

} break;

case PZERO:

case NZERO: switch(data_type_of(n)){

case PZERO :

case NZERO :return(EQ); break;

default: break;

} break;

case PINF : switch(data_type_of(n)){

case PINF :return(EQ); break;

default:return(LT); break;

} break;

case NINF : switch(data_type_of(n)){

case NINF :return(EQ); break;

default:return(GT); break;

} break;

}

if(FPSCR_PR == 0) {

if(FR[n] == FR[m]) return(EQ);

else if(FR[n] > FR[m]) return(GT);

else return(LT);

}else {

if(DR[n>>1] == DR[m>>1]) return(EQ);

else if(DR[n>>1] > DR[m>>1]) return(GT);

Rev. 2.0, 03/99, page 249 of 396

else return(LT);

}

}

void fcmp_invalid()

{

set_V(); if((FPSCR & ENABLE_V) == 0) T = 0;


}

FCMP Special Cases

FCMP/EQ FRn,DRn

FRm,DRm NORM DNORM +0 –0 +INF –INF qNaN sNaN

NORM CMP

DNORM

+0 EQ

–0

+INF EQ

–INF EQ

qNaN !EQ

sNaN Invalid


FCMP/GT FRn,DRn

FRm,DRm NORM DENORM +0 –0 +INF –INF qNaN sNaN

NORM CMP GT !GT

DENORM

+0 !GT

–0

+INF !GT !GT

–INF GT !GT

qNaN UO

sNaN Invalid

Note: When DN = 1, the value of a denormalized number is treated as 0.UO means unordered. Unordered is treated as false (!GT).

Possible Exceptions:Invalid operation

Rev. 2.0, 03/99, page 250 of 396

10.28 FCNVDS Floating-point CoNVertDouble to Single precision Floating-Point Instruction

Double-Precisionto Single-PrecisionConversion


0 — — — — —

1 FCNVDS DRm,FPUL (float)DRm → FPUL 1111mmm010111101 2 —

Description

When FPSCR.PR = 1: This instruction converts the double-precision floating-point number inDRm to a single-precision floating-point number, and stores the result in FPUL.

When FPSCR.enable.O/U/I is set, an FPU exception trap is generated regardless of whether or notan exception has occurred. When an exception occurs, correct exception information is reflected inFPSCR.cause and FPSCR.flag, and FPUL is not updated. Appropriate processing should thereforebe performed by software.

Operation

void FCNVDS(int m, float *FPUL){

case((FPSCR.PR){

0: undefined_operation(); /* reserved */

1: fcnvds(m, *FPUL); break; /* FCNVDS */

}

}

void fcnvds(int m, float *FPUL)

{

pc += 2;

clear_cause();

case(data_type_of(m, *FPUL)){

NORM :

PZERO :

NZERO : normal_ fcnvds(m, *FPUL); break;

DENORM : set_E();

PINF : *FPUL = 0x7f800000; break;

NINF : *FPUL = 0xff800000; break;

Rev. 2.0, 03/99, page 251 of 396

qNaN : *FPUL = 0x7fbfffff; break;

sNaN : set_V();

if((FPSCR & ENABLE_V) == 0) *FPUL = 0x7fbfffff;

else fpu_exception_trap(); break;

}

}

void normal_fcnvds(int m, float *FPUL)

{

int sign;

float abs;

union {

float f;

int l;

} dstf,tmpf;

union {

double d;

int l[2];

} dstd;

dstd.d = DR[m>>1];

if(dstd.l[1] & 0x1fffffff)) set_I();

if(FPSCR_RM == 1) dstd.l[1] &= 0xe0000000; /* round toward zero*/

dstf.f = dstd.d;

check_single_exception(FPUL, dstf.f);

}

FCNVDS Special Cases

FRn +NORM –NORM +0 –0 +INF –INF qNaN sNaN

FCNVDS(FRn FPUL) FCNVDS FCNVDS +0 –0 +INF –INF qNaN Invalid




• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 252 of 396

10.29 FCNVSD Floating-point CoNVertSingle to Double precision Floating-Point Instruction

Single-Precisionto Double-PrecisionConversion


0 — — — — —

1 FCNVSD FPUL, DRn (double) FPUL → DRn 1111nnn010101101 2 —

Description

When FPSCR.PR = 1: This instruction converts the single-precision floating-point number inFPUL to a double-precision floating-point number, and stores the result in DRn.

Operation

void FCNVSD(int n, float *FPUL){

pc += 2;

clear_cause();

case((FPSCR_PR){

0: undefined_operation(); /* reserved */

1: fcnvsd (n, *FPUL); break; /* FCNVSD */

}

}

void fcnvsd(int n, float *FPUL)

{

case(fpul_type(FPUL)){

PZERO :

NZERO :

PINF :

NINF : DR[n>>1] = *FPUL; break;

DENORM : set_E(); break;

qNaN : qnan(n); break;

sNaN : invalid(n); break;

}

}

int fpul_type(int *FPUL)

Rev. 2.0, 03/99, page 253 of 396

{

int abs;

abs = *FPUL & 0x7fffffff;

if(abs < 0x00800000){

if((FPSCR_DN == 1) || (abs == 0x00000000)){

if(sign_of(src) == 0) return(PZERO);

else return(NZERO);

}


}

else if(abs < 0x7f800000) return(NORM);

else if(abs == 0x7f800000) {

if(sign_of(src) == 0) return(PINF);

else return(NINF);

}

else if(abs < 0x7fc00000) return(qNaN);

else return(sNaN);

}

FCNVSD Special Cases


FCNVSD(FPUL FRn) +NORM –NORM +0 –0 +INF –INF qNaN Invalid




Rev. 2.0, 03/99, page 254 of 396

10.30 FDIV Floating-point DIVide Floating-Point InstructionFloating-PointDivision


0 FDIV FRm,FRn FRn/FRm → FRn 1111nnnnmmmm0011 10 —

1 FDIV DRm,DRn DRn/DRm → DRn 1111nnn0mmm00011 23 —

Description

When FPSCR.PR = 0: Arithmetically divides the single-precision floating-point number in FRn bythe single-precision floating-point number in FRm, and stores the result in FRn.

When FPSCR.PR = 1: Arithmetically divides the double-precision floating-point number in DRnby the double-precision floating-point number in DRm, and stores the result in DRn.


Operation

void FDIV(int m,n) /* FDIV FRm,FRn */

{

pc += 2;

clear_cause();







case PINF:

case NINF: inf(n,sign_of(m)^sign_of(n));break;

case PZERO:

case NZERO: zero(n,sign_of(m)^sign_of(n));break;

case DENORM:set_E(); break;

default: normal_fdiv(m,n); break;

} break;

Rev. 2.0, 03/99, page 255 of 396

case PZERO: switch (data_type_of(n)){

case PZERO:

case NZERO: invalid(n);break;

case PINF:

case NINF: break;

default: dz(n,sign_of(m)^sign_of(n));break;

} break;

case NZERO: switch (data_type_of(n)){

case PZERO:

case NZERO: invalid(n); break;

case PINF: inf(n,1); break;

case NINF: inf(n,0); break;

default: dz(FR[n],sign_of(m)^sign_of(n)); break;

} break;

case DENORM: set_E(); break;

case PINF :

case NINF : switch (data_type_of(n)){


case PINF:


default: zero(n,sign_of(m)^sign_of(n));break

} break;

}

}

void normal_fdiv(int m,n)

{

union {

float f;

int l;

} dstf,tmpf;

union {

double d;

int l[2];

} dstd,tmpd;

union {

int double x;

int l[4];

Rev. 2.0, 03/99, page 256 of 396

} tmpx;

if(FPSCR_PR == 0) {

tmpf.f = FR[n]; /* save destination value */

dstf.f /= FR[m]; /* round toward nearest or even */

tmpd.d = dstf.f; /* convert single to double */

tmpd.d *= FR[m];


if((tmpf.f < tmpd.d) && (SPSCR_RM == 1))

dstf.l -= 1; /* round toward zero */

check_single_exception(&FR[n], dstf.f);

}else {

tmpd.d = DR[n>>1]; /* save destination value */

dstd.d /= DR[m>>1]; /* round toward nearest or even */

tmpx.x = dstd.d; /* convert double to int double */

tmpx.x *= DR[m>>1];


if((tmpd.d < tmpx.x) && (SPSCR_RM == 1)) {

dstd.l[1] -= 1; /* round toward zero */


}

check_double_exception(&DR[n>>1], dstd.d);

}

}

FDIV Special Cases

FRm,DRm FRn,DRn


NORM DIV 0 INF Error

+0 DZ Invalid +INF –INF DZ

–0 –INF +INF

+INF 0 +0 –0 Invalid

–INF –0 +0

DENORM Error

qNaN qNaN

sNaN Invalid


Rev. 2.0, 03/99, page 257 of 396



• Divide by zero

• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 258 of 396

10.31 FIPR Floating-point InnerPRoduct Floating-Point Instruction

Floating-PointInner Product


0 FIPR FVm,FVn FVn ⋅ FVm → FR[n+3] 1111nnmm11101101 1 —

— — — — — —

Notes: FV0 = {FR0, FR1, FR2, FR3}FV4 = {FR4, FR5, FR6, FR7}FV8 = {FR8, FR9, FR10, FR11}

FV12 = {FR12, FR13, FR14, FR15}

Description

When FPSCR.PR = 0: This instruction calculates the inner products of the 4-dimensional single-precision floating-point vector indicated by FVn and FVm, and stores the results in FR[n + 3].

The FIPR instruction is intended for speed rather than accuracy, and therefore the results willdiffer from those obtained by using a combination of FADD and FMUL instructions. The FIPRexecution sequence is as follows:

1. Multiplies all terms. The results are 28 bits long.

2. Aligns these results, rounding them to fit within 30 bits.

3. Adds the aligned values.

4. Performs normalization and rounding.

Special processing is performed in the following cases:

1. If an input value is an sNaN, an invalid exception is generated.

2. If the input values to be multiplied include a combination of 0 and infinity, an invalidexception is generated.

3. In cases other than the above, if the input values include a qNaN, the result will be a qNaN.

4. In cases other than the above, if the input values include infinity:

a. If multiplication results in two or more infinities and the signs are different, an invalidexception will be generated.

b. Otherwise, correct infinities will be stored.

5. If the input values do not include an sNaN, qNaN, or infinity, processing is performed in thenormal way.

Rev. 2.0, 03/99, page 259 of 396


Operation

void FIPR(int m,n) /* FIPR FVm,FVn */

{

if(FPSCR_PR == 0) {

pc += 2;

clear_cause();

fipr(m,n);

}

else undefined_operation();

}

Possible Exceptions:• Invalid operation

• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 260 of 396

10.32 FLDI0 Floating-pointLoaD Immediate 0.0 Floating-Point Instruction

0.0 Load


0 FLDI0 FRn 0x00000000 → FRn 1111nnnn10001101 1 —

1 — — — — —

Description

When FPSCR.PR = 0, this instruction loads floating-point 0.0 (0x00000000) into FRn.

Operation

void FLDI0(int n)

{

FR[n] = 0x00000000;

pc += 2;

}


Rev. 2.0, 03/99, page 261 of 396

10.33 FLDI1 Floating-point LoaDImmediate 1.0 Floating-Point Instruction

1.0 Load


FLDI1 FRn 0x3F800000 → FRn 1111nnnn10011101 1 —

— — — — —

Description

When FPSCR.PR = 0, this instruction loads floating-point 1.0 (0x3F800000) into FRn.

Operation

void FLDI1(int n)

{

FR[n] = 0x3F800000;

pc += 2;

}


Rev. 2.0, 03/99, page 262 of 396

10.34 FLDS Floating-pointLoaD to System register Floating-Point Instruction

Transfer to SystemRegister


FLDS FRm,FPUL FRm → FPUL 1111mmmm00011101 1 —

Description

This instruction loads the contents of floating-point register FRm into system register FPUL.

Operation

void FLDS(int m, float *FPUL)

{

*FPUL = FR[m];

pc += 2;

}


Rev. 2.0, 03/99, page 263 of 396

10.35 FLOAT Floating-pointconvert from integer Floating-Point Instruction

Integer to Floating-PointConversion


0 FLOAT FPUL,FRn (float)FPUL → FRn 1111nnnn00101101 1 —

1 FLOAT FPUL,DRn (double)FPUL → DRn 1111nnn000101101 2 —

Description

When FPSCR.PR = 0: Taking the contents of FPUL as a 32-bit integer, converts this integer to asingle-precision floating-point number and stores the result in FRn.

When FPSCR.PR = 1: Taking the contents of FPUL as a 32-bit integer, converts this integer to adouble-precision floating-point number and stores the result in DRn.

When FPSCR.enable.I = 1, an FPU exception trap is generated regardless of whether or not anexception has occurred. When an exception occurs, correct exception information is reflected inFPSCR.cause and FPSCR.flag, and FRn or DRn is not updated. Appropriate processing shouldtherefore be performed by software.

Rev. 2.0, 03/99, page 264 of 396

Operation

void FLOAT(int n, float *FPUL)

{

union {

double d;

int l[2];

} tmp;

pc += 2;

clear_cause();

if(FPSCR.PR==0){

FR[n] = *FPUL; /* convert from integer to float */

tmp.d = *FPUL;

if(tmp.l[1] & 0x1fffffff) inexact();

}else {

DR[n>>1] = *FPUL; /* convert from integer to double */

}

}

Possible Exceptions:Inexact: Not generated when FPSCR.PR = 1.

Rev. 2.0, 03/99, page 265 of 396

10.36 FMAC Floating-point Multiplyand ACcumulate Floating-Point Instruction

Floating-Point Multiplyand Accumulate


0 FMAC FR0,FRm,FRn FR0*FRm+FRn → FRn 1111nnnnmmmm1110 1 —

1 — — — — —

Description

When FPSCR.PR = 0: This instruction arithmetically multiplies the two single-precision floating-point numbers in FR0 and FRm, arithmetically adds the contents of FRn, and stores the result inFRn.


Operation

void FMAC(int m,n)

{

pc += 2;

clear_cause();

if(FPSCR_PR == 1) undefined_operation();

else if((data_type_of(0) == sNaN) ||

(data_type_of(m) == sNaN) ||


else if((data_type_of(0) == qNaN) ||

(data_type_of(m) == qNaN)) qnan(n);

else if((data_type_of(0) == DENORM) ||

(data_type_of(m) == DENORM)) set_E();

else switch (data_type_of(0){

case NORM: switch (data_type_of(m)){

case PZERO:



Rev. 2.0, 03/99, page 266 of 396

case qNaN: qnan(n); break;

case PZERO:

case NZERO: zero(n,sign_of(0)^ sign_of(m)^sign_of(n));break;

default: break;

}

case PINF:




case PINF:

case NINF: if(sign_of(0)^ sign_of(m)^sign_of(n)) invalid(n);

else inf(n,sign_of(0)^ sign_of(m)); break;

default: inf(n,sign_of(0)^ sign_of(m)); break;

}




case PINF:

case NINF: inf(n,sign_of(n)); break;

case PZERO:

case NZERO:

case NORM: normal_fmac(m,n); break;

} break;

case PZERO:

case NZERO: switch (data_type_of(m)){

case PINF:


case PZERO:

case NZERO:




case PZERO:

case NZERO: zero(n,sign_of(0)^ sign_of(m)^sign_of(n)); break;

default: break;

} break;

} break;

Rev. 2.0, 03/99, page 267 of 396

case PINF :

case NINF : switch (data_type_of(m)){

case PZERO:

case NZERO:invalid(n); break;

default: switch (data_type_of(n)){



default: inf(n,sign_of(0)^sign_of(m)^sign_of(n));break

} break;

} break;

}

}

void normal_fmac(int m,n)

{

union {

int double x;

int l[4];

} dstx,tmpx;

float dstf,srcf;

if((data_type_of(n) == PZERO)|| (data_type_of(n) == NZERO))

srcf = 0.0; /* flush denormalized value */

else srcf = FR[n];

tmpx.x = FR[0]; /* convert single to int double */

tmpx.x *= FR[m]; /* exact product */

dstx.x = tmpx.x + srcf;

if(((dstx.x == srcf) && (tmpx.x != 0.0)) ||

((dstx.x == tmpx.x) && (srcf != 0.0))) {

set_I();

if(sign_of(0)^ sign_of(m)^ sign_of(n)) {

dstx.l[3] -= 1; /* correct result */

if(dstx.l[3] == 0xffffffff) dstx.l[2] -= 1;



}

else dstx.l[3] |= 1;

}

if((dstx.l[1] & 0x01ffffff) || dstx.l[2] || dstx.l[3]) set_I();

Rev. 2.0, 03/99, page 268 of 396

if(FPSCR_RM == 1) {

dstx.l[1] &= 0xfe000000; /* round toward zero */

dstx.l[2] = 0x00000000;

dstx.l[3] = 0x00000000;

}

dstf = dstx.x;

check_single_exception(&FR[n],dstf);

}

Rev. 2.0, 03/99, page 269 of 396

FMAC Special Cases

FRn FR0 FRm

+Norm -Norm +0 –0 +INF –INF Denorm qNaN sNaN

Norm Norm MAC INF

0 Invalid

INF INF Invalid INF

+0 Norm MAC

0 +0 Invalid

INF INF Invalid INF

–0 +Norm MAC +0 –0 +INF –INF

–Norm –0 +0 –INF +INF

+0 +0 –0 +0 –0 Invalid

–0 –0 +0 –0 +0

INF INF Invalid INF

+INF +Norm +INF Invalid

–Norm +INF

0 Invalid

+INF Invalid +INF

–INF Invalid +INF +INF

–INF +Norm –INF –INF

–Norm

0

+INF Invalid Invalid –INF

–INF –INF –INF Invalid

Denorm Norm

0 Invalid

INF Invalid

!sNaN Denorm Error

qNaN 0 Invalid

INF Invalid

Norm

!sNaN qNaN qNaN

All types sNaN

SNaN all types Invalid


Rev. 2.0, 03/99, page 270 of 396



• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 271 of 396

10.37 FMOV Floating-point MOVe Floating-Point InstructionFloating-PointTransfer

SZ FormatSummary ofOperation Instruction Code

ExecutionStates T Bit

0 1. FMOV FRm,FRn FRm → FRn 1111nnnnmmmm1100 1 —

1 2. FMOV DRm,DRn DRm → DRn 1111nnn0mmm01100 1 —

0 3. FMOV.S FRm,@Rn FRm → (Rn) 1111nnnnmmmm1010 1 —

1 4. FMOV DRm,@Rn DRm → (Rn) 1111nnnnmmm01010 1 —

0 5. FMOV.S @Rm,FRn (Rm) → FRn 1111nnnnmmmm1000 1 —

1 6. FMOV @Rm,DRn (Rm) → DRn 1111nnn0mmmm1000 1 —

0 7. FMOV.S @Rm+,FRn (Rm) → FRn,Rm+=4 1111nnnnmmmm1001 1 —

1 8. FMOV @Rm+,DRn (Rm) → DRn,Rm+=8 1111nnn0mmmm1001 1 —

0 9. FMOV.S FRm,@-Rn Rn-=4,FRm → (Rn) 1111nnnnmmmm1011 1 —

1 10. FMOV DRm,@-Rn Rn-=8,DRm → (Rn) 1111nnnnmmm01011 1 —

0 11. FMOV.S @(R0,Rm),FRn (R0+Rm) → FRn 1111nnnnmmmm0110 1 —

1 12. FMOV @(R0,Rm),DRn (R0+Rm) → DRn 1111nnn0mmmm0110 1 —

0 13. FMOV.S FRm, @(R0,Rn) FRm → (R0+Rn) 1111nnnnmmmm0111 1 —

1 14. FMOV DRm, @(R0,Rn) DRm → (R0+Rn) 1111nnnnmmm00111 1 —

Description

1. This instruction transfers FRm contents to FRn.

2. This instruction transfers DRm contents to DRn.

3. This instruction transfers FRm contents to memory at address indicated by Rn.

4. This instruction transfers DRm contents to memory at address indicated by Rn.

5. This instruction transfers contents of memory at address indicated by Rm to FRn.

6. This instruction transfers contents of memory at address indicated by Rm to DRn.

7. This instruction transfers contents of memory at address indicated by Rm to FRn, and adds 4 toRm.

8. This instruction transfers contents of memory at address indicated by Rm to DRn, and adds 8to Rm.

9. This instruction subtracts 4 from Rn, and transfers FRm contents to memory at addressindicated by resulting Rn value.

10. This instruction subtracts 8 from Rn, and transfers DRm contents to memory at addressindicated by resulting Rn value.

11. This instruction transfers contents of memory at address indicated by (R0 + Rm) to FRn.

Rev. 2.0, 03/99, page 272 of 396

12. This instruction transfers contents of memory at address indicated by (R0 + Rm) to DRn.

13. This instruction transfers FRm contents to memory at address indicated by (R0 + Rn).

14. This instruction transfers DRm contents to memory at address indicated by (R0 + Rn).

Operation

void FMOV(int m,n) /* FMOV FRm,FRn */

{

FR[n] = FR[m];

pc += 2;

}

void FMOV_DR(int m,n) /* FMOV DRm,DRn */

{

DR[n>>1] = DR[m>>1];

pc += 2;

}

void FMOV_STORE(int m,n) /* FMOV.S FRm,@Rn */

{

store_int(FR[m],R[n]);

pc += 2;

}

void FMOV_STORE_DR(int m,n) /* FMOV DRm,@Rn */

{

store_quad(DR[m>>1],R[n]);

pc += 2;

}

void FMOV_LOAD(int m,n) /* FMOV.S @Rm,FRn */

{

load_int(R[m],FR[n]);

pc += 2;

}

void FMOV_LOAD_DR(int m,n) /* FMOV @Rm,DRn */

{

load_quad(R[m],DR[n>>1]);

pc += 2;

}

void FMOV_RESTORE(int m,n) /* FMOV.S @Rm+,FRn */

{

Rev. 2.0, 03/99, page 273 of 396

load_int(R[m],FR[n]);

R[m] += 4;

pc += 2;

}

void FMOV_RESTORE_DR(int m,n) /* FMOV @Rm+,DRn */

{

load_quad(R[m],DR[n>>1]) ;

R[m] += 8;

pc += 2;

}

void FMOV_SAVE(int m,n) /* FMOV.S FRm,@–Rn */

{

store_int(FR[m],R[n]-4);

R[n] -= 4;

pc += 2;

}

void FMOV_SAVE_DR(int m,n) /* FMOV DRm,@–Rn */

{

store_quad(DR[m>>1],R[n]-8);

R[n] -= 8;

pc += 2;

}

void FMOV_INDEX_LOAD(int m,n) /* FMOV.S @(R0,Rm),FRn */

{

load_int(R[0] + R[m],FR[n]);

pc += 2;

}

void FMOV_INDEX_LOAD_DR(int m,n) /*FMOV @(R0,Rm),DRn */

{

load_quad(R[0] + R[m],DR[n>>1]);

pc += 2;

}

void FMOV_INDEX_STORE(int m,n) /*FMOV.S FRm,@(R0,Rn)*/

{

store_int(FR[m], R[0] + R[n]);

pc += 2;

}

Rev. 2.0, 03/99, page 274 of 396

void FMOV_INDEX_STORE_DR(int m,n)/*FMOV DRm,@(R0,Rn)*/

{

store_quad(DR[m>>1], R[0] + R[n]);

pc += 2;

}

Possible Exceptions:• Data TLB miss exception

• Data protection violation exception

• Initial write exception

• Address error

Rev. 2.0, 03/99, page 275 of 396

10.38 FMOV Floating-pointMOVe extension Floating-Point Instruction

Floating-PointTransfer

PR FormatSummary ofOperation Instruction Code

ExecutionStates T Bit

1 1. FMOV XDm,@Rn XRm → (Rn) 1111nnnnmmm11010 1 —

1 2. FMOV @Rm,XDn (Rm) → XDn 1111nnn1mmmm1000 1 —

1 3. FMOV @Rm+,XDn (Rm) → XDn,Rm+=8 1111nnn1mmmm1001 1 —

1 4. FMOV XDm,@-Rn Rn-=8,XDm → (Rn) 1111nnnnmmm11011 1 —

1 5. FMOV @(R0,Rm),XDn (R0+Rm) → XDn 1111nnn1mmmm0110 1 —

1 6. FMOV XDm,@(R0,Rn) XDm → (R0+Rn) 1111nnnnmmm10111 1 —

1 7. FMOV XDm,XDn XDm → XDn 1111nnn1mmm11100 1 —

1 8. FMOV XDm,DRn XDm → DRn 1111nnn0mmm11100 1 —

1 9. FMOV DRm,XDn DRm → XDn 1111nnn1mmm01100 1 —

Description

1. This instruction transfers XDm contents to memory at address indicated by Rn.

2. This instruction transfers contents of memory at address indicated by Rm to XDn.

3. This instruction transfers contents of memory at address indicated by Rm to XDn, and adds 8to Rm.

4. This instruction subtracts 8 from Rn, and transfers XDm contents to memory at addressindicated by resulting Rn value.

5. This instruction transfers contents of memory at address indicated by (R0 + Rm) to XDn.

6. This instruction transfers XDm contents to memory at address indicated by (R0 + Rn).

7. This instruction transfers XDm contents to XDn.

8. This instruction transfers XDm contents to DRn.

9. This instruction transfers DRm contents to XDn.

Rev. 2.0, 03/99, page 276 of 396

Operation

void FMOV_STORE_XD(int m,n) /* FMOV XDm,@Rn */

{

store_quad(XD[m>>1],R[n]);

pc += 2;

}

void FMOV_LOAD_XD(int m,n) /* FMOV @Rm,XDn */

{

load_quad(R[m],XD[n>>1]);

pc += 2;

}

void FMOV_RESTORE_XD(int m,n) /* FMOV @Rm+,DBn */

{

load_quad(R[m],XD[n>>1]);

R[m] += 8;

pc += 2;

}

void FMOV_SAVE_XD(int m,n) /* FMOV XDm,@–Rn */

{

store_quad(XD[m>>1],R[n]-8);

R[n] -= 8;

pc += 2;

}

void FMOV_INDEX_LOAD_XD(int m,n) /* FMOV @(R0,Rm),XDn */

{

load_quad(R[0] + R[m],XD[n>>1]);

pc += 2;

}

void FMOV_INDEX_STORE_XD(int m,n) /* FMOV XDm,@(R0,Rn) */

{

store_quad(XD[m>>1], R[0] + R[n]);

pc += 2;

}

void FMOV_XDXD(int m,n) /* FMOV XDm,XDn */

{

XD[n>>1] = XD[m>>1];

pc += 2;

Rev. 2.0, 03/99, page 277 of 396

}

void FMOV_XDDR(int m,n) /* FMOV XDm,DRn */

{

DR[n>>1] = XD[m>>1];

pc += 2;

}

void FMOV_DRXD(int m,n) /* FMOV DRm,XDn */

{

XD[n>>1] = DR[m>>1];

pc += 2;

}


• Data protection violation exception

• Initial write exception

• Address error

Rev. 2.0, 03/99, page 278 of 396

10.39 FMUL Floating-point MULtiply Floating-Point InstructionFloating-PointMultiplication


0 FMUL FRm,FRn FRn*FRm → FRn 1111nnnnmmmm0010 1 —

1 FMUL DRm,DRn DRn*DRm → DRn 1111nnn0mmm00010 6 —

Description

When FPSCR.PR = 0: Arithmetically multiplies the two single-precision floating-point numbersin FRn and FRm, and stores the result in FRn.

When FPSCR.PR = 1: Arithmetically multiplies the two double-precision floating-point numbersin DRn and DRm, and stores the result in DRn.


Operation

void FMUL(int m,n)

{

pc += 2;

clear_cause();







else switch (data_type_of(m){


case PZERO:

case NZERO: zero(n,sign_of(m)^sign_of(n)); break;

case PINF:

case NINF: inf(n,sign_of(m)^sign_of(n)); break;

default: normal_fmul(m,n); break;

Rev. 2.0, 03/99, page 279 of 396

} break;

case PZERO:


case PINF:


default: zero(n,sign_of(m)^sign_of(n));break;

} break;

case PINF :

case NINF : switch (data_type_of(n)){

case PZERO:

case NZERO: invalid(n); break;

default: inf(n,sign_of(m)^sign_of(n));break

} break;

}

}

FMUL Special Cases

FRm,DRm FRn,DRn


NORM MUL 0 INF

+0 0 +0 –0 Invalid

–0 –0 +0

+INF INF Invalid +INF –INF

–INF –INF +INF

DENORM Error

qNaN qNaN

sNaN Invalid




• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 280 of 396

10.40 FNEG Floating-point NEGate value Floating-Point InstructionFloating-PointSign Inversion


0 FNEG FRn -FRn → FRn 1111nnnn01001101 1 —

1 FNEG DRn -DRn → DRn 1111nnn001001101 1 —

Description

This instruction inverts the most significant bit (sign bit) of the contents of floating-point registerFRn/DRn, and stores the result in FRn/DRn.

The cause and flag fields in FPSCR are not updated.

Operation

void FNEG (int n){

FR[n] = -FR[n];

pc += 2;

}

/* Same operation is performed regardless of precision. */


Rev. 2.0, 03/99, page 281 of 396

10.41 FRCHG FR-bit CHanGe Floating-Point InstructionFR BitInversion


0 FRCHG FRSCR.FR=~FRSCR.FR 1111101111111101 1 —

1 — — — — —

Description

This instruction inverts the FR bit in floating-point register FPSCR. When the FR bit in FPSCR ischanged, FR0 to FR15 in FPPR0 to FPPR31 become XR0 to XR15, and XR0 to XR15 becomeFR0 to FR15. When FPSCR.FR = 0, FPPR0 to FPPR15 correspond to FR0 to FR15, and FPPR16to FPPR31 correspond to XR0 to XR15. When FPSCR.FR = 1, FPPR16 to FPPR31 correspond toFR0 to FR15, and FPPR0 to FPPR15 correspond to XR0 to XR15.

Operation

void FRCHG() /* FRCHG */

{

if(FPSCR_PR == 0){

FPSCR ^= 0x00200000; /* bit 21 */

PC += 2;

}


}


Rev. 2.0, 03/99, page 282 of 396

10.42 FSCHG Sz-bit CHanGe Floating-Point InstructionSZ BitInversion


0 FSCHG FRSCR.SZ=~FRSCR.SZ 1111001111111101 1 —

1 — — — — —

Description

This instruction inverts the SZ bit in floating-point register FPSCR. Changing the SZ bit inFPSCR switches FMOV instruction data transfer between one single-precision data unit and a datapair. When FPSCR.SZ = 0, the FMOV instruction transfers one single-precision data unit. WhenFPSCR.SZ = 1, the FMOV instruction transfers two single-precision data units as a pair.

Operation

void FSCHG() /* FSCHG */

{

if(FPSCR_PR == 0){

FPSCR ^= 0x00100000; /* bit 20 */

PC += 2;

}


}


Rev. 2.0, 03/99, page 283 of 396

10.43 FSQRT Floating-point SQuare RooT Floating-Point InstructionFloating-PointSquare Root


0 FSQRT FRn √FRn → FRn 1111nnnn01101101 9 —

1 FSQRT DRn √DRn → DRn 1111nnnn01101101 22 —

Description

When FPSCR.PR = 0: Finds the arithmetical square root of the single-precision floating-pointnumber in FRn, and stores the result in FRn.

When FPSCR.PR = 1: Finds the arithmetical square root of the double-precision floating-pointnumber in DRn, and stores the result in DRn.

When FPSCR.enable.I is set, an FPU exception trap is generated regardless of whether or not anexception has occurred. When an exception occurs, correct exception information is reflected inFPSCR.cause and FPSCR.flag, and FRn or DRn is not updated. Appropriate processing shouldtherefore be performed by software.

Operation

void FSQRT(int n){

pc += 2;

clear_cause();

switch(data_type_of(n)){

case NORM : if(sign_of(n) == 0) normal_ fsqrt(n);

else invalid(n); break;

case DENORM: if(sign_of(n) == 0) set_E();

else invalid(n); break;

case PZERO :

case NZERO :

case PINF : break;

case NINF : invalid(n); break;

case qNaN : qnan(n); break;

case sNaN : invalid(n); break;

}

}

void normal_fsqrt(int n)

Rev. 2.0, 03/99, page 284 of 396

{

union {

float f;

int l;

} dstf,tmpf;

union {

double d;

int l[2];

} dstd,tmpd;

union {

int double x;

int l[4];

} tmpx;

if(FPSCR_PR == 0) {

tmpf.f = FR[n]; /* save destination value */

dstf.f = sqrt(FR[n]); /* round toward nearest or even */

tmpd.d = dstf.f; /* convert single to double */

tmpd.d *= dstf.f;


if((tmpf.f < tmpd.d) && (SPSCR_RM == 1))

dstf.l -= 1; /* round toward zero */

if(FPSCR & ENABLE_I) fpu_exception_trap();

else FR[n] = dstf.f;

}else {

tmpd.d = DR[n>>1]; /* save destination value */

dstd.d = sqrt(DR[n>>1]); /* round toward nearest or even */

tmpx.x = dstd.d; /* convert double to int double */

tmpx.x *= dstd.d;


if((tmpd.d < tmpx.x) && (SPSCR_RM == 1)) {

dstd.l[1] -= 1; /* round toward zero */


}

if(FPSCR & ENABLE_I) fpu_exception_trap();

else DR[n>>1] = dstd.d;

}

}

Rev. 2.0, 03/99, page 285 of 396

FSQRT Special Cases


FSQRT(FRn) SQRT Invalid +0 –0 +INF Invalid qNaN Invalid




• Inexact

Rev. 2.0, 03/99, page 286 of 396

10.44 FSTS Floating-point SToreSystem register Floating-Point Instruction

Transfer fromSystem Register


FSTS FPUL,FRn FPUL → FRn 1111nnnn00001101 1 —

Description

This instruction transfers the contents of system register FPUL to floating-point register FRn.

Operation

void FSTS(int n, float *FPUL)

{

FR[n] = *FPUL;

pc += 2;

}


Rev. 2.0, 03/99, page 287 of 396

10.45 FSUB Floating-pointSUBtract Floating-Point Instruction

Floating-PointSubtraction


0 FSUB FRm,FRn FRn-FRm → FRn 1111nnnnmmmm0001 1 —

1 FSUB DRm,DRn DRn-DRm → DRn 1111nnn0mmm00001 6

Description

When FPSCR.PR = 0: Arithmetically subtracts the single-precision floating-point number in FRmfrom the single-precision floating-point number in FRn, and stores the result in FRn.

When FPSCR.PR = 1: Arithmetically subtracts the double-precision floating-point number inDRm from the double-precision floating-point number in DRn, and stores the result in DRn.


Operation

void FSUB (int m,n)

{

pc += 2;

clear_cause();









case NORM: normal_faddsub(m,n,SUB); break;

case PZERO:

case NZERO: register_copy(m,n); FR[n] = -FR[n];break;

default: break;

Rev. 2.0, 03/99, page 288 of 396

} break;

case PZERO: break;


case NZERO: zero(n,0); break;

default: break;

} break;

case PINF: switch (data_type_of(n)){

case PINF: invalid(n); break;


} break;




} break;

}

}

FSUB Special Cases

FRm,DRm FRn,DRn


NORM SUB +INF –INF

+0 –0

–0 +0

+INF –INF Invalid

–INF +INF Invalid

DENORM Error

qNaN qNaN

sNaN Invalid




• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 289 of 396

10.46 FTRC Floating-point TRuncateand Convert to integer Floating-Point Instruction

Conversionto Integer


0 FTRC FRm,FPUL (long)FRm → FPUL 1111mmmm00111101 1 —

1 FTRC DRm,FPUL (long)DRm → FPUL 1111mmm000111101 2 —

Description

When FPSCR.PR = 0: Converts the single-precision floating-point number in FRm to a 32-bitinteger, and stores the result in FPUL.

When FPSCR.PR = 1: Converts the double-precision floating-point number in FRm to a 32-bitinteger, and stores the result in FPUL.

The rounding mode is always truncation.

Operation

#define N_INT_SINGLE_RANGE 0xcf000000 & 0x7fffffff /* -1.000000 * 2^31 */

#define P_INT_SINGLE_RANGE 0x4effffff /* 1.fffffe * 2^30 */

#define N_INT_DOUBLE_RANGE 0xc1e0000000200000 & 0x7fffffffffffffff

#define P_INT_DOUBLE_RANGE 0x41e0000000000000

void FTRC(int m, int *FPUL)

{

pc += 2;

clear_cause();

if(FPSCR.PR==0){

case(ftrc_single_ type_of(m)){

NORM: *FPUL = FR[m]; break;

PINF: ftrc_invalid(0); break;

NINF: ftrc_invalid(1); break;

}

}

else{ /* case FPSCR.PR=1 */

case(ftrc_double_type_of(m)){

Rev. 2.0, 03/99, page 290 of 396

NORM: *FPUL = DR[m>>1]; break;

PINF: ftrc_invalid(0); break;

NINF: ftrc_invalid(1); break;

}

}

}

int ftrc_signle_type_of(int m)

{

if(sign_of(m) == 0){

if(FR_HEX[m] > 0x7f800000) return(NINF); /* NaN */

else if(FR_HEX[m] > P_INT_SINGLE_RANGE)

return(PINF); /* out of range,+INF */

else return(NORM); /* +0,+NORM */

}else {

if((FR_HEX[m] & 0x7fffffff) > N_INT_SINGLE_RANGE)

return(NINF); /* out of range ,+INF,NaN*/

else return(NORM); /* -0,-NORM */

}

}

int ftrc_double_type_of(int m)

{

if(sign_of(m) == 0){

if((FR_HEX[m] > 0x7ff00000) ||

((FR_HEX[m] == 0x7ff00000) &&

(FR_HEX[m+1] != 0x00000000))) return(NINF); /* NaN */

else if(DR_HEX[m>>1] >= P_INT_DOUBLE_RANGE)

return(PINF); /* out of range,+INF */

else return(NORM); /* +0,+NORM */

}else {

if((DR_HEX[m>>1] & 0x7fffffffffffffff) >= N_INT_DOUBLE_RANGE)

return(NINF); /* out of range ,+INF,NaN*/

else return(NORM); /* -0,-NORM */

}

}

void ftrc_invalid(int sign, int *FPUL)

{

set_V();

Rev. 2.0, 03/99, page 291 of 396

if((FPSCR & ENABLE_V) == 0){

if(sign == 0) *FPUL = 0x7fffffff;

else *FPUL = 0x80000000;

}


}

FTRC Special Cases

FRn,DRn NORM +0 –0

PositiveOut ofRange

NegativeOut ofRange +INF –INF qNaN sNaN

FTRC(FRn,DRn)

TRC 0 0 Invalid+MAX

Invalid–MAX

Invalid+MAX

Invalid–MAX

Invalid–MAX

Invalid–MAX



Rev. 2.0, 03/99, page 292 of 396

10.47 FTRV Floating-pointTRansform Vector Floating-Point Instruction

VectorTransformation


0 FTRV XMTRX,FVn XMTRX*FVn → FVn 1111nn0111111101 4 —

1 — — — — —

Description

When FPSCR.PR = 0: This instruction takes the contents of floating-point registers XF0 to XF15indicated by XMTRX as a 4-row × 4-column matrix, takes the contents of floating-point registersFR[n] to FR[n + 3] indicated by FVn as a 4-dimensional vector, multiplies the array by the vector,and stores the results in FV[n].

XMTRX FVn FVnXF[0] XF[4] XF[8] XF[12] FR[n] FR[n]XF[1] XF[5] XF[9] XF[13] × FR[n+1] → FR[n+1]XF[2] XF[6] XF[10] XF[14] FR[n+2] FR[n+2]XF[3] XF[7] XF[11] XF[15] FR[n+3] FR[n+3]

The FTRV instruction is intended for speed rather than accuracy, and therefore the results willdiffer from those obtained by using a combination of FADD and FMUL instructions. The FTRVexecution sequence is as follows:

1. Multiplies all terms. The results are 30 bits long.

2. Aligns these results, rounding them to fit within 28 bits.

3. Adds the aligned values.

4. Performs normalization and rounding.

Special processing is performed in the following cases:

1. If an input value is an sNaN, an invalid exception is generated.

2. If the input values to be multiplied include a combination of 0 and infinity, an invalidoperation exception is generated.

3. In cases other than the above, if the input values include a qNaN, the result will be a qNaN.

4. In cases other than the above, if the input values include infinity:

a. If multiplication results in two or more infinities and the signs are different, an invalidexception will be generated.

b. Otherwise, correct infinities will be stored.

Rev. 2.0, 03/99, page 293 of 396

5. If the input values do not include an sNaN, qNaN, or infinity, processing is performed in thenormal way.

When FPSCR.enable.V/O/U/I is set, an FPU exception trap is generated regardless of whether ornot an exception has occurred. When an exception occurs, correct exception information isreflected in FPSCR.cause and FPSCR.flag, and FRn or DRn is not updated. Appropriateprocessing should therefore be performed by software.

Operation

void FTRV (int n) /* FTRV FVn */

{

float saved_vec[4],result_vec[4];

int saved_fpscr;

int dst,i;

if(FPSCR_PR == 0) {

PC += 2;

clear_cause();

saved_fpscr = FPSCR;

FPSCR &= ~ENABLE_VOUI; /* mask VOUI enable */

dst = 12 - n; /* select other vector than FVn */

for(i=0;i<4;i++)saved_vec [i] = FR[dst+i];

for(i=0;i<4;i++){

for(j=0;j<4;j++) FR[dst+j] = XF[i+4j];

fipr(n,dst);

saved_fpscr |= FPSCR & (CAUSE|FLAG) ;

result_vec [i] = FR[dst+3];

}

for(i=0;i<4;i++)FR[dst+i] = saved_vec [i];

FPSCR = saved_fpscr;

if(FPSCR & ENABLE_VOUI) fpu_exception_trap();

else for(i=0;i<4;i++) FR[n+i] = result_vec [i];

}


}

Rev. 2.0, 03/99, page 294 of 396


• Overflow

• Underflow

• Inexact

Rev. 2.0, 03/99, page 295 of 396

10.48 JMP JuMP Branch InstructionUnconditional Branch Delayed Branch Instruction


JMP @Rn Rn → PC 0100nnnn00101011 2 —

Description

Unconditionally makes a delayed branch to the address specified by Rn.

Notes



Operation

JMP(int n)/* JMP @Rn */

{

unsigned int temp;

temp=PC;

PC=R[n];

Delay_Slot(temp+2);

}

Example

MOV.L JMP_TABLE,R0 ;R0 = TRGET address

JMP @R0 ;Branch to TRGET.


.align 4

JMP_TABLE: .data.l TRGET ;Jump table

...........

TRGET: ADD #1,R1 ;← Branch destination

Rev. 2.0, 03/99, page 296 of 396

10.49 JSR Jump to SubRoutine Branch InstructionBranch to Subroutine Procedure Delayed Branch Instruction


JSR @Rn PC+4 → PR, Rn → PC 0100nnnn00001011 2 —

Description

This instruction makes a delayed branch to the subroutine procedure at the specified address afterexecution of the following instruction. Return address (PC + 4) is saved in PR, and a branch ismade to the address indicated by general register Rn. JSR is used in combination with RTS forsubroutine procedure calls.

Notes



Operation

JSR(int n)/* JSR @Rn */

{

unsigned int temp;

temp=PC;

PR=PC+4;

PC=R[n];

Delay_Slot(temp+2);

}

Rev. 2.0, 03/99, page 297 of 396

Example

MOV.L JSR_TABLE,R0 ;R0 = TRGET address

JSR @R0 ;Branch to TRGET.

XOR R1,R1 ;XOR executed before branch.

ADD R0,R1 ;← Procedure return destination (PR contents)

.......

.align 4

JSR_TABLE: .data.l TRGET ;Jump table

TRGET: NOP ;← Entry to procedure

MOV R2,R3 ;


MOV #70,R1 ;MOV executed before RTS.

Rev. 2.0, 03/99, page 298 of 396

10.50 LDC LoaD to Control register System Control InstructionLoad to ControlRegister (Privileged Instruction)


LDC Rm, SR Rm → SR 0100mmmm00001110 4 LSB

LDC Rm, GBR Rm → GBR 0100mmmm00011110 3 —

LDC Rm, VBR Rm → VBR 0100mmmm00101110 1 —

LDC Rm, SSR Rm → SSR 0100mmmm00111110 1 —

LDC Rm, SPC Rm → SPC 0100mmmm01001110 1

LDC Rm, DBR Rm → DBR 0100mmmm11111010 1 —

LDC Rm, R0_BANK Rm → R0_BANK 0100mmmm10001110 1 —








LDC.L @Rm+, SR (Rm) → SR, Rm+4 → Rm 0100mmmm00000111 4 LSB

LDC.L @Rm+, GBR (Rm) → GBR, Rm+4 → Rm 0100mmmm00010111 3 —

LDC.L @Rm+, VBR (Rm) → VBR, Rm+4 → Rm 0100mmmm00100111 1 —

LDC.L @Rm+, SSR (Rm) → SSR, Rm+4 → Rm 0100mmmm00110111 1 —

LDC.L @Rm+, SPC (Rm) → SPC, Rm+4 → Rm 0100mmmm01000111 1 —

LDC.L @Rm+, DBR (Rm) → DBR, Rm+4 → Rm 0100mmmm11110110 1 —

LDC.L @Rm+, R0_BANK (Rm) → R0_BANK, Rm+4 → Rm 0100mmmm10000111 1 —








Description

These instructions store the source operand in the control register SR, GBR, VBR, SSR, SPC,DBR, or R0_BANK to R7_BANK.

Rev. 2.0, 03/99, page 299 of 396

Notes

With the exception of LDC Rm,GBR and LDC.L @Rm+,GBR, the LDC/LDC.L instructions areprivileged instructions and can only be used in privileged mode. Use in user mode will cause anillegal instruction exception. However, LDC Rm,GBR and LDC.L @Rm+,GBR can also be usedin user mode.

With the LDC Rm, Rn_BANK and LDC.L @Rm, Rn_BANK instructions, Rn_BANK0 isaccessed when the RB bit in the SR register is 1, and Rn_BANK1 is accessed when this bit is 0.

Operation

LDCSR(int m) /* LDC Rm,SR : Privileged */

{

SR=R[m]&0x700083F3;

PC+=2;

}

LDCGBR(int m) /* LDC Rm,GBR */

{

GBR=R[m];

PC+=2;

}

LDCVBR(int m) /* LDC Rm,VBR : Privileged */

{

VBR=R[m];

PC+=2;

}

LDCSSR(int m) /* LDC Rm,SSR : Privileged */

{

SSR=R[m],

PC+=2;

}

LDCSPC(int m) /* LDC Rm,SPC : Privileged */

{

SPC=R[m];

Rev. 2.0, 03/99, page 300 of 396

PC+=2;

}

LDCDBR(int m) /* LDC Rm,DBR : Privileged */

{

DBR=R[m];

PC+=2;

}

LDCRn_BANK(int m) /* LDC Rm,Rn_BANK : Privileged */

/* n=0–7 */

{

Rn_BANK=R[m];

PC+=2;

}

LDCMSR(int m) /* LDC.L @Rm+,SR : Privileged */

{

SR=Read_Long(R[m])&0x700083F3;

R[m]+=4;

PC+=2;

}

LDCMGBR(int m) /* LDC.L @Rm+,GBR */

{

GBR=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDCMVBR(int m) /* LDC.L @Rm+,VBR : Privileged */

{

VBR=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

Rev. 2.0, 03/99, page 301 of 396

LDCMSSR(int m) /* LDC.L @Rm+,SSR : Privileged */

{

SSR=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDCMSPC(int m) /* LDC.L @Rm+,SPC : Privileged */

{

SPC=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDCMDBR(int m) /* LDC.L @Rm+,DBR : Privileged */

{

DBR=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDCMRn_BANK(Long m) /* LDC.L @Rm+,Rn_BANK : Privileged */

/* n=0–7 */

{

Rn_BANK=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

Possible Exceptions:• General illegal instruction exception

• Illegal slot instruction exception

• Data TLB miss exception

• Data TLB protection violation exception

• Address error

Rev. 2.0, 03/99, page 302 of 396

10.51 LDS LoaD to FPU Systemregister System Control Instruction

Load to FPUSystem Register


LDS Rm,FPUL Rm → FPUL 0100mmmm01011010 1 —

LDS.L @Rm+,FPUL (Rm) → FPUL, Rm+4 → Rm 0100mmmm01010110 1 —

LDS Rm,FPSCR Rm → FPSCR 0100mmmm01101010 1 —

LDS.L @Rm+,FPSCR (Rm) → FPSCR, Rm+4 → Rm 0100mmmm01100110 1 —

Description

This instruction loads the source operand into FPU system registers FPUL and FPSCR.

Operation

#define FPSCR_MASK 0x003FFFFF

LDSFPUL(int m, int *FPUL) /* LDS Rm,FPUL */

{

*FPUL=R[m];

PC+=2;

}

LDSMFPUL(int m, int *FPUL) /* LDS.L @Rm+,FPUL */

{

*FPUL=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDSFPSCR(int m) /* LDS Rm,FPSCR */

{

FPSCR=R[m] & FPSCR_MASK;

PC+=2;

}

LDSMFPSCR(int m) /* LDS.L @Rm+,FPSCR */

{

FPSCR=Read_Long(R[m]) & FPSCR_MASK;

Rev. 2.0, 03/99, page 303 of 396

R[m]+=4;

PC+=2;

}


• Data access protection exception

• Address error

Rev. 2.0, 03/99, page 304 of 396

10.52 LDS LoaD to System register System Control InstructionLoad to SystemRegister


LDS Rm,MACH Rm → MACH 0100mmmm00001010 1 —

LDS Rm,MACL Rm → MACL 0100mmmm00011010 1 —

LDS Rm,PR Rm→ PR 0100mmmm00101010 2 —

LDS.L @Rm+,MACH (Rm) → MACH, Rm + 4 → Rm 0100mmmm00000110 1 —

LDS.L @Rm+,MACL (Rm) → MACL, Rm + 4 → Rm 0100mmmm00010110 1 —

LDS.L @Rm+,PR (Rm) → PR, Rm + 4 → Rm 0100mmmm00100110 2 —

Description

Stores the source operand into the system registers MACH, MACL, or PR.

Operation

LDSMACH(int m) /* LDS Rm,MACH */

{

MACH=R[m];

PC+=2;

}

LDSMACL(int m) /* LDS Rm,MACL */

{

MACL=R[m];

PC+=2;

}

LDSPR(int m) /* LDS Rm,PR */

{

PR=R[m];

PC+=2;

}

LDSMMACH(int m) /* LDS.L @Rm+,MACH */

{

Rev. 2.0, 03/99, page 305 of 396

MACH=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDSMMACL(int m) /* LDS.L @Rm+,MACL */

{

MACL=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

LDSMPR(int m) /* LDS.L @Rm+,PR */

{

PR=Read_Long(R[m]);

R[m]+=4;

PC+=2;

}

Example

LDS R0,PR ; Before execution R0 = H’12345678, PR = H’00000000

; After execution PR = H’12345678

LDS.L @R15+,MACL ; Before execution R15 = H’10000000

; After execution R15 = H’10000004, MACL = (H’10000000)

Rev. 2.0, 03/99, page 306 of 396

10.53 LDTLB LoaD PTEH/PTEL/PTEAto TLB System Control Instruction

Load to TLB (Privileged Instruction)


LDTLB PTEH/PTEL/PTEA → TLB 0000000000111000 1 —

Description

This instruction loads the contents of the PTEH/PTEL/PTEA registers into the TLB (translationlookaside buffer) specified by MMUCR.URC (random counter field in the MMC control register).

LDTLB is a privileged instruction, and can only be used in privileged mode. Use of thisinstruction in user mode will cause an illegal instruction exception.

Notes

As this instruction loads the contents of the PTEH/PTEL/PTEA registers into a TLB, it should beused either with the MMU disabled, or in the P1 or P2 virtual space with the MMU enabled (seesection 3, Memory Management Unit, for details). After this instruction is issued, there must be atleast one instruction between the LDTLB instruction and issuance of an instruction relating toaddress to areas P0, U0, and P3 (i.e. BRAF, BSRF, JMP, JSR, RTS, or RTE).

Rev. 2.0, 03/99, page 307 of 396

Operation

LDTLB( ) /*LDTLB */

{

TLB[MMUCR. URC] .ASID=PTEH & 0x000000FF;

TLB[MMUCR. URC] .VPN=(PTEH & 0xFFFFFC00)>>10;

TLB[MMUCR. URC] .PPN=(PTEH & 0x1FFFFC00)>>10;

TLB[MMUCR. URC] .SZ=(PTEL & 0x00000080)>>6 |

(PTEL & 0x00000010)>>4;

TLB[MMUCR. URC] .SH=(PTEH & 0x00000002)>>1;

TLB[MMUCR. URC] .PR=(PTEH & 0x00000060)>>5;

TLB[MMUCR. URC] .WT=(PTEH & 0x00000001);

TLB[MMUCR. URC] .C=(PTEH & 0x00000008)>>3;

TLB[MMUCR. URC] .D=(PTEH & 0x00000004)>>2;

TLB[MMUCR. URC] .V=(PTEH & 0x00000100)>>8;

TLB[MMUCR. URC] .SA=(PTEA & 0x00000007);

TLB[MMUCR. URC] .TC=(PTEA & 0x00000008)>>3;

PC+=2;

}

Example

MOV @R0,R1 ;Load page table entry (upper) into R1

MOV R1,@R2 ;Load R1 into PTEH; R2 is PTEH address (H’FF000000)

LDTLB ;Load PTEH, PTEL, PTEA registers into TLB

Rev. 2.0, 03/99, page 308 of 396

10.54 MAC.L Multiply and ACcumulateLong Arithmetic Instruction

Double-PrecisionMultiply-and-AccumulateOperation


MAC.L @Rm+,@Rn+ Signed,(Rn) × (Rm) + MAC → MAC

Rn + 4 → Rn, Rm + 4 → Rm

0000nnnnmmmm1111 2–5 —

Description

This instruction performs signed multiplication of the 32-bit operands whose addresses are thecontents of general registers Rm and Rn, adds the 64-bit result to the MAC register contents, andstores the result in the MAC register. Operands Rm and Rn are each incremented by 4 each timethey are read.

If the S bit is 0, the 64-bit result is stored in the linked MACH and MACL registers.

If the S bit is 1, the addition to the MAC register contents is a saturation operation at the 48th bitfrom the LSB. In a saturation operation, only the lower 48 bits of the MAC register are valid, andthe result range is limited to H’FFFF800000000000 (minimum value) to H’00007FFFFFFFFFFF(maximum value).

Operation

MACL(long m, long n) /* MAC.L @Rm+,@Rn+ */

{



long tempm,tempn,fnLmL;

tempn=(long)Read_Long(R[n]);

R[n]+=4;

tempm=(long)Read_Long(R[m]);

R[m]+=4;

if ((long)(tempn^tempm)<0) fnLmL=-1;

else fnLmL=0;

Rev. 2.0, 03/99, page 309 of 396

if (tempn<0) tempn=0-tempn;

if (tempm<0) tempm=0-tempm;

temp1=(unsigned long)tempn;

temp2=(unsigned long)tempm;

RnL=temp1&0x0000FFFF;

RnH=(temp1>>16)&0x0000FFFF;

RmL=temp2&0x0000FFFF;

RmH=(temp2>>16)&0x0000FFFF;

temp0=RmL*RnL;

temp1=RmH*RnL;

temp2=RmL*RnH;

temp3=RmH*RnH;

Res2=0;

Res1=temp1+temp2;

if (Res1<temp1) Res2+=0x00010000;


Res0=temp0+temp1;



if(fnLmL<0){

Res2=~Res2;

if (Res0==0) Res2++;

else Res0=(~Res0)+1;

}

if(S==1){

Res0=MACL+Res0;

if (MACL>Res0) Res2++;

if (MACH&0x00008000);

else Res2+=MACH|0xFFFF0000;

Rev. 2.0, 03/99, page 310 of 396

Res2+=MACH&0x00007FFF;

if(((long)Res2<0)&&(Res2<0xFFFF8000)){

Res2=0xFFFF8000;

Res0=0x00000000;

}

if(((long)Res2>0)&&(Res2>0x00007FFF)){

Res2=0x00007FFF;

Res0=0xFFFFFFFF;

};

MACH=(Res2&0x0000FFFF)|(MACH&0xFFFF0000);

MACL=Res0;

}

else {

Res0=MACL+Res0;

if (MACL>Res0) Res2++;

Res2+=MACH;

MACH=Res2;

MACL=Res0;

}

PC+=2;

}

Rev. 2.0, 03/99, page 311 of 396

Example

MOVA TBLM,R0 ;Get table address

MOV R0,R1 ;

MOVA TBLN,R0 ;Get table address

CLRMAC ;MAC register initialization

MAC.L @R0+,@R1+ ;

MAC.L @R0+,@R1+ ;

STS MACL,R0 ;Get result in R0

.........

.align 2 ;

TBLM .data.l H’1234ABCD ;

.data.l H’5678EF01 ;

TBLN .data.l H’0123ABCD ;

.data.l H’4567DEF0 ;

Rev. 2.0, 03/99, page 312 of 396

10.55 MAC.W Multiply andACcumulate Word Arithmetic Instruction

Single-PrecisionMultiply-and-AccumulateOperation


MAC.W @Rm+,@Rn+

MAC @Rm+,@Rn+

Signed,(Rn) × (Rm) + MAC →MAC

Rn + 2 → Rn, Rm + 2 → Rm

0100nnnnmmmm1111 2–5 —

Description

This instruction performs signed multiplication of the 16-bit operands whose addresses are thecontents of general registers Rm and Rn, adds the 32-bit result to the MAC register contents, andstores the result in the MAC register. Operands Rm and Rn are each incremented by 2 each timethey are read.

If the S bit is 0, a 16 × 16 + 64 → 64-bit multiply-and-accumulate operation is performed, and the64-bit result is stored in the linked MACH and MACL registers.

If the S bit is 1, a 16 × 16 + 32 → 32-bit multiply-and-accumulate operation is performed, and theaddition to the MAC register contents is a saturation operation. In a saturation operation, only theMACL register is valid, and the result range is limited to H’80000000 (minimum value) toH’7FFFFFFF (maximum value). If overflow occurs, the LSB of the MACH register is set to 1.H’80000000 (minimum value) is stored in the MACL register if the result overflows in thenegative direction, and H’7FFFFFFF (maximum value) is stored if the result overflows in thepositive direction

Notes

If the S bit is 0, a 16 × 16 + 64 → 64-bit multiply-and-accumulate operation is performed.

Rev. 2.0, 03/99, page 313 of 396

Operation

MACW(long m, long n) /* MAC.W @Rm+,@Rn+ */

{

long tempm,tempn,dest,src,ans;

unsigned long templ;

tempn=(long)Read_Word(R[n]);

R[n]+=2;

tempm=(long)Read_Word(R[m]);

R[m]+=2;

templ=MACL;

tempm=((long)(short)tempn*(long)(short)tempm);

if ((long)MACL>=0) dest=0;

else dest=1;

if ((long)tempm>=0) {

src=0;

tempn=0;

}

else {

src=1;

tempn=0xFFFFFFFF;

}

src+=dest;

MACL+=tempm;

if ((long)MACL>=0) ans=0;

else ans=1;

ans+=dest;

if (S==1) {

if (ans==1) {

if (src==0) MACL=0x7FFFFFFF;

if (src==2) MACL=0x80000000;

}

}

else {

MACH+=tempn;

if (templ>MACL) MACH+=1;

}

Rev. 2.0, 03/99, page 314 of 396

PC+=2;

}

Example

MOVA TBLM,R0 ;Get table address

MOV R0,R1 ;

MOVA TBLN,R0 ;Get table address

CLRMAC ;MAC register initialization

MAC.W @R0+,@R1+ ;

MAC.W @R0+,@R1+ ;

STS MACL,R0 ;Get result in R0

...........

.align 2 ;

TBLM .data.w H’1234 ;

.data.w H’5678 ;

TBLN .data.w H’0123 ;

.data.w H’4567 ;

Rev. 2.0, 03/99, page 315 of 396

10.56 MOV MOVe data Data Transfer InstructionData Transfer


MOV Rm,Rn Rm → Rn 0110nnnnmmmm0011 1 —

MOV.B Rm,@Rn Rm → (Rn) 0010nnnnmmmm0000 1 —

MOV.W Rm,@Rn Rm → (Rn) 0010nnnnmmmm0001 1 —

MOV.L Rm,@Rn Rm → (Rn) 0010nnnnmmmm0010 1 —

MOV.B @Rm,Rn (Rm) sign extension Rn 0110nnnnmmmm0000 1 —

MOV.W @Rm,Rn (Rm) sign extension Rn 0110nnnnmmmm0001 1 —

MOV.L @Rm,Rn (Rm) → Rn 0110nnnnmmmm0010 1 —

MOV.B Rm,@-Rn Rn-1 → Rn, Rm → (Rn ) 0010nnnnmmmm0100 1 —

MOV.W Rm,@-Rn Rn-2 → Rn, Rm → (Rn ) 0010nnnnmmmm0101 1 —

MOV.L Rm,@-Rn Rn-4 → Rn, Rm → (Rn) 0010nnnnmmmm0110 1 —

MOV.B @Rm+,Rn (Rm) sign extension Rn,Rm+1 → Rm


MOV.W @Rm+,Rn (Rm) sign extension Rn,Rm+2 → Rm


MOV.L @Rm+,Rn (Rm) → Rn, Rm+4 → Rm 0110nnnnmmmm0110 1 —

MOV.B Rm,@(R0,Rn) Rm → (R0+Rn) 0000nnnnmmmm0100 1 —

MOV.W Rm,@(R0,Rn) Rm → (R0+Rn) 0000nnnnmmmm0101 1 —

MOV.L Rm,@(R0,Rn) Rm → (R0+Rn) 0000nnnnmmmm0110 1 —

MOV.B @(R0,Rm),Rn (R0+Rm) sign extension Rn 0000nnnnmmmm1100 1 —

MOV.W @(R0,Rm),Rn (R0+Rm) sign extension Rn 0000nnnnmmmm1101 1 —

MOV.L @(R0,Rm),Rn (R0+Rm) → Rn 0000nnnnmmmm1110 1 —

Description

This instruction transfers the source operand to the destination. When an operand is memory, thedata size can be specified as byte, word, or longword. When the source operand is memory, theloaded data is sign-extended to longword before being stored in the register.

Rev. 2.0, 03/99, page 316 of 396

Operation

MOV(long m, long n) /* MOV Rm,Rn */

{

R[n]=R[m];

PC+=2;

}

MOVBS(long m, long n) /* MOV.B Rm,@Rn */

{

Write_Byte(R[n],R[m]);

PC+=2;

}

MOVWS(long m, long n) /* MOV.W Rm,@Rn */

{

Write_Word(R[n],R[m]);

PC+=2;

}

MOVLS(long m, long n) /* MOV.L Rm,@Rn */

{

Write_Long(R[n],R[m]);

PC+=2;

}

MOVBL(long m, long n) /* MOV.B @Rm,Rn */

{

R[n]=(long)Read_Byte(R[m]);

if ((R[n]&0x80)==0) R[n]&=0x000000FF;


PC+=2;

}

MOVWL(long m, long n) /* MOV.W @Rm,Rn */

{

R[n]=(long)Read_Word(R[m]);

if ((R[n]&0x8000)==0) R[n]&=0x0000FFFF;

Rev. 2.0, 03/99, page 317 of 396


PC+=2;

}

MOVLL(long m, long n) /* MOV.L @Rm,Rn */

}

R[n]=Read_Long(R[m]);

PC+=2;

}

MOVBM(long m, long n) /* MOV.B Rm,@-Rn */

{

Write_Byte(R[n]-1,R[m]);

R[n]-=1;

PC+=2;

}

MOVWM(long m, long n) /* MOV.W Rm,@-Rn */

{

Write_Word(R[n]-2,R[m]);

R[n]-=2;

PC+=2;

}

MOVLM(long m, long n) /* MOV.L Rm,@-Rn */

{

Write_Long(R[n]-4,R[m]);

R[n]-=4;

PC+=2;

}

MOVBP(long m, long n) /* MOV.B @Rm+,Rn */

{

R[n]=(long)Read_Byte(R[m]);

if ((R[n]&0x80)==0) R[n]&=0x000000FF;


if (n!=m) R[m]+=1;

Rev. 2.0, 03/99, page 318 of 396

PC+=2;

}

MOVWP(long m, long n) /* MOV.W @Rm+,Rn */

{

R[n]=(long)Read_Word(R[m]);

if ((R[n]&0x8000)==0) R[n]&=0x0000FFFF;


if (n!=m) R[m]+=2;

PC+=2;

}

MOVLP(long m, long n) /* MOV.L @Rm+,Rn */

{

R[n]=Read_Long(R[m]);

if (n!=m) R[m]+=4;

PC+=2;

}

MOVBS0(long m, long n) /* MOV.B Rm,@(R0,Rn) */

{

Write_Byte(R[n]+R[0],R[m]);

PC+=2;

}

MOVWS0(long m, long n) /* MOV.W Rm,@(R0,Rn) */

{

Write_Word(R[n]+R[0],R[m]);

PC+=2;

}

MOVLS0(long m, long n) /* MOV.L Rm,@(R0,Rn) */

{

Write_Long(R[n]+R[0],R[m]);

PC+=2;

}

MOVBL0(long m, long n) /* MOV.B @(R0,Rm),Rn */

Rev. 2.0, 03/99, page 319 of 396

{

R[n]=(long)Read_Byte(R[m]+R[0]);

if ((R[n]&0x80)==0) R[n]&=0x000000FF;


PC+=2;

}

MOVWL0(long m, long n) /* MOV.W @(R0,Rm),Rn */

{

R[n]=(long)Read_Word(R[m]+R[0]);

if ((R[n]&0x8000)==0) R[n]&=0x0000FFFF;


PC+=2;

}

MOVLL0(long m, long n) /* MOV.L @(R0,Rm),Rn */

{

R[n]=Read_Long(R[m]+R[0]);

PC+=2;

}

Example

MOV R0,R1 ;Before execution R0 = H’FFFFFFFF, R1 = H’00000000


MOV.W R0,@R1 ;Before execution R0 = H’FFFF7F80

;After execution (R1) = H’7F80

MOV.B @R0,R1 ;Before execution (R0) = H’80, R1 = H’00000000

;After execution R1 = H’FFFFFF80

MOV.W R0,@-R1 ;Before execution R0 = H’AAAAAAAA, (R1) = H’FFFF7F80

;After execution R1 = H’FFFF7F7E, (R1) = H’AAAA

MOV.L @R0+,R1 ;Before execution R0 = H’12345670

;After execution R0 = H’12345674, R1 = (H’12345670)

MOV.B R1,@(R0,R2) ;Before execution R2 = H’00000004, R0 = H’10000000

;After execution R1 = (H’10000004)

MOV.W @(R0,R2),R1 ;Before execution R2 = H’00000004, R0 = H’10000000


Rev. 2.0, 03/99, page 320 of 396

10.57 MOV MOVe constant value Data Transfer InstructionImmediate DataTransfer


MOV #imm,Rn imm sign extension Rn 1110nnnniiiiiiii 1 —

MOV.W @(disp,PC),Rn (disp×2+PC+4) → signextension Rn

1001nnnndddddddd 1 —

MOV.L @(disp,PC),Rn (disp×4+PC+4) → Rn 1101nnnndddddddd 1 —

Description

This instruction stores immediate data, sign-extended to longword, in general register Rn. In thecase of word or longword data, the data is stored from memory address (PC + 4 + displacement ×2) or (PC + 4 + displacement × 4).

With word data, the 8-bit displacement is multiplied by two after zero-extension, and so therelative distance from the table is in the range up to PC + 4 + 510 bytes. The PC value is theaddress of this instruction.

With longword data, the 8-bit displacement is multiplied by four after zero-extension, and so therelative distance from the operand is in the range up to PC + 4 + 1020 bytes. The PC value is theaddress of this instruction. A value with the lower 2 bits adjusted to B’00 is used in addresscalculation.

Notes

If a PC-relative load instruction is executed in a delay slot, an illegal slot instruction exception willbe generated.

Rev. 2.0, 03/99, page 321 of 396

Operation

MOVI(int i, int n) /* MOV #imm,Rn */

{

if ((i&0x80)==0) R[n]=(0x000000FF & i);

else R[n]=(0xFFFFFF00 | i);

PC+=2;

}

MOVWI(d, n) /* MOV.W @(disp,PC),Rn */

{

unsigned int disp;

disp=(unsigned int)(0x000000FF & d);

R[n]=(int)Read_Word(PC+4+(disp<<1));

if ((R[n]&0x8000)==0) R[n]&=0x0000FFFF;


PC+=2;

}

MOVLI(int d, int n)/* MOV.L @(disp,PC),Rn */

{

unsigned int disp;

disp=(unsigned int)(0x000000FF & (int)d);

R[n]=Read_Long((PC&0xFFFFFFFC)+4+(disp<<2));

PC+=2;

}

Rev. 2.0, 03/99, page 322 of 396

Example

Address

1000 MOV #H’80,R1 ;R1 = H’FFFFFF80

1002 MOV.W IMM,R2 ;R2 = H’FFFF9ABC IMM means (PC + 4 + H’08)

1004 ADD #-1,R0 ;

1006 TST R0,R0 ;

1008 MOV.L @(3,PC),R3 ;R3 = H’12345678

100A BRA NEXT ;Delayed branch instruction

100C NOP

100E IMM .data.w H’9ABC ;

1010 .data.w H’1234 ;

1012 NEXT JMP @R3 ;BRA branch instruction

1014 CMP/EQ #0,R0 ;

.align 4 ;

1018 .data.l H’12345678 ;

101C .data.l H’9ABCDEF0 ;

Rev. 2.0, 03/99, page 323 of 396

10.58 MOV MOVe global data Data Transfer InstructionGlobal DataTransfer


MOV.B @(disp,GBR),R0 (disp+GBR) → signextension R0


MOV.W @(disp,GBR), R0 (disp×2+GBR) → signextension R0


MOV.L @(disp,GBR),R0 (disp×4+GBR) → R0 11000110dddddddd 1 —

MOV.B R0,@(disp,GBR) R0 → (disp+GBR) 11000000dddddddd 1 —

MOV.W R0,@(disp,GBR) R0 → (disp×2+GBR) 11000001dddddddd 1 —

MOV.L R0,@(disp,GBR) R0 → (disp×4+GBR) 11000010dddddddd 1 —

Description

This instruction transfers the source operand to the destination. Byte, word, or longword can bespecified as the data size, but the register is always R0. If the transfer data is byte-size, the 8-bitdisplacement is only zero-extended, so a range up to +255 bytes can be specified. If the transferdata is word-size, the 8-bit displacement is multiplied by two after zero-extension, enabling arange up to +510 bytes to be specified. With longword transfer data, the 8-bit displacement ismultiplied by four after zero-extension, enabling a range up to +1020 bytes to be specified.

When the source operand is memory, the loaded data is sign-extended to longword before beingstored in the register.

Notes

When loading, the destination register is always R0.

Rev. 2.0, 03/99, page 324 of 396

Operation

MOVBLG(int d) /* MOV.B @(disp,GBR),R0 */

{

unsigned int disp;


R[0]=(int)Read_Byte(GBR+disp);

if ((R[0]&0x80)==0) R[0]&=0x000000FF;

else R[0]|=0xFFFFFF00;

PC+=2;

}

MOVWLG(int d) /* MOV.W @(disp,GBR),R0 */

{

unsigned int disp;


R[0]=(int)Read_Word(GBR+(disp<<1));

if ((R[0]&0x8000)==0) R[0]&=0x0000FFFF;

else R[0]|=0xFFFF0000;

PC+=2;

}

MOVLLG(int d) /* MOV.L @(disp,GBR),R0 */

{

unsigned int disp;


R[0]=Read_Long(GBR+(disp<<2));

PC+=2;

}

MOVBSG(int d) /* MOV.B R0,@(disp,GBR) */

{

unsigned int disp;

Rev. 2.0, 03/99, page 325 of 396


Write_Byte(GBR+disp,R[0]);

PC+=2;

}

MOVWSG(int d) /* MOV.W R0,@(disp,GBR) */

{

unsigned int disp;


Write_Word(GBR+(disp<<1),R[0]);

PC+=2;

}

MOVLSG(int d) /* MOV.L R0,@(disp,GBR) */

{

unsigned int disp;

disp=(unsigned int)(0x000000FF & (long)d);

Write_Long(GBR+(disp<<2),R[0]);

PC+=2;

}

Example

MOV.L @(2,GBR),R0 ;Before execution (GBR+8) = H’12345670


MOV.B R0,@(1,GBR) ;Before execution R0 = H’FFFF7F80

;After execution (GBR+1) = H’80

Rev. 2.0, 03/99, page 326 of 396

10.59 MOV MOVe structure data Data Transfer InstructionStructure DataTransfer


MOV.B R0,@(disp,Rn) R0 → (disp+Rn) 10000000nnnndddd 1 —

MOV.W R0,@(disp,Rn) R0 → (disp×2+Rn) 10000001nnnndddd 1 —

MOV.L Rm,@(disp,Rn) Rm → (disp×4+Rn) 0001nnnnmmmmdddd 1 —

MOV.B @(disp,Rm),R0 (disp+Rm) → signextension R0

10000100mmmmdddd 1 —

MOV.W @(disp,Rm),R0 (disp×2+Rm) → signextension R0

10000101mmmmdddd 1 —

MOV.L @(disp,Rm),Rn (disp×4+Rm) → Rn 0101nnnnmmmmdddd 1 —

Description

This instruction transfers the source operand to the destination. It is ideal for accessing data insidea structure or stack. Byte, word, or longword can be specified as the data size, but with byte orword data the register is always R0.

If the data is byte-size, the 4-bit displacement is only zero-extended, so a range up to +15 bytescan be specified. If the data is word-size, the 4-bit displacement is multiplied by two after zero-extension, enabling a range up to +30 bytes to be specified. With longword data, the 4-bitdisplacement is multiplied by four after zero-extension, enabling a range up to +60 bytes to bespecified. If a memory operand cannot be reached, the previously described @(R0,Rn) mode mustbe used.

When the source operand is memory, the loaded data is sign-extended to longword before beingstored in the register.

Notes

When loading byte or word data, the destination register is always R0. Therefore, if the followinginstruction attempts to reference R0, it is kept waiting until completion of the load instruction.This allows optimization by changing the order of instructions.

MOV.B ANDADD

@(2,R1),R0#80,R0#20,R1

MOV.B ADDAND

@(2,R1),R0#20,R1#80,R0

Rev. 2.0, 03/99, page 327 of 396

Operation

MOVBS4(long d, long n /* MOV.B R0,@(disp,Rn) */

{

long disp;

disp=(0x0000000F & (long)d);

Write_Byte(R[n]+disp,R[0]);

PC+=2;

}

MOVWS4(long d, long n) /* MOV.W R0,@(disp,Rn) */

{

long disp;

disp=(0x0000000F & (long)d);

Write_Word(R[n]+(disp<<1),R[0]);

PC+=2;

}

MOVLS4(long m, long d, long n) /* MOV.L Rm,@(disp,Rn) */

{

long disp;

disp=(0x0000000F & (long)d);

Write_Long(R[n]+(disp<<2),R[m]);

PC+=2;

}

MOVBL4(long m, long d) /* MOV.B @(disp,Rm),R0 */

{

long disp;

disp=(0x0000000F & (long)d);

R[0]=Read_Byte(R[m]+disp);

if ((R[0]&0x80)==0) R[0]&=0x000000FF;

else R[0]|=0xFFFFFF00;

PC+=2;

}

Rev. 2.0, 03/99, page 328 of 396

MOVWL4(long m, long d) /* MOV.W @(disp,Rm),R0 */

{

long disp;

disp=(0x0000000F & (long)d);

R[0]=Read_Word(R[m]+(disp<<1));

if ((R[0]&0x8000)==0) R[0]&=0x0000FFFF;

else R[0]|=0xFFFF0000;

PC+=2;

}

MOVLL4(long m, long d, long n) /* MOV.L @(disp,Rm),Rn */

{

long disp;

disp=(0x0000000F & (long)d);

R[n]=Read_Long(R[m]+(disp<<2));

PC+=2;

}

Example

MOV.L @(2,R0),R1 ;Before execution (R0+8) = H’12345670


MOV.L R0,@(H’F,R1) ;Before execution R0 = H’FFFF7F80

;After execution (R1+60) = H’FFFF7F80

Rev. 2.0, 03/99, page 329 of 396

10.60 MOVA MOVe effective Address Data Transfer InstructionEffective AddressTransfer


MOVA @(disp,PC),R0 disp×4+PC+4 → R0 11000111dddddddd 1 —

Description

This instruction stores the source operand effective address in general register R0. The 8-bitdisplacement is multiplied by four after zero-extension. The PC value is the address of thisinstruction, but a value with the lower 2 bits adjusted to B’00 is used in address calculation.

Notes

If this instruction is executed in a delay slot, an illegal slot instruction exception will be generated.

Operation

MOVA(int d) /* MOVA @(disp,PC),R0 */

{

unsigned int disp;


R[0]=(PC&0xFFFFFFFC)+4+(disp<<2);

PC+=2;

}

Example

Address .org H’1006

1006 MOVA STR,R0 ;STR address → R0

1008 MOV.B @R0,R1 ;R1 = “X” ← Position after adjustment of lower 2 bits of PC

100A ADD R4,R5 ;← Original PC position in MOVA instruction address calculation

.align 4

100C STR:.sdata "XYZP12"

Rev. 2.0, 03/99, page 330 of 396

10.61 MOVCA.L MOVe with Cacheblock Allocation Data Transfer Instruction

Cache Block Allocation


MOVCA.L R0,@Rn R0 → (Rn) 0000nnnn11000011 1 —

Description

This instruction stores the contents of general register R0 in the memory location indicated byeffective address Rn. This instruction differs from other store instructions as follows.

If write-back is selected for the accessed memory, and a cache miss occurs, the cache block willbe allocated but an R0 data write will be performed to that cache block without performing a blockread. Other cache block contents are undefined.

Operation

MOVCAL(int n) /*MOVCA.L R0,@Rn */

{

if ((is_write_back_memory(R[n]))

&& (look_up_in_operand_cache(R[n]) == MISS))

allocate_operand_cache_block(R[n]);

Write_Long(R[n], R[0]);

PC+=2;

}



• Initial page write exception

• Address error

Rev. 2.0, 03/99, page 331 of 396

10.62 MOVT MOVe T bit Data Transfer InstructionT Bit Transfer


MOVT Rn T → Rn 0000nnnn00101001 1 —

Description

This instruction stores the T bit in general register Rn. When T = 1, Rn = 1; when T = 0, Rn = 0.

Operation

MOVT(long n) /* MOVT Rn */

{

R[n]=(0x00000001 & SR);

PC+=2;

}

Example

XOR R2,R2 ;R2 = 0

CMP/PZ R2 ;T = 1

MOVT R0 ;R0 = 1

CLRT ;T = 0

MOVT R1 ;R1 = 0

Rev. 2.0, 03/99, page 332 of 396

10.63 MUL.L MULtiply Long Arithmetic InstructionDouble-PrecisionMultiplication


MUL.L Rm,Rn Rn×Rm → MACL 0000nnnnmmmm0111 2–5 —

Description

This instruction performs 32-bit multiplication of the contents of general registers Rn and Rm, andstores the lower 32 bits of the result in the MACL register. The contents of MACH are notchanged.

Operation

MULL(long m, long n) /* MUL.L Rm,Rn */

{

MACL=R[n]*R[m];

PC+=2;

}

Example

MUL.L R0,R1 ;Before execution R0 = H’FFFFFFFE, R1 = H’00005555

;After execution MACL = H’FFFF5556

STS MACL,R0 ;Get operation result

Rev. 2.0, 03/99, page 333 of 396

10.64 MULS.W MULtiply as Signed Word Arithmetic InstructionSignedMultiplication


MULS.W Rm,Rn

MULS Rm,Rn

Signed, Rn × Rm → MACL 0010nnnnmmmm1111 2–5 —

Description

This instruction performs 16-bit multiplication of the contents of general registers Rn and Rm, andstores the 32-bit result in the MACL register. The multiplication is performed as a signedarithmetic operation. The contents of MACH are not changed.

Operation

MULS(long m, long n) /* MULS Rm,Rn */

{

MACL=((long)(short)R[n]*(long)(short)R[m]);

PC+=2;

}

Example

MULS.W R0,R 1 ;Before execution R0 = H’FFFFFFFE, R1 = H’00005555

;After execution MACL = H’FFFF5556


Rev. 2.0, 03/99, page 334 of 396

10.65 MULU.W MULtiply as Unsigned Word Arithmetic InstructionUnsigned Multiplication


MULU.W Rm,Rn

MULU Rm,Rn

Unsigned, Rn × Rm → MACL 0010nnnnmmmm1110 2–5 —

Description

This instruction performs 16-bit multiplication of the contents of general registers Rn and Rm, andstores the 32-bit result in the MACL register. The multiplication is performed as an unsignedarithmetic operation. The contents of MACH are not changed.

Operation

MULU(long m, long n) /* MULU Rm,Rn */

{

MACL=((unsigned long)(unsigned short)R[n]*

(unsigned long)(unsigned short)R[m];

PC+=2;

}

Example

MULU.W R0,R1 ;Before execution R0 = H’00000002, R1 = H’FFFFAAAA

;After execution MACL = H’00015554


Rev. 2.0, 03/99, page 335 of 396

10.66 NEG NEGate Arithmetic InstructionSign Inversion


NEG Rm,Rn 0-Rm → Rn 0110nnnnmmmm1011 1 —

Description

This instruction finds the two’s complement of the contents of general register Rm and stores theresult in Rn. That is, it subtracts Rm from 0 and stores the result in Rn.

Operation

NEG(long m, long n) /* NEG Rm,Rn */

{

R[n]=0-R[m];

PC+=2;

}

Example

NEG R0,R1 ;Before execution R0 = H’00000001


Rev. 2.0, 03/99, page 336 of 396

10.67 NEGC NEGate with Carry Arithmetic InstructionSign Inversion with Borrow


NEGC Rm,Rn 0 – Rm – T → Rn,borrow → T

0110nnnnmmmm1010 1 Borrow

Description

This instruction subtracts the contents of general register and the T bit from 0 and stores the resultin Rn. A borrow resulting from the operation is reflected in the T bit. The NEGC instruction isused for sign inversion of a value exceeding 32 bits.

Operation

NEGC(long m, long n) /* NEGC Rm,Rn */

{

unsigned long temp;

temp=0-R[m];

R[n]=temp-T;

if (0<temp) T=1;

else T=0;

if (temp<R[n]) T=1;

PC+=2;

}

Example

CLRT ;Sign inversion of R0:R1 (64 bits)

NEGC R1,R1 ;Before execution R1 = H’00000001, T = 0

;After execution R1 = H’FFFFFFFF, T = 1

NEGC R0,R0 ;Before execution R0 = H’00000000, T = 1

;After execution R0 = H’FFFFFFFF, T = 1

Rev. 2.0, 03/99, page 337 of 396

10.68 NOP No OPeration System Control InstructionNo Operation


NOP No operation 0000000000001001 1 —

Description

This instruction simply increments the program counter (PC), advancing the processing flow toexecution of the next instruction.

Operation

NOP( ) /* NOP */

{

PC+=2;

}

Example

NOP ;Time equivalent to one execution state elapses.

Rev. 2.0, 03/99, page 338 of 396

10.69 NOT NOT-logical complement Logical InstructionBit Inversion


NOT Rm,Rn ∼Rm → Rn 0110nnnnmmmm0111 1 —

Description

This instruction finds the one’s complement of the contents of general register Rm and stores theresult in Rn. That is, it inverts the Rm bits and stores the result in Rn.

Operation

NOT(long m, long n) /* NOT Rm,Rn */

{

R[n]=∼R[m];

PC+=2;

}

Example

NOT R0,R1 ;Before execution R0 = H’AAAAAAAA


Rev. 2.0, 03/99, page 339 of 396

10.70 OCBI Operand Cache BlockInvalidate Data Transfer Instruction

Cache Block Invalidation


OCBI @Rn Operand cache blockinvalidation

0000nnnn10010011 1 —

Description

This instruction accesses data using the contents indicated by effective address Rn. In the case of ahit in the cache, the corresponding cache block is invalidated (the V bit is cleared to 0). If there isunwritten information (U bit = 1), write-back is not performed even if write-back mode is selected.No operation is performed in the case of a cache miss or an access to a non-cache area.

Operation

OCBI(int n) /* OCBI @Rn */

{

invalidate_operand_cache_block(R[n]);

PC+=2;

}




• Address error

Note that the above exceptions are generated even if OCBI does not operate.

Rev. 2.0, 03/99, page 340 of 396

10.71 OCBP Operand Cache BlockPurge Data Transfer Instruction

Cache Block Purge


OCBP @Rn Operand cache block purge 0000nnnn10100011 1 —

Description

This instruction accesses data using the contents indicated by effective address Rn. If the cache ishit and there is unwritten information (U bit = 1), the corresponding cache block is written back toexternal memory and that block is invalidated (the V bit is cleared to 0). If there is no unwritteninformation (U bit = 0), the block is simply invalidated. No operation is performed in the case of acache miss or an access to a non-cache area.

Operation

OCBP(int n) /* OCBP @Rn */

{

if(is_dirty_block(R[n])) write_back(R[n])

invalidate_operand_cache_block(R[n]);

PC+=2;

}



• Address error

Note that the above exceptions are generated even if OCBP does not operate.

Rev. 2.0, 03/99, page 341 of 396

10.72 OCBWB Operand Cache BlockWrite Back Data Transfer Instruction

Cache Block Write-Back


OCBWB @Rn Operand cache block write-back

0000nnnn10110011 1 —

Description

This instruction accesses data using the contents indicated by effective address Rn. If the cache ishit and there is unwritten information (U bit = 1), the corresponding cache block is written back toexternal memory and that block is cleaned (the U bit is cleared to 0). In other cases (i.e. in the caseof a cache miss or an access to a non-cache area, or if the block is already clean), no operation isperformed.

Operation

OCBWB(int n) /* OCBWB @Rn */

{

if(is_dirty_block(R[n])) write_back(R[n]);

PC+=2;

}



• Address error

Note that the above exceptions are generated even if OCBWB does not operate.

Rev. 2.0, 03/99, page 342 of 396

10.73 OR OR logical Logical InstructionLogical OR


OR Rm,Rn Rn | Rm → Rn 0010nnnnmmmm1011 1 —

OR #imm,R0 R0 | imm → R0 11001011iiiiiiii 1 —

OR.B #imm,@(R0,GBR) (R0+GBR) | imm →(R0+GBR)


Description

This instruction ORs the contents of general registers Rn and Rm and stores the result in Rn.

This instruction can be used to OR general register R0 contents with zero-extended 8-bitimmediate data, or, in indexed GBR indirect addressing mode, to OR 8-bit memory with 8-bitimmediate data.

Rev. 2.0, 03/99, page 343 of 396

Operation

OR(long m, long n) /* OR Rm,Rn */

{

R[n]|=R[m];

PC+=2;

}

ORI(long i) /* OR #imm,R0 */

{

R[0]|=(0x000000FF & (long)i);

PC+=2;

}

ORM(long i) /* OR.B #imm,@(R0,GBR) */

{

long temp;


temp|=(0x000000FF & (long)i);


PC+=2;

}

Example

OR R0,R1 ;Before execution R0 = H’AAAA5555, R1 = H’55550000

;After execution R1 = H’FFFF5555

OR #H’F0,R0 ;Before execution R0 = H’00000008

;After execution R0 = H’000000F8

OR.B #H’50,@(R0,GBR) ;Before execution (R0,GBR) = H’A5

;After execution (R0,GBR) = H’F5

Rev. 2.0, 03/99, page 344 of 396

10.74 PREF PREFetch data to cache Data Transfer InstructionPrefetch to DataCache


PREF @Rn Prefetch cache block 0000nnnn10000011 —

Description

This instruction reads a 32-byte data block starting at a 32-byte boundary into the operand cache.The lower 5 bits of the address specified by Rn are masked to zero.

This instruction does not generate address-related errors. In the event of an error, the PREFinstruction is treated as an NOP (no operation) instruction.

Operation

PREF(int n) /* PREF */

{

PC+=2;

}

Example

MOV.L #SOFT_PF,R1 ;R1 address is SOFT_PF

PREF @R1 ;Load SOFT_PF data into on-chip cache

.align 32

SOFT_PF: .data.l H’12345678

.data.l H’9ABCDEF0

.data.l H’AAAA5555

.data.l H’5555AAAA

.data.l H’11111111

.data.l H’22222222

.data.l H’33333333

.data.l H’44444444

Rev. 2.0, 03/99, page 345 of 396

10.75 ROTCL ROTate with Carry Left Shift InstructionOne-Bit Left Rotationthrough T Bit


ROTCL Rn T ← Rn ← T 0100nnnn00100100 1 MSB

Description

This instruction rotates the contents of general register Rn one bit to the left through the T bit, andstores the result in Rn. The bit rotated out of the operand is transferred to the T bit.

MSB LSB

ROTCL T

Operation

ROTCL(long n) /* ROTCL Rn */

{

long temp;

if ((R[n]&0x80000000)==0) temp=0;

else temp=1;

R[n]<<=1;

if (T==1) R[n]|=0x00000001;

else R[n]&=0xFFFFFFFE;

if (temp==1) T=1;

else T=0;

PC+=2;

}

Example

ROTCL R0 ;Before execution R0 = H’80000000, T = 0

;After execution R0 = H’00000000, T = 1

Rev. 2.0, 03/99, page 346 of 396

10.76 ROTCR ROTate with Carry Right Shift InstructionOne-Bit Right Rotationthrough T Bit


ROTCR Rn T → Rn → T 0100nnnn00100101 1 LSB

Description

This instruction rotates the contents of general register Rn one bit to the right through the T bit,and stores the result in Rn. The bit rotated out of the operand is transferred to the T bit.

T

MSB LSB

ROTCR

Operation

ROTCR(long n) /* ROTCR Rn */

{

long temp;

if ((R[n]&0x00000001)==0) temp=0;

else temp=1;

R[n]>>=1;

if (T==1) R[n]|=0x80000000;

else R[n]&=0x7FFFFFFF;

if (temp==1) T=1;

else T=0;

PC+=2;

}

Example

ROTCR R0 ;Before execution R0 = H’00000001, T = 1


Rev. 2.0, 03/99, page 347 of 396

10.77 ROTL ROTate Left Shift InstructionOne-Bit LeftRotation


ROTL Rn T ← Rn ← MSB 0100nnnn00000100 1 MSB

Description

This instruction rotates the contents of general register Rn one bit to the left, and stores the resultin Rn. The bit rotated out of the operand is transferred to the T bit.

MSB LSB

ROTL T

Operation

ROTL(long n) /* ROTL Rn */

{

if ((R[n]&0x80000000)==0) T=0;

else T=1;

R[n]<<=1;

if (T==1) R[n]|=0x00000001;

else R[n]&=0xFFFFFFFE;

PC+=2;

}

Example

ROTL R0 ;Before execution R0 = H’80000000, T = 0


Rev. 2.0, 03/99, page 348 of 396

10.78 ROTR ROTate Right Shift InstructionOne-Bit RightRotation


ROTR Rn LSB → Rn → T 0100nnnn00000101 1 LSB

Description

This instruction rotates the contents of general register Rn one bit to the right, and stores the resultin Rn. The bit rotated out of the operand is transferred to the T bit.

MSB LSB

ROTR T

Operation

ROTR(long n) /* ROTR Rn */

{

if ((R[n]&0x00000001)==0) T=0;

else T=1;

R[n]>>=1;

if (T==1) R[n]|=0x80000000;


PC+=2;

}

Example

ROTR R0 ;Before execution R0 = H’00000001, T = 0


Rev. 2.0, 03/99, page 349 of 396

10.79 RTE ReTurn from Exception System Control InstructionReturn from Exception Handling (Privileged Instruction)

Delayed Branch Instruction


RTE SSR → SR, SPC→ PC 0000000000101011 5 —

Description

This instruction returns from an exception or interrupt handling routine by restoring the PC andSR values from SPC and SSR. Program execution continues from the address specified by therestored PC value.

RTE is a privileged instruction, and can only be used in privileged mode. Use of this instruction inuser mode will cause an illegal instruction exception.

Notes

As this is a delayed branch instruction, the instruction following the RTE instruction is executedbefore the branch destination instruction.

Interrupts are not accepted between this instruction and the following instruction. An exceptionmust not be generated by the instruction in this instruction’s delay slot. If the following instructionis a branch instruction, it is identified as a slot illegal instruction.

If this instruction is located in the delay slot immediately following a delayed branch instruction, itis identified as a slot illegal instruction.

The SR value accessed by the instruction in the RTE delay slot is the value restored from SSR bythe RTE instruction. The SR and MD values defined prior to RTE execution are used to fetch theinstruction in the RTE delay slot.

Rev. 2.0, 03/99, page 350 of 396

Operation

RTE( ) /* RTE */

{

unsigned int temp;

temp=PC;

SR=SSR;

PC=SPC;

Delay_Slot(temp+2);

}

Example

RTE ;Return to original routine.

ADD #8,R14 ;Executed before branch.

Note: In a delayed branch, the actual branch operation occurs after execution of the slotinstruction, but instruction execution (register updating, etc.) is in fact performed indelayed branch instruction → delay slot instruction order. For example, even if the registerholding the branch destination address is modified in the delay slot, the branch destinationaddress will still be the register contents prior to the modification.

Rev. 2.0, 03/99, page 351 of 396

10.80 RTS ReTurn from Subroutine Branch InstructionReturn from Subroutine Procedure Delayed Branch Instruction


RTS PR → PC 0000000000001011 2 —

Description

This instruction returns from a subroutine procedure by restoring the PC from PR. Processingcontinues from the address indicated by the restored PC value. This instruction can be used toreturn from a subroutine procedure called by a BSR or JSR instruction to the source of the call.

Notes



The instruction that restores PR must be executed before the RTS instruction. This restoreinstruction cannot be in the RTS delay slot.

Operation

RTS( ) /* RTS */

{

unsigned int temp;

temp=PC;

PC=PR;

Delay_Slot(temp+2);

}

Rev. 2.0, 03/99, page 352 of 396

Example

MOV.L TABLE,R3 ;R3 = TRGET address

JSR @R3 ; Branch to TRGET.

NOP ;NOP executed before branch.

ADD R0,R1 ;← Subroutine procedure return destination (PR contents)

........

TABLE: .data.l TRGET ;Jump table

........

TRGET: MOV R1,R0 ;← Entry to procedure

RTS ;PR contents → PC


Rev. 2.0, 03/99, page 353 of 396

10.81 SETS SET S bit System Control InstructionS Bit Setting


SETS 1 → S 0000000001011000 1 —

Description

This instruction sets the S bit to 1.

Operation

SETS( ) /* SETS */

{

S=1;

PC+=2;

}

Example

SETS ;Before execution S = 0

;After execution S = 1

Rev. 2.0, 03/99, page 354 of 396

10.82 SETT SET T bit System Control InstructionT Bit Setting


SETT 1 → T 0000000000011000 1 1

Description

This instruction sets the T bit to 1.

Operation

SETT( ) /* SETT */

{

T=1;

PC+=2;

}

Example

SETT ;Before execution T = 0


Rev. 2.0, 03/99, page 355 of 396

10.83 SHAD SHift Arithmetic Dynamically Shift InstructionDynamic Arithmetic Shift


SHAD Rm, Rn When Rm ≥ 0,Rn << Rm → Rn

When Rm < 0,Rn >> Rm → [MSB → Rn]


Description

This instruction arithmetically shifts the contents of general register Rn. General register Rmspecifies the shift direction and the number of bits to be shifted.

Rn register contents are shifted to the left if the Rm register value is positive, and to the right ifnegative. In a shift to the right, the MSB is added at the upper end.

The number of bits to be shifted is specified by the lower 5 bits (bits 4 to 0) of the Rm register. Ifthe value is negative (MSB = 1), the Rm register is represented as a two’s complement. The leftshift range is 0 to 31, and the right shift range, 1 to 32.

MSB LSB

MSB

MSB

LSB

0

Rm 0

Rm 0

Rev. 2.0, 03/99, page 356 of 396

Operation

SHAD(int m,n) /*SHAD Rm,Rn */

{

int sgn=R[m] & 0x80000000;

if (sgn==0)

R[n] <<= (R[m] & 0x1F);

else if ((R[m] & 0x1F) == 0) {

if ((R[n] & 0x80000000) == 0)

R[n] = 0;

else

R[n] = 0xFFFFFFFF;

}

else

R[n]=(long)R[n] >> ((~R[m] & 0x1F)+1);

PC+=2;

}

Example

SHAD R1,R2 ;Before execution R1 = H’FFFFFFEC, R2 = H’80180000

;After execution R1 = H’FFFFFFEC, R2 = H’FFFFF801

SHAD R3,R4 ;Before execution R3 = H’00000014, R4 = H’FFFFF801

;After execution R3 = H’00000014, R4 = H’80100000

Rev. 2.0, 03/99, page 357 of 396

10.84 SHAL SHift Arithmetic Left Shift InstructionOne-Bit LeftArithmetic Shift


SHAL Rn T ← Rn ← 0 0100nnnn00100000 1 MSB

Description

This instruction arithmetically shifts the contents of general register Rn one bit to the left, andstores the result in Rn. The bit shifted out of the operand is transferred to the T bit.

MSB LSBSHAL

T 0

Operation

SHAL(long n) /* SHAL Rn (Same as SHLL) */

{

if ((R[n]&0x80000000)==0) T=0;

else T=1;

R[n]<<=1;

PC+=2;

}

Example

SHAL R0 ;Before execution R0 = H’80000001, T = 0


Rev. 2.0, 03/99, page 358 of 396

10.85 SHAR SHift Arithmetic Right Shift InstructionOne-Bit RightArithmetic Shift


SHAR Rn MSB → Rn → T 0100nnnn00100001 1 LSB

Description

This instruction arithmetically shifts the contents of general register Rn one bit to the right, andstores the result in Rn. The bit shifted out of the operand is transferred to the T bit.

MSB LSB

SHAR T

Operation

SHAR(long n) /* SHAR Rn */

{

long temp;

if ((R[n]&0x00000001)==0) T=0;

else T=1;

if ((R[n]&0x80000000)==0) temp=0;

else temp=1;

R[n]>>=1;

if (temp==1) R[n]|=0x80000000;


PC+=2;

}

Example

SHAR R0 ;Before execution R0 = H’80000001, T = 0

;After execution R0 = H’C0000000, T = 1

Rev. 2.0, 03/99, page 359 of 396

10.86 SHLD SHift Logical Dynamically Shift InstructionDynamic LogicalShift


SHLD Rm, Rn When Rm ≥ 0,Rn << Rm → Rn

When Rm < 0,Rn >> Rm → [0 → Rn]


Description

This instruction logically shifts the contents of general register Rn. General register Rm specifiesthe shift direction and the number of bits to be shifted.

Rn register contents are shifted to the left if the Rm register value is positive, and to the right ifnegative. In a shift to the right, 0s are added at the upper end.

The number of bits to be shifted is specified by the lower 5 bits (bits 4 to 0) of the Rm register. Ifthe value is negative (MSB = 1), the Rm register is represented as a two’s complement. The leftshift range is 0 to 31, and the right shift range, 1 to 32.

MSB LSB

MSB

0

LSB

0

Rm 0

Rm 0

Rev. 2.0, 03/99, page 360 of 396

Operation

SHLD(int m,n)/*SHLD Rm,Rn */

{

int sgn = R[m] & 0x80000000;

if (sgn == 0)

R[n] <<= (R[m] & 0x1F);

else if ((R[m] & 0x1F) == 0)

R[n] = 0;

else

R[n]=(unsigned)R[n] >> ((~R[m] & 0x1F)+1);

PC+=2;

}

Example

SHLD R1, R2 ;Before execution R1 = H’FFFFFFEC, R2 = H’80180000

;After execution R1 = H’FFFFFFEC, R2 = H’00000801

SHLD R3, R4 ;Before execution R3 = H’00000014, R4 = H’FFFFF801

;After execution R3 = H’00000014, R4 = H’80100000

Rev. 2.0, 03/99, page 361 of 396

10.87 SHLL SHift Logical Left Shift InstructionOne-Bit LeftLogical Shift


SHLL Rn T ← Rn ← 0 0100nnnn00000000 1 MSB

Description

This instruction logically shifts the contents of general register Rn one bit to the left, and stores theresult in Rn. The bit shifted out of the operand is transferred to the T bit.

MSB LSB

0SHLL

T

Operation

SHLL(long n) /* SHLL Rn (Same as SHAL) */

{

if ((R[n]&0x80000000)==0) T=0;

else T=1;

R[n]<<=1;

PC+=2;

}

Example

SHLL R0 ;Before execution R0 = H’80000001, T = 0


Rev. 2.0, 03/99, page 362 of 396

10.88 SHLLn n bits SHift Logical Left Shift Instructionn-Bit LeftLogical Shift


SHLL2 Rn Rn<<2 → Rn 0100nnnn00001000 1 —

SHLL8 Rn Rn<<8 → Rn 0100nnnn00011000 1 —

SHLL16 Rn Rn<<16 → Rn 0100nnnn00101000 1 —

Description

This instruction logically shifts the contents of general register Rn 2, 8, or 16 bits to the left, andstores the result in Rn. The bits shifted out of the operand are discarded.

MSB LSB

0

SHLL8

SHLL16 MSB LSB

0

MSB LSB

0

SHLL2

Rev. 2.0, 03/99, page 363 of 396

Operation

SHLL2(long n) /* SHLL2 Rn */

{

R[n]<<=2;

PC+=2;

}


{

R[n]<<=8;

PC+=2;

}


{

R[n]<<=16;

PC+=2;

}

Example

SHLL2 R0 ;Before execution R0 = H’12345678

;After execution R0 = H’48D159E0





Rev. 2.0, 03/99, page 364 of 396

10.89 SHLR SHift Logical Right Shift InstructionOne-Bit RightLogical Shift


SHLR Rn 0 → Rn → T 0100nnnn00000001 1 LSB

Description

This instruction logically shifts the contents of general register Rn one bit to the right, and storesthe result in Rn. The bit shifted out of the operand is transferred to the T bit.

MSB LSBSHLR

T0

Operation

SHLR(long n) /* SHLR Rn */

{

if ((R[n]&0x00000001)==0) T=0;

else T=1;

R[n]>>=1;

R[n]&=0x7FFFFFFF;

PC+=2;

}

Example

SHLR R0 ;Before execution R0 = H’80000001, T = 0


Rev. 2.0, 03/99, page 365 of 396

10.90 SHLRn n bits SHift Logical Right Shift Instructionn-Bit RightLogical Shift


SHLR2 Rn Rn>>2 → Rn 0100nnnn00001001 1 —

SHLR8 Rn Rn>>8 → Rn 0100nnnn00011001 1 —

SHLR16 Rn Rn>>16 → Rn 0100nnnn00101001 1 —

Description

This instruction logically shifts the contents of general register Rn 2, 8, or 16 bits to the right, andstores the result in Rn. The bits shifted out of the operand are discarded.

MSB LSB

0

SHLR8

SHLR16 MSB LSB

0

MSB LSB

0

SHLR2

Rev. 2.0, 03/99, page 366 of 396

Operation

SHLR2(long n) /* SHLR2 Rn */

{

R[n]>>=2;

R[n]&=0x3FFFFFFF;

PC+=2;

}


{

R[n]>>=8;

R[n]&=0x00FFFFFF;

PC+=2;

}


{

R[n]>>=16;

R[n]&=0x0000FFFF;

PC+=2;

}

Example

SHLR2 R0 ;Before execution R0 = H’12345678

;After execution R0 = H’048D159E





Rev. 2.0, 03/99, page 367 of 396

10.91 SLEEP SLEEP System Control InstructionTransition to Power-Down Mode (Privileged Instruction)


SLEEP Sleep 0000000000011011 4 —

Description

This instruction places the CPU in the power-down state.

In power-down mode, the CPU retains its internal state, but immediately stops executinginstructions and waits for an interrupt request. When it receives an interrupt request, the CPU exitsthe power-down state.

SLEEP is a privileged instruction, and can only be used in privileged mode. Use of this instructionin user mode will cause an illegal instruction exception.

Notes

SLEEP performance depends on the standby control register (STBCR). See section 9, Power-Down Modes, for details.

Operation

SLEEP( ) /* SLEEP */

{

Sleep_standby();

}

Example

SLEEP ;Transition to power-down mode

Rev. 2.0, 03/99, page 368 of 396

10.92 STC STore Control register System Control InstructionStore from Control Register (Privileged Instruction)


STC SR, Rn SR → Rn 0000nnnn00000010 2 —

STC GBR, Rn GBR → Rn 0000nnnn00010010 2 —

STC VBR, Rn VBR → Rn 0000nnnn00100010 2 —

STC SSR , Rn SSR → Rn 0000nnnn00110010 2 —

STC SPC, Rn SPC → Rn 0000nnnn01000010 2 —

STC SGR, Rn SGR → Rn 0000nnnn00111010 3 —

STC DBR , Rn DBR → Rn 0000nnnn11111010 2 —

STC R0_BANK, Rn R0_BANK → Rn 0000nnnn10000010 2 —








STC.L SR, @-Rn Rn-4 → Rn, SR → (Rn) 0100nnnn00000011 2 —

STC.L GBR, @-Rn Rn-4 → Rn, GBR → (Rn) 0100nnnn00010011 2 —

STC.L VBR, @-Rn Rn-4 → Rn, VBR → (Rn) 0100nnnn00100011 2 —

STC.L SSR, @-Rn Rn-4 → Rn, SSR → (Rn) 0100nnnn00110011 2 —

STC.L SPC, @-Rn Rn-4 → Rn, SPC → (Rn) 0100nnnn01000011 2 —

STC.L SGR, @-Rn Rn-4 → Rn, SGR → (Rn) 0100nnnn00110010 3 —

STC.L DBR, @-Rn Rn-4 → Rn, DBR → (Rn) 0100nnnn11110010 2 —

STC.L R0_BANK, @-Rn Rn-4 → Rn, R0_BANK → (Rn) 0100nnnn10000011 2 —








Rev. 2.0, 03/99, page 369 of 396

Description

This instruction stores control register SR, GBR, VBR, SSR, SPC, SGR, DBR or Rm_BANK (m= 0–7) in the destination.

Rm_BANK operands are specified by the RB bit of the SR register:when the RB bit is 1 Rm_BANK0 is accessed,when the RB bit is 0 Rm_BANK1 is accessed.

Notes

STC/STC.L can only be used in privileged mode excepting STC GBR, Rn/STC.L GBR, @-Rn.Use of these instructions in user mode will cause illegal instruction exceptions.

Operation

STCSR(int n) /* STC SR,Rn : Privileged */

{

R[n]=SR;

PC+=2;

}

STCGBR(int n) /* STC GBR,Rn */

{

R[n]=SGR;

PC+=2;

}

STCVBR(int n) /* STC VBR,Rn : Privileged */

{

R[n]=VBR;

PC+=2;

}

STCSSR(int n) /* STC SSR,Rn : Privileged */

{

R[n]=SSR;

PC+=2;

}

Rev. 2.0, 03/99, page 370 of 396

STCSPC(int n) /* STC SPC,Rn : Privileged */

{

R[n]=SPC;

PC+=2;

}

STCSGR(int n) /* STC SGR,Rn : Privileged */

{

R[n]=SGR;

PC+=2;

}

STCDBR(int n) /* STC DBR,Rn : Privileged */

{

R[n]=DBR;

PC+=2;

}

STCRm_BANK(int n) /* STC Rm_BANK,Rn : Privileged */

/* m=0–7 */

{

R[n]=Rm_BANK;

PC+=2;

}

STCMSR(int n) /* STC.L SR,@-Rn : Privileged */

{

R[n]–=4;

Write_Long(R[n],SR);

PC+=2;

}

STCMGBR(int n) /* STC.L GBR,@–Rn */

{

R[n]–=4;

Write_Long(R[n],GBR);

PC+=2;

Rev. 2.0, 03/99, page 371 of 396

}

STCMVBR(int n) /* STC.L VBR,@-Rn : Privileged */

{

R[n]–=4;

Write_Long(R[n],VBR);

PC+=2;

}

STCMSSR(int n) /* STC.L SSR,@-Rn : Privileged */

{

R[n]–=4;

Write_Long(R[n],SSR);

PC+=2;

}

STCMSPC(int n) /* STC.L SPC,@-Rn : Privileged */

{

R[n]–=4;

Write_Long(R[n],SPC);

PC+=2;

}

STCMSGR(int n) /* STC.L SGR,@-Rn : Privileged */

{

R[n]–=4;

Write_Long(R[n],SGR);

PC+=2;

}

STCMDBR(int n) /* STC.L DBR,@-Rn : Privileged */

{

R[n]–=4;

Write_Long(R[n],DBR);

PC+=2;

}

Rev. 2.0, 03/99, page 372 of 396

STCMRm_BANK(int n) /* STC.L Rm_BANK,@-Rn : Privileged */

/* m=0–7 */

{

R[n]–=4;

Write_Long(R[n],Rm_BANK);

PC+=2;

}

Possible Exceptions:• General illegal instruction exception

• Slot illegal instruction exception

• Data TLB miss exception


• Address error

Rev. 2.0, 03/99, page 373 of 396

10.93 STS STore System register System Control InstructionStore fromSystem Register


STS MACH,Rn MACH → Rn 0000nnnn00001010 1 —

STS MACL,Rn MACL → Rn 0000nnnn00011010 1 —

STS PR,Rn PR → Rn 0000nnnn00101010 1 —

STS.L MACH,@-Rn Rn-4 → Rn, MACH → (Rn) 0100nnnn00000010 1 —

STS.L MACL,@-Rn Rn-4 → Rn, MACL → (Rn) 0100nnnn00010010 1 —

STS.L PR,@-Rn Rn-4 → Rn, PR → (Rn) 0100nnnn00100010 1 —

Description

This instruction stores system register MACH, MACL, or PR in the destination.

Operation

STSMACH(int n) /* STS MACH,Rn */

{

R[n]=MACH;

PC+=2;

}

STSMACL(int n) /* STS MACL,Rn */

{

R[n]=MACL;

PC+=2;

}

STSPR(int n) /* STS PR,Rn */

{

R[n]=PR;

PC+=2;

}

STSMMACH(int n) /* STS.L MACH,@-Rn */

{

Rev. 2.0, 03/99, page 374 of 396

R[n]–=4;

Write_Long(R[n],MACH);

PC+=2;

}

STSMMACL(int n) /* STS.L MACL,@-Rn */

{

R[n]–=4;

Write_Long(R[n],MACL);

PC+=2;

}

STSMPR(int n) /* STS.L PR,@-Rn */

{

R[n]–=4;

Write_Long(R[n],PR);

PC+=2;

}



• Address error

Example

STS MACH,R0 ; Before execution R0 = H’FFFFFFFF, MACH = H’00000000

; After execution R0 = H’00000000

STS.L PR,@-R15 ; Before execution R15 = H’10000004

; After execution R15 = H’10000000, (R15) = PR

Rev. 2.0, 03/99, page 375 of 396

10.94 STS STore from FPUSystem register System Control Instruction

Store from FPUSystem Register


STS FPUL,Rn FPUL → Rn 0000nnnn01011010 1 —

STS FPSCR,Rn FPSCR → Rn 0000nnnn01101010 1 —

STS.L FPUL,@-Rn Rn-4 → Rn, FPUL → (Rn) 0100nnnn01010010 1 —

STS.L FPSCR,@-Rn Rn-4 → Rn, FPSCR → (Rn) 0100nnnn01100010 1 —

Description

This instruction stores FPU system register FPUL or FPSCR in the destination.

Operation

STS(int n, int *FPUL) /* STS FPUL,Rn */

{

R[n]= *FPUL;

PC+=2;

}

STS_SAVE(int n, int *FPUL) /* STS.L FPUL,@-Rn */

{

R[n]-=4;

Write_Long(R[n],*FPUL) ;

PC+=2;

}

STS(int n) /* STS FPSCR,Rn */

{

R[n]=FPSCR&0x003FFFFF;

PC+=2;

}

STS_RESTORE(int n) /* STS.L FPSCR,@-Rn */

{

R[n]-=4;

Write_Long(R[n],FPSCR&0x003FFFFF)

Rev. 2.0, 03/99, page 376 of 396

PC+=2;

}



• Address error

Examples

• STS

Example 1:

MOV.L #H’12ABCDEF, R12

LDS R12, FPUL

STS FPUL, R13

; After executing the STS instruction:

; R13 = 12ABCDEF

Example 2:

STS FPSCR, R2

; After executing the STS instruction:

; The current content of FPSCR is stored in register R2

• STS.L

Example 1:

MOV.L #H’0C700148, R7

STS.L FPUL, @-R7

; Before executing the STS.L instruction:

; R7 = 0C700148

; After executing the STS.L instruction:

; R7 = 0C700144, and the content of FPUL is saved at memory

; locatio\n 0C700144.

Example 2:

MOV.L #H’0C700154, R8

STS.L FPSCR, @-R8

; After executing the STS.L instruction:

; The content of FPSCR is saved at memory location 0C700150.

Rev. 2.0, 03/99, page 377 of 396

10.95 SUB SUBtract binary Arithmetic InstructionBinary Subtraction


SUB Rm,Rn Rn-Rm → Rn 0011nnnnmmmm1000 1 —

Description

This instruction subtracts the contents of general register Rm from the contents of general registerRn and stores the result in Rn. For immediate data subtraction, ADD #imm,Rn should be used.

Operation

SUB(long m, long n) /* SUB Rm,Rn */

{

R[n]-=R[m];

PC+=2;

}

Example

SUB R0,R1 ;Before execution R0 = H’00000001, R1 = H’80000000

;After execution R1 = H’7FFFFFFF

Rev. 2.0, 03/99, page 378 of 396

10.96 SUBC SUBtract with Carry Arithmetic InstructionBinary Subtraction with Borrow


SUBC Rm,Rn Rn-Rm-T → Rn, borrow → T 0011nnnnmmmm1010 1 Borrow

Description

This instruction subtracts the contents of general register Rm and the T bit from the contents ofgeneral register Rn, and stores the result in Rn. A borrow resulting from the operation is reflectedin the T bit. This instruction is used for subtractions exceeding 32 bits.

Operation

SUBC(long m, long n) /* SUBC Rm,Rn */

{

unsigned long tmp0,tmp1;

tmp1=R[n]-R[m];

tmp0=R[n];

R[n]=tmp1-T;

if (tmp0<tmp1) T=1;

else T=0;

if (tmp1<R[n]) T=1;

PC+=2;

}

Example

CLRT ;R0:R1(64 bits) – R2:R3(64 bits) = R0:R1(64 bits)

SUBC R3,R1 ;Before execution T = 0, R1 = H'00000000, R3 = H'00000001

;After execution T = 1, R1 = H'FFFFFFFF

SUBC R2,R0 ;Before execution T = 1, R0 = H'00000000, R2 = H'00000000

;After execution T = 1, R0 = H'FFFFFFFF

Rev. 2.0, 03/99, page 379 of 396

10.97 SUBV SUBtract with (V flag)underflow check Arithmetic Instruction

Binary Subtractionwith Underflow Check


SUBV Rm,Rn Rn-Rm → Rn, underflow → T 0011nnnnmmmm1011 1 Underflow

Description

This instruction subtracts the contents of general register Rm from the contents of general registerRn, and stores the result in Rn. If underflow occurs, the T bit is set.

Operation

SUBV(long m, long n) /* SUBV Rm,Rn */

{

long dest,src,ans;

if ((long)R[n]>=0) dest=0;

else dest=1;

if ((long)R[m]>=0) src=0;

else src=1;

src+=dest;

R[n]-=R[m];

if ((long)R[n]>=0) ans=0;

else ans=1;

ans+=dest;

if (src==1) {

if (ans==1) T=1;

else T=0;

}

else T=0;

PC+=2;

}

Rev. 2.0, 03/99, page 380 of 396

Example

SUBV R0,R1 ;Before execution R0 = H’00000002, R1 = H’80000001

;After execution R1 = H’7FFFFFFF, T = 1

SUBV R2,R3 ;Before execution R2 = H’FFFFFFFE, R3 = H’7FFFFFFE


Rev. 2.0, 03/99, page 381 of 396

10.98 SWAP SWAP register halves Data Transfer InstructionUpper-/Lower-HalfSwap


SWAP.B Rm,Rn Rm → lower-2-byte upper-/lower-byte swap → Rn


SWAP.W Rm,Rn Rm → upper-/lower-wordswap → Rn

0110nnnnmmmm1001 1

Description

This instruction swaps the upper and lower parts of the contents of general register Rm, and storesthe result in Rn.

In the case of a byte specification, the 8 bits from bit 15 to bit 8 of Rm are swapped with the 8 bitsfrom bit 7 to bit 0. The upper 16 bits of Rm are transferred directly to the upper 16 bits of Rn.

In the case of a word specification, the 16 bits from bit 31 to bit 16 of Rm are swapped with the 16bits from bit 15 to bit 0.

Operation

SWAPB(long m, long n) /* SWAP.B Rm,Rn */

{

unsigned long temp0,temp1;

temp0=R[m]&0xFFFF0000;

temp1=(R[m]&0x000000FF)<<8;

R[n]=(R[m]&0x0000FF00)>>8;

R[n]=R[n]|temp1|temp0;

PC+=2;

}

SWAPW(long m, long n) /* SWAP.W Rm,Rn */

{

unsigned long temp;

temp=(R[m]>>16)&0x0000FFFF;

R[n]=R[m]<<16;

Rev. 2.0, 03/99, page 382 of 396

R[n]|=temp;

PC+=2;

}

Example

SWAP.B R0,R1 ;Before execution R0 = H’12345678


SWAP.W R0,R1 ;Before execution R0 = H’12345678


Rev. 2.0, 03/99, page 383 of 396

10.99 TAS Test And Set Logical InstructionMemory Testand Bit Setting


TAS.B @Rn If (Rn) = 0, 1 → T, else 0 → T

1 → MSB of (Rn)

0100nnnn00011011 5 Testresult

Description

This instruction purges the cache block corresponding to the memory area specified by thecontents of general register Rn, reads the byte data indicated by that address, and sets the T bit to 1if that data is zero, or clears the T bit to 0 if the data is nonzero. The instruction then sets bit 7 to 1and writes to the same address. The bus is not released during this period.

The purge operation is executed as follows.

In a purge operation, data is accessed using the contents of general register Rn as the effectiveaddress. If there is a cache hit and the corresponding cache block is dirty (U bit = 1), the contentsof that cache block are written back to external memory, and the cache block is then invalidated(by clearing the V bit to 0). If there is a cache hit and the corresponding cache block is clean (U bit= 0), the cache block is simply invalidated (by clearing the V bit to 0). A purge is not executed inthe event of a cache miss, or if the accessed memory location is non-cacheable.

The two TAS.B memory accesses are executed automatically. Another memory access is notexecuted between the two TAS.B accesses.

Operation

TAS(int n) /* TAS.B @Rn */

{

int temp;

temp=(int)Read_Byte(R[n]); /* Bus Lock */

if (temp==0) T=1;

else T=0;

temp|=0x00000080;

Write_Byte(R[n],temp); /* Bus unlock */

PC+=2;

}

Rev. 2.0, 03/99, page 384 of 396




• Address error

Exceptions are checked taking a data access by this instruction as a byte store.

Rev. 2.0, 03/99, page 385 of 396

10.100 TRAPA TRAP Always System Control InstructionTrap ExceptionHandling


TRAPA #imm imm → TRA, PC+2 → SPC,SR → SSR, 1 → SR.MD/BL/RB, 0x160 → EXPEVT,VBR+H’00000100 → PC


Description

This instruction starts trap exception handling. The values of (PC + 2) and SR are saved to SPCand SSR, and 8-bit immediate data is stored in the TRA register (bits 9 to 2). The processor modeis switched to privileged mode (the MD bit in SR is set to 1), and the BL bit and RB bit in SR areset to 1. As a result, exception and interrupt requests are masked (not accepted), and the BANK1registers (R0_BANK1 to R7_BANK1) are selected. Exception code 0x160 is written to theEXPEVT register (bits 11 to 0). The program branches to address (VBR + H’00000100), indicatedby the sum of the VBR register contents and offset H’00000100.

Operation

TRAPA(int i) /* TRAPA #imm */

{

int imm;

imm=(0x000000FF & i);

TRA=imm<<2;

SSR=SR;

SPC=PC+2;

SR.MD=1;

SR.BL=1;

SR.RB=1;

EXPEVT=0x00000160;

PC=VBR+H’00000100;

}

Rev. 2.0, 03/99, page 386 of 396

10.101 TST TeST logical Logical InstructionAND OperationT Bit Setting


TST Rm,Rn Rn & Rm; if result is 0,1 → T, else 0 → T

0010nnnnmmmm1000 1 Testresult

TST #imm,R0 R0 & imm; if result is 0,1 → T, else 0 → T

11001000iiiiiiii 1 Testresult

TST.B #imm,@(R0,GBR) (R0 + GBR) & imm;if result is 0, 1 → T,else 0 → T

11001100iiiiiiii 3 Testresult

Description

This instruction ANDs the contents of general registers Rn and Rm, and sets the T bit if the resultis zero. If the result is nonzero, the T bit is cleared. The contents of Rn are not changed.

This instruction can be used to AND general register R0 contents with zero-extended 8-bitimmediate data, or, in indexed GBR indirect addressing mode, to AND 8-bit memory with 8-bitimmediate data. The contents of R0 or the memory are not changed.

Operation

TST(long m, long n) /* TST Rm,Rn */

{

if ((R[n]&R[m])==0) T=1;

else T=0;

PC+=2;

}

TSTI(long i) /* TST #imm,R0 */

{

long temp;

temp=R[0]&(0x000000FF & (long)i);

if (temp==0) T=1;

else T=0;

PC+=2;

}

Rev. 2.0, 03/99, page 387 of 396

TSTM(long i) /* TST.B #imm,@(R0,GBR) */

{

long temp;


temp&=(0x000000FF & (long)i);

if (temp==0) T=1;

else T=0;

PC+=2;

}

Example

TST R0,R0 ;Before execution R0 = H’00000000


TST #H’80,R0 ;Before execution R0 = H’FFFFFF7F


TST.B #H’A5,@(R0,GBR) ;Before execution (R0,GBR) = H’A5


Rev. 2.0, 03/99, page 388 of 396

10.102 XOR eXclusive OR logical Logical InstructionExclusive

Logical OR


XOR Rm,Rn Rn ^ Rm → Rn 0010nnnnmmmm1010 1 —

XOR #imm,R0 R0 ^ imm → R0 11001010iiiiiiii 1 —

XOR.B #imm,@(R0,GBR) (R0+GBR)^imm →(R0+GBR)


Description

This instruction exclusively ORs the contents of general registers Rn and Rm, and stores the resultin Rn.

This instruction can be used to exclusively OR register R0 contents with zero-extended 8-bitimmediate data, or, in indexed GBR indirect addressing mode, to exclusively OR 8-bit memorywith 8-bit immediate data.

Operation

XOR(long m, long n) /* XOR Rm,Rn */

{

R[n]^=R[m];

PC+=2;

}

XORI(long i) /* XOR #imm,R0 */

{

R[0]^=(0x000000FF & (long)i);

PC+=2;

}

XORM(long i) /* XOR.B #imm,@(R0,GBR) */

{

int temp;


temp^=(0x000000FF &(long)i);

Rev. 2.0, 03/99, page 389 of 396


PC+=2;

}

Example

XOR R0,R1 ;Before execution R0 = H’AAAAAAAA, R1 = H’55555555


XOR #H’F0,R0 ;Before execution R0 = H’FFFFFFFF

;After execution R0 = H’FFFFFF0F

XOR.B #H’A5,@(R0,GBR) ;Before execution (R0,GBR) = H’A5

;After execution (R0,GBR) = H’00

Rev. 2.0, 03/99, page 390 of 396

10.103 XTRCT eXTRaCT Data Transfer InstructionMiddle Extractionfrom Linked Registers


XTRCT Rm,Rn Middle 32 bits of Rm:Rn → Rn 0010nnnnmmmm1101 1 —

Description

This instruction extracts the middle 32 bits from the 64-bit contents of linked general registers Rmand Rn, and stores the result in Rn.

MSB

RnRm

Rn

LSBMSB LSB

Operation

XTRCT(long m, long n) /* XTRCT Rm,Rn */

{

unsigned long temp;

temp=(R[m]<<16)&0xFFFF0000;

R[n]=(R[n]>>16)&0x0000FFFF;

R[n]|=temp;

PC+=2;

}

Example

XTRCT R0,R1 ;Before execution R0 = H’01234567, R1 = H’89ABCDEF

;After execution R1 = H’456789AB

Rev. 2.0, 03/99, page 391 of 396

Appendix A Address List

Table A.1 Address List

Module Register P4 AddressArea 7Address*1 Size

Power-OnReset

ManualReset Sleep Standby

Synchro-nizationClock

CCN PTEH H’FF00 0000 H’1F00 0000 32 Undefined Undefined Held Held Iclk

CCN PTEL H’FF00 0004 H’1F00 0004 32 Undefined Undefined Held Held Iclk

CCN TTB H’FF00 0008 H’1F00 0008 32 Undefined Undefined Held Held Iclk

CCN TEA H’FF00 000C H’1F00 000C 32 Undefined Held Held Held Iclk

CCN MMUCR H’FF00 0010 H’1F00 0010 32 H’0000 0000 H’0000 0000 Held Held Iclk

CCN BASRA H’FF00 0014 H’1F00 0014 8 Undefined Held Held Held Iclk

CCN BASRB H’FF00 0018 H’1F00 0018 8 Undefined Held Held Held Iclk

CCN CCR H’FF00 001C H’1F00 001C 32 H’0000 0000 H’0000 0000 Held Held Iclk

CCN TRA H’FF00 0020 H’1F00 0020 32 Undefined Undefined Held Held Iclk

CCN EXPEVT H’FF00 0024 H’1F00 0024 32 H’0000 0000 H’0000 0020 Held Held Iclk

CCN INTEVT H’FF00 0028 H’1F00 0028 32 Undefined Undefined Held Held Iclk

CCN PTEA H’FF00 0034 H’1F00 0034 32 Undefined Undefined Held Held Iclk

CCN QACR0 H’FF00 0038 H’1F00 0038 32 Undefined Undefined Held Held Iclk

CCN QACR1 H’FF00 003C H’1F00 003C 32 Undefined Undefined Held Held Iclk

UBC BARA H’FF20 0000 H’1F20 0000 32 Undefined Held Held Held Iclk

UBC BAMRA H’FF20 0004 H’1F20 0004 8 Undefined Held Held Held Iclk

UBC BBRA H’FF20 0008 H’1F20 0008 16 H’0000 Held Held Held Iclk

UBC BARB H’FF20 000C H’1F20 000C 32 Undefined Held Held Held Iclk

UBC BAMRB H’FF20 0010 H’1F20 0010 8 Undefined Held Held Held Iclk

UBC BBRB H’FF20 0014 H’1F20 0014 16 H’0000 Held Held Held Iclk

UBC BDRB H’FF20 0018 H’1F20 0018 32 Undefined Held Held Held Iclk

UBC BDMRB H’FF20 001C H’1F20 001C 32 Undefined Held Held Held Iclk

UBC BRCR H’FF20 0020 H’1F20 0020 16 H’0000*2 Held Held Held Iclk

BSC BCR1 H’FF80 0000 H’1F80 0000 32 H’0000 0000*2 Held Held Held Bclk

BSC BCR2 H’FF80 0004 H’1F80 0004 16 H’3FFC*2 Held Held Held Bclk

BSC WCR1 H’FF80 0008 H’1F80 0008 32 H’7777 7777 Held Held Held Bclk

BSC WCR2 H’FF80 000C H’1F80 000C 32 H’FFFE EFFF Held Held Held Bclk

BSC WCR3 H’FF80 0010 H’1F80 0010 32 H’0777 7777 Held Held Held Bclk

Rev. 2.0, 03/99, page 392 of 396

Table A.1 Address List (cont)


Power-OnReset



BSC MCR H’FF80 0014 H’1F80 0014 32 H’0000 0000 Held Held Held Bclk

BSC PCR H’FF80 0018 H’1F80 0018 16 H’0000 Held Held Held Bclk

BSC RTCSR H’FF80 001C H’1F80 001C 16 H’0000 Held Held Held Bclk

BSC RTCNT H’FF80 0020 H’1F80 0020 16 H’0000 Held Held Held Bclk

BSC RTCOR H’FF80 0024 H’1F80 0024 16 H’0000 Held Held Held Bclk

BSC RFCR H’FF80 0028 H’1F80 0028 16 H’0000 Held Held Held Bclk

BSC PCTRA H’FF80 002C H’1F80 002C 32 H’0000 0000 Held Held Held Bclk

BSC PDTRA H’FF80 0030 H’1F80 0030 16 Undefined Held Held Held Bclk

BSC PCTRB H’FF80 0040 H’1F80 0040 32 H’0000 0000 Held Held Held Bclk

BSC PDTRB H’FF80 0044 H’1F80 0044 16 Undefined Held Held Held Bclk

BSC GPIOIC H’FF80 0048 H’1F80 0048 16 H’0000 0000 Held Held Held Bclk

BSC SDMR2 H’FF90 xxxx H’1F90 xxxx 8 Bclk

BSC SDMR3 H’FF94 xxxx H’1F94 xxxx 8

Write-only

Bclk

DMAC SAR0 H’FFA0 0000 H’1FA0 0000 32 Undefined Undefined Held Held Bclk

DMAC DAR0 H’FFA0 0004 H’1FA0 0004 32 Undefined Undefined Held Held Bclk

DMAC DMATCR0 H’FFA0 0008 H’1FA0 0008 32 Undefined Undefined Held Held Bclk

DMAC CHCR0 H’FFA0 000C H’1FA0 000C 32 H’0000 0000 H’0000 0000 Held Held Bclk













DMAC DMAOR H’FFA0 0040 H’1FA0 0040 32 H’0000 0000 H’0000 0000 Held Held Bclk

Rev. 2.0, 03/99, page 393 of 396



Power-OnReset



CPG FRQCR H’FFC0 0000 H’1FC0 0000 16 *2 Held Held Held Pclk

CPG STBCR H’FFC0 0004 H’1FC0 0004 8 H’00 Held Held Held Pclk

CPG WTCNT H’FFC0 0008 H’1FC0 0008 8/16*3

H’00 Held Held Held Pclk

CPG WTCSR H’FFC0 000C H’1FC0 000C 8/16*3

H’00 Held Held Held Pclk

CPG STBCR2 H’FFC0 0010 H’1FC0 0010 8 H’00 Held Held Held Pclk

RTC R64CNT H’FFC8 0000 H’1FC8 0000 8 Held Held Held Held Pclk

RTC RSECCNT H’FFC8 0004 H’1FC8 0004 8 Held Held Held Held Pclk

RTC RMINCNT H’FFC8 0008 H’1FC8 0008 8 Held Held Held Held Pclk

RTC RHRCNT H’FFC8 000C H’1FC8 000C 8 Held Held Held Held Pclk

RTC RWKCNT H’FFC8 0010 H’1FC8 0010 8 Held Held Held Held Pclk

RTC RDAYCNT H’FFC8 0014 H’1FC8 0014 8 Held Held Held Held Pclk

RTC RMONCNTH’FFC8 0018 H’1FC8 0018 8 Held Held Held Held Pclk

RTC RYRCNT H’FFC8 001C H’1FC8 001C 16 Held Held Held Held Pclk

RTC RSECAR H’FFC8 0020 H’1FC8 0020 8 Held*2 Held Held Held Pclk

RTC RMINAR H’FFC8 0024 H’1FC8 0024 8 Held*2 Held Held Held Pclk

RTC RHRAR H’FFC8 0028 H’1FC8 0028 8 Held*2 Held Held Held Pclk

RTC RWKAR H’FFC8 002C H’1FC8 002C 8 Held*2 Held Held Held Pclk

RTC RDAYAR H’FFC8 0030 H’1FC8 0030 8 Held*2 Held Held Held Pclk

RTC RMONAR H’FFC8 0034 H’1FC8 0034 8 Held*2 Held Held Held Pclk

RTC RCR1 H’FFC8 0038 H’1FC8 0038 8 H’00*2 H’00*2 Held Held Pclk

RTC RCR2 H’FFC8 003C H’1FC8 003C 8 H’09*2 H’00*2 Held Held Pclk

INTC ICR H’FFD0 0000 H’1FD0 0000 16 H’0000*2 H’0000*2 Held Held Pclk

INTC IPRA H’FFD0 0004 H’1FD0 0004 16 H’0000 H’0000 Held Held Pclk

INTC IPRB H’FFD0 0008 H’1FD0 0008 16 H’0000 H’0000 Held Held Pclk

INTC IPRC H’FFD0 000C H’1FD0 000C 16 H’0000 H’0000 Held Held Pclk

TMU TOCR H’FFD8 0000 H’1FD8 0000 8 H’00 H’00 Held Held Pclk

TMU TSTR H’FFD8 0004 H’1FD8 0004 8 H’00 H’00 Held H’00*2 Pclk

TMU TCOR0 H’FFD8 0008 H’1FD8 0008 32 H’FFFF FFFF H’FFFF FFFF Held Held Pclk

TMU TCNT0 H’FFD8 000C H’1FD8 000C 32 H’FFFF FFFF H’FFFF FFFF Held Held Pclk

TMU TCR0 H’FFD8 0010 H’1FD8 0010 16 H’0000 H’0000 Held Held Pclk

Rev. 2.0, 03/99, page 394 of 396

Rev. 2.0, 03/99, page 395 of 396



Power-OnReset




TMU TCNT1 H’FFD8 0018 H’1FD8 0018 32 H’FFFF FFFF H’FFFF FFFF Held Held Pclk

TMU TCR1 H’FFD8 001C H’1FD8 001C 16 H’0000 H’0000 Held Held Pclk


TMU TCNT2 H’FFD8 0024 H’1FD8 0024 32 H’FFFF FFFF H’FFFF FFFF Held Held Pclk

TMU TCR2 H’FFD8 0028 H’1FD8 0028 16 H’0000 H’0000 Held Held Pclk

TMU TCPR2 H’FFD8 002C H’1FD8 002C 32 Held Held Held Held Pclk

SCI SCSMR1 H’FFE0 0000 H’1FE0 0000 8 H’00 H’00 Held H’00 Pclk

SCI SCBRR1 H’FFE0 0004 H’1FE0 0004 8 H’FF H’FF Held H’FF Pclk

SCI SCSCR1 H’FFE0 0008 H’1FE0 0008 8 H’00 H’00 Held H’00 Pclk

SCI SCTDR1 H’FFE0 000C H’1FE0 000C 8 H’FF H’FF Held H’FF Pclk

SCI SCSSR1 H’FFE0 0010 H’1FE0 0010 8 H’84 H’84 Held H’84 Pclk

SCI SCRDR1 H’FFE0 0014 H’1FE0 0014 8 H’00 H’00 Held H’00 Pclk

SCI SCSCMR1 H’FFE0 0018 H’1FE0 0018 8 H’00 H’00 Held H’00 Pclk

SCI SCSPTR1 H’FFE0 001C H’1FE0 001C 8 H’00*2 H’00*2 Held H’00*2 Pclk

SCIF SCSMR2 H’FFE8 0000 H’1FE8 0000 16 H’0000 H’0000 Held Held Pclk

SCIF SCBRR2 H’FFE8 0004 H’1FE8 0004 8 H’FF H’FF Held Held Pclk

SCIF SCSCR2 H’FFE8 0008 H’1FE8 0008 16 H’0000 H’0000 Held Held Pclk

SCIF SCFTDR2 H’FFE8 000C H’1FE8 000C 8 Undefined Undefined Held Held Pclk

SCIF SCFSR2 H’FFE8 0010 H’1FE8 0010 16 H’0060 H’0060 Held Held Pclk

SCIF SCFRDR2 H’FFE8 0014 H’1FE8 0014 8 Undefined Undefined Held Held Pclk

SCIF SCFCR2 H’FFE8 0018 H’1FE8 0018 16 H’0000 H’0000 Held Held Pclk

SCIF SCFDR2 H’FFE8 001C H’1FE8 001C 16 H’0000 H’0000 Held Held Pclk

SCIF SCSPTR2 H’FFE8 0020 H’1FE8 0020 16 H’0000*2 H’0000*2 Held Held Pclk

SCIF SCLSR2 H’FFE8 0024 H’1FE8 0024 16 H’0000 H’0000 Held Held Pclk

Hitachi-UDI

SDIR H’FFF0 0000 H’1FF0 0000 16 H’FFFF*2 Held Held Held Pclk

Hitachi-UDI

SDDR H’FFF0 0008 H’1FF0 0008 32 Held Held Held Held Pclk

Notes: 1. With control registers, the above addresses in the physical page number field can beaccessed by means of a TLB setting. When these addresses are referenced directlywithout using the TLB, operations are limited.

Rev. 2.0, 03/99, page 396 of 396

2. Includes undefined bits. See the descriptions of the individual modules.3. Use word-size access when writing. Perform the write with the upper byte set to H’5A or

H’A5, respectively. Byte- and longword-size writes cannot be used.

Use byte-size access when reading.

Rev. 2.0, 03/99, page 397 of 396

Appendix B Instruction Prefetch Side Effects

The SH4 is provided with an internal buffer for holding pre-read instructions, and always performspre-reading. Therefore, program code must not be located in the last 20-byte area of any memoryspace. If program code is located in these areas, the memory area will be exceeded and a busaccess for instruction pre-reading may be initiated. A case in which this is a problem is shownbelow.

AddressH'03FFFFF8H'03FFFFFAH'03FFFFFCH'03FFFFFEH'04000000H'04000002

Area 0

Area 1

ADD R1,R4JMP @R2NOPNOP

.

.

.

.

.PC (program counter)

Instruction prefetch address

Figure B.1 Instruction Prefetch

Figure B.1 presupposes a case in which the instruction (ADD) indicated by the program counter(PC) and the address H’0400002 instruction prefetch are executed simultaneously. It is alsoassumed that the program branches to an area outside area 1 after executing the following JMPinstruction and delay slot instruction.

In this case, the program flow is unpredictable, and a bus access (instruction prefetch) to area 1may be initiated.

Instruction Prefetch Side Effects1. It is possible that an external bus access caused by an instruction prefetch may result in

misoperation of an external device, such as a FIFO, connected to the area concerned.

2. If there is no device to reply to an external bus request caused by an instruction prefetch,hangup will occur.

Remedies1. These illegal instruction fetches can be avoided by using the MMU.

2. The problem can be avoided by not locating program code in the last 20 bytes of any area.

SH7750 Programming Manual

Publication Date: 1st Edition, August 19982nd Edition, March 1999

Published by: Electronic Devices Sales & Marketing GroupHitachi, Ltd.

Edited by: Technical Documentation GroupUL Media Co., Ltd.

Copyright © Hitachi, Ltd., 1998. All rights reserved. Printed in Japan.

Documents

High Performance RISC Engine Programming Manual