View
219
Download
2
Tags:
Embed Size (px)
Citation preview
Real instruction set architectures
Part 2: internal CPU storage, overview of Intel architectures
Big-Endian vs. Little-Endian: quick recap
• In a big-endian machine, bytes used to store a data item are arranged left to right, so that the MSB is found at the leftmost position (first byte of address, the “big end”)
• Little-endian is just the opposite; bytes are arranged right to left, with the MSB as the first bit of the last byte (the “little end”)
• Note that, in either case, bits within each bytebits within each byte are arranged left to right – so a little-endian integer isn’t exactly the same thing as a big-endian integer backwards
Byte ordering & data movement
• Computer networks are big endian:– Little endian machines must convert integers
(e.g. network device addresses) before they can be passed over the network
– Little endian machines must also convert integers retrieved from the network to the native mode for the machine
Byte ordering & data movement
• Any program that reads/writes file data must be aware of byte ordering– For example, Windows BMPs were developed
on a little endian machine; an application on a big endian machine that reads a BMP must reverse byte order
– PhotoShop, JPEG, MacPaint, Sun raster files: big endian
– GIF, PC Paintbrush, RTF: little endian
Internal CPU storage
• 3 choices for data storage in CPU:– Stack architecture:
• Use stack to execute instructions; operands stored at top of stack
• No random access
– Accumulator architecture:• Minimum of internal complexity; short instructions• One (implicit) operand stored in accumulator• Involves high volume of memory traffic
– General Purpose register: see next slide
General Purpose Register (GPR)
• Set (>1) of GPRs• Most common architecture in use today• Registers are faster than memory; easier for
other parts of the CPU to handle register data (than data from memory)
• Cheaper hardware tends to mean an increased number of registers in the CPU
• GPRs mean longer instructions, because register(s) must be specified; takes more time to fetch/decode longer instructions
Classification of GPR architectures
• Memory to memory (VAX):– Instruction uses 2-3 operands, stored in memory– Instructions can perform operations without involving
registers
• Register to memory (Intel, Motorola): at least one operand must be in a register
• Load-store (SPARC, MIPS, Alpha, PowerPC): Requires movement of data to registers before any operations performed
Operand number / instruction length
• Instructions can be formatted 2 ways:– Fixed-length: fast, but wastes space– Variable-length: more complex to decode, but
saves space
• Real-life compromise often involves 2-3 instruction lengths (so fixed, but variable)
Some historical architectures
• VAX: Digital’s line of midsize computers, dominant in academia in the 70s and 80s
• Characteristics:– Variable-length instructions; anywhere from 2 to 5
operands– Full set of addressing modes: operands can be
anywhere; single instruction could take up to 31 bytes– “High level” instructions: complexity built into
instruction set to make programmers’ task easier– Extensive set of data types at machine level
Some historical architectures
• Motorola’s 68000 series– Initial Apple MacIntosh, early Sun
workstations– Variable-length instructions: 0-2 operands– Wide variety of addressing modes (but not as
many as VAX)– Could not start an instruction until previous
one was completed
Intel architectures
• 8086 chip: first produced in 1979– Handled 16-bit data, 20-bit addresses– Could address 1 million bytes of memory– CPU split into 2 parts:
• Execution unit: contained GPRs & ALU• Bus interface unit: included instruction queue,
segment registers, instruction pointer (SR & IP are special-purpose registers)
8086 GPRs
• AX: accumulator
• BX: base register: could be used to extend addressing
• CX: count register
• DX: data register
• Some 8086 instructions require use of specific GPR, but in general, could use any of these to hold data
Byte-level addressing
• Each GPR addressable at word or byte level
• For example, AX divided into:– AH (contains MSB)– AL (contains LSB)
• Same for BX, CX, DX
Other registers in 8086
• Pointer registers:– SP: stack pointer: used as offset into stack– BP: base pointer: used to reference parameters
pushed on stack; indicates lowest value SP can reach– IP: holds address of next instruction (like Pep/8’s PC)
• Index registers:– SI: source index; used as source pointer for string
operations– DI: destination index; used as destination pointer for
string operations– Both SI & DI sometimes used to supplement GPRs
Other registers in 8086
• Status flags register: bits indicate CPU status & results (overflow, carry, negative, etc.)
• Segment registers– 8086 assembly language programs divided
into specialized blocks of code called segments
– Each segment holds specific types of information
8086 Segments
• Code segment: program itself (instructions)
• Data segment: program data
• Stack segment: program’s runtime stack (for procedure calls)
8086 segments
• To access information in a segment, had to specify item’s offset from segment start
• Segment needed to store segment addresses – these were stored in segment registers:– CS: code segment– DS: data segment– SS: stack segment– ES: extra segment (used by some string operations to handle
memory addressing)
• Addresses specified in segment/offset form:XXX:YYYWhere XXX is the value stored in a segment register, and YYY is
the offset from the start of the segment
Evolution of Intel platform
• Basic 8086 ISA used in many successor chips:– 8087
• Introduced in 1980• Added floating-point instructions, 80-bit stack
– 80286• Introduced 1982• Could address up to 16Mb of memory
Evolution of Intel platform
• 80386– Could address 4Gb of RAM– 32-bit chip, with 32-bit bus, 32-bit word– To achieve backward compatibility, Intel kept
same basic architecture, register sets– Used new naming convention in registers:
EAX, EBX, etc. were 32-bit (extended) versions of AX, BX, etc.; could still access original 16-bit registers (and their byte components) using original names
Evolution of Intel platform
• 80486– Added high-speed cache memory for performance
improvement– Integrated math co-processor
• Pentium™ series– Intel quit using numbers: couldn’t trademark them– 32-bit registers, 64-bit bus– Employed superscalar design, with multiple ALUs;
could run instructions in parallel, handling more than one instruction per clock cycle
Pentium™ series
• Pro added branch prediction
• II added MMX
• III added increased support for 3D graphics using floating-point instructions
• P4: 1.4 GHz and higher clock rates; 42 million transistors per CPU; 400MHz (and faster) system bus, refinements to cache & floating-point operations
Pentium™ series
• Itanium: Intel’s first 64-bit chip– Employs hardware emulator to maintain
backward compatibility with x86 – 4 integer ALUs, 4 floating-point ALUs, 4 cache
levels, 128 bit registers for integers and floating-point numbers
– Multiple miscellaneous registers for dealing with efficient instruction loading for branching
– Addresses up to 16Gb of RAM
CISC vs. RISC
• CISC: complex instruction set computing– Employed by Intel up through Pentium Pro– Pentium II and III used combined CISC/RISC: CISC
architecture with RISC core that could translate CISC instructions to RISC
• RISC: reduced instruction set computing• CISC emphasizes complexity in hardware,
simplicity in software; RISC is opposite• RISC is generally considered superior in
performance