Upload
susan-malone
View
236
Download
1
Tags:
Embed Size (px)
Citation preview
Computer ArchitectureComputer Architecture
Part II-D: Survey of Processor Architecture
Microprocessors in the MarketMicroprocessors in the Market
What’s the difference?
Areas of DevelopmentAreas of Development
Below are technologies which can be improved in CPU design: System bus speed Internal and external clock frequency Casing Cooling system Instruction set Material used for the die
End result: enhance speed of the CPU and the system in general
The System BusThe System Bus
Conduit for moving data between the processor and other system components
CPU
Caches
System Bus
Memory
I/O Devices:
Controllers
Adapters
DisksDisplaysKeyboards
Networks
Bus
System Bus SpeedsSystem Bus Speeds
Intel Pentium Core 2 Quad/Duo have CPU clocks of 2.66/3 GHz with system bus speeds of 1066/1333 MHz
AMD: 2nd Generation Opteron (dual core) processor has clock speed of 1.8 GHz with a 1000 MHz system bus
Split Clock FrequencySplit Clock Frequency
Internal clock frequency Speed of data processing inside the CPU
External clock frequency Speed of data transfer to and from the
CPU via the system bus Intel 486DX2 25/50 was first to use
clock doubling to implement split clock system
The GHz Race in CPU FrequencyThe GHz Race in CPU Frequency
June 1999: API (Alpha Processor Inc.) demonstrated a 1 GHz chip
March 2000: AMD released Athlon 1 GHz; within days Intel released 1 GHz Pentium III
2002: AMD, Intel uses 0.13 micron technology Athlon XP 2200+ (June) Pentium 4 2.53 GHz (May), mobile Pentium 4 2 GHz (June)
2004 Pentium 4: 3.6 GHz, 800 MHz system bus AMD: 3200+, 2.2 GHz, 400 MHz : Same as 2003 32-bit
CPUs now concentrating on 64-bit 2005
Pentium 4: 3.73 – 3.8 GHz, 800/1066 MHz system bus AMD: Same as 2004
Is Moore’s Law Dead?Is Moore’s Law Dead?
Intel’s vision of a 10 GHz CPU cannot be realized due to heat problems
Some have pushed speed limits through high-end cooling systems
Both Intel and AMD no longer concentrating on speed as performance driver
SIA says “Moore’s Law is still going strong after 40 years”
Micron TechnologyMicron Technology
A micron is 1 millionth of a meter Human hair strand about 100 microns
Objective: thinner wires Allow CPU to operate at lower voltage Results in CPU generating less heat and
operating at higher speeds Currently, processors are in the range
of 0.065 microns (65 nm) Intel’s Roadmap: 45 15 nm
Micron Technology Through the YearsMicron Technology Through the Years
Processor Year Micron4004 First microprocessor 1971 10
8080 1974 6
8086 1978 3
486 Intel 1989 1
486 AMD 1990 0.8
Pentium classic 1993 0.8
IDT Winchip 1997 0.35
Pentium MMX 1997 0.25
AMD K6-11 1997 0.25
PIII/Athlon/Itanium 2001 0.18
P4/Athlon XP 2002 0.13
2003 0.13/0.09
2004 - 05 0.09
2006 - 07 0.065
Thinner Wires = Increased TransistorsThinner Wires = Increased Transistors
05000
1000015000200002500030000350004000045000
1984 1987 1990 1993 1997 1999 2001
Year
Tran
sist
ors
(,000
)
8086/808822,000
286128,000
386DX/386SX250,000
486SX/486DX486DX2/486DX4
1,200,000
Pentium, CyrixAMD K5, MMX
3,100,000
AMD K68,800,000
Athlon 1.4 GHz37,000,000
Pertium 442,000,000
The Switch to CopperThe Switch to Copper
Aluminum limits making chips smaller Copper is a good choice because it
is a better conductor consumes less energy, and takes up less space than aluminum
Copper allowed processors to boost speeds to the GHz range
IBM pioneered the use of copper on September 1, 1998 (IBM Power PC 740/750)
PC on a ChipPC on a Chip
Integrates a number of key components into one chip
Result: The chip replaces dozen or so separate chips (memory, FPU, graphics, video, etc.)
Applications: PDAs, cellphones, set-top boxes, embedded processors, etc.
Impact of PC-on-a-ChipImpact of PC-on-a-Chip
Smaller and quieter desktops Battery of devices lasts longer
because of the low power drain Proliferation of information appliances
CPU ReceptacleCPU Receptacle
ZIF Zero Insertion Force
socket - type of socket designed for easy insertion of chips that have high density of pins
Socket 7 - popular implementation of ZIF
CPU ReceptacleCPU Receptacle
Slot 1 Consists of receptacle on the motherboard
that holds an Intel Single Edge Contact (SEC) cartridge
Cartridge may contain up to two CPUs and an L2 cache (runs at half the speed of CPU) and plugs into 242-pin receptacle
Started with Pentium II
CPU ReceptacleCPU Receptacle
A Pentium II mounted on Slot 1
CPU ReceptacleCPU Receptacle
Slot 2 An enhanced Slot 1 Uses 330-pin SEC Holds up to four CPUs L2 cache runs at full
processor speed First used in Intel's
Pentium II Xeon
CPU ReceptacleCPU Receptacle
AMD’s Slot A Receptacle on motherboard for K7 CPU Physically similar to Slot 1, but has
different electrical requirements
Casing: FC-PGA (Flip-Chip)Casing: FC-PGA (Flip-Chip)
Traditional Wiring Flip-Chip (IBM)
Advantages of FC-PGAAdvantages of FC-PGA
Greater # of I/O pins available Shorter electrical connections Better manufacturing efficiency
Casing: FC-LGACasing: FC-LGA
Bottom view of LGA/BGA-based CPU LGA Socket 775
Advantages of FC-LGAAdvantages of FC-LGA
Lower voltage used (less distance traveled, reduced signal loss)
Less heat dissipation
CacheCache
Works as buffer between CPU and memory Two types:
Internal External
Levels of CacheLevels of Cache
Level 1 Level 2 Level 3
L1 L2 L3
Cache PlacementCache Placement
Intel used to have external L2 cache
Pentium Pro Internal but CPU and L2
cache are separate Result: larger chip that
requires a larger socket
OverclockingOverclocking
Going beyond recommended clock frequency settings
3 method of overclocking System bus frequency CPU frequency multiplier Change both of the above
Some CPUs have locked frequencies
Overclocking: How to...Overclocking: How to...
Done through BIOS program
Older systems require motherboard jumpers
Some motherboards (e.g. ASUS TX97) contain jumper codes
Overclocking IssuesOverclocking Issues
Heat! Can main memory cope? Will the software still work?
Cooling SystemsCooling Systems
CPUs get hotter as they get faster
Developed to keep the CPU from overheating
Sophisticated cooling systems allow more reliable CPU operation
Liquid Nitrogen: Extremely Cool!Liquid Nitrogen: Extremely Cool!
CPU: Pentium 4 (Northwood)Date: Christmas 2003
The CPU Gets Watered DownThe CPU Gets Watered Down
Multimedia ProcessingMultimedia Processing
Multimedia applications require geometric transformation Re-computation of location and size of an image
to determine new position Deals with FP
FPU handles all real number computations Drawing landscapes (e.g. games) involves
lots of computations and CPU may not handle it as fast as the player could react
Ways of Handling MultimediaWays of Handling Multimedia
Speed up the CPU Improve the CPU’s FPU by adding
more pipelines Use high-end 3D graphics cards Add new multimedia instructions
Multimedia Innovations in CPUsMultimedia Innovations in CPUs
MMX 3DNow! SSE
MMXMMX
Introduced 1995 in the Pentium processor Had 57 new instructions for 3D graphics Introduced SIMD (Single Instruction Multiple
Data) instructions: technique that processes more than one integer simultaneously
Problems: Only works with integers CPU can only work with either MMX or FPU, not
both simultaneously because they share registers
3DNow!3DNow!
Introduced summer of 1998 in the AMD K6-2
Characteristics Supports SIMD instructions Improved handling of numbers
Successful! Integrated in Windows, games, and
drivers Does not use the same registers
SSESSE
Introduced in Pentium III (Katmai) 500 MHz as Intel’s response to 3DNow!
Characteristics 8 new 128-bit registers (can hold four 32-bit #s) Has Streaming SIMD Extensions 50 new instructions enabling simultaneous
advanced calculations of more FP with a single instruction
New Media Instructions designed for coding and decoding MPEGs
Problems with SSEProblems with SSE
Pipelines can only handle two 32-bit numbers at a time
To take advantage of 128-bit registers, FPU pipeline should have been doubled (would have pushed back release date of Katmai)
Potentially, it could have enhanced 3D graphics since registers can handle four 32-bit numbers at a time
SSE EnhancementsSSE Enhancements
SSE2 Started in Pentium 4 Has 144 new instructions (since SSE) Data width now 64 bits
SSE3 13 additional SIMD instructions (since SSE2) New instructions primarily designed to improve
thread synchronization and specific application areas such as media and gaming
Supplemental SSE3 (Core 2) SSE4
Other CPU InnovationsOther CPU Innovations Data width
Internal: How many bits can the CPU process simultaneously?
External: How many bits can the CPU receive simultaneously for processing
Superscalar architecture
Superpipelined architecture Superscalar processing
Intel CorporationIntel Corporation
Produced biggest impact on microprocessor technology
Main line of business is CPU but also has other hardware products (e.g. motherboards)
Short History of IntelShort History of Intel
1968: Birth of Intel Started in memory business First product was 64-bit memory
1970s: Increase in market share Early 1980s: Japanese eats up memory
market with 16 - 256 KB chips 1984: Business slowing down “Get us
out of memory!” 1986: Exited from memory due to success
of 80386
Intel Processor Time LineIntel Processor Time Line
1971: 4004Intel’s first microprocessor(108 KHz, 4 bit bus width)
1978: 8086First 16-bit CPU from Intel
1979: 8088Reengineered CPU to fitexisting 8-bit hardware
1982: 28616-bit processor
Optimized Instruction handling
1985: 386First 32-bit CPU
(32-bit system bus)
1988: 386SXCheaper version
of the 386DX
1989: 486Built in math co-processor
L1 cache on-chip
2
Intel Processor Time LineIntel Processor Time Line
486SXDiscount chip
No math co-processor
486DX4Triple the clock speed
From 25 MHz to 75 MHz33 MHz to 100 MHz
1993: Pentium ClassicSuperscalar (5x 486DX-33 MHz)
Width of system bus: 64 bitSpeed of system bus: 60 to 66 MHz
Initially produced a lot of heat
Jan 8, 1997: Pentium MMXNew set of instructions for multimedia
32 KB L1 cache
Nov 1, 1995: Pentium ProRISC Processor32 bit processing
L2 cache is built in
May 7, 1997: Pentium II(Klamath)512 KB L2
L1 cache of 32 KB
3
Intel Processor Time LineIntel Processor Time Line
Jan 26, 1998: Deschutes333 MHz
0.25 micron technology
1Q 1998: Celeron (Covington)Pentium II without
the L2 cache
1998: Celeron (Mendocino)333 MHz
128 KB L2 internal cache
July 26, 1998: Pentium II Xeon450 MHz
Custom SRAMDifferent L2 caches: 512, 1/2 MB
Can have 4 - 8 Xeons in one server
1999: Pentim III (Katmai)Enhanced MMX2 graphics
instructions
1999: Pentium III Xeon
(Tanner)
2001: Itanium(formerly Merced)
64-bit CPU0.18 micron technology> 25 million transistors
2000: Pentium 47th Generation
0.18 micron technology
Core (2005)
Current Intel CPU InnovationsCurrent Intel CPU Innovations
Hyperthreading Multi-core
Core Core 2 (64-bit architecture)
Intel’s First 64-Bit Chip (Server): Intel’s First 64-Bit Chip (Server): ItaniumItanium
Was known as IA-64 (but IA-32 compatible) EPIC (Explicitly Parallel Instruction
Computing) processor Enables up to 20 operations/clock cycle Employs branch prediction and speculation
Three levels of cache: 2 MB / 4 MB L3 cache, 96K L2 cache, and 32K L1 cache
128 integer registers, 128 FP registers
Itanium 2Itanium 2 Available from 1 - 1.66 GHz Internal L3 cache (1.5 MB, 3 MB, 4 MB, 6 MB, or 9 MB) System bus: 400/533/667 MHz, 128-bits wide 0.13 microns, 592 million transistors Next version (“Montecito”) has 1.72 billion transistors, 26 MB on-
die cache, 90 nm
Current Intel CPU LineupCurrent Intel CPU Lineup
Mobile Centrino (Core and Core 2)
Desktop Core 2 Extreme Core 2 (now used in Apple Mac Mini)
Servers and workstations Xeon (now used in Apple Mac Pro) Itanium 2
AMD (Advanced Micro Devices)AMD (Advanced Micro Devices)
Incorporated in May 1969 Challenging Intel even before
Pentium-class processors Offered their own technology
and cannot be considered as producing clones
Achieved increased market sales starting with K6 and K6-II
AMD Series (From Pentium Class)AMD Series (From Pentium Class)
K5 Similar to the classic Pentiums 16 KB L1 cache and no MMX Not very impressive but much cheaper than
similar Pentium models K6
Technology brought in from NexGen; put AMD back in business
32 KB L1 cache & MMX Pentium compatible but performed better than
MMX
AMD Series (From Pentium Class)AMD Series (From Pentium Class)
K6-II: Chomper 0.25 micron, system bus speed of 100 MHz Introduced 3DNow! Also MMX-compatible; really challenged the
Pentium II and led to low-cost Celeron K6-III: Sharptooth
Three levels of cache: L1 and L2 are in CPU; L3 is on motherboard up to 1 MB; 133 system bus
Was not as successful as the K6-II K7: Athlon
Last of AMD’s 32-Bit ProcessorsLast of AMD’s 32-Bit Processors
Athlon XP Intel played catch-up to the Athlon XP on many
occasions, but now stagnant in 32-bit computing Model 3200+ has a 2.2 GHz CPU, 3 FP
pipelines, 128 KB of L1 cache, 512 KB L2 cache, system bus speed of 400 MHz, and 0.13 micron technology
Sempron Counterpart of Intel Celeron Model 3300+ has 2 GHz CPU, 754-pins, 90 nm
technology, 128 KB L2 cache
AMD’s 64-Bit ChipsAMD’s 64-Bit Chips Varieties:
Athlon 64 (desktop) Turion 64 (mobile) Opteron (servers or
workstations) Provides seamless
transition to 64-bit System bus runs at
processor speed through on-chip memory controller
Lead the Itanium 2 on many benchmarks
AMD formed a partnership with Sun
Current AMD 64-Bit CPU InnovationsCurrent AMD 64-Bit CPU Innovations
HyperTransport Dual core Direct Connect Architecture
Transmeta’s Crusoe ProcessorTransmeta’s Crusoe Processor
Transmeta’s founders include David Ditzel, Linus Torvalds, and Paul Allen released “Crusoe” in January 2000
Architectural achievements Only 25% the number of transistors compared to
current Pentiums Needs only 1 or 2 watts of power for 400 MHz or
700 MHz chips running at full speed Much less heat dissipated but can compete with
same category Intel and AMD chips
How Crusoe Pulled It OffHow Crusoe Pulled It Off
Efficient instruction set bears no resemblance to x86
Takes advantage of latest and best in hardware design
Software layer (code morphing software) in flash ROM translates x86 commands
Current Transmeta ProcessorsCurrent Transmeta Processors
Crusoe TM5900 667 MHz – 1 GHz CPU speed 128 KB L1, 512 KB L2 133 MHz system bus 0.13 microns
Efficeon TM8800 Up to 1.7 GHz 128 KB L1 instruction cache 64 KB L1 data
cache, 1 MB L2 400 MHz system bus
The PowerPC MicroprocessorThe PowerPC Microprocessor Originally designed by
Apple, IBM, and Motorola Based on IBM POWER
architecture used in IBM RS/6000 (RISC based)
Provides seamless transition to 64-bit
The PowerPC G5 is used in Apple iMac G5
2.7 GHz CPU speed, 1.35 GHz system bus, 512 KB on-chip L2 cache
Sun UltraSparc IV+Sun UltraSparc IV+ 2nd generation dual core
processor design (1368 pins FC-LGA)
64-bit CPU, 90 nm, 295 million transistors
CPU speeds of 1.95 / 2.1 GHz 2 MB L2 cache, 32 MB off-chip On-chip memory controller CMT (Chip Multi-Threading) with
2 threads per processor 14-stage non-stalling pipeline 4-way superscalar Runs Solaris, Linux, FreeBSD,
and other UNIX versions
Sun UltraSparc T1Sun UltraSparc T1
Available in 4-, 6- or 8-core 64 bits, 90 nm 4-way multithreaded core 14-stage non-stalling pipeline 4 integrated memory controllers 16 KB instruction, 8 KB data L1
cache per core, 3 MB unified L2 cache
Available in 1 and 1.2 GHz Low power (72 – 79 watts)
Multiprocessor SystemsMultiprocessor Systems
Combines two or more CPUs of the same brand and model
Allows systems to scale up Forms an N-way system
Future TrendsFuture Trends
In Dec. 1997, the Semiconductor Industry Association (SIA) provided details about future requirements of microprocessors.
Attempts to continue the pace predicted by Moore’s Law
1999 SIA Roadmap for 1999 SIA Roadmap for MicroprocessorsMicroprocessors
1999 2000 2001 2002 2005 2008
MPU (gate length) 0.14 microns
0.12
0.10
0.09
0.065 0.045
Transistors/ (sq. cm)
6.6 million 9.4 million 13 million 18 million 44 million
109 million
Die size (sq. mm)
340 340 340 340 408 468
MHz 1250 1486 1767 2100 3500 6000
Packaging (pins/balls)
740 821 912 1012 1384 1893
Wafer size (mm)
200 200 300 300 300 300
International Technology Roadmap for International Technology Roadmap for SemiconductorsSemiconductors