66
Computer Architecture Computer Architecture Part II-D: Survey of Processor Architecture

Computer Architecture Part II-D: Survey of Processor Architecture

Embed Size (px)

Citation preview

Page 1: Computer Architecture Part II-D: Survey of Processor Architecture

Computer ArchitectureComputer Architecture

Part II-D: Survey of Processor Architecture

Page 2: Computer Architecture Part II-D: Survey of Processor Architecture

Microprocessors in the MarketMicroprocessors in the Market

What’s the difference?

Page 3: Computer Architecture Part II-D: Survey of Processor Architecture

Areas of DevelopmentAreas of Development

Below are technologies which can be improved in CPU design: System bus speed Internal and external clock frequency Casing Cooling system Instruction set Material used for the die

End result: enhance speed of the CPU and the system in general

Page 4: Computer Architecture Part II-D: Survey of Processor Architecture

The System BusThe System Bus

Conduit for moving data between the processor and other system components

CPU

Caches

System Bus

Memory

I/O Devices:

Controllers

Adapters

DisksDisplaysKeyboards

Networks

Bus

Page 5: Computer Architecture Part II-D: Survey of Processor Architecture

System Bus SpeedsSystem Bus Speeds

Intel Pentium Core 2 Quad/Duo have CPU clocks of 2.66/3 GHz with system bus speeds of 1066/1333 MHz

AMD: 2nd Generation Opteron (dual core) processor has clock speed of 1.8 GHz with a 1000 MHz system bus

Page 6: Computer Architecture Part II-D: Survey of Processor Architecture

Split Clock FrequencySplit Clock Frequency

Internal clock frequency Speed of data processing inside the CPU

External clock frequency Speed of data transfer to and from the

CPU via the system bus Intel 486DX2 25/50 was first to use

clock doubling to implement split clock system

Page 7: Computer Architecture Part II-D: Survey of Processor Architecture

The GHz Race in CPU FrequencyThe GHz Race in CPU Frequency

June 1999: API (Alpha Processor Inc.) demonstrated a 1 GHz chip

March 2000: AMD released Athlon 1 GHz; within days Intel released 1 GHz Pentium III

2002: AMD, Intel uses 0.13 micron technology Athlon XP 2200+ (June) Pentium 4 2.53 GHz (May), mobile Pentium 4 2 GHz (June)

2004 Pentium 4: 3.6 GHz, 800 MHz system bus AMD: 3200+, 2.2 GHz, 400 MHz : Same as 2003 32-bit

CPUs now concentrating on 64-bit 2005

Pentium 4: 3.73 – 3.8 GHz, 800/1066 MHz system bus AMD: Same as 2004

Page 8: Computer Architecture Part II-D: Survey of Processor Architecture

Is Moore’s Law Dead?Is Moore’s Law Dead?

Intel’s vision of a 10 GHz CPU cannot be realized due to heat problems

Some have pushed speed limits through high-end cooling systems

Both Intel and AMD no longer concentrating on speed as performance driver

SIA says “Moore’s Law is still going strong after 40 years”

Page 9: Computer Architecture Part II-D: Survey of Processor Architecture

Micron TechnologyMicron Technology

A micron is 1 millionth of a meter Human hair strand about 100 microns

Objective: thinner wires Allow CPU to operate at lower voltage Results in CPU generating less heat and

operating at higher speeds Currently, processors are in the range

of 0.065 microns (65 nm) Intel’s Roadmap: 45 15 nm

Page 10: Computer Architecture Part II-D: Survey of Processor Architecture

Micron Technology Through the YearsMicron Technology Through the Years

Processor Year Micron4004 First microprocessor 1971 10

8080 1974 6

8086 1978 3

486 Intel 1989 1

486 AMD 1990 0.8

Pentium classic 1993 0.8

IDT Winchip 1997 0.35

Pentium MMX 1997 0.25

AMD K6-11 1997 0.25

PIII/Athlon/Itanium 2001 0.18

P4/Athlon XP 2002 0.13

2003 0.13/0.09

2004 - 05 0.09

2006 - 07 0.065

Page 11: Computer Architecture Part II-D: Survey of Processor Architecture

Thinner Wires = Increased TransistorsThinner Wires = Increased Transistors

05000

1000015000200002500030000350004000045000

1984 1987 1990 1993 1997 1999 2001

Year

Tran

sist

ors

(,000

)

8086/808822,000

286128,000

386DX/386SX250,000

486SX/486DX486DX2/486DX4

1,200,000

Pentium, CyrixAMD K5, MMX

3,100,000

AMD K68,800,000

Athlon 1.4 GHz37,000,000

Pertium 442,000,000

Page 12: Computer Architecture Part II-D: Survey of Processor Architecture

The Switch to CopperThe Switch to Copper

Aluminum limits making chips smaller Copper is a good choice because it

is a better conductor consumes less energy, and takes up less space than aluminum

Copper allowed processors to boost speeds to the GHz range

IBM pioneered the use of copper on September 1, 1998 (IBM Power PC 740/750)

Page 13: Computer Architecture Part II-D: Survey of Processor Architecture

PC on a ChipPC on a Chip

Integrates a number of key components into one chip

Result: The chip replaces dozen or so separate chips (memory, FPU, graphics, video, etc.)

Applications: PDAs, cellphones, set-top boxes, embedded processors, etc.

Page 14: Computer Architecture Part II-D: Survey of Processor Architecture

Impact of PC-on-a-ChipImpact of PC-on-a-Chip

Smaller and quieter desktops Battery of devices lasts longer

because of the low power drain Proliferation of information appliances

Page 15: Computer Architecture Part II-D: Survey of Processor Architecture

CPU ReceptacleCPU Receptacle

ZIF Zero Insertion Force

socket - type of socket designed for easy insertion of chips that have high density of pins

Socket 7 - popular implementation of ZIF

Page 16: Computer Architecture Part II-D: Survey of Processor Architecture

CPU ReceptacleCPU Receptacle

Slot 1 Consists of receptacle on the motherboard

that holds an Intel Single Edge Contact (SEC) cartridge

Cartridge may contain up to two CPUs and an L2 cache (runs at half the speed of CPU) and plugs into 242-pin receptacle

Started with Pentium II

Page 17: Computer Architecture Part II-D: Survey of Processor Architecture

CPU ReceptacleCPU Receptacle

A Pentium II mounted on Slot 1

Page 18: Computer Architecture Part II-D: Survey of Processor Architecture

CPU ReceptacleCPU Receptacle

Slot 2 An enhanced Slot 1 Uses 330-pin SEC Holds up to four CPUs L2 cache runs at full

processor speed First used in Intel's

Pentium II Xeon

Page 19: Computer Architecture Part II-D: Survey of Processor Architecture

CPU ReceptacleCPU Receptacle

AMD’s Slot A Receptacle on motherboard for K7 CPU Physically similar to Slot 1, but has

different electrical requirements

Page 20: Computer Architecture Part II-D: Survey of Processor Architecture

Casing: FC-PGA (Flip-Chip)Casing: FC-PGA (Flip-Chip)

Traditional Wiring Flip-Chip (IBM)

Page 21: Computer Architecture Part II-D: Survey of Processor Architecture

Advantages of FC-PGAAdvantages of FC-PGA

Greater # of I/O pins available Shorter electrical connections Better manufacturing efficiency

Page 22: Computer Architecture Part II-D: Survey of Processor Architecture

Casing: FC-LGACasing: FC-LGA

Bottom view of LGA/BGA-based CPU LGA Socket 775

Page 23: Computer Architecture Part II-D: Survey of Processor Architecture

Advantages of FC-LGAAdvantages of FC-LGA

Lower voltage used (less distance traveled, reduced signal loss)

Less heat dissipation

Page 24: Computer Architecture Part II-D: Survey of Processor Architecture

CacheCache

Works as buffer between CPU and memory Two types:

Internal External

Page 25: Computer Architecture Part II-D: Survey of Processor Architecture

Levels of CacheLevels of Cache

Level 1 Level 2 Level 3

L1 L2 L3

Page 26: Computer Architecture Part II-D: Survey of Processor Architecture

Cache PlacementCache Placement

Intel used to have external L2 cache

Pentium Pro Internal but CPU and L2

cache are separate Result: larger chip that

requires a larger socket

Page 27: Computer Architecture Part II-D: Survey of Processor Architecture

OverclockingOverclocking

Going beyond recommended clock frequency settings

3 method of overclocking System bus frequency CPU frequency multiplier Change both of the above

Some CPUs have locked frequencies

Page 28: Computer Architecture Part II-D: Survey of Processor Architecture

Overclocking: How to...Overclocking: How to...

Done through BIOS program

Older systems require motherboard jumpers

Some motherboards (e.g. ASUS TX97) contain jumper codes

Page 29: Computer Architecture Part II-D: Survey of Processor Architecture

Overclocking IssuesOverclocking Issues

Heat! Can main memory cope? Will the software still work?

Page 30: Computer Architecture Part II-D: Survey of Processor Architecture

Cooling SystemsCooling Systems

CPUs get hotter as they get faster

Developed to keep the CPU from overheating

Sophisticated cooling systems allow more reliable CPU operation

Page 31: Computer Architecture Part II-D: Survey of Processor Architecture

Liquid Nitrogen: Extremely Cool!Liquid Nitrogen: Extremely Cool!

CPU: Pentium 4 (Northwood)Date: Christmas 2003

Page 32: Computer Architecture Part II-D: Survey of Processor Architecture

The CPU Gets Watered DownThe CPU Gets Watered Down

Page 33: Computer Architecture Part II-D: Survey of Processor Architecture

Multimedia ProcessingMultimedia Processing

Multimedia applications require geometric transformation Re-computation of location and size of an image

to determine new position Deals with FP

FPU handles all real number computations Drawing landscapes (e.g. games) involves

lots of computations and CPU may not handle it as fast as the player could react

Page 34: Computer Architecture Part II-D: Survey of Processor Architecture

Ways of Handling MultimediaWays of Handling Multimedia

Speed up the CPU Improve the CPU’s FPU by adding

more pipelines Use high-end 3D graphics cards Add new multimedia instructions

Page 35: Computer Architecture Part II-D: Survey of Processor Architecture

Multimedia Innovations in CPUsMultimedia Innovations in CPUs

MMX 3DNow! SSE

Page 36: Computer Architecture Part II-D: Survey of Processor Architecture

MMXMMX

Introduced 1995 in the Pentium processor Had 57 new instructions for 3D graphics Introduced SIMD (Single Instruction Multiple

Data) instructions: technique that processes more than one integer simultaneously

Problems: Only works with integers CPU can only work with either MMX or FPU, not

both simultaneously because they share registers

Page 37: Computer Architecture Part II-D: Survey of Processor Architecture

3DNow!3DNow!

Introduced summer of 1998 in the AMD K6-2

Characteristics Supports SIMD instructions Improved handling of numbers

Successful! Integrated in Windows, games, and

drivers Does not use the same registers

Page 38: Computer Architecture Part II-D: Survey of Processor Architecture

SSESSE

Introduced in Pentium III (Katmai) 500 MHz as Intel’s response to 3DNow!

Characteristics 8 new 128-bit registers (can hold four 32-bit #s) Has Streaming SIMD Extensions 50 new instructions enabling simultaneous

advanced calculations of more FP with a single instruction

New Media Instructions designed for coding and decoding MPEGs

Page 39: Computer Architecture Part II-D: Survey of Processor Architecture

Problems with SSEProblems with SSE

Pipelines can only handle two 32-bit numbers at a time

To take advantage of 128-bit registers, FPU pipeline should have been doubled (would have pushed back release date of Katmai)

Potentially, it could have enhanced 3D graphics since registers can handle four 32-bit numbers at a time

Page 40: Computer Architecture Part II-D: Survey of Processor Architecture

SSE EnhancementsSSE Enhancements

SSE2 Started in Pentium 4 Has 144 new instructions (since SSE) Data width now 64 bits

SSE3 13 additional SIMD instructions (since SSE2) New instructions primarily designed to improve

thread synchronization and specific application areas such as media and gaming

Supplemental SSE3 (Core 2) SSE4

Page 41: Computer Architecture Part II-D: Survey of Processor Architecture

Other CPU InnovationsOther CPU Innovations Data width

Internal: How many bits can the CPU process simultaneously?

External: How many bits can the CPU receive simultaneously for processing

Superscalar architecture

Superpipelined architecture Superscalar processing

Page 42: Computer Architecture Part II-D: Survey of Processor Architecture

Intel CorporationIntel Corporation

Produced biggest impact on microprocessor technology

Main line of business is CPU but also has other hardware products (e.g. motherboards)

Page 43: Computer Architecture Part II-D: Survey of Processor Architecture

Short History of IntelShort History of Intel

1968: Birth of Intel Started in memory business First product was 64-bit memory

1970s: Increase in market share Early 1980s: Japanese eats up memory

market with 16 - 256 KB chips 1984: Business slowing down “Get us

out of memory!” 1986: Exited from memory due to success

of 80386

Page 44: Computer Architecture Part II-D: Survey of Processor Architecture

Intel Processor Time LineIntel Processor Time Line

1971: 4004Intel’s first microprocessor(108 KHz, 4 bit bus width)

1978: 8086First 16-bit CPU from Intel

1979: 8088Reengineered CPU to fitexisting 8-bit hardware

1982: 28616-bit processor

Optimized Instruction handling

1985: 386First 32-bit CPU

(32-bit system bus)

1988: 386SXCheaper version

of the 386DX

1989: 486Built in math co-processor

L1 cache on-chip

2

Page 45: Computer Architecture Part II-D: Survey of Processor Architecture

Intel Processor Time LineIntel Processor Time Line

486SXDiscount chip

No math co-processor

486DX4Triple the clock speed

From 25 MHz to 75 MHz33 MHz to 100 MHz

1993: Pentium ClassicSuperscalar (5x 486DX-33 MHz)

Width of system bus: 64 bitSpeed of system bus: 60 to 66 MHz

Initially produced a lot of heat

Jan 8, 1997: Pentium MMXNew set of instructions for multimedia

32 KB L1 cache

Nov 1, 1995: Pentium ProRISC Processor32 bit processing

L2 cache is built in

May 7, 1997: Pentium II(Klamath)512 KB L2

L1 cache of 32 KB

3

Page 46: Computer Architecture Part II-D: Survey of Processor Architecture

Intel Processor Time LineIntel Processor Time Line

Jan 26, 1998: Deschutes333 MHz

0.25 micron technology

1Q 1998: Celeron (Covington)Pentium II without

the L2 cache

1998: Celeron (Mendocino)333 MHz

128 KB L2 internal cache

July 26, 1998: Pentium II Xeon450 MHz

Custom SRAMDifferent L2 caches: 512, 1/2 MB

Can have 4 - 8 Xeons in one server

1999: Pentim III (Katmai)Enhanced MMX2 graphics

instructions

1999: Pentium III Xeon

(Tanner)

2001: Itanium(formerly Merced)

64-bit CPU0.18 micron technology> 25 million transistors

2000: Pentium 47th Generation

0.18 micron technology

Core (2005)

Page 47: Computer Architecture Part II-D: Survey of Processor Architecture

Current Intel CPU InnovationsCurrent Intel CPU Innovations

Hyperthreading Multi-core

Core Core 2 (64-bit architecture)

Page 48: Computer Architecture Part II-D: Survey of Processor Architecture

Intel’s First 64-Bit Chip (Server): Intel’s First 64-Bit Chip (Server): ItaniumItanium

Was known as IA-64 (but IA-32 compatible) EPIC (Explicitly Parallel Instruction

Computing) processor Enables up to 20 operations/clock cycle Employs branch prediction and speculation

Three levels of cache: 2 MB / 4 MB L3 cache, 96K L2 cache, and 32K L1 cache

128 integer registers, 128 FP registers

Page 49: Computer Architecture Part II-D: Survey of Processor Architecture

Itanium 2Itanium 2 Available from 1 - 1.66 GHz Internal L3 cache (1.5 MB, 3 MB, 4 MB, 6 MB, or 9 MB) System bus: 400/533/667 MHz, 128-bits wide 0.13 microns, 592 million transistors Next version (“Montecito”) has 1.72 billion transistors, 26 MB on-

die cache, 90 nm

Page 50: Computer Architecture Part II-D: Survey of Processor Architecture

Current Intel CPU LineupCurrent Intel CPU Lineup

Mobile Centrino (Core and Core 2)

Desktop Core 2 Extreme Core 2 (now used in Apple Mac Mini)

Servers and workstations Xeon (now used in Apple Mac Pro) Itanium 2

Page 51: Computer Architecture Part II-D: Survey of Processor Architecture

AMD (Advanced Micro Devices)AMD (Advanced Micro Devices)

Incorporated in May 1969 Challenging Intel even before

Pentium-class processors Offered their own technology

and cannot be considered as producing clones

Achieved increased market sales starting with K6 and K6-II

Page 52: Computer Architecture Part II-D: Survey of Processor Architecture

AMD Series (From Pentium Class)AMD Series (From Pentium Class)

K5 Similar to the classic Pentiums 16 KB L1 cache and no MMX Not very impressive but much cheaper than

similar Pentium models K6

Technology brought in from NexGen; put AMD back in business

32 KB L1 cache & MMX Pentium compatible but performed better than

MMX

Page 53: Computer Architecture Part II-D: Survey of Processor Architecture

AMD Series (From Pentium Class)AMD Series (From Pentium Class)

K6-II: Chomper 0.25 micron, system bus speed of 100 MHz Introduced 3DNow! Also MMX-compatible; really challenged the

Pentium II and led to low-cost Celeron K6-III: Sharptooth

Three levels of cache: L1 and L2 are in CPU; L3 is on motherboard up to 1 MB; 133 system bus

Was not as successful as the K6-II K7: Athlon

Page 54: Computer Architecture Part II-D: Survey of Processor Architecture

Last of AMD’s 32-Bit ProcessorsLast of AMD’s 32-Bit Processors

Athlon XP Intel played catch-up to the Athlon XP on many

occasions, but now stagnant in 32-bit computing Model 3200+ has a 2.2 GHz CPU, 3 FP

pipelines, 128 KB of L1 cache, 512 KB L2 cache, system bus speed of 400 MHz, and 0.13 micron technology

Sempron Counterpart of Intel Celeron Model 3300+ has 2 GHz CPU, 754-pins, 90 nm

technology, 128 KB L2 cache

Page 55: Computer Architecture Part II-D: Survey of Processor Architecture

AMD’s 64-Bit ChipsAMD’s 64-Bit Chips Varieties:

Athlon 64 (desktop) Turion 64 (mobile) Opteron (servers or

workstations) Provides seamless

transition to 64-bit System bus runs at

processor speed through on-chip memory controller

Lead the Itanium 2 on many benchmarks

AMD formed a partnership with Sun

Page 56: Computer Architecture Part II-D: Survey of Processor Architecture

Current AMD 64-Bit CPU InnovationsCurrent AMD 64-Bit CPU Innovations

HyperTransport Dual core Direct Connect Architecture

Page 57: Computer Architecture Part II-D: Survey of Processor Architecture

Transmeta’s Crusoe ProcessorTransmeta’s Crusoe Processor

Transmeta’s founders include David Ditzel, Linus Torvalds, and Paul Allen released “Crusoe” in January 2000

Architectural achievements Only 25% the number of transistors compared to

current Pentiums Needs only 1 or 2 watts of power for 400 MHz or

700 MHz chips running at full speed Much less heat dissipated but can compete with

same category Intel and AMD chips

Page 58: Computer Architecture Part II-D: Survey of Processor Architecture

How Crusoe Pulled It OffHow Crusoe Pulled It Off

Efficient instruction set bears no resemblance to x86

Takes advantage of latest and best in hardware design

Software layer (code morphing software) in flash ROM translates x86 commands

Page 59: Computer Architecture Part II-D: Survey of Processor Architecture

Current Transmeta ProcessorsCurrent Transmeta Processors

Crusoe TM5900 667 MHz – 1 GHz CPU speed 128 KB L1, 512 KB L2 133 MHz system bus 0.13 microns

Efficeon TM8800 Up to 1.7 GHz 128 KB L1 instruction cache 64 KB L1 data

cache, 1 MB L2 400 MHz system bus

Page 60: Computer Architecture Part II-D: Survey of Processor Architecture

The PowerPC MicroprocessorThe PowerPC Microprocessor Originally designed by

Apple, IBM, and Motorola Based on IBM POWER

architecture used in IBM RS/6000 (RISC based)

Provides seamless transition to 64-bit

The PowerPC G5 is used in Apple iMac G5

2.7 GHz CPU speed, 1.35 GHz system bus, 512 KB on-chip L2 cache

Page 61: Computer Architecture Part II-D: Survey of Processor Architecture

Sun UltraSparc IV+Sun UltraSparc IV+ 2nd generation dual core

processor design (1368 pins FC-LGA)

64-bit CPU, 90 nm, 295 million transistors

CPU speeds of 1.95 / 2.1 GHz 2 MB L2 cache, 32 MB off-chip On-chip memory controller CMT (Chip Multi-Threading) with

2 threads per processor 14-stage non-stalling pipeline 4-way superscalar Runs Solaris, Linux, FreeBSD,

and other UNIX versions

Page 62: Computer Architecture Part II-D: Survey of Processor Architecture

Sun UltraSparc T1Sun UltraSparc T1

Available in 4-, 6- or 8-core 64 bits, 90 nm 4-way multithreaded core 14-stage non-stalling pipeline 4 integrated memory controllers 16 KB instruction, 8 KB data L1

cache per core, 3 MB unified L2 cache

Available in 1 and 1.2 GHz Low power (72 – 79 watts)

Page 63: Computer Architecture Part II-D: Survey of Processor Architecture

Multiprocessor SystemsMultiprocessor Systems

Combines two or more CPUs of the same brand and model

Allows systems to scale up Forms an N-way system

Page 64: Computer Architecture Part II-D: Survey of Processor Architecture

Future TrendsFuture Trends

In Dec. 1997, the Semiconductor Industry Association (SIA) provided details about future requirements of microprocessors.

Attempts to continue the pace predicted by Moore’s Law

Page 65: Computer Architecture Part II-D: Survey of Processor Architecture

1999 SIA Roadmap for 1999 SIA Roadmap for MicroprocessorsMicroprocessors

1999 2000 2001 2002 2005 2008

MPU (gate length) 0.14 microns

0.12

0.10

0.09

0.065 0.045

Transistors/ (sq. cm)

6.6 million 9.4 million 13 million 18 million 44 million

109 million

Die size (sq. mm)

340 340 340 340 408 468

MHz 1250 1486 1767 2100 3500 6000

Packaging (pins/balls)

740 821 912 1012 1384 1893

Wafer size (mm)

200 200 300 300 300 300

Page 66: Computer Architecture Part II-D: Survey of Processor Architecture

International Technology Roadmap for International Technology Roadmap for SemiconductorsSemiconductors