16
The CRAY-1 The CRAY-1 Computer System Computer System Richard Russell Richard Russell Communications of the ACM Communications of the ACM January 1978 January 1978

The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Embed Size (px)

Citation preview

Page 1: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

The CRAY-1 The CRAY-1 Computer SystemComputer System

Richard RussellRichard Russell

Communications of the ACMCommunications of the ACMJanuary 1978January 1978

Page 2: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

““The world’s most The world’s most expensive love-seat”expensive love-seat”

Page 3: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

A “reasonably trim A “reasonably trim individual” can gain access individual” can gain access

to the interior of the to the interior of the machine.machine.

12.5 ns clock12.5 ns clock 8 MB internal semiconductor 8 MB internal semiconductor

memorymemory 4 KB of register storage4 KB of register storage Uses ECL throughoutUses ECL throughout 115 kW input power115 kW input power Simple gatesSimple gates

Page 4: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

MemoryMemory

16 bank = 16 way interleaved access16 bank = 16 way interleaved access No bank conflicts except on stride No bank conflicts except on stride

lengths of 8 or 16lengths of 8 or 16 4 clock cycles per access4 clock cycles per access Can pull down 16 instructions per Can pull down 16 instructions per

cyclecycle 1 data word if being placed in 1 data word if being placed in

registersregisters

Page 5: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

CoolingCooling

Big power + many modules = heatBig power + many modules = heat Aluminum/steel cooling rods with Freon Aluminum/steel cooling rods with Freon

flowflow Copper connectors pipe heat from chip Copper connectors pipe heat from chip

out to cooling rodsout to cooling rods Freon/oil leak problem on rod Freon/oil leak problem on rod

constructionconstruction Designed to keep module temperatures Designed to keep module temperatures

under 54 degrees Celsius under 54 degrees Celsius

Page 6: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Floating PointFloating Point

IEEE? IEEE? No.No.

Why?Why? Not written yet!Not written yet! Wouldn’t arrive until 7 years later.Wouldn’t arrive until 7 years later.

49 bit signed magnitude “mantissa”49 bit signed magnitude “mantissa” 15 bit biased exponent15 bit biased exponent

Page 7: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Production plans anticipate Production plans anticipate shipping one CRAY-1 per shipping one CRAY-1 per

quarter.quarter.

Page 8: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Topic: Vector ComputersTopic: Vector Computers

8 64X64 vector registers8 64X64 vector registers Process vector elements identicallyProcess vector elements identically Vector Mask register can protect an Vector Mask register can protect an

elementelement ““Chaining”Chaining”

Can use output of one vector operation Can use output of one vector operation as input to next before it is doneas input to next before it is done

Win = don’t have to store to memory Win = don’t have to store to memory then fetch from memorythen fetch from memory

Page 9: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Benefits of Vector Benefits of Vector ComputingComputing

Previously needed 100+ elements for Previously needed 100+ elements for vector to be useful over scalarvector to be useful over scalar CRAY-1 cuts that to 2-4CRAY-1 cuts that to 2-4

Don’t need to store vector elements Don’t need to store vector elements next to each other in memorynext to each other in memory

Max wait time is previous vector Max wait time is previous vector length + 4length + 4

Common wait time is functional unit Common wait time is functional unit time + 2time + 2

Page 10: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Vector Benefits Vector Benefits ContinuedContinued

Page 11: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

CompilerCompiler

CFTCFT Automatically vectorizes inner loop if Automatically vectorizes inner loop if

possiblepossible No need to rewrite code!No need to rewrite code!

Can’t vectorize loops with control Can’t vectorize loops with control statements.statements.

Often slower than hand coded assembly.Often slower than hand coded assembly. Improve instruction scheduling “in the Improve instruction scheduling “in the

future”future”

Page 12: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

QuestionsQuestions The CRAY-1 automatically vectorizes code The CRAY-1 automatically vectorizes code

loops. Current microprocessors usually use loops. Current microprocessors usually use smaller vector registers with extensions such smaller vector registers with extensions such as SSE to support SIMD operations. Do as SSE to support SIMD operations. Do modern compilers do these vector modern compilers do these vector optimizations automatically as the CRAY did optimizations automatically as the CRAY did or is it the explicit use of vector instructions or is it the explicit use of vector instructions that has dominated and why? Trade offs?that has dominated and why? Trade offs?

They say they can eventually make loops with They say they can eventually make loops with control flow in them vectorizable. Can you control flow in them vectorizable. Can you come up with a simple method to do so and/or come up with a simple method to do so and/or some reasons that make this case difficult?some reasons that make this case difficult?

Page 13: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Table 3Table 3

Page 14: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

RegistersRegisters

A = 8 address registersA = 8 address registers B = 64 address-save registersB = 64 address-save registers S = 8 scalar registersS = 8 scalar registers T = 64 scalar-save registersT = 64 scalar-save registers V = 8 64X64 vector registersV = 8 64X64 vector registers

Page 15: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Special RegistersSpecial Registers VM = mask off vector elements to not operate onVM = mask off vector elements to not operate on VL = length of vector being processedVL = length of vector being processed P = parcel address countP = parcel address count BA = absolute address used as base for indexed BA = absolute address used as base for indexed

memory accesses (helps with dynamic user space memory accesses (helps with dynamic user space migration)migration)

LA = limits the accessible address spaceLA = limits the accessible address space XA = supports exchange operationXA = supports exchange operation F = flag register that holds various “condition F = flag register that holds various “condition

codes”codes” M = mode register (3 bits)M = mode register (3 bits)

Bit 1 = Floating Point Error/Interrupt EnableBit 1 = Floating Point Error/Interrupt Enable Bit 2 = Uncorrectable memory corruption Interrupt EnableBit 2 = Uncorrectable memory corruption Interrupt Enable Bit 3 = All interrupts disabled.Bit 3 = All interrupts disabled.

Page 16: The CRAY-1 Computer System Richard Russell Communications of the ACM January 1978

Front EndFront End

Needs an access terminal Needs an access terminal minicomputerminicomputer

Connects to a “CRAY access Connects to a “CRAY access channel” to control the computerchannel” to control the computer