L03 Principles

Embed Size (px)

Citation preview

  • 8/9/2019 L03 Principles

    1/37

    1

    Roman

    Japanese

    Chinese (compute in hex?)

  • 8/9/2019 L03 Principles

    2/37

    2

    COMP 206:COMP 206:Computer Architecture andComputer Architecture andImplementationImplementation

    Montek SinghMontek Singh Thu, Jan 22, 2009 Thu, Jan 22, 2009

    Lecture 3: Quantitative PrinciplesLecture 3: Quantitative Princip

    les

  • 8/9/2019 L03 Principles

    3/37

    3

    Quantitative Principles of ComputerQuantitative Principles of ComputerDesignDesign This is intro to design and analysis This is intro to design and analysis

    Take Advantage of Parallelism Take Advantage of ParallelismPrinciple of LocalityPrinciple of LocalityFocus on the Common CaseFocus on the Common CaseAmdahls LawAmdahls Law

    The Processor Performance Equation The Processor Performance Equation

  • 8/9/2019 L03 Principles

    4/37

    4

    1) Taking Advantage of Parallelism1) Taking Advantage of Parallelism(exs.)(exs.)Increase throughput of server computer viaIncrease throughput of server computer via

    multiple processors or multiple disksmultiple processors or multiple disksDetailed HW designDetailed HW design

    Carry lookahead adders uses parallelism to speed upCarry lookahead adders uses parallelism to speed upcomputing sums from linear to logarithmic in numbercomputing sums from linear to logarithmic in number

    of bits per operandof bits per operandMultiple memory banks searched in parallel in set-Multiple memory banks searched in parallel in set-associative cachesassociative caches

    Pipelining (next slides)Pipelining (next slides)

  • 8/9/2019 L03 Principles

    5/37

    5

    PipeliningPipeliningOverlap instruction executionOverlap instruction execution

    to reduce the total time to complete an instructionto reduce the total time to complete an instructionsequence.sequence.

    Not every instruction depends on immediateNot every instruction depends on immediatepredecessorpredecessor

    executing instructions completely/partially inexecuting instructions completely/partially inparallel possibleparallel possible

    Classic 5-stage pipeline:Classic 5-stage pipeline:1) Instruction Fetch (Ifetch),1) Instruction Fetch (Ifetch),2) Register Read (Reg),2) Register Read (Reg),3) Execute (ALU),3) Execute (ALU),4) Data Memory Access (Dmem),4) Data Memory Access (Dmem),5) Register Write (Reg)5) Register Write (Reg)

  • 8/9/2019 L03 Principles

    6/37

    6

    Pipelined Instruction ExecutionPipelined Instruction Execution

    I n s t r.

    O r d e r

    Time (clock cycles)

    Reg A L U

    DMemIfetch Reg

    Reg A L U

    DMemIfetch Reg

    Reg A L U

    DMemIfetch Reg

    Reg A L U

    DMemIfetch Reg

    Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7Cycle 5

  • 8/9/2019 L03 Principles

    7/377

    Limits to pipeliningLimits to pipeliningHazardsHazards prevent next instruction fromprevent next instruction from

    executing during its designated clock cycleexecuting during its designated clock cycle

    Structural hazardsStructural hazards : attempt to use the same hardware: attempt to use the same hardwareto do two different things at onceto do two different things at onceData hazardsData hazards : Instruction depends on result of prior: Instruction depends on result of priorinstruction still in the pipelineinstruction still in the pipelineControl hazardsControl hazards : Caused by delay between the fetching: Caused by delay between the fetchingof instructions and decisions about changes in controlof instructions and decisions about changes in controlflow (branches and jumps).flow (branches and jumps).

  • 8/9/2019 L03 Principles

    8/378

    Increasing Clock RateIncreasing Clock RatePipelining also used for thisPipelining also used for this

    Clock rate determined by gate delaysClock rate determined by gate delays

    Latchor

    register

    combinationallogic

  • 8/9/2019 L03 Principles

    9/379

    2) The Principle of Locality2) The Principle of Locality The Principle of Locality: The Principle of Locality:

    Programs access a relatively small portion of thePrograms access a relatively small portion of theaddress space. Also, reuse data.address space. Also, reuse data.

    Two Different Types of Locality: Two Different Types of Locality: Temporal Locality (Locality in Time): If an item is Temporal Locality (Locality in Time): If an item is

    referenced, it will tend to be referenced again soonreferenced, it will tend to be referenced again soon(e.g., loops, reuse)(e.g., loops, reuse)Spatial Locality (Locality in Space): If an item isSpatial Locality (Locality in Space): If an item isreferenced, items whose addresses are close by tendreferenced, items whose addresses are close by tendto be referenced soonto be referenced soon(e.g., straight-line code, array access)(e.g., straight-line code, array access)

    Last 30 years, HW relied on locality forLast 30 years, HW relied on locality formemory perf.memory perf.

  • 8/9/2019 L03 Principles

    10/3710

    Levels of the Memory HierarchyLevels of the Memory Hierarchy

    CPU Registers 100s Bytes300 500 ps (0.3-0.5 ns)

    L1 and L2 Cache 10s-100s K Bytes~1 ns - ~10 ns$1000s/ GByte

    Main Memory G Bytes80ns- 200ns~ $100/ GByte

    Disk 10s T Bytes, 10 ms(10,000,000 ns)~ $1 / GByte

    Capacity Access Time Cost

    Tape infinitesec-min

    ~$1 / GByte

    Registers

    L1 Cache

    Memory

    Disk

    Tape

    Instr. Operands

    Blocks

    Pages

    Files

    Staging Xfer Unit

    prog./compiler1-8 bytes

    cache cntl32-64 bytes

    OS4K-8K bytes

    user/operatorMbytes

    Upper Level

    Lower Level

    faster

    Larger

    L2 Cachecache cntl64-128 bytesBlocks

  • 8/9/2019 L03 Principles

    11/3711

    3) Focus on the Common Case3) Focus on the Common CaseIn making a design trade-off, favor the frequent caseIn making a design trade-off, favor the frequent caseover the infrequent caseover the infrequent case

    e.g., Instruction fetch and decode unit used moree.g., Instruction fetch and decode unit used morefrequently than multiplier, so optimize it 1stfrequently than multiplier, so optimize it 1ste.g., If database server has 50 disks / processor, storagee.g., If database server has 50 disks / processor, storagedependability dominates system dependability, so optimizedependability dominates system dependability, so optimizeit 1stit 1st

    Frequent case is often simpler and can be done fasterFrequent case is often simpler and can be done fasterthan the infrequent casethan the infrequent case

    e.g., overflow is rare when adding 2 numbers, so improvee.g., overflow is rare when adding 2 numbers, so improveperformance by optimizing more common case of noperformance by optimizing more common case of nooverflowoverflowMay slow down overflow, but overall performance improvedMay slow down overflow, but overall performance improved

    by optimizing for the normal caseby optimizing for the normal caseWhat is frequent case and how much is performanceWhat is frequent case and how much is performanceimproved by making case faster => Amdahls Lawimproved by making case faster => Amdahls Law

  • 8/9/2019 L03 Principles

    12/3712

    Validity of the single processor approach to achieving large scale computing capabilities, G. M. Amdahl,

    AFIPS Conference Proceedings, pp. 483-485, April 1967 http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf

    4) Amdahls Law (History, 1967)4) Amdahls Law (History, 1967)Historical contextHistorical context

    Amdahl was demonstrating the continued validity of Amdahl was demonstrating the continued validity of the single processor approach and of the weaknessesthe single processor approach and of the weaknessesof the multiple processor approachof the multiple processor approachPaper contains no mathematical formulation, justPaper contains no mathematical formulation, justarguments and simulationarguments and simulation

    The nature of this overhead appears to be sequential so The nature of this overhead appears to be sequential sothat it is unlikely to be amenable to parallel processingthat it is unlikely to be amenable to parallel processingtechniques.techniques. A fairly obvious conclusion which can be drawn at thisA fairly obvious conclusion which can be drawn at thispoint is that the effort expended on achieving high parallelpoint is that the effort expended on achieving high parallelperformance rates is wasted unless it is accompanied byperformance rates is wasted unless it is accompanied byachievements in sequential processing rates of very nearlyachievements in sequential processing rates of very nearlythe same magnitude.the same magnitude.

    Nevertheless, it is of widespread applicabilityNevertheless, it is of widespread applicabilityin all kinds of situ ationsin all kinds of situ ations

    http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdfhttp://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdfhttp://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdfhttp://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf
  • 8/9/2019 L03 Principles

    13/3713

    SpeedupSpeedupBook shows two forms of speedup eqnBook shows two forms of speedup eqn

    We will use the second because you getWe will use the second because you getspeedup factors like 2Xspeedup factors like 2X

    oldoverallnew

    ExTimeSpeedup ExTime

    =

    newoverall

    old

    ExTimeSpeedupExTime

    =

  • 8/9/2019 L03 Principles

    14/3714

    4) Amdahls Law4) Amdahls Law

    ( )enhanced

    enhancedenhanced

    newoldoverall

    SpeedupFraction Fraction

    1 ExTime

    ExTime Speedup

    +=

    1

    Best you could ever hope to do:

    ( )enhancedmaximum Fraction-1

    1 Speedup =

    ( )+enhanced

    enhancedenhancedoldnew Speedup

    FractionFractionExTimeExTime 1

  • 8/9/2019 L03 Principles

    15/37

    15

    Amdahls Law exampleAmdahls Law exampleNew CPU 10X fasterNew CPU 10X faster

    I/O bound server, so 60% time waitingI/O bound server, so 60% time waiting

    ( )

    ( )56.1

    64.01

    10

    0.4 0.41

    1

    Speedup

    Fraction Fraction1

    1 Speedup

    enhanced

    enhancedenhanced

    overall

    ==+

    =

    +=

    Its human nature to be attracted by 10X faster, vs.keeping in perspective its just 1.6X faster

  • 8/9/2019 L03 Principles

    16/37

    16

    Amdahls Law for Multiple TasksAmdahls Law for Multiple Tasks

    1

    1

    =

    =

    ii

    i i

    iavg

    F R F R

    [ ][ ]=

    secondresults

    11

    secondresults

    Fraction of resultsgenerated at this rate

    Average execution rate(performance)

    Note: Not fractionof time spent workingat this rate

    Note : Not fractionof time spent workingat this rate

    Bottleneckology: Evaluating Supercomputers, Jack Worlton, COMPCOM 85, pp. 405-406

  • 8/9/2019 L03 Principles

    17/37

    17

    ExampleExample

    30% of results are generated at the rate of 1 MFLOPS,20% at 10 MFLOPS,50% at 100 MFLOPS.What is the average performance in MFLOPS?What is the bottleneck?

    30% of results are generated at the rate of 1 MFLOPS,

    20% at 10 MFLOPS,50% at 100 MFLOPS.What is the average performance in MFLOPS?What is the bottleneck?

    MFLOPS08.35.32100

    5.0230100

    1005.0

    102.0

    13.0

    1==

    ++=

    ++

    = Ravg

    %5.15.32

    5.0%,2.6

    5.322

    %,3.925.32

    30===

    Bottleneck: the rate that consumes most of the time

    0 0.2 0.4 0.6 0.8 1

    1

    1

    =

    =

    ii

    i i

    i

    avg

    F R F

    R

  • 8/9/2019 L03 Principles

    18/37

    18

    Another ExampleAnother Example

    Which change is more effective on a certain machine: speeding up 10-foldthe floating point square root operation only, which takes up 20% of execution time, or speeding up 2-fold all floating point operations, whichtake up 50% of total execution time?(Assume that the cost of accomplishing either change is the same, and thetwo changes are mutually exclusive.)

    Which change is more effective on a certain machine: speeding up 10-foldthe floating point square root operation only, which takes up 20% of execution time, or speeding up 2-fold all floating point operations, whichtake up 50% of total execution time?(Assume that the cost of accomplishing either change is the same, and thetwo changes are mutually exclusive.)

    F sqrt = fraction of FP sqrt resultsR sqrt = rate of producing FP sqrt resultsF non-sqrt = fraction of non-sqrt resultsR non-sqrt = rate of producing non-sqrt resultsF fp = fraction of FP results

    R fp = rate of producing FP resultsF non-fp = fraction of non-FP resultsR non-fp = rate of producing non-FP resultsR before = average rate of producing results before enhancementR after = average rate of producing results after enhancement

    R FR F

    R F

    R F

    fp

    fp

    fp-non

    fp-non

    sqrt

    sqrt

    sqrt-non

    sqrt-non 4

    =

    =

  • 8/9/2019 L03 Principles

    19/37

    19

    Solution using Amdahls LawSolution using Amdahls Law

    22.11.4

    551

    1.41

    1.41

    41.011

    51

    411

    R R

    R F

    R 10FR

    R F

    R FR

    before

    after

    sqrt-non

    sqrt-non

    sqrt

    sqrtafter

    sqrt-non

    sqrt-non

    sqrt

    sqrt before

    ===

    =+

    =

    +

    =

    =+

    =

    +

    =

    x

    x

    x x x

    x x x

    Improve FP sqrt only

    33.15.1

    2215.11

    5.11

    5.011

    2111

    R R R

    F

    R 2

    FR

    R F

    R FR

    before

    after

    fp-non

    fp-non

    fp

    fpafter

    fp-non

    fp-non

    fp

    fp before

    ===

    =+

    =

    +

    =

    =+

    =

    +

    =

    y

    y

    y y y

    y y y

    Improve all FP ops

  • 8/9/2019 L03 Principles

    20/37

    20

    Implications of Amdahls LawImplications of Amdahls LawImprovements provided by a feature limited by howImprovements provided by a feature limited by howoften feature is usedoften feature is usedAs stated, Amdahls Law is valid only if the systemAs stated, Amdahls Law is valid only if the systemalways works with exactly one of the ratesalways works with exactly one of the rates

    Overlap between CPU and I/O operations? Amdahls Law asOverlap between CPU and I/O operations? Amdahls Law asgiven here is not applicablegiven here is not applicable

    Bottleneck is the most promising target forBottleneck is the most promising target forimprovementsimprovements Make the common case fastMake the common case fastInfrequent events, even if they consume a lot of time, willInfrequent events, even if they consume a lot of time, willmake little difference to performancemake little difference to performance

    Typical use: Change only one parameter of system, Typical use: Change only one parameter of system,and compute effect of this changeand compute effect of this change

    The same program, with the same input data, should run The same program, with the same input data, should runon the machine in both caseson the machine in both cases

  • 8/9/2019 L03 Principles

    21/37

    21

    5) Processor Performance5) Processor Performance

    sec secclock cycleCPU Time CPU Cycles for program clock cycle time

    program program clock cy =

    sec

    sec

    clock cycleCPU Cycles for program

    programCPU Time

    clock cycle program clock rate

    =

    or

  • 8/9/2019 L03 Principles

    22/37

    22

    CPI Clocks per InstructionCPI Clocks per Instruction

    clock cycleCPU Cycles for program programclock cyles

    CPI instruction instruction

    instruction count

    program

    =

    sec

    sec

    clock cycle instructionsCPI instruction count

    instruction programCPU Time

    clock cycle program clock rate

    =

  • 8/9/2019 L03 Principles

    23/37

    23

    Details of CPIDetails of CPI

    ( )

    ( )

    =

    =

    =

    iii

    iii

    i

    ii

    ICPI

    ICPI

    ICPI

    rateClock e performancCPU

    countnInstructioCPI

    countnInstructio CPI

    We can break performance down intoindividual types of instructions (instructionof type i ) simplistic CPU

  • 8/9/2019 L03 Principles

    24/37

  • 8/9/2019 L03 Principles

    25/37

    25

    Processor Performance EqnProcessor Performance EqnHow can we improve performance?How can we improve performance?

    Clockrate CPI Instruction counHardware technology (realization) xHardware organization (implementation) x xInstruction set (architecture) x xCompiler technology x xProgram x x

    Clockrate CPI Instruction counHardware technology (realization) xHardware organization (implementation) x xInstruction set (architecture) x xCompiler technology x xProgram x x

  • 8/9/2019 L03 Principles

    26/37

    26

    Example 1Example 1

    A LOAD/STORE machine has the characteristics shown below. We alsoobserve that 25% of the ALU operations directly use a loaded value that isnot used again. Thus we hope to improve things by adding new ALUinstructions that have one source operand in memory. The CPI of the newinstructions is 2. The only unpleasant consequence of this change is thatthe CPI of branch instructions will increase from 2 to 3. Overall, will CPUperformance increase?

    A LOAD/STORE machine has the characteristics shown below. We alsoobserve that 25% of the ALU operations directly use a loaded value that isnot used again. Thus we hope to improve things by adding new ALUinstructions that have one source operand in memory. The CPI of the newinstructions is 2. The only unpleasant consequence of this change is thatthe CPI of branch instructions will increase from 2 to 3. Overall, will CPU

    performance increase?

    Instruction type Frequency CPIALU ops 0.43 1Loads 0.21 2Stores 0.12 2Branches 0.24 2

  • 8/9/2019 L03 Principles

    27/37

    27

    Example 1 (Solution)Example 1 (Solution)

    Instruction type Frequency CPIALU ops 0.43 1Loads 0.21 2Stores 0.12 2Branches 0.24 2

    TIC57.1

    T1.57IC

    timecycleClock CPIICtimeCPU

    1.5720.24)0.12(0.2110.43CPI

    =

    =

    =

    =+++=

    Before change

    Instruction type Frequency CPIALU ops (0.43-x)/(1-x) 1

    Loads (0.21-x)/(1-x) 2Stores 0.12/ (1- x ) 2Branches 0.24/ (1-x) 3Reg-mem ops x/(1-x) 2 TIC1.703

    T908.1IC)-(1

    timecycleClock CPIICtimeCPU

    908.10.8925

    1.7025-1

    30.242)0.12-(0.211)-(0.43 CPI

    1075.040.43

    =

    =

    =

    ==

    ++++=

    ==

    x

    x x x x

    xAfter change

    Since CPU time increases, change will not improve performance.

  • 8/9/2019 L03 Principles

    28/37

    28

    Example 2Example 2

    A load-store machine has the characteristics shown below. An optimizingcompiler for the machine discards 50% of the ALU operations, although itcannot reduce loads, stores, or branches. Assuming a 500 MHz (2 ns)clock, what is the MIPS rating for optimized code versus unoptimized code?Does the ranking of MIPS agree with the ranking of execution time?

    A load-store machine has the characteristics shown below. An optimizingcompiler for the machine discards 50% of the ALU operations, although itcannot reduce loads, stores, or branches. Assuming a 500 MHz (2 ns)clock, what is the MIPS rating for optimized code versus unoptimized code?Does the ranking of MIPS agree with the ranking of execution time?

    Instruction type Frequency CPIALU ops 43% 1

    Loads 21% 2Stores 12% 2Branches 24% 2

  • 8/9/2019 L03 Principles

    29/37

    29

    Example 2 (Solution)Example 2 (Solution)

    Instruction type Frequency CPIALU ops 43% 1Loads 21% 2Stores 12% 2

    Branches 24% 2

    5.318101.57

    MHz500 MIPS

    IC1014.3

    1021.57IC

    timecycleClock CPIICtimeCPU1.5720.24)0.12(0.2110.43CPI

    6

    9-

    9-

    =

    =

    =

    =

    =

    =+++=

    Without optimization

    Instruction type Frequency CPIALU ops (0.43-x)/(1-x) 1Loads 0.21/ (1-x) 2Stores 0.12/ (1- x ) 2Branches 0.24/(1-x) 2

    0.2891073.1

    MHz500 MIPS

    IC1072.2

    10273.1IC)-(1

    timecycleClock CPIICtimeCPU

    73.10.7851.355

    -120.24)0.12(0.211x)-(0.43

    CPI

    20.43

    6

    9-

    9-

    =

    =

    =

    =

    =

    ==

    +++=

    =

    x

    x

    xWith optimization

    Performance increases,but MIPS decreases!

    f f

  • 8/9/2019 L03 Principles

    30/37

    30

    Performance of (Blocking) CachesPerformance of (Blocking) Caches

    timecycleClock cyclesCPUtimeCPU =

    timecycleClock cycles)stallMemorycycles(CPUtimeCPU +=

    penaltyMissreferenceMemory

    MissesnInstructio

    referencesMemoryIC

    penaltyMissnInstructio

    Misses IC

    penaltyMissmissesof Number cyclesstallMemory

    =

    =

    =

    CPIICcyclesCPU =

    no cache misses!no cache misses!no cache misses!no cache misses!

    with cache misses!with cache misses!with cache misses!with cache misses!

    IC instruction count

    l

  • 8/9/2019 L03 Principles

    31/37

    31

    ExampleExample

    Assume we have a machine where the CPI is 2.0 when allmemory accesses hit in the cache. The only data accessesare loads and stores, and these total 40% of the instructions.If the miss penalty is 25 clock cycles and the miss rate is 2%,how much faster would the machine be if all memory

    accesses were cache hits?

    Assume we have a machine where the CPI is 2.0 when allmemory accesses hit in the cache. The only data accessesare loads and stores, and these total 40% of the instructions.If the miss penalty is 25 clock cycles and the miss rate is 2%,how much faster would the machine be if all memory

    accesses were cache hits?

    35.127.2

    22502.0)4.01(2

    CPI

    penaltyMissrateMissnInstructio

    refsMemoryCPI

    timeCPUtimeCPU

    missesno

    misses

    ==

    ++=

    +=

    Why?

    ll i d i f ll

  • 8/9/2019 L03 Principles

    32/37

    32

    Fallacies and PitfallsFallacies and PitfallsFallacies - commonly held misconceptionsFallacies - commonly held misconceptions

    When discussing a fallacy, we try to give aWhen discussing a fallacy, we try to give acounterexample.counterexample.

    Pitfalls - easily made mistakesPitfalls - easily made mistakesOften generalizations of principles true in limitedOften generalizations of principles true in limited

    contextcontextWe show Fallacies and Pitfalls to help you avoid theseWe show Fallacies and Pitfalls to help you avoid theseerrorserrors

    ll d f ll ( )F ll i d Pi f ll (1/3)

  • 8/9/2019 L03 Principles

    33/37

    33

    Fallacies and Pitfalls (1/3)Fallacies and Pitfalls (1/3)Fallacy: Benchmarks remain valid indefinitelyFallacy: Benchmarks remain valid indefinitely

    Once a benchmark becomes popular, tremendousOnce a benchmark becomes popular, tremendouspressure to improve performance by targetedpressure to improve performance by targetedoptimizations or by aggressive interpretation of theoptimizations or by aggressive interpretation of therules for running the benchmark:rules for running the benchmark:benchmarksmanship.benchmarksmanship.

    70 benchmarks from the 5 SPEC releases. 70% were70 benchmarks from the 5 SPEC releases. 70% weredropped from the next release since no longer usefuldropped from the next release since no longer useful

    Pitfall: A single point of failurePitfall: A single point of failureRule of thumb for fault tolerant systems: make sureRule of thumb for fault tolerant systems: make surethat every component was redundant so that nothat every component was redundant so that nosingle component failure could bring down the wholesingle component failure could bring down the wholesystem (e.g, power supply)system (e.g, power supply)

    ll d f ll ( )F ll i d Pi f ll (2/3)

  • 8/9/2019 L03 Principles

    34/37

    34

    Fallacies and Pitfalls (2/3)Fallacies and Pitfalls (2/3)Fallacy - Rated MTTF of disks is 1,200,000Fallacy - Rated MTTF of disks is 1,200,000

    hours orhours or 140 years, so disks practically never fail140 years, so disks practically never failDisk lifetime is ~5 yearsDisk lifetime is ~5 years replace a diskreplace a diskevery 5 years; on average, 28 replacementevery 5 years; on average, 28 replacementcycles wouldn't fail (140 years long time!)cycles wouldn't fail (140 years long time!)Is that meaningful?Is that meaningful?Better unit: % that fail in 5 yearsBetter unit: % that fail in 5 years

    Next slideNext slide

    ll i d i f ll (3/3)F ll i d Pi f ll (3/3)

  • 8/9/2019 L03 Principles

    35/37

    35

    Fallacies and Pitfalls (3/3)Fallacies and Pitfalls (3/3)

    So 3.7% will fail over 5 yearsSo 3.7% will fail over 5 yearsBut this is under pristine conditionsBut this is under pristine conditions

    little vibration, narrow temperature rangelittle vibration, narrow temperature range no power failuresno power failures

    Real world: 3% to 6% of SCSI drives fail per yearReal world: 3% to 6% of SCSI drives fail per year3400 - 6800 FIT or 150,000 - 300,000 hour MTTF [Gray & van Ingen3400 - 6800 FIT or 150,000 - 300,000 hour MTTF [Gray & van Ingen05]05]

    3% to 7% of ATA drives fail per year3% to 7% of ATA drives fail per year3400 - 8000 FIT or 125,000 - 300,000 hour MTTF [Gray & van Ingen3400 - 8000 FIT or 125,000 - 300,000 hour MTTF [Gray & van Ingen05]05]

    Number of disks Time Period Failed DisksMTTF

    =

    1000 (5*365* 24 )37

    1,200,000disks hours

    Failed Diskshours

    = =

    N TiN Ti

  • 8/9/2019 L03 Principles

    36/37

    36

    Next TimeNext TimeInstruction Set ArchitectureInstruction Set Architecture

    Appendix BAppendix B

    R fR f

  • 8/9/2019 L03 Principles

    37/37

    ReferencesReferencesG. M. Amdahl, Validity of the single processor G. M. Amdahl, Validity of the single processor approach to achieving large scale computingapproach to achieving large scale computingcapabilities, AFIPS Conference Proceedings, pp. 483-capabilities, AFIPS Conference Proceedings, pp. 483-485, April 1967485, April 1967

    http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf

    http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdfhttp://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdfhttp://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf