56
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 1 - Introduc;on Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://people.eecs.berkeley.edu/~krste http://inst.eecs.berkeley.edu/~cs152

CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture ...cs152/sp18/lectures/L01... · 2018. 1. 21. · § Analog compu

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • CS152ComputerArchitectureandEngineeringCS252GraduateComputerArchitecture

    Lecture1-Introduc;on

    KrsteAsanovic

    ElectricalEngineeringandComputerSciencesUniversityofCaliforniaatBerkeley

    http://people.eecs.berkeley.edu/~krstehttp://inst.eecs.berkeley.edu/~cs152

  • WhatisComputerArchitecture?

    2

    Applica

  • 3

    Abstrac;onLayersinModernSystems

    Algorithm

    Gates/Register-TransferLevel(RTL)

    Applica

  • 4

    Compu;ngDevicesThen…

    EDSAC,UniversityofCambridge,UK,1949

  • 5

    Compu;ngDevicesNow

    Robots

    Supercomputers Automobiles

    Laptops

    Set-top boxes

    Smart phones

    Servers Media Players

    Sensor Nets

    Routers

    Cameras Games

  • CostofsoZwaredevelopmentmakescompa

  • 7

    [from Kurzweil]

    Major Technology Generations Bipolar

    nMOS

    CMOS

    pMOS

    Relays

    Vacuum Tubes

    Electromechanical

    ?

  • Single-ThreadProcessorPerformance

    8

    [Hen

    nessy&PaW

    erson,2017]

  • UpheavalinComputerDesign

    •  Mostoflast50years,Moore’sLawruled–  Technologyscalingallowedcon

  • Today’sDominantTargetSystems•  Mobile(smartphone/tablet)

    –  >1billionsold/year–  MarketdominatedbyARM-ISA-compa

  • 11

    ThisYear:CombinedCS152/CS252•  CS152/CS252sharelecturesin306Soda,MW2:30-4pm

    –  butsomeslidesmarkedasCS252materialonly

    •  CS152/CS252sharetwomidterms(inclass,80minuteseach)–  butsomeques

  • 12

    CS152/CS252AdministriviaInstructor:Prof.KrsteAsanovic,[email protected]

    Office:579SodaHall(insideADEPTLab)OfficeHours:Wed.9:30-10:30AM(emailtoconfirm),579Soda

    T.A.s: DonggyuKim,dgkim@berkeley OH:2-3Thu,611SodaHowieMao,zhemao@berkeley OH:10-11Tue,611Soda

    Lectures: MW,2:30-4PM,306Soda252Readings:M4-5pm,405Soda(start1/29)152Sec

  • 13

    CS152CourseGrading

    •  15%ProblemSets–  Intendedtohelpyoulearnthematerial.Feelfreetodiscusswithotherstudentsandinstructors,butmustturninyourownsolu

  • 14

    CS252CourseGrading

    •  15%ProblemSets–  Intendedtohelpyoulearnthematerial.Feelfreetodiscusswithotherstudentsandinstructors,butmustturninyourownsolu

  • CS152/CS252Crossovers

    •  BerkeleyundergradscannottakeCS252beforeCS152

    •  CS152studentscanpar

  • 16

    CS152Labs•  Eachlabhasdirectedplusopen-endedassignments•  Directedpor

  • ClassISAisRISC-V•  RISC-Visanewfree,simple,clean,extensibleISAwedevelopedatBerkeleyforeduca

  • Founda;on:100+Members

  • Chiselsimulators

    •  Chiselisanewhardwaredescrip

  • ChiselDesignFlow

    20

    ChiselDesignDescrip

  • Ques;ons?

    21

  • 22

    ComputerArchitecture:ALiWleHistory

    Throughoutthecoursewe’lluseahistoricalnarra

  • AnalogComputers

    § Analogcomputerrepresentsproblemvariablesassomephysicalquan

  • DigitalComputers

    § Representproblemvariablesasnumbersencodedusingdiscretesteps- Discretestepsprovidenoiseimmunity

    §  Enablesaccurateanddeterminis

  • CharlesBabbage(1791-1871)§  LucasianProfessorofMathema

  • DifferenceEngine1822§ Con

  • RealizingtheDifferenceEngine§  Mechanicalcalculator,hand-cranked,usingdecimaldigits§  BabbagedidnotcompletetheDE,movingontotheAnaly

  • Analy;calEngine1837

    §  Recognizedasfirstgeneral-purposedigitalcomputer- Manyitera

  • Analy;calEngineDesignChoices

    § Decimal,becausestorageonmechanicalgears- Babbageconsideredbinaryandotherbases,butnoclearadvantageoverhuman-friendlydecimal

    §  40-digitprecision(equivalentto>133bits)- Toreduceimpactofscalinggivenlackoffloa

  • AdaLovelace(1815-1852)§  TranslatedlecturesofLuigiMenabreawhopublishednotesofBabbage’slecturesinItaly

    §  LovelaceconsiderablyembellishednotesanddescribedAnaly

  • EarlyProgrammableCalculators

    § Analogcompu

  • Atanasoff-BerryLinearEqua;onSolver(1939)

    32

    §  Fixed-func

  • ZuseZ3(1941)§  BuiltbyKonradZuseinwar

  • HarvardMarkI(1944)

    §  ProposedbyHowardAikenatHarvard,andfundedandbuiltbyIBM

    §  Mostlymechanicalwithsomeelectricallycontrolledrelaysandgears

    §  Weighed5tonsandhad750,000components§  Stored72numberseachof23decimaldigits§  Speed:adds0.3s,mul1minute§  Instruc

  • ENIAC(1946)§  Firstelectronicgeneral-purposecomputer§  Construc

  • ENIAC

    36 [PublicDomain,USArmyPhoto]

    Changingtheprogramcouldtakedays!

  • EDVAC§  ENIACteamstarteddiscussingstored-programconcepttospeedupprogrammingandsimplifymachinedesign

    §  JohnvonNuemannwasconsul

  • [Piero71,Crea0veCommonsBY-SA3.0]

    Williams-KilburnTubeStore

    ManchesterSSEM“Baby”(1948)§  ManchesterUniversitygroupbuildsmall-scaleexperimentalmachinetodemonstrateideaofusingcathode-raytubes(CRTs)forcomputermemoryinsteadofmercurydelaylines

    §  Williams-KilburnTubeswerefirstrandomaccesselectronicstoragedevices

    §  32wordsof32-bits,accumulator,andprogramcounter§  Machineranworld’sfirststored-programinJune1948§  LedtolaterManchesterMark-1full-scalemachine

    - Mark-1introducedindexregisters- Mark-1commercializedbyFerran<

    38

  • CambridgeEDSAC(1949)§  MauriceWilkescamebackfromworkshopinUSandsetaboutbuildingastored-programcomputerinCambridge

    §  EDSACusedmercury-delaylinestoragetoholdupto1024words(512ini

  • Commercialcomputers:BINAC(1949)andUNIVAC(1951)

    §  EckertandMauchlyleZU.PennaZerpatentrightsdisputesandformedtheEckert-MauchlyComputerCorpora

  • IBM701(1952)

    §  IBM’sfirstcommercialscien

  • IBM650(1953)

    §  Thefirstmass-producedcomputer§  Low-endsystemwithdrum-basedstorageanddigitserialALU

    § Almost2,000produced

    42 [CushingMemorialLibraryandArchives,TexasA&M,

    Crea0veCommonsABribu0on2.0Generic]

  • IBM650Architecture

    43 [From650Manual,©IBM]

    Magne0cDrum(1,000or2,000

    10-digitdecimalwords)

    20-digitaccumulator

    Ac0veinstruc0on(includingnext

    programcounter)

    Digit-serialALU

  • IBM650Instruc;onSet

    § Addressanddatain10-digitdecimalwords§  Instruc

  • EarlyInstruc;onSets

    § VerysimpleISAs,mostlysingle-addressaccumulator-stylemachines,ashigh-speedcircuitrywasexpensive- Basedonearlier“calculator”model

    § Over

  • Burrough’sB5000StackArchitecture:RobertBarton,1960

    § Hideinstruc

  • Evalua;onofExpressions

    47

    abc

    (a+b*c)/(a+d*c-e)/

    +

    * +a e

    -

    ac

    d c

    *b

    ReversePolishabc*+adc*+e-/

    pushapushbpushcmul

  • Evalua;onofExpressions

    48

    a

    (a+b*c)/(a+d*c-e)/

    +

    * +a e

    -

    ac

    d c

    *b

    ReversePolishabc*+adc*+e-/

    add

    +

    Evalua

  • IBM’sBigBet:360Architecture

    § Byearly1960s,IBMhadseveralincompa

  • IBM360:DesignPremisesAmdahl,BlaauwandBrooks,1964

    §  Thedesignmustlenditselftogrowthandsuccessormachines

    § Generalmethodforconnec

  • StackversusGPROrganiza;onAmdahl,BlaauwandBrooks,1964

    1.Theperformanceadvantageofpush-downstackorganiza

  • IBM360:AGeneral-PurposeRegister(GPR)Machine

    § ProcessorState- 16General-Purpose32-bitRegisters- maybeusedasindexandbaseregister- Register0hassomespecialproper

  • IBM360:Ini;alImplementa;ons

    53

    Model30 ... Model70Storage 8K-64KB 256K-512KBDatapath 8-bit 64-bitCircuitDelay 30nsec/level 5nsec/levelLocalStore MainStore TransistorRegistersControlStore Readonly1µsec Conven

  • © 2017 IBM Corporation

    z14 processor design summary

    Micro-Architecture

    • 10 cores per CP-chip • 5.2GHz

    • Cache Improvements:• 128KB I$ + 128KB D$• 2x larger L2 D$ (4MB)• 2x larger L3 Cache• symbol ECC

    • New translation & TLB design• Logical-tagged L1 directory• Pipelined 2nd level TLB• Multiple translation engines

    • Pipeline Optimizations• Improved instruction delivery• Faster branch wakeup• Improved store hazard avoidance• 2x double-precision FPU bandwidth• Optimized 2nd generation SMT2

    • Better Branch Prediction• 33% Larger BTB1 & BTB2• New Perceptron & Simple Call/Return Predictor

    Architecture

    • PauseLess Garbage Collection• Vector Single & Quad precision• Long-multiply support (RSA, ECC)• Register-to-register BCD arithmetic

    Accelerators

    • Redesigned in-core crypto-accelerator• Improved performance• New functions (GCM, TRNG, SHA3)

    • Optimized in-core compression accelerator• Improved start/stop latency• Huffman encoding for better

    compression ratio• Order-preserving compression

    IBMMainframessurviveun;ltoday

    54

    [z14,2017,14nmtechnology,17layersofmetal,696sqmm]

  • 55

    Andinconclusion…

    •  ComputerArchitecture>>ISAsandRTL•  CSx52isaboutinterac

  • 56

    Acknowledgements

    •  Theseslidescontainmaterialdevelopedandcopyrightby:–  Arvind(MIT)–  KrsteAsanovic(MIT/UCB)–  JoelEmer(Intel/MIT)–  JamesHoe(CMU)–  JohnKubiatowicz(UCB)–  DavidParerson(UCB)

    •  MITmaterialderivedfromcourse6.823•  UCBmaterialderivedfromcourseCS252