Advanced Multimedia Architecture Cristina Silvano home.deib. ?· Advanced Multimedia Architecture ...…

  • Published on
    08-Sep-2018

  • View
    212

  • Download
    0

Embed Size (px)

Transcript

  • AdvancedMultimediaArchitectureProf.CristinaSilvano

    June2011AmirHosseinASHOURI

    764722

  • IBMenergyapproachpolicy:OneSizeFitsAll EncompassSoftware/Firmware/Hardware

    P d f t Power7predecessorsfeatures Clockgatingcircuits(whentheyarenotneeded) Runtimescalingoffrequencyandvoltagetovaryinguse Sensorstomeasuretheenvironmentworkloadsunderwhichthechipisoperatingwhichthechipisoperating

  • f NewInnovativeFeaturesofIBMPower7Chips

    Per corefrequencyscalingwithavailablePer corefrequencyscalingwithavailableautonomicfrequencycontrol PerchipautomatedvoltageslewingPer chipautomatedvoltageslewing Powerconsumptionestimation HardwareinstrumentationassistHardwareinstrumentationassist

  • F Features: 8 processorcorechaplets EachconsistingofprocessorcorewithL2 L3cache EachconsistingofprocessorcorewithL2,L3cache eachcorechipletrunsasynchronouslytothesymmetricmultiprocessor(SMP)interconnectfabric.

    ForOptimalPower,ThermalandYieldconcern: Thesystemmustmanagetwovoltagesindependentlyofthevoltagesuppliedtotherestofthechipofthevoltagesuppliedtotherestofthechip TheypowerthestaticRAM(SRAM)andembeddedDRAM(eDRAM)arraycircuitswithaseparate,slightlyhighervoltage(Varray)thanthevoltage(Vlogic)usedbytherestofthelogiccircuitsinthecorechiplet.

  • f f ThreeStepsofEnergyScalefirmware:1. Sensethestateofthehardware,including

    workloadandenvironmentalconditions.2. Decidewhichtradeoffstomakeonthebasisof

    li di i f h policydirectionfromthecustomer.3. Controlthehardwaresbehaviortodirectthese

    hchanges.

  • l l TemperatureCalculation EachPower7chipcontains44digitalthermalsensors(DTSs)

    fivepereachprocessorcorechipletfivepereachprocessorcorechiplet

    CriticalpathMonitoring colocatedwitheachDTStoproviderealtimefeedbackonthe

    chipscurrenttimingmargins

    SensorConsolidationE S l fi i f ti d EnergyScale firmware accesses sensor information andsets controls on the processor via an onchip serialcommunication infrastructure that provides 64bitregisters in a memorymapped, 32bit address space.

  • B i P t Because previous Power systems cores ransynchronously to the SMP interconnect, EnergyScalefirmware had to change the reference clocks frequencyfor the entire system in order to change the processorcores frequency, Disadvantages: Relatively slowRelatively slow Affected every core and cache in the system Makes suboptimal for adaptive power management algorithms

    Oft t i d th E S l fi Often constrained the EnergyScale firmware Multiple processor cores on a chip traditionally run off a shared analog

    phaselocked loop (APLL), a design thats not amenable to frequencyscalingscaling.

  • SolutionDPLL(fractionalNdigitalphaselockedloop) Power7leveragestheasynchronouscorechipletdesignby

    supportinga50percentto110percentchangefromnominalsupportinga50percentto110percentchangefromnominalfrequency,whileminimizingSMPcoherenceandmemorylatencybyusingasynchronouswrite,asynchronousreadarrayqueuingstructureacrossthechipletinterfacequeuingstructureacrossthechipletinterface.

    TheDPLLcontainsadigitallycontrolledoscillator(DCO),wheretheoutputfrequencyisselectedasamultiplieroffareferenceclockbase.

    AninternalDPLLfilterensuresthatnoovershootorundershootincycle AninternalDPLLfilterensuresthatnoovershootorundershootincycletimeispresentinthetransientduringrequestedfrequencychanges.

    Consequently,thePower7seesdramaticimprovementinbothslewrateandfrequencyrangeoverthePower6 inadditiontoprovidinganrateandfrequencyrangeoverthePower6,inadditiontoprovidinganindividualfrequencyonapercorebasis.

  • ThePower7processorsupportstwodistinctidlestates ThePower7processorsupportstwodistinctidlestates:

    Nap TheNapidlestateQuiesce instructionfetchandexecutionandturns

    ff ll l k t th ti i i th N P k offallclockstotheexecutionenginesinthecore.NaponPower7keepsthecaches(L1,L2,andL3)coherentanddoesntpurgethem.

    Sleep MoreaggressiveidlemodeMoreaggressiveidlemode Clocksofftheentirecorechiplet,cachesandDPLL UponwakingfromSleep,thehardwarelogicrestartstheDPLL,powersuptheL3

    cacheseDRAM,andsequencesthecorechipletbacktofunctionaloperation. Power7allowsadditionalpowerreductionforNapandSleep Power7allowsadditionalpowerreductionforNapandSleep

    bysupportingautomatedcouplingoftheonchipfrequencyandvoltageslewingcapabilitieswiththeprocessoridlestatesstates

  • t d i th lt t t ti f ll supportreducingthevoltagetoretentionforanallcoreSleepstateintherequiredsub10mstimescale

    Power7approach:1. ArrayVoltage(Varray)2. LogicVoltage(Vlogic)

    TheonchipcontrollerautomaticallyselectsTheon chipcontrollerautomaticallyselectsbetweentheidleandoperationaltargetvoltage.

    VRMscansafelyprocessonlysmallinstantaneousvoltageh t Power7 hi h d t t thi changerequests,Power7 chiphardwareautomatesthis

    functionbyprovidingavoltagesequencingengine,freeingupcomputepoweronthemicrocontrollerformoreadvancedpower

    t l ithmanagementalgorithms.

  • Eff ti ti i i ti Effectivepowermanagementinamicroprocessorrequiresaruntimemeasurementofpower.However,themeasurementofreal,calibratedpowerconsumptioninhardwareisdifficult.

    AMD hasreferredtoanon chippowermonitoringandcontrolfacility AMD hasreferredtoanonchippowermonitoringandcontrolfacility Intel hasalsoreportedamechanismtochoosethechipsglobalsettings

    (suchasvoltageorfrequency)bymonitoringactivitylevelsacrossvariousregionswithinthechipregionswithinthechip

    Power7 implementsahardwaremechanismintheformofapowerproxybyusingaspeciallyarchitected,programmably weightedcounterbasedarchitecturethatmonitorsactivitiesandformsanaggregatebasedarchitecturethatmonitorsactivitiesandformsanaggregatevalue.

    Pact= (Wi*Ai)+C Pproxy= Pact+ (K*Favg) Pproxy= Pact+ (K Favg)

  • Si ifi l i ffi i l Significantlyimproveserverenergyefficiently TheSPECpower_ssj2008benchmarkisrepresentativeofserversideJava

    businessapplications: Itmeasuressystemenergyefficiencywhileinjectingtransactionsintothe

    serveratloadlevelsfrom10percentto100percentofcalibratedmaximumthroughput

    Thefinalbenchmarkscoreistheratioofthesumofsystem Thefinalbenchmarkscoreistheratioofthesumofsystemthroughputtothesumofsystempowerconsumption,eachsumbeingacrossallloadlevels.

  • A f l hi l Autonomousfrequencycontrolsmoveprocessorcorechipletfrequencieswithinadefinedrangeinresponsetooperatingconditions,exploitingthefinegrained,lowlatency, p g g , yfrequencycontrolprovidedbythepercoreDPLLs

    LowActivityDetection (LAD)mechanismautonomouslymanagescorechipletfrequenciesin (LAD)mechanismautonomouslymanagescorechipletfrequenciesin

    responsetoworkloadchangesobservedonintervalsasshortasafewmicroseconds

    ReducingwastefulguardbandReducingwastefulguardband Typically, includesomeadditionalguardbandvoltagetocoverfor

    unknownvariablesandtestinaccuracy usingCPM

  • E ffi i b tb hi db ith d i Energyefficiencycanbestbeachievedbyeitherreducingpowerconsumptionwhilemaintainingthesameamountofperformanceorbyincreasingperformanceatthesamepowerlevel.

    ThedirectionforfutureEnergyScalehardwareandfirmware ThedirectionforfutureEnergyScalehardwareandfirmwaredevelopmentwillbetobuilduponthecapabilitiesandresultsachievedonPower7todojustthat.

    Toreachthisgoalwillrequire:Toreachthisgoalwillrequire: betterinstrumentation moreintelligentalgorithms finergrainedvoltagecontrols Increasedguardbandreductiontechniques Smalleridlelatencies fasterEnergyScalefirmwareresponsetime

  • INTRODUCINGTHEADAPTIVEENERGYMANAGEMENTFEATURESOFTHE1. INTRODUCINGTHEADAPTIVEENERGYMANAGEMENTFEATURESOFTHEPOWER7CHIP02721732/11/$26.00c2011IEEE

    2. http://www.greenlaunches.com/gadgetsandtech/ibmintroducestheecofriendlypower7chip.phpy p 7 p p p

    3. http://news.cnet.com/i/bto/20090831/power7die_610x483.jpg4. http://en.wikipedia.org/wiki/High_Productivity_Computing_Systems5. http://en.wikipedia.org/wiki/Clock_gating6 http://en wikipedia org/wiki/Quiesce6. http://en.wikipedia.org/wiki/Quiesce7. http://www.phys.uu.nl/~euroben/reports/web10/pwr7layout.jpg8. http://en.wikipedia.org/wiki/POWER79. http://news.cnet.com/830113924_31044880464.html

Recommended

View more >