19
Version 1.0 COMP 412 Fall 2019 1 COMP 412, Fall 2019 Lab 3: Instruction Scheduling Code Due date: 12/03/2019 Tutorial #1 Date: tba Submit to: [email protected] Time: tba Location: tba Report & Due date: 12/05/2019 Tutorial #2 Date: tba Test Block Submit to: [email protected] Time: tba Location: tba Table of Contents Code 2 C-1 Instruction Scheduling for a Single Basic Block 2 C-2 Code Specifications 3 C-3 Code Submission Requirements 6 C-4 Code Grading Rubric & Honor Code Policy 6 Report & Test Block R-1 Report Contents 7 R-1.1 Questionnaire 7 R-1.2 Results 8 R-2 Test Block 10 R-3 Report & Test Block Submission Requirements 11 R-4 Report & Test Block Grading Rubric & Honor Code Policy 11 Checklist 12 Appendix 13 A-1 ILOC Virtual Machine 13 A-2 ILOC Simulator & Subset 14 A-3 ILOC Input Blocks 15 A-4 Tools 16 A-4.1 Reference Instruction Scheduler 16 A-4.2 Timer 16 A-4.3 Graphviz 17 A-5 Instruction Scheduler Timing Results 18 Please report suspected typographical errors to the class Piazza site.

Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20191

COMP412,Fall2019Lab3:InstructionScheduling

Code Due date: 12/03/2019 Tutorial #1 Date: tba Submit to: [email protected] Time: tba Location: tba Report & Due date: 12/05/2019 Tutorial #2 Date: tba Test Block Submit to: [email protected] Time: tba Location: tba

TableofContentsCode 2 C-1 InstructionSchedulingforaSingleBasicBlock 2 C-2 CodeSpecifications 3 C-3 CodeSubmissionRequirements 6 C-4 CodeGradingRubric&HonorCodePolicy 6

Report&TestBlock R-1 ReportContents 7 R-1.1 Questionnaire 7 R-1.2 Results 8 R-2 TestBlock 10 R-3 Report&TestBlockSubmissionRequirements 11 R-4 Report&TestBlockGradingRubric&HonorCodePolicy 11

Checklist 12Appendix 13 A-1 ILOCVirtualMachine 13 A-2 ILOCSimulator&Subset 14 A-3 ILOCInputBlocks 15 A-4 Tools 16 A-4.1 ReferenceInstructionScheduler 16 A-4.2 Timer 16 A-4.3 Graphviz 17 A-5 InstructionSchedulerTimingResults 18

Please report suspected typographical errors to the class Piazza site.

Page 2: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20192

Code

C-1. InstructionSchedulingforaSingleBasicBlockThisprojectwillexposeyoutothecomplicationsthatariseinschedulinginstructionsforamodernmicroprocessor.Youwilldesign,test,andimplementaninstructionschedulerthatoperatesonsinglebasicblocks.Thegoalofyourscheduleristorearrangetheinstructionsintheinputblocktoreducethenumberofcyclesrequiredtoexecutetheoutputblock,asmeasuredwiththeLab3ILOCsimulator.(See§A-2fordetailsrelatedtotheLab3ILOCsimulator.)

Yourschedulerwilltakeasinputafilethatcontainsasequenceofoperations,expressedinasubsetoftheILOCIntermediateRepresentation(IR)presentedinEaC2e.TheILOCsubsetusedinLab3isdescribedin§A-2.Asoutput,yourschedulerwillproducean“equivalent”sequenceofILOCoperations,albeitreorderedtoimprove(shorten)itsexecutiontimeontheILOCvirtualmachine,whichisdescribedin§A-1.

Forthepurposesofthislab,aninputprogramandanoutputprogramareconsideredequivalentifandonlyiftheyprintthesamevaluesinthesameordertostdoutandeachdata-memorylocationdefinedbyanexecutionoftheinputprogramreceivesthesamevaluewhentheoutputprogramexecutes.

Tosimplifymatters,thetestblocksdonotreadanyinputfiles.AlldatausedbyagiventestblockiscontainedinthetestblockorenteredonthecommandlineusingtheILOCsimulator’s–ioption.

Youhavewidelatitudeinyourchoiceofschedulingalgorithms,rangingfromacarefulimplementationoflistschedulingthroughnoveltechniquesofyourowninvention.SeethelecturenotesandChapter12(includingthechapternotes)ofEngineeringaCompiler(EAC2e)fordescriptionsofschedulingalgorithmstoconsider.

Yourinstructionschedulerwillundoubtedlybuildadependencegraph.Manystudentshavefounditusefultolookatdependencegraphsusingavisualizationtool.WerecommenddevelopingyourschedulerwithanoptiontooutputafilethatcanbevisualizedusingGraphviz(§A-4.3).

Codeproducedbyyourschedulermustruncorrectlywithouthardwareinterlocks.Iftheschedulerneedstodelayanoperationbeyonditspositionintheoutputblock,itmustinserttheappropriatenumberofnops,lengthentheschedule,andforcetheoperationintotheappropriatecycle.

SchedulingisprofitableontheILOCvirtualmachinebecauseithastwosourcesofdelaythatcanslowprogramexecution.

1. Someoperationshavelatenciesofmorethanonecycle.Ifanoperation,sayinissueslot3ofthebasicblock,triestousetheresultofanoperationinsomeearlierslot,sayslot2,beforetheearlieroperationcompletes,thentheslot3operationmayreceiveanincorrectinputvalue.(Typically,anincorrectinputvalueproducesanunexpectedandincorrectresult.)Toensurecorrectexecution,yourschedulermustinsertenoughnopstoensurethatnooperationexecutesbeforeitsoperandsareavailable.TheLab3simulatorhasoptionstohelpyoutestwhetherornotyourschedulerinsertstherightnumberofnops(§A-2).

2. TheILOCvirtualmachinehasmultiplefunctionalunits(§A-1).Eachunitcanexecuteanidiosyncraticsetofoperations.Ifnooperationisavailabletoexecuteononeoftheunits,thenthatunitsitsidleuntilthenextcycle.Forexample,ifanoperationoftypefeecanexecuteoneitherfunctionalunit0or1,andanoperationoftypefoecanonlyexecuteonfunctionalunit1,thenthestringofoperations:

fee1, foe2, foe3, foe4, fee5, fee6

Page 3: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20193

willexecuteas [fee1; foe2][nop; foe3][fee5; foe4][fee6; nop],wherethesquarebracketsindicateapairofoperationsthatareissuedinthesamecycleatruntime.Adifferentinterleavingoftheoperationswouldhaveavoidedthetwonops.(Forexample,swappingtheorderoffoe4andfee5intheinput,ifallowed,wouldeliminatethenopsintheoutput.)Pooroperationordercanreducefunctionalunitutilization.

Yourschedulercandecreasethenumberofcyclesrequiredtoexecutetheoutputblockbyreorderingoperationstoreducethenumberofnopsandincreasefunctionalunitutilization.

Design and Implementation Timeline: To complete Lab 3 on time, your code should be building a complete and correct dependence graph, including dependences for store, load, and output operations, by Monday, November 11, 2019.

C-2. CodeSpecificationsTheCOMP412teachingassistants(TAs)willuseascripttogradeLab3codesubmissions.AllgradingwillbeperformedonCLEAR,sotestyourinstructionscheduleraswellasyourmakefileand/orscriptonCLEAR.YoumustadheretothefollowinginterfacespecificationstoensurethatyourinstructionschedulerworkscorrectlywiththescriptthattheTAswillusetogradeyourinstructionscheduler.

• Name:Theexecutableversionofyourinstructionschedulermustbenamedschedule.

• Behavior:Yourschedulermustworkintwodistinctmodes,ascontrolledbyparametersandflagsonthecommandline.Yourschedulermustcheckthecorrectnessofthecommand-lineargumentsandproducereasonableandinformativeerrormessageswhenerrorsaredetected.Theoutputdescribedinthetablebelowmustbeprintedtothestandardoutputstreamstdout;errormessagesmustbeprintedtothestandarderroroutputstreamstderr.TheTAswilltestbothofthesemodes,usingcommandlinesthatfitthesyntaxdescribedbelow:

schedule –h Whena–hflagisdetected,schedulemustproducealistofvalidcommand-lineargumentsthatincludesadescriptionofallcommand-lineargumentsrequiredforLab3aswellasanyadditionalcommand-lineargumentssupportedbyyourscheduleimplementation.

scheduleisnotrequiredtoprocesscommand-lineargumentsthatappearafterthe–hflag.

schedule <file name> Thiscommandwillbeusedtoinvokescheduleontheinputblockcontainedin<file name>.

<file name>specifiesthenameoftheinputfile.<file name>isavalidLinuxpathnamerelativetothecurrentworkingdirectory.

schedulewillproduce,asoutput,anILOCprogramthatisequivalenttotheinputprogram(§C-1),albeitreorderedtoimprove(shorten)itsexecutiontimeontheILOCvirtualmachine(§A-1).TheoutputILOCcodemustusethesquarebracketnotation[op1;op2]describedin§A-1todesignateoperationsthatshouldissueinthesamecycle.

scheduleisnotrequiredtoprocesscommand-lineargumentsthatappearafter<file name>.

Page 4: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20194

• Input:SinceinputfilesforLab1andLab3arerestrictedtothesamesubsetofILOCinstructions,albeitwithdifferentlatenciesassignedtotheILOCinstructions(§A-2),youshouldreuseinyourLab3instructionschedulerbothyourLab1routinesforreadingILOCfilesandyourLab1routinesforregisterrenaming.Ifyouelecttowriteanewfrontend,thefollowinginputspecificationsfromLab1apply:

o Theinputfilespecifiedinthecommand-lineargumentwillcontainasinglebasicblockthatconsistsofasequenceofoperationswrittenintheILOCsubsetdescribedin§A-2.Ifyourinstructionschedulercannotreadthespecifiedfile,orthecodeinthefileisnotvalidILOC,yourinstructionschedulershouldproduceareasonableandinformativemessage.IftheILOCcodeintheinputfileusesavaluefromaregisterthathasnopriordefinition,yourinstructionschedulershouldhandlethesituationgracefully.

o ScanningtheInput:Youmustwriteyourowncodetoscantheinput.Youmaynotusearegularexpressionlibraryorotherpattern-matchingsoftware,unlessyouwriteit.Yourcodemayreadanentirelineofinputandscanthatlinecharacter-by-character.

Note: The timer described in § A-4.2 uses input files containing up to 128,000 lines of ILOC. Your scheduler is expected to handle such large files correctly, efficiently, and gracefully. The use of efficient algorithms and data structures in your scheduler is key to handling large input files.

• RestrictionsonOptimizations:Yourschedulermustnotperformoptimizationsotherthaninstructionschedulingandregisterrenaming.Assumethattheoptimizerhadafirmbasisfornotperformingotheroptimizations.Examplesofoptimizationsthatwillresultinlossofcreditinclude:

o RemovingDeadCode:Yourschedulermayremovenopsfoundintheinputfile;theyhavenoeffectontheequivalenceoftheinputandoutputofprograms.Anyotheroperationleftinthecodebytheoptimizerispresumedtohaveapurpose.Forexample,youmaynotdeleteoperationsthatdefinevaluesthatarenotusedinthetestblock.

o AddingILOCOperations:Yourschedulermayaddnops.YourschedulermaynotaddotherILOCoperations.Forexample,yourschedulermaynotaddloadIoperations.

o FoldingConstants:Youmayuseconstantpropagationtodisambiguatememoryreferences.Youmaynotuseconstantpropagationtoeliminatecomputations.

o OptimizingBasedonHeaderConstants:Yourschedulermaynotrelyoninputconstantsspecifiedintestblockcomments.TheCOMP412TAsarenotrequiredtousetheinputspecifiedinthosecommentswhilegradingyourscheduler.YourschedulerisexpectedtogeneratecorrectcoderegardlessofwhetherornottheTAsusetheinputspecifiedinthetestblockcomments.

• Makefile&ShellScript:Lab3submissionswritteninlanguagesthatrequireacompilationstep,suchasC,C++,orJava,mustincludeamakefile.1(Lab3submissionswritteninlanguagesthatdonotrequirecompilation,suchasPython,donotneedamakefile.)

Themakefileshouldproduceanexecutablenamedschedule.Lab3submissionswritteninlanguagesinwhichitisnotpossibletocreateanexecutableandrenameit

1IfyouprefertouseabuildmanagerthatisavailableonCLEARtocreateyourexecutable,youmayinvokethatbuildmanagerinyourmakefile.

Page 5: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20195

schedule,suchasPythonandJava,mustincludeanexecutableshellscriptnamedschedulethatcorrectlyacceptstherequiredcommand-lineargumentsandinvokestheprogram.Forexample,aprojectwritteninPythonnamedlab3.pycouldprovideanexecutableshellscriptnamedschedulethatincludesthefollowinginstructions: #!/bin/bash python lab3.py $@

Ifyourscriptusesaversionofpythonthatprintsanextrainitiallinetotheoutputfile,youcanpipetheoutputthroughthe“scripteditor”sedtoremovethatline,asnotedonpiazza.

AprojectwritteninJavawithajarfilenamedlab3.jaroraclassnamedlab3.classthatcontainsthemainfunctioncouldprovide,respectively,oneofthefollowingtwoexecutableshellscriptsnamedschedule:

#!/bin/bash java –jar lab3.jar $@

#!/bin/bash java lab3 $@

ToensurethatyourscheduleshellscriptisexecutableonaLinuxsystem,executethefollowingcommandintheCLEARdirectorywhereyourscheduleshellscriptresides:

chmod a+x schedule

ToavoidproblemsrelatedtothetranslationofcarriagereturnandlinefeedbetweenWindowsandLinux,werecommendthatyouwriteyourshellscriptonCLEARratherthanwritingitonaWindowslaptopandtransferringthefile.

• GradingScript:TheTA’sgradingscriptwillfirstexecuteamakeifamakefileispresent.Thegradingscriptwillthenexecutecommandssimilartothefollowingcommand:

./schedule input_block.i > output_block.i

Thiscommandshouldinvokeyourinstructionscheduleronafileinput_block.i.Theoutputofyourinstructionscheduler,whichhasbeenredirectedtooutput_block.iinthecommand,willbetestedonCLEARusingtheLab3simulator.

Totestyouradherencetotheinterfaceusedbythegradingscript,runthetimerdescribedin§A-4.2onCLEAR.Thetimerwillinvokeyourscheduleexecutableorscriptonthetimingblocksmentionedin§A-3.

• CLEAR:Yourcodeandmakefile/shellscriptmustcompile,run,andproducecorrectoutputonRice’sCLEARsystems.CLEARprovidesafairlystandardLinuxenvironment.

• ProgrammingLanguage:YoumayuseanyprogramminglanguageavailableonRice’sCLEARfacility,exceptforPerl.ToallowyoutoeasilyreusepartofyourLab1codeinLab3,youshouldusethesameprogramminglanguagethatyouusedinLab1.

• README:YouarerequiredtosubmitaREADMEfilethatprovidesdirectionsforbuildingandinvokingyourprogram.Includeadescriptionofallcommand-lineargumentsrequiredforLab3aswellasanyadditionalcommand-lineargumentsthatyourschedulersupports.Toworkwiththegradingscripts,thefirsttwolinesinyourREADMEfilemustbeinthefollowingform.(Donotincludewhitespaceafter“//”.)

//NAME: <your name> //NETID: <your NetID>

Page 6: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20196

Assistance with Lab 3: The COMP 412 staff is available to answer questions: Piazza: Post questions to Piazza, where COMP 412 students, TAs, and professors respond to questions. Tutorial #1: Attend Lab 3 Tutorial #1. COMP 412 staff will be available to answer Lab 3 questions. Tutorial #2: Attend Lab 3 Tutorial #2, which will address instruction scheduler performance issues. Office Hours: Visit the TAs during the hours posted on the Piazza COMP 412 course page under “Staff”.

C-3. CodeSubmissionRequirementsDueDate:Lab3codeisdueat11:59PMonthecodeduedate(Tuesday,December3,2019).Individualextensionstothisdeadlinewillnotbegranted.

Early-SubmissionBonus:Codereceivedbeforethecodeduedatewillbeawardedanadditionalthreepointsperdayuptoamaximumofninepoints.Forexample,toreceiveninepoints,youmustsubmityourcodeby11:59PMonSaturday,November30,2019.

LatePenalty:Lab3codereceivedafterthecodeduedatewilllosethreepointsperday.Latecodewillbeaccepteduntil5:00PMonFriday,December6,2019.Permissionfromtheinstructorsisneededtosubmitcodeafterthisdeadline.Contactbothinstructorsbye-mailtorequestpermission.

LatePenaltyWaivers:Tocoverpotentialillness,travel,andotherconflicts,theinstructorswillwaiveuptosixdaysoflatepenaltiespersemesterwhencomputingyourfinalCOMP412grade.

SubmissionDetails:Createatarfilethatcontainsyoursubmission,including(1)thesourcecodeforyourinstructionscheduler,(2)yourmakefileand/orshellscript,(3)yourREADMEfile,and(4)anyotherfilesthatareneededfortheTAstobuildandtestyourcode.Ifyourschedulerdoesnotwork,alsoincludeafilenamedSTATUSthatdescribesthecurrentstateofthepassesinyourscheduler(complete/incomplete/notimplemented,working/broken,etc.)

NamethetarfilewithyourRiceNetID(e.g.,jed12.tarforastudentwiththeRiceNetIDjed12).Emailthetarfile,asanattachment,[email protected]:59PMonthecodeduedate.Use“Lab3Code”asthesubjectlineofyoure-mailsubmission.

Note:comp412code@rice.edushouldonlybeusedforsubmittingcodesinceitisonlymonitoredduringperiodswhencodeisdue.QuestionsshouldbepostedtotheCOMP412Piazzasite.

C-4. CodeGradingRubric&HonorCodePolicyTheLab3codegradeaccountsfor20%ofyourfinalCOMP412grade.TheLab3coderubricisbasedon100points,whichwillbeallocatedasindicatedbelow.TheTAswillawardmorepointstoinstructionschedulersthatproducescheduledILOCblocksthatarecorrectandefficient,basedontheoutputandcyclecountreportedbytheILOCsimulatorfortestrunsthatuseasinputtheLab3reportblocksandasetofTAtestblocks.Attheirdiscretion,theinstructorsmayawardpartialcreditforschedulerswithlimitedfunctionality.

• 5pointsforadherencetoLab3specifications(§C-2)andsubmissionrequirements(§C-3).

• 45pointsforthecorrectnessofthecodeproducedbyyourscheduler.

• 50pointsforthenumberofcyclesrequiredtorunthescheduledILOCcodeproducedbyyourschedulerontheLab3simulator.Instructionschedulerswithaggregateperformancescoresthatarewithin5%ofthereferencescheduler’saggregateperformancescorewillreceive50pointsforperformance;theremaininginstructionschedulerswillreceivepartialcredit.Thespeedofyourscheduler,asmeasuredonthetimingblocks(see§A-4.2),factorsintothelabreportgrade,notthecodegrade.

Page 7: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20197

YoursubmittedinstructionschedulersourcecodeandREADMEfilemustconsistofcodeand/ortextthatyouwrote,noteditedorcopiedversionsofcodeand/ortextwrittenbyothersorincollaborationwithothers.YoumaynotlookatCOMP412codefrompastsemesters.YoumaynotinvoketheCOMP412referenceinstructionscheduler(§A-4.1),oranyotherinstructionschedulerthatyoudidnotwrite,fromyoursubmittedcode.

YoumaycollaboratewithcurrentCOMP412studentswhenpreparingyourmakefileand/orshellscriptandsubmittheresultsofyourcollaborativemakefileand/orshellscriptefforts.AllotherLab3codeandtextsubmittedmustbeyourownwork,nottheresultofcollaborativeefforts.

YouarewelcometodiscussLab3withtheCOMP412staffandwithstudentscurrentlytakingCOMP412.Youarealsoencouragedtousethearchiveoftestblocksproducedbystudentsinprevioussemesters.However,youmaynotmakeyourCOMP412labsavailabletostudents(otherthanCOMP412TAs)inanyformduringorafterthissemester.Inparticular,youmaynotplaceyourcodeanywhereontheInternetthatisviewablebyothers.

“Meets Specs” Points: The rubric awards 5 points to labs that meet the specifications detailed in § C-2 and § C-3. Labs that fail to adhere to the specifications will lose points for each infraction; labs with multiple infractions may receive negative “meets specs” scores. Examples of issues that may result in a loss of points include: • Incorrect user interface. (§ C-2 under “Name” and “Behavior”) • Use of prohibited optimizations. (§ C-2 under “Optimization”) • Broken/missing required makefile and/or shell script. (§ C-2 under “Makefile & Shell Script”) • Missing and/or incorrectly formatted README file. (§ C-2 under “README”) • Broken and/or incorrectly named tar file. (§ C-3) • Incorrect e-mail address or header for submission. (§ C-3)

Report&TestBlock

R-1 ReportContentsYourLab3reportwillcontain:yourrepliestoabriefquestionnaire,twotablesofdata,andagraph.Atthesametime,youwillsubmitanoriginaltestblockthatyoucreatedtotestyourscheduler.ThespecificationsforthetablesandgrapharegiveninSectionR-1.2.Thequestionnaireshouldbeinatwelve-pointfont.Itcanbeeithersingle-spacedordouble-spaced.Thetablesandgraphshouldeachbeonaseparatepage.

R-1.1. QuestionnaireTheLab3ReportQuestionnairewillbeavailableonthecoursewebsite.ItwillbeprovidedinbothaMicrosoftWordformatandaplainASCIIfile.Runaspellingcheckeronyourquestionnaire.

Page 8: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20198

R-1.2. Results

TheLab3Spreadsheetwillbeavailableonthecoursewebsite.ItwillbeprovidedinExcelformat.Ifyouneeditinanotherformat,pleasetalktotheinstructorsafterlecture. Thespreadsheetdocumentsyourexperimentalresults,includingboththeeffectivenessandtheefficiencyofyourallocator.Itconsistsoftwotablesandachart.The Questionnaire will ask specific questions, based on the data provided in the two tables and the graph.

Table1:InstructionSchedulerEffectiveness:

• Includeatablethatlists,foreachofthefollowingscenarios,thenumberofcyclesreportedbytheILOCsimulatorforeachLab3reportblockandyourILOCtestblock:

o Performinstructionschedulingontheblocksusingyourinstructionscheduler,schedule.UsetheresultingscheduledblocksasinputtotheILOCsimulator.(UsetheILOCsimulator’s-s1option.)

o Performinstructionschedulingontheblocksusingthereferenceinstructionscheduler,lab3_ref.(See§A-4.1foradescriptionofthereferenceinstructionscheduler.)UsetheresultingscheduledblocksasinputtotheILOCsimulator.(UsetheILOCsimulator’s-s1option.)

• TesteachreportblocktoverifythattheILOCsimulatorproducesthesameoutputwhenitisrun(1)withthe-s1optiononthescheduledblockproducedbyyourinstructionschedulerand(2)withthe-s3optionontheoriginal,unscheduledblock.

• Indicateineitheryourtableorinthetextif(1)yourinstructionschedulerfailedtoproducecodeforaninputblock,(2)theILOCsimulatorgeneratedanerrormessagewhenrunwithyourscheduledcodeforaparticularinputblock,or(3)ablock’soutputwhentheILOCsimulatorisrunwiththe-s1optiononyourscheduledblockisdifferentthantheblock’soutputwhentheILOCsimulatorisrunwiththe-s3optionontheoriginal,unscheduledblock.

• Table1showsthetypeofinformationthatmustbeincludedinyourtable.Theresultscolumns

arelabeledwiththenamesofthetwoschedulersbeingcompared:scheduleandlab3_ref.Includetheresultsforyourinstructionschedulerinthetwo“schedule”columns.

• Checkthenumbersinthelab3_refcolumn.Theimplementationmayhavechanged.

schedule lab3_ref Difference schedule lab3_ref Difference(cycles) (cycles) (percent) (cycles) (cycles) (percent)

report1.i 26 25 4% report13.i 25 25 0%report2.i 24 23 4% report14.i 19 19 0%report3.i 26 26 0% report15.i 23 23 0%report4.i 33 33 0% report16.i 53 57 -7%report5.i 25 25 0% report17.i 21 21 0%report6.i 41 43 -5% report18.i 22 22 0%report7.i 32 36 -11% report19.i 42 42 0%report8.i 15 15 0% report20.i 25 22 14%report9.i 26 26 0% report21.i 72 72 0%report10.i 43 40 8% report22.i 52 52 0%report11.i 32 30 7% report23.i 69 69 0%report12.i 21 21 0% jed12.i 23 35 -34%

Table1:TotalCyclesRequiredforLab3ReportBlocks&SubmittedBlock*

Block Block

*Thesimulatorrunsusedtogeneratetheresultsinthistableranwithouterrorsandproducedcorrectoutput.

Page 9: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall20199

• Comparethenumberofcyclesreportedbythesimulatorforscheduledblocksproducedbyscheduleversusthenumberofcyclesreportedforblocksproducedbylab3_ref.ReporttheresultsintheTable1“Difference(percent)”columns.Usethefollowingformula:

100*(schedule’scycles–lab3_ref’scycles)/(lab3_ref’scycles)

Table2:InstructionSchedulerEfficiency:

• Inasecondtable,listtheresultsofrunningthetimerdescribedin§A-4.2onCLEARwithyourscheduler,schedule.Noteinstanceswherethetimerdidnotproduceresultsbecauseyourinstructionschedulereitherrequiredfiveminutesormoretoscheduleasmallerfileorfailedtocorrectlyscheduleoneormoreblocks.

Table 2: Scheduler Timing Results *

Input (lines) Scheduling Time (seconds) lab3_ref schedule

1000 0.001999 0.432934 2000 0.002999 0.542918 4000 0.005000 0.553916 8000 0.007998 1.155824

16000 0.014998 1.533767 32000 0.029996 1.781729 64000 0.059990 2.234660

128000 0.129980 2.810573 * None of the timing runs generated error messages.

• Table2showsthetypeofinformationthatmustbeincludedinyourtable.“lab3_ref”referstothereferenceinstructionschedulerimplementation.“schedule”referstotheschedulerthatyousubmittedforgrading.(ThedatashownforscheduleisfromaschedulerwritteninJava.)

Java Programmers: The results presented in Table 2 and its associated graph must be obtained using CLEAR’s default JVM maximum heap size. If garbage collection is slowing down your scheduler, you may include supplementary timing data that shows your scheduler’s efficiency results for larger maximum heap sizes. Document the maximum heap sizes that you use when measuring the supplementary timings.

• Graphyourschedulertimingresultsusingaformatsimilartothefollowinggraph.Toreceivefullcredit,youmustusealinearscale(notalogarithmicscale)forbothaxesofyourgraph.

Page 10: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201910

Note: The base points for the “Instruction Scheduler Efficiency” section of the Lab 3 report are awarded for the inclusion of all requested information and the quality of your responses. The base points are not determined by the speed of your instruction scheduler, as shown in Table 2 and the associated graph. Extra credit points may be awarded for schedule implementations that demonstrate exceptional, reproducible efficiency results when compared to schedule implementations written in the same language (C, C++, Java, Python, etc.).

R-2 TestBlockYouarerequiredtosubmitanoriginal,commentedILOCtestblockthateitheraccuratelytestsoneormoreaspectsofyourscheduler,orimplementsaninterestingalgorithmorcomputation.Submittedtestblocksmaybeaddedtothearchiveofcontributedinputblocks(§A-3).Pointsawardedforyourtestblockwillbebasedonaninstructor’sevaluationofthequality,correctness,andoriginalityofyourILOCcode,andadherencetothefollowingspecifications:

• YourtestblockmustbesubmittedinafilenamedwithyourRiceNetID(e.g., jed12.iforastudentwiththeRiceNetIDjed12).Usethe.isuffixtoindicatethatthefilecontainsILOC.

• Yourtestblockmustproduceoutput(i.e.,containatleastoneILOCoutputstatement)thatcanbeusedtoconvincinglydeterminewhetherornottheILOCcodeexecutedcorrectly.

• TheinstructorswillassessthecorrectnessofyourtestblockwiththeLab1RunAllscriptavailableonCLEARin/clear/courses/comp412/students/lab1/scripts.

• Toworkwiththescripts,thefirstfourlinesinyourtestblockmustbeinthefollowingform://NAME: <your name> //NETID: <your NetID> //SIM INPUT: <required simulator input, for example: -i 1024 1 2 3 4 5 6 7 8 9 10> //OUTPUT: <expected simulator output, for example: 55 0 13>

Donotincludewhitespaceafter“//”.SincetheSIMINPUTstringspecifiesinputflagsandconstants(notvariablenames)thatwillbeusedwheninvokingthesimulator,itmustbeinaformacceptedbythesimulator.TheOUTPUTstringmustspecifythestringofintegers(notvariablenames)thatthesimulatorisexpectedtoproducewheninvokedwithyourtestblock

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0 16 32 48 64 80 96 112 128

SchedulingTime(seconds)

NumberofLinesinInputFile(thousands)

SchedulerTimingResults

lab3_ref

schedule

Page 11: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201911

andthestringspecifiedby//SIMINPUT:.Seetheblocksdescribedin§A-3forexamplesofblocksthathavecommentsthatmeetthisspecification.

• YourtestblockmustonlyuseILOCoperationsdescribedin§A-2.EachILOCoperationmustbeginonanewline.Allmemoryaccessesthatoccurwhenyourtestblockisinvokedwiththeinputspecifiedin//SIM INPUT:mustbewordaligned.Additionally,yourtestblockmustnotuselabels,squarebracketnotation,ILOCpseudooperations,orregisterswithundefinedvalues.

• Thecommentsthatfollowthefirstfourlinesofcommentsmustincludeabriefdescriptionof:o Theblock’sinputrequirements(viatheILOCsimulator’s-iparameter)andexpected

output.Youcanusevariablesinthesecommentstodescribeyourblock’srequiredinputandexpectedoutput,asappropriate.

o Eitherthealgorithmimplementedortheaspectofyourschedulertestedbyyourtestblock.

R-3 Report&TestBlockSubmissionRequirementsYourLab3reportandtestblockaredueat11:59PMonTuesday,December3,2019.Individualextensionstothisdeadlinewillnotbegranted.Sendyourreportandtestblockasattachmentstocomp412report@rice.edu.Use“Lab3ReportandTestBlock”asthee-mailsubjectline.

LatePenalty:LateLab3reportsandtestblockswilllosethreepointsperday.Latereportsandtestblockswillautomaticallybeaccepteduntil5:00PMonFriday,December5,2019.PermissionfromtheinstructorsisrequiredtosubmitlateLab3reportsandtestblocksafterthisdeadline.Contactbothinstructorsbye-mailtorequestpermission.

LatePenaltyWaivers:Tocoverpotentialillnessandconflicts,theinstructorswillwaiveuptosixdaysoflatepenaltiespersemesterwhencomputingyourfinalCOMP412grade.Prioritywillbegiventowaivingpenaltiesforlatecodeoverpenaltiesforlatelabreportsandtestblocks.

Note:[email protected],testblocks,andtest-block-relatedbugreportssinceitisnotmonitoredregularly.

R-4 ReportandTestBlockGradingRubric&HonorCodePolicyTheLab3reportandtestblockgradeaccountsfor4%ofyourfinalCOMP412grade.ThegradingrubricfortheLab3reportandtestblockisbasedon100points:

• 25pointsforadherencetothetestblockspecificationsin§R-3andthequality,correctness,andoriginalityofthesubmittedtestblock.

• 75pointsforadherencetothespecificationsin§R-1,§R-2,&§R-4andthegrammaticalcorrectness,quality,andcontentoftheresultinglabreport.

YoursubmittedILOCtestblockandLab3reportmustconsistofcodeandtextthatyouwrote,noteditedorcopiedversionsoftestblocksortextwrittenbyothers.Additionally,whensubmittingyourLab3testblock,youmaynotsubmiteitheryourLab1testblockoraslightlyalteredversionofyourLab1testblock.Thedatausedtogeneratethetablesandgraphrequiredforthereportmustbeproducedusingthetoolsandmethodologydescribedin§R-1.3.

Youmaynotsubmitatestblockorreportthatyoujointlywrotewithanotherperson.YoumaynotlookatCOMP412reportsfrompastsemesters.YoumaynotmakeyourCOMP412reportsavailabletostudentsinanyformduringorafterthissemester.Inparticular,youmaynotplaceyourlabreportanywhereontheInternetwhereitisviewablebyothers.

Page 12: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201912

ChecklistThefollowinghigh-levelchecklistisprovidedtohelpyoutrackprogressonyourLab3code,report,andtestblock.£ Implementaninstructionschedulerforasinglebasicblockusingaschedulingalgorithmthat

youchooseordevise(§C-1).£ UsetheLab3ILOCsimulator(§A-2)toensurethatthescheduledcodeproducedbyyour

programiscorrect(§C-1&C-2)andtomeasurethenumberofcyclesthattheILOCsimulatorrequirestoexecuteeachscheduledblock.AvarietyofILOCinputblocks(§A-3)areavailableforyoutousewhentestingyourinstructionscheduler.

£ TestyourinstructionscheduleronCLEAR.ItwillbegradedonCLEAR.£ CreateanoriginalILOCtestblock,asdescribedin§R-3.Reportthenumberofcyclesrequired

bytheILOCsimulatortorunbothyouroriginaltestblockandthescheduledversionofyourtestblock,asdescribedin§R-1.3.

£ ReportthenumberofcyclesrequiredbytheLab3ILOCsimulatortorunonCLEARafterinstructionschedulingeachLab3reportblock,asdescribedin§R-1.3.

£ Invokethetimerdescribedin§A-4.2onCLEARtogeneratetheefficiencydatathatyouarerequiredtoincludeaspartofyourlabreport.Notethatyoushouldrunthetimingtestslongbeforeyousubmityourfinalcode.Theseblocksprovideausefultestofyouralgorithmsandimplementationatscale.

£ SubmitalabreportandtestblockfollowingthespecificationsinsectionsR-1throughR-5.

Page 13: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201913

AppendixA-1. ILOCVirtualMachineTheILOCmachinehasdisjointaddressspacesforinstructionsanddata.TheLab3simulator(§A-2)provides200,000registers,numberedr0throughr199999,andfourmegabytesofavailabledatamemory;accessestomemorymustbewordaligned.Byconvention,ILOCinputblocksdonotaccessmemoryaddressesabove32764.ILOCMicroarchitecture.Tomakethetaskofschedulingmoreinteresting,thetargetmachine’sbehaviorismorecomplicatedthanitwasinLab1.TheILOCvirtualmachinehastwofunctionalunits,f0andf1.Thesetwounitsareidentical,except

• Onlyf0canexecutetheloadandstoreoperations.Eitherf0orf1canexecute loadI.• Onlyf1canexecutethemultoperation. • Onlyoneoutputoperationcanexecutepercycle.Itcanexecuteoneitherf0orf1.

Thus,mostinstructionscanexecuteoneitherfunctionalunit.Onlythelonglatencyinstructionsareconstrainedtospecificfunctionalunits.Therestrictiononoutputisneededtoensuredeterminismintheprintstream,whichisnecessarygiventhatyourschedulerisrequiredtopreservetheorderingamongtheoutputinstructions.

TheILOCmachineexecutesinstructionsinparallelaccordingtothesyntaxfromAppendixAofEaC2e.Thus,[op1;op2]isasingleinstructioncontainingtwooperations,op1andop2,thatissueconcurrently.Theorderoftheoperationsisnotimportantaslongasthemachineconstraintsarenotviolated.Ifthesecondoperationofapairmustexecuteonf0andthefirstcanexecuteoneitherf0orf1,thedispatchunitwillsendthemtotheappropriateunits.

Themachinehasseveralotherimportantquirks.loadandstorearenon-blocking—thatis,ifaloadisissuedatcyclei,executioncontinuesnormallyunlessthecodeattemptstoexecuteareferencetotheregisterdefinedbytheloadprematurely.Anyreferencetotheresultofaloadthatoccursbeforetheendofitslatency—thatis,beforecyclei+5forLab3—willresultinthepreviousvaluebeingused.Thestoreinstructionispresumedtoexecuteimmediately,althoughitsresultsdonotreachmemoryuntiltheendofitslatencyperiod.Thus,ifastoretolocationpisscheduledincyclei,andaloadfrompisscheduledbeforei+5,theresultsareunpredictableanddependonthesettingofthestallcommandlineflag(-s).(SeetheLab3simulator’sdocumentation(§A-2)fordetailsrelatedtothesimulator’s-scommandlineflag.)UsetheLab3simulatortorunsometrialILOCblockstotestthesimulator’sbehavior.

Theresultofamultinstructionisnotavailablebeforetheendofthemult’slatencyperiod;attemptingtoreferenceitprematurelywillresultinanincorrectvalue.Inthissense,multoperatesinafashionsimilartoload.

SupportForTestingWithInterlocks.Whenyouruntheoriginalcodeforablocktodeterminethecorrectresults,usethesimulator’s-s3option,whichisthesimulator’sdefaultoption.Thisoptionwillensurethatthesimulatorusesalloftheinterlocksneededtoproducethecorrectanswer.

Whenyoutestthecodeproducedbyyourscheduler,usethe-s1option,whichistheoptionthattheTAswillusewhentheygradeyourcode.(Ifyourcodeiscorrectlyscheduled,thecomputedvalueswillbethesameasthosefoundwiththeoriginalcodeusingthe-s3option.)

Page 14: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201914

A-2. ILOCSimulator&SubsetILOCSimulator:AnILOCsimulator,itssource,anddocumentationareavailableonCLEAR.Theexecutablesimulatoris/clear/courses/comp412/students/lab3/sim.Itssourceanddocumentationareinthe/clear/courses/comp412/students/lab3/simulatordirectory.See§7.2ofthesimulatordocumentationforLab3configurationdetails.

TheLab3simulator,whichisdifferentthantheLab1simulator,buildsandexecutesonCLEAR.Youcaneitherrunthesimulatoras/clear/courses/comp412/students/lab1/sim,orcopyitintoyourlocaldirectory.ThesimulatorappearstoworkonotherOSimplementations,butisnotguaranteedtoworkonotherOSimplementations.YourinstructionschedulerwillbetestedandgradedonCLEAR,soyoushouldtestitonCLEAR.

ILOCSubset:Lab3inputandoutputfilesconsistofasinglebasicblock2ofcodewritteninasubsetofILOC.Yourschedulermustsupportandrestrictitselftothefollowingcase-sensitiveoperations:

Syntax Meaning Latency

load r1 => r2 r2 ß MEM(r1) 5

loadI x => r2 r2 ß x 1

store r1 => r2 MEM(r2) ß r1 5

add r1, r2 => r3 r3 ß r1 + r2 1

sub r1, r2 => r3 r3 ß r1 - r2 1

mult r1, r2 => r3 r3 ß r1 * r2 3

lshift r1, r2 => r3 r3 ß r1 << r2 1

rshift r1, r2 => r3 r3 ß r1 >> r2 1

output x printsMEM(x)tostdout 1

nop idleforonecycle 1

Allregisternameshaveaninitiallowercaserfollowedimmediatelybyanon-negativeinteger.Leadingzerosintheregisternamearenotsignificant;thus,r017andr17refertothesameregister.Argumentsthatdonotbeginwithr,whichappearasxinthetableabove,areassumedtobenon-negativeintegerconstantsintherange0to231–1.Assumearegister-to-registermemorymodel(seepage250inEaC2e)withoneclassofregisters.

ILOCtestblockscontaincommasandassignmentarrows,whicharecomposedofanequalsignfollowedbyagreaterthansymbol,asshown(=>).

EachILOCoperationinaninputblockmustbeginonanewline.3Blanksandtabsaretreatedaswhitespace.ILOCopcodesmustbefollowedbywhitespace—anycombinationofblanksortabs.Whitespaceprecedingandfollowingallothersymbolsisoptional.Whitespaceisnotallowedwithinoperationnames,registernames,ortheassignmentarrow.Adoubleslash(//)indicatesthattherestofthelineisacommentandcanbediscarded.Emptylinesandnopsininputfilesmayalsobediscarded.

2Abasicblockisamaximallengthsequenceofstraight-line(i.e.,branch-free)code.Weusethetermsblockandbasicblockinterchangeablywhenthemeaningisclear.3Carriagereturns(CR,\r,0x0D)andlinefeeds(LF,\n,0x0A)mayappearasvalidcharactersinend-of-linesequences.

Page 15: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201915

ThesyntaxofILOCisdescribedinfurtherdetailinSection3oftheILOCsimulatordocument,and,atahigherlevel,inAppendixAofEaC2e.(SeethegrammarforOperationthatstartsonp.726.)Notethatyourschedulerisnotrequiredtosupportlabelsonoperations.Yourschedulerisnotrequiredtosupportsquarebracketnotationininputfiles,butitmustbeabletoproduceoutputfilesthatusethesquarebracketnotation.

SimulatorUsageExample:Ifaninputfilenamedtest1.iisinyourpresentworkingdirectory,youcaninvokethesimulatoronCLEARontest1.iinthefollowingmanner:

/clear/courses/comp412/students/lab3/sim -s 3 -i 2048 1 2 3 < test1.i

Thiscommandwillcausethesimulatortoexecutetheinstructionsintest1.i,printthevaluescorrespondingtoILOCoutputinstructions,anddisplaythetotalnumberofcycles,operations,andinstructionsexecuted.The-sNUMparameterisoptional.Itcontrolstheconditionsunderwhichthesimulatorstalls.ThedefaultusedbytheLab3ILOCsimulatorwhenitisrunwithouta-sflagis-s3.WhenyouruntheoriginalcodeforanILOCtestblocktodeterminethecorrectoutputresults,youshouldusethedefault(-s3option);whenyoutestthecodeproducedbyyourLab3scheduler,youshouldusethe-s1option.TheTAswillusethesetwoflagswhentheygradetheoutputofyourscheduler,soyoushouldusethesetwoflagswhenyoutestyourscheduler.The-iparameterisusedtofillmemory,startingatthememorylocationindicatedbythefirstargumentthatappearsafter-i,withtheinitialvalueslistedafterthememorylocation.TheLab3ILOCsimulatorhasbyteaddressablememory,buttheILOCsubsetforLab1andLab3onlyallowsword-alignedaccesses.So,intheaboveexample,1willbewrittentomemorylocation2048,2tolocation2052,and3tolocation2056.(Thismeansthat,beforethememorylocationsareoverwrittenduringprogramexecution,"output2048"willcause1tobeprintedbythesimulator,"output 2052"willcause2tobeprinted,etc.)SeetheILOCsimulatordocumentforadditionalinformationaboutsupportedcommand-lineoptions.(Notethatthecommand-lineoptions-d,-r,-x,and-carenotrelevantforLab3.)

A-3. ILOCInputBlocks

AcollectionofILOCinputblocksisavailableonCLEAR.Thecommentsatthebeginningofeachinputblockspecifywhetherornottheinputblockexpectscommand-lineinputdataviatheILOCsimulator’s-iparameter.Ifyouexperienceproblemswiththeinputblocks,pleasesubmitbugreportstocomp412report@rice.edusothattheCOMP412staffcaneitherfixordeletetheproblematicblocks.

InputBlocksforLab3Report:TheLab3reportblocksmustbeusedtoproduceTable1foryourLab3report,asdescribedin§R-1.3.TheLab3reportblocksareavailablein:

/clear/courses/comp412/students/lab3/reportTheLab1timingblocks,whichwillbeusedbythetimerdescribedin§A-4.2toproducetiminginformationfortheLab3report,asdescribedin§R-1.3,areavailablein:

/clear/courses/comp412/students/lab1/timing

OtherAvailableILOCInputBlocks:SincetheILOCsubsetusedinLab1isthesameastheILOCsubsetusedinLab3,youcanfurthertestyourinstructionschedulerbyusingLab1andLab3ILOCtestblocksavailableonCLEARinsubdirectoriesof:

/clear/courses/comp412/students/lab1/clear/courses/comp412/students/ILOC

Page 16: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201916

A-4. Tools

A-4.1. ReferenceInstructionSchedulerTohelpyouunderstandthefunctioningofaninstructionschedulerforasinglebasicblockandtoprovideanexemplarforyourimplementationanddebuggingefforts,weprovidetheCOMP412referenceinstructionscheduler.ThereferenceinstructionschedulerisaCimplementationofaninstructionschedulerforasinglebasicblock.Youcanimproveyourunderstandingofinstructionschedulingbyexaminingitsoutputonsmallblocks.Youwillalsousethereferenceinstructionschedulertodeterminehowwellyourinstructionschedulerperformsintermsofeffectiveness(qualityoftheschedulesthatyourinstructionschedulergenerates)andefficiency(runtimeofyourinstructionscheduler).

TheCOMP412referenceinstructionschedulercanbeinvokedonCLEARasfollows:/clear/courses/comp412/students/lab3/lab3_ref <file name>

where<file name>bothspecifiesthenameoftheinputfileandisavalidLinuxpathnamerelativetothecurrentworkingdirectory.ForadescriptionoftheflagssupportedbytheCOMP412referenceinstructionscheduler,enterthefollowingcommandonCLEAR:

/clear/courses/comp412/students/lab3/lab3_ref -h

Ascriptforinvokinglab3_refonadirectoryofILOCtestblocksisavailableonCLEAR:/clear/courses/comp412/students/lab3/scripts

Notethatlab3_refcanonlyberunonCLEARandthatitdoesnotproducescheduledcodefortestblocksthatuseundefinedregisters.Thescriptwillprintanerrormessagewhenlab3_refdetectserrorsthatpreventitfromproducingscheduledcode.Thelogfilegeneratedbythescriptwillcontainadditionalinformationregardingthenatureoftheerrorsdetectedbylab3_ref.

A-4.2. TimerToproduceefficiencydataforyourLab3report,youwillusetheLab3timertodeterminehowtheruntimeperformanceofyourinstructionschedulercomparestotheruntimeperformanceofthereferenceinstructionscheduler,lab3_ref,overeighttimingblocks(T1k.i,T2k.i,T4k.i,T8k.i,T16k.i,T32k.i,T64k.i,andT128k.i).

TheLab3timercanonlyberunonCLEAR.Tousethetimer,copythetarfileonCLEARavailableat/clear/courses/comp412/students/lab3/timer.tar

intoadirectoryinyourownfilespaceonCLEAR.Unpackthetarfile.ReadtheREADMEfile.Invokethetimerasfollows:

./timer <f1>

where<f1>specifiesthepathnameofyourinstructionscheduler(ortheshellscriptthatinvokesit).Thetimerassumesthatyourscheduleracceptsthecommand-lineargumentsandinputdescribedin§C-2.<f1>mustbeavalidLinuxpathnamerelativetothecurrentworkingdirectory.Theeighttimingblocksincludedinthetarfilemustappearinthesamedirectoryasthetimer.

Note: The timer will run <f1> on the input files in order from the smallest to the largest file and print timing results when it completes each input file. The timer will quit if your instruction scheduler requires five minutes or more to process a single input file.

Page 17: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201917

A-4.3. GraphvizTheGraphviztoolisavailableforvariousformsofUnix,forWindows,andfortheMacOSfrom:

http://www.graphviz.org

ItisnotcurrentlyinstalledonanyoftheRice-supportedplatforms.Youcandownloadacopyforyourownmachine.Ifthisposesaproblem,pleaseseetheteachingassistants. ThepointofusingGraphvizonsmallinputblocksearlyinyourlabisthatitallowsyoutoseethedependencegraphandunderstanditinawaythatishardtoachievewithstrictlytextualoutput.Assecondarybenefits,itwillprovideyouwithbothfiguresthatyoucanincorporateintoyourlabreportandexperienceusingthekindsoftoolsthatcompilerwritersuseintheirwork.Graphviztakesasinputadotfile—atextfilethatrepresents,forourpurposes,thenodesandedgesinadirectedgraph.Nodesarenamedwitharbitraryintegers;edgesarenamedwithapairofnumbersjoinedbyan“->”operatorthatspecifiesthedirectionoftheedge.ThefullspecificationofthedotlanguagecanbefoundontheGraphvizwebsite.Forthislab,youwillworkwithasmallsubsetofthedotlanguage.Hereisasimpledotfileanditsoutput(redrawninMicrosoftWord):

digraph testcase1 { 1 [label = "B1"]; 2 [label = "B2"]; 3 [label = "B3"]; 4 [label = "B4"]; 5 [label = "B5"]; 1 -> 2; 2 -> 3; 2 -> 4; 3 -> 5; 4 -> 5; }

Input dot File Output Drawing

Notethatthe“label”fieldcanbearbitrarilylongandcanincludenewlines,sothatadotfilewiththefollowingnodespecification:

1 [label="add r1 r2 => r3\nr1 ready at cycle 2\nr2 ready at cycle 7\n"];

shouldproduceanodethatlookslike:

UsingthisfeatureofGraphviz,youcandisplayasmuchinformationasyouneedtounderstandthegraphsthatyourlabisbuilding.Additionalannotationsonthenodescangreatlysimplifyyourdebuggingeffort.

Note: The use of Graphviz is optional, but strongly recommended for debugging small input blocks. Avoid using Graphviz for large input blocks because it takes significant computational time to produce graphs for large input blocks and, due to the large numbers of nodes involved, the large graphs are difficult to display and decipher.

B1

B2

B3 B4

B5

add r1 r2 => r3 r1 ready at cycle 2 r2 ready at cycle 7

Page 18: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201918

A-5. InstructionSchedulerTimingResultsThetimingresultsshowninFigure1onpage20wereproducedusinginstructionschedulerssubmittedbyCOMP412studentsinFall2014andFall2015.Thefeaturedinstructionschedulersallproducedcorrectcodeandreceivedhighscoresforbotheffectivenessandefficiency.ThesegraphsdocumentthemostefficientC,C++,Java,Haskell,OCaml,Python,Racket,Ruby,andRimplementationssubmittedandshowtheexpectedrelativespeeddifferencesandthecurveshapesfortheselanguages.Whendiscussingtheimpactofyourprogramminglanguagechoiceontheefficiencyofyourinstructionschedulerimplementation,ifyourinstructionscheduleriswritteninC,C++,Java,Haskell,OCaml,Python,Racket,Ruby,orR,youareexpectedtocompareyourinstructionschedulertimingresultstotheFigure1resultsforinstructionscheduler(s)writteninyourchosenprogramminglanguage.(See§R-1.3.)WhiledirectcomparisonswithresultsfrompastyearsdonotaccountforsoftwareandhardwarechangesonCLEARthatmayhaveoccurredsincetheresultsweregenerated,theobservedtrendsareconsistentenoughtojustifyroughcomparisons.

WhenviewingthetimingresultsinFigure1,notethedifferenceinthey-axisscalesacrossthegraphs.IfyourinstructionscheduleriswritteninJava,recallthattojudgetheasymptoticbehaviorofaJavaprogram,youneedtolookattherelationshipbetweenruntimeanddatasetsizeonthelargeinputs,notthesmallinputs.

Page 19: Lab 3: Instruction Scheduling · should reuse in your Lab 3 instruction scheduler both your Lab 1 routines for reading ILOC files and your Lab 1 routines for register renaming. If

Version1.0

COMP412 Fall201919

Figure1:2014&2015InstructionSchedulerTimingResults

00.010.020.030.040.050.060.070.080.090.1

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

CSchedulerTimingResults

2014

2015

00.050.1

0.150.20.250.30.350.40.45

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

C++SchedulerTimingResults

2014

2015

0

0.5

1

1.5

2

2.5

3

3.5

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

JavaSchedulerTimingResults

2014

2014

2015

0102030405060708090

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

RubySchedulerTimingResults

2014

0

50

100

150

200

250

300

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

RSchedulerTimingResults

2014

02468

1012141618

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

OCamlSchedulerTimingResults

2015

0

2

4

6

8

10

12

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

PythonSchedulerTimingResults

2014

2015

0

1

2

3

4

5

6

7

8

0 20 40 60 80 100 120 140

Sche

dulin

gTime(secon

ds)

NumberofLinesinInputFile(thousands)

HaskellSchedulerTimingResults

2015