20
www. chameleoncloud.org THE MANY COLORS OF CHAMELEON Kate Keahey Mathematics and CS Division, Argonne National Laboratory CASE, University of Chicago [email protected] February 6, 2019 Chameleon User Meeting

THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

THE MANY COLORS OF CHAMELEON Kate Keahey

Mathematics and CS Division, Argonne National Laboratory

CASE, University of Chicago

[email protected] February 6, 2019 Chameleon User Meeting

Page 2: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

CHAMELEONINANUTSHELL� Weliketochange:testbedthatadaptsitselftoyourexperimentalneeds

�  Deepreconfigurability(baremetal)andisolation(CHI)–butalsoeaseofuse(KVM)�  CHI:poweron/off,reboot,customkernel,serialconsoleaccess,etc.

� Wewanttobeallthingstoallpeople:balancinglarge-scaleanddiverse�  Large-scale:~largehomogenouspartition(~15,000cores),5PBofstoragedistributedover

2sites(now+1!)connectedwith100Gnetwork…�  …anddiverse:ARMs,Atoms,FPGAs,GPUs,Corsaswitches,etc.

� Wewanttolast:cost-effectivetodeploy,operate,andenhance�  PoweredbyOpenStackwithbaremetalreconfiguration(Ironic)�  ChameleonteamcontributionrecognizedasofficialOpenStackcomponent

� Welivetoserve:open,productiontestbedforComputerScienceResearch�  Startedin10/2014,testbedavailablesince07/2015,renewedin10/2017�  Currently~3,000users,~500projects,~100institutions

Page 3: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

CHAMELEONHARDWARE

ChameleonCoreNetwork100Gbpsuplinkpublicnetwork

(eachsite)

CoreServices3.5PBStorageSystem

CoreServices0.5PBStorageSystem

HeterogeneousCloudUnitsGPUs(K80,M40,P100),FPGAs,NVMe,SSDs,IB,

ARM,Atom,low-powerXeon

HaswellStandardCloudUnit

42compute4storage

x2

HaswellStandardCloudUnit

42compute4storage

x10

SkyLakeStandardCloudUnit

32computeCorsaSwitch

x2

SkyLakeStandardCloudUnit

32computeCorsaSwitch

x1

GENIandotherpartners

ChameleonAssociateSiteNorthwestern

ChicagoAustin

Page 4: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

EXPERIMENTALWORKFLOW

discover resources

allocate resources

configure and interact monitor

- Fine-grained - Complete - Up-to-date - Versioned - Verifiable

- Advance reservations - On-demand - Isolation - Across resource types

- Deeply reconfigurable - Appliance catalog - Snapshotting - Complex Appliances - Network Isolation

- Hardware metrics - Fine-grained data - Aggregate - Archive

CHI = 65%*OpenStack + 10%*G5K + 25%*”special sauce”

Page 5: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

IMPROVINGTHEPLATFORM:NETWORKING� Multi-tenantnetworkingallowsuserstoprovisionisolatedL2VLANs

andmanagetheirownIPaddressspace(sinceFall2017)�  StitchingdynamicVLANsfromChameleontoexternalpartners

(ExoGENI,ScienceDMZs)(sinceFall2017)�  VLANs+AL2SconnectionbetweenUCandTACCfor100Gexperiments

(sinceSpring2018)�  BYOC–BringYourOwnController:isolatedusercontrolledvirtual

OpenFlowswitches(sinceSummer2018)� Managingmultiplestitches(sinceFall2018)�  VLANreservations(sinceWinter2019),floatingIPreservationscoming

soon!

Page 6: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

BRING-YOUR-OWN-CONTROLLER(BYOC)�  SoftwareDefinedNetworking

(SDN)�  CorsaVirtualForwarding

Context(VFC)�  OpenFlow1.3�  Userdefinedcontroller

�  WithinChameleonoranywhereontheInternet

�  AvailableonSkylakenodes

�  Supportedcapabilities�  SDNexperiments�  Experimentsrequiringnon-

standardnetworkingcapabilities

StandardCloudUnit

CorsaSwitch

OpenFlowController(TenantA)

Ryu

ComputeNode

(TenantA)

ComputeNode

(TenantA)

ComputeNode

(TenantB)

ComputeNode

(TenantB)

VFC(TenantA)

OpenFlowController(TenantB)

VFC(Tenantb)

OpenFlowController(TenantA)

Page 7: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

EXTERNALSTITCHING

� Layer2VLANsfromChameleontoexternalpartners� ExoGENI,ScienceDMZs,Esnet,andAL2S

� VFCswithmultipleL2stitchedlinks� NamedVFCs

StandardCloudUnit

Internet 2 AL2S, GENI, Future Partners

ChameleonCoreNetwork100Gbpsuplinkpublicnetwork

Chicago

Austin

ComputeNode

(TenantA)

ComputeNode

(TenantA)

ComputeNode

(TenantB)

ComputeNode

(TenantB)

VFC(TenantA)

OpenFlowController(TenantB)

OpenFlowController(TenantA)

Ryu

VFC(Tenantb)

Page 8: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

NETWORKINGPATTERNSMADEEASY

�  Sharednet1�  Pre-configuredlocalsharednetwork

�  Sharedwan1�  Stitchedsharednetwork

�  Pre-configured

�  ConnectsUCandTACC

�  Upto100Gbps

�  Askhowtoaddittoyourproject!

ChameleonCoreNetwork100Gbpsuplinkpublicnetwork

Chicago

StandardCloudUnit

ComputeNode

ComputeNode

Austin

sharednet1 ComputeNode

ComputeNode

StandardCloudUnit

ComputeNode

ComputeNode

sharednet1ComputeNode

ComputeNode

sharedwan1sharedwan1

Page 9: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

IMPROVINGTHEPLATFORM:OTHERFEATURES

�  Leasemanagement:adding/removingnodesto/fromalease,notificationsofleasestartandimpendingtermination

�  Advancereservationorchestration�  Powerandtemperaturemetrics� WholediskimagebootforARMnodes�  Newappliances(Hadoop,ExoGENI,BYOCexamples)andarichersetof

appliancefeatures:FUSEmoduleandnetworkingsupport�  Usabilityfeatures:multi-regionconfiguration,singlelogintoallweb

interfaces,betteraccesstoinformation,bettererrorhandling,softwareself-updates,betterappliancepublishing,documentationoverhaul,etc.

�  Chameleontracesarenowavailableatwww.scienceclouds.org

Page 10: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

BEYONDTHEPLATFORM:BUILDINGANECOSYSTEM�  Helpinghardwareprovidersinteract

�  BringYourOwnHardware(BYOH)

�  CHI-in-a-Box:deployyourownChameleonsite

�  Helpingouruserinteract–withusbutprimarilywitheachother�  Facilitatingcontributionsofappliances,tools,andotherartifacts:appliancecatalog,

blogasapublishingplatform,andeventuallynotebooks

�  Integratingtoolsforexperimentmanagement

�  Makingreproducibilityeasier

�  Improvingcommunication–notjustwithusbutwithourusersaswell

Page 11: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

CHI-IN-A-BOX�  CHI-in-a-box:packagingacommodity-basedtestbed

�  Firstreleasedinsummer2018,continuouslyimproving

�  CHI-in-a-boxscenarios�  Independenttestbed:packageassumesindependentaccount/projectmanagement,

portal,andsupport�  Chameleonextension:jointheChameleontestbed(currentlyservingonlyselected

users),andincludesbothuserandoperationssupportPart-timeextension:defineandimplementcontributionmodels

�  Part-timeChameleonextension:likeChameleonextensionbutwiththeoptiontotakethetestbedofflineforcertaintimeperiods(supportislimited)

�  Adoption�  NewChameleonAssociateSiteatNorthwesternsincefall2018–newnetworking!�  Twoorganizationsworkingonindependenttestbedconfiguration

Page 12: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

REPRODUCIBILITYDILEMMA

�  Reproducibilityasside-effect:loweringthecostofrepeatableresearch�  Example:Linux“history”command�  Fromameanderingscientificprocesstoarecipe

�  Reproducibilitybydefault:documentingtheprocessviainteractivepapers

? Should I invest in more new research instead?

Should I invest in making my experiments repeatable?

Page 13: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

REPEATABILITYMECHANISMSINCHAMELEON�  Testbedversioning(collaborationwithGrid’5000)

�  BasedonrepresentationsandtoolsdevelopedbyG5K

�  >50versionssincepublicavailability–andcounting

�  Stillworkingon:betterfirmwareversionmanagement

�  Appliancemanagement�  Configuration,versioning,publication

�  Appliancemeta-dataviatheappliancecatalog

�  OrchestrationviaOpenStackHeat

� Monitoringandlogging�  However…theuserstillhastokeeptrackofthisinformation

Page 14: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

KEEPINGTRACKOFEXPERIMENTS�  Everythinginatestbedisarecordedevent�  Theresourcesyouused�  Theappliance/imageyoudeployed�  Themonitoringinformationyourexperimentgenerated�  Plusanyinformationyouchoosetosharewithus:e.g.,“start

power_exp_23”and“stoppower_exp_23

�  Experimentprécis:informationaboutyourexperimentmadeavailableina“consumable”form

Page 15: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

REPEATABILITY:EXPERIMENTPRÉCIS

Experiment précis

OpenStack services

Instance monitoring

Infrastructure monitoring

User events

Store and share

Orchestrator (Heat)

Page 16: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

EXPERIMENTPRÉCIS:ACASESTUDY

Based on Wang et al., Understanding and Auto-Adjusting Performance-Sensitive Configurations. ASPLOS, 2018

Based on Wang et al., Understanding and Auto-Adjusting Performance-Sensitive Configurations. ASPLOS, 2018

Page 17: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

INTERACTIVEPAPERS� Whatdoesitmeantodocumentaprocess?�  Somerequirements

�  Easytoworkwith:humanreadable/modifiableformat�  IntegrateswellwithALLaspectsofexperimentmanagement�  Bitbybitreplay–allowsforbitbybitmodification(andintrospection)aswell–elementof

interactivity�  Supportstorytelling:allowsyoutoexplainyourexperimentdesignandmethodology

choices�  Hasadirectrelationshiptotheactualpaperthatgetswritten�  Canbeversioncontrolled�  Sustainable,apopularopensourcechoice

�  Implementationoptions�  Orchestrators:Heat,thedashboard,andOpenStackFlame�  Notebooks:Jupyter,NextJournal

Page 18: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

CHAMELEONJUPYTERINTEGRATION�  Combiningtheeaseofnotebooksandthepowerofasharedplatform

�  StorytellingwithJupyter:ideas/text,process/code,results�  Chameleonsharedexperimentalplatform

�  JupyterLabserverforourusers

�  Justgotojupyter.chameleoncloud.organdloginwithyourChameleoncredentials

�  Chameleon/Jupyterintegration�  Alternativeinterface

�  Allthemaintestbedfunctions

�  “HelloWorld”templateScreencastofacomplexexperiment:https://vimeo.com/297210055

Page 19: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

SHARING,EXPERIMENTING,LEVERAGING�  SharingJupyternotebooksinChameleon

�  Today:fromhomedirectorytosharingviaourSwiftstoragewithyourprojectmembers

�  Challengesahead:moreflexiblesharingpolicyimplementation,integratingwithgithubforbetterversioningandsharingsupport

�  AutomatingexperimentswithJupyter

Page 20: THE MANY COLORS OF CHAMELEON · 2019-02-06 · Powered by OpenStack with bare metal reconfiguration (Ironic) Chameleon team contribution recognized as official OpenStack component

www. chameleoncloud.org

PARTINGTHOUGHTS�  Physicalenvironment:Chameleonisarapidlyevolvingexperimental

platform�  Originally:“Adaptstotheneedsofyourexperiment”�  Nowalso:“Adaptstotheneedsofitscommunityandthechangingresearchfrontier”

�  TowardsanEcosystem:ameetingplaceofusersandproviderssharingresourcesandresearch�  Testbedsaremorethanjustexperimentalplatforms�  Common/sharedplatformisa“commondenominator”thatcaneliminatemuch

complexitythatgoesintosystematicexperimentation,sharing,andreproducibility

�  Bepartofthechange:telluswhatcapabilitiesweshouldprovidetohelpyoushareandleveragethecontributionsofothers!