31
Introduction Slurm Energy and Power Measurement Upcoming Summary Literature Resource management with Slurm Tobias Weßeler Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg 2016-01-18 Tobias Weßeler Resource management with Slurm 1 / 31

Resource management with Slurm

  • Upload
    others

  • View
    36

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Resource management with Slurm

Tobias Weßeler

Arbeitsbereich Wissenschaftliches RechnenFachbereich Informatik

Fakultät für Mathematik, Informatik und NaturwissenschaftenUniversität Hamburg

2016-01-18

Tobias Weßeler Resource management with Slurm 1 / 31

Page 2: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Agenda

1 Introduction

2 Slurm

3 Energy and Power Measurement

4 Upcoming

5 Summary

6 Literature

Tobias Weßeler Resource management with Slurm 2 / 31

Page 3: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

What is job scheduling / resource management?

Introduction

Resource managerMonitors resource utilization (CPU, RAM, etc.)and allocates them to the users’ jobsMonitors power consumptionSwitches off unused resourcesCommunicates with job scheduler

Job schedulerUses information from resource manager to prioritize jobsSchedules jobs to efficiently use resourcesInforms resource manager about hardware needs

Tobias Weßeler Resource management with Slurm 3 / 31

Page 4: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Job scheduling examples

Introduction

Figure: Naive queue, figure based on: [Ada14]

Tobias Weßeler Resource management with Slurm 4 / 31

Page 5: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Job scheduling examples

Introduction

Figure: Backfill example, figure based on: [Ada14]

Tobias Weßeler Resource management with Slurm 5 / 31

Page 6: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Job scheduling examples

Introduction

Figure: No backfill possible, figure based on: [Ada14]

Tobias Weßeler Resource management with Slurm 6 / 31

Page 7: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Software

SLURM

Simple Linux Utility for Resource ManagementCombined resource manager and job schedulerOpen source, fault-tolerant, highly scalableSupports plugins (dynamically linked objects at runtime)Active development by community as well as SchedMDwide-spread use in HPC

”As of the June 2015 Top 500 computer list, Slurm wasperforming workload management on six of the ten mostpowerful computers in the world including the number 1system, Tianhe-2 with 3,120,000 computing cores.”-SchedMD website

Tobias Weßeler Resource management with Slurm 7 / 31

Page 8: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Architecture

Daemons of slurm

Slurmctld – controller daemonMonitors and allocates resourcesManages job queuesHas optional backup with automatic fail-over

Slurmdbd – database daemonStores accounting and configuration informationAlso has an optional automatic fail-overAttached database can be mysql, postgresql or text format

Slurmd – compute node daemonLaunches and manages tasksVery light-weightQuiet (except for optional accounting)

SlurmstepdManages job steps and I/OSpawned for each jobstep and terminated after

Tobias Weßeler Resource management with Slurm 8 / 31

Page 9: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Architecture

Daemons of slurm

Figure: Multi-cluster environment

Figure based on Introduction to Slurm: [Sch16]Tobias Weßeler Resource management with Slurm 9 / 31

Page 10: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Architecture

What is a plugin?

Also called addins, addons or extensionsA plugin is an optional software module that can extend orchange the functionality of an existing programUsually plugins are very specific and only work with a certainprogram - just like a puzzle piece only fits into its own puzzleSoftware gets more customizable and becomes extensibleOften represented as a puzzle pieceDescribed via interface or API

Tobias Weßeler Resource management with Slurm 10 / 31

Page 11: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Architecture

Plugins

Over 80 plugins (as of dec 2012)Objects that are dynamically linked duringruntimeCurrently 26 well-defined APIs /programmer’s guides

Tobias Weßeler Resource management with Slurm 11 / 31

Page 12: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

The How and What...

Energy accounting

Measure the power and energy consumed by nodes or jobsPower profiling:analyse power demands of cluster and utilization of resourcesImprove energy efficiency

Power:P = I · V (Product of Current and Voltage)SI: watt (1 joule over 1 second)Energy consumption:P · t (Product of Power and Time)SI: watt-hours (3600 joule)

Tobias Weßeler Resource management with Slurm 12 / 31

Page 13: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

The How and What...

Motivation

MoneyDKRZ consumes over 17 GWh per yearIt costs over 1,850,000 eThat is roughly 11 cents per KWhFor comparison:Average energy consumption per Person - 2,000 KWhFactor: 8,500,000Environmental awareness

Tobias Weßeler Resource management with Slurm 13 / 31

Page 14: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

The How and What...

How to measure energy

PM = power meterdifferent possible locationsLibrary / APIProgram (or plugin) to uselibrary or APISamples are collected byclient and then processed

Figure: Concept drawing [Wes]Tobias Weßeler Resource management with Slurm 14 / 31

Page 15: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Slurm Energy Plugin - RAPL

Running Average Power LimitSamples are estimated from apower consumption model based onhardware countersEstimates seem to be very accurateOnly 2 readings necessary -> lowoverhead

Figure: Concept drawing [Wes]

Tobias Weßeler Resource management with Slurm 15 / 31

Page 16: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Slurm Energy Plugin - IPMI

Intelligent Platform Management InterfaceProtocol to read from sensorsLights Out Management:enables remote control and management of machineBaseboard Management ControllerSpecial microcontroller connected to sensors on hardwarePhyiscal Interfaces:SM Buses, Serial Port, IMPBBMC communicates with BMU(Baseboard Management Controller Management Utility)

Tobias Weßeler Resource management with Slurm 16 / 31

Page 17: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Slurm Energy Plugin - IPMI

Figure: Concept drawing [Wes]

Tobias Weßeler Resource management with Slurm 17 / 31

Page 18: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Slurm Energy Plugin - Config

Slurm.conf (main config file)AcctGatherEnergyTypeSpecifies which plugin should be used.AcctGatherNodeFreqTime interval between pollings in seconds.

Acct_gather.conf (same dir as slurm.conf)Contains configuration for acct_gather related pluginsE.g. EnergyIPMIFrequency:number of seconds between BMC access samples

Tobias Weßeler Resource management with Slurm 18 / 31

Page 19: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Slurm Energy Plugin - ext_sensors

New infrastructure -> HDEEMHigh Definition Energy Efficiency MonitoringHigh resolution: ~1000 samples per secondPlugin needs to be written to utilize functionalityExt_sensors plugin works independently from acct_gatherpluginsStandard config files only allows up to 1 call per second

Tobias Weßeler Resource management with Slurm 19 / 31

Page 20: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

HDEEM

Figure: Concept drawing [unk]

Tobias Weßeler Resource management with Slurm 20 / 31

Page 21: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Challenges

CPU current and voltage are highly dynamicFrequency is much higher than sampling ratePower meter returns instant values - need to be convertedMany conversion steps required - sources of inaccuracy:

Voltage and current sensorsAnalog-digital-converterLowpass filtersdata formatsaverage calculations

Calculation of energy correct avg power values over time period

Tobias Weßeler Resource management with Slurm 21 / 31

Page 22: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Example Energy Plugins

Configuration cases

node energy monitoringAcctGatherEnergyType=acct_gather_energy/ipmi or raplAcctGatherNodeFreq=<seconds>orExtSensorsType=ext_sensors/rrd

ExtSensorsFreq=<seconds>

job/step energy accountingJobAcctGatherType=jobacct_gather/linux or cgroupAcctGatherEnergyType=acct_gather_energy/ipmi or raplJobAcctGatherFrequency=task=<seconds>orJobAcctGatherType=jobacct_gather/linux or cgroup

ExtSensorsType=ext_sensors/rrd

job/step power profilingAcctGatherEnergyType=acct_gather_energy/ipmi or raplAcctGatherProfileType=acct_gather_profile/hdf5

JobAcctGatherFrequency=energy=<seconds>

Tobias Weßeler Resource management with Slurm 22 / 31

Page 23: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Storing Accounting and Profiling Data

Slurm Energy Plugin - File Format

Efficient file format needed to store collected dataHDF5 (Hierarchical Data Format version 5)Represents a wide variety of data structures within a single fileSupports very complex dataHigh-level interfaces for C, C++, Fortran 90 and Java

Tobias Weßeler Resource management with Slurm 23 / 31

Page 24: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Storing Accounting and Profiling Data

HDF5

Figure: Features [The16]

Tobias Weßeler Resource management with Slurm 24 / 31

Page 25: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Storing Accounting and Profiling Data

HDF5 - Requirements

Shared filesystem for compute nodesSlurm.conf:Uses HDF5 Profile PluginAcct_gather.conf:Root directory of profiling data (must be in shared filesystem)Each slurmstepd keeps his own fileFiles are merged after job completion

Tobias Weßeler Resource management with Slurm 25 / 31

Page 26: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Storing Accounting and Profiling Data

HDF5 Slurm integration

Figure: Workflow [unk]

Tobias Weßeler Resource management with Slurm 26 / 31

Page 27: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

New features in slurm 16.05

What’s new?Supports asymmetric resource allocation

Different amount of resources for each process / rankEnables MPMD approach

Classical approach: SPMDExample call: mpirun -np 2 a.out : -np 2 b.out

Figure: Concept drawing [Wes]

Tobias Weßeler Resource management with Slurm 27 / 31

Page 28: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Summary

HPC software stackSlurm is a good choiceMore possibilities for resource management in near future

Energy AccountingRange of available plugins is growingEnergy consumption and power profiles become increasinglyimportant due to high costs in HPCAccurate power profiling is difficult

Tobias Weßeler Resource management with Slurm 28 / 31

Page 29: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Literature

[Ada14] Adaptive Computing Enterprises. Maui SchedulerAdministrator’s Guide, 1999-2014.http://docs.adaptivecomputing.com/maui/8.2backfill.php.

[ea14] Daniel Hackenberg et al. HDEEM: High Definition EnergyEfficiency Monitoring. 2014.

[Mar13] Martin Perry (Bull). Energy Accounting and ExternalSensors Plugins, 2013. http://www.schedmd.com/.

[Sch16] SchedMD LLC. Slurm Commercial Support andDevelopment, 2011-2016. http://www.schedmd.com/.

[The16] The HDF Group. Hierarchical Data Format, version 5,1997-2016. http://www.hdfgroup.org/HDF5/.

[unk] unknown. materials provided by R. Heidari.

[Wes] Tobias Wesseler. based on understanding of the topic.

[Yia12] Yiannis Georgiou (Bull). Enhancing SLURM with EnergyConsumption Control and Monitoring Features, 2012.http://www.schedmd.com/.

Tobias Weßeler Resource management with Slurm 29 / 31

Page 30: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Thank you!

Questions?

Tobias Weßeler Resource management with Slurm 30 / 31

Page 31: Resource management with Slurm

Introduction Slurm Energy and Power Measurement Upcoming Summary Literature

Also read for information:

[ea14][Mar13][Yia12]

Tobias Weßeler Resource management with Slurm 31 / 31