32
Priya Autee Harpreet Sindhu June’ 17 Intel® RDT Hands-on Lab

Intel® RDT Hands-on Lab

Embed Size (px)

Citation preview

Priya Autee Harpreet Sindhu June’ 17

Intel® RDT Hands-on Lab

2 TRANSFORMING NETWORKING INFRASTRUCTURE

Cache Monitoring Tech (CMT) § Per-thread L3 Occupancy Monitoring § 4 Resource Monitoring ID’s per logical thread

LP HP

Memory BW Monitoring (MBM) § Per-thread Memory Bandwidth Monitoring § Leverages RMID infrastructure

IMC ?

Cache Allocation Tech (CAT) § Per-thread L3 Occupancy Control § Code and Data Prioritization (CDP) §  Intel® Xeon® processor E5 v4 introduces 16

Classes of Service

Cache LP HP

Supplement existing Telemetry: •  Counters; Perfmon; •  Intel® Node Manager •  Snap (open source project); •  Utilities in Kernel & VMM; etc…

Intel® Resource Director Technology (Intel® RDT) Building on a rich and growing portfolio of technologies embedded in Intel silicon

8

3 TRANSFORMING NETWORKING INFRASTRUCTURE

•  3 Levels of cache (SNB, IVB, HSW,BDW processors) •  L1 cache – 32KB data and 32KB instruction caches •  L2 cache – 256KB – unified (holds code & data) •  L3 cache (LLC) – 25MB (IVB) , 30MB (HSW) common cache for all cores in

CPU socket.

•  L1 cache is smallest, and fastest. •  CPU tries to access data – not in L1 cache? •  Try L2 cache - not in L2 cache? •  Try L3 cache – not in L3 cache? •  Cache miss - need to access system memory (DRAM).

•  L1 & L2 cache is per physical core (shared per logical core)

•  L3 cache is shared (per CPU socket)

Caching on IA

4 TRANSFORMING NETWORKING INFRASTRUCTURE

Intel® Resource Director Technology (Intel® RDT)

Core app

Core app

Last Level

Cache

Core

DRAM

app

•  Identify misbehaving applications and reschedule according to priority

•  Cache Occupancy reported on a per Resource Monitoring ID (RMID) basis – Advanced Telemetry

Cache Monitoring Technology (CMT)

Core app

Core app

Last Level

Cache

Core

DRAM

app

Cache Allocation Technology (CAT)

•  Last Level Cache partitioning mechanism enabling separation and prioritization of apps or VMs

•  Misbehaving threads can be isolated to increase determinism

Core app

Core app

Last Level

Cache

Core app

Memory Bandwidth Monitoring (MBM)

•  Monitors Memory Bandwidth consumption on per thread/core/app basis

•  Shares common RMID architecture -- Telemetry

•  Provides insight into second order of shared resource contention

DRAM

9

5 TRANSFORMING NETWORKING INFRASTRUCTURE

Key Concepts: Resource Monitoring IDs (RMIDs)

§  Threads/Apps/VMs grouped into Resource Monitoring IDs (RMIDs)

§  Any thread, app, VM or a combination can be monitored with any RMID

§  Specify the RMID for a thread via the per-core IA32_PQR_ASSOC (“PQR”) MSR

Associate threads into RMIDs. Hardware tracks resource utilization per RMID.

SW retrieves monitoring data periodically via event IDs for CMT, MBM, future features.

6 TRANSFORMING NETWORKING INFRASTRUCTURE

Key Concepts: Classes of Service (CLOS)

§  Threads/Apps/VMs grouped into Classes of Service (CLOS) for resource allocation

§  Resource usage of any thread, app, VM or a combination controlled with a CLOS

§  Specify the CLOS for a thread via the per-core IA32_PQR_ASSOC (“PQR”) MSR

§  Configure resource guidelines per CLOS.

§  Associate threads into CLOS.

§  Hardware manages resource allocation.

§  Extensible to other shared resources

7 TRANSFORMING NETWORKING INFRASTRUCTURE

PQoS Kernel Implementation

Threads

resctrl fs

/sys/fs/resctrl perf

User interface

Cache alloc Cache, mem bw monitoring

Kernel QOS support

Intel Xeon Intel® RDT support

Shared L3 Cache

User Space

Kernel Space

Hardware

MSR Driver

Configure bitmask per

CLOS

Set CLOS/RMID for

thread

During ctx switch

Allocation configuration

Read Event

counter

Read Monitored data

Standalone PQoS library

8 TRANSFORMING NETWORKING INFRASTRUCTURE

PQoS Library and Utility

PQoS static library (Intel IP, BSD Lic. 01.org, github) -  Provides applications a simple C API -  Requires C & pthreads libraries (GNU C library on Linux implements both) -  Uses MSR and CPUID Linux kernel drivers through standard file I/O API

PQoS utility: -  Links to PQoS static library -  Simple and easy to use command line interface -  Enable customers for evaluation of CMT, MBM, CAT and CDP

PQoS Utility

PQoS library

Linux User Space Linux Kernel Space

MSR Driver

CPUID Driver

Software Package Standard Linux Kernel Modules

File I/O

9 TRANSFORMING NETWORKING INFRASTRUCTURE

Options

#1 PQoS integration through PQoS library linking

#2 PQoS library development in an Application

#3 Using perf sys call and resctrl fs for Scheduler based RDT support

10 TRANSFORMING NETWORKING INFRASTRUCTURE

Download & Installation •  Download v0.1.5 & unpack

wget https://github.com/01org/intel-cmt-cat/archive/v0.1.5.tar.gz

tar xzf v0.1.5.tar.gz

note: tip of the master branch is available as zip here at https://github.com/01org/intel-cmt-cat/archive/master.zip

•  Compile & Install

cd intel-cmt-cat-0.1.5

make

sudo make install

•  Uninstall with “sudo make uninstall”

6/20/17

11 TRANSFORMING NETWORKING INFRASTRUCTURE

Other Download & Installation Options •  Ubuntu / Debian based

sudo apt-get install intel-cmt-cat

•  Fedora / RedHat based

sudo yum install intel-cmt-cat

sudo dnf install intel-cmt-cat

Note:

OS packages are typically bit behind github code.

For latest features refer to github.com/01org/intel-cmt-cat

6/20/17

12 TRANSFORMING NETWORKING INFRASTRUCTURE

Common Installation Problems •  “/usr/local/bin” not in the PATH

- Update profile or sudoers file with “/usr/local/bin” path

•  “/usr/local/lib” not in the LD PATH

- Update LD configuration to include “/usr/local/lib” path

echo “/usr/local/lib” > /etc/ld.so.conf.d/libpqos.conf

ldconfig

6/20/17

13 TRANSFORMING NETWORKING INFRASTRUCTURE

Package Details •  ‘libpqos’ shared library

•  Provides API’s to: •  detect & enumerate Intel® RDT features on the platform •  monitor resources on hardware thread basis •  manage resources on hardware thread basis

•  ‘pqos’ tool •  Detect & show intel® RDT configuration •  Monitors resources

•  Manages resources

•  ‘rdtset’ tool •  Aims to simplify Intel® RDT resource management •  Same as ‘taskset’ pins application to cores

•  Then configures classes to satisfy command line requirements

6/20/17

14 TRANSFORMING NETWORKING INFRASTRUCTURE

Package Details •  Other Bits and Pieces in the Package

• Perl shim for the library • C example code • Perl example code • Net SNMP Agent to providing SNMP access to CAT, CDP and CMT

•  For FAQ and other usage examples please have a look at project wiki web site https://github.com/01org/intel-cmt-cat/wiki

6/20/17

15 TRANSFORMING NETWORKING INFRASTRUCTURE

•  Total Cache size: 55 MB •  Number of ways: 20 Calculations: Formula for Bitmask: Total Cache Size/Number of Ways For our lab systems: 55MB/20 = 2.75MB Example: Mask: 0x00001 means 2.75 MB ( One cache way) Mask: 0x00003 means 5.5 MB ( Two cache ways) Mask: 0x00007 means 8.25 MB ( Three cache ways) Mask: 0x0000F means 11 MB ( Four cache ways)

Bitmask/Capacity Calculation

16 TRANSFORMING NETWORKING INFRASTRUCTURE

Display Configuration -bash-4.3$ sudo pqos –s NOTE: Mixed use of MSR and kernel interfaces to manage CAT or CMT & MBM may lead to unexpected behavior. L3CA COS definitions for Socket 0: L3CA COS0 => MASK 0xfffff L3CA COS1 => MASK 0xfffff L3CA COS2 => MASK 0xfffff L3CA COS3 => MASK 0xfffff L3CA COS4 => MASK 0xfffff L3CA COS5 => MASK 0xfffff L3CA COS6 => MASK 0xfffff L3CA COS7 => MASK 0xfffff L3CA COS8 => MASK 0xfffff L3CA COS9 => MASK 0xfffff L3CA COS10 => MASK 0xfffff L3CA COS11 => MASK 0xfffff L3CA COS12 => MASK 0xfffff L3CA COS13 => MASK 0xfffff L3CA COS14 => MASK 0xfffff L3CA COS15 => MASK 0xfffff L3CA COS definitions for Socket 1: L3CA COS0 => MASK 0xfffff ... L3CA COS15 => MASK 0xfffff < COS definitions >

Core information for socket 0: Core 0, L2ID 0, L3ID 0 => COS0, RMID0 Core 2, L2ID 1, L3ID 0 => COS0, RMID0 Core 4, L2ID 2, L3ID 0 => COS0, RMID0 Core 6, L2ID 3, L3ID 0 => COS0, RMID0 Core 8, L2ID 4, L3ID 0 => COS0, RMID0 Core 10, L2ID 5, L3ID 0 => COS0, RMID0 Core 12, L2ID 8, L3ID 0 => COS0, RMID0 Core 14, L2ID 9, L3ID 0 => COS0, RMID0 Core 16, L2ID 10, L3ID 0 => COS0, RMID0 Core 18, L2ID 11, L3ID 0 => COS0, RMID0 Core 20, L2ID 12, L3ID 0 => COS0, RMID0 …. Core 86, L2ID 28, L3ID 0 => COS0, RMID0 Core information for socket 1: Core 1, L2ID 32, L3ID 1 => COS0, RMID0 ... Core 87, L2ID 60, L3ID 1 => COS0, RMID0

<Core to COS/RMID associations >

17 TRANSFORMING NETWORKING INFRASTRUCTURE

-bash-4.3$ sudo pqos –s –v NOTE: Mixed use of MSR and kernel interfaces to manage

CAT or CMT & MBM may lead to unexpected behavior. INFO: CACHE: type 1, level 1, max id sharing this cache 2 (1 bits) INFO: CACHE: type 2, level 1, max id sharing this cache 2 (1 bits) INFO: CACHE: type 3, level 2, max id sharing this cache 2 (1 bits) INFO: CACHE: type 3, level 3, max id sharing this cache 64 (6 bits) INFO: Monitoring capability detected INFO: CPUID.0x7.0: L3 CAT supported INFO: CDP is disabled INFO: L3 CAT details: CDP support=1, CDP on=0, #COS=16, #ways=20, ways contention bit-mask 0xc0000 INFO: L3 CAT details: cache size 57671680 bytes, way size 2883584 bytes INFO: L3CA capability detected INFO: CPUID 0x10.0: L2 CAT not supported! INFO: L2CA capability not detected INFO: CPUID 0x10.0: MBA not supported! INFO: MBA capability not detected INFO: resctrl not detected. Kernel version 4.10 or higher required INFO: OS support for CMT detected INFO: OS support for L3 CAT not detected ...

RDT enumeration

18 TRANSFORMING NETWORKING INFRASTRUCTURE

•  LLC •  Represents LLC Occupancy ( CMT)

•  MBL •  Represents local memory bandwidth (MBM)

•  MBR •  Represents remote memory bandwidth (MBM)

•  IPC •  Represents instructions per cycle (PMU architectural event)

•  LLC Misses •  Represents LLC misses (PMU architectural event)

Monitoring Data (events)

19 TRANSFORMING NETWORKING INFRASTRUCTURE

Monitoring (CMT, MBM) # monitor all cores and all events

-bash-4.3$ sudo pqos

# monitor cores 0 to 11 and all events

-bash-4.3$ sudo pqos –m all:0-11

# monitor LLC occupancy on cores 0, 1, 4 and 6, local memory bandwidth on cores 8 to 11 and remote memory bandwidth on cores 12-14

-bash-4.3$ sudo pqos –m “llc:0,1,4,6” –m “mbl:8-11” –m “mbr:12-14”

# reset monitoring infrastructure ; Reclaims in-use RMID's.

-bash-4.3$ sudo pqos -r

Note: Use ctrl-c to stop monitoring 6/20/17

20 TRANSFORMING NETWORKING INFRASTRUCTURE

Monitoring # monitor groups of cores together (aggregate statistics): # cores 0 to 7 – group 1 # cores 8 to 11 – group 2 # cores 12 to 15 – group 3 # groups can represent applications or VM’s

-bash-4.3$ sudo pqos –m “all:[0-7][8-11][12-15]”

# cores 0 to 11 – group 1 [all events] # cores 12-14 – group 2 [LLC occupancy] # cores 15,17 and 20 – group 3 [Local memory BW] # groups can represent applications or VM’s

-bash-4.3$ sudo pqos -m "all:[0-11];llc:[12,13,14];mbl:[15-17,20]"

Note: Use ctrl-c to stop monitoring

6/20/17

21 TRANSFORMING NETWORKING INFRASTRUCTURE

# All sockets: # - set COS1 to 4 ways # - set COS2 to 8 ways

-bash-4.3$ sudo pqos -e “llc:1=0xf;llc:2=0xff0;”

# Set COS1 to 4 ways on socket 0 # Set COS1 to 8 ways on socket 1

-bash-4.3$ sudo pqos –e “llc@0:1=0xf;llc@1:1=0xff0;”

Allocation LLC (Define COS)

22 TRANSFORMING NETWORKING INFRASTRUCTURE

# associate cores 1 to COS1

-bash-4.3$ sudo pqos -a “llc:1=1”

# associate: # - cores 0 to 2 with COS1 # - cores 3 to 5 with COS2 # - cores 6 to 8 with COS3

-bash-4.3$ sudo pqos –a “llc:1=0-2;llc:2=3,4,5;llc:3=8-6”

# run sleep on core 2 with access to 2 LLC ways: rdtset

-bash-4.3$ sudo rdtset –t “l3=0x3;cpu=2” –c 2 sleep 60

Allocation LLC (Associate Core with COS)

23 TRANSFORMING NETWORKING INFRASTRUCTURE

# reset & keep current CDP config. Sets all COS to default (fill into all ways) and associates all cores with COS 0.

-bash-4.3$ sudo pqos –R

# reset & turn on CDP

-bash-4.3$ sudo pqos –R l3cdp-on

# Use current L3 CDP settings and set COS 1 code and data bitmasks

-bash-4.3$ sudo pqos -e "llc:1d=0xfff;llc:1c=0xfff00;"

# reset & turn off CDP

-bash-4.3$ sudo pqos –R l3cdp-off

Allocation LLC (CDP & RESET)

24 TRANSFORMING NETWORKING INFRASTRUCTURE

Orchestration Proposal for RDT in the Datacenter: OpenStack Integration (CMT Example)

24

Ceilometer

Data store

AODH (AlarmEngine)

Host-C CMT

Host-B (w/o CMT)

Host-A CMT

VM1

VM2

VM3

Cache Allocations Nova (Scheduler, Orchestrator)

Congress (Policy Engine)

Cache-Usage CPU-Usage X-Usage

VM9

CPU-Usage X-Usage

Cache-Usage CPU-Usage X-Usage

Ceilometer agent

Ceilometer agent

Ceilometer agent

Ceilometer agent

collectd

PQoS Lib perf

MSR if Linux sched integrated

Vision: Per-node resource controls directed by a

datacenter-level orchestration framework

25 TRANSFORMING NETWORKING INFRASTRUCTURE

25

Ceilometer

Data store

AODH (AlarmEngine)

Host-C CMT

Host-B (w/o CMT)

Host-A CMT

VM1

VM2

VM3

Cache Allocations Nova (Scheduler, Orchestrator)

Congress (Policy Engine)

Cache-Usage CPU-Usage X-Usage

VM9

CPU-Usage X-Usage

Cache-Usage CPU-Usage X-Usage

Ceilometer agent

Ceilometer agent

Ceilometer agent

OpenStack RDT Integration (Contention Detection)

26 TRANSFORMING NETWORKING INFRASTRUCTURE

26

Ceilometer

Data store

AODH (AlarmEngine)

Host-C CMT

Host-B (w/o CMT)

Host-A CMT

VM1

VM2

VM3

Cache Allocations Nova (Scheduler, Orchestrator)

Congress (Policy Engine)

Cache-Usage CPU-Usage X-Usage

VM 9

CPU-Usage X-Usage

Cache-Usage CPU-Usage X-Usage

move action

OpenStack RDT Integration (Determine Action)

27 TRANSFORMING NETWORKING INFRASTRUCTURE

27

Ceilometer

database

AODH (AlarmEngine)

Host-C CMT

Host-B

Host-A CMT

VM1

VM2

VM3

Cache Allocations Nova (Scheduler, Orchestrator)

Congress (Policy Engine)

Cache-Usage CPU-Usage X-Usage

VM-9

CPU-Usage X-Usage

Cache-Usage CPU-Usage X-Usage

action

VM1

OpenStack RDT Integration (Execute Action)

28 TRANSFORMING NETWORKING INFRASTRUCTURE

BACKCUP

6/20/17

29 TRANSFORMING NETWORKING INFRASTRUCTURE

Intel® Resource Director Technology (Intel® RDT) Collateral

•  Intel Resource Director Technology landing page •  http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-

technology.html •  Includes links to blogs and many other resources

•  Intel Software Developer’s Manual •  http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

(Vol 3b, Chapter 17.15 and 17.16, covers CMT, CAT, MBM and CDP)

•  NPG Product Literature •  http://www.intel.com/content/www/us/en/communications/nfv-packet-processing-brief.html

•  Academic Research Papers •  Numerous prior works are available from multiple researchers and organizations

229

30 TRANSFORMING NETWORKING INFRASTRUCTURE

Intel® Resource Director Technology Collateral – Software Enabling •  Software Enabling: Non Operating System integrated options

•  Standalone tool to monitor and control allocation functionality https://01.org/packet-processing/cache-monitoring-allocation-technology

•  Software Enabling: Operating System Scheduler enabled options •  Linux* Perf patches for Monitoring (mainstream since kernel v4.1, v4.6) h"ps://lkml.kernel.org/r/1422038748-21397-1-

git-send-email-ma"@codeblueprint.co.uk

•  Linux resctrl for allocation support v4.10

•  Introduc)ontoCMT:h"ps://soBware.intel.com/en-us/blogs/2014/06/18/benefit-of-cache-monitoring

•  DiscussionofRMIDsandCMTSo6wareInterfaces:h"ps://soBware.intel.com/en-us/blogs/2014/12/11/intel-s-cache-monitoring-technology-soBware-visible-interfaces

•  UseModelsandExampleData:h"ps://soBware.intel.com/en-us/blogs/2014/12/11/intels-cache-monitoring-technology-use-models-and-data

•  So6wareSupportsandTools:Intel'sCacheMonitoringTechnology:SoBwareSupportandTools:h"ps://soBware.intel.com/en-us/blogs/2014/12/11/intels-cache-monitoring-technology-soBware-support-and-tools

25

31 TRANSFORMING NETWORKING INFRASTRUCTURE

Perf CMT and MBM Implementation

•  RMID recycling

•  Cache monitoring and Memory Bandwidth monitoring per Pid/tid based # tools/perf/perf list | grep intel_cqm

intel_cqm/llc_occupancy/ [Kernel PMU event]

intel_cqm/local_bytes/ [Kernel PMU event]

intel_cqm/total_bytes/ [Kernel PMU event]

Command: #tools/perf/perf stat -e intel_cqm/llc_occupancy/ -e intel_cqm/local_bytes/ -e intel_cqm/total_bytes/ -p <#pid>

•  CMT and MBM support in applications to track pid/tid.

•  Libvirt enables CMT event for monitoring VM’s using perf_event_open sys call.

int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu, int group_fd, unsigned long flags);

(Please refer lib/perf.c for detailed implementation)

32 TRANSFORMING NETWORKING INFRASTRUCTURE

Libvirt CMT support

Libvirt enabling Commands:

• To Enable/Disable CMT perf event for domain:

$virsh perf <domain> --enable <event_name>

For Example: $virsh perf guest01 --enable cmt

• To get the perf events list: § $virsh perf <domain>

• To print statistics for the perf events: § $ virsh domstats domain

Patches Available: https://www.redhat.com/archives/libvir-list/2016-January/msg01264.html