Upload
ngohuong
View
213
Download
0
Embed Size (px)
Citation preview
Introduction On-line Monitoring Framework Evaluation Summary References
Online Monitoring of I/O
Eugen Betke and Julian Kunkel
Research GroupGerman Climate Computing Center
23-03-2017
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
Introduction
Why monitoring?
Monitoring is important to find inefficient applications
What I/O levels to monitor?
node I/OOverview of total I/O traffic on a nodeAvailable in user space
file I/OFiltered I/O traffic for a specific fileAvailable in user space
mmap I/OI/O traffic done by virtual memory in thebackgroundHidden in the kernel space
How do monitoring tools get data?
Capturing of proc-files statisticsInstrumentation code injection
Static approachInjection of new compiled C code into abinary executable or dynamic library fileRe-compilation necessary
Interception with LD_PRELOAD
Dynamic approachWorks with dynamic libraries onlyStatic linked functions can not bemanipulated
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
Virtual Memory
mmap()
mmap() is a system call to map the contents of a fileinto memory.
1 void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off);
fildes file descriptoroff offset within the file to be mappedlen length of data from the offset to be mapped
Problematic
Virtual memory run in kernel space. Non-priviligedapplications have no access. Figure: Virtual Memory[4]
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
FUSE File System - Overview
File System in User Space
Software interface for Unix-likecomputer operating systems
Non-privileged file systems runwithout editing kernel code
File system code runs in user space
FUSE module provides only a"bridge" to the actual kernelinterfaces
USER
SPA
CE
KER
NEL
SPA
CE
Application
VirtualFile System
FUSEKernel Module
User LevelFile System
linked against libfuse
Built-InFile System
e.g. BTRFS, EXT4
Storage
1 63 4
2
5
VirtualMemory
1’
mmap()
Figure: FUSE architecture
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
FUSE File System - IOFS
Auxiliary toolMounts a directory to a mount point
FeaturesNo cacheNo mmap() operationsNo root priviledges required
USER
SPA
CE
KER
NEL
SPA
CE
Application
VirtualFile System
FUSEKernel Module
User LevelFile System
linked against libfuse
Built-InFile System
e.g. BTRFS, EXT4
Storage
1 63 4
2
5
VirtualMemory
1’
mmap()
Figure: FUSE architecture
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
SIOX - Scalable I/O for Extreme Performance [3]
Performance Analysis Framework
Open-Source-Framework published under LGPL 1
Supports POSIX-, MPI-, HDF5- and NETCDF4-Layers
Modular design
Online Analysis
Analyse activities during program execution
Offline Analysis
Analyse activities after program termination
1https://github.com/JulianKunkel/siox.gitEugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
SIOX - On-line Monitoring Plug-in
Plug-in
Aggregates I/O traffic to statistics
Uses I/O categories
write: pwrite(), write(), . . .read: get(), read(), . . .
Sends I/O statistics in specifiedtime intervals
I/O Statistics
Typed for visualization
Types
metrics - y-axistimestamp - x-axistags - filtering
I/O StatisticsName Type Valuewrite_duration metric (basic) time spent for writingwrite_bytes metric (basic) bytes writtenwrite_calls metric (basic) number of I/O operationswrite_bytes_per_call metric (derived) write_bytes, write_callsread_duration metric (basic) time spent for readingread_bytes metric (basic) bytes readread_calls metric (basic) number of I/O operationsread_bytes_per_call metric (derived) read_bytes, read_callsfilename tag I/O operationsaccess tag I/O operationsusername tag SLURM_USERhostname tag HOSTNAMEprocid tag SLURM_PROCIDjobid tag SLURM_JOBIDlayer tag user definedtimestamp date system clock
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
Elasticsearch
Scalable, real-time search and analytics engine
Apache 2 license
Indexing of all field allow fast look-ups
Highly scalable
Runs on laptops as well as on large-scaled super computers
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
Grafana
Rich graphingInteractive, editable graphsMultiple Y-axes, Logarithmic scales andoptions
Mixed StylingMix lines, points and barsMix stacked w/ isolated series
Template VariablesVariables are automatically filled withvalues from DB
Repeating PanelsAutomatically repeat rows or panels foreach selected variable value
AnnotationsShow events from datasources in thegraphs
Figure: Grafana [1]
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
On-line Monitoring Framework
High scalability
Almost real-time on-line monitoring
Non-intrusive framework
No changes need to be done in applications
Components
Interception of mmap I/O: FUSEI/O statistics: SIOX + OnlineMonitoringPluginDB back-end: ElasticsearchVisualization: Grafana
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
On-line Monitoring Architecture 1/4
USER
SPA
CE
KER
NEL
SPA
CE
(optional)
SIOX( Application )+ Online-Monitoring-Plugin
Virtual File SystemVirtual Memory FUSE Kernel Module
(optional)
SIOX( IOFS )+ Online-Monitoring-Plugin
Built-in File System
Storage
Elasticsearch
Grafana
I/O statistics I/O statistics
I/O statistics
mmap()
file I/O
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
On-line Monitoring Architecture 2/4
USER
SPA
CE
KER
NEL
SPA
CE
(optional)
SIOX( Application )+ Online-Monitoring-Plugin
Virtual File SystemVirtual Memory FUSE Kernel Module
(optional)
SIOX( IOFS )+ Online-Monitoring-Plugin
Built-in File System
Storage
Elasticsearch
Grafana
I/O statistics I/O statistics
I/O statistics
mmap() mmap I/O
here still hidden
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
On-line Monitoring Architecture 3/4
USER
SPA
CE
KER
NEL
SPA
CE
(optional)
SIOX( Application )+ Online-Monitoring-Plugin
Virtual File SystemVirtual Memory FUSE Kernel Module
(optional)
SIOX( IOFS )+ Online-Monitoring-Plugin
Built-in File System
Storage
Elasticsearch
Grafana
I/O statistics I/O statistics
I/O statistics
mmap()
mmap I/O
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
On-line Monitoring Architecture 4/4
USER
SPA
CE
KER
NEL
SPA
CE
(optional)
SIOX( Application )+ Online-Monitoring-Plugin
Virtual File SystemVirtual Memory FUSE Kernel Module
(optional)
SIOX( IOFS )+ Online-Monitoring-Plugin
Built-in File System
Storage
Elasticsearch
Grafana
I/O statistics I/O statistics
I/O statistics
mmap()
file I/O
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
Grafana Web-Interface (User Perspective)
Interactive web interfaceZoom, time shift, filtering, . . .
Elaborated filteringBased on templatesAuto update of templates
FlawsNo Auto range finderTemplate update functionalitynot user friedly
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Components Architecture
On-line monitoringlive demo
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Elasticsearch performance
Elasticsearch was deployed on an office PC
Test setup
Nodes: 10Processes per Node: 20Metrics were
generated on our HPC “Mistral” [2] with apython scriptsent in 100 metrics packages
Result
100 x 7500 metrics per second
Package
1 {2 ’metric1’: ’1’,3 ’metric2’: ’2’,4 ’metric3’: ’3’,5 ...6 ’metric100’: ’100’7 }
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Overhead - Test Setup
IOR, IOZone, SIOX, IOFSIntel Core i5-660, 4M Cache, 3.33 GHz12 GB DDR3 RAM2 TB HDD (Test)500 GB HDD (System)
Computer1: Test System
Elasticsearch, Grafana-
Computer2: DB and Visualization
I/O statistics
over1 GB/s network
Experiment configuration
Block sizes 1 KiB, 100 KiB, 128 KiB, 1000 KiB, 1024 KiB, 16384 KiB
1 nodes and 1 processes per node (in SLURM)
4 GiB test file
10 test runs for each block size
IOR for file I/OIOZone for mmap I/O
Scenarios without monitoring and with monitoring (application, mount point, both)
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Overhead [1/4] - Write
FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●●
●
●
●
●
●
●
1.00
1.04
1.08
1.12
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e w
rite
perf
orm
ance
MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●
●
●
●●
1.00
1.04
1.08
1.12
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e w
rite
perf
orm
ance
Prel =mean(Pno_monitoring)
P<scenario>
Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)
Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Overhead [2/4] - Write (zoomed)
FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●● ●●
●●●
●1
2
3
4
5
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e w
rite
perf
orm
ance
MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●
●
●
●●
1.00
1.04
1.08
1.12
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e w
rite
perf
orm
ance
Prel =mean(Pno_monitoring)
P<scenario>
Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)
Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Overhead [3/4] - Read
FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●
●
●
●
●
●
●●
●
●
0.98
0.99
1.00
1.01
1.02
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e re
ad p
erfo
rman
ce
MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●
●●
●
●
●
●
●
●
●
0.98
0.99
1.00
1.01
1.02
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e re
ad p
erfo
rman
ce
Prel =mean(Pno_monitoring)
P<scenario>
Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)
Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References Scalability Overhead
Overhead [4/4] - Read (zoomed)
FILE I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●
●
●
●
●
●
●●
●
●
0.98
0.99
1.00
1.01
1.02
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e re
ad p
erfo
rman
ce
MMAP I/O1 KiB 100 KiB 128 KiB 1000 KiB 1024 KiB 16384 KiB
●
●
●●
●
● ●●
●
●
●
●
●
1.0
1.1
1.2
1.3
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3Scenario
Rel
ativ
e re
ad p
erfo
rman
ce
Prel =mean(Pno_monitoring)
P<scenario>
Scenarios0 no monitoring1 monitoring of application2 monitoring of mount point3 both, (1) and (2)
Exp. configurationnodes/processes per node 1/1test file 4 GiBtest runs 10
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
Table Of Content
1 Introduction
2 On-line Monitoring FrameworkComponents
FUSE File SystemSIOX + SIOX On-line Monitoring Plug-inElasticsearchGrafana
Architecture
3 EvaluationScalabilityOverhead
4 Summary
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
Summary
Non-intrusive On-line Monitoring Framework
Built on top of open source software: FUSE, SIOX, Elasticsearch, GrafanaProvides near real-time on-online monitoringCollects I/O statistics from applications and mount pointsProvides support for file I/O and mmap I/O
file I/O: Detailed information about file accessesmmap I/O: Non-intrusive way for instrumenting virtual memory (novelity)
Scalability (office PC)
100 x 7500 metrics/second
Overhead (office PC)
Write: file I/O < 1%/12% (+outlier) and Read mmap I/O < 1%/6%Read: file I/O < 1%/1% and Read mmap I/O < 1%/30% (+outlier)
Results for our HPC “Mistral” [2] are coming soon
Eugen Betke and Julian Kunkel Online Monitoring of I/O
Introduction On-line Monitoring Framework Evaluation Summary References
References
Grafana. https://grafana.com/. Accessed: 2017-03-22.
HLRE-3 "Mistral". https://www.dkrz.de/Klimarechner/hpc. Accessed:2017-03-22.
SIOX.https://wr.informatik.uni-hamburg.de/research/projects/siox.Accessed: 2017-03-22.
Virtual Memory. https://en.wikipedia.org/wiki/Virtual_memory.Accessed: 2017-03-22.
Eugen Betke and Julian Kunkel Online Monitoring of I/O