View
9
Download
0
Category
Preview:
Citation preview
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P and ScalascaPortable open-source tools for scalableperformance analysis
February 1, 2014 Alexandre Otto Strube
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Outline
Going Exascale
Scalasca
We’re not alone
Things got messy
Unification
Who uses/develops Score-P
What is ours
Extreme scalability
The future
February 1, 2014 Alexandre Otto Strube Slide 2
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Going Exascale
February 1, 2014 Alexandre Otto Strube Slide 3
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking
# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking# of cores increasing
Hybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking# of cores increasingHybrid environments
That affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOW
HPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearhead
We only find the problems before the othersSupercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the others
Supercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
TL;DR
Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow
February 1, 2014 Alexandre Otto Strube Slide 5
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
It doesn’t get easier
Increasing machine complexity (gpu, accelerators, etc)
Every doubling of scale reveals a new bottleneckPerturbation and data volumeDrawing insight from measurements
February 1, 2014 Alexandre Otto Strube Slide 6
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
It doesn’t get easier
Increasing machine complexity (gpu, accelerators, etc)Every doubling of scale reveals a new bottleneck
Perturbation and data volumeDrawing insight from measurements
February 1, 2014 Alexandre Otto Strube Slide 6
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
It doesn’t get easier
Increasing machine complexity (gpu, accelerators, etc)Every doubling of scale reveals a new bottleneckPerturbation and data volume
Drawing insight from measurements
February 1, 2014 Alexandre Otto Strube Slide 6
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
It doesn’t get easier
Increasing machine complexity (gpu, accelerators, etc)Every doubling of scale reveals a new bottleneckPerturbation and data volumeDrawing insight from measurements
February 1, 2014 Alexandre Otto Strube Slide 6
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Example: Sweep3d Wait States on BG/P (2010)
February 1, 2014 Alexandre Otto Strube Slide 7
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
This is an old song
Several performance tools exist, for many years
Most cease to work in huge processor/core countsKOJAK performance tool was created 16 years ago.
February 1, 2014 Alexandre Otto Strube Slide 8
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
This is an old song
Several performance tools exist, for many yearsMost cease to work in huge processor/core counts
KOJAK performance tool was created 16 years ago.
February 1, 2014 Alexandre Otto Strube Slide 8
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
This is an old song
Several performance tools exist, for many yearsMost cease to work in huge processor/core countsKOJAK performance tool was created 16 years ago.
February 1, 2014 Alexandre Otto Strube Slide 8
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
Started in 2006 (following KOJAK from ’98)
Goals:
Scalable performance analysis toolsetSpecifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes
February 1, 2014 Alexandre Otto Strube Slide 9
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
Started in 2006 (following KOJAK from ’98)Goals:
Scalable performance analysis toolsetSpecifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes
February 1, 2014 Alexandre Otto Strube Slide 9
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
Started in 2006 (following KOJAK from ’98)Goals:
Scalable performance analysis toolset
Specifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes
February 1, 2014 Alexandre Otto Strube Slide 9
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
Started in 2006 (following KOJAK from ’98)Goals:
Scalable performance analysis toolsetSpecifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes
February 1, 2014 Alexandre Otto Strube Slide 9
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)
PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)Portable
IBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10
Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++
MPI 2.2, basic OpenMP & hybrid MPI+OpenMPUnique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:
scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:scalable trace analysis
Automatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:scalable trace analysisAutomatic wait-state search
Parallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalability
INSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca: Features
Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages
Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP
Unique:scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL
February 1, 2014 Alexandre Otto Strube Slide 10
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
This looks understandable...
February 1, 2014 Alexandre Otto Strube Slide 11
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
... but this is a real code.
February 1, 2014 Alexandre Otto Strube Slide 12
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
and this.
February 1, 2014 Alexandre Otto Strube Slide 13
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
... it can get really confusing.
February 1, 2014 Alexandre Otto Strube Slide 14
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
February 1, 2014 Alexandre Otto Strube Slide 15
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
February 1, 2014 Alexandre Otto Strube Slide 16
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
February 1, 2014 Alexandre Otto Strube Slide 17
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
February 1, 2014 Alexandre Otto Strube Slide 18
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
February 1, 2014 Alexandre Otto Strube Slide 19
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Scalasca
February 1, 2014 Alexandre Otto Strube Slide 20
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools exist
Different goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools existDifferent goals, similar needs
Separate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools existDifferent goals, similar needsSeparate measurement systems and output formats
Complementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionality
Redundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenance
Limited or expensive interoperabilityComplications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperability
Complications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
We’re not alone
Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training
February 1, 2014 Alexandre Otto Strube Slide 21
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Things got messy
February 1, 2014 Alexandre Otto Strube Slide 22
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Unification
February 1, 2014 Alexandre Otto Strube Slide 23
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructure
What we share:
Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4
Single development effort, testing, supportSingle installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructureWhat we share:
Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4
Single development effort, testing, supportSingle installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructureWhat we share:
Single instrumentation and measurement system
Common data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4
Single development effort, testing, supportSingle installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructureWhat we share:
Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortraces
Performance report: Cube4
Single development effort, testing, supportSingle installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructureWhat we share:
Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4
Single development effort, testing, supportSingle installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructureWhat we share:
Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4
Single development effort, testing, support
Single installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Score-P project idea
Communnity project with common infrastructureWhat we share:
Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4
Single development effort, testing, supportSingle installation, interoperability, etc
So, Score-P is the base instrumentation/measurement forseveral projects
February 1, 2014 Alexandre Otto Strube Slide 24
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Who uses/develops Score-P?
Scalasca (Fz-Juelich, RTWH Aachen)
Vampir (TU Dresden)Periscope (Tu Munich)Tau (U. Oregon)
February 1, 2014 Alexandre Otto Strube Slide 25
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Who uses/develops Score-P?
Scalasca (Fz-Juelich, RTWH Aachen)Vampir (TU Dresden)
Periscope (Tu Munich)Tau (U. Oregon)
February 1, 2014 Alexandre Otto Strube Slide 25
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Who uses/develops Score-P?
Scalasca (Fz-Juelich, RTWH Aachen)Vampir (TU Dresden)Periscope (Tu Munich)
Tau (U. Oregon)
February 1, 2014 Alexandre Otto Strube Slide 25
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Who uses/develops Score-P?
Scalasca (Fz-Juelich, RTWH Aachen)Vampir (TU Dresden)Periscope (Tu Munich)Tau (U. Oregon)
February 1, 2014 Alexandre Otto Strube Slide 25
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
And why we did it?
February 1, 2014 Alexandre Otto Strube Slide 26
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Cleaning the house
February 1, 2014 Alexandre Otto Strube Slide 27
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functions
Generation of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinking
Minimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overhead
Times each function was calledTime spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was called
Time spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each function
Amount of data transferredCall-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profiles
Needs recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profilesNeeds recompilation
Some overhead - might need filteringTrace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profilesNeeds recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profilesNeeds recompilationSome overhead - might need filtering
Trace analysis
Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profilesNeeds recompilationSome overhead - might need filtering
Trace analysisIdentifies inneficiency patterns in communication andsynchronization
Traces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
What do we measure?
Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles
Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred
Call-path profilesNeeds recompilationSome overhead - might need filtering
Trace analysisIdentifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that
February 1, 2014 Alexandre Otto Strube Slide 28
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Extreme scalability
All parallel:
Data collection/reduction
Analysis:
Pattern searchDelay analysisCritical-path analysis
Visualization
February 1, 2014 Alexandre Otto Strube Slide 29
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Extreme scalability
All parallel:
Data collection/reductionAnalysis:
Pattern searchDelay analysisCritical-path analysis
Visualization
February 1, 2014 Alexandre Otto Strube Slide 29
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Extreme scalability
All parallel:
Data collection/reductionAnalysis:
Pattern search
Delay analysisCritical-path analysis
Visualization
February 1, 2014 Alexandre Otto Strube Slide 29
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Extreme scalability
All parallel:
Data collection/reductionAnalysis:
Pattern searchDelay analysis
Critical-path analysis
Visualization
February 1, 2014 Alexandre Otto Strube Slide 29
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Extreme scalability
All parallel:
Data collection/reductionAnalysis:
Pattern searchDelay analysisCritical-path analysis
Visualization
February 1, 2014 Alexandre Otto Strube Slide 29
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Extreme scalability
All parallel:
Data collection/reductionAnalysis:
Pattern searchDelay analysisCritical-path analysis
Visualization
February 1, 2014 Alexandre Otto Strube Slide 29
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Some MPI patterns
time
pro
cess
ENTER EXIT SEND RECV COLLEXIT
(a) Late Sendertime
pro
cess
(b) Late Receiver
time
pro
cess
(d) Wait at N x Ntime
pro
cess
(c) Late Sender / Wrong Order
February 1, 2014 Alexandre Otto Strube Slide 30
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Late sender
February 1, 2014 Alexandre Otto Strube Slide 31
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Late sender and application topology
February 1, 2014 Alexandre Otto Strube Slide 32
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Direct wait time analysis
A
B
C Recv
Send
Send
foo
foo
foo
bar
bar Recv
bar
time
Recv
Recv
Direct wait
February 1, 2014 Alexandre Otto Strube Slide 33
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Indirect wait time analysis
Recv
Send
Send
foo
foo
foo
bar
bar Recv
A
B
C
bar
time
Recv
Direct waitIndirect wait
Recv
February 1, 2014 Alexandre Otto Strube Slide 34
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Direct wait time
February 1, 2014 Alexandre Otto Strube Slide 35
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Indirect wait time analysis
February 1, 2014 Alexandre Otto Strube Slide 36
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
Root cause analysis
A
B
C Recv
Send
Send
foo
foo
foo
bar
bar Recv
barDELAY
time
Recv
Recv
Direct waitIndirect wait
Recv
cause
February 1, 2014 Alexandre Otto Strube Slide 37
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
6D Hardware topology
February 1, 2014 Alexandre Otto Strube Slide 38
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
The Future
February 1, 2014 Alexandre Otto Strube Slide 39
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
The Future
Energy awareness
Bring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.orghttp://www.scalasca.org
February 1, 2014 Alexandre Otto Strube Slide 40
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
The Future
Energy awarenessBring performance analysis to YOU!
There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.orghttp://www.scalasca.org
February 1, 2014 Alexandre Otto Strube Slide 40
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
The Future
Energy awarenessBring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!
support@score-p.orghttp://www.scalasca.org
February 1, 2014 Alexandre Otto Strube Slide 40
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
The Future
Energy awarenessBring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.org
http://www.scalasca.org
February 1, 2014 Alexandre Otto Strube Slide 40
Mem
bero
fthe
Hel
mho
ltz-A
ssoc
iatio
n
The Future
Energy awarenessBring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.orghttp://www.scalasca.org
February 1, 2014 Alexandre Otto Strube Slide 40
Recommended