View
237
Download
1
Category
Preview:
Citation preview
Intel® Performance Libraries: the latest Updates, upcoming featuresHPC Code Modernization Workshop for Intel® Xeon® processors & Xeon Phi™ coprocessors
Julia Sukharina
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® Performance LibrariesIntel® Parallel Studio XE
Deliver top C++ and Fortran application performance with less effort• Faster code: Boost applications performance that
scales on today’s and next-gen processors• Create code faster: Utilize a toolset that simplifies
creating fast, reliable parallel code
Intel® Math Kernel Library (Intel® MKL)Fastest and most used math library for Intel and compatible processors*
Intel® Integrated Performance Primitives (Intel® IPP)Extensive software library for media and data processing
Intel® Data Analytics Acceleration Library (Intel® DAAL)*Library of optimized building blocks for Data Analytics
Intel® System Studio (ISS)
Deep system-level insight into power, performance, and reliability• Accelerate time to market of Intel® architecture-
based systems and embedded applications• Cross-development tools for Intel® architecture and
multiple target operating systems
Intel® Math Kernel Library (Intel® MKL)
Intel® Integrated Performance Primitives (Intel® IPP)
Intel® Integrated Native Developer Experience (Intel® INDE)
Cross platform meets native performance• Cross-OS, Cross-Architecture, Cross-IDE: C++/Java*
tools and libraries for Windows* and OS X* on Intel® architecture and Android* on ARM* and Intel® architecture
• More Performance, Less Time: Develop native applications faster through code reuse & streamlined access of platform capabilities
Intel® Integrated Performance Primitives (Intel® IPP)
* Intel® DAAL is also available as standalone
NEW!
2
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® Math Kernel LibraryIntel® MKL
3
“I’m a C++ and Fortran developer and have high praise for
the Intel® Math Kernel Library. One nice feature I’d like to
stress is the bitwise reproducibility of MKL which helps me
get the assurance I need that I’m getting the same floating
point results from run to run."Franz Bernice
CEO and Senior Developer
MSTC Modern Software Technology
“Intel MKL is indispensable for any high-
performance computer user on x86 platforms.”
Prof. Jack Dongarra
Innovative Computing Lab
University of Tennessee, Knoxville
“The best new feature of Intel® MKL is the Cluster Sparse
Solver. The solution of the sparse linear systems of 106 or
even 107 equations has never been faster.”
Jaroslav Šindler, Researcher
Czech Technical University
Faculty of Mechanical Engineering
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Powered by theIntel® Math Kernel Library (Intel® MKL)
4
‒ Speeds math processing in scientific, engineering and financial applications
‒ Functionality for dense and sparse linear algebra (BLAS, LAPACK, PARDISO), FFTs, vector math, summary statistics and more
‒ Provides scientific programmers and domain scientists
‒ Interfaces to de-facto standard APIs from C++, Fortran, C#, Python and more
‒ Support for Linux*, Windows* and OS X* operating systems
‒ Extract great performance with minimal effort
‒ Unleash the performance of Intel® Core, Intel® Xeon and Intel® Xeon Phi™ product families
‒ Optimized for single core vectorization and cache utilization
‒ Coupled with automatic OpenMP*-based parallelism for multi-core, manycore and coprocessors
‒ TBB compasibility
‒ Scales to PetaFlop (1015 floating-point operations/second) clusters and beyond
‒ Included in Intel® Parallel Studio XE and Intel® System Studio Suites
Used on the World’s Fastest Supercomputers**
**http://www.top500.org
Energy FinancialAnalytics
Engineering Design Digital Content Creation
Science & Research
Signal Processing
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Mathematical problems arise in many scientific disciplines
These applications areas typically involve mathematics
‒ Differential equations‒ Linear algebra ‒ Fourier transforms‒ Statistics
5
Intel® MKL is a Computational Math Library
Energy FinancialAnalytics
Science &Research
Engineering Design
SignalProcessing
Digital Content Creation
Intel® MKL helps solve your computational challenges
−𝝏𝒖𝟐
𝝏𝒙𝟐−
𝝏𝒖𝟐
𝝏𝒚𝟐+ 𝒒 𝒖 = 𝒇 𝒙, 𝒚
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Linear Algebra
• BLAS
• LAPACK
• ScaLAPACK
• Sparse BLAS
• Intel® MKL PARDISO
• Cluster Sparse Solver
Fast Fourier Transforms
• Multidimensional
• FFTW interfaces
• Cluster FFT
Vector Math
• Trigonometric
• Hyperbolic
• Exponential
• Log
• Power
• Root
Vector RNGs
• Congruential
• Wichmann-Hill
• Mersenne Twister
• Sobol
• Neiderreiter
• Non-deterministic
Summary Statistics
• Kurtosis
• Variation coefficient
• Order statistics
• Min/max
• Variance-covariance
And More
• Splines
• Interpolation
• Trust Region
• Fast Poisson Solver
6
Optimized Mathematical Building Blocks
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
7
Automatic Performance Scaling from the Core, to Multicore, to Many Core and Beyond
Extracting performance from the computing resources
‒ Core: vectorization, prefetching, cache utilization
‒ Multi-Many core (processor/socket) level parallelization
‒ Multi-socket (node) level parallelization
‒ Clusters scaling
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Optimizations – current and future hardware support ‒ Always highly tuned for the latest Intel® Xeon® processor‒ Tuned for Intel® Xeon PhiTM coprocessor x100 (KNC)‒ Early optimizations for Intel® AVX-512
Features‒ Conditional Numerical Reproducibility‒ Extended Eigensolvers based on and compatible with FEAST1
‒ Parallel Direct Sparse Solver for Clusters‒ Small Matrix Multiply enhancements
8
Notable Enhancements in Intel® MKL 11.0-11.2
Notes:1
http://www.ecs.umass.edu/~polizzi/feast/
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Optimized for the latest Intel® Xeon® processors and for Intel® Xeon PhiTM x200 coprocessor(KNL)
Batch GEMM functions
⁻ Improve the performance of multiple, simultaneous matrix multiply operations
⁻ Provides grouping (the same sizes and leading dimensions) and batching across groups
GEMMT functions calculate C = A * S * AT, where S is symmetric and/or diagonal
Sparse BLAS inspector-executor API
⁻ Matrix structure analysis brings performance benefit for relevant applications (i.e. iterative solvers)
⁻ Parallel triangular solver
⁻ Both 0-based and 1-based indexing, row-major and column-major ordering
⁻ Extended BSR support
Counter-based pseudorandom number generators
⁻ ARS-5 based on the Intel AES-NI instruction set
⁻ Philox4x32-10
Intel® MKL PARDISO scalability
⁻ Improved Intel® MKL PARDISO and Cluster Sparse Solver scalability on Intel Xeon Phi coprocessors
Cluster components extension
⁻ MPI wrappers provide compatibility with most MPI implementations including custom ones
⁻ Cluster components support on OS X*9
New Enhancements in Intel® MKL 11.3
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® Integrated Performance PrimitivesIntel® IPP
10
"I have extensively used Intel IPP functions in my code to
accelerate development of multi format encoding and
decoding, and realized a significant performance boost for
our video transcoder which resulted in improved user
experiences for media playback."
Jagadish Kamath
Co-founder and Software Architect
River Silica Technologies
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
11
Intel® IPP is a Performance Primitives Library
Intel® IPP includes algorithms for various scientific disciplines
Ima
ge
Pro
ce
ssin
g/C
olo
r C
on
ve
rsio
n •Healthcare
•Special effects for photo/video processing
•Object compression/ decompression
•Image scaling, image combination
•Noise reduction
•Optical correction
Co
mp
ute
r V
isio
n •Digital Surveillance
•Industrial/Machine Control
•Image Recognition
•Bio-metric identification
•Remote operation of equipment and gesture interpretation
•Automated sorting of materials or objects
Da
ta C
om
pre
ssio
n •Internet portal data center
•Data storage centers
•Databases
•Enterprise data management
Sig
na
l P
roce
ssin
g •Telecom
•Energy
•Recording, enhancement and playback of audio and non-audio signals
•Echo cancellation: filtering, equalization and emphasis
•Simulation of environment or acoustics
•Games involving sophisticated audio content or effects
Cry
pto
gra
ph
y •Internet portal data center
•Information Security
•Telecom
•Enterprise data management
•Transaction security
•Smart card interfaces
•ID verification
•Copy protection
•Electronic signature
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
12
Optimized Performance Building Blocks Signal Processing
•Essential Functions:
•Logical and shift; Arithmetic functions; Conversion; Viterbi Decoder; Windowing; Statistical functions; Sampling
•Filtering Functions:
•Finite impulse response (FIR) filter; Adaptive FIR using least mean squares (LMS) filter; Infinite impulse response (IIR) filter; Median filter
•Transform Functions:
•Fourier transform; Hartley transform; Walsh-Hadamard; Discrete cosine transform; Hilbert transform; Wavelet transform
•String Functions
•Fixed-Accuracy Arithmetic Functions:
•Arithmetic functions; Power and root functions; Exponential and logarithmic functions; Trigonometric functions; Hyperbolic functions; Special functions; Rounding functions
•Data Compression Functions:
•VLC and Huffman coding; Dictionary-based compression; BWT-based compression
•Long Term Evolution Wireless Support Functions
Image Processing
•Image Color Conversion:
•Color model conversions; Color-gray scale conversions; Format Conversion; Color twist; Color keying; Gamma correction; Intensity transformation
•Threshold and Compare Operations
•Morphological Operations
•Filtering Functions:
•Filters with borders; Median filters; General linear filters; Separable filters; Wiener filters; Convolution; Deconvolution; Fixed filters
•Image Linear Transforms
•Image Statistics Functions
•Image Geometry Transforms
•Miscellaneous Image Transforms
•Wavelet Transforms
•Computer Vision:
•Feature detection; Distance transform; Image gradients; Flood fill; Motion analysis and objects ; tracking; Pyramids functions; Universal pyramids functions; Image inpainting; Image segmentation; Pattern recognition; Camera calibration and 3D reconstruction; Image enhancement
Cryptography
•Symmetric Cryptography Primitive Functions:
•Block ciphers: AES, TDES, and SM4
•ARCFour stream cipher
•One-Way Hash Primitives:
•Hash functions for streaming and non-streaming messages
•Hash-based mask generation functions
•Message Authentication Functions:
• Keyed Hash Functions
•Public Key Cryptography Functions:
•RSA Algorithm Functions
•Elliptic Curve Cryptography Functions
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Optimized for the latest Intel® Xeon® processors and for Intel® Xeon PhiTM x200 coprocessor(KNL)
New APIs to support external threading⁻ Intel® IPP library primitives are “thread-safe”
⁻ Intel® IPP internal threaded library are available as optional installation.
⁻ The external threading is recommended, which is more effective than the internal threading
New APIs to support external memory allocation ⁻ All memory allocations are done at the application level
⁻ Reduce the memory allocation for different IPP function calls by using a shared memory buffer
Improved CPU dispatcher⁻ Auto-initialization. No need for the CPU initialization call in static libraries
⁻ Code dispatching based on CPU features
Optimized cryptography functions to support SM2/SM3/SM4 algorithm
Custom dynamic library building tool
13
New Enhancements in Intel® IPP 9.0
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® Data Analytics Acceleration LibraryIntel® DAAL
14
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
New library for data analytics
⁻ Customers: analytics solution providers, system integrators, and application developers (FSI, Telco, Retail, Grid, etc.)
⁻ Key benefits: improved time-to-value, forward-scaling performance and parallelism on IA, advanced analytics building blocks
Key features
⁻ Building blocks highly optimized for IA to support all data analysis stages.
⁻ Support batch, streaming, and distributed processing with easy connectors to popular platforms (Hadoop, Spark)
⁻ Flexible interfaces for handling different data sources (CSV, MySQL, HDFS, RDD (Spark)).
⁻ Rich set of operations to handle sparse and noisy data
⁻ C++ and Java APIs
15
Intel® DAAL is Data Analytics Library
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
16
Optimized Analytics Building Blocks
Pre-processing Transformation Analysis Modeling Decision Making
Decompression,Filtering, Normalization
Aggregation,Dimension Reduction
Summary StatisticsClustering, etc.
TrainingParameter Estimation
Simulation
ForecastingDecision Trees, etc.
Sci
en
tifi
c/E
ng
ine
eri
ng
We
b/S
oci
al
Bu
sin
ess
Compute (Server, Desktop, … ) Client EdgeData Source Edge
Validation
Hypothesis testingModel errors
Same data analysis process and analytics building blocks despite variety of data formats, usages and domains
Analytics Targets:
⁻ Perform analysis close to data source (sensor/client/server) to optimize response latency, decrease network bandwidth utilization, and maximize security.
⁻ Offload data to server/cluster for complex and large-scale analytics only.
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
17
Big Data Analytics Usage ModelsS
cien
tific/En
gin
ee
ring
We
b/S
ocia
l
Bu
sine
ss
17
1. Pre-Processing
2. Analysis
3. Modeling4. Model
Validation
5. Visualization
1. Pre-processing
2. Model Calibration
3. Decision Making
4. Model Validation
5. Reporting
Interactive Modeling Visualization & Reporting
Production Use
C++, Java
Software
DAAL
Software
DAAL
Visual IDE
Collaboration Services
Data Store
Infrastructure (Comms, Security)
Management (Monitoring, Rule
Management)
Client Edge
Data Source Edge
C++, Java
Data Center
Workstation
SQL, noSQL, Files,Sensor Data
C++, Java
Intel Confidential
C++, Java
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
C++ and Java programming languages API
Data mining and analysis algorithms for
⁻ Computing correlation distance and Cosine distance
⁻ PCA (Correlation, SVD)
⁻ Matrix decomposition (SVD, QR, Cholesky)
⁻ Computing statistical moments
⁻ Computing variance-covariance matrices
⁻ Univariate and multivariate outlier detection
⁻ Association rule mining
Algorithms for supervised and unsupervised machine learning
⁻ Linear regressions
⁻ Naïve Bayes classifier
⁻ AdaBoost, LogitBoost, and BrownBoostclassifiers
⁻ SVM
⁻ K-Means clustering
⁻ Expectation Maximization (EM) for Gaussian Mixture Models (GMM)
Support for serialization/deserialization
Support for local and distributed data sources
⁻ Batch
⁻ Distributed
⁻ Streaming
Support for local and distributed data sources
⁻ In-file and in-memory CSV
⁻ MySQL
⁻ HDFS
⁻ RDD
Data compression and decompression
⁻ ZLIB
⁻ LZO
⁻ RLE
⁻ BZIP2
18
Intel® DAAL 2016 Beta FeaturesOptimized for the latest Intel® Xeon® processors and for Intel® Xeon PhiTM x200 coprocessor(KNL)
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® MKL and MKL Forum pages
• http://software.intel.com/en-us/articles/intel-mkl/
• http://software.intel.com/en-us/articles/intel-math-kernel-library-documentation/
• http://software.intel.com/en-us/forums/intel-math-kernel-library/
Intel® IPP and IPP Forum pages:
• https://software.intel.com/en-us/intel-ipp
• https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/
Intel® DAAL Forum page:
• https://software.intel.com/en-us/forums/intel-data-analytics-acceleration-library
References
19
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Copyright © 2015, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
20
Recommended