Upload
julius-green
View
237
Download
6
Embed Size (px)
Citation preview
®®
Intel® Integrated Intel® Integrated Performance Performance Primitives (IPP)Primitives (IPP)
®®
AgendaAgenda
Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary
®®
The Intel® Integrated The Intel® Integrated Performance Primitives are…Performance Primitives are… 函数库:信号处理,图像处理,多媒体,
向量处理等 跨平台和 OS 的通用 API 高性能代码
Overview of Intel® IPP
®®
Overview of Intel® IPP
易于在多种平台上开发
IA32IA32IA32Intel Pentium®
and Xeon processors
Intel Itanium®Architecture
Intel® MathIntel® MathKernelKernelLibraryLibrary
ApplicationsApplicationsApplicationsApplications
Intel® PCA application processors
Intel® Integrated Performance Primitives (IPP)Intel® Integrated Performance Primitives (IPP)Primitives Interface
Processor-specific Functions
Sample codeSample code
Intel® PCA application processors
®®
Overview of Intel® IPP
• 快速编码快速编码– 自动选择处理器相关的 自动选择处理器相关的 DLLDLL– 和体系结构相关的指令集和体系结构相关的指令集
Itanium®Architecture
Intel PCA application processors based on XScale™
technology
Integrated Performance Primitives (IPP)Integrated Performance Primitives (IPP)
Pentium® II processor
Pentium® III processor
Pentium® 4 processor
Xeon™ processor
Full IA-32 Family
MMX™ technology
Streaming SIMD Extensions
Streaming SIMD Extensions-2
®®
Cross platform & OSCross platform & OSOverview of Intel® IPP
• 支持多种平台 支持多种平台 – MMX™, Streaming SIMD Extensions (SSE) and Streaming MMX™, Streaming SIMD Extensions (SSE) and Streaming
SIMD Extensions 2 (SSE-2) TechnologiesSIMD Extensions 2 (SSE-2) Technologies
– IA-32 (including Intel® Xeon™ processor)IA-32 (including Intel® Xeon™ processor)
– Itanium® architectureItanium® architecture
– Intel® XScale™ micro-architectureIntel® XScale™ micro-architecture
• 支持多种操作系统支持多种操作系统– Windows NT* 4.0 / Windows 2000* /Windows XP*Windows NT* 4.0 / Windows 2000* /Windows XP*
– Windows XP* 64-bit Windows XP* 64-bit
– Linux* & Linux-64Linux* & Linux-64
– Windows CE*, Linux in embedded deviceWindows CE*, Linux in embedded device
®®
Overview of Intel® IPP
不需要写汇编代码,获得优化的应用程序不需要写汇编代码,获得优化的应用程序不需要写汇编代码,获得优化的应用程序不需要写汇编代码,获得优化的应用程序
®®
AgendaAgenda
Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary
®®
对对 IPPIPP 性能的评论性能的评论
Gaining Performance with Intel® IPP
Leo Volfson, President and Chief Technology Officer, Inetcam, Inc. Leo Volfson, President and Chief Technology Officer, Inetcam, Inc.
““The Intel ® Integrated Performance Primitives (IPP) has enhanced the The Intel ® Integrated Performance Primitives (IPP) has enhanced the iVISTA* application to be more in line with customers' expectations. For iVISTA* application to be more in line with customers' expectations. For exampleexample, it enables us to dynamically rescale, in real-time, a video , it enables us to dynamically rescale, in real-time, a video stream without loss of performance. This capability would not be stream without loss of performance. This capability would not be possible without IPP.possible without IPP.””
““The Intel IPP provided The Intel IPP provided a 300% improvement in the number of users who a 300% improvement in the number of users who can simultaneously participate in a webcastcan simultaneously participate in a webcast. In addition, the . In addition, the migration migration from the Intel Pentium III to the Intel Pentium 4 took only a dayfrom the Intel Pentium III to the Intel Pentium 4 took only a day .”.”
®®
对对 IPPIPP 性能的评论性能的评论
Gaining Performance with Intel® IPP
Bryan Cook, Software Architect, AuSIM Inc, Los Altos, California Bryan Cook, Software Architect, AuSIM Inc, Los Altos, California October 2001October 2001
““AuSIM Inc. delivers the most advanced audio simulation technology for AuSIM Inc. delivers the most advanced audio simulation technology for mission-critical aural displays and simulations. mission-critical aural displays and simulations. With Intel’s Integrated With Intel’s Integrated Performance Primitives (IPP), AuSIM has leveraged 4X performance Performance Primitives (IPP), AuSIM has leveraged 4X performance gains within its AuSIM3D* audio simulation technology.gains within its AuSIM3D* audio simulation technology. …directly …directly enhances AuSIM’s ability to provide the ultimate audio solutions for enhances AuSIM’s ability to provide the ultimate audio solutions for simulations, team communications, audio production, tele-conferences, simulations, team communications, audio production, tele-conferences, and aural information displays.”and aural information displays.”
®®
Minor effort, major gainsMinor effort, major gains
Gaining Performance with Intel® IPP
通过 Intel® IPP ,很小的代码改变可以获得极大的性能
通过将运行时的函数调用替换为IPP ,应用程序模块的性能可以得到极大的改进
pFlTmp=flArgSin[thrd].algPtr;
for(t=0; t<4*iWvMeshSize; t++) {
pFlTmp[t]=(float)sin(pFlTmp[t]);
}
____________________
ippsSin_32f_A11(flArgSin[thrd].algPtr,
flArgSin[thrd].algPtr, 4*iWvMeshSize);
1x
4x
Demo
®®
AgendaAgenda
Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary
®®
An IPP Function NameAn IPP Function Name
Intel® IPP Programming Conventions
ippsAddC_8u_I();
Prefixipps, ippi, ippm
BasenameE.g. Add, DCT, etc. Data array type
Indicates bit depth & integer / floating pointE.g. 8u, 16s, 32f
DescriptorsIndicates data layout variants
®®
Data-types and layoutsData-types and layouts
Intel® IPP Programming Conventions
对数据类型的特殊优化 (8u, 16s, 32f) 对数据布局的特殊优化 (pixel, planer…) 对数据类型和布局的转换函数的优化
ippsConvert_8u_32f()
ippsInterleave_16s()
ippiRGBToYUV_8u_C3P3R()
Color Models:• YCbCr (4:4:4, 4:2:2)• YUV (4:2:2, 4:2:0)• YCC, HLS, RGB, RGBA …
Interleaving:• L/R/L/R/L/R… stereo• RGBRGBRGB… color image• RGBARGBA… color +alpha
®®
数据布局数据布局 pixelpixel 格式中,每个像素所有位都被顺序存储格式中,每个像素所有位都被顺序存储 在在 planerplaner 格式中,每个像素的第一位被存储,格式中,每个像素的第一位被存储,
接着每个像素的第二位被存储,等等接着每个像素的第二位被存储,等等
®®
AgendaAgenda
Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary
®®
IPP IPP 对信号处理的优化对信号处理的优化Intel® IPP Lab Exercise: Signal Processing
IPP 为信号处理应用提供了优化的性能• 信号的生成• 信号的转换• 时域频域的转换 (FFTs)• 滤波器 (FFT, FIR)
®®
IPP IPP 对信号处理的优化对信号处理的优化Intel® IPP Lab Exercise: Signal Processing
SupportSupport– conj, copy, imag, real, zero, setconj, copy, imag, real, zero, set
ConvertConvert– polar/cart, complex/realpolar/cart, complex/real
– integer/floatinteger/float
– up/down sampleup/down sample
WindowingWindowing– Bartlett, Blackman, Hamming, Hann, Bartlett, Blackman, Hamming, Hann,
KaiserKaiser
Signal GenerationSignal Generation– random, wave patternsrandom, wave patterns
FiltersFilters– FIR, IIRFIR, IIR
– medianmedian
TransformsTransforms– FFT, DFT, Goertzel, DCT, FFT, DFT, Goertzel, DCT,
waveletwavelet
StatisticsStatistics– norms, threshold, min / max / norms, threshold, min / max /
std.dev., mean, powerspectrstd.dev., mean, powerspectr
AudioAudio– A-law / mu-law, preemphasizeA-law / mu-law, preemphasize
Lab
®®
IPP IPP 对图像处理的优化对图像处理的优化Intel® IPP Lab Exercise: Image Processing
IPP 为图像处理应用提供了优化的性能
• 图像生成• 颜色转换• 变形和过滤• 算术和逻辑操作• 几何变形• 编解码支持 (MPEG-1, -2, -4, H.263)
®®
IPP OptimizationsIPP Optimizationsfor Image Processingfor Image Processing
Intel® IPP Lab Exercise: Image Processing
Arithmetic & LogicalArithmetic & Logical– Abs, Add, convolve, cross-Abs, Add, convolve, cross-
correlation, div, exp, ln, LShift, correlation, div, exp, ln, LShift, normalize, mul, RShift, sqr, sqrt, normalize, mul, RShift, sqr, sqrt, sub, thresholdsub, threshold
– And, not, orAnd, not, or– CompareCompare– Phase, magnitudePhase, magnitude
ConvertConvert– Pixel/planarPixel/planar– Color conversionsColor conversions
FiltersFilters– User-defined and built-inUser-defined and built-in
TransformsTransforms– FFT, DFT, DCT, waveletFFT, DFT, DCT, wavelet
StatisticsStatistics– norms, threshold, min / max / norms, threshold, min / max /
std.dev., mean, powerspectr, std.dev., mean, powerspectr, momentsmoments
GeometricGeometric– MirrorMirror, rotate, resize, remap, rotate, resize, remap
Alpha CompositeAlpha Composite Gamma correctionGamma correction Image GenerationImage Generation
Lab
®®
AgendaAgenda
Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary
®®
将来的 将来的 IPP:3.0 IPP:3.0 或以后版本或以后版本 迁移支持迁移支持 ::
–SPL (Signal Processing Library)SPL (Signal Processing Library)
– IPL (Image Processing Library)IPL (Image Processing Library)
– IJL (Intel® JPEG Library)IJL (Intel® JPEG Library)
–RPL (Recognition Primitives Library)RPL (Recognition Primitives Library)
The Future of Intel® IPP
…supporting migration to new platforms
®®
将来的 将来的 IPP:3.0 IPP:3.0 或以后版本或以后版本 编解码原语编解码原语 ::
–MPEG-1, MPEG-2, MPEG-4, JPEG2000MPEG-1, MPEG-2, MPEG-4, JPEG2000
–G.729, G.723, GSM AMRG.729, G.723, GSM AMR
编解码采样编解码采样 ::–MPEG-1, MPEG-2, MPEG-4MPEG-1, MPEG-2, MPEG-4
–G.729, G.723, GSM AMRG.729, G.723, GSM AMR
The Future of Intel® IPP
…supporting video, image and speech encoding/decoding
®®
将来的 将来的 IPP:3.0 IPP:3.0 或以后版本或以后版本 支持更多的颜色模式和转换支持更多的颜色模式和转换 ……and much, much more!and much, much more!
The Future of Intel® IPP
…tell us what you want!
®®
AgendaAgenda
Overview of Intel® IPP 通过 Intel® IPP 获得更高的性能 编程约定 实验 将来的 IPP Summary
®®
Summary: IPPSummary: IPP
高性能信号处理,图像处理,多媒体和向高性能信号处理,图像处理,多媒体和向量数学函数库量数学函数库
通过很小的代码改变获得很高的性能通过很小的代码改变获得很高的性能 跨平台和操作系统的通用跨平台和操作系统的通用 APIAPI
Intel® IPP Summary