56
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners www.intel.com/software/products Improve Application Performance on Windows*

Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Improve Application Performance on Windows*

Page 2: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

What is the world’s biggest semiconductor company doing building software products?

Page 3: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

3Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

Intel® Compilers Best way to get application performance on Intel processors

Intel® VTune™ AnalyzersQuickly identify “hot spots” and how to fix them

Intel® Performance LibrariesHighly optimized, ready to use building-block functions

Intel® Threading ToolsSpeeds, simplifies development & maintenance of threaded apps

Intel® Cluster ToolsCreate, analyze, optimize and deploy cluster-based applications

Intel Software Development Products for Intel® Personal Internet Client Architecture processors,

Pentium® M, Pentium® 4, Intel® Xeon™ and Itanium® 2 Processors

Page 4: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

4Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

Performance– Enable developers to deliver

higher performance software Compatibility

– Compatible with the leading tools and development environments already used by many software developers

– Easy to incorporate into the development process Support

– Premier Customer Support– Technical training offered through Intel Software

College

Page 5: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Compilers

http:www.intel.com/software/productshttp:www.intel.com/software/products

Page 6: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

6Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Compilers for Intel PCA, Intel® 32-bit, EM64T & Itanium® 2 Processors

Intel compilers for Intel PCA processor line support Intel® Wireless MMX™ technology

Intel 32-bit processor support: SSE3, Intel Net Burst® microarchitecture, Hyper-threading

Itanium® 2 processor support: software pipelining, improved branch prediction, branch reduction thru predication

Advanced optimization features of Intel compilers

– Profile Guided Optimization, Inter-Procedural Optimization

– Parallelism: Auto-parallelization, vectorization, OpenMP* support

– Data prefetching

– Processor dispatch on IA-32 processors Intel® Premier Support: Compiler updates, support, expertise, customer

interaction via compiler forums, architectural information, white papers and more

Page 7: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

7Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Compilers

Optimize for Specific ProcessorsOptimize for Specific Processors

Instruction Scheduling– Schedule instructions to be optimal for specific processor

– How? On Windows: /G1, /G2, /G5, /G7…

Build target for specific processor– For target processor it uses processor specific opcodes & features

like SSE, SSE2, Vectorization

– Runs only the target processor

– How? On Windows*: /QxK, /QxW, QxB…

Automatic Processor Dispatch– Runs on all x86 processors

– How? On Windows*: /QaxK, /QaxW, /QaxB…

Page 8: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

8Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Compilers

High-Level OptimizationsHigh-Level Optimizations

High-Level Optimizer– Performs loop level optimizations, aids optimal memory access

– How? On Windows: /O3

Inter-Procedural Optimization– Enables inter-procedural optimizations for single/ multiple files

– How? On Windows*: /Qip, /Qipo

Profile Guided Optimization– Use execution-time feedback to guide optimization

– Aids paging, branch-prediction, basic block reordering

– How? On Windows*: /Qprof_gen, /Qprof_use

Page 9: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

9Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Compilers

Using Parallel Programming DirectivesUsing Parallel Programming Directives

Auto-Parallelization– Automatically converts loops to use multiple processors

– How? On Windows*: /Qparallel

OpenMP Support– Intel Compilers supports multi-platform shared-memory parallel

programming in C/C++ and FORTRAN on all platforms & OS

– How? On Windows*: /Qopenmp

OpenMP usage example#pragma omp parallel for#pragma omp parallel for

for (i = 0;i < n; i++) {for (i = 0;i < n; i++) { dy[i] = dy[i] + da*dx[i]; }dy[i] = dy[i] + da*dx[i]; }

Page 10: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

10Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Code Coverage Tool

Example of code coverage summary for a project. The workload applied in this

test exercised 34 of 143 blocks, representing 5 of 19 functions in 2 of 3 modules. In the file, SAMPLE.C, 4 of 5

functions were exercised

Clicking on SAMPLE.C produces a listing that highlights the code that

was exercised. In this example, the pink-highlighted code was

never exercised, the yellow was run but not exercised by any of the tests set up by the developer and

the beige was partially covered.

Page 11: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

11Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Test Prioritization Tool Helps guide and speed software testing,

– Helps produce better code more quickly– Helps improve programmer productivity

Example:– These 3 achieve 52.17% block and 50.00% function coverage– Test 3 alone covers 45.65% of basic blocks or 87.50% of total

block coverage from all tests– By adding Test 2, cumulative block coverage goes to 52.17%, or

100% of the total block coverage of Test 1, Test 2, and Test 3 – Eliminating Test 1 has no negative impact on block coverage

and saves time

Number of Tests

%Rat Cvrg

%Blk Cvrg

%Func Cvrg

Test Names@ Options

1 87.50 45.65 37.50 Test3.dpi

2 100.00 52.17 50.00 Test2.dpi

Total Number of Tests = 3Total Block Coverage ~ 52.17%Total Function Coverage ~50.00%

Page 12: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

12Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Compilers 8.1 C++ and Fortran IA-32, Intel® Itanium® 2, EM 64T & Intel® PCA

processor-based systems Intel® Code-Coverage & Intel® Test-Prioritization tools Threaded application support (Hyper-Threading

Technology)– OpenMP* 2.0 standard support– Auto-Parallel feature that automatically generates

threaded code Windows specific:

– Integrates into MS Visual Studio .NET* IDE– Support for MSVC.NET* language features (no

support for C# or managed code)– Compaq Visual Fortran* language features with

Intel code generation and optimization technology

Page 13: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel VTune Performance Analyzer

http:www.intel.com/software/productshttp:www.intel.com/software/products

Page 14: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

14Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Performance Tuning

Detecting common issues – Where to add threads, what to optimize?

– Load imbalance?

– Wait, blocked, or idle time?

– Excessive overhead?

– Processor architecture issues?

– Application issues?

No particular order: Address No particular order: Address issues as neededissues as needed

No particular order: Address No particular order: Address issues as neededissues as needed

Page 15: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

15Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® VTune™ Performance Analyzer

VTune analyzer’s intimate knowledge of the processor enables it to provide extensive insights into how software utilizes CPU resources

Allows you to identify and locate performance bottlenecks in your code

– Collects and displays software performance data– Features that help you identify and address

performance issues: Sampling that uses non-intrusive technologies Call Graph that displays graphically the program’s

flow of control Analyzer that has detailed knowledge of the

processor’s microarchitecture Intel Tuning Assistant that suggests optimization

techniques for your Windows code

“The Intel VTune Performance Analyzer took a multi-day task and turned it into a sub-day task.”

—— Randy Camp, V.P. Software Research and Development, MUSICMATCH, Inc.

Page 16: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

16Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Sampling – Identifying Performance Bottlenecks

“Sample” the CPU’s execution context As program runs, gather occasional CPU context snapshots triggered by

CPU’s performance monitoring registers

– Interrupt based sampling using CPU registers

– Low intrusion – doesn’t change performance of the software

– No special builds required Sample rate set to provide statistically meaningful data

– Based on CPU clock speed or can be auto-calibrated Can measure performance sensitive CPU events

– Cache misses, branch mispredictions, etc.

Page 17: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

17Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

How to use Intel VTune Performance Analyzer

Build the application– Build the application in Release mode with compiler optimizations

Find “Hotspots” using VTune– A “Hotspot” in an application or a system is a section of code where

there is a significant amount of activity.

– Finding “hotspots” would assist you in determining the compiler/ code

optimizations required for gaining performance improvement.

Symbols required for VTune Analyzer– Required Intel compiler switch (on Windows*): /Zi

Page 18: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

18Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Start New Project using Sampling Wizard

Intel VTune Performance Analyzer

Select Application Type to ProfileSelect Application to Launch

Page 19: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

19Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Understanding VTune Interface

Choose Project/Activity/ Run

Choose Project/Activity/ Run

Different ViewsDifferent Views

System-wide performance data

Most Instructions RetiredMost Instructions RetiredMost Instructions RetiredMost Instructions Retired

Statistics SummaryStatistics SummaryStatistics SummaryStatistics Summary

Events Measured

Sampling Analysis

Per CPU AnalysisPer CPU AnalysisPer CPU AnalysisPer CPU Analysis

Status OutputStatus OutputStatus OutputStatus Output

Page 20: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

20Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Hotspot Drill Down

Function StatisticsFunction StatisticsFunction StatisticsFunction Statistics

LINPACK performance data

Symbols required for Hotspot Drill-down

Events Measured

Is this the Hotspot?Is this the Hotspot?Is this the Hotspot?Is this the Hotspot?

More analysis needed. Use VTune Call Graph feature to obtain flow info!

Page 21: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

21Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Source Level View

““Hotspot” sourceHotspot” source““Hotspot” sourceHotspot” source

Efficiency (CPI)Efficiency (CPI)Efficiency (CPI)Efficiency (CPI)

View AssemblyView AssemblyView AssemblyView Assembly

Page 22: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

22Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Using Sampling & Call GraphTogether

Why? Use sampling to find which functions have hotspots. Use call graph to find out who is calling these functions.

Why? Use sampling to find which functions have hotspots. Use call graph to find out who is calling these functions.

Page 23: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

23Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

What Are Users Saying

“SGI develops applications for its computers that employ many levels of parallelism, demanding the highest level of performance. The VTune Performance Analyzer for Windows provided invaluable insights to the correction of performance bottlenecks in these applications at the process, thread, and basic block levels."– Arthur Raefsky, Technical Lead, SGI,

Mountain View, CA

Page 24: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Threading Tools

http:www.intel.com/software/productshttp:www.intel.com/software/products

Page 25: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

25Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Threads Defined

OS creates process for each program loaded– Each process executes as a

separate thread Additional threads can be

created within the process All threads share code and

data – Each thread has its own Stack

and Instruction Pointer

OS creates process for each program loaded– Each process executes as a

separate thread Additional threads can be

created within the process All threads share code and

data – Each thread has its own Stack

and Instruction Pointer

Data

Code

thread2()Stack

IP

threadN()Stack

IP

ProcessProcess

thread1()Stack

IP

Threading Overview

Page 26: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

26Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Amdahl’s LawThreading Overview

If only 1/2 of the code is parallel, 2X speedup is

unlikely

If only 1/2 of the code is parallel, 2X speedup is

unlikely

TotalParallel TONPPT })1{(

P = parallel portion of processN = number of processors (cores)O = parallel overhead

tim

e PPP(1-P)

TTotal

Page 27: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

27Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Correctness Bugs: Data RacesThreading Overview: Challenges Unique to Threading

Thread1x = a + b

Thread2b = 42

What is value of x if:– Thread1 runs before Thread2?

– Thread2 runs before Thread1? Data race: concurrent read, modify, write of same

address

x = 3

x = 43

Suppose: a=1, b=2

Outcome depends on thread execution orderOutcome depends on thread execution order

Page 28: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

28Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Solving Data Races: Synchronization

Thread1Acquire(L)a = 1b = 2x = a + bRelease(L)

Acquisition of mutex L ensures atomic access– Only one thread can hold lock at a time

Example APIs:- EnterCriticalSection(), LeaveCriticalSection()- pthread_mutex_lock(), pthread_mutex_unlock()

Thread2Acquire(L)b = 42Release(L)

Threading Overview: Challenges Unique to Threading

Page 29: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

29Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Performance Penalty: Synchronization

Thread blocked waiting for Mutex– Thread not running, so no parallelism

Mutex Release, Acquire takes time– Release marks mutex free

– Acquire must check for free If free, mark as in use If not free, thread put to sleep

– Costs context switch out and in of processorCosts context switch out and in of processor

Threading Overview: Challenges Unique to Threading

Page 30: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

30Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Problem Statement

Developing threaded applications is hard New class of problems are caused by the

interaction between concurrent threads– Correctness problems (data races,

deadlocks, etc)

– Performance problems (contention, imbalance, etc)

Threading Overview

Page 31: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

31Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Software Development Cycle

Introduce ThreadsIntroduce Threads

–Intel® Performance libraries: IPP and MKLIntel® Performance libraries: IPP and MKL

–OpenMP* (supports incremental threading)OpenMP* (supports incremental threading)

–Explicit threading (Win32*, Pthreads*)Explicit threading (Win32*, Pthreads*)

Debug for correctnessDebug for correctness–Intel® Thread CheckerIntel® Thread Checker

–Intel DebuggerIntel Debugger

Tune for performanceTune for performance–Thread ProfilerThread Profiler

–VTune™ Performance AnalyzerVTune™ Performance Analyzer

Scope of the Tools

Page 32: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

32Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

Intel® Thread Checker and Thread Profiler VTune™ Performance Analyzer

– Prerequisite for Intel® Threading Tools– VTune analyzer has thread support

Intel® Compilers support OpenMP* and the Threading tools– More detailed results are generated with the Intel

compilers Intel Performance Libraries are thread safe

– Many functions are threaded

Page 33: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

33Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Common Threading Errors/Bugs

Race conditions– Unprotected concurrent access to shared

variables by multiple threads– Most common error

Deadlocks– Multiple threads waiting on resources that

are held by other threads Thread stalls

– Threads waiting on resources infinitely

Page 34: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

34Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Thread Checker Intro

Identifies threading bugs in applications threaded with:– Windows* threads on Windows* systems

– OpenMP* on Windows* systems Plugs into VTune™ environment

– Windows* for IA-32 systems

Intel® Thread Checker

Page 35: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

35Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Thread Checker Analysis

Dynamic monitoring as software runs– Data (workload) -driven execution

Includes monitoring of:– Thread and Sync APIs used

– Thread execution order Scheduler impacts results

– Memory accesses between threads

Only executed code path is analyzedOnly executed code path is analyzed

Intel® Thread Checker

Page 36: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

36Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Checker Usage

Dynamic Correctness tool– Dataset selection is important

Must touch all code paths

– Multiple runs exercising different data paths yield best results

– Use small data set for each pathMonitoring of all memory references is

time consuming

Intel® Thread Checker

Page 37: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

37Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Starting Thread Checker

Start VTune™Performance Analyzer

1

2

Intel® Thread Checker

Page 38: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

38Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Diagnostics ListIntel® Thread Checker

Page 39: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

39Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Location in Source Code

Each entry in the Each entry in the diagnostics list diagnostics list links to its links to its source code source code line(s)line(s)

Each entry in the Each entry in the diagnostics list diagnostics list links to its links to its source code source code line(s)line(s)

Intel® Thread Checker

Page 40: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

40Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Common Performance Issues

Parallel Overhead– Due to thread creation, scheduling..

Synchronization– Excessive use of global data, contention for the same

synchronization object

– Implicit synchronization Load balance

– Improper distribution of parallel work Granularity

– No sufficient parallel work

Page 41: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

41Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler

Plugs in to the VTune™ performance environment

Identifies performance issues in OpenMP* or unstructured threaded applications using the Win32*

Pinpoints performance bottlenecks that directly affect execution time

Uses binary instrumentation technology

Intel® Threading Tools: Thread Profiler

Page 42: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

42Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler

Uses critical path analysis Provides a breakdown of execution time

along the critical path– Provides insight into system utilization

Under-subscribed vs. over-subscribed

– Thread state transitionsBlocked->Running, call stack information

Allows comparison of multiple Allows comparison of multiple runsruns

Intel® Threading Tools: Thread Profiler

Page 43: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

43Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Execution Flows and Critical Path

Multiple execution flows in applications Flow splits when a thread creates new threads or

signals another thread to continue Flow ends when a thread stalls or terminates

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

Longest flow is the critical pathcritical path

Intel® Threading Tools: Thread Profiler

Page 44: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

44Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Why use Critical Path?

Goal is to shorten the execution time Shorten the critical path and you shorten

the total execution time Events recorded are events that impact the

critical path– Lock/Unlock

– Thread Creation, suspension, resume, termination

– Blocking calls, external events

Intel® Threading Tools: Thread Profiler

Page 45: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

45Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Critical Path Analysis

System Utilization– Idle, serial, parallel and oversubscribed

– This is relative to the system the application is running on

Time categories along critical path (CP)– Cruise, overhead, blocking and impact time

Resulting view is a combination of utilization and execution time along CP

Intel® Threading Tools: Thread Profiler

Page 46: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

46Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

System Utilization

Examines processor utilization to determine parallel activity of the application

Concurrency is the number of threads that are active

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Thread Profiler: Critical Path Analysis

Categorization shown for a system configuration with 2 processors

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

Idle

Serial

Parallel

Under-subscribed

Over-subscribed

Page 47: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

47Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Execution Time Categories

Analyze critical path by “colorizing” the time spent along it.

Associate spans of time with the objects that caused the critical path transitions

Thread Profiler: Critical Path Analysis

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Cruise time

Overhead

Blocking time

Impact time

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

Page 48: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

48Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Critical Path View

Thread 1

Thread 2

Thread 3

T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15

Thread Profiler: Critical Path Analysis

Critical Path View0

15

5

10

Tim

e Start with the critical

path Break down by system

utilization Add overhead Further categorize by

behavior

Acquire lock L

Wait for Threads 2 & 3

Wait for L

Release L Wait for L

Release L

Idle

Serial

Parallel

Under-subscribed

Over-subscribed

Categorization shown for a system configuration with 2 processors

Cruise time

Overhead

Blocking time

Impact time

Page 49: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

49Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler Views

Critical Path View– Shows breakdown of the critical path

Profile View– Shows the breakdown of selected critical paths– Use can select other views of the selected profile– Concurrency level, threads, objects..

Timeline View– Shows thread activity and critical path transitions for

the entire application Source View

– Transition source view, creation source view

Intel® Threading Tools: Thread Profiler

Page 50: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

50Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Thread Checker

Locates threading bugs: – Data races (storage conflicts) – Deadlocks (potential and actual)

Isolates bugs to source code line Describes possible causes of errors and suggests resolutions Categorizes errors by severity level Identifies threading bugs in applications threaded with:

– Windows* threads on Windows* systems– OpenMP* on Windows* systems

Plugs into VTune™ environment– Windows* for IA-32 systems

Page 51: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

51Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Thread Profiler 2.1

Plugs in to the VTune™ performance environment

Identifies performance issues in OpenMP* or unstructured threaded applications using the Win32*

Pinpoints performance bottlenecks that directly affect execution time

Uses binary instrumentation technology

Page 52: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel Software College

http:www.intel.com/software/collegehttp:www.intel.com/software/college

Page 53: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

53Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Expert Training @ Intel® Software College

High-quality training by expert trainers worldwide– Take advantage of the latest Intel

processors, platforms, tools and technologies Flexible training offerings

– On-line, On-site, or at Intel facility Classroom-based or online, self-paced or custom

course offerings

www.intel.com/software/college

Visit the Intel Software College website:

"I attended the VTune and Compiler courses at the ISC … I am able to apply what I learned at the ISC to optimizing applications that matter to my company's business. The ISC courses were probably the best that I have had as a professional in terms of delivering on what they said they would teach."

—— Keith Fish - ISV Technical Consultant, Hewlett-Packard Company

Page 54: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

54Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

“Registering for support was easy, and we value the security of knowing that Intel is there to help, even though we haven’t needed it so far.”

—— Rob Hoffmann - Director of Marketing, NewTek, Inc.

Intel Premier Support

Every purchase of an Intel software development product includes a year of support services

Provides access to Intel® Premier Support and all product updates during that time

Premier Support includes online access to Intel’s Premier Support Website

– Primary support for all Intel Software products

– Issue submission & tracking

– Product updates & related downloads

– FAQ’s & other proactive notices

– 128-bit encrypted communication protects confidentiality

– Dedicated expert staff review submissions and respond within 4 Intel business hours

https://premier.intel.com

Page 55: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

55Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Intel® Software Development Products

From Supercomputers to Cell Phones, Intel Software Development Products Enable Application Development Across Intel Processors

VTune™VTune™Performance AnalyzerPerformance Analyzer

LibrariesLibraries

Threading Threading ToolsTools

CompilersCompilers

Math Kernel LibraryMath Kernel Library

Integrated PerformanceIntegrated PerformancePrimitivesPrimitives

Thread Thread CheckerChecker

C++C++

MS Windows* MS Windows* Win

CE

Intel Software Development Products

FortranFortran NA NA

NA NA

ShippingShipping

FutureFuture

Performance Performance AnalyzersAnalyzers

Cluster Cluster ToolsTools NA

Trace Analyzer / Trace Analyzer / CollectorCollector NANA

Palm

*Sym

bian*

Nucleu

s*

DebuggersDebuggers C++C++

NA NA

NA NA

NA NA NA

NA NA NA NA

Page 56: Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners

56Copyright © 2002, Intel Corporation. All rights reserved.*Other brands and names are the property of their respective owners www.intel.com/software/products

Next Steps Evaluate the Products

– Download at: www.intel.com/software/products Contact Vivek Venkatesh with questions

– 98456 79348– [email protected]