118
Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments Ashish Gupta March 2008 PhD Final Talk Committee: Prof. Peter Dinda Prof. Fabian Bustamante Prof. Yan Chen Prof. Dongyan Xu (Purdue University)

Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Embed Size (px)

DESCRIPTION

My PhD defense talk. Please see www.ashishgupta.org for more details.

Citation preview

Page 1: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Black Box Methods for Inferring Parallel Applications' Properties in Virtual

Environments

Ashish GuptaMarch 2008

PhD Final Talk

Committee:

Prof. Peter Dinda

Prof. Fabian Bustamante

Prof. Yan Chen

Prof. Dongyan Xu (Purdue University)

Page 2: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Introduction

Page 3: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

3 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Background

• Virtual Machine Distributed Computing

• Virtuoso– Middleware for autonomic Virtual Machine distributed

computing– Presents a simple abstraction for distributed computing,

insulating from underlying computational, networking and middleware complexities

– VMM abstracts computational resources– VNET abstracts different networking domains into one –

also ideal point for monitoring– Autonomic Resource Management

Page 4: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

4 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Problem Introduction

• Adaptation – Resources can be heterogeneous (CPU, memory etc)– If shared, then resources availability can also highly dynamic– Application demands also change !

• Autonomic computing:– What is available ?– What is required ?– How can we effectively match the two ?– One of the major components What does the

distributed application want ?– Should work with existing unmodified applications and OS

Page 5: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

6 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Thesis Statement

My thesis is that it is feasible to infer various useful demands and behavior of a parallel application running inside a collection of VMs to a significant degree using a black box model. To evaluate this thesis, I enumerate and define various demands and types of behavior that can be inferred, and also design, implement and evaluate ideas and approaches towards inferring these.

One of the demands I infer is the communication behavior and the runtime topology of a parallel application. I also show how to infer some very useful runtime properties of a parallel application like its runtime performance, its slowdown under external load and its global bottlenecks.

Significantly all of this is done using black-box assumptions and without specific assumptions about the application or the operating system. I also give evidence of how automatic black box inference can assist in adapting the application and its resource usage resulting in improved performance.

Chapter 2

Chapter 3 Chapter 4

Appendices A, B

Page 6: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

7 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Black box assumption and impact

Black Box – Can make no assumptions about the implementation, behavior or internal state of the Guest OS/module beyond its external interface

• Lowers barrier to adoption of the new inference techniques

• helps deploy my work to legacy applications

• Mainly accomplished by looking at external signals:

• traffic, host load etc.

• Not tied to Virtuoso

• Other systems like softUDC [91], XenoServer [52], SODA [87], Violin [88], In-VIGO [13], VioCluster [136] can also benefit from black box application inference

• Potentially any virtual distributed system

Page 7: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

8 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Components of my dissertation

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application

slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed Computing

Inference

Adaptation

1 2

3 4

5 6

Page 8: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

9 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Topics I cover

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application

slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed Computing

Inference

Adaptation

2 3

4 5

A B

Page 9: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

10 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

BSP application model

• Multiple processes executing a common kernel

• Execution alternates between one or more computing phases and one or more communication phases

• Combination of a schedule of computation and communication phases Super-step

• A very popular model for implementing a large variety of scientific applications and parallel algorithms

• Original paper: Valiant [167]

Page 10: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

11 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Patterns

• Synthetic workload generator developed before. Models a BSP application

• Can execute many different types of topologies common in BSP programs

Some parameters

Topology

Number of processors

Message size

# of iterations

Flops per element

Memory Reads/Writes per element

Topologies

N-dimensional mesh

N-dimensional torus

N-dimensional hypercube

Binary reduction tree

All to All

Page 11: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

12 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Patterns application’s capabilities

2-D Mesh 3-D Toroid 3-D Hypercube

Reduction Tree All-to-All

Page 12: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

13 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

NAS parallel benchmarks

• Developed by NASA [17, 172]

• Set of programs to evaluate performance of parallel supercomputers

• Representative of CFD applications

• To generate realistic parallel application workloads

• 5 kernel benchmarks: EP, MG, FT, IS, CG– Five very frequently used numeric methods

Page 13: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

14 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Different contributions of my dissertation

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application

slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed Computing

Inference

Adaptation

2 3

4 5

A B

Page 14: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Collective Inference for Topology:VTTIF

Page 15: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

16 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Goal of VTTIF

Low Level Traffic Monitoring

?

Application Topology

An online topology inference framework for a VM environment

Page 16: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

17 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Traffic Analyzer

Rate based Change detection

Traffic MatrixQuery Agent

VM Network Scheduling Agent

VNET daemon

VM

VNET overlay network

To other VNET daemons

Physical Host

VTTIF Architecture

Page 17: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

18 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Inferred topology

Parallel Integer Sort

Page 18: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Black Box Measures of performance

Page 19: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

20 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

The problem

• Performance of BSP applications – an important goal

• Lot of work dedicated to improving performance of parallel applications, e.g.

– Virtuoso– VioCluster

• How do we measure performance in a black box fashion ?– The current way in Virtuoso is manual (e.g. Lin et.al. [106])

• Impact– Would enable superior adaptation algorithms– Automated evaluation and adaptation cycle– Generate reports on effectiveness of different

adaptation/scheduling methods

Page 20: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

21 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Cost model for BSP applications

• Popular strategy: break super-step into its components:– computational cost

– Communication cost of the global exchange of the data

– Synchronization cost

Computation cost

Communication cost

Number of super steps

Sync latency cost

Speed of computation in

FLOPS

Static model of performance

+

requires detailed application profiling and access to source

code

Page 21: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

22 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Super-step approach

• Super-step structure an invariant– No. of steps depend on parameters and data

Another possible measure of performance: number of super-steps

executed per second, or the

iteration rate dynamic metric

Multiple super-steps for dynamic applications iteration rate is not

constant

Page 22: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

23 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

A new black box metric: Round Trip Iteration Rate (RIR)

• Based solely on communication behavior of the application

• Correlated to the iteration rate

• Indirectly measures number of process interactions this indicates progress as synchronization happens at end of a super-step

• Approach: Examined various properties of the traffic trace

• Inter send-packet delay exhibits interesting properties

Page 23: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

24 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Inter send-packet delay

176 Receive from 175

176 Send to 175

176 Send to 175

176 Send to 175

176 Receive from 175

176 Send to 175

Page 24: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

25 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Inter send-packet delay

Traffic trace for Patterns

Message size = 4000 bytes

Computation per iteration: 100 MFlops

Clustering based on inter-send

delays

Count in cluster matches actual iteration rate

output by application (325)

Page 25: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

26 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Patterns: clusters without load

Actual: 568

Reported: 569

Page 26: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

27 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Patterns: clusters with external CPU load

Actual: 142

Reported: 153

Actual execution time ratio: 3.922

Ratio from reported iteration

rate: 3.72

Within 5%

Page 27: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

28 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Why Does Circled Bin Correspond To Iteration Rate?

• Each iteration consists of send, receive and compute phases

• Number of items in large inter send-packet delay cluster represent the group that represent an inter-process interaction

• Each inter-process interaction represents progress in the super-step

time

Send packets

Page 28: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

29 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Plotting inter send-packet delay for MG

1. No clean clusters for a more complex application

2. Delays shift towards right on greater load

Page 29: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

30 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Computing the RIR metric

• Previous examples were for Patterns –

• static performance case, easy clustering

• Applications like MultiGrid from NAS benchmarks changing iteration rate

For a given packet time series,

1. Count send pairs whoseinter-packet delay exceeds byc * RTT

2. Send pair must be interleaved by one receive

Based on BSP principles

Page 30: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

31 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

RIR time series

For dynamic applications, RIR changes with time

Need a time series for RIR over the trace

Page 31: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

32 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Outputting RIR time series - Workflow

Sniff Packets

Send packets that obey the conditions

Slide a 1 second window over these send packets

Sliding interval = t

Get a new time series denoting RIR for each

1 second sliding window

Sampling duration = T

Derivative Metrics from the above time series

Average RIR

CDF

Power Spectrum

Super-phase period

Using tcpdump/libpcap for the VM traffic

Send packets satisfying the two conditions

1 sec Slide by t

i1, i2, i3, i4, i5, …. in

Each number represents number of iterations for a particular 1 sec window instance

Page 32: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

33 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Representing Dynamic Performance

• Define a spectrum of metrics for dynamic performance

RIRavg

RIR-CDF

RIR-PS

RIR-PSE

Long term stationary average of RIR time series

CDF of RIR time series indicating spread of iteration rates

Phase structure, periodicities, application fingerprinting, statistical scheduling

Summary of the periodic behavior for multiple supersteps

Page 33: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

34 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Computing the stationary Average – sampling issues

• Sliding window resolution (sampling rate)– Needs to be high enough to capture the any important high frequency

behavior

• Capture duration– Enough to capture the stationarity of the signal (repetition of all super-

steps)

• Assumption:– iteration dynamics of the application are indeed empirically stationary for the long

run.– For a dynamic application that consist of repetitive phases, it means capturing

enough of its performance behavior to capture this repetitive element.

• Capturing stationarity

• Power Spectrum based techniques help us determine the right sampling rate and duration

• Details in dissertation

Page 34: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

35 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Effectiveness of RIRavg

For MG application, running under different load conditions (100% and 60% CPU load), predicted execution

time error rates were 13% and 7% respectively

(completely black box)

Value of c = 1.1 here (c*RTT factor)

Page 35: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

36 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

RIR time series graph

A super-step

Page 36: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Can we predict slowdown of application if we put it under load ?

For the IS and MG applications, which application may be hurt more if one of the processes from each application shares the

physical host with an external computational load?

The impact: we can now determine in advance, the impact of external load if we must choose one of these applications to be influenced by the load.

• Scheduling fact: Depending the scheduling algorithm, affect of load on different RIR regions can be different (Govindan et.al. [60])

• Reason: Scheduling handled differently for CPU bound processes vs I/O bound processes

• “Providing enough CPU is not enough, an equally important consideration is to provide the CPU at the right time”

Page 37: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

38 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Role of CDF

• RIR-CDF can be used to predict which application will be more affected by external load (for dynamic applications)

• Very useful when extra load needs to be introduced over existing applications due to demand

• Scheduling using the CDF:– From the normal CDF, we predict a slowdown CDF based on a slowdown

mapping– Slowdown CDF How will the RIR-CDF of the application look like after

load ?– Slowdown mapping What is the slowdown for a particular RIR under a

particular load ?

Slowdown mapping

Page 38: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

39 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Other metrics

• Power spectrum of RIR– Gives idea about the super-step structure of the application– Length of Super-step– A summary of the power spectrum and the significant frequencies serves

as a fingerprint/super-step snapshot

Page 39: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

40 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

A sample power spectrum for 4 processes

Consistency across processes

Page 40: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

41 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Example for MG

Significant frequency separation

Exec time = 19.44 seconds

Super-phase period = 4.1

seconds (1/0.244)

Number of super-phases

~4

Page 41: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

42 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Recap

• We can deduce performance for a BSP application using black box means.

• We can predict performance for an application, when imposed with load

• Entirely based on Based on packet analysis

• New metric called RIR : Round trip Iteration Rate

• More complex metrics for dynamic applications– RIRavg

– RIR-CDF– RIR Power Spectrum

Page 42: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Ball In the Court Methods

Page 43: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

44 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

The problem

• Last chapter: Predicting performance under load (Slowdown CDF)

• This chapter: Can we predict performance of application if an existing external load was removed ?

• I.e. What is the slowdown of the application under the current load conditions ?

Page 44: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

45 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Ball in the Court

• BSP super-step computation, communication, sync

• Communication according to a certain schedule and then computation in between

• Each process acts (computes) on a message before sending out the next one

• This acting on a message is called “Ball in the Court” (BIC)

• Ball is the responsibility of the process to do some local processing and then interact with other processes , court is the local host.

BIC delay

Page 45: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

46 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Developing a strategy

• Focus on one process

• If just one process is loaded, entire application slowed down because of one process

• All processes operate in sync, and iterations can only proceed if the loaded process does its duties

stretched

Under load

Page 46: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

47 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Why BIC delay?

• The traffic trace captures the behavior of the process for the entire duration

• If the process slows down, the trace time length will increase.

• There will be corresponding changes in the trace as well. What are those changes ? Seed of the idea

Page 47: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

48 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Approach to computing the BIC delay

Let’s compute the time differential for event pairs, for *.176

Some computation

(6 ms)

Some computation

(6ms)

Page 48: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

49 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

With Load, Some BIC Delays Get Larger

Loaded process

Unloaded process

1. Sending an ack process’s responsibility

2. This responsibility was hugely inflated in the loaded case ( 64 us to 23822 us)

3. BIC delays for other receives are similar (receive BIC for other processes)

Page 49: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

50 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

An Algorithm for BIC Delay?

• Can we estimate the total BIC delay from the traffic trace alone ?

• Investigated this question with different approaches.

• Each packet can be of the following types:– 1. Send Packet (SP)– 2. Send Ack (SA)– 3. Receive Packet (SP)– 4. Receive Ack (RA)

• We can pair up consecutive packets to form event pairs SA followed by SP SA-SP

• For 4 event possibilities we have 16 event pairs

Page 50: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

51 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

BIC events

• * - S* event pairs can be classified as BIC events

• Intuitively, – process has either received a packet and is responsible to send the next

one– Just sent a packet and is responsible to send the next one as well

Page 51: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

52 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Computing the BIC delay

• 1. Count all BIC event pairs and sum up their time differentials total BIC delay

• 2. BIC delays partitioned by the recipient IP address– E.g.: RP (175) – SP (173) delay goes into the bucket of 173

• Some special cases and concerns for counting BIC delays are explained in the dissertation (Page 140)

Page 52: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

53 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Using BIC Delays to Measure Application Imbalance

• Get a measure of the local computing/processing time for which its responsible

• Comparing this local time with another unloaded process gives an idea of imbalance for the loaded process

• Gives an idea of how much extra time the loaded process is taking

•The way we compare produces different strategies for computing slowdown

Page 53: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

54 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Using BIC Delays to Measure Application Imbalance

Compute BIC delay

Imbalance Algorithm

Slowdown

Page 54: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

55 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Computing the no-load runtime using BIC delays

•Global BIC imbalance algorithm: – Assumes all processes are executing the same BSP kernel

– 1. Compute the BIC delay for the loaded process

– 2. Select a partner process for comparison

• What if the loaded process was put in the shoes of the partner process ?• Choice of partner process can affect results. Select least loaded for most optimistic

results. Or select one that reflects possible migration conditions

– 3. Compute the “imbalance” by finding the difference in the BIC delays of these two processes

Page 55: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

56 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Global BIC algorithm

Page 56: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

57 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Evaluation with the IS application

Balanced case

Total runtime = 14.36 seconds

Total runtime = 152.67 seconds

Loaded case

Imbalance between 176 and 175 = 124.49 seconds

Actual difference = 138.41 seconds

Using completely black box means, we could determine, how slow is the application running with load,

without knowing about the unloaded case

Page 57: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

58 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

More sophisticated Imbalance Algorithms

• Process-level BIC imbalance algorithm

• A better fit for more dynamic applications like the NAS benchmarks

• Sometimes work done by all processes is not the same

• Algorithm takers into account inter-process level interactions and BIC delays

• Multi-iteration BIC-delay Bias-Aware Imbalance Algorithm

• BIC algorithm for multi-load situations

• Covered in detail in the dissertation

• Gives a range of slowdown (optimistic and pessimistic values)

• Evaluated with MG and IS benchmarks

Page 58: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

59 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Other contributions in the Chapter

• Metrics for computing imbalance amongst processes – Gives idea of heterogeneity experienced by different processes– Draws attention to applications that can use lot of help– Three imbalance metrics

• Two global: Standard Deviation from average, Squared distance• Process Level imbalance metric

Page 59: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Global bottlenecks and Time Decomposition

Page 60: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

62 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Global Bottleneck

• In the BIC discussion, we could identify the slow processes using their BIC delays

• This chapter extends the argument to network and I/O resources

• Looks at the problem of Global Time Decomposition– Where does the application spend its time ?– At network resource and process level

• Time decomposition can point out potential sources of imbalance

• Some metrics output are:– Cumulative Message Latency– Average Latencies observed– Cumulative message transfer time – Average Bandwidth observed

Page 61: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

63 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

BIC analysis for local imbalance

Global Time Decomposition

Measure currentperformance

• Indicates if application is highly imbalanced, • Indicates amount of slowdown, and• Which host to focus on?

• Shows time allocation for network components, and• Which non-host resources to focus on ?

• Is application running at expected speed compared to previous known instances ?• Multiple application case: which application will be slowed down more from load ?

Eliminate a bottleneck using adaptation mechanisms

NO

• Traffic based iteration rate metrics• RIR time series• Dynamic Performance Metrics

• Global BIC imbalance algorithm• Process-level Bias Aware BIC imbalance algorithm• Multi-load BIC imbalance algorithm• Imbalance metrics

• Global Time decomposition techniques for various network metrics• Unix tools to find root case

• Adaptation Mechanisms• Migration • Network overlay links• Routing rules • Network reservation

Flowchart for diagnosis

Chap 3

Chap 4

Chap 5

Appendices A,B

Tools Questions/problems

Page 62: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

64 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Different contributions of dissertation

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application

slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed Computing

Inference

Adaptation

2 3

4 5

A B

* An API also defined in the dissertation

Page 63: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

65 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Monitoring and inference

Application performance measure

Adaptation algorithm

Adaptation mechanisms

AdaptationApplications

Optimization metric

1. Overlay topology

2. Forwarding rules

3. VM migration

1. Application Throughput

1. BSP

2. Transactional ecommerce

1. Application throughput

1. VTTIF

2. Network monitoring

1. Single metric

2. Combined metric

Page 64: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

66 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Effect on BSP Application Throughput

For high Compute/Communicate Ratio, Migration + Topology dramatically improves performance

Adapting to External Load Imbalance

External load removed, but I/O still

dominates, so topology helps

External load removed, can drive

higher I/O

Page 65: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

67 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Free Network Measurement For Adaptive Virtualized Distributed Computing

Page 66: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Closest Related Work

• Wood et. al. [178] - Black box and gray box strategies for Virtual Machine Migration, NSDI 2007

• Goal: To improve performance of Stand-alone applications running inside VMs

• Black box and gray box strategies

• Black box externally visible parameters like CPU load, disk swapping activity and network usage

• Focuses on stand-alone applications

• I focus on distributed parallel applications collective properties

Page 67: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Closest Related Work• Aguilera et. al. [14] – Performance debugging for distributed systems of blackboxes, OSP 2003

• Goal: To find components that contribute to high latency amongst critical message paths in a distributed application

• Somewhat analagous to BIC chapter

• Assume constant delay per component

• Unclear how their techniques translate to parallel applications (cyclic computation)

•No estimate of actual slowdown

• Reynolds et. al. [133] – WAP5: black-box performance debugging for wide-area systems, WWW 2006

• Focus on wide area systems now, more focus on the network delay aspects

• Major work focuses on identifying causal relationships between messages

• Assume a one to one message casualty not so in Parallel applications

• Acknowledge that will not work for barrier type communications

Page 68: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

70 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Related Publications

• Gupta, A., Dinda, P. A. Inferring the topology and traffic load of parallel programs running in a virtual machine environment. In Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (JSPPS 2004 (June 2004).

• A. Sundararaj, A. Gupta, P. Dinda, Dynamic Topology Adaptation of Virtual Networks of Virtual Machines, Proceedings of the Seventh Workshop on Languages, Compilers and Run-time Support for Scalable Systems (LCR 2004)

• Sundararaj, A. Gupta, P. Dinda, Increasing Distributed Application Performance in Virtual Environments through Run-time Inference and Adaptation, In Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC 2005)

• Ashish Gupta, Ananth Sundararaj, Marcia Zangrilli, Peter Dinda, Bruce B. Lowekamp, Free Network Measurement For Adaptive Virtualized Distributed Computing, In Proceedings of 20th IEEE International Parallel & Distributed Processing Symposium, 2006.

Page 69: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

71 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Other Work

• Robert Schweller et al., Reversible Sketches: Enabling Monitoring and Analysis over High-speed Data Streams, Journal paper in IEEE/ACM Transactions on Networking, 2006.

• Robert Schweller et al., Monitoring Flow-level High-speed Data Streams with Reversible Sketches, In Proceedings of IEEE INFOCOM 2006.

• Robert Schweller, Ashish Gupta, Elliot Parsons, Yan Chen, Reverse Hashing Algorithms for Sketch-based Change Detection on Highspeed Networks:, In Proceedings of ACM SIGCOMM Internet Measurement Conference, October 2004, Taormina, Sicily

• P. Dinda, G. Memik, R. Dick, B. Lin, A. Mallik, A. Gupta, S. Rossoff, The User In Experimental Computer Systems Research, Proceedings of the Workshop on Experimental Computer Science (ExpCS 2007)

• A. Gupta, B. Lin, P. Dinda, Measuring And Understanding User Comfort With Resource Borrowing, In Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing (HPDC 2004), Honolulu, Hawaii

• Bin Lin, A. Gupta, Peter Dinda, Measuring, Understanding, and Exploiting Direct User Input In Resource Scheduling,  Journal paper in submission

Accepted Posters

• Ashish Gupta, Peter Dinda, Fabian Bustamante, Distributed Popularity Indices, Poster Presentation at ACM SIGCOMM 2005, Philadelphia

• Ashish Gupta, Manan Sanghi, Peter Dinda, Fabian Bustamante, Magnolia: a novel DHT architecture for Keyword based search, Poster Presentation at Network System Design and Implementation (NSDI 2005), Boston

• Ashish Gupta et al., Free Network Measurement For Adaptive Virtualized Distributed Computing, Poster Presentation at Supercomputing 2005, Seattle

Sketch-based

reverse hashing

work

User feedback

work

Page 70: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

72 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Components of my dissertation

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application

slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed Computing

Inference

Adaptation

1 2

3 4

5 6

Page 71: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Thank you !

Page 72: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Backup slides

Page 73: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

75 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Infrastructure

• Virtuoso cluster, which is an IBM e1350 with 32 compute nodes, each of which is a dual 2.2 GHz Intel HT Xeon Processors (except the management node which is faster), 1.5 GB RAM, and 40 GB of disk.

• The machines also have 1 Gbit interfaces and in some evaluation scenarios in later chapters(Chapters 3, 4, 5), I use 4 of these machines which form a mini-cluster of their own. These machines are then connected via a 100 Mbit switch.

• Virtual Machine Monitor Used:

• I use both VMWare and the popular open source Xen VMM in my evaluations.

• VMWare GSX Server 2.5 (VTTIF evaluation)

• Xen version 3.0.3-rc3-1.2798.f in my evaluations in Linux Kernel release 2.6.18-1.2798.fc6xen. (other chapters)

Page 74: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

76 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Increasing Application Performance In Virtual Environments Through Run-time Inference and Adaptation

Dynamically adapt unmodified applications on unmodified operating systems in virtual environments to available resources

The adaptation mechanisms are application independent and controlled automatically without user or developer help

Demonstrate the feasibility of adaptation at the level of collection of VMs connected by Virtual Networks

Show that its benefits can be significant for two classes of applications

Published in LCR 2004, HPDC 2005

Page 75: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

77 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Informally stated:

Input

• Network traffic load matrix of application

• Topology of the network

Output

• Mapping of VMs to hosts

• Overlay topology connecting hosts

• Forwarding rules on the topology

Such that the application throughput is maximized

Optimization Problem (2/2) Topology + Migration

The algorithm is described in detail in the paper

Page 76: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

78 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Evaluation

Applications

• Patterns: A synthetic BSP benchmark

• TPC-W: Transactional web ecommerce benchmark

Benefits of adaptation (performance speedup)

• Adapting to compute/communicate ratio

• Adapting to external load imbalance

Page 77: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

79 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Effect on BSP Application Throughput

Adapting to Compute/Communicate Ratio

For high Compute/Communicate Ratio, Migration + Topology dramatically improves performance

Less time spent in I/O, so

migration alone is enough

Since I/O dominates, drop in latency

improves performance

Even for small amount of I/O,

it takes up significant time

Page 78: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

80 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

TPCW

No Topology Topology

No Migration 1.216 1.76

Migration 1.4 2.52

Throughput (WIPS) With Image Server Facing External Load

Page 79: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

81 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Free Network Measurement for Adaptive Virtualized Distributed Computing

ADAPTATION : A FOUR STEP PROCESS

1. Automatically infer application demands (network/CPU)

2. Monitor resource availability (bw/latency/CPU)

3. Adapt distributed application for better performance/cost effectiveness

4. Reserve Resources when possible

WREN PAPER, Published in IPDPS 2006

Page 80: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

82 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

What is WREN ?

Developed by my colleague Marcia Zangrilli from WM College

1. Observes incoming/outgoing packets

2. Online analysis to derive latency/bandwidth information for all host pair connections

3. Answers network queries for any pair of hosts

What does it do ?

Page 81: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

83 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

An important Contribution: Problem Formalization

Page 82: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

84 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Two approaches to adaptation

Page 83: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

85 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

VTTIF in a dynamic topology environment

Parameters: Smoothing Window sliding window

duration over which updates are aggregated

Update Rate

Detection Threshold

Page 84: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

86 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Questions

• Are these techniques applicable to other applications ?

• Why do you think they are generic enough even for BSP applications ? How did you test their generality ?

– Because the reasoning for their derivation was based on the BSP model not on any particular application

– Tested on Patterns and then tried on other more complex applications.– VTTIF was used by other components like adaptation

• What modifications would you suggest to the – Network hardware– VMM– Router level ?

• How could you improve time synchronization issues ?

• What is the closed work to you and how is it different from whats out there– Black box inference NSDI 2007– Black box debugging from HP labs

Page 85: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

87 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

• How about applying the techniques beyond BSP applications ?– Distributed apps– P2P– Web apps

• What are the limitations of these techniques ? When will they not work ? What are the assumptions ?

• What is the main research contribution to your work..?

Page 86: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Bin

Page 87: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Evidence of Adaptation Driven by Inference

Page 88: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

90 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Evidence of Adaptation

How can automated inference assist automated adaptation without user

intervention and application knowledge ?

Significant work already done in this area by me

• Using multiple mechanisms for adaptation

• Combining resource availability with application demands to match needs

• Non-scientific apps also investigated (multi-tier web sites)

• More in following slides…

Page 89: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

91 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Informally stated:

Input

• Network traffic load matrix of application

• Topology of the network

Output

• Mapping of VMs to hosts

• Overlay topology connecting hosts

• Forwarding rules on the topology

Such that the application throughput is maximized

Optimization Problem (2/2) Topology + Migration

The algorithm is described in detail in the paper

Page 90: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

92 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Effect on BSP Application Throughput

For high Compute/Communicate Ratio, Migration + Topology dramatically improves performance

Adapting to External Load Imbalance

External load removed, but I/O still

dominates, so topology helps

External load removed, can drive

higher I/O

Page 91: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

93 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Evaluation

Scenario 2 : Large 256 host topology. 32 potential hosts, 8 Virtual Machines

Results for Multi Constraint Cost Function : Bandwidth and Latency

Annealing easy to adapt and finds good mappings compared to heuristic

Page 92: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

94 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

The New Adaptation Process

Page 93: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

95 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Two approaches to adaptation

Page 94: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

96 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Evaluation

Scenario 2 : Large 256 host topology. 32 potential hosts, 8 Virtual Machines

Results for Multi Constraint Cost Function : Bandwidth and Latency

Annealing easy to adapt and finds good mappings compared to heuristic

Page 95: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

97 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Summary of contributions

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed

Computing

1. Infer the traffic matrix for a BSP application

2. Spatial Communication Topology of the application

3. Online implementation and low performance overhead

4. Handling dynamic situations and avoiding oscillations

5. Integration with adaptation system

Page 96: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

98 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Summary of contributions

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed

Computing

1. How to evaluate BSP application performance under black box conditions ?

2. New set of metrics based on traffic

3. RIR (Round trip Iteration Rate) correlated to the iteration rate

4. Complex metrics for dynamic applications

1. RIRavg

2. RIR-CDF

3. RIR-PS

5. Scheduling using these advanced metrics (Slowdown CDF)

6. Other applications beyond performance

1. Fingerprinting

2. Statistical Scheduling

Page 97: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

99 Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Summary of contributions

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed

Computing

1. Quantitative estimate of the application slowdown under external load

2. Amount of speedup achievable

3. Can greatly help in adaptation/scheduling decisions predicts the benefit of migration in advance

4. Concept of Ball In the Court delays

5. Using BIC delays to compute imbalance in application

6. Algorithms to compute slowdown

1. Global Imbalance Algorithm

2. Process Level BIC imbalance algo

3. Multi-load BIC imbalance algo

7. Metrics to indicate amount of imbalance in application as a whole

Page 98: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

100

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Summary of contributions

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed

Computing

1. How can we find the global bottleneck for the application ?

2. Time Decomposition of execution time Helps identify regions of great imbalance

3. Includes the network aspects

4. Many new metrics for latency and bandwidth time components

5. Evaluation with various load scenarios

Page 99: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

101

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Summary of contributions

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed

Computing

Page 100: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

102

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Summary of contributions

Virtual Topology and Traffic Inference Framework

Black box metrics for Absolute Performance

Ball in The Court principles to compute application slowdown

Global Bottlenecks using time decomposition

Increasing Application Performance In Virtual

Environments through Run-time Inference and Adaptation

Free Network Measurement For Adaptive Virtualized Distributed

Computing

Page 101: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

106

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Impact

• Further the goal of autonomic computing

• Black box approach impacts huge set of applications

• Help in lower the entry barriers for distributed and parallel autonomic computing

• Shared resources no new tweaks needed to existing applications

Page 102: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

108

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Application Model

• BSP Parallel applications

• Two benchmarks being considered– Patterns– NAS Benchmarks

Page 103: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

Citations

[167] VALIANT, L. A bridging model for parallel computation. Communications of the ACM 33, 8 (Aug. 1990), 103–111.

Page 104: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

117

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Contributions

• Understanding the correlation between network traffic and inter-process interactions

• Defining a black box measure of application performance (RIR)– Techniques to derive it

• Advanced metrics for Dynamic Applications and their performance

• Frequency Domain based metrics for greater insight about application

• Applications to advanced scheduling and application fingerprinting

Page 105: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

118

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Solving the issues

• Sampling rate:– Compute at a “high enough” sample rate to capture enough high

frequency dynamics

– Compute power spectrum to calculate total energy in the signal (power spectrum is magnitude of Fourier Transform) Integrate the power spectrum

– Sample at a slightly lower rate and see reduction in energy.

– Repeat until we start losing energy beyond a certain threshold (5%) this indicates at this rate we start losing significant part of signal

– If actual sampling rate is much higher (5 to 10 times) than this cutoff point, we are doing alright

Gives us 95% cutoff sampling rate

Page 106: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

119

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Solving the issues

Page 107: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

120

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Capturing the right time duration

• Again a Power Spectrum based approach

• Helps us determine if our sample size captures most of the energy in the signal

• Ideally we will capture as much as resources allow us

•The process:– After capturing, we test the amount of energy loss, if we shorten the

sample size.– Find the cut-off point as last time– Compare the cut-off sample size and our measured sample size

• After sampling rate and sampling duration is satisfactory,

RIRavg computed by averaging the time series

Page 108: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

121

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Effect of load on the RIR time series

1. Expansion effect

2. Basic structure of time series is same

3. Each phase more irregular (more spikes)

4. Captures the time dynamics of the application

Page 109: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

122

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

The RIR-CDF metric

• More exhaustive than the RIRavg

• Statistical scheduling

• Application fingerprint

• Performance comparison of runs at more detailed level than RIRavg

Global metric

Page 110: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

123

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

The RIR-CDF metric

CDF indicates the RIR region application spends most of its time

in

RIR-CDF can be used to predict which application will be more affected by external load (for

dynamic applications)

Slowdown mapping

I show how we can predict slowdown for MG vs IS

Page 111: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

124

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Taking guidance from RIR-CDF : a proof of concept

• CDF indicates the RIR region application spends most of its time in

• Scheduling fact: Depending the scheduling algorithm, affect of load on different RIR regions is different (Govindan et.al.)

• Reason: Scheduling handled differently for CPU bound processes vs I/O bound processes

• “Providing enough CPU is not enough, an equally important consideration is to provide the CPU at the right time”

• Higher RIR more inter-process interaction can use more efficient communication

• Lower RIR more compute intensive or large messages

• How do applications with different RIRs suffer from external load ?

Page 112: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

125

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Effect of load on different RIR areas

Difference of performance over 12.46 times at the

extremes under full load

Applications with higher iteration

rates always seem to be more

drastically affected by

external load then those with lower iteration

rates.

Page 113: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

126

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Scheduling using the CDF

The slowdown mapping can be complicated and built over a long history. We use results from Patterns benchmark to provide a simpler mapping

Input: RIR value

Output: Fractional slowdown

Assumed load = 100%

More realistic mappings may take bandwidth

and CPU utilization as

input

Page 114: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

127

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Scheduling using the CDF

For the IS and MG applications we have seen till now, which application may be hurt more if one of the processes from each application shares the physical host with an external

computational load?

The impact: we can now determine in advance, the impact of external load if we must choose one of these applications to be influenced by the load.

Map each RIR value to its slowdown RIR value

from the mapping

Slowdown CDF

Page 115: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

128

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Slowdown CDF

For MG

Slowdown CDF

For IS

More time spent in low

regions

Avg = 0.103

Avg = 0.065

Page 116: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

129

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Actual slowdown

• MG 2.4 times

• IS 15.52 times

• IS is affected much more from the load as predicted.

• A better mapping function can yield more powerful slowdown estimates for complex applications.

• Shows how we can make complex decisions for dynamic applications using RIR – CDF.

• Other applications Statistical scheduling ideas– examples

Page 117: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

130

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Power Spectrum

• Other applications of the Power Spectrum– Statistical Scheduling decisions– Application categorization– Component separation

• More discussion in dissertation

• Steps in computing the Power Spectrum – numerous measures taken to get a representative power spectrum (Windowing functions, zero padding, eliminating the DC bias)

• Evaluation with another application: IS (Integer Sort) from the NAS benchmarks similar results

• Implementation details and making it work online

Page 118: Black Box Methods for Inferring Parallel Applications' Properties in Virtual Environments

133

Black Box Methods for Inferring Parallel Applications' properties in Virtual Environments

Process Level BIC imbalance Algorithm

• A better fit for more dynamic applications like the NAS benchmarks

• Sometimes work done by all processes is not the same

• Algorithm takers into account inter-process level interactions and BIC delays

• Phenomenon of Load Bias also observed A heavily loaded process affects the BIC delay of other unloaded processes too (shifting of work)

• Covered in detail in the dissertation

• Gives a range of slowdown (optimistic and pessimistic values)

• Evaluation with MG– Range of slowdown = [28.67s, 50.78s ]– Actual slowdown = 42.67s