Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Datacenter application interference
CMPs (popular in datacenters) offer increased throughput and reduced power consumption
They also increase resource sharing between applications, which can result in negative interference.
1
Resource contention is well studied
… at least on single machines.
3 main methods:
(1) Gladiator style match-ups
(2) Static analysis to predict application resource usage
(3) Measure benchmark resource usage; apply to live applications
2
New methodology for understanding datacenter interference is needed.
One that can handle complexities of a datacenter:
(10s of) thousands of applications real user inputs production hardware financially feasible low overhead
Hardware counter measurements of live applications.
3
Our contributions
1. ID complexities in datacenters
2. New measurement methodology
3. First large-scale study of measured interference on live datacenter applications.
4
Complexities of understanding application interference in a datacenter
5
Large chips and high core utilizations
Profiling 1000 12-core, 24-hyperthread Google servers running production workloads revealed the average machine had >14/24 HW threads in use.
6
Heterogeneous application mixes
Often applications have more than one co-runner on a machine.
Observed max of 19 unique co-runner threads (out of 24 HW threads).
0-1 Co-runners
2-3 Co-runners
4+ Co-runners
7
Application complexities
Fuzzy definitions
Varying and sometimes unpredictable inputs
Unknown optimal performance
8
Hardware & Economic Complexities
Varying micro-arch platforms
Necessity for low overhead = limited measurement capabilities
Corporate policies
9
Measurement methodology
10
Measurement Methodology
The goal:
A generic methodology to collect application interference data on live production datacenter servers
11
Measurement Methodology
12
App. A App. B
Tim
e
Measurement Methodology
1. Use sample-based monitoring to collect per machine per core event (HW counter) sample data.
1.
13
Measurement Methodology
14
App. A App. B
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
1
2
3
4
5
6
1
2
3
4
Measurement Methodology
2. Identify sample sized co-runner relationships…
2.
15
Measurement Methodology
16
App. A App. B
Samples A:1-A:6 are co-runners with App. B.
Samples B:1-B:4 are co-runners with App. A.
Measurement Methodology
17
App. C
App. A
App. B
Say that a new App. C starts running on CPU 1…
… B:4 no longer has a co-runner.
Measurement Methodology
3. Filter relationships by arch. independent interference classes…
3.
18
Measurement Methodology
Be on opp. sockets.
19
Measurement Methodology
Share only I/O
20
Measurement Methodology
4. Aggregate equivalent co-schedules.
4.
21
Measurement Methodology
22
For example: • Aggregate all the samples of App. A
that have App. B as a shared core co- runner.
• Aggregate all samples of App. A that have App. B as a shared core co-runner and App. C as a shared socket co- runner.
Measurement Methodology
5. Finally, calculate statistical indicators (means, medians) to get a midpoint performance for app. interference comparisons
5.
23
Measurement Methodology
24
App. A App. B
Avg. IPC = 2.0
Avg. IPC = 1.5
Applying the measurement methodology at Google.
25
Applying the Methodology @ Google
Event Instrs IPC
Sampling period 2.5 Million
Number of machines* 1000
Experiment Details:
* All had Intel Westmere chips (24 hyperthreads, 12 cores), matching clock speed, RAM, O/S
1. Collect samples
Method:
26
Applying the Methodology @ Google
Event Instrs IPC
Sampling period 2.5 Million
Number of machines* 1000
Experiment Details:
* All had Intel Westmere chips (24 hyperthreads, 12 cores), matching clock speed, RAM, O/S
Unique binary apps 1102
Co-runner relationships (top 8 apps)
Avg. shared core rel’ns 1M (min 2K)
Avg. shared socket 9.5M (min 12K)
Avg. opposite socket 11M (min 14K)
Collection results:
1. Collect samples
Method:
2. ID sample size relationships
3. Filter by interference classes
27
Applying the Methodology @ Google
4. Aggregate equiv. schedules
Method:
5. Calculate statistical indicators
28
Analyze Interference
streeview’s IPC changes with top co-runners
Overall median IPC across 1102 applications
29
Beyond noisy interferers (shared core)
30
Co-running applications
Base
Ap
plic
atio
n
Less or pos. interference
Negative interference
Noisy data
Beyond noisy interferers (shared core)
* Recall minimum pair has 2K samples; medians across full grid of 1102 apps
31
Base
Ap
plic
atio
ns
Co-running applications
Less or pos. interference
Noisy data
Negative interference
Performance Strategies
Restrict negative beyond noisy interferers (or encourage positive interferers as co-runners)
Isolate sensitive or antagonistic applications
32
Takeaways
1. New datacenter application interference studies can use our identified complexities as a check list.
2. Our measurement methodology (verified at Google in 1st large-scale measurements of live datacenter interference), is generally applicable and shows promising initial performance opportunities.
33