Upload
nickolas-atkinson
View
217
Download
0
Embed Size (px)
DESCRIPTION
U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 3 Hosting Platforms Data Centers Clusters of servers Storage devices High-speed interconnect Hosting platforms: Rent resources to third-party applications Performance guarantees in return for revenue Benefits: Applications: don’t need to maintain their own infrastructure o Rent server resources, possibly on demand Platform provider: generates revenue by renting resources
Citation preview
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science
Dynamic Resource Management in Internet Hosting Platforms
Ph.D. Thesis DefenseBhuvan Urgaonkar
Advisor: Prashant Shenoy
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 2
Internet Applications
Proliferation of Internet applications
auction site online game online retail store
Growing significance in personal, business affairs Focus: Internet server applications
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 3
Hosting Platforms Data Centers
Clusters of servers Storage devices High-speed interconnect
Hosting platforms: Rent resources to third-party applications Performance guarantees in return for revenue
Benefits: Applications: don’t need to maintain their own infrastructure
o Rent server resources, possibly on demand Platform provider: generates revenue by renting resources
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 4
Goals of a Hosting Platform Meet service-level agreements
Satisfy application performance guaranteeso E.g., average response time, throughput
Maximize revenue E.g., maximize the number of hosted applications
Question: How should a hosting platform manage its resources to meet these goals?
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 5
Challenge #1: Dynamic Workloads
Multi-time-scale variations Time-of-day, hour-of-day
Overloads E.g., Flash crowds
User threshold for response time: 8-10 s
Key issue: How to provide good response time under varying workloads?
0
20000
40000
60000
80000
100000
120000
140000
0 5 10 15 20
Time (hrs)
Req
uest
Rat
e (r
eq/m
in)
0 12 24 Time (hours)
Time (days)0 1 2 3 4 5
Arr
ival
s pe
r min 0
0
140K
1200
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 6
Challenge #2: Complexity of Applications
Complex software architecture Diverse software components Web servers, Java application servers, databases
Multiple classes of clients How to provide differentiated service?
Replicable components How many replicas to have?
Tunable configuration parameters E.g., MaxClient in Apache How to set these parameters?
Key issue: How to capture all this complexity?
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 7
Talk Outline
MotivationThesis contributions
Application modeling Dynamic provisioning Scalable request policing Conclusions
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 8
Hosting Platform Models
Small applications Require only a fraction of a server Shared Web hosting, $20/month to run own Web site
Shared hosting: multiple applications on a server Co-located applications compete for server resources
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 9
Hosting Platform Models
Large applications May span multiple servers eBay site uses thousands of servers!
Dedicated hosting: at most one application per server Allocation at the granularity of a single server
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 10
Thesis Contributions
Dynamic resource management in hosting platforms
Shared Hosting Statistical multiplexing and under-provisioning [OSDI 2002] Application placement [PDCS 2004]
Dedicated Hosting Analytical model for an Internet application [SIGMETRICS 2005] Dynamic provisioning [Autonomic Computing 2005] Scalable request policing [PODC 2004, WWW 2005]
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 11
Talk Outline
Motivation Thesis contributions
Application modeling Dynamic provisioning Scalable request policing Conclusions
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 12
Internet Application Architecture
Multi-tier architecture Each tier uses services provided by its successor
Session-based workloads
HTTP J2EE Databaserequest processing in an online bookstore
search “moby” queries
response
Melville’s ‘Moby Dick’Music CDs by Moby
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 13
Baseline Application Model
Model consists of two components Sub-system to capture behavior of clients Sub-system to capture request processing inside the application
SIGMETRICS’05
clients application
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 14
Modeling Clients
Clients think between successive requests Infinite server system to capture think time Z Captures independence of Z from processing in application
Client 1
Client 2
Client N
Z
Z
Z
Q0
applicationclients
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 15
Modeling Request Processing
Q1 Q2 QM
tier 1 tier 2 tier M
pM=1p3p1
p2
S1 S2 SM
Transitions defined to capture circulation of requests Request may move to next queue or previous queue
Multiple requests are processed concurrently at tiers Processor sharing scheduling discipline
Caching effects get captured implicitly!
N
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 16
Putting It All Together
Q0
Q1 Q2 QM
pM=1p3p1
p2
Z
Z
S1 S2 SM
N
A closed-queuing model that captures a given number of simultaneous sessions being served
tier 1 tier 2 tier M
client
client
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 17
Mean-value Analysis
Q0
Q1 Q2 QM
Product-form closed queuing network Lm: average length of Qm Am: average number of clients in Qm seen by arriving client
Am (n+1) = Lm (n) Iterative algorithm to compute mean queue lengths, sojourn times
client
clientn
clientn+1
1
A2(n+1)= AM(n+1)=A1(n+1)= L1(n) L2(n) LM(n)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 18
Parameter Estimation
Visit ratios Equivalent to trans. probs. for MVA Vi ≈ λi / λreq ; λreq at sentry, λi from logs
Service times Use residence time Xi logged at tier i For last tier, SM ≈ XM
Si = Xi – ( Vi+1 / Vi ) · Xi+1
Think time Measured at the application sentry
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 19
Evaluation of Baseline Model
Auction site RUBiS One server per tier
Apache JBOSS Mysql
Concurrency limits not captured
0
5000
10000
15000
20000
25000
30000
0 100 200 300 400 500
ObservedBasic Model
Avg
resp
tim
e (m
sec)
Num sessions75150
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 20
Q0Q1 Q2 QM
Z
ZS1 S2 SM
N
Requests may be dropped due to concurrency limits Need to model the finiteness of queues!
Handling Concurrency Limits
dropped requests
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 21
QMp1 pM
S1 SM
Q0Q1 Q2 QM
Z
ZS1 S2 SM
N
Approach: Subsystems to capture dropped requests Distinguish the processing of dropped requests
Handling Concurrency Limits
dropQ1drop drop
dropdrop
drop
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 22
Estimating Drop Probabilities and Delay Values
Drop probability Step 1: Estimate throughput using MVA assuming no concurrency limits Step 2: Estimate pi
drop as the drop probability of M/M/1/Ki queue
Delay value for tier i Subject the application to offline workload that causes limit to be
exceeded only at tier i; record response time of failed requests
Highlimit
Highlimit
Lowlimit
Tput=tt t*(1-pi
drop)
t*pidrop
Ki
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 23
Enhanced model can capture concurrency limits
Response Time Prediction
0
5000
10000
15000
20000
25000
30000
0 100 200 300 400 500
ObservedBasic ModelEnh Model
Avg
resp
tim
e (m
sec)
Num sessions
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 24
Replication and Load Imbalances
Causes of imbalance “Sticky” sessions Variation in session durations and resource requirements
Imbalance factor for jth most-loaded replica of tier i imbalance(i, j) = num_arrivals(i, j) / num_arrivals(i)
Scale visit ratio Vi, j = Vi * imbalance(i, j)
Apache Mysql
JBOSS
JBOSS
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 25
Capturing Load ImbalanceNumber of requests (per-replica)
0
200
400
600
800
1000
30 90 150 210 210 270
Time (sec)
Num
ber o
f req
uest
s
Replica 1
Replica 2
Replica 3
Response times (based on load)
0200400600800
10001200140016001800
Observed PerfectLoad balancing
Enhanced Model
Avg
. res
p. ti
me
(mse
c)
Least loadedMedium loadedMost loadedAverage
Session affinity causes load imbalance Imbalance shifts among replicas
Our enhancement helps improve response time prediction
JBOSSApache Mysql
JBOSS
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 26
Talk Outline
Motivation Thesis contributions Application modeling
Dynamic provisioning Scalable request policing Conclusions
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 27
Dynamic Provisioning
Key idea: increase or decrease allocated servers to handle workload fluctuations Monitor incoming workload Compute current or future demand Match number of allocated servers to demand
Monitor workload
Compute current/future demand Adjust allocation
Auto. Computing’05
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 28
Dynamic Provisioning at Multiple Time-scales
Predictive provisioning Certain Internet workloads patterns can be predicted
o E.g., time-of-day effects, increased workload during Thanksgiving Provision using model at time-scale of hours or days
Reactive provisioning Applications may see unpredictable fluctuations
o E.g., Increased workload to news-sites after an earthquake Detect such anomalies and react fast (minutes)
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 29
Request Policing
Key Idea: If incoming req. rate > current capacity Turn away excess requests
Why police when you can provision? Provisioning is not instantaneous
o Residual sessions on reallocated servero Application and OS installation and configuration overheads
Overhead of several (5-30) minutes
Sentry policing
drop
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 30
Existing Work
Lots of existing work on request policing [Kanodia00, Li00, Verma03, Welsh03, Abdelzaher99, …]
Shortcomings of existing work: Does not attempt to integrate policing and provisioning Does not address scalability of the policer!
o The policer itself may become the bottleneck during overloads
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 31
Policer: Design Goals
Each class should sustain its guaranteed admission rate
Class-based differentiation and revenue maximization Challenging due to online nature of the problem
o An admitted request may cause a more important request arriving later to be dropped
Approach: Preferential admission to higher class requests
Scalability The policer should remain operational even under extremely
high arrival rates
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 32
Overview of Policer Design
Our policer has three components Request classifier and per-class leaky buckets Class-specific queues Admission control
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
Admission controldgold
dsilver
dbronzedropped
admitted
PODC’04 / WWW’05
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 33
Class-based Differentiation
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
dgold
dsilver
dbronze
Each incoming request undergoes classification Per-class leaky buckets used to ensure that
rates guaranteed in SLA are admitted
Admission control
dropped
admitted
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 34
Revenue Maximization
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
dgold
dsilver
dbronze
Idea: Different delays in processing requests of different classes More important requests processed more frequently Methodology to compute delay values in online manner
Bounds probability of a request denying admission to a more important request [Appendix B of thesis]
Admission control
dropped
admitted
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 35
Admission Control
Classifier
Leaky buckets
Class gold
Class silver
Class bronze
Class-specific queues
dgold
dsilver
dbronze
Admission control
Goal: Ensure that an admitted request meets its response time target Measurement-based admission control algorithm Use information about current load on servers and estimated size of
new request to make decision
dropped
admitted
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 36
Scalability of Admission Control Idea #1: Reduce the per-request admission control cost Admission control on every request may be expensive
Bursty arrivals during overloads => batches get formed Delays for class-based differentiation => batches get formed
Admission control that operates on batches instead of requests
Idea #2: Sacrifice accuracy for computational overhead When batch-based processing becomes prohibitive
Threshold-based schemeo E.g., Admit all Gold requests, drop all Silver and Bronze requestso Thresholds chosen based on observed arrival rates and service times+ Extremely efficient- Wrong threshold => bad response times or fewer requests admitted
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 37
Scaling Even Further … Protocol processing overheads will saturate sentry
resources at extremely high arrival rates Indiscriminate dropping of requests will occur
o Important requests may be turned away without even undergoing the admission control test
o Loss in revenue! Sentry should still be able to process each arriving request!
Idea: Dynamic capacity provisioning for sentry Pull in an additional sentry if CPU utilization of existing
sentries exceeds a threshold (e.g., 90%) Round-robin DNS to load balance among sentries
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 38
Class-based DifferentiationArrival rate
0
50
100
150
200
250
0 100 200 300 400 500
Time (sec)
Arriv
al ra
te (r
eq/s
) Gold
Silver
Bronze
Fraction admitted
0
0.2
0.4
0.6
0.8
1
0 100 200 300 400 500
Time (sec)
Frac
tion
adm
itted
GoldSilver
Bronze
Three classes of requests: Gold, Silver, Bronze Policer successful in providing preferential
admission to important requests
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 39
Threshold-based: Higher ScalabilityScalability
0
20
40
60
80
100
0 5000 10000 15000 20000
Arrival rate (req/s)
CPU
util
(%)
Batch
Threshold
Threshold-based processing allows the policer to handle upto 4 times higher arrival rate Single sentry can handle about 19000 req/s
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 40
Threshold-based: Loss of Accuracy
Admission rate
0
50
100
150
200
250
0 100 200 300 400 500
Time (sec)
Adm
issi
on ra
te (r
eq/s
)
Gold
Silver
Bronze
95th resp time
0
1000
2000
3000
4000
5000
0 100 200 300 400 500
Time (sec)
95th
resp
tim
e (m
sec)
Gold
Silver
Bronze
Higher scalability comes at a loss in accuracy of admission control
More violations of response time targets
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 41
Talk Outline
Motivation Thesis contributions Application modeling Dynamic provisioning Scalable request policing
Summary and Future Research
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 42
Thesis Contributions
Dynamic resource management in hosting platforms
Shared Hosting Statistical multiplexing and under-provisioning [OSDI 2002] Application placement [PDCS 2004]
Dedicated Hosting Analytical model for Internet applications [SIGMETRICS 2005] Dynamic provisioning [Autonomic Computing 2005] Scalable request policing [PODC 2004, WWW 2005]
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 43
Future Research Directions
Virtual machine based hosting Recent research has shown feasibility of migrating VMs across nodes Adds a new dimension to the capacity provisioning problem
Characterizing multi-tier workloads Workloads for standalone Web servers are well-characterized E.g., typical service times at Java tier or query processing times? Offshoot of this study: workloads generators for multi-tier applications
Automated determination of provisioning parameters Predictor and reactor invoked based on manually chosen frequencies System administrators use rules-of-thumb => error-prone
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 44
Thanks to … Advisor
Prashant Shenoy
Thesis committeeEmery Berger, Jim Kurose, Don Towsley, Tilman Wolf
Collaborators Abhishek Chandra, Pawan Goyal, Giovanni Pacifici, Timothy Roscoe, Arnold Rosenberg, Mike Spreitzer, Asser Tantawi
All my teachersPaul Cohen, Mani Krishna, Don Towsley
Friends and family
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 45
Questions or comments?
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 46
Query Caching at the Database
0200400600800
100012001400
0 20 40 60 80 100
ObservedModel
Avg
resp
tim
e (m
sec)
% queries cached
Caching effects Captured by tuning Vi and/or Si
Bulletin-board site RUBBoS 50 sessions
SELECT SQL_NO_CACHE causes Mysql to not cache the response to a query
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 47
Agile Switching Using Virtual Machine Monitors
Use VMMs to enable fast switching of servers Switching time only limited by residual sessions
VMM VMM
active dormant dormant active
VM1 VM1 VM2 VM3VM2 VM3
VMMs allow multiple “virtual” m/c on a server E.g., Xen, VMWare, …
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 48
Prototype Data Center
40+ Linux servers Gigabit switches Multi-tier applications
Auction (RUBiS) Bulletin-board (RUBBoS) Apache, JBOSS (replicable) Mysql database
Control Plane Application placementDynamic provisioning
Nuc
leus
Apps
OS
Server NodeApplication capsulesSentries
Resource monitoringParameter estimationN
ucle
us
Apps
OS Nuc
leus
Apps
OS
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 49
Sentry Provisioning (XXX)
CPU util
0
20
40
60
80
100
0 100 200 300 400 500 600
Time (sec)
CPU
util
(%)
CPU util
Arrival rate
0
10000
20000
30000
40000
50000
0 100200 300 400 500 600
Time (sec)
Arr
ival
rate
(req
/s)
Total arrival
Arrival at sentry 1
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 50
System Overview
Control Plane Centralized resource manager
Nucleus Per-server measurements and resource management
Sentry Per-application admission control
Capsule Component of an application running on a server
Control Plane
Nuc
leus
Apps
OS
Server NodeApplication capsulesSentries
Resource monitoringParameter estimationN
ucle
us
Apps
OS Nuc
leus
Apps
OS
Application placementDynamic provisioning
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 51
Existing Application Models Models for Web servers [Chandra03, Doyle03]
Do not model Java server, database etc.
Black-box models [Kamra04, Ranjan02] Unaware of bottleneck tier
Extensions of single-tier models [Welsh03] Fail to capture interactions between tiers
Existing models inadequate for multi-tier Internet applications
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 52
Existing Work Predictable resource management within a single server
Proportional-share schedulers for CPU, network [Duda,Goyal,Waldspurger]o Multi-processors [Chandra]
Memory management [Berger,Waldspurger] Disk scheduling [Shenoy]
Hosting platforms and Internet applications Rice, Duke, Penn State: shared platforms for Web servers IBM, HP Labs: shared platforms, workload prediction Berkeley: novel architecture for Internet applications
Main shortcomings Possible statistical multiplexing gains in shared platforms unexplored Most work assumes simplistic applications (e.g., only Web servers) Provisioning either purely reactive or purely predictive Handling of extreme overloads not addressed satisfactorily
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 53
Predictive Provisioning
Allocator
Predictors
Monitor
ApplicationModels
Predictedworkload
Observedworkload
Resourcereqmts
Servers
Workloadmeasurements
Serverallocations
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 54
Reactive Provisioning
Idea: react to current conditions Useful for capturing significant short-term fluctuations Can correct errors in predictions
Track error between long-term predictions and actual Allocate additional servers if error exceeds a threshold Can be invoked if request drop rate exceeds a threshold
Operates over time scale of a few minutes Pure reactive provisioning: lags workload
Reactive + predictive more effective!
Predictionerrorpred
actual
error > Invokereactortime series
allocate servers
UUNIVERSITYNIVERSITY OFOF M MASSACHUSETTS, ASSACHUSETTS, AAMHERST MHERST – – Department of Computer ScienceDepartment of Computer Science 55
Dynamic Capacity Provisioning
01000200030004000500060007000
0 10 20 30 40 50 60
Res
p tim
e (m
sec)
Time (min)
Workload Response timeServer allocations
Auction application RUBiS Factor of 4 increase in 30 min
0
2
4
6
8
10
12
0 10 20 30 40 50 60
Web serversApp servers
Num
ber o
f ser
vers
Time (min)
20406080
100120140160
0 10 20 30 40 50 60
Arr
ival
s pe
r min
Time (min)
Server allocations increased to match increased workload Response time kept below 2 seconds