View
349
Download
3
Category
Preview:
DESCRIPTION
The presentation of our full paper presented at IEEE Cloud 2013. Abstract: In this paper, we propose a concept for improving the energy efficiency and resource utilization of cloud infrastructures by combining the benefits of heterogeneous machine instances. The basic idea is to integrate low-power system on a chip (SoC) machines and high-power virtual machine instances into so-called Elastic Tandem Machine Instances (ETMI). The low-power machine serves low load and is always running to ensure the availability of the ETMI. When load rises, the ETMI scales up automatically by starting the high-power instance and handing over traffic to it. For the non-disruptive transition from low-power to high-power machines and vice versa, we present a handover mechanism based on software-defined networking technologies. Our evaluations show the applicability of low-power SoC machines to serve low load efficiently as well as the desired scalability properties of ETMIs.
Citation preview
Universität Stuttgart
Institute of Parallel and
Distributed Systems (IPVS)
Universitaetsstr. 38
70569 Stuttgart
Germany
Improving the Efficiency of Cloud Infrastructures
with Elastic Tandem Machines
Sixth IEEE International Conference on Cloud Computing
Santa Clara, CA, USA
June 29th, 2013
Frank Dürr
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Overview
• Motivation
• System Model
• Elastic Tandem Machines
• Evaluation
• Summary
2
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Motivation
• Date centers contain up to tens of thousands of hosts
• Energy-efficiency one of the major challenges
• The ideal host is energy proportional [Barroso, Hölzle]
◦ Energy consumption should be proportional to utilization/load
3
power
consumption
utilization 100%
max
Ideal System Real System
0% (idle) 100% 0% (idle)
power
consumption
utilization
Efficiency
100%
Efficiency 0% Efficient area
of operation
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Goal
Building the ideal energy-proportional machine
• (Almost) no power consumption while being idle
• Elasticity: Scaling up to nominal (maximum) requested resources
4
100% idle
Fill this area of
inefficient operation!
power
consumption
utilization
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Contribution: Elastic Tandem Machines
System on a Chip (SoC)
Machine
• Low performance
• Low power consumption:
~ 2 Watt
Classic high power VM
on commodity PC Hardware
• High performance
• High power consumption
Elastic Tandem Machine: Best of both worlds
• Low power consumption in idle/weak load
• Scale up to maximum nominal resources
• Transparency: Clients see only one ideal machine
+
Transparent integration of heterogeneous hardware
100 Mbps
NIC
700 MHz ARM
512 MB RAM
16 GB
SD Card ~ 35$
[source: www.dell.com]
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Contributions in Detail
Show that SoCs can serve low load in realistic settings
• Web server in 3-tier system architecture
Concept for implementing Elastic Tandem Machines
• Handover concept to switch between SoC and VM
◦ Adaptive: based on dynamic load
◦ Transparent, seamless, non-disruptive
▪ Client just sees one “ideal” machine
▪ Existing (TCP) connections don‘t break during handover
◦ “In network” based on Software-defined Networking (SDN)
• Proof of concept implementation and evaluation
6
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Overview
• Motivation
• System Model
• Elastic Tandem Machines
• Evaluation
• Summary
7
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
System Model (1)
Target environment: Data center of IaaS provider
• SoC machines (Low-power Micro Instances; LPMI)
• Classic VMs on PC hosts (High-power Instances; HPI)
• One LPMI + one HPI = one Elastic Tandem Machine (ETMI)
• Network:
◦ Core switches SDN-enabled
◦ Programmable forwarding
tables
SDN
Controller
Core
Switches
Client
Data center
Top of Rack
Switches …
Internet
ETMI DB
HPI LPMI
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
System Model (2)
3-Tier web service
• ETMI runs web server (middle tier)
◦ One public IP address for ETMI Transparency
◦ One web server instance on LPMI and HPI
• File/DB servers in backend
◦ Store all persistent data and state
◦ Not part of optimization!
9
SDN
Controller
Core
Switches
Client
Data center
Top of Rack
Switches …
HPI LPMI
Internet
DB ETMI
Public Service
IP
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Overview
• Motivation
• System Model
• Elastic Tandem Machines
◦ Overview
◦ System Components
◦ Seamless handover concept
• Evaluation
• Summary
10
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Basic Concept: Overview
• HTTP requests either forwarded to …
◦ … LPMI during low load
◦ … HPI during high load
SDN-based programming
of network (forwarding tables)
• LPMI always running
◦ Service always available
• HPI booted on demand on
LPMI overload
• HPI shutdown if current load
would not overload LPMI
SDN
Controller
Core
Switches
…
Internet
ETMI
Low load path
to LPMI
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
System Components
Handover Controller:
• Switches between LPMI
and HPI based on their
load
• Programs core switches
using OpenFlow
• Boots or shuts down HPI
via Virtual Machine
Manager
• Hysteresis and “ignore
period” to prevent
oscillation
12
Core
Switches
Top of Rack
Switches …
Handover
Controller
OpenFlow
MAC Address re-writing:
• If destination IP matches public IP
write MAC of LPMI (or HPI) in frame
IP aliasing:
• NICs configured with (same)
public IP address of service
• Private IPs used for communication
with controller
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
System Components
Load Monitors:
• Notify controller of
LPMI overload and
HPI under-load
• Load metric:
Incoming data rate
• Threshold scheme
(overload, under-load
thresholds)
• Offline benchmarking
to define LPMI
overload threshold
13
Core
Switches
Top of Rack
Switches …
Internet
Load Monitor Load Monitor
Overload!
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
• Problem: Simple switching breaks existing TCP connections
◦ HTTP 1.1: Multiple requests send over same TCP connection!
• Solution: “Pinning” of existing
connections to old instance
◦ Controller queries instances for
accepted or established connections
▪ Connection monitor (ss or netstat)
◦ Inserts high priority entry into
core switch forwarding table:
▪ (client IP, client port,
public IP, public port)
MAC_rewriting(instance MAC)
Seamless Handover
14
…
Internet
Connection
Monitor
Connection
Monitor
Connections?
connection
pinning
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Client ControllerConnection
Monitor LPMI
SYN
Web ServerLPMI
query open connections
ACK / SYN
pin open connections
t1
t2
...
Seamless Handover
• It‘s not that simple!
◦ There‘s a race condition
• Solution: Block connection requests before querying
◦ Controller programs firewalls on LPMI/HPI
◦ Unblock after flow re-direction
Connection accepted
after query (t1)!
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Overview
• Motivation
• System Model
• Elastic Tandem Machines
• Evaluation
• Summary
16
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Evaluation Setup
• Elastic Tandem machine with:
◦ Low-power Instance (SoC): Raspberry Pi
▪ 700MHz ARM CPU, 512 MB RAM,
100 Mbps Ethernet NIC
◦ High-power Instance (PC):
▪ AMD Athlon 64 X2 Dual Core 4.2 GHz
2 GB RAM, 1 Gbps Ethernet NIC
◦ Running Apache Web server, PHP,
Tomcat servlet engine
• Backend: NFS file server, MySQL
• Core switch: PC with Open vSwitch
and multiport NIC
◦ Line rate forwarding (no bottleneck)
• SDN handover controller based on Floodlight
17
NFS
MySQL LPMI
Apache,
Tomcat
HPI
Apache,
Tomcat
ETMI
Handover
Controller
OpenFlow
HTTP-Client
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
LPMI (SoC) Performance for Static Web Pages
18
20 requests/s
Increase avg. request rate
by 1 request/s every 50 s (Poisson distr.)
Low-power SoC can serve
realistic low-load
(too slow for processing-intensive jobs paper)
Scenario:
• Real static
web pages from: http://www.netsys2013.de/
Performance:
• Throughput:
◦ Max. 26 pages/s
• Response time
◦ Significant
increase at
20 requests/s
(> 150 ms)
◦ Performance limit
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
ETMI Performance for Static Web Pages
Handover LPMI HPI
Increase request rate
by 1 request/s every 50 s until 2500s,
Then decrease rate at 1 request every 50 s
ETMI scales up
transparently
Configuration:
• Switch between LPMI
and HPI at data rate
Toverload = 80 KB/s
Tunderload = 53 KB/s
Performance:
• Scales to maximum
HPI performance
• Seamless handover
◦ No broken HTTP
connections
Handover HPI LPMI
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Energy Efficiency (Static Web Page Scenario)
Idle mode power consumption:
• SoC: 1.85 W PC host: 141.22 W
20
The SoC area
(left figure)
PC Host SoC
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Energy Efficiency – Comparison with
Virtualization (Static Web Page Scenario)
Idle mode power consumption:
• SoC: 1.85 W
• PC host: 141.22 W
76 (idle) VMs per host for same energy efficiency
Fair comparison: PC host must serve same load as 76 SoCs
• At 76x4 request/s = 304 request/s:
◦ 76 SoCs: 76 x 1.89W = 143.65 W
◦ PC host: 1 x 184.46 W
• At 76x8 request/s = 608 request/s:
◦ 76 SoCs: 76 x 1.92 W = 145.92 W
◦ PC hosts: 2 x 184.46 W = 368.92 W
Our PC host could only serve max. 300 request/s!!!
22%
energy savings
60%
energy savings
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
(Closest) Related Work (more see paper)
SoC & host integration
• B.-G. Chun, G. Iannaccone, R. Katz, G. Lee, and L. Niccolini, ACM SIGOPS
Operating Systems Review, 44(1), 2010
◦ Integration of discrete server systems as one design option
◦ Our handover mechanism is one (network centric) technical solution for a
transparent integration
Load balancing mechanisms
• R. Wang, D. Butnariu, and J. Rexford, Hot-ICE 2011
◦ SDN-based approach for keeping TCP connections alive
▪ Approach 1: Re-directs packets to controller (possibly high load on controller)
▪ Approach 2: Timeout heuristic (problem of setting timeout)
◦ We utilize readily available end-system information about connections
◦ We handle dynamic state consistently through firewall “locks”
22
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Summary and Future Work
Elastic Tandem Machine
• Concept for transparent integration of SoCs and classic VMs
• Low power consumption at weak load
• Elasticity: Scale up to nominal resources
• SDN-based seamless handover concept
Future work
• Integrating more than two machine types
◦ Micro instance, small instance, large instance, …
• Predictive load/performance models to plan handover in advance
23
Universität Stuttgart
IPVS
Research Group
“Distributed Systems”
Discussion
24
Full paper:
http://goo.gl/Vkdmfc
Contact:
Frank Dürr
email: frank.duerr@ipvs.uni-stuttgart.de
WWW: http://goo.gl/o6u2A
Recommended