16
Eucalyptus : A Technical Report on an Elastic Utility Computing Archietcture Linking Your Programs to Useful Systems UCSB Computer Science Technical Report Number 2008-10 Daniel Nurmi, Rich Wolski, Chris Grzegorczyk Graziano Obertelli, Sunil Soman, Lamia Youseff, Dmitrii Zagorodnov Computer Science Department University of California, Santa Barbara Santa Barbara, California 93106 Abstract Utility computing, elastic computing, and cloud computing are all terms that refer to the concept of dynamically provisioning processing time and storage space from a ubiquitous “cloud” of computational re- sources. Such systems allow users to acquire and re- lease the resources on demand and provide ready ac- cess to data from processing elements, while relegating the physical location and exact parameters of the re- sources. Over the past few years, such systems have become increasingly popular, but nearly all current cloud computing offerings are either proprietary or depend upon software infrastructure that is invisible to the research community. In this work, we present Eucalyptus, an open-source software implementation of cloud computing that uti- lizes compute resources that are typically available to researchers, such as clusters and workstation farms. In order to foster community research exploration of cloud computing systems, the design of Eucalyptus em- phasizes modularity, allowing researchers to experi- ment with their own security, scalability, scheduling, and interface implementations. In this paper, we out- line the design of Eucalyptus, describe our own im- plementations of the modular system components, and provide results from experiments that measure perfor- mance and scalability of an Eucalyptus installation currently deployed for public use. The main contribution of our work is the presenta- tion of the first research-oriented open-source cloud computing system focused on enabling methodical investigations into the programming, administration, and deployment of systems exploring this novel dis- tributed computing model. 1 Introduction Scalable Internet services [1, 4, 24, 44] deliver mas- sive amounts of computing power (in aggregate) on de- mand to large, internationally distributed user commu- nities through well-defined software interfaces. Until recently, however, access to these services has been re- stricted to human-oriented and simple query-style ap- plication programming interfaces (APIs). With few exceptions, an application programmer wishing to in- corporate such a service as a software component had little ability to direct and control computation inside the service explicitly. Cloud computing [11, 46] has emerged as a new paradigm for providing programmatic access to scal- able Internet service venues. 1 While significant de- bate continues with regard to the “optimal” level of abstraction that such programmatic interfaces should support (c.f., software-as-a-service versus platform- 1 The term “cloud computing” is considered by some to be syn- onymous with the terms “elastic computing,” “utility computing,” and occasionally “grid computing.” For the purposes of this paper, we will use the term “cloud computing” to refer to cloud, elastic, or utility computing but not to grid computing. The difference is explained in Section 4. 1

Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

Eucalyptus : A Technical Report on an Elastic Utility Computing ArchietctureLinking Your Programs to Useful Systems

UCSB Computer Science Technical Report Number 2008-10

Daniel Nurmi, Rich Wolski, Chris GrzegorczykGraziano Obertelli, Sunil Soman, Lamia Youseff, Dmitrii Zagorodnov

Computer Science DepartmentUniversity of California, Santa Barbara

Santa Barbara, California 93106

Abstract

Utility computing, elastic computing, and cloudcomputing are all terms that refer to the concept ofdynamically provisioning processing time and storagespace from a ubiquitous “cloud” of computational re-sources. Such systems allow users to acquire and re-lease the resources on demand and provide ready ac-cess to data from processing elements, while relegatingthe physical location and exact parameters of the re-sources. Over the past few years, such systems havebecome increasingly popular, but nearly all currentcloud computing offerings are either proprietary ordepend upon software infrastructure that is invisibleto the research community.

In this work, we present Eucalyptus, an open-sourcesoftware implementation of cloud computing that uti-lizes compute resources that are typically available toresearchers, such as clusters and workstation farms.In order to foster community research exploration ofcloud computing systems, the design of Eucalyptus em-phasizes modularity, allowing researchers to experi-ment with their own security, scalability, scheduling,and interface implementations. In this paper, we out-line the design of Eucalyptus, describe our own im-plementations of the modular system components, andprovide results from experiments that measure perfor-mance and scalability of an Eucalyptus installationcurrently deployed for public use.

The main contribution of our work is the presenta-tion of the first research-oriented open-source cloud

computing system focused on enabling methodicalinvestigations into the programming, administration,and deployment of systems exploring this novel dis-tributed computing model.

1 Introduction

Scalable Internet services [1, 4, 24, 44] deliver mas-sive amounts of computing power (in aggregate) on de-mand to large, internationally distributed user commu-nities through well-defined software interfaces. Untilrecently, however, access to these services has been re-stricted to human-oriented and simple query-style ap-plication programming interfaces (APIs). With fewexceptions, an application programmer wishing to in-corporate such a service as a software component hadlittle ability to direct and control computation insidethe service explicitly.

Cloud computing [11, 46] has emerged as a newparadigm for providing programmatic access to scal-able Internet service venues. 1 While significant de-bate continues with regard to the “optimal” level ofabstraction that such programmatic interfaces shouldsupport (c.f., software-as-a-service versus platform-

1The term “cloud computing” is considered by some to be syn-onymous with the terms “elastic computing,” “utility computing,”and occasionally “grid computing.” For the purposes of this paper,we will use the term “cloud computing” to refer to cloud, elastic,or utility computing but not to grid computing. The difference isexplained in Section 4.

1

Page 2: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

as-a-service versus infrastructure-as-a-service [13, 25,26, 34]), the general goal is to provide users with theability to program resources within a very-large-scaleresource “cloud” so that they can take advantage ofthe potential performance, cost, and reliability bene-fits that access to scale makes possible.

In short, the model is to provide a large user basewith the ability to program some specified fraction ofthe resources hosted by a scalable service provider(e.g., Google [24], Amazon [4], SalesForce [44],3Tera [1], etc.) through one or more well defined ser-vice interfaces. However, while the interfaces are pub-lic, the infrastructure maintained by the various serviceproviders is almost exclusively proprietary. Thus it isnot possible (or at least not easy) for researchers tobuild, deploy, modify, instrument, or experiment witha cloud infrastructure under their own control.

In this paper, we describe the design and implemen-tation of Eucalyptus – an open-source software infras-tructure architected specifically to support cloud com-puting research and infrastructure development. Thedesign of Eucalyptus is distinctive in that it

• must be able to deploy and execute in hardwareand software environments not under the controlof its designers, and

• must be modularized to allow component-wisemodification or replacement,

while achieving the greatest degree of scalabilitypossible. This work describes the system architec-tural trade-offs imposed upon the design by these tworequirements, the way in which they have been ad-dressed by the current version of Eucalyptus that iscurrently available and in use, and the degree to whichthese trade-offs impact the functionality and perfor-mance of the overall system.

The motivation for Eucalyptus is an exploratoryone. Cloud computing as an emerging concept hasgreat potential, but the speed of commercial engineer-ing leaves fundamental questions either not fully de-fined or unanswered. Thus, while cloud systems areproviding users a valuable service, the closed nature ofthe software has created a situation where researchersinterested in cloud computing topics are finding it dif-ficult to formulate experiments due to the lack of acommon, flexible framework in which they can work.

1.1 Open-Source Infrastructure as a Service

Although most existing cloud computing imple-mentations share the common high-level notion offlexible, scalable, and dynamic computational “provi-sioning,” there is significant variation in exactly howthat power is presented to the end user. Some systems,such as Amazon’s Elastic Compute Cloud (EC2) [17]and Enomalism [18], allow users to allocate entire vir-tual machines (VMs) on demand, thus providing whatis commonly referred to as Infrastructure as a Service(IaaS). Here, the user is responsible for providing theoperating system kernel, base OS software, and anyuser level software and applications they wish to runand the IaaS system provisions physical resources andinstantiates the user’s VMs.

Eucalyptus implements IaaS, with the key differen-tiations being that it is specifically designed to be easyto install and maintain in a research setting, and thatit is easy to modify, instrument, and extend. Specifi-cally, commercial cloud infrastructures take advantageof the ability to control the local resource configuration(hardware versioning, O.S. versioning, network andstorage policies, etc.) and access to large collections ofpotentially expensive resources (e.g., publicly visibleand routable Internet addresses). In a research setting,it is unlikely that the cloud infrastructure can mandatea specific configuration for all hardware and softwareit manages, nor is it possible to predicate functionalityon the availability of very large resource sets.

Further, because IaaS systems each typically targeta specific installation, they are not engineered with ex-tensibility or portability as a primary concern, nor isthe need for ease of system administration given pri-macy in the design. The difficulties are compoundedby the need to be able to incorporate multiple computeclusters into a single resource pool from which cloudallocations are to be drawn. Few open-source softwarepackages of any kind are designed to install and de-ploy on multiple compute clusters that then operatetogether as an ensemble. Thus, Eucalyptus is a rela-tively unique example of IaaS and also a harbinger offuture multi-cluster open-source design experiences.The way in which it frames and then addresses thechallenges that arise as a result, forms the basis of thecontribution this paper makes.

Specifically, we describe

2

Page 3: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

• a simple open architecture for implementingcloud functionality at the IaaS level,

• experiences with implementing this architectureusing open-source Web-service software as theintrinsic technology, and

• performance results demonstrating the viabilityof the resulting cloud computing system.

IaaS, however, is not the only approach to imple-menting cloud computing that the commercial sectoris currently pursuing. Amazon and Google also bothprovide Data as a Service (DaaS) capabilities, throughthe Simple Storage Service (S3) [43] and parts of AppEngine [7] respectively, where users can both storeand access massive amounts of data from the providedcomputational resources. In addition, Google’s AppEngine [7], also provides a language-level abstrac-tion, making it generically categorizable as Platformas a Service (PaaS), where access to computationalpower and storage is gained through language-specificAPIs and libraries. Finally, companies such as Sales-force.com [44] provide a number of high-level soft-ware service packages (e.g., Web-accessible CustomerRelationship Management, Enterprise Resource Plan-ning, Inventory Control, Payroll, etc.). This higher-level approach is often described as Software as a Ser-vice (SaaS).

We have chosen to focus Eucalyptus at the IaaSlevel for two reasons. First, Amazon.com’s EC2 is per-haps the most commercially successful cloud comput-ing endeavor to date and it implements IaaS. Eucalyp-tus is interface-compatible with EC2, making it possi-ble to test its functionality against one of the most ma-ture commercial examples of cloud computing. Thisavailability of a “gold standard” greatly influenced thedesign since it is possible to gauge immediately howclosely our open-source rendition of the functional-ity matches its exemplar. Second, higher-level cloudcomputing abstractions all seem to depend on similarIaaS functionality, at least conceptually. We do notclaim that all cloud computing infrastructures includean IaaS layer in their software architecture. How-ever, for the purposes of further research and open-source development, we speculate that self-containedIaaS functionality that can be layered upon will proveboth foundational and beneficial.

Further, we believe that the results garnered fromour experiences with Eucalyptus may prove to be sem-inal. The software infrastructure in various packageforms has been publicly available since approximatelyJune 1st of 2008 and since its public release, uptakehas been surprisingly rapid (so rapid, in fact, that theproject has been the subject of increasingly visible dis-cussion in the popular press, hence our decision to ob-fuscate its name in this paper). We believe the initialsuccess of the project stems from our choice of EC2as an interface to support, but also from a significanteffort to make the software as easy to download and in-stall as possible. Indeed, using the Rocks [42] clusterconfiguration system, installation and launch is essen-tially a “one-button” operation (installation from Red-Hat Package Management format or source is morecomplicated but still streamlined and documented).No other cloud system, of which we are aware, com-bines support for open development with ease of in-stallation and maintenance as basic design goals while,at the same time, attempting to emulate commerciallyavailable functionality as a way of stimulating com-munity research and development.

2 Eucalyptus Design

The Eucalyptus design is primarily motivatedby two engineering goals: extensibility and non-intrusiveness. Eucalyptus is extensible as a result ofits simple organization and modular design. Further,we have implemented Eucalyptus using open-sourceWeb-service technologies, which serve to illuminateits internals. As a collection of Web services, Euca-lyptus components have well defined interfaces (de-scribed by WSDL documents), support secure com-munication (using WS-Security policies), and relyupon industry-standard Web-services software pack-ages (Axis2, Apache, and Rampart). This choice ofimplementation technology also supports the seconddesign goal – that of non-intrusive or “overlay” de-ployment. We do not assume that researchers inter-ested in Eucalyptus are necessarily willing to dedicateentire collections of machines to Eucalyptus alone (al-though this model of operation is also supported), nordo we assume that they are willing to allow Eucalyp-tus to modify the local software configuration in po-tentially disruptive ways. Intrusiveness is admittedly a

3

Page 4: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

subjective metric. For the purposes of our work, we as-sume that a site wishing to use Eucalyptus is willing tosupport virtualized execution through Xen [8] and tohost Web services. With these two requirements ful-filled, Eucalyptus can be deployed and executed with-out modification to the underlying infrastructure.

2.1 Architectural Overview

Academic research groups have access to a num-ber of resources; for instance, small clusters, poolsof workstations, and various server/desktop machines.Since public IP addresses are usually scarce, andthe security ramifications of allowing complete ac-cess from the public Internet can be daunting, sys-tem administrators commonly deploy clusters as poolsof “worker” machines on private, unroutable networkswith a single “head node” responsible for routing traf-fic between the worker pool and a public network. Al-though this configuration provides security while us-ing a minimum of publicly routable addresses, it usu-ally means that, while most machines can initiate con-nections to external hosts, external hosts cannot typi-cally connect to machines running within each cluster.

For example, an administrator might configure twosmall Linux clusters, a small server pool, and a collec-tion of computer lab workstations. The clusters eachhave a single front-end machine with a publicly acces-sible IP address, while the nodes are connected via aprivate network such that they can only contact eachother and their respective front-ends. The server andworkstation machines have public IP addresses, but theworkstations are behind a firewall and can not be con-tacted from the outside world. In this scenario, it isclear that it is not possible to install a fully connectedsystem, since many of the machines can only initiateconnections to external hosts or are entirely isolatedfrom external networks. In addition, the two sets ofcluster nodes may even have overlapping IP addressessince their networks are fully private and unroutable.In order to make all of these types of resources partof a single cloud, we reflect the hierarchical nature ofthis typical configuration in the architecture of Euca-lyptus, as depicted in Figure 1, where the three hier-archical levels are shown. These hierarchical compo-nents are sufficiently general to accommodate instal-lation on common network hierarchies found within

many institutions, an example of which is depicted inFigure 2.

Node Controller

The Node Controller (NC) is the component thatexecutes on the physical resources that host VM in-stances and is responsible for instance start up, in-spection, shutdown, and cleanup. There are typicallymany NCs in a Eucalyptus installation, but only oneNC needs to execute per physical machine, since a sin-gle NC can manage multiple virtual machine instanceson a single machine. The NC interface is describedvia a WSDL document that defines the instance datastructure and instance control operations that the NCsupports (runInstance, describeInstance, terminateIn-stance, describeResource and startNetwork). The run,describe, and terminate operations on an instance per-form minimal system setup, followed by calls to theunderlying hypervisor (Xen in the current implementa-tion) to control and inspect running instances. The de-scribeResource operation reports current physical re-source characteristics (compute cores, memory, anddisk capacity) to the caller and the startNetwork opera-tion sets up and configures the virtual Ethernet overlaydescribed in more detail in Section 2.2.

Cluster Contoller

A collection of NCs that logically belong togetherreport to a single Cluster Controller (CC) that typi-cally executes on a cluster head node or server that hasaccess to both private and public networks. The CCis responsible for gathering state information from itscollection of NCs, scheduling incoming VM instanceexecution requests to individual NCs, and managingthe configuration of public and private instance net-works. The WSDL that describes the CC interface issimilar to the NC interface, except that each operationis plural instead of singular (runInstances, describeIn-stances, terminateInstances, describeResources). Thedescribe and terminate instance control operations aremerely pass-thru operations to the relevant NC mod-ule. When a CC receives a runInstances request, it per-forms a simple scheduling task of determining whichNCs can support the incoming instance by queryingeach NC through describeResource and choosing the

4

Page 5: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

CLC

CC

NC NC

NC NC

NC NC

Client Client Client

CC CC

NC NC

NC NC

NC NC

NC NC

NC NC

NC NC

Figure 1. Eucalyptus employs a hierarchical de-sign to reflect underlying resource topologies.

Cluster Two CC

NC

NC

NC

NC

Servers

Server

Server Server

CLC

CC

Cluster One CC

NC

NC

NC

NC

Workstations

CC

NC

NC

NC

Figure 2. Example location of CLC, CC and NCcomponents running within a typical resourceenvironment.

first NC that has enough free resources. The CC alsoimplements a describeResources operation, however,instead of reporting actual physical resources avail-able, this operation takes as input a description of re-sources that a single instance could occupy, and returnsthe number of instances of that type can be simultane-ously executed on the NCs.

Cloud Controller

Each Eucalyptus installation includes a singleCloud Controller (CLC) that is the user-visible en-try point and global decision-making component of anEucalyptus installation. The CLC is responsible forprocessing incoming user-initiated or administrativerequests, making high-level VM instance schedulingdecisions, processing service-level agreements (SLAs)and maintaining persistent system and user metadata.

The CLC itself is composed of a collection of ser-vices (Figure 3) that handle user requests and authen-tication, persistent system and user metadata (e.g.,VM images and ssh key pairs), and the managementand monitoring of VM instances. The services areconfigured and managed by an enterprise service bus(ESB) [45] that publishes services and mediates han-dling of user requests while decoupling the service im-plementation from message routing and transport de-tails. Our design emphasizes transparency and sim-plicity in order to foster experimentation and exten-sion of Eucalyptus, particularly with respect to cloud

behavior. To achieve extensibility at this level of gran-ularity, the architectural components of the CLC (in-cluding, but not limited to the VM scheduler, SLAengine, and user/administrative interfaces) are mutu-ally isolated behind well-defined internal interfaceswhere ESB configuration controls their orchestration.With this as a foundation, our CLC implementa-tion can function as an Amazon EC2 work-alike byinter-operating with the EC2 client tools using bothWeb-services and Query interfaces (Amazon publishesspecification documents describing these interfaces).We chose EC2 because it is relatively mature, has alarge existing user community, and because it imple-ments a well-defined IaaS functionality. However, theinterface parsing is modularized so that Eucalyptus cansupport different interfaces, either as a way of emu-lating other infrastructures or to allow interface cus-tomization.

Client Interface

The CLC’s client interface service essentially acts as atranslator between the internal Eucalyptus system in-terfaces (i.e., the NC and CC instance control inter-faces) and some defined external client interface. Forexample, Amazon provides a WSDL document thatdescribes a Web-service SOAP-based client interfaceto their service as well as a document describing anHTTP Query-based interface, both of which can betranslated by the CLC user interface service into Eu-

5

Page 6: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

   

Cloud Controller Services

User Authentication

WSInterface VmControl

KeyPairs

Images

Addresses

Groups

...

PersistentMetadata

SystemState

CCN

CC1

WWW

User WSRequest

Admin WWWRequest

Figure 3. Overview of services that comprise the Cloud Controller. Lines indicate the flow of mes-sages where the dashed lines correspond to internal service messages.

calyptus internal objects. We use JiBX [32] bindingtool to specify a mapping of XML elements onto in-stances of Java objects, which we have used to createbindings that map the body of EC2 SOAP messagesonto internal Eucalyptus objects.

The Query interface does not lend itself to thismodel however. First, there is no XML document toconsume. Second, the authentication mechanism isdifferent and in conflict with the WS-Security policyenforced. Third, conflicts exist between the structureof SOAP requests and Query requests for the samefield of the same kind of request.

The solution stems from the observation that theQuery interface for EC2 is a strict subset of the SOAPinterface. As a result, we have developed a simplebinding framework that maps HTTP Parameter namesonto object fields guided by annotations. We thenrely on annotations of the target object to aid in de-obfuscating inconsistencies such as elided lists and un-wrapped complex types (i.e., field names of a childclass). Ultimately, JiBX is used to marshal the boundobject using the namespace for the EC2 SOAP inter-face. The result is two-fold: First, JiBX will vali-date the object that is actually a legal SOAP inter-face request, hence, a legal EC2 client request. Sec-ond, the marshalled XML document can be suppliedas the SOAP body to allow further processing to con-tinue along the exact same path it would have taken if

the message had been SOAP to begin with.

Administrative Interface

In addition to supporting primary tasks, such as start-ing and stopping instances, a cloud infrastructure mustsupport administrative tasks, such as adding and re-moving users and disk images. Eucalyptus supportssuch tasks though a Web-based interface, implementedby the cloud controller, and command-line tools. Un-like the client interface, however, the administrative in-terface is unique to Eucalyptus. That is, while cloudpurveyors do publish their client interfaces they do notgenerally publish administrators’ interfaces. Thus, wehave defined one for the system that is independent ofany specific client interface or intrinsic IaaS function-ality.

Users are added to a Eucalyptus installation eitherthrough the action of an administrator or by filling outan on-line form that is sent to the administrator forapproval. Control over account creation thus rests inthe hands of a human being, which we found neces-sary given the absence of automated approval meth-ods, such as credit card verification used by Amazon.It is up to the cloud administrator to try to ensure thata new account will not be misused. By forcing newusers to confirm their interest in the account by click-ing on a link received in an email message, Eucalyptusmaps the identity of a user to the their email address.

6

Page 7: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

If that is not sufficient, the administrator may chooseto verify the identity of the applicant with the help ofother information on the sign-up form. Once added,a user account can be temporarily disabled or perma-nently removed by an administrator. At any point, theadministrator can find out which instances a user is ex-ecuting and terminate them.

Currently, disk images in Eucalyptus can be addedto the system only by an administrator. An imageconsists of a Xen-compatible guest OS kernel, a rootfile system image, and, optionally, a RAM disk im-age. Adding an image constitutes uploading thesethree components into the system and naming the im-age. After a image is added, any user can run instancesof that image. Administrators may temporarily dis-able or permanently remove the image. Finally, the ad-ministrator is in charge of adding and removing nodesfrom cluster controller’s configuration.

Instance Control

Creation of virtual machine instance metadata inEucalyptus is managed by a component of the CLCnamed the VmControl service. VmControl continu-ously maintains a simple local representation of thestate of underlying resources (i.e., number of instanceseach CC could potentially create). When instance cre-ation events are initiated, it coordinates with the otherservices in the CLC to resolve user request referencesto image, keypairs, networks, and security groups. Al-location then consists of validating references to meta-data, application of an allocation strategy producing a’pre-allocation’, meaning that as far as the VmCon-trol component is concerned, the resources have beenlocally reserved. Messages are then disseminated tothe CCs involved in the allocation. Each such CC willschedule the instance request to its locally controlledNCs which, finally, create the virtual machine instanceitself and respond accordingly.

SLA Implementation and Management

Service-level agreements (SLAs) are implemented asextensions to the message handling service which caninspect, modify, and reject the message, as well as thestate stored by VmControl. Ultimately, the VmControlrationally arbitrates access to resources and enforces

system-wide or user-specific service-level agreements.These decisions require data about the state of re-sources that is captured in a system model and the re-sult of update events (i.e., either a change to the modelor information about a failure). We have implementedan extensible SLA scheme, which couples the statemodel with event handling to support further work inquantitative study of service level agreements.

The VmControl relies on a local model for decision-making purposes. To keep the model up to date, eachCC is passively polled to obtain the state of its in-stance availability, allocations, virtual network, andregistered images. Information gathered via polling istreated as ground truth and user requests are handled intransactions that commit only when they are reflectedon the resources.

Nonetheless, the model may become inconsistent,causing the system to agree to an SLA with a user thatis unsatisfiable. This can happen when when messagesare lost (e.g., due to network partition) and the state ofresources changes (the period between polling eventscan be thought of as a network partition). However,loss of messages can be identified (polling is semi-synchronous) and times when the model is in an in-valid state can, ultimately, also be detected (after thesystem recovers and ground truth can be inspected).Consequently, the likelihood that the model will be in-correct at a given moment can be computed.

We have implemented a simple yet powerful ini-tial SLA that allows users to control the high-levelnetwork topology of their instances. While resourceproviders typically think of collections of machinesin terms of “clusters” or “pools”, we have adoptedthe more general concept of “zones” that is currentlyused by Amazon EC2. Within EC2, a “zone” is cor-related to a vague geographic location such as “eastcoast U.S.” or “west coast U.S.”, while we use theterm to refer to a logical collection of machines thathas several NC components and a single CC compo-nent. Eucalyptus allows users to specify a zone con-figuration upon instance execution, which allows aninstance set to reside within a single cluster or poten-tially across clusters. Each configuration offers differ-ent administrative and network performance character-istics, which we explore in more detail in Section 2.2.In addition, Eucalyptus further co-opts the notion ofzone, extending it to support different SLAs with re-

7

Page 8: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

spect to trade-offs between the number of resourcesacquired and their relative topology. In the current im-plementation, the default set of zones supplied allowsusers to request a specific cluster, the emptiest cluster,any single cluster unless no cluster provides the mini-mum requested, and multiple clusters.

2.2 Virtual Networking

Perhaps one of the most interesting challenges inthe design of a cloud computing infrastructure is thatof VM instance interconnectivity. One of the most at-tractive characteristics of cloud systems stems fromthe fact that although the underlying physical ma-chines may have complex and restrictive networkingtopologies, a simpler, more configurable VM intercon-nection topology can be presented to the user throughvirtualization. When designing Eucalyptus, we rec-ognized that the VM instance network solution mustaddress connectivity, isolation, and performance.

First and foremost, every virtual machine that Euca-lyptus controls must have network connectivity to eachother, and at least partially to the public Internet (weuse the word “partially” to denote that at least one VMinstance in a “set” of instances must be exposed exter-nally so that the instance set owner can log in and in-teract with their instances). Because users are grantedsuper-user access to their provisioned VMs, they mayhave super-user access to the underlying network inter-faces. This ability can cause security concerns, in that,without care, a VM instance user may have the abilityto acquire system IP or MAC addresses and cause in-terference on the system network. In addition, if twoinstances are running on one physical machine, a userof one VM may have the ability to snoop and influ-ence network packets belonging to another. Thus, ina cloud shared by different users, VMs belonging to asingle cloud allocation must be able to communicate,but VMs belonging to separate allocations must be iso-lated. Note that current hypervisor offerings do notsupport this notion of grouping directly. Finally, oneof the primary reasons that virtualization technologiesare just now gaining such popularity is that the perfor-mance overhead of virtualization has diminished sig-nificantly over the past few years, including the costof virtualized network interfaces. Our design attemptsto maintain inter-VM network performance as close to

native as possible.Each instance controlled by Eucalyptus is given two

virtual network interfaces; one is referred to as “pub-lic” while the other is termed “private”. The publicinterface is assigned the role of handling communi-cation outside of a given set of VM instances, or be-tween instances within the same availability zone asdefined by the SLA. For example, in an environmentthat has available public IP addresses, they may beassigned to VM instances at instance boot time, al-lowing communication both to and from the instance.In environments where instances are connected to aprivate network with a router that supports externalcommunication through network address translation(NAT), the public interface may be assigned a validprivate address giving it access to systems outside thelocal network through the NAT-enabled router. Theinstance’s private interface, however, is used only forinter-VM communication across zones, handling thesituation where two VM instances are running insideseparate private networks (zones) but need to commu-nicate with one another. The basic instance network-ing configuration is shown in Figure 4, which depictsthe instance’s public interface as connected to the pub-lic network via a bridge connected to the resource’sreal interface.

Within Eucalyptus, the cluster controller currentlyhandles the set up and tear down of instance virtualnetwork interfaces. The CC can be configured to setup the public interface network in three ways corre-sponding to three common environments we currentlysupport. The first configuration instructs Eucalyptusto attach the VM’s public interface directly to a soft-ware Ethernet bridge connected to the real physicalmachine’s network, allowing the administrator to han-dle VM network DHCP requests the same way theyhandle regular DHCP requests. The second config-uration allows the administrator to define a dynamicpool of IP addresses that will be assigned via a DHCPserver that is executed by the CC. In this configuration,the administrator defines a network, an interface on theCC that is connected to that network, and a range of IPaddresses that are dynamically assigned as instancesare started. Finally, we support a configuration thatallows an administrator to define static Media AccessControl (MAC) and IP address tuples. In this mode,each new instance created by the system is assigned

8

Page 9: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

VMInstance

PhysicalResource

PhysicalInterface

PublicBridge

PrivateBridge

PublicInterface

PrivateInterface

VDESwitch

VDECable

ToremoteVDEswitch

TophysicalEthernet

FromremoteVDEswitch

Figure 4. Each Eucalyptus VM instance is as-signed a public interface for external networkconnections, and a private network interfaceconnected to a fully virtual Ethernet network forinter-VM communication.

PhysicalResource

VMInstance(A)

PublicInterface

PrivateInterface

VMInstance(B)

PublicInterface

PrivateInterface

VDESwitchVLAN(A,B,…)

VDECable

VLANA VLANB

PrivateBridge PrivateBridge

Figure 5. If two different user instances (A andB) are running on the same resource, we employVLAN tagging as a means of isolating networktraffic between VMs.

a free MAC/IP tuple, which is released when the in-stance is terminated.

The instance’s private interface is connected viaa bridge to a fully virtual software Ethernet systemcalled Virtual Distributed Ethernet (VDE) [49]. VDEis a process-level implementation of the Ethernet pro-tocol, where users can specify and control virtual Eth-ernet switch and cable abstractions that are imple-mented as programs running in user-space. Once aVDE network has been created, connections to realEthernet networks can be established through the Uni-versal TUN/TAP interface, which, in essence, providesEthernet packet communication from the Linux kernelto user-space processes. When an Eucalyptus systemis initiated, it sets up a VDE network overlay that con-sists of one VDE switch per CC and NC componentand as many VDE wire processes as can be establishedbetween switches. If there are no firewalls existing onthe physical network, the VDE network will be fullyconnected, where each VDE switch is connected to ev-ery other VDE switch. The VDE switches support aspanning tree protocol, which allows redundant linksto exist while preventing loops in the network, thusgiving the VDE network a level of redundancy whenthe switches are fully connected. However, since NCcomponents may be behind a firewall, the only require-

ment is that each VDE switch has at least one wire tosome other VDE switch in the system, which is typi-cally satisfied by a single connection to the CC.

At instance run time, the NC responsible for con-trolling the VM creates a new Ethernet bridge thatis connected to the local VDE switch and configuresthe instance to attach its private interface to the newbridge. At this point, our original requirement of in-stance connectivity is satisfied, since any VM startedon any VDE-connected NC will be able to contact anyother VM over the virtual Ethernet, regardless of theunderlying physical network configuration. Currently,we allow the administrator to define a class-B IP sub-net that is to be used by instances connected to theprivate network, and each new instance is assigned adynamic IP address from within the specified subnet.

The second requirement of the virtual network isthat it supports instance network traffic isolation. Werequire that if two instances, owned by separate users,are running on the same host or on different hosts con-nected to the same physical Ethernet, they do not havethe ability to inspect or modify each other’s networktraffic. To meet this requirement, each set of instancesowned by a particular user is assigned a tag that is thenused as a virtual local area network (VLAN) identifierassigned to that user’s instances. Once a VLAN iden-

9

Page 10: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

tifier has been assigned, all VDE switch ports that areconnected to the instance’s private interfaces are con-figured to tag all incoming traffic with the VLAN tagand to only forward packets that have the same VLANtag. Hence, a set of instances will only be forwardedtraffic on VDE ports that other instances in the set areattached to, and all traffic they generate will be taggedwith a VLAN identifier at the virtual switch level, thusisolating instance network traffic even when two in-stances are running on the same physical resource.Figure 5 shows how two instances owned by user Aand user B running on the same physical resource areconnected to the VDE network through ports config-ured to only forward traffic based on a particular VM’sassigned VLAN.

3 Experiment

To illustrate the performance characteristics of Eu-calyptus as well as to observe its functionality underuser load not generated by the development team, wehave installed Eucalyptus on a small research Linuxcluster at our home institution and made it availablefor general use to the wider community as a “publiccloud.” The hardware configuration comprises 7 com-pute nodes and one head-node. The compute nodes areon an isolated network, while the front-end is publiclyaccessible. Each system has two Intel Xeon 3.2GHzprocessors, 3GB of RAM and approximately 40GB ofavailable disk (single SCSI drive). We are running asingle CLC on the front-end, a single CC on the front-end, and one NC per compute node.

Users request access to the Eucalyptus PublicCloud (OPC) by requesting credentials from the CLCthrough the user signup web page. Subsequent cloudallocation requests are limited to 4 instances whichwill be terminated automatically after 6 hours. Areverse firewall prevents EPC hosted instances frommaking network connections to external network ad-dresses (public Linux distribution sites are exceptedto allow instance configuration) to avoid inadvertent“spam-bot” hosting. Only a local EPC zone is avail-able as an externally accessible SLA.

All experiments detailed in this section have beenconducted using the EPC in the presence of ambientinduced load. That is, unless otherwise indicated, wemeasured the performance of the EPC in the presence

of load being generated by its users (i.e. in a non-dedicated mode).

3.1 Instance Throughput

The first experiment we perform is designed to mea-sure the performance of VM instance control opera-tions. Because Eucalyptus is interface compatible withAmazon’s EC2, we are able to perform the same ex-periments on both Eucalyptus and EC2 without cus-tomization. The primary purpose in doing so is to ver-ify that the EC2 functionality is, indeed, fully repli-cated by Eucalyptus. Less rigorously, the quantitativecomparison serves as a high-level test for whether ourimplementation is pathologically inefficient. In fact,during early phases of Eucalyptus design we discov-ered a number of performance “bugs” through com-parisons with EC2.

Because one of the primary functions of Eucalyptusis to control the execution of VM instances on a collec-tion of resources, we perform an “instance throughput”experiment where we measure the time from when auser wishes to execute a collection of instances to thetime the instances are booted and available for use onthe network. For this experiment, we measure the to-tal time between an instance execution request to thepoint when we can first detect that the instance is run-ning. In order to measure the instance state, we rely onthe Amazon EC2 command-line tool “ec2-describe-instances”, which simply queries the cloud server forinformation about a user’s instances and prints the in-formation to the user’s terminal. To gather a singledata point, we first take a timestamp followed imme-diately by a launch of an instance or set of instancesusing the client tool “ec2-run-instances”. Then, we re-peatedly poll the server using “ec2-describe-instances”until our initiated instance enters a “running” state, atwhich point we take another timestamp. The differ-ence between the two timestamps constitutes a singledata point that represents the number of seconds be-tween a user instance creation request and the user be-coming aware that the instance(s) are available for use.Each trial is characterized by four variables and tim-ings are reported in seconds. The first variable is theVM type requested, where the VM type is defined asthe number of cores, amount of RAM, and allocateddisk space. The second variable is the instance image

10

Page 11: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

itself, which we control by loading identical copies in-side both EC2 and the EPC. For this experiment we runtrials for a “small” VM type with a corresponding im-age called “ttylinux” [48]; a compact Linux image thatboots very quickly and offers a minimal networkedLinux installation when fully booted. The third vari-able of interest is the number of instances simulta-neously requested, which we vary from one to eightinstances. The final variable is of course the systemused; either Eucalyptus or Amazon EC2.

In Figure 6 we show the results of the instancethroughput experiment. The figure shows two em-pirical cumulative distribution functions that allow usto examine both the magnitude and variance of timetaken to create instances in EC2 and the EPC. Eachdata point represents the percentage (Y axis) of in-stance creation trials that took at least the number ofseconds denoted at the point’s corresponding positionon the X axis.

Notice that for both cases (one and eight concurrentinstance creation trials), though the range of creationtimes overlap, all empirical quantiles in the EPC caseare lower than those of EC2. For example, 98 percentof the eight concurrent instance creation trials com-pleted in less than 24 seconds within EPC, while only75 percent completed in less than 24 seconds withinEC2. For the one instance case, the difference is evenmore striking, with 98 percent of the trials complet-ing in less than 17 seconds within the EPC and only32 percent completing in less than 17 seconds withinEC2.

This result, we believe, indicates that the Eucalyp-tus implementation is relatively efficient given its tar-get environment. However it does not indicate thatthe EPC is outperforming EC2. The actual EC2 har-nesses a vast resource pool and, as such, should al-most certainly incur a measurable performance over-head over an Eucalyptus implementation running on asmall cluster. At the same time, the implementation ofthe EPC does seem to compare well with that of thesystem it emulates indicating that its implementationis, at least, relatively high performance. This supposi-tion is further supported by the (somewhat surprising)similarity in the shapes of two distribution plots. Bothare unimodal with relatively similar tail weights. Atpresent we are unable to go beyond this observationand to make a direct inference (say from confidence

bounds on the variance) about the similarity of the twoperformance profiles however doing so is somethingwe hope to achieve in the near future.

3.2 Network Performance

Our second performance experiment is designed tostudy the characteristics of our network solution andto compare it with EC2’s networking approach. Sincewe do not know the network configuration and thehardware employed by EC2, it is not possible to com-pare the two systems in terms of functional detail.Rather large discrepancies in performance should beinterpreted as indicating a significant difference in ap-proach or a “bug” in the Eucalyptus implementation.

The network experiment was conducted betweentwo simultaneously launched instances, one acting asa server and the other as a client. We launched the in-stances from a disk image of Debian’s “etch” Linuxdistribution and installed a network performance mea-suring tool “iperf” into it. In addition to “iperf,” whichwas used for TCP and UDP experiments, we used“ping” to measure the round-trip latency of an ICMPecho. Finally, the TCP buffer conditioning and exper-iment duration was chosen to saturate a dedicated gi-gabit interconnection network.

To study how physical distance between the ma-chines hosting VMs affects network performance, weconducted this experiment both between instanceswithin one availability zone and between instances lo-cated in two different zones. For EC2 this meant thatthe network traffic traveled from one Amazon site toanother, albeit both located on the east coast of the US(only three availability zones are currently offered byAmazon and all three are located on the east coast).For EPC this meant that the network traffic traversedthe “private” network interface implemented by VDE,as described in Section 2.2. Although instances in twodifferent Eucalyptus zones might be able to commu-nicate over their “public” interfaces (if all addressesare publicly routable) and thus achieve better perfor-mance, the “private” interface experiments shows howEucalyptus performs when offering the same privacyguarantees as EC2.

The results of our network experiment are shown inFigure 7. Here, we show TCP throughput and round-trip latency between two instances, inside EC2 and

11

Page 12: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

5 10 15 20 25 30 35Time (seconds)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

EC2 1 InstanceOPC 1 Instance

5 10 15 20 25 30 35Time (seconds)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

EC2 8 InstancesOPC 8 Instances

Figure 6. Empirical CDFs comparing number of seconds taken to start one and eight VM instanceswithin EC2 and Eucalyptus.

EPC, within a single availability zone and betweenzones. In addition to the individual measurementsfrom 32 independent trials, we show their arithmeticmeans (square features) and 95% confidence intervals,when the interval is wide enough to be of interest.

Within EC2, we observe that bandwidth within asingle availability zone outperforms bandwidth be-tween zones by approximately a factor of 2, whereasthe factor is closer to 10 within the EPC. It is clearthat our chosen private networking solution (VDE) im-poses a significant performance penalty that is not ap-parent within EC2. We believe the reason for thisdifference lies in the fact that with VDE, the privatenetworking overlay is running almost entirely in userspace, resulting in more memory copies per packet.Another reason for the poor performance of our pri-vate networking solution can be inferred from the la-tency results, showing a significantly greater variancein the RTT for ICMP packets traveling over the VDEnetwork. Similar to the bandwidth result, we foundthe latency in some cases was over 10 times greaterbetween EPC zones than within a single zone, againindicating that the VDE network imposes significantnetwork performance degradation.

In both bandwidth and latency experiments, how-ever, Eucalyptus delivers native network performancewhen VDE is not selected. Thus by choosing and SLAthat specifies an allocation should not span clusters,a user can ensure that her cloud allocation will real-

ize native interconnect speed at the possible expenseof scalability (since an allocation will be limited to, atmost, the size of one cluster).

This experiment also demonstrates, in rather starkterms, the performance impact associated with theoverlay approach implemented by Eucalyptus. Specif-ically, VDE is necessary to implement a secure, user-space layer-2 overlay for each cloud allocation that canspan separate private cluster networks.

4 Related Work

Cloud computing stems from recent innovations inoperating system virtualization and scalable Internetservices. It also shares intellectual underpinning withgrid computing, although the precise nature of thissharing is a matter of some debate.

Machine virtualization projects producing VM hy-pervisor software [8, 9, 30, 50] have enabled newmechanisms for providing resources to users. In par-ticular, these efforts have influenced hardware de-sign [3, 27, 31] to support transparent operating sys-tem hosting. The “right” virtualization architectureremains an open field of study [2]): analyzing, opti-mizing, and understanding the performance of virtual-ized systems [28, 29, 36, 37, 51] is an active area ofresearch. Eucalyptus implements a cloud computing“operating system” using Xen-based virtualization asits initial target hypervisor and this work, particularly

12

Page 13: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

EC2 1 Zone EC2 2 Zones EPC 1 Zone EPC 2 Zones0

200

400

600

800IC

MP

Late

ncy

(ms)

EC2 1 Zone EC2 2 Zones EPC 1 Zone EPC 2 Zones0

2

4

6

8

10

ICM

P La

tenc

y (m

s)

Figure 7. TCP throughput (left) and round-trip latency (right) measurements between instancesstarted within EC2 and Eucalyptus. Individual measurements from 32 independent runs are showntogether with their arithmetic means and 95% confidence intervals.

with respect to performance benchmarking, serves asa starting point for studying the overheads introducedby Eucalyptus.

Thanks in part to the new facilities provided by vir-tualization platforms, a large number of systems havebeen built using these technologies for providing scal-able Internet services [1, 5, 12, 15, 16, 24, 44], thatshare in common many system characteristics: theymust be able to rapidly scale up and down as work-load fluctuates, support a large number of users requir-ing resources “on-demand”, and provide stable accessto provided resources over the public Internet. Whilethe details of the underlying resource architectures onwhich these systems operate are not commonly pub-lished, Eucalyptus is almost certainly shares some ar-chitectural features with these systems due to sharedobjectives and design goals.

In addition to the commercial cloud computing of-ferings mentioned above (Amazon EC2/S3, GoogleAppEngine, Salesforce.com, etc.), which maintain aproprietary infrastructure with open interfaces, thereare open-source projects aimed at resource provision-ing with the help of virtualization. Usher [35] isa modular open-source virtual machine managementframework from academia. Enomalism [18] is anopen-source cloud software infrastructure from a start-up company. Virtual Workspaces [33] is a Globus-based [19] system for provisioning workspaces (i.e.,

VMs), which leverages several pre-existing solutionsdeveloped in the grid computing arena. The Cluster-on-demand [14] project focuses on the provisioningof virtual machines for scientific computing applica-tions. oVirt [40] is a Web-based virtual machine man-agement console.

While these projects produced software artifactsthat are similar to Eucalyptus, there are several dif-ferences. First, Eucalyptus was designed from theground up to be as easy to install and as non-intrusiveas possible, without requiring sites to dedicate re-sources to it exclusively (one can even install it on alaptop for experimentation.) Second, the Eucalyptussoftware framework is highly modular, with industry-standard, language-agnostic communication mecha-nisms, which we hope will encourage third-party ex-tensions to the system and community development.Third, the external interface to Eucalyptus is based onan already popular API developed by Amazon. Fi-nally, Eucalyptus is unique among the open-source of-ferings in providing a virtual network overlay that bothisolates network traffic of different users and allowstwo or more clusters to appear to belong to the sameLocal Area Network (LAN).

Grid computing must also be acknowledged as anintellectual sibling of, if not ancestor to, cloud com-puting [10, 20, 38, 47]. The original metaphor fora computational utility, in fact, gives grid computing

13

Page 14: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

its name. While grid computing and cloud computingshare a services oriented approach [21, 22] and mayappeal to some of the same users (e.g., researchers andanalysts performing loosely-coupled parallel compu-tations), they differ in two key ways. First, grid sys-tems are architected so that individual user requestscan (and should) consume large fractions of the totalresource pool [39]. Cloud systems often limit the sizeof an individual request to be tiny fraction of the totalavailable capacity [6] and, instead, focus on scaling tosupport large numbers of users.

A second key difference concerns federation. Fromits inception, grid computing took a middleware-basedapproach as a way of promoting resource federationamong cooperating, but separate, administrative do-mains [19]. Cloud service venues, to date, are unfeder-ated. That is, a cloud system is typically operated by asingle (potentially large) entity with the administrativeauthority to mandate uniform configuration, schedul-ing policies, etc. Eucalyptus conforms to the designconstraints governing cloud systems.

Several research projects and white papers in thelast few years have studied the performance ramifi-cations of deploying specific workloads (often scien-tific ones) in today’s commercial clouds. For example,Palankar et al. [41] benchmarked Amazon’s S3 cloudstorage solution for scientific applications, pointingout several current characteristics of the system thatneed to be addressed before it is appropriate for ac-cess to scientific data. Garfinkel [23] analyzed Ama-zon’s EC2 management, performance and security fa-cilities and reported on their experience with movinglarge scale research application to the cloud. Thiswork, while valuable by itself, could be significantlyaugmented through experimentation with Eucalyptus,both in terms of experimental verification and by al-lowing the researchers of these works to more pre-cisely understand the measured resource performanceresponse through system instrumentation. In addition,the performance results presented in this paper are di-rectly relevant to these other benchmarking efforts.

Overall, we find that there are a great number ofcloud computing systems in design and operation to-day that expose interfaces to proprietary and closedsoftware and resources, a smaller number of open-source cloud computing offerings that typically re-quire substantial effort and/or dedication of resources

in order to use, and no system antecedent to Eucalyp-tus that has been designed specifically with supportacademic exploration and community involvement asfundamental design goals.

5 Conclusion and Future Work

In this work, we have presented the Eucalyptusopen-source cloud computing software framework.We have shown that Eucalyptus is distinctive amongother cloud computing IaaS systems in that it sup-ports an industry standard interface (Amazon EC2),deploys as an overlay atop existing commonly encoun-tered resource configurations (small clusters, worksta-tion pools, etc), and has been designed as a modu-lar system where components may be replaced or en-hanced in order to foster future cloud computing re-search efforts. The entire Eucalyptus system is avail-able for download and has been successfully installedboth on clusters and numerous personal computing en-vironments.

Benchmarking Eucalyptus against EC2 reveals thatit is relatively efficient. While it outperforms EC2 inabsolute terms, it does so in an environment with sig-nificantly fewer resources. Only when a process-levelvirtual network overlay is employed is performancesubstantially degraded. However, by adapting the con-cept of availability zone from EC2, Eucalyptus allowsusers to trade network performance for scalability ex-plicitly though a default set of SLAs supplied with thesystem. This adaptation supports the claim that Eu-calyptus allows new cloud computing techniques andpolicies to be developed. Thus we conclude that Euca-lyptus is not inherently inefficient and provides facil-ities for cloud computing research that are otherwiseunavailable.

In addition to constantly supporting new features,we are particularly interested in using Eucalyptus as aplatform for experimenting with novel cloud comput-ing concepts such as dynamic SLA generation, newvirtual networking topologies for floating static IP ad-dresses across clouds, investigations on how to imple-ment a truly secure cloud infrastructure, and investi-gating novel user and administrative cloud interfaces.

14

Page 15: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

References

[1] 3Tera home page. http://www.3tera.com/.[2] K. Adams and O. Agesen. A comparison of software

and hardware techniques for x86 virtualization. InASPLOS-XII: Proceedings of the 12th internationalconference on Architectural support for programminglanguages and operating systems, pages 2–13, NewYork, NY, USA, 2006. ACM.

[3] Advanced Micro Devices, AMD Inc. AMD Vir-tualization Codenamed “Pacifica” Technology, Se-cure Virtual Machine Architecture Reference Manual.May 2005.

[4] Amazon.com home page. http://www.amazon.com/.

[5] Amazon Web Services home page. http://aws.amazon.com/.

[6] Amazon Elastic Compute Cloud (Amazon EC2).http://aws.amazon.com/ec2/.

[7] Google appengine – http://code.google.com/appengine/.

[8] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris,A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xenand the art of virtualization. In SOSP ’03: Proceed-ings of the nineteenth ACM symposium on Operatingsystems principles, pages 164–177, New York, NY,USA, 2003. ACM.

[9] F. Bellard. QEMU, a Fast and Portable DynamicTranslator. Proceedings of the USENIX Annual Tech-nical Conference, FREENIX Track, pages 41–46,2005.

[10] F. Berman, G. Fox, and T. Hey. Grid Computing:Making the Global Infrastructure a Reality. Wileyand Sons, 2003.

[11] R. Buyya, C. S. Yeo, and S. Venugopa. Market-oriented cloud computing: Vision, hype, and realityfor delivering it services as computing utilities. InProceedings of the 10th IEEE International Confer-ence on High Performance Computing and Commu-nications (HPCC-08, IEEE CS Press, Los Alamitos,CA, USA) 2008.

[12] F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. Wal-lach, M. Burrows, T. Chandra, A. Fikes, and R. Gru-ber. Bigtable: A Distributed Storage Systemfor Structured Data. Proceedings of 7th Sympo-sium on Operating System Design and Implementa-tion(OSDI), page 205218, 2006.

[13] M. Chang, J. He, and E. Castro-Leon. Service-orientation in the computing infrastructure. In SOSE’06: Proceedings of the Second IEEE InternationalSymposium on Service-Oriented System Engineering,pages 27–33, Washington, DC, USA, 2006. IEEEComputer Society.

[14] J. Chase, D. Irwin, L. Grit, J. Moore, and S. Sprenkle.Dynamic virtual clusters in a grid site manager. HighPerformance Distributed Computing, 2003. Proceed-ings. 12th IEEE International Symposium on, pages90–100, 2003.

[15] J. Dean and S. Ghemawat. MapReduce: SimplifiedData Processing on Large Clusters. Proceedings of6th Symposium on Operating System Design and Im-plementation(OSDI), pages 137–150, 2004.

[16] G. DeCandia, D. Hastorun, M. Jampani, G. Kaku-lapati, A. Lakshman, A. Pilchin, S. Sivasubrama-nian, P. Vosshall, and W. Vogels. Dynamo: ama-zon’s highly available key-value store. Proceedingsof twenty-first ACM SIGOPS symposium on Operat-ing systems principles, pages 205–220, 2007.

[17] Amazon elastic compute cloud – http://aws.amazon.com/ec2/.

[18] Enomalism elastic computing infrastructure. http://www.enomaly.com.

[19] I. Foster and C. Kesselman. Globus: A metacom-puting infrastructure toolkit. International Journal ofSupercomputer Applications, 1997.

[20] I. Foster and C. Kesselman, editors. The Grid –Blueprint for a New Computing Infrastructure. Mor-gan Kaufmann, 1998.

[21] I. Foster, C. Kesselman, J. Nick, and S. Tuecke. Thephysiology of the grid: An open grid services archi-tecture for distributed systems integration, 2002.

[22] I. Foster, C. Kesselman, and S. Tuecke. The anatomyof the grid: Enabling scalable virtual organizations.Int. J. High Perform. Comput. Appl., 15(3):200–222,2001.

[23] S. L. Garnkel. An evaluation of amazons grid comput-ing services: Ec2, s3 and sqs. Technical Report TR-08-07, Center for Research on Computation and So-ciety, School for Engineering and Applied Sciences,Harvard University, August 2007.

[24] Google – http://www.google.com/.[25] D. Greschler and T. Mangan. Networking lessons in

delivering ‘software as a service’: part i. Int. J. Netw.Manag., 12(5):317–321, 2002.

[26] D. Greschler and T. Mangan. Networking lessons indelivering ’software as a service’: part ii. Int. J. Netw.Manag., 12(6):339–345, 2002.

[27] R. Hiremane. Intel Virtualization Technology for Di-rected I/O (Intel VT-d). Technology@Intel Magazine,4(10), May 2007.

[28] W. Huang, M. Koop, Q. Gao, and D. Panda. Virtualmachine aware communication libraries for high per-formance computing. In Proceedings of Supercom-puting 2007.

[29] W. Huang, J. Liu, B. Abali, and D. K. Panda. Acase for high performance computing with virtual ma-chines. In ICS ’06: Proceedings of the 20th annual

15

Page 16: Eucalyptus : A Technical Report on an Elastic Utility ...rich/class/cs293b-cloud/...using open-source Web-service software as the intrinsic technology, and • performance results

international conference on Supercomputing, pages125–134, New York, NY, USA, 2006. ACM.

[30] Hyper-v home page – http://www.microsoft.com/hyperv.

[31] Intel. Enhanced Virtualization on Intel Architecture-based Servers. Intel Solutions White Paper, March2005.

[32] JiBX home page. http://jibx.sourceforge.net/.

[33] K. Keahey, I. Foster, T. Freeman, and X. Zhang. Vir-tual workspaces: Achieving quality of service andquality of life in the grid. Sci. Program., 13(4):265–275, 2005.

[34] P. Laplante, J. Zhang, and J. Voas. What’s in a name?distinguishing between saas and soa. IT Professional,10(3):46–50, May-June 2008.

[35] M. McNett, D. Gupta, A. Vahdat, and G. M. Voelker.Usher: An Extensible Framework for Managing Clus-ters of Virtual Machines. In Proceedings of the 21stLarge Installation System Administration Conference(LISA), November 2007.

[36] A. Menon, A. Cox, and W. Zwaenepoel. OptimizingNetwork Virtualization in Xen. Proc. USENIX AnnualTechnical Conference (USENIX 2006), pages 15–28,2006.

[37] M. F. Mergen, V. Uhlig, O. Krieger, and J. Xeni-dis. Virtualization for high-performance computing.SIGOPS Oper. Syst. Rev., 40(2):8–11, 2006.

[38] NSF TeraGrid Project. http://www.teragrid.org/.

[39] J. P. Ostriker and M. L. Norman. Cosmology of theearly universe viewed through the new infrastructure.Commun. ACM, 40(11):84–94, 1997.

[40] oVirt home page. http://ovirt.org/.[41] M. R. Palankar, A. Iamnitchi, M. Ripeanu, and

S. Garfinkel. Amazon s3 for science grids: a viablesolution? In DADC ’08: Proceedings of the 2008international workshop on Data-aware distributedcomputing, pages 55–64, New York, NY, USA, 2008.ACM.

[42] P. Papadopoulos, M. Katz, and G. Bruno. NPACIRocks: tools and techniques for easily deployingmanageable Linux clusters. Concurrency and Com-putation: Practice & Experience, 15(7):707–725,2003.

[43] Amazon simple storage service – http://aws.amazon.com/s3/.

[44] Salesforce Customer Relationships Management(CRM) system. http://www.salesforce.com/.

[45] M. Schmidt, B. Hutchison, P. Lambros, and R. Phip-pen. The Enterprise Service Bus: Making service-oriented architecture real. IBM Systems Journal,44(4):781–797, 2005.

[46] D. Skillicorn. The case for datacentric grids. Paralleland Distributed Processing Symposium., ProceedingsInternational, IPDPS 2002, Abstracts and CD-ROM,pages 247–251, 2002.

[47] T. Tannenbaum and M. Litzkow. The condor dis-tributed processing system. Dr. Dobbs Journal,February 1995.

[48] Ttylinux home page – http://www.minimalinux.org/ttylinux/.

[49] Virtual distributed ethernet (vde) home page – http://vde.sourceforge.net/.

[50] Vmware home page – http://www.vmware.com.

[51] L. Youseff, K. Seymour, H. You, J. Dongarra, andR. Wolski. The impact of paravirtualized memory hi-erarchy on linear algebra computational kernels andsoftware. In HPDC, pages 141–152. ACM, 2008.

16