Transcript
  • Open Source Cloud Management Stacks Comparison:

    Eucalyptus Vs. OpenStack

    August 10, 2011

    1 Introduction

    As cloud computing technology matures, many commercial companies and online com-munities are developing new and interesting concepts for managing the underlying cloudstacks. This survey tries to compare between two widely used open-sourced managementstacks, and assess the current state-of-the-art state in terms of resource-management, stor-age overview, scalability and extendability. Cloud providers as well as medium-large busi-nesses use these Cloud Management Stacks (CMS) to govern data-base infrastructures andoer not only private cloud service, but also public or hybrid cloud services. Due to thenature of open-source projects, survival is achieved by popularity. Compatibility to var-ious VMM platforms (e.g. Xen, KVM, VMware) is key, as these platforms dictates notonly the build and maintenance cost factors but also the features they will enable cost-saving services and techniques (e.g. smart resource provisioning, high-availability, SLAmonitoring, etc). Table 1.1 states the list of hypervisors each platform currently supports.As private cloud IaaS features are the basics for every cloud operator, public APIs set anew and sometimes contradicting set of requirements as Service Level Agreements (SLA)usually penalise cloud providers in cases of unsatisfactorily performance or low rates ofavailability. Thus, the ability to extend the CMS with a business oriented layer is often anecessity. Next, we briey describe each of the open-source IaaS management stacks, interms of high level design and focuses on the key dierences between them. Follows, isa more in-depth look at the strategies each of the platforms holds in regards to resourcemanagement, storage perception, scalability concerns and extendability development.

    2 Architectures Overview

    2.1 Eucaliptus

    Being a relatively mature player in the market (4 years), Eucalyptus [3] backed by Euca-lyptus Systems, Inc oers the benet of a relatively open source infrastructure for managing

    1

  • Eucalyptus OpenStack

    Xen yes yes

    KVM yes yes

    VMware ESX/ESXi Only in EE yes

    Hyper-V no yes

    Table 1.1: Hypervisors support matrix

    private/public cloud. One benet of Eucalyptus is that its open source software compo-nents are used without modication, meaning that they can run on unmodied GNU Linuxkernels with relative ease. Ubuntu's baked-in cloud computing edition (Ubuntu Cloud) wasEucalyptus-based (Until changed to OpenStack in recent versions) and ready to be installedright after download, making it very convenient. Another major benet is its API compat-ibility with one of the largest cloud provider market players, Amazon and its EC2 platform(as well as partial support for S3). That means that a company evaluating EC2 can usefreely available software on freely available operating systems to build a compatible testlab. That same company, once they are an Amazon customer, can then use Eucalyptus fordevelopment work before pushing to the ever-live world of the real cloud.

    A Eucalyptus cloud setup [7, 5] consists of ve types of components. The cloud con-troller (CLC) and "Walrus" are top-level components, with one of each in a cloud instal-lation. The cloud controller is a Java program that oers EC2-compatible WS (SOAPbased) interfaces, as well as a Web interface to the outside world. In addition to handlingincoming requests, the cloud controller performs high-level resource scheduling and systemaccounting. Walrus, also written in Java, implements bucket-based storage, which is avail-able outside and inside a cloud through S3-compatible WS (SOAP) and HTTP (REST)interfaces.

    Top-level components can aggregate resources from multiple clusters (i.e., collectionsof nodes sharing a LAN segment, possibly residing behind a rewall). Each cluster needsa cluster controller (CC) for cluster-level scheduling and network control and a "storagecontroller" (SC) for EBS-style storage [4]. The two cluster-level components would typicallybe deployed on the head-node of a cluster. Finally, every node with a hypervisor (Xen,KVM, etc) will need a node controller (NC) for controlling the hypervisor. CC and NCare written in C and deployed as Web services inside Apache; the SC is written in Java.Communication among these components takes place over SOAP with WS-security.

    2

  • Figure 2.1: Eucalyptus High Level Design

    It is important to note that while Eucalyptus has an open-source version, it is not 100%open. A few code modules remain close, and variations of them are in use in the open-sourceand the enterprise version of Eucalyptus. Some of them, (e.g. resource management) arethe reason why Eucalyptus is considered limited in scalability. In fact, NASA which attime, supported Eucalyptus publicly, pulled out from the project (In favor of OpenStack)because of these very issue.

    2.2 OpenStack

    Backed by Rackspace and NASA, OpenStack is a collection of open source technologieswritten in Python delivering a massively scalable cloud operating system. It currentlyencompasses three main projects:

    Nova which provides virtual servers upon demand. This is similar to RackspacesCloud Servers or Amazon EC2.

    Swift which provides object/blob storage. This is roughly analogous to RackspaceCloud Files (from which it is derived) or Amazon S3.

    Glance which provides discovery, storage and retrieval of virtual machine images forOpenStack Nova.

    While these three projects provide the core of the cloud infrastructure, OpenStack isopen and evolving constantly, and more projects are lling missing features as time goesby. In this survey we focus mainly on the architecture of Nova [6] which is the heart ofthe OpenStack platform (which can be extended by the Swift, Glance and others). Novaconsists of seven main components, with the Cloud Controller component representingthe global state and interacting with all other components. API Server acts as the Web

    3

  • Services front end for the cloud controller. Compute Controller provides compute server re-sources, and the Object Store component provides storage services. Auth Manager providesauthentication and authorization services. Volume Controller provides fast and permanentblock-level storage for the compute servers. Network Controller provides virtual networksto enable compute servers to interact with each other and with the public network. Sched-uler selects the most suitable compute controller to host an instance. Nova is built ona shared-nothing, messaging-based architecture. All of the major components, that isCompute Controller, Volume Controller, Network Controller, and Object Store can be runon multiple servers. Cloud Controller communicates with Object Store via HTTP, butit communicates with Scheduler, Network Controller, and Volume Controller via AMQP(Advanced Message Queue Protocol). To avoid blocking each component while waitingfor a response, Nova uses asynchronous calls, with a call-back that gets triggered when aresponse is received. To achieve the shared-nothing property with multiple copies of thesame component, Nova keeps all the cloud system state in a distributed data store. Up-dates to system state are written into this store, using atomic transactions when required.Requests for system state are read out of this store. In limited cases, the read results arecached within controllers for short periods of time (for example, the current list of systemusers).

    Figure 2.2: OpenStack Nova High Level Design

    4

  • 3 Resources

    An important aspect of any cloud infrastructure is the resources granted to a VM. Each VMdeployed in the cloud must be isolated from other VMs on the system in terms of their CPUaccess, memory access, network access, and access to persistent storage. A dened ServiceLevel Agreements (SLAs) or Quality-of-Service (QoS) specications categorize the qualityand guarantee of resources. Farthermore, cloud resource allocations are "on-demand"(resources are returned to the user when they are requested without queuing delay) andatomic (an entire resource allocation request is typically satised without the possibility ofpartial allocation failure). Many design and implementation issues are relevant and havean eect on the infrastructure utilization, user customization and exibility:

    Guarantees - Reservation and limit are xed resources guarantees that specify theminimum and maximum amount a VM may consume. Relative resources such asshares set VMs priority and in case of resources contention specify how to divide thehosts resource between the VMs.

    Overcommit - Assuming that the VMs do not always use all the resources assignedto them, overcommmit approach exploit it to better utilize hardware by hostingmore resources than physically available but requires a consistent monitoring andpredication models to keep the guarantees valid.

    Flexibility - VM resources may be oered to the users as xed, i.e. predened VMsizes the user can choose from in terms of cpu, memory, etc. or allow a higher degreeof freedom and customization in which the user can specify his specic resourcesrequirement. With predened instances types, the user is not free to allocate thecombination of resources that best t his application, but force to choose a specicresources setup. For cloud providers, this simplies the scheduling and resourceallocation guarantees enforcement. Another exibility issue is the ability to modifyresources dynamically, e.g. increasing memory on the y for a specic VM (i.e.vertical elasticity).

    EC2 parlance exposes pre-dened instances type (called avors in Rackspace) thatdetermine the size of the instance in terms of resources, used by both OpenStack andEucalyptus (See Table 3.1)

    5

  • Type Memory [MB] CPU [cores] Disk [GB]

    tiny 512 1 0

    small 2048 1 20

    medium 4096 2 40

    large 8192 4 80

    xlarge 16384 8 160

    Table 3.1: Supported instance types in Eucalyptus/OpenStack

    Both compared implementations do not implement EC2s Modify-Instance-Attributethat supports exible resources (change the instance type after VM launch dynamically).

    4 Storage & Image support

    Images repository and images query service are important aspect of a cloud implementation.Ecient and scalable images management can save a lot of disk space, increase reliability,allow quick instances deployment, customized images and share in a secured way imagesamong users and groups. Both Eucalyptus and openstack has their own images storeservice and can use any S3 compatible implementation. Images are a triplet of virtual diskimage(s), kernel and ramdisk images as well as an xml le containing meta data aboutthe image. Eucalyptus includes Walrus, a data storage controller service that it's interfaceis compatible with Amazons Simple Storage Service (S3) and provide a mechanism forstoring and accessing virtual machine images and user data.Walrus provides two types offunctionality:

    Stream data into/out of the cloud as well as from instances. Storage service for VM images.Walrus controller compresses the images, encrypt them using user credentials, and split

    them into multiple parts that are described in a image description le (called the manifest inEC2 parlance). Walrus is entrusted with the task of verifying and decrypting images thathave been uploaded by users. When a node controller (NC) requests an image from Walrusbefore instantiating it on a node, it sends an image download request that is authenticatedusing an internal set of credentials. Then, images are veried and decrypted, and nallytransferred. As a performance optimization, and because VM images are often quite large,Walrus maintains a cache of images that have already been decrypted.

    In OpenStack, there are two methods for managing images. Images can be servedthrough the OpenStack Image Service, a project that is named Glance that provides discov-ery, storage and retrieval of images meta data, or directly from a nova-objectstore service.The image store can be a number of dierent object stores, including OpenStack Swift

    6

  • (provides object/blob storage, long-term storage tool that provides reliability and avail-ability through redundancy, analogous to Rackspace Cloud Files or Amazon S3). Withan OpenStack Image Service server in place, the Image Service fetches the image on tothe host machine and then OpenStack Compute boots the image from the host machine.Openstack compute relies on using the euca2ools command-line tools distributed by theEucalyptus Team for adding, bundling, and deleting images.

    Clients can register new virtual disk images with the Image Service, query for infor-mation on publicly available disk images, and use the Image Service's client library forstreaming virtual disk images. Being A multi-format image registry, OpenStack ImageService allows uploads of private and public images in a variety of formats, including:

    Raw Machine (kernel/ramdisk outside of image, a.k.a. AMI) VHD (Hyper-V), also VHD images that include customer data and kernel in oneunied image

    VDI (VirtualBox) qcow2 (Qemu/KVM) VMDK (VMWare) OVF (VMWare, others)

    While Eucalyptus's Walrus acts both as an VM image storage and as the image man-agement service, those two roles were divided into expendable two separate building blockcomponents (Glance and Swift) in OpenStack that provides higher degree of implementa-tion freedom and scalability.

    Openstack notates images as private or public only, there is no way to share an im-age between a dened users group while Eucalyptus imposes its groups security policyalong all its objects including images. Images are not used by several Virtual Machinessimultaneously (ramdisk and kernel disks may).

    The two open source clouds have many future plans related to images store such asadding image replicas, ability to share images between specic users/groups, etc. Thatwould enhance image store & image services.

    5 Scheduling

    Resource scheduling is the process of nding the physical machine in which a new virtualmachine will be hosted. The physical machine (AKA host) must have enough free resourcecapacity (i.e. vCPU, Memory, local-storage, network bandwidth, etc.) to allow the new

    7

  • VM (and others which are already hosted on it) to perform within the performance bound-aries specied by their Service Level Agreements (SLA). Cloud providers usually also havesome kind of a global utilization function which tries to place the hosted VMs in suchlocation which will benet a certain cost aspect (e.g. physical machines consolidation, loadbalancing, etc.). Obviously, these kind of problems are usually NP-Hard problems, whichmeans that nding an optimal solution is an extremely lengthy procedure (impractical formost cloud scales), and so a good cloud management stack usually balance between thequality of the solution, and the time it takes nding it. This is one of the areas where thedesign dierence between Eucalyptus and OpenStack may very well be a decisive factorwhen one come to choose the CMS of his choice.

    As Eucalypsus Cloud Controller (CLC) is stateful, it can choose the one of the ClusterController (CC) either by round-robin, or the one with the largest total free capacity, todelegate the new VM assignment. The CC can chose whether to comply to the request.If the CC decides the request is legit, it assigns the VM to the node with the largest freecapacity using a two-face commit mechanism, rst the CC reserves the resources, next theresources are committed after the NC complete provisioning the VM.

    As one can see this heuristic is quite simplied, and may very well result in under-utilized hardware. While it is possible to replace the scheduling code, this must be done,both in the cluster level (CC), and also in the system-wide level (CLC). Also, currently itis impossible for cluster to communicate with each other, and so VM migration betweenclusters is complicated and demands the CLC involvement. Thus, any solution which triesto use VM migration (i.e. existing placement optimization) must be centric, as it must beimplemented completely in the CLC.

    While OpenStack does oer a similar set of schedulers to chose from (chance scheduler- randomly selects a host, zone scheduler - pick a random host which is up in a specicavailability zone, simple scheduler - pick the least loaded host), it also adds an additionaltwo for cases of large scale which are almost completely distributed. When a request for VMinstantiation arrives, the central scheduler module asks all availability zones (recursively)to assess the cost of assigning the VM for each node which resides within them. This costis currently eected by the amount of free space left after the assignment, and additionaloptional dynamic weight. The results are then sorted and the most appropriate candidateis chosen.

    As this is a completely distributed model, theres no central bottleneck on the perfor-mance of the scheduler. Also, other heuristics can be easily deployed by changing theVM-Node cost model, and by dynamically changing the weights to prefer certain hosts astime goes by (e.g. preferring the most reliable machines, or those with the lowest applica-tion loads, etc).

    8

  • 6 Extendability

    While Eucalyptus is built as an hierarchical architecture, aiming to ease researchers anddevelopers to replace certain modules with their own versions [1, 2], the current releasesdont support a plug-in extensions. This means, that modules re-use and distribution isdicult and complex, and sometimes even impossible (due to the fact that Eucalyptus isnot 100% open-source). Also, since Eucalyptus can be considered as an EC2-extension tothe private cloud (implementing EC2 & S3 APIs), the development team is no susceptibleto add support for new features in hypervisors (e.g. Live-Migration). Adding manualsupport for such features is considered hard, since the state is distributed among variouscomponents (e.g. network, EBS, public IPs, security group).

    Its hierarchical design has also practical implications into the feasibility of features,and/or their performance. For example, since the Cloud Controller acts as a single aggre-gation point of resource allocation and instance provisioning it is in fact a bottle-neck forany smart-placement algorithm (being of centric design). Even in that in mind, the actualinstance allocation is ultimately in the hands of the Cluster Controller, which operates avery simplistic allocation algorithm (First-Fit) which can override the Cloud Controllerplacement algorithm. The core problem is Eucalyptus design as components centric Vsservices oriented. Instead of helping the creation of new services, a predetermined hier-archical design constrain the behavioral ow and thus burdens the extendability of thismodel.

    To tackle extendability issues OpenStack implements several APIs which form a two-dimension communication multi-path. On the one hand user/developer-oriented APIs(EC2, Rackspace APIs) and on the other hand extendability APIs to bridge betweenOpenStack extensions. The main design motto in OpenStack is \share-nothing". Thisapproach comes to reality in the execution ow of requests (i.e. \star"-design), and com-munication between the dierent modules. The assumption is that a component simplyadvises the central \execution unit" (i.e. Cloud Controller), thus it is relatively easy toreplace a module since it has virtually no dependent modules. For example, it is easy toreplace the default Scheduler module, since it does not change the state of other modules.Its high cohesion and the fact that it is decoupled from the complexity of other modulesmake the job of writing and debugging such a component much easier.

    7 Discussion

    Weve surveyed topics concerning system design and there implications in each of the sys-tems. While certainly theres some technology advantage to OpenStack, it is important tonote that while OpenStack clearly claims to be designed for every purpose clouds, Euca-lyptus on the other hand, started its way, and in a sense still is, a research or \sand-box"oriented cloud; Designed to enable quick development and testing in \lab" conditions,before deploying to EC2. While the open-source version of Eucalyptus has remained un-

    9

  • changed, in that regard, the Enterprise Edition includes several advanced features (e.g.quota and accounting management, scalable DB supprt, windows guests, etc).

    While in both systems, setup is easy right out-of-the-box, the amount of work, one needsto put in-order to customize the dierent modules is dependant on the type of research,algorithms and scale of their work.

    References

    [1] Rick Bradshaw and Piotr T. Zbiegiel. Experiences with eucalyptus: deploying an opensource cloud. In Proceedings of the 24th international conference on Large installationsystem administration, pages 1{16. USENIX Association, 2010.

    [2] E. Caron, F. Desprez, D. Loureiro, and A. Muresan. Cloud computing resource man-agement through a grid middleware: a case study with diet and eucalyptus. In IEEEInternational Conference on Cloud Computing (CLOUD), pages 151{154. IEEE, 2009.

    [3] Eucalyptus open source edition. http://open.eucalyptus.com.

    [4] Amazon Inc. Elastic Block Storage. http://aws.amazon.com/ebs/.

    [5] Eucalyptus Inc. Eucalyptus online guide. http://open.eucalyptus.com/book/export/html/1398.

    [6] OpenStack Nova. http://nova.openstack.org.

    [7] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youse, andD. Zagorodnov. The Eucalyptus open-source cloud-computing system. In IEEE/ACMInternational Symposium on Cluster Computing and the Grid (CCGrid), pages 124{131. IEEE, 2009.

    10


Recommended