65
Implementation of a Simulation Environment for Cloud Object Storage Infrastructures Bachelor Thesis of Tobias Sturm At the Department of Informatics Steinbuch Centre for Computing (SCC) Erstgutachter: Prof. Dr. Achim Streit Zweitgutachter: Prof. Dr. Bernhard Neumair Betreuender Mitarbeiter: Foued Jrad Bearbeitungszeit: 23.5.2013 22.8.2013 KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

Implementation of a SimulationEnvironment for Cloud Object Storage

Infrastructures

Bachelor Thesis of

Tobias Sturm

At the Department of InformaticsSteinbuch Centre for Computing (SCC)

Erstgutachter: Prof. Dr. Achim StreitZweitgutachter: Prof. Dr. Bernhard NeumairBetreuender Mitarbeiter: Foued Jrad

Bearbeitungszeit: 23.5.2013 – 22.8.2013

KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Page 2: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities
Page 3: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

I declare that I have developed and written the enclosed thesis completely by myself, andhave not used sources or means without declaration in the text.

Karlsruhe, 22.8.2013

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .(Tobias Sturm)

Page 4: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities
Page 5: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

Zusammenfassung

Cloud Computing ermoglicht es Unternehmen, ihren Bedarf an IT-Resourcen wie Rechen-leistung oder Speicherkapazitat durch Nutzung externer Dienste zu befriedigen. Große In-vestitionskosten entfallen, da Cloud-Dienste nach dem Pay-as-You-go-Prinzip abgerechnetwerden. Fur Benutzer aber auch Betreiber ist es wichtig, Abschatzungen machen zu kon-nen, welche Kosten anfallen werden, oder welche Hardwarekonstellation in der Cloud ver-wendet werden sollte. Fragen zu z.B. Multi-Cloud Nutzung oder welche Allocation-Policiesverwendet werden sollen, spielen ebenso eine sehr große Rolle. Aus diesem Grund wurdeCloudSim [3] entwickelt - ein eventbasiertes Simulationsframework mit dem IaaS Clouds(Infrastructure-as-a-Service) modelliert und deren Verwendung simuliert werden kann.Neben Kosten und Ressourcenauslastung stehen auch Metriken wie Energieverbrauch imVordergrund. Diese beliebte Simulationsumgebung stellt jedoch keinerlei Mechanismen zurVerfugung, um heutige Object Storage basierte Cloud-Dienste (STaaS, Storage-Service-as-a-Service) zu simulieren.

In dieser Arbeit wurde CloudSim um eine STaaS-Komponente erweitert. Dabei wur-den bekannte Cloud-Standards wie CDMI (Cloud Data Management Interface) als Vor-bild fur die User-Cloud-Schnittstellen verwendet. Cloud-interne Prozesse wie Speicherzu-griffe und Netzwerkkommunikation innerhalb einer STaaS-Infrastrukur, sowie die Ver-wendung von Multi-Cloud wurden modelliert. Die Resourcennutzung und Kosten, diedurch Nutzung der STaaS-Komponente auftreten, wurden anhand verschiedener Simula-tionszenarien evaluiert.

v

Page 6: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

Abstract

Cloud computing empowers organizations to satisfy their need of IT resources like com-puting power and storage capacity by using external services. Since Cloud services arebilled by the pay-as-you-go principle, organizations can save huge investment costs. Userswant to know, what costs will arise by the usage of those services - Cloud providers want toprovide the best-matching hardware configurations. Different allocation policies or multi-Cloud usage are two topics, that are in the focus of interest as well. Therefore, CloudSim[3], a popular event-based framework, was developed to model and simulate the usage ofIaaS Clouds (Infrastructure-as-a-Service). Metrics as for instance costs, resource utiliza-tion and energy consumption can be also investigated using CloudSim. But this favoredsimulation framework does not provide any mechanisms to simulate today’s object storagebased Cloud-services (STaaS, Storage-as-a-Service).

This work extends CloudSim with a STaaS-component. The well-known standard CDMI(Cloud Data Management Interface) was the model for the user Cloud interface. Cloudinternal processes were modeled, like for example storage accesses and network communi-cation inside the STaaS infrastructure, as well as the usage of multiple Clouds. Resourceutilization and costs that arise by the usage of the STaaS component were evaluated onbasis of multiple simulation scenarios.

vi

Page 7: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

Table of Contents

Zusammenfassung v

Abstract vi

1. Introduction 11.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2. Goal of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3. Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2. Fundamentals 32.1. Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2. Characteristics of a Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3. Cloud Deployment Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.4. Cloud Service Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.5. Data Storage as a Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.5.1. STaaS Storage Types . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.5.1.1. Structured Storage . . . . . . . . . . . . . . . . . . . . . . . 6

2.5.1.2. Block Storage . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.5.1.3. Object Storage . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5.2. Example STaaS Provider - Amazon S3 . . . . . . . . . . . . . . . . . 7

2.5.3. Example STaaS Middleware - OpenStack Swift . . . . . . . . . . . . 8

2.5.4. Example STaaS API - CDMI . . . . . . . . . . . . . . . . . . . . . . 8

2.6. CloudSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3. Modeling of a STaaS 123.1. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1.1. User Code and User Interface Structures . . . . . . . . . . . . . . . . 12

3.1.2. Provided Storage Services . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1.3. Resources, Resource Usage and Network . . . . . . . . . . . . . . . . 13

3.2. Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4. Implementation 174.1. Development Envorinment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2. STaaS Provider Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2.1. CDMI Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2.1.1. CDMI Entity . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2.1.2. CDMI Object . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2.1.3. CDMI Container . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2.1.4. CDMI Metadata and Cloud Characteristics . . . . . . . . . 20

4.2.2. Internal Storage Models . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2.2.1. Blob and Bloblocators . . . . . . . . . . . . . . . . . . . . . 21

4.2.2.2. Servers and Hard Drives . . . . . . . . . . . . . . . . . . . 21

vii

Page 8: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

viii Table of Contents

4.2.3. Object Storage Cloud Model . . . . . . . . . . . . . . . . . . . . . . 224.3. User Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3.1. StorageBroker Class . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.3.2. UsageSequence Class . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.3.3. Service Level Agreements . . . . . . . . . . . . . . . . . . . . . . . . 234.3.4. MetaStorageBroker Class . . . . . . . . . . . . . . . . . . . . . . . . 244.3.5. Latency Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.3.6. Request Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.3.7. CDMI Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3.7.1. Cloud Discovery Request . . . . . . . . . . . . . . . . . . . 264.3.7.2. GET Container Request . . . . . . . . . . . . . . . . . . . . 264.3.7.3. GET Object Request . . . . . . . . . . . . . . . . . . . . . 274.3.7.4. PUT Container Request . . . . . . . . . . . . . . . . . . . . 284.3.7.5. PUT Object Request - creation . . . . . . . . . . . . . . . 284.3.7.6. PUT Object Request - update . . . . . . . . . . . . . . . . 30

4.3.8. UserRequest and UserMetaRequest Class . . . . . . . . . . . . . . . 304.4. Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.4.1. TrackableResource Class . . . . . . . . . . . . . . . . . . . . . . . . . 314.4.2. EventTracker and ResourceUsageHistory Class . . . . . . . . . . . . 324.4.3. Accounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.4.4. Report Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.5. Scenario Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5. Evaluation 345.1. Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.2. Simulation Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.2.1. Single Cloud Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . 355.2.2. Multi-Cloud Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.3. Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375.3.1. Used Storage and SLA fulfillments . . . . . . . . . . . . . . . . . . . 375.3.2. Cost evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.3.3. Effect of the used Sequence Type . . . . . . . . . . . . . . . . . . . . 41

6. Conclusion and future Work 42

Bibliography 43

Appendix 45A. Additional figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45B. Monitoring Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46C. Input Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

viii

Page 9: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

List of Figures

2.1. CDMI root container with multiple containers . . . . . . . . . . . . . . . . . 92.2. CloudSim Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1. Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2. Interaction Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.1. System Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2. Relationship between Objects and Containers1 . . . . . . . . . . . . . . . . 204.3. CDMI Object, Blob and Blob Locators . . . . . . . . . . . . . . . . . . . . . 214.4. Server and Disk IO limitations . . . . . . . . . . . . . . . . . . . . . . . . . 224.5. Timeaware Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . . 254.6. Request Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.7. Get Container State Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 274.8. Get Object State Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.9. Put Container State Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 294.10. Put Object State Diagram (creation) . . . . . . . . . . . . . . . . . . . . . . 29

5.1. Distribution of the Size of single Objects in all scientific Access Patterns . . 355.2. Distribution of the Size of single Objects in all normal Access Patterns . . . 365.3. Number of Request Types in all normal Access Patterns . . . . . . . . . . . 365.4. Distribution of Traffic per Sequence (both Types) . . . . . . . . . . . . . . . 375.5. Succeeded Requests and SLA Violations in single Cloud Experiment with

5000 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.7. Used Storage in multi-Cloud Experiment, with 5000 sequences . . . . . . . 395.6. Used Storage in multi-Cloud Experiment with 50 Sequences . . . . . . . . . 395.8. Total Cost per Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.9. Costs per Sequence Type with 250 Input sequences . . . . . . . . . . . . . . 41

A.1. Average Response Time for mixed input Sequences . . . . . . . . . . . . . . 45A.2. Distribution of Request Types in scientific Sequences . . . . . . . . . . . . . 45B.3. Plot of Monitored Number of Requests in Cloud . . . . . . . . . . . . . . . 48B.4. Plot of Total physical used Storage Capacity in Cloud . . . . . . . . . . . . 48

ix

Page 10: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

List of Tables

2.1. Comparison between Block Storage and Object Storage . . . . . . . . . . . 72.2. Storage Prices for Amazon S3 standard Storage in the US . . . . . . . . . . 82.3. Costs per Request for Amazon S3 in the US . . . . . . . . . . . . . . . . . . 82.4. Traffic Cost for Amazon S3 in the US . . . . . . . . . . . . . . . . . . . . . 9

4.1. Implemented CDMI Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.1. Cloud Configurations compared . . . . . . . . . . . . . . . . . . . . . . . . . 375.2. Calculated StorageMetaBroker Score per Provider and Access Pattern . . 405.3. Average Price per accepted Sequence in Cents . . . . . . . . . . . . . . . . . 415.4. Matrix of total Costs for 250 Input Sequences in Dollar . . . . . . . . . . . 41

x

Page 11: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

1. Introduction

1.1. Motivation

Cloud computing is one of the most emerging technologies of the past few years. Start-ups,big companies and the scientific community explore the benefits of this kind of resourceutilization. Cloud users and providers enjoy the advantages of less configuration overhead,less investments, less operation costs, highly flexible systems that scale out, up to largesystems, or scale in to a minimal resource utilization, depending on the current needs.

The main motivation is to reduce the cost for computation and storage by sourcing theseservices out to a third-party provider. The crucial questions are about the costs and theperformance of the usage. Users want to know, how much money they are going to spend,if they switch to a Cloud solution. On the other side, providers have to know, what is thebest hardware constellation and configuration for them. These questions can be answeredby simulating Cloud environments.

CloudSim is a popular simulation environment, which offers capabilities to simulate com-puting Clouds that provision computing infrastructures as a service. The focus of CloudSimis on the modeling of so-called Cloudlets which are jobs, that can be scheduled on VMs(Virtual Machines). Models for hard disks, SAN and files do already exist [3]. However,capable interfaces to simulate object STaaS are missing, as well as a fine-grained modelfor the size of files (see 2.6).

There are also few research works that address the above mentioned functionalities. Theonly work with storage modeling in CloudSim was performed by Zhao and Long. Theypropose an extension of the existing Data Center model [11] by introducing a ReplicaCatalog 1 and Block Catalog 2 to benchmark different configurations. All operations arestarted from within a Cloudlet, which means that the VM is the requesting entity. Bythis, their work presents a mixture of data centers that offer computing power and storagecapacities. This work in contrast focuses on the strict separation of those two capabilities,because the real-world infrastructure-as-a-service providers do the same.

1One piece of information is stored multiple times on different physical locations. Locations are managedby the Catalog.

2One piece of information is striped and distributed on different physical locations to gain a parallelizedand therefore faster access. Catalogs can provides query interface to retrieve those locations.

1

Page 12: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

2 1. Introduction

1.2. Goal of this Thesis

Considering that there is no known simulation environment for object STaaS, which in-cludes a proper modeling of the interfaces, this work extends the existing frameworkCloudSim by adding components for an accurate modeling of object storage.

New models are introduced to model the behavior of hard disks, storage servers and stor-age data centers. Interactions between the user and the Cloud are based on the CDMIstandard, which enables the users to model Cloud usage scenarios with a low effort. Con-current accesses to one physical resource, caused by two or more requests will be modeledas well as network delays between different simulation entities. A so-called meta brokeraims to provide a transparent access to multiple Clouds by scheduling sequences of requeststo different Cloud providers, depending on the SLA 3 requirements for those sequences.The usage of one or multiple Clouds will be compared to each other in terms of billedcosts, required storage space and SLA violations.

1.3. Structure of this Thesis

The thesis is structured as followed: The fundamentals of Cloud computing and it’s de-ployment models as well as service models are described in chapter two, followed by anintroduction to different storage types and one example for each of an STaaS provider,middle ware and API. The architecture of the existing simulation framework CloudSimwill also be described in chapter two, afterwards the software architecture of the extendedCloudSim with the STaaS models, the focus of this work, is demonstrated in chapterthree. After that, a more detailed description of the implementation follows in chapterfour, which focuses on the subset of implemented CDMI elements and features, as well asthe internal storage model (such as servers and hard drives). Following the user’s modelslike broker, SLA and request sequences are explained. After a detailed explanation ofthe proposed latency model, the thesis addresses on the communication between differententities and describes possible user requests, that can be sent to a simulated STaaS. Afterthat, some monitoring techniques are described, before the generation process for scenariosis explained. In chapter five the developed STaaS simulator is evaluated and the results aredepicted and discussed, where a single-Cloud environment is compared to a multi-Cloudenvironment. The work closes with a conclusion and an outlook of the future work.

3service level agreements

2

Page 13: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

2. Fundamentals

This section gives a summary of the fundamentals of Cloud-related technologies, providers,middle-ware and standards. The simulation framework CloudSim[3], which is extended bythis work, will be described as well.

2.1. Cloud Computing

“The Cloud represents any-to-any network connectivity in an abstract way. In this ab-straction, the network connectivity in the Cloud is represented without concern for how itis made to happen.” [4]

The term “Cloud” stands for the highly elastic provisioning of services (either computingpower, networking, storage or applications). Providers of Clouds acquire hardware (harddrives, CPUs, network infrastructure, etc.) and set a Cloud platform on top of thatphysical infrastructure by using virtualization technologies[9]. Users are given accesses tothat platform via network in a transparent way, so that they do not need to have knowledgeabout the underlying hardware. Billing is typically based on a pay-per-use principle. Usersgain a simpler, more flexible and cheaper way of accessing the infrastructure [14].

“Cloud computing is the IT foundation for Cloud services and it consists of technologiesthat enable Cloud services”[6].

Providers offer physical machines, that are pooled and can be allocated to virtual machines(VM ), that are created by the users (customers). Users pay per computing instance pertime. A computing instance defines a specific amount of guaranteed CPU power, availableRAM and hard disk storage. This server template and other metrics (like maximumnetwork latency, minimal network bandwidth, maximum time to turn additional VMs onfor scale-out, etc.) are defined in the service level agreements (SLA).

The service level management is done via dynamic orchestration of physical resources[6]:

• Multiple VMs can run on the same physical host simultaneously (using virtualizationtechnologies).

• Virtual machines are migrated between physical hosts.

• Physical machines are powered off or on, depending on the current requirements.

3

Page 14: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4 2. Fundamentals

Customers can upload their own images of their VMs, that can then be dispatched on oneor multiple instances. The complete software stack, beginning at the level of the operatingsystem, is customizable by the user. All interactions between the user and the Cloudprovider are typically carried out via a web console. Security can be enforced via storageisolation, VM isolation, VLANs, SSL/SSH[6].

2.2. Characteristics of a Cloud

According to NIST, there are five essential characteristics of a Cloud [13]:

• On-demand self-service: Users can allocate resources whenever they need.

• Broad network access: Fast data transfers between the Cloud and all users.

• Resource pooling: Resources are pooled via virtualization technologies and then allo-cated to user requests by the Cloud operator. Resources can be re-allocated. Usersdo not have control, which physical device is allocated.

• Rapid elasticity: The amount of allocated resources depends on current need. Rapidelasticity ensures, that changes of these allocations can be done within very shorttime intervals.

• Measured Service: Different dashboards provide metrics for providers as well as forusers.

2.3. Cloud Deployment Models

Four different deployment models can be defined[13] as follow:

• Private Cloud. The whole infrastructure serves only one customer. The Cloudprovider is either the user himself (same company) or another company, but theCloud hardware is separated on hardware level, which leads to more security andprivacy.

• Community Cloud. Similar to the private Cloud, the physical resources are onlyavailable to a limited group of users. Multiple organizations share a hardware pooland agree amongst each other on policies. The community Cloud is useful, if theorganizations follow a common mission and have to share data between each other.

• Public Cloud. General accessible platform, owned by a company that sells the serviceto users.

• Hybrid Cloud. Combination of either private or community Cloud together withpublic Cloud. High peaks of resource utilization can be backed by the public Cloud.This requires a common interface of involved Cloud platforms.

2.4. Cloud Service Models

Three different service models describe the different levels of abstraction, that is offeredby the provider. The highest level of abstraction is the Software as a Service (SaaS) [13].Whole applications like email or group-ware are provided via web-interfaces. The usercan not use other applications than the offered ones, but he does not have to worry aboutinstallation, updates or configuration. Popular examples are iCloud1 or Evernote2 for endusers or Amazon Simple Work-flow Service3 for companies. [14]

1see https://www.icloud.com/2see https://evernote.com3see http://aws.amazon.com/en/swf/

4

Page 15: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

2.5. Data Storage as a Service 5

The second level of abstraction is Platform as a Service (PaaS). The Cloud provider offersan API and a runtime to the user, which can develop own applications. The runtimeand the API offer libraries and mechanisms to the developer, which speeds up the de-velopment of software, but limit the developer to a specific provider (Vendor-lock-in).Well-known examples are Google App Engine4, Microsoft Windows Azure5 or AmazonElastic Beanstalk6.

The lowest level of the abstraction is called Infrastructure as a Service (IaaS) [13] whichsimply provides virtual machines, virtual switches and virtual network connections thatcan be used like physical devices. The user can determine the configuration of all thesedevices and can choose his own operating systems.

2.5. Data Storage as a Service

Like Cloud Computing, data storage as a service (STaaS) is a specialization of IaaS. Theterm storage with respect to STaaS means non-volatile (permanent) memory with readand write possibilities via network, which can be offered in different forms by a Cloudprovider. Most STaaS solutions offer on-line secondary storage, but tertiary off-line ornear-line solutions do exist (for example Amazon Glacier7). This work focuses on anon-line secondary storage.

Storage can be analyzed by the following characteristics [15]:

• Random vs. sequential access: Jump between specific positions in a file or access inconsecutive manner

• Minimum, maximum and average read/write latency : Delay, introduced by storagedevices, that occurs before data transfer can be achieved between user and storagemedium

• read/write throughput : Maximum transfer rate

• Granularity : Size of accessible chunks

• Reliability : Probability of spontaneous bit value change by mistake

• Energy use: Power consumption during standby and performance

• Storage density : Required space per megabyte

The cost per storage depends on the energy use and storage density. Many different factorshave to be considered, before building and configuring a storage system, for example:

• Geographic backups: Encounter loss of whole data centers or deletion by mistake

• Replication systems: Encounter disk failure and serve many consecutive read requestsof the same content

• Anti-bit-rot mechanisms: Detect data inconsistency, caused by storage devices orwrite/read operations

• Total costs: Based on energy usage and storage density

• Scalability: Prevent bottlenecks and single-points-of-failures

• Encryption: Secure stored data and/or transfer between the client and the Cloud

4see https://developers.google.com/appengine/?hl=en5see http://www.windowsazure.com/en-us/solutions/6see http://aws.amazon.com/de/elasticbeanstalk/7see http://aws.amazon.com/en/glacier/

5

Page 16: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

6 2. Fundamentals

Companies that do not have the required knowledge or money to invest in such a stor-age facility, become STaaS customers. Providers guarantee certain SLAs (Service LevelAgreements, see 4.3.3), like the costs per gigabyte, the number of replicas or the geo-graphic availability. STaaS solutions scale-out, like the Cloud Computing solutions, whichsave customers high investment costs, for example if, less storage capacity is required thanbeeing bought.

Providers on the other side do not know what kind of data is stored by their customerand how the data will be accessed. One possible scenario would be a popular website:Very few write operations, a high burst on a specific content. Another scenario would bea document management system: Ratio between read and write operations is close to 1.There is no prediction, which content could be requested in the future, is available andtherefore no good caching possibilities are known. Providers can reduce the amount ofactual used space by using compression and deduplication [4]. One infrastructure has toserve these and other possible scenarios.

2.5.1. STaaS Storage Types

There are basically three known types of storage types: structured, block and objectstorage. Every type has different characteristics and is therefore preferred in differentuse-cases.

2.5.1.1. Structured Storage

Structured storage systems are also known as Databases. Content (or entries) follow aschema and have a defined structure (field A of type a, followed by field B of type b, ...).Database systems follow the client-server pattern (both can be on same machine), whichmeans that the server stores and manages the content. The client requests or writes contentvia a specific interface (for example SQL). The biggest advantage of this kind of storageover other storage systems is, that the server can use the content schema to filter, sort orcompute outputs. DBMSs (database management systems) hide the physical organizationof the data, are responsible for avoiding mutual overwrites and perform optimizations inorder to retrieve data as fast as possible8.

2.5.1.2. Block Storage

Block storage is the kind of storage that is typically used on every personal computer:hard disks, optical disks or magnetic tape. Devices can be only read or written on blocks(also known as chunks of data). Except for the magnetic tape, block devices are accessedvia a file system in order to achieve random access to content. Optionally a DBMS canprovide a convenient way to organize data on the storage device 9.

File systems define the logical unit file that combines one or multiple blocks on a storagedevice to one entity. Files can be organized hierarchically in directories. While the mappingfrom file to blocks on the storage device is done by the file system, the organization andretrieval of files from different directories, replication, backups, etc. has to be done by theuser or programs that run on top of the operating system and use the file system. Securityis enforced by the operation system in cooperation with the file system via flags, accesscontrol lists[? ] or similar mechanisms 10.

8see http://en.wikipedia.org/w/index.php?title=Database&oldid=5644662979see http://en.wikipedia.org/w/index.php?title=Block_(data_storage)&oldid=540858882

10see http://en.wikipedia.org/w/index.php?title=File_system&oldid=563716718

6

Page 17: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

2.5. Data Storage as a Service 7

Virtual file systems allow to access remote storage devices via network, for example NAS(network attached storage), or pool multiple devices to one logical device, like SAN (stor-age area network). SAN offers only block-based storage which leaves the file system con-cerns to the client11.

2.5.1.3. Object Storage

The concept of object storage was introduced in the early 1990’s and gains an increasinginterest in the Cloud computing community.

Object storage pools multiple physical devices together and provides one logical medium tostore and retrieve many different pieces of information (called objects). According to [1],SAN lacks in three important aspects: “security and protection, end-to-end managementat a meaningful semantic level, and scalability (in particular for allocation)”.

In contrast to conventional file systems, the physical location of an object is determinedby the storage controller. Like structured storage, object storage follows the client-serverpattern. Every operation has an attached credential in order to enforce security[1]. Objectstorage systems can usually handle multiple users: Their stored objects are separated fromeach other on the logical representation layer [4].

Besides user data, an object contains so called metadata [1], like timestamps, informationabout the content (for example via MIME type) or number of replicas. Objects can beaccessed by their server-wide unique ID and can be created, updated, read (completeor partially) by all authorized clients [1]. Objects may have a name, like a filename inconventional file systems, to achieve a more convenient way for the user to identify files.These filenames must be unique in a specified scope: This scope is either the set of all filesof one user or all objects within the same container. Containers are virtual organizationunits for objects and may be hierarchical like folders on conventional file systems. Metadatacan even be attached to containers (like number of replica of every stored object in thatcontainer) [4].

Object storage systems provide a set of capabilities (like versioning, replication, usergroups, ...), which can usually be queried by customers. [4]

Table 2.1 compares block storage to object storage, according to [7].

Table 2.1. Comparison between Block Storage and Object Storage

Block Storage Object Storage

Operationsread block,write block

read object offset,write object offset,create object,delete object

Securityweak,full disk

strong,per object

allocation external internal

2.5.2. Example STaaS Provider - Amazon S3

One of the most popular object storage providers is Amazon S3.

“Amazon S3 provides a simple web services interface that can be used to store and retrieveany amount of data, at any time, from anywhere on the web. It gives any developer access

11see http://en.wikipedia.org/w/index.php?title=Network-attached_storage&oldid=560730128

7

Page 18: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

8 2. Fundamentals

Table 2.2. Storage Prices for Amazon S3 standard Storage in the US

used storage / month price in $/GB

First 1 TB $0.095Next 49 TB $0.080Next 450 TB $0.070Next 500 TB $0.065Next 4000 TB $0.060Over 5000 TB $0.055

Table 2.3. Costs per Request for Amazon S3 in the US

Pricing

PUT, COPY, POST, LIST $0.005 per 1k requestsGET and all others $0.004 per 10k requestsDELETE free

to the same highly scalable, reliable, secure, fast, inexpensive infrastructure that Amazonuses to run its own global network of web sites.”[10]

Developers (the users) create so-called buckets (which are equivalent to the object con-tainers of CDMI), which isolates the stored objects from different users. Objects are thenstored in those buckets without any additional hierarchy (no nested buckets possible).The number of objects is not limited, but the size of one object cannot exceed 5 petabyte.Objects are stored in three different facilities (replication) and the backup mechanismsare designed for 99.999999999% durability and 99.99% availability of objects over a givenyear.

All operations are passed via REST / SOAP interfaces. Object downloads can be donevia HTTP or BitTorrent. Objects can be made public so they can be accessed via HTTPby end users without any authentication, which means that in fact the object storage canserve as a CDN (content distribution network). Amazon calls this feature CloudFront.Pricing depends on the region. Amazon offers currently two locations in the US, one inthe EU, three in Asia Pacific, and one in South Africa. Data will never be transferedbetween regions, except the developer transfers them by himself. [10]. The total costs fora bucket depend on the region, the used space per month, amount of transfered data andthe number of different operations according to [10] are shown in table 2.2, 2.3 and 2.4:

2.5.3. Example STaaS Middleware - OpenStack Swift

OpenStack [8] is an open source initiative, founded in 2010 by NASA (project Nebula)and Rackspace Hosting (Cloud Files platform). OpenStack is very popular for developingprivate or community Clouds. Organizations like eBay, CERN and Deutsche Telekom usethe projects 12. One of the OpenStack projects is called Swift, which is a STaaS system thatoffers basic features (storage, retrieve, deletion, updates of objects) as well as replication,integrity audits and statistics.

Swift was designed to have no single point of failure and scale horizontally (see 2.2).

2.5.4. Example STaaS API - CDMI

The Cloud Data Management Interface (CDMI) is a standard, defined by the SNIA (Stor-age Networking Industry Association). CDMI defines a RESTful HTTP interface to access

12see http://en.wikipedia.org/wiki/OpenStack

8

Page 19: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

2.5. Data Storage as a Service 9

Table 2.4. Traffic Cost for Amazon S3 in the US

Pricing

Uploads freeTransfer out from S3 to same region freeTransfer out from S3 to different region $0.02 per GBTransfer out from S3 to CloudFront $0.02 per GBTransfer out from S3 to the internet $0.00 up to $0.12 per GB

CDMI Cloud

Root ContainerUser A

Root ContainerUser B

Root ContainerUser C

Container foo

Nested Container

Container bar

User A

Figure 2.1.: CDMI root container with multiple containers

an object storage system. Export capabilities to CIFS, NFS, iSCSI, WebDav and OCCIare generally possible, but optional. Besides objects and containers, more advanced fea-tures like Domains and Queues are provided. The offered features can be expressed viacapabilities.

Containers are being used as simple grouping of objects for convenience and may be hier-archical [4] as depicted in figure 2.1. The provider creates exactly one root container forevery customer (user). Users can only access their own root containers, but can createmultiple credentials for different access levels for objects inside their root container viaDomains.

Metadata is being used to keep the storage system simple, but empowers the providerto build quality services (like automatic, selective backups) on top of an object storagesystem. The schema of metadata can be defined by the user. Containers as well as objectsdo have metadata. If a new object is created, it inherits some metadata from the containerit is located in (and containers inherit metadata from the parent containers). Metadataof an instance (container or object) may then be changed at any further time in orderto overwrite the inherited metadata. There are different types of metadata, like HTTP(content length, content type, ...), user and storage system metadata. Such informationare key-value pairs, that are encoded as JSON strings.[4]

Queue objects are used to store values like containers, but offer access in a first-in-first-outmanner. Domain objects can be used for administrative groupings and accounting [4].Both kind of those objects will not be discussed or further used in this thesis.

Every object and every container must have an URI, which is unique in scope of the Cloudand is generated by the Cloud itself. Users then can change names of objects or containers

9

Page 20: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

10 2. Fundamentals

Cloud ScenarioUser

Requirements..

Data Center Broker

User Broker

Cloudlet Virtual Machine

VM Provisioning CPU Allocation RAM Allocation

Storage AllocationBandwidth Allocation

Cloud Coordinator

Data Center

Network TopologyMessage Delay

CalculationSensor

CloudSim Core Simulation Engine

Use

r Co

de

Clo

ud

Sim

Figure 2.2.: CloudSim Overview

to assign a more expressive identifier.

In contrast to the Amazon S3 own API, CDMI offers the four HTTP request verbs (GET,PUT, POST, DELETE).

CDMI was chosen for this work, because it provides all core features of a Cloud serviceinterface, but is not limited to a single provider (like Amazon S3). In addition, it is afact that CDMI is an open standard, leads to an open environment for interoperabilitybetween different Cloud Providers. Customers can use one interface definition to accessmany different Clouds. There is also a CDMI implementation for OpenStack 13.

2.6. CloudSim

CloudSim is a time discrete simulation framework for Cloud computing. The frameworkconsists of three layers as shown in figure 2.6 (from bottom to top):

1. Core Simulation Engine: Queuing and processing of events, management of Cloudsystem entities such as host, VMs, brokers, etc.

2. CloudSim: Representation of network topology, delay of messages, VM provisioning,CPU, storage and memory allocation, etc.

3. User code: General configuration such as Cloud scenarios and user requirements,User Broker

13see https://github.com/osaddon/cdmi

10

Page 21: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

2.6. CloudSim 11

Users of the framework can either modify the top layer to change the scenarios to simulate,or extend the second layer, to test different allocation policies in a Cloud system. Theuser code layer defines so-called cloudlets that define a specific amount of computationrequirements (like a Cloud job). These jobs are then dispatched on available VMs bythe CloudSim layer. Communication between the entities is done via messages that arerepresented as events that are sent to the core simulation engine, which handles all eventsin the correct order and manages the simulated time. Events between two remote entitiesare automatically delayed, if the network topology is represented.[2]

CloudSim can simulate SAN storage, hard drives and files, that are stored on hard drivesdirectly or via SAN storage. But the modeling of those lacks for object storage:

• File size magnitude: CloudSim models the file size in megabyte, but 90% of all webobjects fit within 16KB [5]14

• Hard drive models: The hard drive models do not provide all metrics that are requiredto model the read and write durations accurately.

• No storage controller: CloudSim does not offer a controller that determines thestorage location of objects.

• No appropriate object storage interface: No model for any STaaS interface, as forexample CDMI.

14Which means that a CDN scenario can not be modeled accurately. File sizes are stored in Java Integer

variables: Only files up to 2.147 PB can be modeled.

11

Page 22: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

3. Modeling of a STaaS

The previous chapter described all fundamentals in detail, including the existing simulationframework CloudSim. This chapter describes the extension of this framework to enablethe simulation of STaaS.

Class names are formated as the following: SomeClass. Detailed descriptions of theseclasses can be found in chapter 4.

3.1. Architecture

The following section provides an outlook, how the existing architecture of CloudSim willbe extended, to provide a simulation environment for STaaS Clouds. Some classes areshared with CloudSim, some are completely independent. Therefore contents of figure 3.1,which represents the overall architecture of the modeled StaaS Cloud, will be discussedin the following sections. Blue boxes represent components of CloudSim, green boxes arecomponents that are described in this work and purple boxes are components that haveto be provided by the user of the simulation framework.

3.1.1. User Code and User Interface Structures

The user code describes the general Cloud scenario: What kind of requests shall be simu-lated in which order? For classical usage of CloudSim, the user creates different parameters,that are then converted into cloudlets and sent to the Cloud. One cloudlet represents asingle job, that cannot be divided into two jobs and is independent of other jobs.

The similar concept for STaaS is the UsageSequence. Instances of this class define therequirements that are demanded of the Cloud (e.g. pricing, capabilities, ...). After that,a series of User-Cloud interactions follows (see 4.3.8). Possible interactions are: Creationor deletion of a container, upload or modification or deletion or download of an objectand idle operations (see more in section 4.3.7). All operations within one UsageSequence

depend on each other in their given order (a download of an object can only succeed, if itwas uploaded previously to the very same Cloud).

UsageSequences are brought to a MetaStorageBroker (see 4.3.4), which chooses oneCloud that matches the SLA requirements the best. For this process the MetaStorageBro-ker starts multiple Cloud discovery requests (see 4.3.7.1) that retrieve current capacitiesand capabilities of the Clouds. After all Clouds have been discovered, the best one is cho-sen. The UsageSequence is then forwarded to the StorageBroker (see 4.3.1), which then

12

Page 23: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

3.1. Architecture 13

Figure 3.1.: Architecture Overview

creates further CDMI requests and interacts with the Cloud. A more detailed descriptionof the different interactions can be found in section 4.3.6.

3.1.2. Provided Storage Services

The services that are provided by the modeled STaaS Cloud are:

• Object Storage: Storage, organization and retrieval of objects as described in 2.5.4.

• Replica: Objects are stored multiple times on different locations. The number ofrequired replica can be adjusted per container. Store operations only succeed ifthere is sufficient storage for all replicas of the object.

• Storage Accounting: Every operation in the Cloud is logged. One purpose is billing,the other is general monitoring of delay and duration of operations.

• Storage Policy Enforcement: Object Replicas are stored as remotely distributed fromeach other as possible to reduce the possibility of failure. Limits like the maximumnumber of children or maximum object size are enforced as well.

3.1.3. Resources, Resource Usage and Network

There are two resources that are limited in the STaaS Cloud: Number of bytes that canbe stored at a given time and the available bandwidth (user to Cloud, Cloud interface

13

Page 24: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

14 3. Modeling of a STaaS

to storage server, server interface to hard disk). The total used storage capacity changesonly when an object is uploaded, deleted, modified. In contrast, the available bandwidthchanges very often during the simulation. Whenever an object is uploaded, downloaded,modified or moved the used bandwidth will increase when the operation starts and thendecrease when the operation is finished. Multiple operations can be executed at the sametime, so the available bandwidths of the different operations depend on each other. Thisis modeled with the TimawareResourceUtilization (see more in section 4.3.5).

Storage servers and hard disks model the hardware that is used by the Cloud provider.They model the technical details like maximum read/write throughput or the total avail-able capacity.

Network links are modeled via BRITE topology. Messages are delayed, depending onsome fixed delay that is defined in this topology [12]. Another crucial factor is the size ofa network transmission (upload and download of objects). This delay is calculated basedon the currently available bandwidth (timeaware resource utilization) and the size of themessage.

3.2. Sequence Diagram

The sequence diagram depicted in figure 3.2 shows an example of interaction from theUser Code down to the hard disks and will be discussed in the following paragraphs.

As described in 3.1.1 all commands have to be encapsulated in a UsageSequence. In thiscase, there is only one available Cloud provider, so there is only one broker and no need todo a cloud discovery process. This UsageSequence consists of only two commands: Thecreation of a container and the upload of one object into that container.

The broker acts on behalf of the user and is identified via the ID that is defined by theCloudSim core simulation framework. The Cloud instance checks on every request, if theuser is already known and either rejects the request or creates a new user account (witha new root container). On the case of a PUT container request, a new user is created.Every other request will fail (PUT object requests require an existing container, GET andDELETE request does not make sense at all, because there are no container or objects ofa new created user). Every container that is created by the user is a direct child of theuser’s root container. Policy enforcement mechanisms will ensure, that the user is able tocreate the container. The creation of a child container requires virtually zero time, so theCloud instance can send a success response immediately after the container was created.

It can be seen that the PUT container request was blocking, so the broker waits until theoperation succeeds before proceeding with the next operation. This is required, becausethe object shall be put into the newly created container. The Cloud instance checks allprerequisites (does the user exist, does the target container exist, ...).

Every container in the Cloud is virtually attached to several storage servers. Containerscontrol where to store objects by choosing one of the attached servers. In the simplestcase, every container is attached to all servers. Another possibility would be some regionallimitations (e.g. one container can only access servers in one geographical region). Assoon as all prerequisites are met (sufficient storage and no policy violation), the Cloud willsend an acknowledgment to the broker, which signals that the operation will succeed, butis not finished yet.

The PUT operation may be delayed, because some resources are totally occupied at thattime. In addition, the duration of the operation is calculated, which depends on themaximum bandwidth and workload of the hard disks, server and cloud network interface.

14

Page 25: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

3.2. Sequence Diagram 15

User

Cod

eB

rokerSto

rageC

lou

d

roo

t Co

ntain

er

ServerH

ardd

rive

Usage

Seque

nceP

UT

Con

tain

ercheck Use

rcreate

&

set Meta

ob

ject

Con

tainer

CD

MI R

espo

nse

PU

T O

bject

check Use

r

getChild

object Co

ntainer

store Object

pro

be

pro

be

Hardd

rive

pro

be

sufficien

t storagesu

fficient sto

rage

succeed

reserve sp

ace

delay re

spon

se

CD

MI A

CK

CD

MI SU

CC

create

child

reserve sp

acerese

rve space

poll Re

port &

Stats

poll Re

port &

Stats

expected

delay an

d duration

Figure 3.2.: Interaction Overview

15

Page 26: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

16 3. Modeling of a STaaS

The lowermost bandwidth and longest delay specify the total delay, before the SUCCresponse is sent back to the broker.

Depending on the scenario and user code, it might be useful to retrieve some statistics andreports from the brokers and the Cloud providers. The broker can provide informationon the user level, like total duration of operations (where delay and duration can not bedistinguished) or the number of succeeded operations for the user. The information thatcan be pulled from the Cloud is more detailed. There are logs available, describing whichresource was used for which purpose and how long operations were delayed for whichreason. In addition, the Cloud instance provides these information for all users. The totalcosts are available per user and aggregated for all users as well.

16

Page 27: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4. Implementation

The previous chapter gave a brief outlook over the architecture and the interactions of dif-ferent components inside the storage cloud simulation framework, which extends CloudSim.This chapter covers the detailed description of the implementation of single classes andtheir interaction with each other.

Figure 4 gives a broad overview over all important classes in this work. Green classesrepresent the CDMI implementation, yellow ones the internal storage model, blue classesrepresent user models and purple classes are for monitoring purposes. A more detaileddescription of the single components will follow.

Names of methods, constants and so on are formated as the following: void foo(String

bar), SOME_CONST or SomeClassName.

4.1. Development Envorinment

This work extends CloudSim version 3.0.31. Beside the The Apache Commons Mathemat-ics Library2, which is required by CloudSim, the Simple framework version 2.73 is usedfor the serialization and deserialization of XML files. Both libraries are published underthe Apache License4. The complete extention of CloudSim is written in Java5.

Netbeans 7.3.16 was used for profiling7 and IntelliJ IDEA was used as primary IDE8

4.2. STaaS Provider Models

This section is about the models that are required to simulate all states, processes andpolicies that are ’inside’ the Cloud and invisible to the Cloud user.

1https://code.google.com/p/cloudsim/downloads/detail?name=cloudsim-3.0.3.tar.gz, released onMay 02, 2013

2version 3.2, see http://commons.apache.org/proper/commons-math/3see http://simple.sourceforge.net/, published on 18 February 20134version 2.0, see http://www.apache.org/licenses/5version 1.7.0 update 256see https://netbeans.org/7see https://profiler.netbeans.org/8version: 12.1.4 Build: 129.713 Released: June 10, 2013, see http://www.jetbrains.com/idea

17

Page 28: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

18 4. Implementation

StorageClo

udC

dm

iRo

otCo

ntain

erC

dmiO

bjectC

on

tainer

Cdm

iObject

Cd

miM

etadata

Cd

miEn

tity

Cdm

iCloud

-C

haracteristics

StorageBlo

b

StorageBlo

bLocationIO

bjectStorageD

evice

Ob

jectStorag

eServe

r

Cd

miR

equ

est

Cd

miR

espon

se

Sched

uleE

ntry

<<cre

ate>>

<<cre

ates>>

StorageB

roker

<<cre

ate>>

<<rece

ives>>

StorageMetaB

roker

UserR

equ

est

<<p

roce

sses>

>

<<cre

ates>> /

<<d

estro

ys>>

UsageSeq

uen

ce

SLAR

equ

iremen

t

<<p

roce

sses>

>

<<fo

rwa

rds>

>

SLAR

equ

est*

GETCo

ntain

erRequ

est

GETO

bjectReq

uest

PUTO

bjectRequ

est

DELETEO

bjectR

equ

est

...

Tim

ea

wa

reR

eso

urce

EventTracker

TrackableResource

UsageH

istory

TrackableResou

rce

ReportG

enerato

r

<<rea

ds>

>

UsageSeq

uen

ceFile-G

enerator

<<cre

ates X

ML fo

r>>

Figure 4.1.: System Class Diagram

18

Page 29: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.2. STaaS Provider Models 19

STaaS Clouds do have different capabilities and characteristics that differ from each other.Some providers might be cheaper than others, but offer less services (for example thenumber of replica). This issue will be highlighted in the section 4.2.1.4.

As described in section 2.5.4 the CDMI interfaced Cloud is accessed via a RESTful inter-face that is based on HTTP. The following section will cover the implementation of thisinterface.

4.2.1. CDMI Implementation

This section will give a closer look to the implementation of a CDMI-like interface to aSTaaS provider. This implementation is only a subset of the CDMI interface which wasdescribed in section 2.5.4.

4.2.1.1. CDMI Entity

Every object, which can be stored, and every storage container is modeled as an instanceof CdmiEntity. Entities must have a provider-wide unique ID (represented by CdmiId )and may have a name for convenient access. The class CdmiID provides unique IDs withinClouds (distinguished by their rootUrl) for the creation of new objects and containers.The user has no control over the choice of IDs.

Containers and objects can be accessed via their CDMI ID or via their name (unless thename was never set / is empty). Some entities, like CdmiRootContainer or CdmiObject-Container can contain children, which are, again, CDMI Entities.

4.2.1.2. CDMI Object

Instances of CdmiDataObject represent the smallest addressable storage units and extendthe class CdmiEntity. Every object has its own metadata, that is inherited from itsparents. The size of an object is defined as the number of bytes that are stored by theuser as the content of the object. The physical size is defined as the size plus the spacethat is required to store the metadata (see 4.2.1.4).

4.2.1.3. CDMI Container

Containers are entities that contain children. They provide a namespace separation forobject names. No two objects with the same name can exist in the same container. Themapping of names to the ID of all stored children is stored in every container. Containersdo have metadata that are inherited from the parent 10. Containers can enforce policiesthat are parameterized through their metadata: For example NUM_REPLICA defines thenumber of required replications of one object to fulfill the Quality of Service (QoS). Everycontainer can have a different QoS regarding replication.

The size and physical size is defined as the sum of sizes/physical sizes of all children plusthe space that is required to store the metadata (see 4.2.1.4) of the container.

The CdmiRootContainer is a specialized container, because there exists exactly one rootcontainer for every Cloud user. Root containers have an additional mapping from objectID to the container that stores the object, in order to retrieve objects through their IDwithout providing the name of the container.

9The figure shows the implementation of the constellation of the classes as they are in Java on the left-hand side. The right-hand side is a simplified version that models the constellation on the layer of themodels.

10CdmiObjectContainer s inherit from the CdmiRootContainer of the user and the root container inheritsfrom the cloud default metadata setting when it is created.

19

Page 30: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

20 4. Implementation

Entity

Metadata

Container

Object

Root Container

Object Container

Has children<T extends Entity>

11

*

1

Root Container

Object Container

Object

Metadata

1

*

1

*

CdmiID

CdmiID

ObjectID-Container-Mapping

*

Object-Container-Mapping

GenericsSimplified

assocations

Figure 4.2.: Relationship between Objects and Containers9

4.2.1.4. CDMI Metadata and Cloud Characteristics

The Cloud, containers and objects store metadata, which is a simple key-value storage.Metadata can be inherited from parents to new created children, but some metadata arerestricted from inheritance (e.g. the capabilities of the cloud are represented as metadatavalues, but are not inherited by root containers). The Cloud user can read and modify themetadata of objects and containers, depending on the configuration of capabilities of theCloud provider. Typically the number of key-value pairs per object and the length of thevalue are limited. All values are stored as UTF-8 characters. The required space to store

Table 4.1. Implemented CDMI Metadata

Key Value Cloud Root Container Object

SIZE size of object in bytes XTYPE e.g. MIME type XCREATED_AT creation time-stamp X XLAST_WRITE_ACCESS last modification

time-stampX X X

LOCATION geographical position X X XNUM_REPLICA number of replica-

tions of objectX X X

NUM_VERSIONS number of old ver-sions to keep

X X X

MAX_OBJECT_SIZE max. allowed size ofcontainer in bytes

X X X

MAX_CHILD_COUNT max number of chil-dren per container

X X X

the metadata is modeled to be the number of bytes to store the string concatenation ofall keys, values and two separator characters for every pair.

All these metadata instances are created by the Cloud provider, but users can introducenew kinds of metadata simply by putting a new key-value pair into the metadata, if thecapabilities allow users to modify metadata.

20

Page 31: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.2. STaaS Provider Models 21

The Cloud characteristics are an extension of a simple key-value storage by providing someconvenient methods to access the data. The characteristics extend the set of possible keysby the following: MIN_BANDWIDTH, MAX_LATENCY, PROVIDER_NAME, AVAILABLE_CAPACITY,STORAGE_COSTS, DOWNLOAD_COSTS, UPLOAD_COSTS, CAPABILITY_CREATE_CONTAINER, CA-PABILITY_DOMAINS, CAPABILITY_EXPORT_ISCSI, CAPABILITY_EXPORT_NFS,CAPABILITY_EXPORT_WEBDAV, CAPABILITY_LIST_CHILDREN, CAPABILITY_MOD_METADATA,CAPABILITY_NOTIFICATIONS, CAPABILITY_QUERY, CAPABILITY_QUEUES, AVAILABLE_CAPACITY,MAX_METADATA_ITEMS, MAX_MEDATADA_ITEM_SIZE, CAPABILITY_DELETE_CONTAINER, CA-

PABILITY_READ_METADATA

These characteristics are then used by the MetaBroker to find a provider that fulfills allthe SLA requirements (see 4.3.3).

4.2.2. Internal Storage Models

The last section describes containers, objects and metadata, which are classes that canbe used to model requests and responses between the user and the Cloud. Models thatare required to simulate the processes inside the Cloud are described in the followedsubsections.

4.2.2.1. Blob and Bloblocators

One blob can be seen as a file - information that is written on a physical medium. Oneobject has one or multiple blobs (depending on the number of replicas). Two blobs thatbelong to the same object can not be stored on the same disk. Instances of BlobLocatormap one location (server and disk ID) to one object ID. Object containers manage thelocations of one object: one list of BlobLocator s are stored for each object in a container.

4.2.2.2. Servers and Hard Drives

ObjectStorageServer

ObjectStorageBlob-Locator

ObjectStorageBlob

ObjectStorageDrive

CdmiDataObject

1

1

1

1

1*

*

Figure 4.3.: CDMI Object, Blob and BlobLocators

Every hard drive (disk) has to be attachedto exactly one server. Hard drives haveto implement the interface IObjectStor-

ageDrive, that models storage drives moreaccurately than CloudSim. Capacity, usedstorage and blob sizes are modeled as Long,which allows a modeling of file sizes from1 byte to 8 Exabyte. Every disk has a de-vice name that has to be unique within aserver system (e.g. /dev/sda1 ). Write la-tency and read latency (in ms) as well asthe maximal write and read throughput (inbyte / ms) can be modeled independently.

Servers manage disks and can either decidewhere to store a blob or store a blob to a given disk. Therefore ObjectStorageServer

provides a method to probe disks. This operation returns all disk names that have enoughcapacity left to store a blob. The method takes optionally a list of drives, that will beexcluded from the disk probe, in order to enforce the policy that no two blobs of a singleobject can be stored on the same disk.

Hard drives and Servers are time-aware resources, as described in 4.3.5. The connectionfrom the hard drive to the system bus is an independent instance of the IO limitationbetween server and network controller, because internal copy operations from one disk toanother, within the same server, will not use any network bandwidth.

21

Page 32: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

22 4. Implementation

Server

sda1

sda2

sda3

Cloud internal network

System Bus

Cloud internal network interface

Disk Interface

Cloud-internetconnection

Figure 4.4.: Server and Disk IO limitations

4.2.3. Object Storage Cloud Model

This model is the the central coordination entity of the STaaS simulation. Incomingrequests from brokers are checked and executed. Every server, container and object iscoordinated by this class.

The StorageCloud enforces all policies, such as number of required replications per objector available capabilities. Every operation can be delayed for a certain time if any involvedresource (server, hard drive, Cloud bandwidth) is not available at the moment. Operationslike GET and PUT do have a certain duration, depending on the current workload of allinvolved hardware.

Operations are billed, depending on a pricing model, which is updated every time a userperforms an action (start request, upload object, download object, delete object). Pricemodels can be the one of Amazon S3 (see section 2.5.2) or more complex price functionsare possible. This work will focus on linear price models.

Every broker (see 4.3.1) represents one user, whereas the ID of SimEntity is used for useridentification. The Cloud checks for each incoming request if the requesting user is alreadyknown. If this is not the case, a new root container is created. Every other type of requestrequires an existing root container. Root containers are strictly separated from each other,and do not interfere with each other. Users can only access their own root container.

4.3. User Models

After describing the most important classes, which are required to model the states andprocesses “inside” the Cloud, the next section will give an outlook of the classes thatrepresent the behavior of the user of the Cloud.

4.3.1. StorageBroker Class

The class StorageBroker represents a single user that is connected to exactly one Cloud.The main purpose is to generate CDMI requests that are then sent as simulation events viathe CloudSim core simulation framework. The UsageSequence (4.3.2) defines the orderand type of the requests to be generated. Lists of all created requests and their states(acknowledged, failed, succeeded) as well as the responses are stored in the broker fordetailed analysis after the simulation. Furthermore a reference to all stored objects and all

22

Page 33: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.3. User Models 23

containers is held in the broker. The StorageBroker class implements the TrackableRe-

source interface, by providing the metrics total number of events, events per second andevents per minute. In addition, all metrics that are provided by the UsageHistory of theassociated cloud are provided.

All UserRequest s are stored in a queue and enqueued at the end by default. Anothermethod allows to enqueue requests at the beginning of the queue as well. New User-

Request s can be generated during runtime. The broker is able to send CDMI requestsasynchronously or synchronously (waits for response of request, before sending the nextone). Another synchronization mechanism is the barrier UserRequest. This request forcesthe broker to wait until all running operations either failed of succeeded.

4.3.2. UsageSequence Class

One UsageSequence represents a series of UserRequest s in a defined order plus an in-stance of StorageCloudSLARequest. Each request may depend on previous requests11.Thus the order of the UserRequest s inside the UsageSequence is critical. No dependen-cies between two instances of UsageSequence are allowed, therefore the order of executionbetween UsageSequence s is irrelevant.

4.3.3. Service Level Agreements

The StorageCloudSLARequest class defines requirements that have to be considered,when the MetaStorageBroker chooses a Cloud provider. Service level agreements (SLA)are therefore modeled as a set of predicates (SLARequirement )that can be combined viaand and or operations. Each SLARequirement provides the method match which takes aninstance of StorageCloudCharacteristics and returns either true if the requirementsare fulfilled or false otherwise. There are some predefined SLARequirement subclasses like

• SupportsCapability

• DoesNotSupportCapability

• MaximumCharacteristicsValue : Checks if a numeric characteristics property (e.g.upload price per GB) does not exceed a given threshold.

• MinimumCharacteristicsValue

• CharacteristicMatchesString : Checks if a characteristics property matches agiven string.

The class StorageCloudSLARequest provides the following methods in order to cre-ate a complex SLA request via the builder pattern: minBandwidth (double), maxLa-

tency (double), maxStorageCost (double), maxUploadCost (double), maxDownload-Cost (double), minCapacity (long), maxContainerSizeAtLeast (long), maxObject-SizeAtLeast (long), locationIs (String), locationIsIn (List<String>), hasNoOb-jectSizeLimit (), hasNoContainerSizeLimit (), canCreateContainers (), canDelete-Containers (), canModifyMetadata ()

If more than one Cloud provider fulfills all SLA requirements, there must be some rankingalgorithm to compare the different providers to choose the one with the best match. TheStorageMetaBroker has to choose the best Cloud by some policies that are modeledwith instances of SLACloudRater which provide a score method that takes an instanceof StorageCloudCharacteristics and returns a double which can be negative, zero or

11e.g. the GET operation can only succeed if the according element has been uploaded via a PUTbeforehand.

23

Page 34: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

24 4. Implementation

positive. The higher the score, the better the match of the characteristics. Multipleinstances of SLACloudRater can be combined (summation of all scores). Some tuning maybe required in order to achieve a reasonable rating with multiple SLACloudRater becausethe range of the scores of the different raters has to be similar. These fundamental ratingclasses are implemented:

1. RateBoolCharacteristics

score =

positiveScore if charactaristic = truenegativeScore if charactaristic = falseneutralScore if charactaristic non-existent

2. RateCharacteristicsWithScale

score =

{neutralScore if charactaristic non-existentscale ∗ value otherwise

3. RateCharacteristicsWithInvers

score =

{neutralScore if charactaristic non-existent

1value otherwise

The two described methods (requirement predicates and ratings) allow the modeling ofSLA requirements as seen in the following example:

StorageCloudSLARequest SLA = new StorageCloudSLARequest ( ) ;SLA. canCreateContainers ( ) .

canDeleteConta iners ( ) .hasNoContainerSizeLimit ( ) .minCapacity ( F i l e S i z e H e l p e r . toBytes (10 , GIGA BYTE) ) .orderByPrice ( ) .addRating (new RateCharac te r i s t i c sWithSca l e (MIN BANDWIDTH, 0 . 1 ,

"bandwidth * 0.1" , 0) ) ;

4.3.4. MetaStorageBroker Class

The meta broker can be used to run multiple UsageSequence s on different Clouds. There-fore the MetaStorageBroker chooses the best matching Cloud for every UsageSequence byusing the SLARequirement which is attached to every UsageSequence. Before rolling outall UserRequest s, the meta broker starts one new instance of StorageCloudBroker forevery known Cloud provider and enqueues a UserRequest with the operation code DIS-

COVER_CLOUD which prompts brokers to retrieve and store the latest available Cloud char-acteristics from their associated clouds. After all discovery requests returned successfully,the StorageMetaBroker can choose the best matching Cloud.

For this purpose, the meta broker calls the already described match function of the Stor-

ageCloudSLARequest instance for every received cloud characeristic (see 4.3.3). Usuallythe SLA requirement are composed with and and/or or statements, so that only a singlemethod call of the meta broker is necessary. All Clouds that matched the SLA require-ment predicates are then scored, using the previously described SLACloudRater. Scoresof different rating policies are summed up for each Cloud characteristics and then sortedby the overall score. The Cloud with the highest score is the best matching and thereforechosen Cloud for the sequence. All brokers that are not connected with the chosen cloudare shutdown. The UserRequest s in the UsageSequence are then forwarded to the re-maining StorageBroker instance. The meta broker stores mappings from the ID of theUsageSequence s to the chosen Cloud ID and associated broker ID.

24

Page 35: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.3. User Models 25

Figure 4.5.: Timeaware Resource Utilization

4.3.5. Latency Models

CloudSim provides a static model for latency calculations between two SimEntity in-stances. Data is provided via BRITE notation12. Delay and bandwidth is stored in amatrix, which enables the modeling of links, that provide different bandwidths in differentdirections. This model is however not sufficient enough to imitate realistic behavior of thestorage Cloud.

The interface TimeawareResourceLimitation is a more accurate model to simulate theuse of bandwidth of multiple operations that run concurrently. Consider the figure 4.5.At first there are three operations that begin at t1, t3 and t5 (all these events are in thefuture). Operation one and three use only 60% of the available capacity of the resource,whereas operation two uses 100%. Let operation four be an operation that will be scheduledfor the same resource: It has no restrictions and could use up to 100% of the resource,but only 40% are left from t1 to t2. Hence operation four is scheduled with differentexecution speeds (40% from t1 to t2, 100% from t2 to t3, a pause from t3 to t4 and soon). Let now operation 5 being scheduled for t1 as well. The operation uses 30% of theavailable capacity. Because no capacity is left, the operation is delayed until t6, wheresome resources are available.

TimeAwareResourceLimitation provides two methods: use(double amount, [double

maxRate], [long startAt]) and getFirstFreeTimeSlot(long time). The first onewill return a sequence of intervals that indicate timeslots that are connected to a rate,at which the requested operation can be performed. The second method calculates theminimal delay before an operation can be started.

4.3.6. Request Layers

Messages between different types of entities are modeled with different classes as shown infigure 4.3.6. The user of the simulation creates a set of UserRequest instances and oneinstance of StorageCloudSLARequest which is wrapped in a UsageSequence and sent tothe StorageCloudMetaBroker. This entity will create an instance of UserMetaRequestto retrieve the latest Cloud characteristics such as price and available capacity. As soonas the broker has received more UserRequest instances out of the UsageSequence it willstart to generate CDMIRequest instances that are sent towards the Cloud. The Cloud itself

12http://www.cs.bu.edu/brite/

25

Page 36: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

26 4. Implementation

Usage-Sequence

StorageCloud-MetaBroker

UserRequest

UserMeta-Request

StorageCloud-Broker

StorageCloudCDMIRequest

CDMIResponse<T>

Schedule-Entry<T>

UserRequest

Usage-Sequence

UserRequestUserRequest

GetContainer-Response

GetObject-Response

...

GetContainer-Request

DeleteObject-Request

...

PutObject-ScheduleEntry

GetContainer-ScheduleEntry

...

T extends CDMIRequest

Figure 4.6.: Request Layers

will create multiple ScheduleEntry instances in order to store the state for each request.The cloud-internal messaging is done via method invokes only. The ScheduleEntry willgenerate the according CDMIResponse which is then sent back to the broker.

4.3.7. CDMI Requests

Section 4.3.6 describes the different kind of messages that are sent between instancesduring simulation runtime. This section deals with the messages that are transmittedbetween StorageCloudBroker and StorageCloud. For every kind of request there existone class that inherits from CdmiRequest. Requests are modeled with the generic classCdmiResponse that takes <T extends CdmiRequest> as generic parameter. This allowsto create specialized responses, that contain detailed information like an object, but at thesame time, it is not necessary to create a new class for every response type (for exampleDeleteObjectResponse and DeleteContainerResponse ), which can be modeled by thespecialized version of the generic response class. Another generic class CloudScheduleEn-try allows the modeling of the state inside the Cloud. For example a PutObjectRequest

has to perform multiple store operations on different disks to succeed. The state of suchprocesses is stored in the schedule entries. As for the generic response class, this class mayhave specialized sub classes.

4.3.7.1. Cloud Discovery Request

The CloudDicoveryRequest is used to request the latest instance of StorageCloudChar-acteristics in order to choose the best matching Cloud among multiple Cloud providers.The request has no parameters. The response contains a deep-copy of the StorageCloud-

Characterstics instance of the cloud, which is completed with the maximum availablebandwidth and latency between the requesting entity and the Cloud. The currently re-maining capacity is calculated and included in the response. This operation never failsand returns immediately. It does not trigger any accounting mechanisms.

4.3.7.2. GET Container Request

The GetContainerRequest takes the name of the requested container as the only param-eter. The response contains the metadata of the container and the CdmiID s of all objects

26

Page 37: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.3. User Models 27

Accounting

userexists?

retrieveuser'sroot

requestedcontainerexists?

sendfail

allow list children?

noyes

no yes

retrievechildren

no

yes

sendcontainer

Figure 4.7.: Get Container State Diagram

that are inside the container13.

The state diagram 4.3.7.2 explains the retrieval of a container in detail. If the user doesnot exist, the operation will fail, because unknown users do not have any stored objectsor containers in the Cloud, which could be retrieved. If the user is known, the accountingmethod is triggered, which logs different operational verbs. After the root container of theuser is determined, a check will be performed whether the container with the given nameis a direct child of the root container of the user. If this is not the case, the operation willfail. If the container is existent, the response will be prepared. The children are eitherincluded or not, depending on the Cloud characteristics (see 4.2.1.4).

This operation returns without delay and triggers accounting mechanisms.

4.3.7.3. GET Object Request

This request takes either the CdmiId of the object that is to be retrieved or a name of acontainer plus the name of the requested object. This allows a convenient way to accessthe object from its logical location and in the same time provides a way to retrieve objectswithout using any container names. The request object is equivalent to one of the followingCDMI request strings: GET <root url>/object by id/<CdmiID of object> or GET <rooturl>/<container>/<name of object>. The corresponding response contains a deep copyof the instance of the CDMI object that is stored in the Cloud and thus provides accessto the metadata. Internal information like the location of the blobs, is not included in theresponse. Thus StorageCloudSim is a simulation environment, no real data is stored inobjects. The content is reduced to the information about the number of bytes that arerequired to store the object on disk.

The process of retrieving an object is described in the diagram 4.3.7.3. The user has toexist in order to perform the operation. After the accounting mechanism was invoked tolog the GET operation, the object is retrieved by either the ID or via the name of thecontainer. If the CdmdID is given, the mapping from CdmdID to ObjectContainer from

13depending on the Cloud capabilities, see 4.2.1.4

27

Page 38: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

28 4. Implementation

Accounting

user exists?

retrieve user's root

ID provided?

send fail

yes

ID-Containermapping exists?

resolve container

Container exists

no

yes yes

Name-ID mapping exists?

resolve object via IDyesyes

no

send ACK

choose Blob on Server with lowest

utilization

calc & wait spec. delay

send object

no

no

Figure 4.8.: Get Object State Diagram

the rootContainer is used. The retrieved container can retrieve the requested objectfrom its child list. If the object can be retrieved successfully, an acknowledgement is sentback to the user. The Cloud will choose the one blob (see 4.2.2.1) that is attached to theserver with the currently lowest utilization. After that, the delay and duration for theoperation on the hard disk, the server and the Cloud is calculated via the latency modelsdescribed in 4.3.5. After the delay and duration, the response is generated and sent backto the user. At the very end, the amount of downloaded data is updated via the invocationof the accounting mechanisms.

4.3.7.4. PUT Container Request

The PutContainerRequest takes a name and an instance of CdmiMetadata as parameter.The metadata may be ignored, depending on the capabilities of the Cloud (see 4.2.1.4).

As seen in figure 4.3.7.4, the Cloud will create a new instance of rootContainer if therequesting user has never sent a request before. After the checks whether the user is able tocreate containers, whether the limit of number of maximum children of the root containeris reached or the amount of stored data of the root container, the new container is createdwith a new CdmiId, if there is no other container in the rootContainer with the samename. Some or all servers are assigned to the new container (see 4.2.1.3). After that, thenew created container is returned in the response.

The response is sent immediately. There exists a delete method as well, which is notdescribed in detail.

4.3.7.5. PUT Object Request - creation

Diagram 4.3.7.5 shows the different states during the creation of a new object.

The PUT operation can either perform the creation of a new object, or start an updateprocess of an existing object. In both cases, the request carries data, that is uploaded fromthe user to the Cloud and then stored in blobs (see 4.2.2.1). As for every other operation,checks are performed before the operation starts, for example:

28

Page 39: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.3. User Models 29

Accounting

user exists?

retrieve user's root

send fail

no

yes

name taken by other container?

yes create Container

associate Serversinherit Metadatasend container

#children limit reached?

no no

create root for user

initialize Accounting

Figure 4.9.: Put Container State Diagram

Accounting

user exists?

retrieve user's root

yes

name provided?yes

name already taken?

#children limit exceeded?

no

no

storage limit exceeded?

no

obj. size exceeds limits?container exists?

retrieve containeryes

no

create Object

choose storage locations

store on next location

succeeded?

no

no

required #blobs reached?

no

calculate slowest delay & duration

waitsend Object

send ACK

Figure 4.10.: Put Object State Diagram (creation)

29

Page 40: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

30 4. Implementation

• Existence of user

• Compliance of limits (number of objects, max. size of objects), if any

• Existence of container to put the object into

If either the object name that is included in the request is empty, or the name is notgiven to any existing object in the target container, the request is considered as a creationrequest for a new object. Otherwise it is an update of an existing object. Both casesrequire sufficient storage capacities on n different disks in order to create n replica (n isdetermined by the metadata of the container, see 4.2.1.3). If enough storage targets couldbe identified, the Cloud instance will send an acknowledgement to the user and schedulethe transfer process by choosing the target devices by sorting them according to somepolicy that can be defined. The default policy will sort the StorageServer instances bythe lowest number of stored blobs on the object 14.

The operation can fail while choosing those targets, even if there is enough storage spaceleft, but it is not distributed enough to store n different replica versions.

After all storage targets could be found, the delay and duration of every single transferoperation from Cloud to server and from server to hard drive is calculated. The slowestone determines the maximum possible speed of the upload from the user to the Cloud.The operation is checked against the IO-limitations of the Cloud (see 4.3.5). The responsethat indicates the success of the operation is then being sent delayed.

A delete operation for objects exists, but is not described in further detail here.

4.3.7.6. PUT Object Request - update

The update of an existing object is exactly the same as the creation of an object, exceptthe old storage blobs are being removed, as soon as the new storage blobs have been storedsuccessfully. Depending on the capabilities of the Cloud, the metadata (see 4.2.1.4) thatare included in the PUT operation, are merged into the metadata of the objects.

4.3.8. UserRequest and UserMetaRequest Class

The previous section covered the different types of messages that are being used to commu-nicate between a StorageBroker and a Cloud instance. The following section is about therequests that are generated by the user code and forwarded to the StorageMetaBroker,and to the StorageBroker to model the sequence of requests independently of any Cloudinterface.

For every CloudRequest (as alredy described) exists a UserRequest operation field thatdistinguishes between different requests (PUT OBJECT, PUT CONTAINER, GET OBJECT,GET CONTAINER, DELETE CONTAINER, DELETE OBJECT, PAUSE, WAIT ).

Aside from the operation field, the UserRequest class provides fields for objectName,containerName, objectID, rootURL, metadata, delay (in ms) and a size field. Apart fromthat, UserRequest instances can be blocking calls (wait until operation finished beforeproceeding to next operation) or not (modeled with the WAIT operation code).

Static methods allow the convenient creation of UserRequest instances, for example:

14Consider a Cloud with only two servers. The number of required replica is 2. For a new object bothservers are equally suitable, but as soon as one server has been chosen for the storage of one blob, theother server is ranked higher in order to achieve more physical distribution to aim a more fault-tolerantsystem. After the second blob was scheduled to the other server, both servers are ranked equivalently.Now the policy enforces to take any hard drive of one of the servers, which does not store one of thefirst two blobs.

30

Page 41: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.4. Monitoring 31

List<UserRequest> r = new ArrayList<>() ;r . add ( UserRequest . b lock ing ( UserRequest . putContainer ("someContainer" ) ) ) ;r . add ( UserRequest . b lock ing ( UserRequest . putObject ("someContainer" , "

objectName" , 1024) ) ) ;r . add ( UserRequest . downloadObject ("someContainer" , "objectName" ) ) ;

As described in 4.3.2 multiple instances of UserRequest are enqueued in a UsageSequence

in addition to a SLARequest which is then forwarded to the StorageMetaBroker.

The MetaRequest class inherits from the UserRequest class and introduces a new opera-tion field that prompts StorageBroker s to retrieve the latest characteristics of their associ-ated Cloud. By this, the StorageMetaBroker can choose the best matching Cloud regard-ing specific SLA requests. The MetaRequest s are only created by the StorageMetaBrokerand then inserted at the very beginning of the UsageSequence as a blocking request.

4.4. Monitoring

Due to the purpose of this simulation, logging and measuring of metrics is crucial. Thereare three different methods to observe the behavior of different entities after the completionof the simulation.

• Logging as textual output of processes and decisions (see 6.4).

• TrackableResources that log a metric over time.

• Dump methods to print the state of a Cloud at the end of the simulation (see 6.1).

4.4.1. TrackableResource Class

A trackable resource has one or multiple streams of so-called samples. One sample isdefined as s = (timestamp, value) whereas the time stamp is the simulation time in ms,the value is some metric, in most cases a double. These streams are identified by keysthat are presented by the method getAvailableTrackingKeys(). Typical keys are forexample:

• used storage (physical)

• used storage (virtual, in %)

• user’s debts

• total earnings

• total number of GET requests

• number of requests per minute

• ...

Sequences can then be processed with the SampleCombinator class, which provides foldmethods as for example:

• flatten(List<SampleStream<T> >) -> SampleStream<T> to preserve order of sam-ples without changing values

• sum(List<SampleStream<Double> >) -> SampleStream<Double>

• min(List<SampleStream<Double> >) -> SampleStream<Double>

• max(List<SampleStream<Double> >) -> SampleStream<Double>

31

Page 42: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

32 4. Implementation

• avg(List<SampleStream<Double> >) -> SampleStream<Double>

• divide(SampleStream<Double>, SampleStream<Double>) -> SampleStream<Double>

• samplesPerTime(duration, SampleStream<T>) -> SampleStream<Double> whereduration is in ms

• filter(predicate, SampleStream<T>) -> SampleStream<T>

4.4.2. EventTracker and ResourceUsageHistory Class

The EventTracker class implements the TrackableResource interface. SampleStreamentries are of the type Tuple<Long, T> whereas T can be for example a CdmiRequest ora double value. The offered keys (see 4.4.1) are NUM EVENTS TOTAL,NUM EVENTS PER MINUTE, NUM EVENTS PER SECOND. The EventTracker canbe used to monitor events that are not measured in discrete values.

In contrast, ResourceUsageHistory implements the TrackableResource interface bydefining a sample as Tuple<Long, Double>, so the time discrete changes of numeric val-ues, like the used bandwidth of a ObjectStorageServer can be monitored over time.ResourceUsageHistory instances only provide one key, which is defined as a parameterof the constructor.

4.4.3. Accounting

The interface IUsageHistory defines methods that are being called every time a Cloud usersends a request. The Cloud instance creates an IUsageHistory instance for every user. Thefollowing methods are provided by the interface:

void DownloadTraff ic ( long s i z e ) ;void UploadTra f f i c ( long s i z e ) ;void query ( CdmiOperationVerbs verb ) ;void updateCurrentlyUsedSpace ( long s i z e ) ;void endAccountingPeriod ( ) ;

The abstract class UsageHistory implements these methods without calculating the costswhich is done by other classes. Instead, all user’s actions are being logged by using multi-ple instances of ResourceUsageHistory (debt history, upload and download traffic) andEventTracker (one for every query type). Two subclasses are SimplePricing, whichmultiplies a cost factor to traffic and used storage and AmazonUS which provides a costmodel that is equivalent to 2.5.2.

4.4.4. Report Generation

ReportGenerator s take references to objects that provide methods of the TrackableRe-

source interface and extract all sample streams for all available keys. These streams areprocessed (round up, magnitudes are changed, ...) and then written to a file, either asCVS (CVSGenerator, see listing 6.2 and 6.3 in appendix) or as LaTex source to generategraphs (GraphGenerator, see appendix B).

4.5. Scenario Generation

In order to make simulations significant, scenarios need to include many different requestsin order to benchmark the performance of the Cloud under heavy load. To fulfill thisrequirement, scenarios can be generated automatically and stored as XML files (simulations

32

Page 43: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

4.5. Scenario Generation 33

are repeatable on these input data). The class UsageSequenceGenerator creates onevalid15 sequence of UserRequest instances.

Three statistical distributions are used, to make the UserRequest realistic:

• fileSizeDistribution determines the size of objects that are created (1KB .. 1GBuniform)

• intervalDistribution determines the idle time between two requests - (5ms ..5min uniform)

• downloadProbability determines whether to download an object or not. If sampledvalue exceeds 0.5, a download is started. Otherwise, an upload is initialized - (0 ..0.6 uniform)

All distributions can be exchanged by other distributions16. With the current version onlyone container is created at the beginning of the scenario, but further modifications for amore complex UsageSequence can be done.

SequenceFileGenerator provides a command line interface (CLI) to create XML filesof UsageSequence s (see listing 6.8 and 6.9 in the appendix). Generated sequences areput together with matching SLAs17. The exact configuration of this class is explainedin section Simulation Scenarios. An example configuration can be found in 6.7 in theappendix.

15valid - containers are only uploaded into containers that have been created before, files are only down-loaded, if they have been created before.

16CloudSim provides classes for exponential-, gamma-, lognormal-, pareto-, weibull- and zipf- distributions17The SLAs contain requirements to the Cloud, so enough storage capacity will be provided to fulfill the

sequence of requests. But this value can be disturbed by a random value, because normally users donot know how much data they will be storing in the future.

33

Page 44: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

5. Evaluation

This chapter describes the experiments that were carried out using the previously presentedsimulation environment and explains the observed results.

5.1. Simulation Setup

All Experiments run on the very same input data, which was created only once and storedin files, to make the experiments repeatable. Java HotSpot(TM) 64-Bit Server VM (build23.25-b01, mixed mode) was used as virtual machine.

Experiments were done with three different amount of request sequences (UsageSequence,see 4.3.2) - 50 and 500 sequences. To create ambiguous requests, two different sequencemodels were used: On the one hand a Cloud access pattern was used, which modelsthe storage of scientific backups1: Large amount of data is only written, but no readsfollow. UsageSequence s produce 1 to 100 GB traffic and objects can be up to 100 GB.Sizes of uploaded objects are equally distributed as seen in figure 5.1. Accesses followin bursts of five PUT requests, followed by a pause of 5 to 10 minutes. Every scientificUsageSequence requires a Cloud with no restrictions on the size of a container. Theminimal accepted limit for the object size is determined individually for every sequence.In real life applications, users may not know how much data they want to store exactly,but estimations are possible. Thus the required SLA (required storage space and minimalaccepted object size limit) are disturbed by a random value that can be up to 25% of theactual value (requested = actual ± actual∗25

100 ). Furthermore, the capabilities for creatingand deleting containers are requested as well. The ranking of multiple SLA matches forthe scientific access pattern are as following:

score =1

costsPerStoredGB+

1

costsPerUploadedGB

Different Clouds are modeled with a simple pricing policy: Billing is modeled by a single,constant pricing factor that is multiplied to the amount of uploaded, downloaded andstored data. This is required, because the SLA modeling allows only simple pricing modelsat the moment (see 4.3.3).

On the other hand there are access patterns that model a more general way of use of theCloud2. Objects are not only uploaded, but downloaded as well. The probability for a

1named scientific access pattern or scientific sequence from now on.2named normal access pattern or normal sequence from now on.

34

Page 45: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

5.2. Simulation Scenarios 35

Figure 5.1.: Distribution of the Size of single Objects in all scientific Access Patterns

PUT operation is higher than for a GET operation, as shown in 5.3 Objects are up to1 GB big (equally distributed, see 5.2), the total traffic of all sequences is determined bya Gamma distribution (α = 3, β = 2). As for the scientific access pattern, this accesspattern model requires the capabilities for deleting and creating containers and requestsa minimal available capacity, which is distributed as described above. The ranking ofmultiple SLA matches are as following:

score =1

costsPerStoredGB+

1

costsPerUploadedGB+

1

costsPerDownloadedGB

Both access patterns combined produce a spectrum of sequences which forms a variatingtraffic from 1 KB to 100 GB. The distribution of the sizes is shown in figure 5.4

5.2. Simulation Scenarios

5.2.1. Single Cloud Scenario

This experiment includes one Cloud provider, which was named Amazon AWS 3. Theoffered service provides a total storage capacity of 72 TB. Because of the three-replicapolicy, only 24 TB can be offered to clients.

5.2.2. Multi-Cloud Scenario

To see if the StorageMetaBroker works correctly, a multi-cloud scenario was investigated.Three different Cloud providers were created (see table 5.1): In addition to the AmazonAWS Cloud, a second third-party provider is available who offers cheaper storage thanAmazon AWS, but without support for object replicas. Therefore the costs per downloadedgigabyte is much higher (ten times higher, compared to Amazon AWS ). The third Cloudprovider is an in-house setup which may be based on hardware that was acquired and isunused at the moment. For legacy reasons the limit of the size of each object is set to 16GB, but the storage prices are very low, because existing LAN infrastructure can be andthere is no interest in generating financial profit by providing the hardware.

3Specifications of the Cloud are note based on actual characteristics of the Service that is offered byAmazon.

35

Page 46: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

36 5. Evaluation

Figure 5.2.: Distribution of the Size of single Objects in all normal Access Patterns

Figure 5.3.: Number of Request Types in all normal Access Patterns

36

Page 47: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

5.3. Simulation Results 37

Figure 5.4.: Distribution of Traffic per Sequence (both Types)

Table 5.1. Cloud Configurations compared

Amazon AWS SCC intra Swift Cloud

max. allowed Object size unlimited 16 GB unlimited

number ofserversdisks per server

66

13

44

write rateread ratecapacityread latencywrite latency

156 MB/s156 MB/s2 TB8.5 ms9.5 ms

64 MB/s156 MB/s1 TB9 ms11 ms

156 MB/s156 MB/s2 TB8.5 ms9.5 ms

total physical capacity 72 TB 3 TB 32 TB

number of replica 3 3 1

Dollar perstored GB/billing perioduploaded GBdownloaded GB

0.05$0.0002$0.01$

0.04$0.0002$0.0002$

0.01$0.0002$0.1$

5.3. Simulation Results

5.3.1. Used Storage and SLA fulfillments

We conducted the above described experiments many times and each time we competedthe mean values. A mixture of both sequence types was used (out of 5000 sequences, thereare 2486 scientific and 2514 normal sequences). First, we observed that both experiments(single and multi-Cloud) with 50 and 500 sequences succeeded4, but the experiment with5000 mixed sequences can not be executed on the single-Cloud environment completely: atotal of 4150 sequences are declined, because the storage capacity is exhausted.

The only purpose of the StorageMetaBroker in the single Cloud experiment is to checkif the Cloud can handle the required storage space. The broker can not choose betweendifferent Cloud providers, because only one is available. But the broker detects 4150sequences which are impossible to be fulfilled, because no more storage is available. Thesesequences will not be dispatched on any StorageCloudBroker instance, because no Cloudoffers SLAs that fulfill the requested conditions. These sequences are declared as declined.

4succeed - no sequence is declined.

37

Page 48: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

38 5. Evaluation

Figure 5.5.: Succeeded Requests and SLA Violations in single Cloud Experiment with 5000Sequences

Figure 5.5 visualizes a subset of the measured number of succeeded requests on the singleCloud (blue), the number of failed requests (grey) and the number of declined sequences(orange) which starts to rise shortly before simulated 28000 minutes5. On the very sametime, the number of succeeded requests stops to rise.

In total 15 requests fail in the single Cloud experiment with 5000 mixed sequences. Threereasons (or any combination of them) explain this behavior:

• Inaccurate SLA: UsageSequence requests less capacity than it will consume. Cloudstorage may be used by 99.99%, but approves the request, because the requested sizefits into the remaining storage.

• Unsatisfiable policies: One reason could be the policy, that forces the storage con-troller to locate StorageBlob s of one object not to be on the same disk. Maybethere is enough physical storage available, but it is not distributed as required.

• Concurrent access: Clouds use the currently available storage, when responding to acloud discovery request. Requests do not allocate any memory. Two storage capacityrequests can be met, but one of them fills all of the remaining storage, so the otherrequest can not be fulfilled.

To investigate the decisions of the StorageMetaBroker in a multi-Cloud environment, onecan observe the amount of used storage in each Cloud over time. For this purpose, thesimulation can output streams of samples to CSV files (see 4.4.4), which are then plottedas seen in figure 5.6. Each dot in the plot represents one sample, whereas only samplesare displayed that are at least one gigabyte apart from each other.

For the multi-Cloud experiment with 50 mixed input sequences, one can see in figure 5.6,that the Amazon AWS Cloud was never chosen by the StorageMetaBroker, because allrequests could be fulfilled with the other two providers and the meta broker chooses thecheapest available Cloud (see table 5.2). The used storage capacity on the Swift provider ismuch higher compared to the SCC-intra, because of the restrictions that are demonstratedin the SLA of the requests: The minimum limit of the maximum object size suspends theaccess to the SCC-intra Cloud for some scientific UsageSequence s.

5The number of declined sequences rises further which is not displayed in the plot.

38

Page 49: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

5.3. Simulation Results 39

Figure 5.7.: Used Storage in multi-Cloud Experiment, with 5000 sequences

Figure 5.6.: Used Storage in multi-Cloud Experiment with 50 Sequences

The same experiment with 5000 mixed input sequences shows that the total number of ac-cepted UsageSequence s is higher, compared to the single Cloud scenario: 2986 sequenceswere declined due to no SLA matches, compared to 4150 in the single Cloud experimentwith the same number of input sequences (increase of 28.0%). In the multi-Cloud exper-iment there is more total storage capacity available (increase of 32.7%). The number ofSLA violations6 increases by factor 7.6 (from 15 to 115), which can be explained by thefact, that there are more sequences accepted in the multi-Cloud experiment, which givesmore possibilities for SLA violations.

Figure 5.7 (which is a plot of the first 2000 sequences of the 5000 multi-Cloud experiment),visualizes the fact, that multiple SLA violations occur, when “switching” from one Cloudprovider to another provider (capacities of one Cloud provider draw to close, so the broker

6Number of Requests that fails, even though the Sequence SLA were accepted.

39

Page 50: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

40 5. Evaluation

Figure 5.8.: Total Cost per Experiment

schedules future sequences to a provider that offers more storage space).

As shown in figure 5.6, 5.7 and table 5.2, the StorageMetaBroker chooses the best suitingCloud provider, wherever possible: If a sequence contains no object bigger than the limitof the SCC-intra Cloud, the broker will choose this provider, because it is the cheapestone. Sequences that contain bigger objects are scheduled on the Swift provider, becausethe price per uploaded and stored gigabyte results in the best score. As soon as the SCC-intra Cloud has no capacities left over, sequences (that are not scientific) get scheduledonto the Amazon AWS Cloud,as broker calculates a higher score due to the price perdownloaded gigabyte. Scientific sequences are still scheduled onto the Swift cloud, becausethe calculated score is the best for scientific sequences.

Table 5.2. Calculated StorageMetaBroker Score per Provider and Access Pattern

Access Pattern Amazon AWS SCC intra Swift Cloud

default 5120 10025 5110scientific 5020 5025 5100

5.3.2. Cost evaluation

One of the most interesting points are the costs that are billed when the sequences arerun on the different constellations. Figure 5.8 visualizes the fact that multi-Cloud ex-periments generate less costs than the single Cloud experiment for the runs with 50 and500 sequences7 (60% less costs in the 500 sequence experiment). This can be explainedby the capacity of the SCC-intra Cloud, which provides resources for a fraction of thecosts of the Amazon AWS Cloud. But the total costs for the 5000 sequence experimentshows that the multi-Cloud constellation can generate more costs, which is in the interestof the user, because far more sequences are processed, compared to the single Cloud ex-periment. However, the multi-Cloud setup provides lower prices per accepted sequence ineach experiment, as seen in table 5.3.

7Both types of sequences combined

40

Page 51: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

5.3. Simulation Results 41

Table 5.3. Average Price per accepted Sequence in Cents

Number of input Sequences Single Cloud Multi Cloud

50 1.50 0.70500 1.49 0.585000 1.49 1.28

5.3.3. Effect of the used Sequence Type

All previously described experiments were done with a mixture of both types of sequences(see 5.1). This section describes the impact of the type of the chosen sequence. Forthis purpose, four experiments were conducted: Only scientific / normal input sequences,each on a single Cloud and a multi-Cloud environment. As we submitted 250 sequences,no sequence was rejected in any experiment, because enough storage capacities could beprovided by either one or all Clouds. Figure 5.9 and Table 5.4 prove that the multi-Cloudvariant generates less or the same costs as the single Cloud variant. The costs for singleand multi-Cloud for the scientific sequences is the same (rounded to cents), whereas thedifference for normal sequences is 4.52$ (saving of factor 16). This can be explained bythe fact, that the scientific sequences have a tighter restriction on the SLA: The minimalaccepted value for the maximum object size depends on the largest object of the sequence,which can be between 1 and 100 GB. Thus, the majority of scientific sequences can not bescheduled for the SCC-intra Cloud, which is deciding for the costs for normal sequences.

Figure 5.9.: Costs per Sequence Type with 250 Input sequences

Table 5.4. Matrix of total Costs for 250 Input Sequences in Dollar

Single Cloud Multi-Cloud

Scientific 2.62 2.62

Normal 4.82 0.3

41

Page 52: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

6. Conclusion and future Work

Emerging Cloud computing and more STaaS technologies are developed and used morethan ever. This work extends the existing Cloud simulation framework CloudSim by addingfeatures to simulate STaaS providers and users. Policies and different hardware modelsas well as the network topology can be exchanged with low effort. The concurrent use ofresources is modeled accurately in order to gain realistic simulation results. The interfacebetween broker and Cloud is inspired and modeled using the standard CDMI. Detailedmodels for storage disks, servers and storage controller are provided and can be easilyextended. A detailed monitoring allows users to investigate a broad spectrum of metricsfrom the inside of each simulated STaaS Cloud as well as from the user’s perspective.

Different access patterns can be modeled and used to generate sequences of requests.These sequences can be stored and read to/from file to provide random, but repeatableexperiments.

In addition, Different SLA can be modeled (static or via generator tools) and checked dur-ing execution of experiments, such as certain capabilities of a Cloud or some restrictions(as for example the physical location of the STaaS). A StorageMetaBroker enables simu-lations of multiple clouds, where different UsageSequence s can be dispatched on differentClouds, depending on the matched SLA prerequisites and a rank, which can be calculatedbased on the characteristics of each Cloud. With this feature, users are able to use multipleClouds in a more differentiated manner to reduce the total costs and fulfill more requests,than it is possible with only a single Cloud (up to 60%). Costs can only be saved by users,if the SLA of the request sequences are not very strict.

In the future, different blob location strategies will be implemented, like for example thering mechanism in OpenStack’s Swift, as well as more realistic pricing models for STaaS,that are not only linear functions as proposed in this work.

42

Page 53: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

Bibliography

[1] A. Azagury, V. Dreizin, M. Factor, E. Henis, D. Naor, N. Rinetzky, O. Rodeh,J. Satran, A. Tavory, and L. Yerushalmi. Towards an object store. pages 165–176,2003.

[2] Rodrigo N Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar AF De Rose, and Ra-jkumar Buyya. Cloudsim: a toolkit for modeling and simulation of cloud computingenvironments and evaluation of resource provisioning algorithms. Software: Practiceand Experience, 41(1):23–50, 2011.

[3] Rodrigo N Calheiros, Rajiv Ranjan, Cesar AF De Rose, and Rajkumar Buyya.Cloudsim: A novel framework for modeling and simulation of cloud computing in-frastructures and services. arXiv preprint arXiv:0903.2525, 2009.

[4] Cloud Data Management Interface (CDMI) Version 1.0.2.

[5] Nandita Dukkipati, Tiziana Refice, Yuchung Cheng, Jerry Chu, Tom Herbert, AmitAgarwal, Arvind Jain, and Natalia Sutin. An argument for increasing tcp’s initialcongestion window.

[6] Armando Escalante. Handbook of cloud computing. Springer, 2010.

[7] M. Factor, K. Meth, D. Naor, O. Rodeh, and J. Satran. Object storage: the futurebuilding block for storage systems. pages 119–123, 2005.

[8] OpenStack Foundation. Openstack open source cloud computing software, 2013. [On-line; accessed 19-Aug-2013].

[9] U. O. Gagliardi. Trends in computing-system architecture. Proceedings of the IEEE,63(6):858–862, 1975.

[10] Amazon Inc. Amazon s3, cloud computing storage for files, images, videos, 2013.[Online; accessed 17-July-2013].

[11] Saiqin Long and Yuelong Zhao. A toolkit for modeling and simulating cloud datastorage: An extension to cloudsim. pages 597–600, 2012.

[12] A. Medina, A. Lakhina, I. Matta, and John Byers. Brite: an approach to univer-sal topology generation. In Modeling, Analysis and Simulation of Computer andTelecommunication Systems, 2001. Proceedings. Ninth International Symposium on,pages 346–353, 2001.

[13] Peter Mell and Timothy Grance. The nist definition of cloud computing (draft). NISTspecial publication, 800(145):7, 2011.

[14] A. Schill and T. Springer. Verteilte Systeme: Grundlagen und Basistechnologien.Springer London, Limited, 2007.

[15] Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne. Operating System Con-cepts. Wiley Publishing, 8th edition, 2008.

43

Page 54: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities
Page 55: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

Appendix

A. Additional figures

Figure A.1.: Average Response Time for mixed input Sequences

Figure A.2.: Distribution of Request Types in scientific Sequences

45

Page 56: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

46 Appendix

B. Monitoring Outputs

Listing 6.1: Beginning of Cloud dump file

c loud c l o u d f r o n t . net / l o ca t ed in us c a p a b i l i t i e s : { c d m i e x p o r t i s c s i :f a l s e , l o c a t i o n : us , SLA cloud id : 2 , c d m i d e l e t e c o n t a i n e r : true ,cdmi export webdav : f a l s e , cdmi modify metadata : true ,c d m i l i s t c h i l d r e n : true , max size : 9223372036854775807 ,cdmi read metadata : true , cdmi expor t n f s : f a l s e ,cdmi metadata maxitems : 1024 , cdmi query : f a l s e ,cdmi c r ea t e con ta in e r : true , cdmi metadata maxsize : 4096 , SLAprov ide r name : amazon−us , c d m i n o t i f i c a t i o n : f a l s e , number rep l i cas: 3 , cdmi queues : f a l s e , SLA download c o s t s : 0 . 01 , cdmi domains :f a l s e , SLA sto rage c o s t s : 0 . 05 , SLA upload c o s t s : 0 .0002}

−−User 4 with root conta ine r IDIBAFFIBE (74 .856GB/74.856GB−−−−Container ’files’ (HJIEBCBHJH) , s i z e 74 .856GB (80376328271B) , 1

ch i ld ren , metadata { cdmi metadata maxitems : 1024 ,cdmi metadata maxsize : 4096 , number rep l i cas : 3 , c r e a t e d a t : WedAug 14 1 5 : 17 : 3 8 CEST 2013 , max size : 9223372036854775807 ,l a s t w r i t e : Wed Aug 14 1 5 : 17 : 3 8 CEST 2013}

−−−−−−Object ’3je4omvxcoimmc7v’ ( DJIGIFJIJJ ) , s i z e 74 .856GB(80376328119B) , metadata : { cdmi metadata maxitems : 1024 ,cdmi metadata maxsize : 4096 , cdmi s i z e : 80376327963 , c r e a t e d a t :Wed Aug 14 15 : 1 7 : 38 CEST 2013 , max size : 9223372036854775807 ,l a s t w r i t e : Wed Aug 14 1 5 : 17 : 3 8 CEST 2013}

−−User 5 with root conta ine r DCFECIGFBF (9 .208GB/9.208GB−−−−Container ’files’ (DACEHCCDJH) , s i z e 9 .208GB (9887071026B) , 1

ch i ld ren , metadata { cdmi metadata maxitems : 1024 ,cdmi metadata maxsize : 4096 , number rep l i cas : 3 , c r e a t e d a t : WedAug 14 1 5 : 17 : 3 8 CEST 2013 , max size : 9223372036854775807 ,l a s t w r i t e : Wed Aug 14 1 5 : 17 : 3 8 CEST 2013}

−−−−−−Object ’t17tetzlxv0jcmgr1’ (IGGCGFIFHB) , s i z e 9 .208GB(9887070874B) , metadata : { cdmi metadata maxitems : 1024 ,cdmi metadata maxsize : 4096 , cdmi s i z e : 9887070719 , c r e a t e d a t : WedAug 14 1 5 : 17 : 3 8 CEST 2013 , max size : 9223372036854775807 ,

l a s t w r i t e : Wed Aug 14 1 5 : 17 : 3 8 CEST 2013}

Listing 6.2: Request Stats

s t a r t e d verb s i z e durat ion de lay1376486810204 / f i l e s /k7−f 4q3998 l z79n l−mok (PUT ob j e c t by user 58)

204194753 14 01376486811743 cloud c h a r a c t e r i s t i c s d i s cove ry reques t by user 61

0 0 01376486816417 / f i l e s / i lngu7awiw (PUT ob j e c t by user 58)

402434514 16 01376486820986 / f i l e s / ki59mvpczeppr (PUT ob j e c t by user 58)

295521742 15 01376486834713 / f i l e s / tr87uln l ruuw2 (PUT ob j e c t by user 58)

944811994 25 01376486849599 n u l l f i l e s i l n g u 7 a w i w (GET Object by user 58)

402434660 11 01376486855395 / f i l e s /n71r7w8d4tbvhodqx (PUT ob j e c t by user 58)

503134743 18 01376486875537 n u l l f i l e s k i 5 9 m v p c z e p p r (GET Object by user 58)

295521888 10 01376486883705 / f i l e s / z lh8zw00ez7b5cz61cev (PUT ob j e c t by user 58)

411649824 17 0

46

Page 57: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

B. Monitoring Outputs 47

1376486887089 n u l l f i l e s r 2 m 8 4 9 2 e t y i n l 3 6 7 l 0 7 (GET Object by user 58)37739824 9 0

1376486902788 n u l l f i l e s r 2 m 8 4 9 2 e t y i n l 3 6 7 l 0 7 (GET Object by user 58)37739824 9 0

1376486916596 cloud c h a r a c t e r i s t i c s d i s cove ry reques t by user 640 0 0

1376486916596 / f i l e s / (PUT Container , r eques ted by 64) 00 0

Listing 6.3: SLA Stats

time ( acked r e q u e s t s in ms) ; time ( acked r e q u e s t s ) ; acked r e q u e s t s ; time (f a i l e d r e q u e s t s in ms) ; time ( f a i l e d r e q u e s t s ) ; f a i l e d r e q u e s t s ; time( succ r e q u e s t s in ms) ; time ( succ r e q u e s t s ) ; succ r e q u e s t s ; time (de c l i n e d r e q u e s t s in ms) ; time ( de c l i n ed r e q u e s t s ) ; d e c l i n e d r e q u e s t s

1376485966631;08/14/2013 03 : 12 : 46 : 631 ; 0 . 0 ; 1377067292060 ; 08/21/201308 : 41 : 32 : 60 ; 0 . 0 ; 1376485966631 ; 08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 1 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 5 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 6 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 1 0 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 1 1 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 1 5 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 1 6 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 2 0 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 2 1 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 2 5 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 2 6 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 3 0 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 3 1 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 3 5 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 3 6 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 4 0 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 4 1 . 0 ; ; ;

1376485966631;08/14/2013 0 3 : 1 2 : 4 6 : 6 3 1 ; 4 5 . 0 ; ; ; ; 1 3 7 6 4 8 5 9 6 7 1 3 1 ; 0 8 / 1 4 / 2 0 1 30 3 : 1 2 : 4 7 : 1 3 1 ; 4 6 . 0 ; ; ;

47

Page 58: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

48 Appendix

Figure B.3.: Plot of Monitored Number of Requests in Cloud

Figure B.4.: Plot of Total physical used Storage Capacity in Cloud

48

Page 59: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

B. Monitoring Outputs 49

Listing 6.4: Sequence Log

INFO DiscoveryProcess seqUsageSequence5 meta5 08/26/20130 5 : 2 5 : 5 9 : 6 4 2 broker 26 responded , 0 to go . Received c h a r a c t e r i s t i c s: { c d m i e x p o r t i s c s i : f a l s e , SLA minimum bandwidth : 0 . 0 , l o c a t i o n :us , SLA cloud id : 4 , c d m i d e l e t e c o n t a i n e r : true ,

cdmi export webdav : f a l s e , SLA a v a i l a b l e capac i ty : 34896044298891 ,cdmi modify metadata : true , c d m i l i s t c h i l d r e n : true , max size :9223372036854775807 , cdmi read metadata : true , cdmi expor t n f s :f a l s e , cdmi metadata maxitems : 1024 , cdmi metadata maxsize : 4096 ,cdmi c r ea t e con ta in e r : true , cdmi query : f a l s e , number rep l i cas : 1 ,

c d m i n o t i f i c a t i o n : f a l s e , SLA prov ide r name : sw i f t−cloud ,cdmi queues : f a l s e , SLA download c o s t s : 0 . 1 , SLA sto rage c o s t s :0 . 01 , cdmi domains : f a l s e , SLA maximum la tency : 0 . 0 , SLA uploadc o s t s : 0 .0002}

INFO DiscoveryProcess seqUsageSequence5 meta5 08/26/20130 5 : 2 5 : 5 9 : 6 4 2 de l egated to c loud sw i f t−c loud

FINE StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2 enqueueuser r eque s t edu . k i t . c loudSimStorage . c loudBroker .

UserRequest@7686652aFINE StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2 enqueue

user r eque s t edu . k i t . c loudSimStorage . c loudBroker .UserRequest@715be530

FINE StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2 enqueueuser r eque s t edu . k i t . c loudSimStorage . c loudBroker .

UserRequest@4823ec74INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2 send

user r eque s t to c loud : / f i l e s / (PUT Container , r eques ted by 26)INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2

Response o f s u c c e s s f u l opera t i on : ’PutContainerResponse: created ’

Container ’files’ (IBBEABIDIF) , s i z e 0B (152B) , 0 ch i ld r en ,metadata { cdmi metadata maxitems : 1024 , cdmi metadata maxsize :4096 , number rep l i cas : 1 , c r e a t e d a t : Mon Aug 26 17 : 24 : 2 1 CEST2013 , max size : 9223372036854775807 , l a s t w r i t e : Mon Aug 2617 : 24 : 2 1 CEST 2013}’’

INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2opera t i on eadb5479−d978−444e−b5ad−7c55a08f6014 f i n i s h e d

INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2 waitedf o r t h i s opera t i on to f i n i s h − cont inue with execut ion

INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 5 : 5 9 : 6 4 2 senduser r eque s t to c loud : / f i l e s / g87asy5fh9pwc3ravuv1 (PUT ob j e c t byuser 26)

INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 6 : 0 0 : 4 6Response o f s u c c e s s f u l opera t i on : ’PutObjectResponse: created

Object ’Object ’g87asy5fh9pwc3ravuv1’ (IBAJHIDABC) , s i z e 60 .213GB(64653152148B) , metadata : { cdmi metadata maxitems : 1024 ,cdmi metadata maxsize : 4096 , cdmi s i z e : 64653151992 , c r e a t e d a t :Mon Aug 26 1 7 : 24 : 2 1 CEST 2013 , max size : 9223372036854775807 ,l a s t w r i t e : Mon Aug 26 17 : 2 4 : 21 CEST 2013}’’

INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 6 : 0 0 : 4 6opera t i on ecb9db80−0931−4b37−b40a−0463db17aba6 f i n i s h e d

INFO StorageCloudBroker 5−26−4 08/26/2013 0 5 : 2 6 : 0 0 : 4 6 pausef o r 360000ms

FINE StorageCloudBroker 5−26−4 08/26/2013 0 5 : 3 2 : 0 0 : 4 6 userr eque s t queue empty − qu i t broker

Listing 6.5: Cloud Log

INFO cloudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2 c loud

49

Page 60: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

50 Appendix

c h a r a c t e r i s t i c s d i s cove ry reques t by user 8 −> opera t i on ’5901ab1d

-58f8-436f-b01c-a0272daf3f40’

INFO cloudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2opera t i on 5901ab1d−58f8−436 f−b01c−a0272daf3 f40 succeeds

FINE c loudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2Operation ’5901ab1d-58f8-436f-b01c-a0272daf3f40’ succeeded (c a l l b a c k )

INFO cloudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2 / f i l e s /(PUT Container , r eques ted by 8) −> opera t i on ’160ecf1b-b22e -428b-

bf9a -2883a36e61fa’

INFO cloudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2 Createuser with ID 8

FINE c loudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2as s i gned the metadata { cdmi metadata maxitems : 1024 ,cdmi metadata maxsize : 4096 , number rep l i cas : 1 , c r e a t e d a t : MonAug 26 1 7 : 24 : 2 1 CEST 2013 , max size : 9223372036854775807 ,l a s t w r i t e : Mon Aug 26 17 : 2 4 : 21 CEST 2013}

FINE c loudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2as s i gned the CDMI ID ’CJDEBAJEBC’

INFO cloudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2opera t i on 160 ecf1b−b22e−428b−bf9a−2883 a36e61fa succeeds

FINE c loudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2Operation ’160ecf1b-b22e -428b-bf9a -2883a36e61fa’ succeeded (c a l l b a c k )

INFO cloudLoggger swi f t−c loud 08/26/2013 0 5 : 2 4 : 3 0 : 3 9 2 / f i l e s/3 je4omvxcoimmc7v (PUT ob j e c t by user 8) −> opera t i on ’e20f46f9

-1067-49f1-ab98-1c022f54829b’

C. Input Files

Listing 6.6: Input Sequences Stats

id model maxObjSize maxObjSize in SLA req i r edSpacerequ i redSpace in SLA i n i t i a l de lay

0 1 9887070719 9832871767 988707071910143590083 2444

1 1 44621724313 46077870305 4462172431347936921811 6357

2 1 46264314663 44973950815 4626431466343487424632 9546

3 1 107178351503 117873997799 107178351503117327511722 16005

4 0 688779669 −1 1806019203 168501218724500

5 1 64653151992 69605959288 6465315199265579218673 30398

6 0 1009614518 −1 7512922954 686308436639050

7 0 1039393485 −1 5665286077 592725309344303

Listing 6.7: Cloud Definition XML

<cloudModel name="scc-intra" l o c a t i o n="us" rootUr l="intra.scc.kit.edu/"

><c h a r a c t e r i s t i c s>

<metadata><entry>

<s t r i n g>c d m i e x p o r t i s c s i</ s t r i n g>

50

Page 61: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

C. Input Files 51

<s t r i n g> f a l s e</ s t r i n g></ entry><entry>

<s t r i n g> l o c a t i o n</ s t r i n g><s t r i n g>us</ s t r i n g>

</ entry><entry>

<s t r i n g>number rep l i cas</ s t r i n g><s t r i n g>3</ s t r i n g>

</ entry><entry>

<s t r i n g>c d m i d e l e t e c o n t a i n e r</ s t r i n g><s t r i n g>t rue</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi export webdav</ s t r i n g><s t r i n g> f a l s e</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi modify metadata</ s t r i n g><s t r i n g>t rue</ s t r i n g>

</ entry><entry>

<s t r i n g>c d m i l i s t c h i l d r e n</ s t r i n g><s t r i n g>t rue</ s t r i n g>

</ entry><entry>

<s t r i n g>max size</ s t r i n g><s t r i n g>17179869184</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi read metadata</ s t r i n g><s t r i n g>t rue</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi expor t n f s</ s t r i n g><s t r i n g> f a l s e</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi metadata maxitems</ s t r i n g><s t r i n g>1024</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi metadata maxsize</ s t r i n g><s t r i n g>4096</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi c r ea t e con ta in e r</ s t r i n g><s t r i n g>t rue</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi query</ s t r i n g><s t r i n g> f a l s e</ s t r i n g>

</ entry><entry>

<s t r i n g>c d m i n o t i f i c a t i o n</ s t r i n g><s t r i n g> f a l s e</ s t r i n g>

</ entry><entry>

51

Page 62: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

52 Appendix

<s t r i n g>cdmi queues</ s t r i n g><s t r i n g> f a l s e</ s t r i n g>

</ entry><entry>

<s t r i n g>SLA download c o s t s</ s t r i n g><s t r i n g>0 .0002</ s t r i n g>

</ entry><entry>

<s t r i n g>SLA sto rage c o s t s</ s t r i n g><s t r i n g>0 .04</ s t r i n g>

</ entry><entry>

<s t r i n g>cdmi domains</ s t r i n g><s t r i n g> f a l s e</ s t r i n g>

</ entry><entry>

<s t r i n g>SLA upload c o s t s</ s t r i n g><s t r i n g>0 .0002</ s t r i n g>

</ entry></metadata>

</ c h a r a c t e r i s t i c s><p r i c i n g P o l i c y c l a s s="edu.kit.cloudSimStorage.pricing.SimplePricing"

><centsPerUploadedGB>0 .0002</centsPerUploadedGB><centsPerDownloadedGB>0 .0002</centsPerDownloadedGB><centsPerStoredGBperPeriod>0 .04</ centsPerStoredGBperPeriod>

</ p r i c i n g P o l i c y><s e r v e r s c l a s s="java.util.ArrayList">

<objectStorageServerMode l><name>s e rv e r0</name>< i o L i m i t a t i o n s c l a s s="edu.kit.cloudSimStorage.storageModel.

resourceUtilization.FirstFitAllocation" maxRate="

1.34217728E11"/><d i s k s c l a s s="java.util.ArrayList">

<objectStorageDiskModel><dr ive c l a s s="edu.kit.cloudSimStorage.cloudFactory.

harddrives.GenericDrive" name="/dev/sda0"><capac i ty>1099511627776</ capac i ty><r e s e rve rdSpace>0</ re se rve rdSpace><usedSpace>0</ usedSpace><readRate>1.63577856E8</ readRate><writeRate>6.7108864E7</ writeRate><readLatency>9 .0</ readLatency><writeLatency>11 .5</ writeLatency><i oL im i t s c l a s s="edu.kit.cloudSimStorage.storageModel

.resourceUtilization.FirstFitAllocation" maxRate="1.63577856E11"/>

</ dr iv e><name>/dev/ sda0</name>

</ objectStorageDiskModel><objectStorageDiskModel>

<dr ive c l a s s="edu.kit.cloudSimStorage.cloudFactory.

harddrives.GenericDrive" name="/dev/sda1"><capac i ty>1099511627776</ capac i ty><r e s e rve rdSpace>0</ re se rve rdSpace><usedSpace>0</ usedSpace><readRate>1.63577856E8</ readRate><writeRate>6.7108864E7</ writeRate><readLatency>9 .0</ readLatency>

52

Page 63: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

C. Input Files 53

<writeLatency>11 .5</ writeLatency><i oL im i t s c l a s s="edu.kit.cloudSimStorage.storageModel

.resourceUtilization.FirstFitAllocation" maxRate="1.63577856E11"/>

</ dr iv e><name>/dev/ sda1</name>

</ objectStorageDiskModel><objectStorageDiskModel>

<dr ive c l a s s="edu.kit.cloudSimStorage.cloudFactory.

harddrives.GenericDrive" name="/dev/sda2"><capac i ty>1099511627776</ capac i ty><r e s e rve rdSpace>0</ re se rve rdSpace><usedSpace>0</ usedSpace><readRate>1.63577856E8</ readRate><writeRate>6.7108864E7</ writeRate><readLatency>9 .0</ readLatency><writeLatency>11 .5</ writeLatency><i oL im i t s c l a s s="edu.kit.cloudSimStorage.storageModel

.resourceUtilization.FirstFitAllocation" maxRate="1.63577856E11"/>

</ dr iv e><name>/dev/ sda2</name>

</ objectStorageDiskModel></ d i s k s>

</ objectStorageServerMode l></ s e r v e r s><c loudIOLimits c l a s s="edu.kit.cloudSimStorage.storageModel.

resourceUtilization.UnlimitedResource"/></ cloudModel>

Listing 6.8: XML Representation of SLA of normal sequence

<SLA><requ i rements c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementAND"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementAND"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SupportsCapability" d e s c r i p t i o n="cdmi_create_container!

" c a p a b i l i t y k e y="cdmi_create_container"/><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SupportsCapability" d e s c r i p t i o n="cdmi_delete_container!

" c a p a b i l i t y k e y="cdmi_delete_container"/></a><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

MinimumCharactersisticValue" d e s c r i p t i o n="SLA available

capacity&gt;=1.685012187E9" key="SLA available capacity"

min="1.685012187E9"/></ requi rements><r a t i n g s c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

RateByPrice"><d e s c r i p t i o n>r a t e 1/ p r i c e f o r up and download and s to rage

c o s t s</ d e s c r i p t i o n></ r a t i n g s>

</SLA>

Listing 6.9: XML of a scientific sequence

<usageSequence sequenceID="0" b lock ing="false" i d l e="2444"><SLA>

53

Page 64: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

54 Appendix

<requ i rements c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementAND"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementAND"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementAND"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementAND"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SupportsCapability" d e s c r i p t i o n="

cdmi_create_container!" c a p a b i l i t y k e y="

cdmi_create_container"/><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SupportsCapability" d e s c r i p t i o n="

cdmi_delete_container!" c a p a b i l i t y k e y="

cdmi_delete_container"/></a><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

SLARequirementOR"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

DoesNotSupportCapability" d e s c r i p t i o n="does not

support max_container_size" c a p a b i l i t y k e y="

max_container_size"/><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

MinimumCharactersisticValue" d e s c r i p t i o n="

max_container_size&gt;=9.223372036854776E18" key="max_container_size" min="9.223372036854776E18"/>

</b></a><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

MinimumCharactersisticValue" d e s c r i p t i o n="SLA available

capacity&gt;=1.0143590083E10" key="SLA available

capacity" min="1.0143590083E10"/></a><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

MinimumCharactersisticValue" d e s c r i p t i o n="max_size&gt

;=9.832871767E9" key="max_size" min="9.832871767E9"/></ requi rements><r a t i n g s c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

RakingSum"><a c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

RateCharacteristicsWithInverse" key="SLA storage costs"

d e f a u l t S c o r e="-Infinity" s c a l e="1.0"><d e s c r i p t i o n>l owest s to rage c o s t s</ d e s c r i p t i o n>

</a><b c l a s s="edu.kit.cloudSimStorage.ObjectStorageSLAs.

RateCharacteristicsWithInverse" key="SLA upload costs"

d e f a u l t S c o r e="-Infinity" s c a l e="1.0"><d e s c r i p t i o n>l owest upload c o s t s</ d e s c r i p t i o n>

</b></ r a t i n g s>

</SLA><r e q u e s t s c l a s s="java.util.ArrayList">

<userRequest de lay="0" b lo ck ingCa l l="true" opCode="1" s i z e="0"><containerName> f i l e s</containerName><objectID>UNKNOWN</ objectID><metadata>

<metadata/></metadata>

54

Page 65: Implementation of a Simulation Environment for Cloud ...downloads.tobiassturm.de/projects/storagecloudsim/thesis.pdf · CloudSim is a popular simulation environment, which o ers capabilities

C. Input Files 55

</ userRequest><userRequest de lay="0" b lo ck ingCa l l="false" opCode="0" s i z e="

9887070719"><objectName>t17 te t z lxv0 j cmgr1</objectName><containerName> f i l e s</containerName><objectID>UNKNOWN</ objectID><metadata>

<metadata><entry>

<s t r i n g>cdmi s i z e</ s t r i n g><s t r i n g>9887070719</ s t r i n g>

</ entry></metadata>

</metadata></ userRequest><userRequest de lay="0" b lo ck ingCa l l="false" opCode="6" s i z e="0">

<objectID>UNKNOWN</ objectID></ userRequest>

</ r e q u e s t s></ usageSequence>

55