Final Report Facebookmarks

Embed Size (px)

DESCRIPTION

google apps and cloud computing

Citation preview

  • INTO THE CLOUD

    20.07.2010 An evaluation of the Google App Engine

    Chornyi, Dmitry

    Riediger, Julian

    Wolfenstetter, Thomas

  • Into the Cloud

    Page 1

    Into the Cloud A N E V A L U A T I O N O F T H E G O O G L E A P P E N G I N E

    Abstract

    Cloud Computing is often glorified as one of the most important IT paradigms to shape the

    next decade of IT. In this paper we illustrate the concepts as well as the technology behind

    Cloud Computing in general and analyze one of the currently most popular platforms, the

    Google App Engine. For this purpose we developed a sample application utilizing many of

    the platforms features. Based on experiences that were made during implementation and

    scalability testing we evaluate the Google App Engine and discuss whether it satisfies the

    requirements of state-of-the-art software engineering. Finally we present an outlook on the

    way ahead in Cloud Computing.

    Keywords:

    Cloud Computing, Google App Engine, Evaluation, Scalability testing, PaaS

  • Into the Cloud

    Page 2

    Content

    OUTLINE .............................................................................................................................. 3

    CLOUD COMPUTING IN A NUTSHELL .................................................................................. 3

    Motivation .................................................................................................................................................. 3

    Definition of Cloud Computing ............................................................................................................... 4

    Service delivery models .......................................................................................................................... 4

    GOOGLE APP ENGINE ......................................................................................................... 6

    Architecture ................................................................................................................................................ 6

    Costs ............................................................................................................................................................ 6

    Features ...................................................................................................................................................... 7

    Runtime Environment .............................................................................................................................. 7

    Persistence and the datastore ............................................................................................................. 8

    Services .................................................................................................................................................... 9

    App Engine for Business ....................................................................................................................... 11

    APPLICATION DEVELOPMENT USING GOOGLE APP ENGINE .......................................... 11

    General Idea ......................................................................................................................................... 11

    Requirements and functionality ........................................................................................................... 12

    Implementation ....................................................................................................................................... 12

    Development Environment ................................................................................................................. 12

    Application Environment .................................................................................................................... 12

    Application Architecture .................................................................................................................... 13

    Platform Limitations ............................................................................................................................ 17

    SCALABILITY TESTING ....................................................................................................... 18

    Testing approach ................................................................................................................................... 18

    Application scalability .......................................................................................................................... 19

    DISCUSSION ....................................................................................................................... 20

    Software Engineering Aspects ............................................................................................................ 20

    Outlook: the way ahead in Cloud Computing ................................................................................. 23

    New application opportunities and use cases............................................................................... 23

    Challenges of Cloud Computing ...................................................................................................... 24

    CONCLUSION .................................................................................................................... 26

    ABBREVIATIONS ................................................................................................................ 27

    TABLE OF FIGURES ............................................................................................................ 28

    LITERATURE ....................................................................................................................... 29

  • Into the Cloud

    Page 3

    OUTLINE

    In this paper we want to discuss the omnipresent topic Cloud Computing with a special focus on

    application development using Googles platform-as-a-service (PaaS) environment, the Google App

    Engine. First we give a general introduction on Cloud Computing where we will outline the motivation

    behind this hype, give a definition of the term and explain the three service delivery models of Cloud

    Computing. In a next step we draft a detailed picture of the Google App Engine, illustrating its

    architecture, services and API. This section is followed by a report on an exemplary application

    development project on the platform. Additionally, we sum up our experiences made during scalability

    testing, before we finally discuss whether Google App Engine and Cloud Computing in general meet the

    requirements of modern software engineering and if there is a paradigm shift in the mindsets of software

    developers.

    CLOUD COMPUTING IN A NUTSHELL

    Motivation

    In October 2008 the British weekly, The Economist praised Cloud Computing as the coining technology

    for the IT world:

    THE RISE OF THE CLOUD IS MORE THAN JUST ANOTHER PLATFORM SHIFT THAT GETS GEEKS EXCITED. IT WILL

    UNDOUBTEDLY TRANSFORM THE IT INDUSTRY, BUT IT WILL ALSO PROFOUNDLY CHANGE THE WAY PEOPLE

    WORK AND COMPANIES OPERATE. IT WILL ALLOW DIGITAL TECHNOLOGY TO PENETRATE EVERY NOOK AND

    CRANNY OF THE ECONOMY AND OF SOCIETY, CREATING SOME TRICKY POLITICAL PROBLEMS ALONG THE WAY

    (1).

    Indeed, when looking at the Google Trends graph for search volume worldwide, (see Figure 1) one

    realizes that since 2007 the term Cloud Computing has become increasingly popular, at least in terms

    of online search behavior.

    FIGURE 1: GLOBAL SEARCH VOLUME INDEX FOR CLOUD COMPUTING (2)

    The dream behind Cloud Computing is that, as long as users can connect to the Internet, they have the

    entire Web as their computing center. When compared to the infinitely powerful Internet Cloud, personal

    computers seem like lightweight terminals allowing users to utilize the cloud. From this perspective, Cloud

    Computing may seem like a return to the original mainframe paradigm from the 60s to 70s (3). Critics

    therefore accuse Cloud Computing to be old wine in new skins. They argue that it is just a temporary

    fashion term within the IT community. No matter whether The Economists optimistic scenario will occur or

    the critics will be right after all, the technology behind Cloud Computing is more than interesting enough

    to be studied in detail.

  • Into the Cloud

    Page 4

    Definition of Cloud Computing

    Recent literature review (4) has shown that Cloud Computing is still a fuzzy term and all existing

    definitions have little in common. Moreover, distinguishing Clouds from Grids is definitely not trivial as

    both approaches are closely related to each other. The most noticeable differences is that in contrast to

    Grid Computing, which focuses on sharing distributed resources dynamically at runtime, Cloud Computing

    aims at virtualization (5). In this context there are several approaches that aim at combining the

    advantages of clouds and grids, which can also be seen as a combination of advanced networking with

    sophisticated virtualization (4). When looking for a universal definition of Cloud Computing, one has to

    consider multiple aspects. The essential characteristics of Cloud Computing can be circumscribed as on-

    demand self-service, rapid elasticity, resource pooling, ubiquitous network access and measured service

    (6). Furthermore, there are different deployment models. Cloud Computing can be operated as a

    private, public, community or hybrid Cloud. Public Clouds are available to everybody in a pay-as-you-

    go manner. Current examples of public Clouds include Amazon Web Services (7), Google App Engine (8)

    and Microsoft Azure (9). Private Clouds on the other side refer to internal datacenters of a business or

    other organization that are not made available to the public. Community Clouds are shared by several

    organizations and are usually setup for their specific requirements. Hybrid Clouds are a mixture of the

    above three deployment models. Each Cloud in the hybrid model can be independently managed but

    applications and data are allowed to move across the hybrid Cloud. Therefore hybrid Clouds allow

    cloud bursting to take place, which is where a private Cloud can burst-out to a public Cloud when it

    requires more resources (10) (11). Based on these observations we adopt the Cloud Computing definition

    by Vaquero et al. (4):

    CLOUDS ARE A LARGE POOL OF EASILY USABLE AND ACCESSIBLE VIRTUALIZED RESOURCES (SUCH AS

    HARDWARE, DEVELOPMENT PLATFORMS AND/OR SERVICES). THESE RESOURCES CAN BE DYNAMICALLY RE-

    CONFIGURED TO ADJUST TO A VARIABLE LOAD (SCALE), ALLOWING ALSO FOR AN OPTIMUM RESOURCE

    UTILIZATION. THIS POOL OF RESOURCES IS TYPICALLY EXPLOITED BY A PAY-PER-USE MODEL IN WHICH

    GUARANTEES ARE OFFERED BY THE INFRASTRUCTURE PROVIDER BY MEANS OF CUSTOMIZED SLAS.

    Service delivery models

    In general Cloud Computing services can be divided into three different service delivery models:

    Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS). As it

    is shown in Figure 2, these delivery models can be seen in a hierarchic context. To the end user only SaaS

    is visible, while developers use PaaS and IaaS to deploy their applications. Subsequently, the three

    occurrences of Cloud Computing are introduced individually.

  • Into the Cloud

    Page 5

    FIGURE 2: SERVICE DELIVERY MODELS OF CLOUD COMPUTING (12)

    Infrastructure-as-a-Service

    IaaS products deliver a complete computer infrastructure remotely via the Internet. They provide machine

    instances to developers, which essentially behave like dedicated servers, controlled by the developers.

    This means that the developer has full responsibility for server operation and once a machine reaches its

    performance limits, the developer has to manually instantiate another machine and to scale the

    application out to it. To sum up, IaaS is intended for developers who want to write arbitrary software on

    top of the infrastructure with only small compromises in their development methodology (12).

    Platform-as-a-Service

    PaaS are situated one level higher within the Cloud Computing hierarchy. They provide a full or partial

    application development environment that abstracts machine instances and other technical details from

    the developer. The applications are executed within data centers, not concerning the developers with

    matters of allocation. In exchange for this, the developers have to handle some constraints that the

    environment imposes on their application design, for example the use of special data stores instead of

    relational databases (12).

    Software-as-a-Service

    At the consumer-facing level there are the most popular examples of Cloud Computing, with well-defined

    applications offering users online resources and storage. This differentiates SaaS from traditional

    websites or web applications which do not interface with user information or do so in a limited manner

    (12). SaaS offers complex applications such as CRM or ERM online (5).

    Figure 3 illustrates the different service delivery models of Cloud Computing and lists major vendors of

    each domain as an example.

  • Into the Cloud

    Page 6

    FIGURE 3: MAJOR TYPES OF CLOUD SERVICES (ADAPTED FORM (5))

    GOOGLE APP ENGINE

    Architecture

    The Google App Engine (GAE) is Google`s answer to the ongoing trend of Cloud Computing offerings

    within the industry. In the traditional sense, GAE is a web application hosting service, allowing for

    development and deployment of web-based applications within a pre-defined runtime environment.

    Unlike other cloud-based hosting offerings such as Amazon Web Services that operate on an IaaS level,

    the GAE already provides an application infrastructure on the PaaS level. This means that the GAE

    abstracts from the underlying hardware and operating system layers by providing the hosted

    application with a set of application-oriented services. While this approach is very convenient for

    developers of such applications, the rationale behind the GAE is its focus on scalability and usage-based

    infrastructure as well as payment.

    Costs

    Developing and deploying applications for the GAE is generally free of charge but restricted to a

    certain amount of traffic generated by the deployed application. Once this limit is reached within a

    certain time period, the application stops working. However, this limit can be waived when switching to a

    billable quota where the developer can enter a maximum budget that can be spent on an application

    per day. Depending on the traffic, once the free quota is reached the application will continue to work

    until the maximum budget for this day is reached. Table 1 summarizes some of the in our opinion most

    important quotas and corresponding amount per unit that is charged when free resources are depleted

    and additional, billable quota is desired.

    IaaS

    Amazon EC2 Joyent Sun Microsofts Network.com HP Flexible Computing Services

    IBM Blue Cloud 3tera OpSource Jamcracker

    PaaS

    Bungee Labs Bungee Connect

    Etelos Coghead Google App Engine HP Adaptive Infrastructure as a Service

    Salesforce.com LongJump

    Saas

    Oracle SaaS platform Salesforce Sales Force Automation

    NetSuite Google Apps Workday Human Capital Management

  • Into the Cloud

    Page 7

    Free Default Quota Billing Enabled Default Quota Cost

    Daily Limit Maximum Rate Daily Limit Maximum Rate

    General Limits

    Requests 1.3 mio 7,400 req/minute 43 mio 30,000 req/min n/a

    Bandwidth In 1GB 56 MB/minute 1,046 GB 10 GB/min $0.10/GB

    Bandwidth Out 1 GB 56 MB/minute 1,046 GB 10 GB/min $0.12/GB

    CPU Time 6.5 CPU-h 15 CPU-min/min 1,729 CPU-h 72 CPU-min/min $0.10/CPU-h

    Data Store

    Stored Data 1 GB no maximum $0.15/GB/month

    # of Indexes 100 200 n/a

    Queries 10 mio 57,000/min 200 mio 129,000/min n/a

    CPU Time 60 CPU-h 20 CPU-min/min 1,200 CPU-h 50 CPU-min/min n/a

    Mail Service

    Recipients 2,000 8/min 7.4 mio 5,100/min $0.0001/recipient

    URL Fetch Service

    API Calls 657,000 calls 3,000/min 46 mio. calls 32,000/min n/a

    Memcache Service

    API Calls 8,600,000 48,000/min 96,000,000 108,000/min n/a

    Task Queue Service

    API Calls 100,000 n/a 1mio n/a n/a

    Stored Tasks 1 mio max 10 mio max n/a TABLE 1: GAE QUOTA AND BILLING (ADAPTED FROM (8))

    Features

    With a Runtime Environment, the Datastore and the App Engine services, the GAE can be divided into

    three parts.

    Runtime Environment

    The GAE runtime environment presents itself as the place where the actual application is executed.

    However, the application is only invoked once an HTTP request is processed to the GAE via a web

    browser or some other interface, meaning that the application is not constantly running if no invocation or

    processing has been done. In case of such an HTTP request, the request handler forwards the request and

    the GAE selects one out of many possible Google servers where the application is then instantly

    deployed and executed for a certain amount of time (8). The application may then do some computing

    and return the result back to the GAE request handler which forwards an HTTP response to the client. It is

    important to understand that the application runs completely embedded in this described sandbox

    environment but only as long as requests are still coming in or some processing is done within the

    application. The reason for this is simple: Applications should only run when they are actually computing,

    otherwise they would allocate precious computing power and memory without need. This paradigm shows

    already the GAEs potential in terms of scalability. Being able to run multiple instances of one application

    independently on different servers guarantees for a decent level of scalability. However, this highly

    flexible and stateless application execution paradigm has its limitations. Requests are processed no

    longer than 30 seconds after which the response has to be returned to the client and the application is

    removed from the runtime environment again (8). Obviously this method accepts that for deploying and

  • Into the Cloud

    Page 8

    starting an application each time a request is processed, an additional lead time is needed until the

    application is finally up and running. The GAE tries to encounter this problem by caching the application

    in the server memory as long as possible, optimizing for several subsequent requests to the same

    application. Furthermore, the stateless execution creates the need for a sophisticated solution for

    persistence which will be presented in detail in the following chapter.

    The type of runtime environment on the Google servers is dependent on the programming language

    used. For Java or other languages that have support for Java-based compilers (such as JRuby, Rhino and

    Groovy) a Java-based Java Virtual Machine (JVM) is provided. Also, GAE fully supports the Google

    Web Toolkit (GWT), a framework for rich web applications. For Python and related frameworks a

    Python-based environment is used.

    FIGURE 4: STRUCTURE OF GOOGLE APP ENGINE (13)

    Persistence and the datastore

    As previously discussed, the stateless execution of applications creates the need for a datastore that

    provides a proper way for persistence. Traditionally, the most popular way of persisting data in web

    applications has been the use of relational databases. However, setting the focus on high flexibility and

    scalability, the GAE uses a different approach for data persistence, called Bigtable (14). Instead of rows

    found in a relational database, in Googles Bigtable data is stored in entities. Entities are always

    associated with a certain kind. These entities have properties, resembling columns in relational database

    schemes. But in contrast to relational databases, entities are actually schemaless, as two entities of the

    same kind not necessarily have to have the same properties or even the same type of value for a certain

    property.

    The most important difference to relational databases is however the querying of entities within a

    Bigtable datastore. In relational databases queries are processed and executed against a database at

    application runtime. GAE uses a different approach here. Instead of processing a query at application

    runtime, queries are pre-processed during compilation time when a corresponding index is created. This

    index is later used at application runtime when the actual query is executed. Thanks to the index, each

    query is only a simple table scan where only the exact filter value is searched. This method makes

    queries very fast compared to relational databases while updating entities is a lot more expensive.

  • Into the Cloud

    Page 9

    Transactions are similar to those in relational databases. Each transaction is atomic, meaning that it either

    fully succeeds or fails. As described above, one of the advantages of the GAE is its scalability through

    concurrent instances of the same application. But what happens when two instances try to start

    transactions trying to alter the same entity? The answer to this is quite simple: Only the first instance gets

    access to the entity and keeps it until the transaction is completed or eventually failed. In this case the

    second instance will receive a concurrency failure exception. The GAE uses a method of handling such

    parallel transactions called optimistic concurrency control. It simply denies more than one altering

    transaction on an entity and implicates that an application running within the GAE should have a

    mechanism trying to get write access to an entity multiple times before finally giving up.

    Heavily relying on indexes and optimistic concurrency control, the GAE allows performing queries very

    fast even at higher scales while assuring data consistency.

    FIGURE 5: BIGTABLE STRUCTURE (14)

    Services

    As mentioned earlier, the GAE serves as an abstraction of the underlying hardware and operating

    system layers. These abstractions are implemented as services that can be directly called from the actual

    application. In fact, the datastore itself is as well a service that is controlled by the runtime environment

    of the application.

    MEMCACHE

    The platform innate memory cache service serves as a short-term storage. As its name suggests, it stores

    data in a servers memory allowing for faster access compared to the datastore. Memcache is a non-

    persistent data store that should only be used to store temporary data within a series of computations.

    Probably the most common use case for Memcache is to store session specific data (15). Persisting session

    information in the datastore and executing queries on every page interaction is highly inefficient over the

    application lifetime, since session-owner instances are unique per session (16). Moreover, Memcache is

    well suited to speed up common datastore queries (8). To interact with the Memcache GAE supports

    JCache, a proposed interface standard for memory caches (17).

    URL FETCH

    Because the GAE restrictions do not allow opening sockets (18), a URL Fetch service can be used to send

    HTTP or HTTPS requests to other servers on the Internet. This service works asynchronously, giving the

    remote server some time to respond while the request handler can do other things in the meantime. After

    the server has answered, the URL Fetch service returns response code as well as header and body. Using

    the Google Secure Data Connector an application can even access servers behind a companys firewall

    (8).

    MAIL

    The GAE also offers a mail service that allows sending and receiving email messages. Mails can be sent

    out directly from the application either on behalf of the applications administrator or on behalf of users

  • Into the Cloud

    Page 10

    with Google Accounts. Moreover, an application can receive emails in the form of HTTP requests initiated

    by the App Engine and posted to the app at multiple addresses. In contrast to incoming emails, outgoing

    messages may also have an attachment up to 1 MB (8).

    XMPP

    In analogy to the mail service a similar service exists for instant messaging, allowing an application to

    send and receive instant messages when deployed to the GAE. The service allows communication to and

    from any instant messaging service compatible to XMPP (8), a set of open technologies for instant

    messaging and related tasks (19).

    IMAGES

    Google also integrated a dedicated image manipulation service into the App Engine. Using this service

    images can be resized, rotated, flipped or cropped (18). Additionally it is able to combine several

    images into a single one, convert between several image formats and enhance photographs. Of course

    the API also provides information about format, dimensions and a histogram of color values (8).

    USERS

    User authentication with GAE comes in two flavors. Developers can roll their own authentication service

    using custom classes, tables and Memcache or simply plug into Googles Accounts service. Since for most

    applications the time and effort of creating a sign-up page and store user passwords is not worth the

    trouble (18), the User service is a very convenient functionality which gives an easy method for

    authenticating users within applications. As byproduct thousands of Google Accounts are leveraged. The

    User service detects if a user has signed in and otherwise redirect the user to a sign-in page.

    Furthermore, it can detect whether the current user is an administrator, which facilitates implementing

    admin-only areas within the application (8).

    OAUTH

    The general idea behind OAuth is to allow a user to grant a third party limited permission to access

    protected data without sharing username and password with the third party. The OAuth specification

    separates between a consumer, which is the application that seeks permission on accessing protected

    data, and the service provider who is storing protected data on his users' behalf (20). Using Google

    Accounts and the GAE API, applications can be an OAuth service provider (8).

    SCHEDULED TASKS AND TASK QUEUES

    Because background processing is restricted on the GAE platform, Google introduced task queues as

    another built-in functionality (18). When a client requests an application to do certain steps, the

    application might not be able to process them right away. This is where the task queues come into play.

    Requests that cannot be executed right away are saved in a task queue that controls the correct

    sequence of execution. This way, the client gets a response to its request right away, possibly with the

    indication that the request will be executed later (13).

    Similar to the concept of task queues are cron jobs. Borrowed from the UNIX world, a GAE cron job is a

    scheduled job that can invoke a request handler at a pre-specified time (8).

    BLOBSTORE

    The general idea behind the blobstore is to allow applications to handle objects that are much larger

    than the size allowed for objects in the datastore service. Blob is short for binary large object and is

    designed to serve large files, such as video or high quality images. Although blobs can have up to 2 GB

    they have to be processed in portions, one MB at a time. This restriction was introduced to smooth the

    curve of datastore traffic. To enable queries for blobs, each has a corresponding blob info record which

    is persisted in the datastore (8), e. g. for creating an image database.

  • Into the Cloud

    Page 11

    ADMINISTRATION CONSOLE

    The administration console acts as a management cockpit for GAE applications. It gives the developer

    real-time data and information about the current performance of the deployed application and is used

    to upload new versions of the source code. At this juncture it is possible to test new versions of the

    application and switch the versions presented to the user. Furthermore, access data and logfiles can be

    viewed. It also enables analysis of traffic so that quota can be adapted when needed. Also the status of

    scheduled tasks can be checked and the administrator is able to browse the applications datastore and

    manage indices (8).

    App Engine for Business

    While the GAE is more targeted towards independent developers in need for a hosting platform for

    their medium-sized applications, Google`s recently launched App Engine for Business tries to target the

    corporate market. Although technically mostly relying on the described GAE, Google added some

    enterprise features and a new pricing scheme to make their cloud computing platform more attractive for

    enterprise customers (21).

    Regarding the features, App Engine for Business includes a central development manager that allows a

    central administration of all applications deployed within one company including access control lists. In

    addition to that Google now offers a 99.9% service level agreement as well as premium developer

    support.

    Google also adjusted the pricing scheme for their corporate customers by offering a fixed price of $8

    per user per application, up to a maximum of $1000, per month. Interestingly, unlike the pricing scheme

    for the GAE, this offer includes unlimited processing power for a fixed price of $8 per user, application

    and month. From a technical point of view, Google tries to accommodate for established industry

    standards, by now offering SQL database support in addition to the existing Bigtable datastore

    described above (8).

    APPLICATION DEVELOPMENT USING GOOGLE APP ENGINE

    General Idea

    In order to evaluate the flexibility and scalability of the GAE we tried to come up with an application

    that relies heavily on scalability, i.e. collects large amounts of data from external sources. That way we

    hoped to be able to test both persistency and the gathering of data from external sources at large

    scale.

    Therefore our idea has been to develop an application that connects people`s delicious bookmarks with

    their respective Facebook accounts. People using our application should be able to see what their

    Facebook friends delicious bookmarks are, provided their Facebook friends have such a delicious

    account. This way a user can get a visualization of his friends latest topics by looking at a generated tag

    cloud giving him a clue about the most common and shared interests.

    In order to provide such a service within our application we had to integrate both Facebook as well as

    delicious and persist the fetched data in the GAE datastore. Although all data could as well always be

    fetched in real-time, there are two reasons to persist bookmarks from delicious as well as personal

    details from the respective Facebook accounts. First, from a user perspective, reading out the information

    from a GAE data store is much faster than re-fetching everything at runtime. Second and more important

  • Into the Cloud

    Page 12

    for this project is the need to test the scalability of the data store and the GAE in general. So in order to

    draw substantial conclusions about the scalability of the GAE, testing persistency remains essential.

    Requirements and functionality

    Before we started implementing the application we studied the relevant APIs of affected service

    providers as well as similar applications. Furthermore, we asked a number of potential users on their

    opinion how the user interface should look like and aligned them with our own design visions. The result

    was a collection of requirements which is illustrated in Table 2.

    Functional Requirements Non-functional Requirements

    User Interface Backend

    Login to a delicious

    account

    Filter friends and their

    bookmarks

    Filter bookmarks by tags

    Visualize bookmarks in a

    tag cloud

    Show all bookmarks in a

    list

    Fetch bookmarks from

    delicious and persist them

    Fetch personal details and

    friends from Facebook and

    persist them

    Update and persist new

    bookmarks from delicious

    Update and persist new

    Facebook details

    Application has to be

    scalable up to 1000

    parallel users

    Secure & reliable

    authentication with

    Facebook & delicious

    Application should run

    within Facebook using an

    iFrame

    TABLE 2: APPLICATION REQUIRMENTS

    Implementation

    Development Environment

    Because of our expertise in Java and familiarity with the Eclipse IDE, we decided to use both for

    developing our application. Furthermore, by using Eclipse we were able to use the GAE plugin that

    improves the development and debugging of a local GAE application significantly. Also, we agreed on

    using the Google Web Toolkit (GWT), a Java framework that helps developing rich web applications

    with AJAX-based user interfaces.

    In order to develop our application supported by a proper source code versioning tool, we made use of

    a free SVN server provided by Assembla.com.

    Application Environment

    As stated in the requirements, the application should run embedded within the Facebook website. As

    depicted in Figure 6, a user can find and select the application via Facebook and send a request to start

    the application within Facebook. This will trigger a request for an iFrame that is forwarded to the GAE

    where the actual application is started. The GAE will then call the Facebook API and wait for a response.

    As soon as the response is received by the GAE, it returns the iFrame with the application to the user. The

    application is then visible within an iFrame in the Facebook UI and is ready to be used. If the user is not

    yet logged in to delicious he can now do so. The login to delicious is done via Yahoo authentication. So

    the GAE sends a request for authentication to Yahoos authentication servers and receives an access

    token. With this access token our application running within the GAE can then access the user`s bookmarks

    by requesting them from the delicious servers.

  • Into the Cloud

    Page 13

    FIGURE 6: APPLICATION ARCHITECTURE (OWN ILLUSTRATION ADAPTED FROM (21) (22) (23) (8))

    Application Architecture

    LAYER & COMPONENT OVERVIEW

    Our application is based on a Three-Tier client/server architecture (24) incorporating a presentation, an

    application/business logic and a data tier. Figure 7 shows an overview of the components and layers

    involved. The presentation layer is represented by the component web browser. Based on the GWT, Java

    code is automatically compiled into AJAX code running within the client`s web browser.

    The depicted business logic component includes various services which interact with external service

    providers APIs such as Facebook, Yahoo and Delicious as well as with the data and presentation layer.

    With the presentation layer the business logic layer interacts via both HTTP requests and RPC calls.

    Additionally, the business logic layer uses an XML parser to process data received from the delicious API.

    The data layer is fully represented by the GAE. Although offering more services than data persistency,

    the GAE serves as our main data layer. The business logic layer uses the Memcache API to store data

    temporarily and utilizes the datastore API to persist data.

    In addition to that the GAE also offers a Logging API and a Task API which we both utilize from the

    business logic layer. These two features do not belong to the data layer, but have been added in the

    diagram to show the whole functionality leveraged from the GAE.

  • Into the Cloud

    Page 14

    FIGURE 7: DIAGRAM OF LAYERS AND COMPONENTS (OWN ILLUSTRATION)

    CLASS OVERVIEW

    In the following paragraph we will present the applications architecture by showing a class overview for

    each of the three layers, namely the business logic layer, the presentation layer and the data layer.

    Business Logic Layer

    Essentially, five classes build up the business logic layer. The class FacebookServiceImpl implements the

    access to the Facebook API by providing methods for retrieving friends for a certain Facebook user.

    DeliciousServiceImpl and DeliciousFeedConnector both implement the access to the delicious backend.

    Access to the data layer via the datastore API is realized by the class PMF which returns a handle to the

    GAE`s PersistenceManagerFactory. In order to communicate with the presentation layer, the class

    UserServiceImpl implements the server side of the GAE`s RPC service.

  • Into the Cloud

    Page 15

    FIGURE 8: CLASS DIAGRAMM OF BUSINESS LAYER (OWN ILLUSTRATION)

    Presentation Layer

    The presentation layer is distributed over several sub-packages, in order to fulfill the division between

    the parts of the presentation layer that are fully server- and those that are solely client-based. The

    classes BookmarkTable, BookmarkWidget, FilterWidget and TagCloudWidget represent the main

    functionality in terms of displaying bookmarks by user, filtering bookmarks and users and creating a tag

    cloud navigation for accessing bookmarks within the web-based UI. Although these classes are written in

    pure Java, the GWT automatically compiles them into AJAX-enabled JavaScript code. Both the

    DeliciousLoginWidget and the FacebookLoginWidget classes provide functionality for the user to login to

    Facebook and Delicious. The classes Gwtapp, Global and Modality represent GWT abstractions of the

    actual presentation layer runtime environment in the clients web browser.

  • Into the Cloud

    Page 16

    FIGURE 9: CLASS DIAGRAMM OF PRESENTATION LAYER (OWN ILLUSTRATION)

    Data Layer

    In order to persist data within the GAE, we are using JDO classes. Although Googles BigTable is by

    definition a schema-less database, the JDO classes serve as a means to define the schema for kinds of

    entities (see GAE introduction).This way BookmarkJDO and UserJDO define the schema for bookmarks

    that are persisted in the datastore as well as for user data that is stored.Not depicted in the diagram

    above, data access classes such as BookmarkDO are used for data representation within the application.

  • Into the Cloud

    Page 17

    FIGURE 10: CLASS DIAGRAM OF DATA LAYER (OWN ILLUSTRATION)

    Platform Limitations

    At its core, GAE restricts the access to the physical infrastructure. This includes preventing the application

    from opening sockets, running background processes (except cron jobs) and using other common back-

    end routines that application developers normally take for granted (18). The following chapter dwells on

    the limitations of the GAE that we directly encountered during our application development. We know

    that there are several more limitations especially in terms of enterprise-centric applications for the GAE,

    but these will be discussed later.

    BACKGROUND PROCESSING

    Due to the character of the GAE, intense background processing is not possible within applications. As

    client requests are subject to a certain time limit, the ability to process large chunks of data is quite

    limited.

    TRANSACTIONS

    Another limitation is the inability of using the Memcache and the data store within one transaction. This

    limitation is quite a problem when processing large amounts of data. In our scenario we wanted to fetch

    all user bookmarks from delicious and persist them in the data store. As the amount of entities that can be

    handled per transaction is limited as well, we tried to buffer all data in the Memcache temporarily. From

    there we wanted to persist small portions of the buffered data within the data store. After successfully

    storing the extracted data from the buffer in the data store, there has been no way of accessing the

    Memcache once again within this transaction and deleting the stored data from the Memcache. This way

    it is unknown whether certain entities have already been stored when accessing the Memcache for the

    subsequent transaction, trying to read the next chunk of buffered data to make it persistent within the

    data store. The only way to circumvent this problem is by deleting the part of the data that is used by

    the current transaction before the transaction is actually triggered. However, using this method means

    risking the loss of already fetched data, in case the transaction fails.

  • Into the Cloud

    Page 18

    DATA STORE QUERYING LIMITATIONS

    Although the underlying Bigtable data storage approach is quite different to traditional SQL

    approaches, the used Java Data Objects Query Language (JDOQL) tries to provide an SQL-like query

    language (25). However, certain features available in traditional SQL approaches are not available in

    the GAE environment. By design joins of tables are not possible within the GAE because it features a non-

    relational database. Also, certain combinations of filtering operators such as cannot be used

    at the same time within a single query.

    SCALABILITY TESTING Good on-demand scalability is one of the key features that the GAE offers. A scalable system should be

    able to proportionally increase the amount of work it performs, as available resources increase. In order

    to check whether GAE lives up to its promise of effortless scalability, we modified the described

    application from the last chapter in order to gather and evaluate performance data.

    Testing approach

    With GAE application instances running on a standardized, relatively low-power virtual hardware and

    having response time limited to 30 seconds for any request, scalability translates into parallelization. By

    utilizing a concurrent algorithm and spreading work over multiple VM instances simultaneously, as the

    number of VM instances grows, an application should ideally execute linearly faster, compared to the

    sequential algorithm. This, of course, presumes that the algorithm it executes can be parallelized in the

    first place.

    One of the easily parallelizable tasks is crawling data sources for various datafor instance,

    downloading existing bookmarks from delicious.com in bulk. Out of the described application in the last

    chapter we developed a Delicious Crawleran application that starts with a random user, then

    downloads and saves his bookmarks and friends. Then in turn, for each friend it downloads his

    bookmarks, friends list, and so forth, effectively performing a breadth-first search. The Delicious crawler

    saves a total of 100 users and 1600 bookmarks per run.

    The operation of the Delicious Crawler is visualized in Figure 11. A request triggered by task queue

    instructs the crawler to:

    1. Take a UserJDO class (represents a delicious user) from the FIFO queue

    2. For this user, read bookmarks and friends using the delicious.com JSON feed API

    3. Convert received data to classes that can be persisted

    4. Save the user and bookmarks to the Bigtable datastore

    5. Enqueue the friends to the FIFO queue

    6. Enqueue a task in the task queue

    As these operations are executed, performance data is logged. This includes the time to execute

    URLFetch and Persistence API requests, as well as the overall request processing duration.

    Since the described workflow can be executed concurrently, thus processing multiple users simultaneously,

    the developed Delicious Crawler is a suitable candidate for our scalability testing. By setting the rate

    attribute of the task queue to 50/second (maximum), and varying the bucket-size between 1 (minimum)

    and 50 (maximum), we were able to test the performance of the Delicious Crawler with 1 to 50

    concurrent threads. The results are discussed in the next section.

  • Into the Cloud

    Page 19

    FIGURE 11. CRAWLER (OWN ILLUSTRATION)

    Application scalability

    Figure 12 shows the total time it took Delicious Crawler to process 100 users and 1600 bookmarks for

    various bucket sizes. As we see, performance increases up to eight parallel threads and then levels.

    FIGURE 12. TOTAL DURATION (OWN ILLUSTRATION)

    One explanation for the disappointing results when using 16 or more threads could be possible

    temporary technical problems at the App Engine at the time of testing, or an automatic throttling by

    Google, as the error message presented in Figure 13 may suggest. Such errors occurred a total of 53

  • Into the Cloud

    Page 20

    times out of 700 task queue executions. These errors possibly explain the extreme outliers seen in Figure

    14, where detailed data for bucket size 16 is shown.

    FIGURE 13 ERROR MESSAGE IN ADMINISTRATION CONSOLE (8)

    FIGURE 14 TIME PER REQUEST AND SERVICE (OWN ILLUSTRATION)

    Since we only did a brief scalability testing, results may be affected by idiosyncrasies ranging from the

    local network connection latencies to temporary technical problems at delicious or the App Engine itself.

    Thus, the results should not be generalized for use scenarios beyond of what we tested.

    DISCUSSION

    The following section tries to discuss whether Cloud Computing in general and the GAE in particular are

    able to serve the needs and requirements of modern software engineering. First each aspect will be

    discussed in general terms which is followed by a GAE-specific reflection.

    Software Engineering Aspects

    Functional Aspects

    As discussed in the introduction, PaaS environments set certain restrictions to the developer in terms of

    programming techniques, languages and other elsewhere available functionality. In contrast to that, IaaS

    environments such as Amazon Web Services give the developer more freedom and flexibility, however

    at the cost of having less features or functionality pre-built. This shows the trade-off between developers

  • Into the Cloud

    Page 21

    flexibility and pre-built functionality. The more the application requirements match with GAEs pre-built

    functionality, the easier and faster it will be to develop applications for the GAE compared to an IaaS

    platform such as Amazon Web Services. However, if the match is quite low, GAEs restrictions outweigh

    the benefits of the pre-built functionality and an IaaS provider might be the better choice in the that

    case.

    Usability

    Cloud Computing adopts the concept of Utility Computing, which presents the very idea that users obtain

    and employ computing platforms in Clouds as easily as they access a traditional public utility

    infrastructure (such as electricity, water or telephone network) (22). The same expectations are there for

    the GAE from a developer`s perspective. Usability-wise the GAE offers easy access to the domain of

    Cloud Computing by providing abstractions for important services such as persistence as well as an easy

    to use development environment.

    Scalability

    The central design goal of the GAE is to address concerns about scalability. The platform is built on the

    concept of horizontal scaling. In essence, this means that instead of running an application on more

    powerful hardware, the application is executed on more instances of less powerful hardware (16).

    Being a PaaS solution, the GAE offers a wide portfolio of built-in services that can be easily integrated

    into a GAE-deployed application. This includes its built-in scalability feature as well as its persistence

    abstraction. At least in theory built-in scalability and the persistence abstraction can be seen as one of

    the most interesting USPs that the GAE has to offer. Using the distributed application deployment

    approach along with an extremely scalable Bigtable database approach, the GAE already delivers all

    tools to build applications that are highly scalable. However, in reality certain purposely set restrictions

    limit the scalability of the GAE at the moment, as our tests have shown. Nevertheless, numerous Google

    services such as Gmail show the GAEs technical potential in terms of scalability as the underlying layers

    for scalability and especially for persistence are the same. The emergence of Google App Engine for

    Business shows that Google is currently trying to loosen the named restrictions and make the GAE

    attractive for enterprise applications that may then fully utilize its scalability features.

    Integration

    The need will arise for migrating and integrating applications and data from different clouds. This will

    bring a new form of cloud service, that is cloud integration service (23). At the moment the GAE does not

    offer any direct support to do so, although data from external clouds can be integrated by the tools and

    services offered within the GAE.

    Availability

    It is impossible to provide 100% availability, unless a high availability architecture is adopted and both

    the platform and applications are fully tested. Enterprise users should seek service level agreements

    (SLAs) that will motivate the vendors to ensure desired levels of availability. Besides the SLA, users who

    require 100% availability may take a combination of precautionary measures. With data, they may

    maintain a backup on on-premises storage, or use a backup cloud, or simply not store mission-critical

    data on the cloud. With applications, the users may keep an on-premises version of the application, so

    that they may work offline while the cloud is down (23).

    In November 2007, RackSpace, Amazons competitor, stopped its service for 3 hours because of power

    cut-off at its data center; in June 2008, Google App Engine service broke off for 6 hours due to some

    bugs of storage system; In March 2009, Microsoft Azure experienced 22 hours out of service caused by

  • Into the Cloud

    Page 22

    OS system update. Currently, the public cloud provider based on virtualization, defines the reliability of

    service as 99.9% in SLA (24).

    Google itself does not guarantee any service level agreements for the basic version of the GAE.

    However, the recently launched GAE for Business specifically includes a 99.9% SLA. It remains to be seen

    how and if this promise will be met in future.

    Support

    In fact, cloud services should be designed for easier usability than on-premises computing in the first

    place (23). Especially PaaS platforms such as the GAE have by definition a higher need for developer

    support as the features/services provided are mostly non-standardized. In contrast to IaaS solutions

    where the basic hardware and operating system layers are mostly the same to traditional deployment

    approaches, PaaS platforms need a detailed description of the supported features and services. The

    GAE offers some documentation on how to use the platform, but in the basic version no professional

    support is available. The upcoming business edition of the GAE will offer a real support option, however.

    Privacy

    Customers may be able to sue enterprises if their privacy rights are violated, and in any case the

    enterprises may face damage to their reputation. Current privacy concepts such as the Fair Information

    Principles are applicable to cloud computing scenarios and mitigate the risks. Tips for SE:

    1. Minimize personal information sent to and stored in the cloud

    2. Protect personal information in the cloud

    3. Maximize user control

    4. Allow user choice

    Privacy should be built into every stage of the product development process: it is not adequate to try to

    bolt on privacy at a late stage in the design process. (25).

    Cloud computing vendors must adopt the most sophisticated and up-to-date tools and procedures, and

    strive to provide better security and privacy than is available for on-premises computing (8).

    In terms of the GAE, Googles general privacy notes are applicable. To evaluate the real level of

    privacy, especially of the data stored within in the GAE, one would have to perform further privacy-

    related tests that can deliver meaningful insights.

    User Authentication

    For user authentication the GAE comes along with the Google authentication service. This enables

    developers to easily integrate logins for Google accounts into their applications. Given the acceptance

    of Google accounts, this feature is really useful and a great advantage compared to IaaS solutions

    where authentication has to be handled by the developer on its own.

    Legal Issues and Compliance

    Enterprise users must maintain business legal documents and assure their integrity in order to comply with

    various laws. Cloud computing vendors have to adopt technologies to ensure that their enterprise users

    data satisfy their compliance requirements. Again, this does not seem to have received much attention yet

    (23). At this stage the GAE does not make any specifications about legal issues and compliance. For

    applications that are heavily dependent on such restrictions, the GAE might not be the right choice at this

  • Into the Cloud

    Page 23

    point in time. But once again, Google has to look into these issues when becoming a serious PaaS

    provider on the enterprise level as its start of the GAE for Business suggests.

    Cost

    The 3rd party provider owns and manages all the computing resources (servers, software, storage and

    networking) and electricity needed for the services. The users only need to plug into the cloud. The users

    do not need to make a large upfront investment on computing resources; the space needed to house

    them; electricity needed to run the computing resources; and the cost of maintaining staff for

    administering the system, network, and database (23).

    In terms of costs, the GAE offers a usage-dependent pricing scheme that starts with a basic version which

    is free of charge but subject to certain limitations. The paid version of the GAE removes some of these

    limitations, however other limitations still exist caused by the GAEs design. As no upfront investments are

    necessary, the GAE is a good way to test new applications (even for free) and pay as the acceptance

    and spread of the application grows.

    Interestingly, the recently launched GAE for Business goes into a different direction, as Google now

    offers a flat-rate pricing scheme for enterprise users.

    Advantages for the software developer

    In the next two decades, service-oriented distributed computing will emerge as a dominant factor in

    shaping the industry, changing the way business is conducted and how services are delivered and

    managed (32).

    This example is not intended to discredit the paradigm, just the exaggerated and premature claims of

    end-user empowerment. On the contrary, even if lay end users wont be able to whip up a serious

    enterprise application in a matter of days, cloud computing opens up exciting new possibilities based on

    a mix of old and new technologies for the next generation of software developers (33).

    Outlook: the way ahead in Cloud Computing

    New application opportunities and use cases

    It is foreseeable that Cloud Computing will affect the world of IT in two ways. On the one hand it will

    fundamentally change the way existing applications are designed and on the other hand it will create

    whole new use-cases. Chun and Maniatis (31) describe one such use-case, where cloud computing enables

    a technology which otherwise would not be possible: to overcome hardware limitations and enable more

    powerful mobile interactive applications, external resources are used by partially shifting computations

    from a smartphone into the Cloud. Although as enabler of new use-cases it will play a major role, the

    impact on current applications is believed to be even bigger. Cloud Computing presents a unique

    opportunity for batch-processing and business analytics. The rise of business analytics has manifested

    itself in a growing share of computing resources being spent on understanding customers, supply chains,

    buying habits and so on. Analyzing terabytes of data and can take hours on a single computer. If

    computations can be parallelized using hundreds of computers for a short time costs the same as using a

    few computers for a long time. Another group of ideal candidates for the Cloud are compute-intensive

    desktop applications. Especially for new product development, moving simulations into the Cloud can

    mean enormous cost savings compared with the traditional approach of buying computation time from a

    data processing center. The latest versions of the mathematics software packages such as Matlab or

    Mathematica are already capable of using Cloud Computing to perform expensive evaluations (10).

  • Into the Cloud

    Page 24

    Challenges of Cloud Computing

    Of course, where there is light, there is also a shadow. Like every uprising technology Cloud Computing

    has still got obstacles on its path that have to be overcome. This is especially true for enterprise

    applications. Subsequently we point out some major obstacles of current Cloud Computing from a

    business perspective and present corresponding solution opportunities (see Table 3).

    Challenge Opportunity

    Availability of Service Especially for enterprises, availability of certain

    applications is business critical. Therefore they still shrink

    from trusting Cloud providers with hosting critical software.

    A possible solution would be to use multiple Cloud providers

    to provide business continuity and utilize Cloud-elasticity to

    defend against DDOS attacks.

    Data Lock-In Many businesses fear that choosing a certain Cloud

    provider also means losing a certain degree of freedom as

    their data gets locked in. If APIs of different providers

    where built on an industry standard, entry threshold would

    definitely be lower.

    Data Confidentiality and Auditability Another important issue is confidentiality in the Cloud. Even

    if the provider is trustworthy it is still unclear on which server

    data is located and which legislation is applied. A possible

    work-around would be to deploy encryption, VLANs and

    firewalls. A really clean solution would store data

    geographically according to legal requirements.

    Data Transfer Bottlenecks Only because high-speed broadband Internet is available

    in some regions, this does not necessarily mean that each

    branch can fall back on the same infrastructure quality.

    FedExing disks is still a common activity throughout the

    world. And also in the era of Cloud Computing enterprises

    have to balance which data to move entirely into the Cloud

    and where other solutions might be better suited.

    Performance Unpredictability Especially for HPC applications it is essential that

    computation performance is stable and predictable. To

    solve this issue it is necessary to improve virtual machines,

    for example by implementing gang scheduling. In some

    matters of data storage flash memory instead of hard-

    drives might greatly improve speed.

    Scalable Storage Although current relational database systems support multi-

    user access, Cloud Computing dimensions are in another

    league. Although CC providers (e. g. Google) have

    implemented special datastores, management of persistent

    data is still a major bottleneck. Fundamental research on

    data base technology is therefore indispensable.

  • Into the Cloud

    Page 25

    Challenge Opportunity

    Bugs in Large-Scale Distributed Systems Current debugging technology is designed for traditional

    software. The laws that apply in distributed virtual Cloud

    systems are different from those in conventional systems. An

    opportunity to approach this issue is to invent special

    Debuggers for distributed VMs.

    Dynamic Scaling State-of-the-art scaling is mainly manually or at most semi-

    automatic. Although additional hardware is switched on in

    case of need, this not until server load hits a certain level.

    Using machine learning algorithms to predict workload and

    dynamically allocate needed resources would improve

    Cloud efficiency substantially.

    Reputation Fate For CC providers reputation might become a problem,

    because one customers bad behavior can affect the

    reputation of the cloud as a whole. As soon as the Clouds IP

    addresses become blacklisted e. g. because of spam all

    applications on the Cloud that send emails out of it become

    negatively affected. This issue could be resolved by

    offering reputation-guarding services similar to trusted

    email services. Also legal issues have to be addressed

    since Cloud Computing providers surely not want to be held

    liable for actions of their customers.

    Software Licensing Another major issue are current licensing models, as they

    restrict the computers on which the software can run. In this

    context software vendors have to adapt their business

    models and offer pay-for-use licenses. Many vendors have

    already reacted and now offer SaaS themselves.

    TABLE 3: OBSTACLES AND OPPORTUNITIES OF CLOUD COMPUTING (ADAPTED FROM (10))

  • Into the Cloud

    Page 26

    CONCLUSION

    Cloud Computing remains the number one hype topic within the IT industry at present. Our evaluation of

    the Google App Engine has shown both functionality and limitations of the platform. Developing and

    deploying an application within the GAE is in fact quite easy and in a way shows the progress that

    software development and deployment has made. Within our application we were able to use the

    abstractions provided by the GAE without problems, although the concept of Bigtable requires a big

    change in mindset when developing. Our scalability testing showed the limitations of the GAE at this point

    in time. Although being an extremely helpful feature and a great USP for the GAE, the built-in scalability

    of the GAE suffers from both purposely-set as well as technical restrictions at the moment. Coming back

    to our motivation of evaluating the GAE in terms of its sufficiency for serious large-scale applications in a

    professional environment, we have to conclude that the GAE not (yet) fulfills business needs for enterprise

    applications at present. As the discussion showed, some of these needs are yet to be satisfied by Cloud

    Computing platforms in general, others are GAE-specific issues. However, seeing the benefits and

    potential of PaaS-based approaches such as the GAE, the question remains whether quite inflexible and

    non-standardized PaaS platforms can establish themselves in the market for serious large-scale

    applications or will remain platforms for small and simple applications as seen today.

  • Into the Cloud

    Page 27

    ABBREVIATIONS

    AJAX Asynchronous JavaScript and XML

    API Application Programming Interface

    Blob Binary Large Object

    CC Cloud Computing

    DDOS Distributed Denial of Service

    GAE Google App Engine

    GWT Google Web Toolkit

    HPC High Performance Computing

    HTTP Hypertext Transfer Protocol

    HTTPS Hypertext Transfer Protocol Secure

    IaaS Infrastructure as a Service

    JDO Java Data Object

    JDOQL Java Data Object Query Language

    JS JavaScript

    JSON JavaScript Object Notation

    JVM Java Virtual Machine

    PaaS Platform as a Service

    RPC Remote Procedure Call

    SaaS Software as a Service

    SLA Service Level Agreement

    SQL Structured Query Language

    SVN Subversion (a revision control system)

    VLAN Virtual Local Area Network

    VM Virtual Machine

    XML Extensible Markup Language

    XMPP Extensible Messaging and Presence Protocol

  • Into the Cloud

    Page 28

    TABLE OF FIGURES

    Figure 1: Global Search volume index for Cloud Computing (2) ......................................................................3

    Figure 2: Service Delivery models of cloud computing (12) ...................................................................................5

    Figure 3: Major types of cloud Services (Adapted form (5)) .................................................................................6

    Figure 4: Structure of Google App Engine (13) ........................................................................................................8

    Figure 5: Bigtable Structure (14) .................................................................................................................................9

    Figure 6: Application Architecture (Own Illustration adapted from (21) (22) (23) (8)) ................................. 13

    Figure 7: Diagram of Layers and components (Own illustration) ....................................................................... 14

    Figure 8: Class diagramm of Business layer (Own illustration) ........................................................................... 15

    Figure 9: Class diagramm of Presentation layer (Own illustration) ................................................................... 16

    Figure 10: Class diagram of Data Layer (Own illustration) ................................................................................ 17

    Figure 11. Crawler (Own illustration) ....................................................................................................................... 19

    Figure 12. Total Duration (Own illustration) ............................................................................................................ 19

    Figure 13 Error message in Administration console (8) ......................................................................................... 20

    Figure 14 Time per request and service (Own illustration) .................................................................................. 20

  • Into the Cloud

    Page 29

    LITERATURE

    1. Let it Rise. The Economist. October, 23., 2008.

    2. Google Inc. Google Trends. [Online] 07 17, 2010. [Cited: 07 17, 2010.]

    http://www.google.com/trends?q=Cloud+Computing&ctab=0&geo=all&geor=all&date=all&sort=0.

    3. Voas, J. and Zhang, J. Cloud Computing:New Wine or Just a New Bottle? IT Professional. 2009.

    4. Vaquero, L, et al. A Break in the Clouds: Towards a Cloud Definition. ACM SIGCOMM Computer

    Communication Review. 01 2009.

    5. Leavitt, N. Is Cloud Computing Really Ready for Prime Time? IEEE Technology News. 2009.

    6. NIST. The NIST Cloud Computing Project. [Online] 2009. [Cited: 07 17, 2010.]

    http://csrc.nist.gov/cyber-md-summit/documents/posters/cloud-computing.pdf.

    7. Amazon.com, Inc. amazon web services. [Online] 2010. [Cited: 07 17, 2010.]

    http://aws.amazon.com/.

    8. Google Inc. Google App Engine. [Online] 2010. [Cited: 07 17, 2010.]

    http://code.google.com/intl/de-DE/appengine/.

    9. Microsoft, Inc. Windows Azure Platform. [Online] 2010. [Cited: 07 17, 2010.]

    http://www.microsoft.com/windowsazure/.

    10. Armbrust, M, et al. Above the Clouds: A Berkeley View of Cloud Computing. s.l. : UC Berkeley

    Reliable Adaptive Distributed Systems Laboratory, 2009.

    11. Sriram, I and Khajeh-Hosseini, A. Research Agenda in Cloud Technologies. [Online] 10 2009.

    [Cited: 07 18, 2010.] http://arxiv.org/ftp/arxiv/papers/1001/1001.3259.pdf.

    12. Marinos, A and Briscoe, G. Community Cloud Computing. 2009.

    13. Sanderson, D. Programming Google App Engine. Sebastopol : OReilly Media, 2009.

    14. Chang, F, et al. Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on

    Computer Systems. s.l. : ACM, 2008. Vol. 26, 2.

    15. Severance, C. Using Google App Engine. Sebastopol : OReilly Media, 2009.

    16. Ciurana, E. Developing with Google App Engine. s.l. : firstPress, 2008.

    17. Java Community Process. JCACHE - Java Temporary Caching API. [Online] 2001. [Cited: 07 18,

    2010.] http://jcp.org/en/jsr/detail?id=107. JSR 107.

    18. Roche, K. and Douglas, J. Beginning Java Google App Engine. s.l. : Apress, 2009.

    19. XMPP Standards Foundation. The Extensible Messaging and Presence Protocol. [Online] 2010.

    [Cited: 07 17, 2010.] http://xmpp.org/.

    20. Internet Engineering Task Force. Request for Comments: 5849 . The OAuth 1.0 Protocol. [Online] 04

    2010. [Cited: 07 19, 2010.] http://tools.ietf.org/html/rfc5849.

    21. Yahoo! Inc. Yahoo! [Online] 2010. [Cited: 07 18, 2010.] http://www.yahoo.com/.

  • Into the Cloud

    Page 30

    22. Facebook, Inc. facebook. [Online] 2010. [Cited: 07 18, 2010.] http://www.facebook.com/.

    23. Yahoo! Inc. delicious.com. [Online] 2010. [Cited: 07 18, 2010.] http://www.delicious.com.

    24. Bruegge, B. and Dutoit, A. Object-oriented software engineering: using UML, patterns, and Java. s.l. :

    Prentice Hall, 2009.

    25. Tyagi, S., Vorburger, M. and McCammon, K. Core Java Data Objects. s.l. : Prentice Hall PTR, 2003.

    26. Scientific Cloud Computing: Early Definition and Experience. Wang, L. and von Laszewski, G. 2008.

    27. Won, K. Cloud Computing: Today and Tomorrow. Journal of Object Technology. 2009, Vol. 08, 01.

    28. Qian, L., et al. Cloud Computing: An Overview. CloudCom 2009. Heidelberg : Springer, 2009.

    29. Pearson, S. Taking Account of Privacy when Designing Cloud Computing Services. Proceedings of the

    2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing. s.l. : IEEE Computer

    Society, 2009.

    30. Deelman, E, et al. The Cost of Doing Science on the Cloud: The Montage Example. Proceedings of the

    2008 ACM/IEEE conference on Supercomputing. 2008.

    31. Chun, B. and Maniatis, P. Augmented Smart Phone Applications Through Clone Cloud Execution.

    Proceedings of the 12th Workshop on Hot Topics in Operating Systems. 2009.

    32. Buyya, R, Pandey, S and Vecchiola, C. Cloudbus Toolkit for Market-Oriented Cloud Computing.

    Proceedings of the 1st International Conference on Cloud Computing. Beijing : Springer, 2009.

    33. Erdogmus, H. Cloud Computing: Does Nirvana Hide behind the Nebula? IEEE Software. 2009,

    March/April.