49
University of Twente Literature Study Business Process Management in the cloud: Business Process as a Service (BPaaS) Author: Evert Duipmans Supervisor: Dr. Lu´ ıs Ferreira Pires April 1, 2012

Business Process Management in the cloud: Business Process as a

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

University of Twente

Literature Study

Business Process Management inthe cloud: Business Process as a

Service (BPaaS)

Author:Evert Duipmans

Supervisor:Dr. Luıs Ferreira Pires

April 1, 2012

Contents

1 Introduction 31.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Structure of the report . . . . . . . . . . . . . . . . . . . . . . 5

2 Business Process Management 62.1 BPM lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Business Process Management and Notation (BPMN) 82.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Orchestration vs. Choreography . . . . . . . . . . . . 112.4 Enactment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.5 Business Process Management System . . . . . . . . . . . . . 122.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Cloud Computing 143.1 General benefits and drawbacks . . . . . . . . . . . . . . . . . 143.2 Service models . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.1 Infrastructure as a Service (IaaS) . . . . . . . . . . . . 163.2.2 Platform as a Service (PaaS) . . . . . . . . . . . . . . 183.2.3 Software as a Service (SaaS) . . . . . . . . . . . . . . 19

3.3 Cloud types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3.1 Public Cloud . . . . . . . . . . . . . . . . . . . . . . . 213.3.2 Private Cloud . . . . . . . . . . . . . . . . . . . . . . . 213.3.3 Hybrid Cloud . . . . . . . . . . . . . . . . . . . . . . . 213.3.4 Community Cloud . . . . . . . . . . . . . . . . . . . . 21

3.4 Cloud providers . . . . . . . . . . . . . . . . . . . . . . . . . . 223.4.1 Amazon . . . . . . . . . . . . . . . . . . . . . . . . . . 223.4.2 Microsoft Windows Azure . . . . . . . . . . . . . . . . 233.4.3 Google App Engine . . . . . . . . . . . . . . . . . . . 26

3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1

4 BPM and cloud computing 294.1 General benefits and drawbacks . . . . . . . . . . . . . . . . . 294.2 Service Models . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.2.1 IaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.2.2 PaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.2.3 SaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.3 Combining on-premise and cloud . . . . . . . . . . . . . . . . 324.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . 334.3.2 Optimal distribution . . . . . . . . . . . . . . . . . . . 34

4.4 Social BPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.5 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.5.1 Product overview . . . . . . . . . . . . . . . . . . . . . 364.5.2 Oracle Fusion . . . . . . . . . . . . . . . . . . . . . . . 364.5.3 Appian . . . . . . . . . . . . . . . . . . . . . . . . . . 374.5.4 Fujitsu InterstageBPM.com . . . . . . . . . . . . . . . 384.5.5 Barium Live . . . . . . . . . . . . . . . . . . . . . . . 38

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5 Research directions 405.1 BPEL engine deployment in the cloud . . . . . . . . . . . . . 405.2 Social BPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.3 Process decomposition . . . . . . . . . . . . . . . . . . . . . . 41

6 Conclusion 44

2

Chapter 1

Introduction

This work investigates the possibility of integrating Business Process Man-agement (BPM) with cloud computing. Both subjects are described in moredetail and already existing research about the subject and already existingtools are investigated. This chapter gives the motivation behind the subject,the objectives and approach of the research, and explains the structure ofthe report.

1.1 Motivation

BPM has gained a lot in popularity the last two decades. By identifying andmanaging business processes, companies get insight in what they are doing,which parts of the process can be automated or can be optimized. This maylead to lower costs, better customer satisfaction or optimized processes forcreating new products at lower cost [30].

Business Process Management Systems (BPMS) are able to capture businessprocesses and keep track of running instances of these processes. A businessprocess can be described using workflows. Workflows consist of activitieswhich are either human-based, system-based or a combination of the two incase of human-system interaction. A BPMS contains a workflow executionengine which coordinates the execution of a business process step by step.Each process instance is monitored by the BPMS and business users are ableto look into these process instances. The process instances give insight intothe progress of the processes and show if processes are completed success-fully, or have failed. In case of failure, the process instance shows in whichpart of the process the failure occurred. By monitoring, evaluating andchanging business processes, companies are able to optimize their processes.

To be able to control and monitor a business process, a BPMS needs to

3

communicate with the participants in the process. Introducing BPM intoa company might lead to integration issues, especially since legacy softwaremay not have standardized interfaces. Not all software that participates inthe business process offers clear interfaces for controlling or monitoring thesoftware.

Nowadays a lot of software vendors do offer their products as a service. Thesoftware is hosted on a server and customers are able to login to the softwareusing the Internet, mostly by using a browser. One of the big advantageshere is that companies can change their product relatively easily withouthaving to distribute the updates to all their customers. Instead, they onlyhave to adjust the software that is running on their own servers. Manyof these services are built using the Service-Oriented Architectures (SOA)principles. SOA[27] allows designing and developing software services in auniform way, so that reuse is enabled. This gives developers support forintegrating and composing services.

Offering software as a service gives providers also new challenges. In theprevious situation, where software was shipped to the user, the hardware onwhich the software would run was provided and managed by the user. In thenew situation the software runs on the machines of the provider. This meansthat the provider is responsible for the hardware and that maintenance hasto be performed by the provider. The provider has a choice to invest inservers and personnel to perform these tasks, or the choice can be madeto outsource the hosting of the software to a third-party provider. Cloudcomputing [9] is an example of a model where computing resources areoffered to the user as a service.

The NIST describes cloud computing as a model for shared configurablecomputing resources that can be rapidly provisioned and released with min-imal management effort or service provider interaction [25]. Cloud comput-ing provides users with 3 service models:

Software as a Service (SaaS) SaaS is a model in which software is of-fered as a service to the user. The software is hosted on a server andusers access the software by using a web browser.

Platform as a Service (PaaS) PaaS is the offering of a computing plat-form as a service. Users are able to deploy their applications on sucha platform. The platform offers auxiliary functionality such as a webserver, databases, load balancing and more.

Infrastructure as a Service (IaaS) With IaaS, the cloud offers platformvirtualization to the customer. The user is offered a virtual machinewith some storage. Instead of buying servers and other network equip-ment, users just rent these resources.

4

1.2 Objectives

The goal of this work is to investigate the possibility of combining BPM andcloud computing. By moving BPMS software to the cloud, companies donot have to buy and maintain expensive servers to manage and coordinatetheir own business process. Instead, the business process can be uploadedto a service in the cloud that performs and monitors the process.

This report gives an overview of problems and solutions of business processmanagement in the cloud, by looking at existing solutions and scientificpapers that discuss the topic. The ultimate goal of this work is to identifyresearch issues for a master assignment.

1.3 Approach

The following steps are taken in order to pursue the defined goals:

1. Study the literature about BPM and identify the phases and toolswithin BPM that can benefit from cloud computing.

2. Study the literature about cloud computing, in order to identify thebenefits and drawbacks of cloud computing in general and also thespecific benefits and drawbacks of the 3 service models.

3. Study literature about the combination of both cloud computing andBPM.

4. Discuss existing products where BPM is already combined with cloudcomputing.

5. Devise possible opportunities for further research.

1.4 Structure of the report

Chapter 2 explains business process management by considering the BPMlifecycle and investigating several tools and languages that are useful inthe management process. Chapter 3 elaborates on cloud computing. Thefundamentals of cloud computing are described and an overview of cloudplatforms is given. Chapter 4 investigates the combination of BPM andcloud computing by discussing scientific literature about the subject andby giving an overview of existing software. Chapter 5 identifies researchdirections for a master assignment. Chapter 6 gives our conclusions.

5

Chapter 2

Business ProcessManagement

The goal of BPM is to identify the internal business processes of an orga-nization, capture these processes in process models, manage and optimizethese processes by monitoring and reviewing them.

Business process management is based on the observation that each prod-uct that a company provides to the market is the outcome of a numberof activities performed [30]. These activities can be performed by humans,systems or a combination of both. By identifying and structuring these ac-tivities in workflows, companies get insight into their business processes. Bymonitoring and reviewing their processes, companies are able to identify theproblems within these processes and can come up with improvements.

This chapter discusses business process management by considering theBPM lifecycle. Each of the phases of the lifecycle is investigated. The chap-ter concludes by giving an overview of the BPM phases that are relevant forthis work.

2.1 BPM lifecycle

The BPM lifecycle is an iterative process in which all of the BPM aspectsare covered. The BPM lifecyle is shown in Figure 2.1 and consists of thefollowing phases:

• DesignThe design phase consists of identifying existing processes and captur-ing the business processes in process models.

6

Design

Implementation

Enactment

Evaluation

Figure 2.1: Schematic representation of the business process managementlifecycle [30]

• ImplementationIn the implementation phase, the designed process is implemented inan executable process language, which can be deployed in a BPMS.

• EnactmentThe enactment phase is the runtime phase of the lifecycle. The busi-ness process is deployed and monitored by a BPMS.

• EvaluationIn the evaluation phase the monitored information that is collectedby the BPMS is used to review the business process. The conclusionsdrawn in the evaluation phase are input for the next iteration of thelifecycle.

In the remainder of this chapter, the first 3 phases of the lifecycle are ex-plained in more detail. More information about the 4th phase is omitted inthis report, since the activities involved in this phase are not relevant forthis work.

2.2 Design

In the design phase the business processes within a company are identified.The goal of the design phase is to capture the processes in business processmodels. These models are often defined using a graphical notation. In thisway, stakeholders are able to understand the process and refine the models

7

easily. The activities within a process are identified by surveying the alreadyexisting business process, by considering the structure of the organizationand by identifying the technical resources within the company. BPMN[21] isthe most popular graphical language for capturing business process modelsin the design phase.

When the business processes are captured within models, these models canbe simulated and validated. By validating and simulating the process, thestakeholders get insight into the correctness and suitability of the businessprocess models.

2.2.1 Business Process Management and Notation (BPMN)

BPMN is a notation for defining business process models, developed bythe Business Process Management Initiative [31]. The first version of thestandard was released in 2004. Recently, in March 2011 BPMN 2.0 wasreleased. BPMN has been designed to be understandable for all the businessusers that have to deal with process diagrams. This means that businessanalysts need to be able to create the first sketches of the models, butthe diagrams also have to be understandable for technical developers whohave to implement the business processes represented in the diagrams andemployees who have to monitor the processes.

Language constructs

BPMN diagrams are modelled as flowcharts, consisting of activities that areconnected to each other to determine their relations. The most importantconstructs available in the language are:

• EventEvents effect the flow of a process. Three types of events are definedin BPMN, namely Start, Intermediate and End events.

• ActivityActivities can be described as work performed by a company. Thereare two types of activities: tasks and sub-processes.

• GatewayGateways are controls that are available for changing sequence flow.Condition/decision paths are examples of situations that can be mod-elled by gateways.

• Sequence FlowA sequence flow is used for ordering flow elements, and is used to showthe sequence of activities that are performed in the flowchart.

8

• Message FlowMessage flows are flow elements that indicate the flow of messagesfrom one participant to another participant. Message flows are usedto model the communication between two separate flow charts.

• AssociationAssociations are available for associating data, text and other artefactswith flow objects.

• PoolPools can be used to represent the participants of a process. Activitiesthat are related can be grouped into a pool.

• LaneLanes are partitions within a pool. They can be used for organizingand categorizing activities.

Example

The following example is adapted from OMG’s BPMN examples document[20]. The BPMN diagram of the example is shown in Figure 2.2. The ex-ample shows a Business-To-Business-Collaboration between a pizza vendorand a customer. Both participants are modelled using a pool. Inside thesepools, the internal business process of the participants is shown. The pizzavendor has 3 employees: a clerk, the pizza chef and the delivery boy. Eachof these employees is represented within the pizza vendor pool by a lane.

The business process of the customer starts when a customer is hungry andwants to eat a pizza. The customer has to perform two activities, namelya pizza needs to be selected and ordered by sending a message to the pizzavendor. The next step for the customer is to wait for the pizza. This has beenmodelled by an event based gateway (diamond shaped), which indicates thattwo possible events can occur. The first possibility is that a customer waitsfor 60 minutes and still has no pizza. In this case, the customer contactsthe pizza vendor and asks when the pizza will be delivered. After the call,the customer starts waiting again. In the second case, the pizza is deliveredwithin 60 minutes. In this situation, the customer pays for the pizza andeats the pizza. After these activities the customers hunger is satisfied andthe business process terminates.

The business process of the vendor starts when an “order received” eventis received by the clerk. The gateway after the event introduces parallelismto the process. A “Bake the Pizza” order is given to the pizza chef and theclerk self is waiting for an incoming “Where is my pizza?” event. As soonas the pizza chef finishes his pizza baking activity, he orders the pizza boy

9

to deliver the pizza. The pizza boy delivers the pizza to the customer andwaits for the payment. When the pizza money is received, the pizza boygives the customer a receipt and the process terminates.

��:��������.�������������������8����������������������������.��8����/�..�������..���./����

��������)�������������������0����.��������0������.�/�.��������0�������0��������������.0���

�8���0��.������������.��/���5���������������:���������)����8���������������������..�.���/���

���������������/�������..�.����0����������������;������.��������0������������O)�..���"��.���.P���

O&�8����5�����������P��������������������.��������0��0�..�.0���0��/��O)�..���"��.���.P���

����.����������������.0����������/���5�����������0����:�����������.��������0��0�..�.��0��/��

O&�8����5�����������P���/��������)����������0��.�����������������;�������..�.���0����/�����.��

��8O��������0��8����������8�������8����P��������0�0�����8�����������������������/�./�..��

��/�����.����8���5������

:�� .%��+;;�0���������+��

&����5���.���������������*&�*��������*��..����������������0�0�������.�������������0������;;�

�������������������5�.���.��0������.����/��������O����������P�����/���������������0����������

���.��".�������������������/��.�����������������/����.����0�������������������.��..��������

������������0�������������0����������������������.��;���������������������.��������������

��0�����//����������������������������.�0��8��������/0�������������..����������������������..�

�������������/������.�������/������������������.��������8��0��������..���������������0��

��//�������.������/�.���0������������.����8�������.0����//����.����������0��������������������

�/0��������������������0�����.����0������;;���������0�����������������������0.����&��

�����������/�����.������;;������������$/��������������0���/������;;������.�������&������

��������0���/������8O��������;;�P�����������������������..�0���/��0���//��������������.�

��������57(��������;;�����.�����������������0����/�..�0���������������������������.�����/��Q

8����������������������������������������������������������9*#������.4�()�� ��:�;�����������

)+'���:��������+�'������+1��+�'�+;;�

*�--����������

+��!�4

���� �--�

�������� �--� "������� �--�

�--�

��������

0���������

��%������ ��

�--�

*�4�� �� �--� (���� �� �--�

+��!��

���������

*�--��������

�--��� ��

�������4�.�4

"����

��������

9�%��� �� �--�

3�������� ��

�--�

��������

�4����

�--�������

����� �

����4

�--�

����%

A& ��������4�

�--�BC

����

��������

Figure 2.2: BPMN diagram of a pizza vendor and a customer [20]

2.3 Implementation

After the business process models are validated and simulated, they haveto be implemented. The implementation of these models can be done intwo ways. One can choose to create work lists, with well defined tasks,which can then be assigned to workers within the company. This is oftenthe case when there is no automation within the business process execution.The disadvantage of working with work lists is that the process execution ishard to monitor. There is no central system in which process instances aremonitored, this has to be done by each employee within the company thatis involved in the process.

In a lot of situations information systems participate in a business process,in which case a business process management system (BPMS) can be used.A BPMS is able to use business process models and create instances ofthese models for each process initiation. The advantage of using a BPMS

10

is that the system gives insight into the whole process. The system is ableto monitor each instance of a business process and gives an overview of theactivities that are performed, the time the process takes and its completionor failure.

Business Process Management Systems need executable business models.The models defined in the design phase are often too abstract to be directlyexecuted. Therefore, they need to be implemented in an executable busi-ness process language, such as BPEL[1]. In addition, collaborations betweenbusiness processes can be implemented by using a choreography language,such as CDL[24]. Below, the difference between orchestrations and chore-ographies is explained in more detail.

2.3.1 Orchestration vs. Choreography

An Orchestration describes how services can interact with each other at themessage level, including the business logic and execution order of the in-teractions from the perspective and under control of single endpoint [27].BPEL[1] is an XML-based language that can be used to describe orchestra-tions. The language gives its users the freedom to describe business processesin two ways: executable or abstract. Abstract processes serve a descriptiverole, whereas executable processes are meant to be executed by an executionengine. An example of orchestration can be found in the pizza collaborationexample. The example has two pools, that represent separate business pro-cesses. The activities performed within a pool can be described as a processin an orchestration language.

Choreography is typically associated with the public message exchanges,rules of interaction, and agreements that occur between multiple businessprocess endpoints, rather than a specific business process that is executed bya single party [27]. CDL[24] can be used for describing choreographies usingan XML-based format. CDL allows its users to describe how peer-to-peerparticipants communicate within the choreography. The communicationbetween the two pools in the pizza collaboration example is an example of achoreography, where two participants with two different business processescollaborate with each other.

2.4 Enactment

When the business process models are implemented in the implementationphase, the enactment phase can be started. In this phase the system is usedat runtime, so that each initiation of the process is monitored and coordi-nated by the BPMS. For each initiation of a process, a process instance is

11

created. The BPMS keeps track of the progress within each of the processinstances. The most important tool within the enactment phase is the mon-itoring tool, since it gives an overview of the running and finished processinstances. By keeping track of these instances, problems that occur in aprocess instance can be easily detected.

2.5 Business Process Management System

Several vendors of Business Process Management software solutions offercomplete suites for modelling, managing and monitoring business processes.Inside these systems there is a process execution environment, which is re-sponsible for the enactment phase of the BPM lifecycle [30]. An abstractschema of a typical BPMS is shown in Figure 2.3.

120 3 Business Process Modelling Foundation

This technological problem is also addressed by enterprise application inte-gration systems, where adapter technology is in place to cope with this issue,as discussed in Chapter 2.

In addition, the granularity with which legacy systems provide functional-ity often does not match the granularity required by the business process. Inparticular, legacy systems often realize complex subprocesses rather than in-dividual activities in a business process. Sometimes, the processes realized bylegacy systems and the modelled business processes are not immediately com-parable. These issues have to be taken into account when software interfacesto existing information systems are developed.

One option to solving this problem is developing software interfaces thatmake available the functionality provided by legacy systems with a granularitythat allows reuse of functionality at a finer level of granularity. The granularityshould match the granularity required at the business process level.

Depending on the legacy system, its complexity, software architecture,and documentation, as well as the availability of knowledgeable personnel,the required effort can be very high. If the need for finer-grained granularityand efficient reuse of functionality is sufficiently high, then partial or completereimplementation can be an option.

3.11 Architecture of Process Execution Environments

So far, this chapter has discussed the modelling of different aspects of a busi-ness process. This section looks into the representation of a business processmanagement system capable of controlling the enactment of business processesbased on business process models.

Business Process Environment

Process Engine

Service Provider 1 Service Provider n

Business Process Model

Repository

Business Process

Modeling

. . .

Fig. 3.39. Business process management systems architecture modelFigure 2.3: Schematic representation of a business process management sys-tem [30]

The tools shown in Figure 2.3 provide the following functionality:

• The Business Process Modelling component consists of tools for cre-ating business process models. It often consists of graphical tools fordeveloping the models.

• Business Process Environment is the main component that triggersthe instantiation of process models

• The Business Process Model Repository is a storage facility for storingprocess models as created by the modelling component

• The Process Engine keeps track of the running instances of processmodels. It communicates with service providers in order to executeactivities or receive status updates.

12

• Service Providers are the information systems or humans that com-municate with the process engine. These entities perform the actualactivities and report to the process engine.

2.6 Conclusion

In this chapter we have introduced BPM by considering the phases of theBPM lifecycle. In this section we identify which of the introduced phases,tools and languages are relevant for this work.

The use of standardized languages such as BPMN and BPEL is interestingfor this work, since both design and enactment tools can be placed in thecloud and work with these languages.

In addition, the structure of a BPMS is relevant. The different componentswithin a BPMS and the relation between these components might changewhen a BPMS is moved to the cloud.

13

Chapter 3

Cloud Computing

Cloud computing is one of the trending topics in Computer Science. Manymarket influencing players as Microsoft, Google and Amazon offer cloudcomputing solutions. The goal of this chapter is to introduce cloud com-puting from both a conceptual level and a more concrete level. At first thegeneral benefits and drawbacks of cloud computing are explained briefly.The three common service models are introduced next and for each of theseservice models the specific benefits and drawbacks are identified. After that,four different cloud types are discussed. In the end of the chapter, three pop-ular cloud platforms are introduced in terms of their purposes and structure.

3.1 General benefits and drawbacks

Cloud computing is a model for enabling ubiquitous, convenient, on-demandnetwork access to a shared pool of configurable computing resources thatcan be rapidly provisioned and released with minimal management effort orservice provider interaction [25].

The idea of cloud computing is that users are offered computing resourcesin a pay-per-use manner that are perceived as being unlimited. The cloudprovider does not have any expectations or up-front commitments with theuser and it offers the user elasticity to scale up or down quickly accordingto the user’s needs.

Cloud computing gives organizations several benefits:

• ElasticityInstead of having to buy additional machines, computing resourcescan be reserved and released as needed. This means that there is nounder- or over-provisioning of hardware by the cloud user.

14

• Pay-per-useCloud users are only billed for the resources they use. If a cloud userneeds 20 computers once a week for some computation of one hour,it is only billed for these computing hours. After that the computerscan be released and can be used by other cloud users.

• No hardware maintenanceThe computing resources are maintained by the cloud provider. Thismeans that operational issues such as data redundancy and hardwaremaintenance are attended by the cloud provider instead of the clouduser.

• AvailabilityClouds are accessible over the Internet. This gives cloud users the flex-ibility to access their resources over the Internet. Cloud users are ableto use software or data that is stored in the cloud not only inside theirorganization but everywhere they are provided with Internet access.

There are also drawbacks and threats in using cloud computing:

• SecurityData is stored inside the cloud and accessible through the Internet. Inseveral situations cloud users deal with confidential information thatshould be kept inside the cloud user’s organisation. In these situationscloud computing might not be a good solution, although there aresolutions with cloud computing in which data is stored inside the clouduser’s organisation but applications are hosted in the cloud. There arealso technical solutions for making data unintelligible for unauthorizedpeople, for example, by using encryption algorithms.

• AvailabilityClouds are accessible through the Internet. This gives cloud users thefreedom to work with the services everywhere they have an Internetconnection. The downside is that when the Internet connection fails,for example, on the side of the cloud provider, cloud users are not ableto access their services any more. This might lead to business failures,especially when the services are part of a business process.

• Data transfer bottlenecksUsers that use software systems might need to transfer large amountsof data in order to use the system. Data should be transported notonly from the user to the system, but also to multiple systems in orderto cooperate inside a company. Cloud computing providers do not onlybill the computation and storage services, but also data transportationis measured and billed. For companies that deal with a lot of data,cloud computing may be expensive because of the data transportationcosts. Another problem can be the time it takes to transfer data to

15

the cloud. For example, a company needs to upload a huge amount ofdata in order to perform a complex computation. The data transfermay take more time than the computation itself. In these situationsit might be faster and cheaper to perform the computation inside thepremisses of the cloud user.

3.2 Service models

The National Institute of Standards and Technology (NIST) defines cloudcomputing by means of three service models: Software-as-a-Service (SaaS),Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) [25]. Thethree service models are closely related and can be seen as a layered architec-ture, as shown in Figure 3.1. Each service model is explained in the sequel.For each of the models, the specific benefits and drawbacks are given forboth the user and the provider of the service models.

Software as a Service

Examples: Facebook, Youtube, Gmail

Platform as a Service

Examples: Windows Azure, Google AppEngine, Force.com

Infrastructure as a Service

Examples: Amazon EC2, GoGrid

Application (Applications, Webservices)

Platform (Software Frameworks, Storage providers)

Infrastructure (Computation, Storage)

Hardware (CPU, Memory, Disk, Bandwidth)

Figure 3.1: An overview of the layers in cloud computing based on [32]

3.2.1 Infrastructure as a Service (IaaS)

Infrastructure as a Service is the lowest layer in the cloud computing stack.As shown in Figure 3.1, IaaS combines two layers: the hardware layer andthe infrastructure layer. IaaS users are interested in using hardware re-sources such as CPU power or disk storage. Instead of directly offeringthese services to the user, IaaS providers provide users with a virtualizationplatform. Customers need to install and configure a virtual machine, whichruns on the hardware of the cloud provider. In this model, cloud users areresponsible for their virtual machine and cloud providers are responsible forthe actual hardware. Issues such as data replication and hardware main-tenance are addressed by the cloud provider, while the management of thevirtual machine is performed by the cloud user.

16

Benefits of IaaS for cloud users are:

• Scalable infrastructureThe biggest advantage of IaaS is the elasticity of the service. Insteadof having to buy servers, software and data centre capabilities, usersrent these resources on a pay-per-use manner. In situations where theworkload of computer resources fluctuates, IaaS might be a good so-lution. For example, consider a movie company that uses servers forrendering 3D effects. The company has a small data centre on-premisewhich is used for the rendering, once a week. The rendering of onescene takes 50 hours when calculated on 1 machine. By scaling up to50 machines, the rendering of the scene would take 1 hour. Scalingup the internal network of the company might be an expensive oper-ation considering the installation and maintenance of the machines,especially when the servers are only used for rendering once a week.Instead of buying these machines, one might consider to rent the ma-chines and only pay for the rendering operation once a week.

• PortabilitySince IaaS works with virtual machine images, porting an on-premisesystem to the cloud or porting a virtual machine from one cloud toanother can be easy. This, however, depends on the virtual machineimage format that is used by the cloud provider.

Drawbacks of IaaS for cloud users are:

• Virtual machine managementAlthough cloud users do not have to manage the rented computerhardware, cloud users are still responsible for the installation and con-figuration of their virtual machine. A cloud user still needs expertsinside its company for the management of these virtual servers.

• Manual scalabilityIaaS does no offer automated scalability to applications. Users are ableto run virtual machines and might boot several instances of virtualmachines in order to scale up to their needs. Collaboration betweenthe virtual machines has to be coordinated and programmed by thecloud user.

Benefits for IaaS cloud providers are:

• Focus on hardwareCloud providers are mainly focused on hardware related issues. Every-thing that is software related, such as database management, threadingand caching needs to be performed by the cloud user.

• Exploiting internal structureSeveral providers are able to offer cloud computing services as an ex-

17

tension to their core business. For example, the Amazon infrastructurestack was built for hosting Amazon’s services. By offering this infras-tructure as a service, Amazon is able to exploit infrastructure and offera new service to its customers at low cost.

Drawbacks for IaaS cloud providers are:

• Under- and overprovisioningCloud providers have to offer their resources as if they are unlimited tothe cloud user. This means that a cloud provider needs to own enoughresources in order to fulfil the needs of a cloud user. These needs,however, may vary every time. Underprovisioning of a data centrecauses that a cloud user might not be able to obtain the resourcesit asks for, since the cloud provider does not have enough machinesavailable. Overprovisioning is extremely expensive, since servers arebought and maintained, but are not used.

3.2.2 Platform as a Service (PaaS)

Platform as a Service is a service model in which users are offered a platformon which they can develop and deploy their applications. The platform offerssupport for using resources from the underlying infrastructure. Platformsare mostly built for a certain domain, e.g., development of web applications,and are programming language-dependent.

There are several cloud platforms available nowadays. Microsoft offers theWindows Azure platform, which can be used for developing (web) applica-tions and services based on the .NET framework. Google’s App Engine isa platform for the development and deployment of Python or Java-based(web) applications.

Benefits of PaaS for cloud users are:

• Development platformPaaS offers cloud users a platform on which they can manage and de-ploy their applications. Instead of having to manage important issuessuch as scalability, load balancing and data management, cloud usershandle these issues by using services offered by the platform.

• No hardware and server management neededCustomers can deploy applications relatively easily on the platform,since no network administrators are necessary for installing and main-taining servers or virtual machines.

Drawbacks of PaaS for cloud users are:

18

• Forced to solutions in the cloudPaaS offers combinations of services. For example, Windows Azureprovides users with a .NET environment. The platform offers supportfor databases in the form of SQL Azure. Application developers canchoose to use a different database, but have to perform difficult op-erations to install these services on the platform, or have to host thedatabase by a third party. PaaS users are more or less forced to usethe solutions that are offered by the cloud provider in order to get thefull benefits from the platform.

Benefits for PaaS cloud providers are:

• Focus on infrastructure and platformThe software that runs on the platform is managed by the cloud userand the cloud provider is responsible for the infrastructure and theplatform.

Drawbacks for PaaS cloud providers are:

• Platform developmentThe platform that is offered by the cloud provider is a piece of soft-ware. Complex software is needed to offer services such as automaticscalability and data replication. Faults in the platform can lead to fail-ure of customer applications, so the platform has to be fault tolerantand stable.

3.2.3 Software as a Service (SaaS)

With Software as a Service, cloud providers offer an application that isdeployed on a cloud platform. Users of the application access the applicationthrough the Internet, often using a browser. One of the benefits of SaaS isthat cloud providers are able to manage their software from inside theircompany. Software is not installed on the computers of the cloud users, butinstead runs on the servers of the cloud provider. When a fault is detectedin the software, this can be easily fixed by repairing the software on theserver, instead of having to distribute an update to all the users.

There are several examples of Software as a Service. For example, Googleoffers several web applications, such as Gmail and Google Docs. Both appli-cations are offered as an online service. Another example is SalesForce.com,which offers CRM online solutions as a service.

Benefits of SaaS for cloud users are:

• Pay-per-useInstead of having to purchase a license for each user of an application,

19

organizations are billed based on the usage of the software. A coupleof years ago software was often purchased on a pay-per-license base.Network administrators had to install applications on the workstationsof a cloud user’s company and for each application instance the clouduser had to pay for a license, even if the user of a particular workstationdid not use the application. With pay-per-use, cloud users pay onlyfor the users and the usage time of the application.

• UpdatesApplications in the cloud are managed by a cloud provider. The cloudprovider is able to perform updates to the software directly in thecloud. Instead of having to distribute updates to the cloud user, theusers always work with the most actual version since they access theapplication in the cloud.

Drawbacks of SaaS for cloud users are:

• Data lock-inData lock-in is one of the typical problems of SaaS. In case cloudusers decide to work with another application, offered by a differentprovider, it might be hard to move the data to this other application.Not every application provider stores data in a standardized way andinterfaces for retrieving all the data from an application may not beavailable.

Benefits for SaaS cloud providers are:

• MaintenanceMaintenance can be directly performed in the cloud application itself.Updates do not have to be distributed to the cloud users but aredirectly applied upon the software in the cloud.

Drawbacks for SaaS cloud providers are:

• Infrastructure neededIn traditional software deployment, software is shipped to the user.The hardware on which the application is installed is managed by theuser. With cloud computing, the software runs on servers of the cloudprovider. This means that cloud providers have to perform infrastruc-ture maintenance, or they have to rent infrastructure or a platform forhosting their applications.

• ResponsibilityApplications that run in the cloud are managed by the SaaS provider.When the application in the cloud is not accessible or not working anymore because of erroneous updates or changes in the software, cloudusers are not able to work with the software any more. It is a big

20

responsibility for cloud providers to make sure the software is kept upand running.

3.3 Cloud types

The cloud types identified in [23][25] are discussed below.

3.3.1 Public Cloud

A public cloud is provisioned for exclusive use by the general public. Cloudusers access the cloud through the Internet. Public clouds are widely avail-able nowadays, for example companies as Microsoft, Google and Amazonoffer public cloud computing services. The biggest benefit of public clouds isthat the management of the servers is provided by the third-party provider.Users just pay for the usage of the cloud and issues as scalability and repli-cation are handled by the cloud provider.

3.3.2 Private Cloud

Private clouds are for exclusive use of a single organization. Private cloudscan be hosted inside or outside the cloud user’s organisation and managedby the cloud user’s organisation itself or by a third-party provider. Thisform of cloud computing can be used when cloud users have to deal withstrict security concerns, where data has to be hosted inside the cloud user’sorganization itself.

3.3.3 Hybrid Cloud

Hybrid clouds are created by combining a private and a public cloud. Withhybrid clouds, organizations can choose to store their critical data insidethe company using a private cloud, while the less critical data and servicescan be stored in the public cloud. The hybrid cloud approach benefits fromthe advantages of both public and private clouds. Scalability is maintained,since the public cloud is used for offering the services, while data security ismaintained by storing critical data in the private cloud.

3.3.4 Community Cloud

A community cloud is available for a specific community. Several companiesthat deal with the same concerns may decide to host their services together,

21

in order to collaborate. Community clouds can be managed by one or moreorganizations within the community, but the cloud may alternatively behosted by a third-party provider.

3.4 Cloud providers

Several companies offer cloud services to customers. Below we give anoverview of the three most important players in the cloud market and theirofferings.

3.4.1 Amazon

Amazon was one of the first companies to offer public cloud solutions to orga-nizations. At first, Amazon focused on offering IaaS solutions, but through-out the years several new services were introduced, which made the AmazonWeb Services platform richer and more than just an IaaS solution. Belowwe briefly discuss some of the these services.

Amazon EC2

Amazon Elastic Compute Cloud (EC2) [2] is the computational service ofAmazon. EC2 offers computational resources to its users, such as CPUpower and memory. In order to use EC2, a customer has to install a customvirtual machine in the form of an Amazon Machine Image (AMI) in anAmazon Virtual Server. Amazon offers pre-installed images, with severaladditional software such as LAMP and Hadoop [29][13]. It is also possiblefor a customer to upload their own virtual machine image. Virtual machinescan be instantiated by the user and the user is free to bootup multipleinstances of a virtual machine to enlarge the computational power. Newinstances can be launched within 2 minutes. The billing of EC2 services isbased upon the number of instantiated virtual machines per hour.

Amazon S3

Amazon S3 stands for Simple Storage Service [4]. S3 offers a storage serviceto customers in which they can store objects up to 5 Tb. To organize thestorage, data needs to be stored in buckets with a unique name. Each usergets 100 buckets. The number of objects inside a bucket is unlimited, theonly restriction within a bucket is that each object needs to have a uniquename. Each object that is uploaded to the service is directly replicated

22

multiple times. Objects can either be private or public. In this way userscan share files over the Internet by making the corresponding object public.Amazon offers several interfaces for accessing objects. Currently supportedinterfaces are REST, SOAP and BitTorrent.

Amazon SQS

Amazon SQS stands for Amazon Simple Queue Service[3]. The goal ofthe queue service is to provide a mechanism for exchanging messages be-tween computers. Developers are able to move data from one componentto another by adding data to the queue of a component, instead of directlycommunicating with the component.

Amazon SimpleDB

SimpleDB [5] is a flexible and scalable non-relational data store. SimpleDBoffers to developers a datastore in which they can store and query structureddata by using webservice requests. Data is stored in domains. A domain isa collection of similar data items, comparable to a relational table, althoughSimpleDB does not offer relational storage. Inside a domain, data is storedin collections of key-value pairs. A key-value pair can contain multiple valuesand has a maximum of 1024 bytes per value. An entity in a domain canhave at maximum 256 key-value pairs. Data stored inside the datastore isautomatically indexed. This leads to fast query performance when queryingthe datastore.

3.4.2 Microsoft Windows Azure

Microsoft’s solution for cloud computing is called Windows Azure [11][12].Windows Azure is a public cloud platform that consists of a group of cloudtechnologies, each providing services to application developers. The plat-form consists of four parts: Windows Azure, SQL Azure, Windows AzureAppFabric and Windows Azure Marketplace. These parts are briefly ex-plained below. An overview of the platform is shown in Figure 3.2.

Windows Azure

The main goal of the Windows Azure part of the Windows Azure platformis to run applications and store data in the cloud. As shown in Figure 3.2,The Windows Azure block consists of 5 parts.

23

Figure 3.2: Overview of Microsoft Windows Azure platform

The Compute part presents the computing service of the cloud platform.The service is able to run .NET applications using a Windows Server foun-dation. Applications that are built on the compute service are structuredby using one or more roles. There are 3 possible roles inside the computeservice: web roles, worker roles and VM roles. A web role can be usedfor running web applications. Each web role instance is provided with apreconfigured IIS installation on which the applications can be deployed.Worker roles are intended for processing operations in the background. Of-ten web roles delegate tasks to worker roles in order to execute complexoperations. VM roles are the most custom roles. Users need to upload aWindows Server 2008 image as a custom VM role. This solution is oftenuseful when an on-premise service is moved to the cloud.

The Storage part contains services for storing data. Three storage servicesare provided: blobs, tables and queues. Blobs are binary large objects andare stored in containers. Blobs are useful for storing unstructured data,such as images. In case of structured data, tables can be used. Tablesare not to be confused with relational tables, as used in relational databasesystems, such as SQL Azure. Tables are represented as a set of entitieswith properties. There is no defined schema for a table. Developers can

24

query data from tables by using a simple query language defined by OData.Queues are not used for the storage of data, but for passing data from onerole to another.

The Fabric controller manages the machines that are available within a datacentre. The goal of the controller is to divide the jobs and roles over theavailable machines and to manage data storage and data replication on thesemachines.

Connect is an integration service for the integration of on-premise serviceswith the Azure platform. For example, a company wants to move an ex-isting application to the cloud but decides to keep their database systemon-premise. Connect helps organizations to deal with these situations.

The Content Delivery Network service is used for improving performanceon data access. The service keeps track of often used data and caches it atsites closer to the clients that use it.

SQL Azure

SQL Azure is Microsoft’s cloud-based database solution and is based on theMicrosoft SQL server. It does not only consist of a database solution, butit also offers services for reporting and data synchronization. SQL AzureReporting provides users with data reports, based on the stored data in thedatabase. Created reports can be published in a reporting portal, in whichusers can access the reports. The Data synchronization service provides amechanism for synchronization of multiple databases, for both on- and off-premise databases. The synchronization service is based on the MicrosoftSync Framework.

Windows Azure AppFabric

The AppFabric can be seen as a layer above the Windows Azure part. App-Fabric adds support for application access, access control and caching.

The first component inside the AppFabric is a service bus. The service busoffers clients a registry in which they can publish their services. Other clientsare able to find and use services, by querying the registry.

Often, applications make use of digital identities. Examples of these identi-ties are Active Directory, Facebook, Google accounts or Windows Live ID.AppFabric offers support for several different digital identities in such a waythat users can use them relatively easily within their applications.

25

AppFabric also offers a caching service for caching frequently accessed in-formation. The service is used for reducing the number of database queries,which leads to better performance of the overall platform.

Windows Azure Marketplace

In the marketplace, Microsoft offers existing cloud applications and clouddata. The marketplace has been split up into two parts: DataMarket andAppMarket. The DataMarket offers organizations with datasets. Organiza-tions can browse through the offered sets and purchase the data they needfor their applications. The AppMarket can be used for companies to sharetheir cloud applications, so that other organizations can use and integratethem into their own applications.

3.4.3 Google App Engine

Google App Engine [19] is Google’s cloud platform. The platform lets cus-tomers run web applications on Google’s infrastructure. The platform cur-rently supports three programming languages: Python, Java and Go.

The App Engine was developed with two goals in mind [16]: (1) to reducethe time for deployment for web application developers. Traditionally, de-velopers do not only spend a lot of time programming their application,but deployment and maintenance of applications also takes its share. Byproviding users with an already configured infrastructure and several pro-gramming API’s, developers can reduce their deployment and maintenancecosts. (2) to provide developers with a system that scales automatically tothe needs of the application. Users are able to let their application scalefrom a simple experimental project towards a fully mature software appli-cation. Scaling is performed automatically by the App Engine, but withindefinable boundaries.

Sandboxed applications

Web applications run in a sandboxed environment. Users are provided withlimited access to the underlying operating system. In this way Google is ableto distribute the web requests across multiple servers. This limited accessleads to several restrictions for the developer:

• Applications are only able to communicate with other computers usingthe provided URL fetch or email services.

26

• External applications and computers can only access the web applica-tion via HTTP/HTTPS requests on the standard ports.

• Web applications can not write to the file system. Instead Googleoffers a datastore and a memory cache service.

• An application can only execute code during a request. As soon as arequest comes in, the application is able to start processing data untilthe response is sent back to the user. The maximum request-responsetime is 60 seconds.

Datastore

Google offers developers a distributed data storage service called BigTable[10]. BigTable is not a traditional relational database, since it is column-based instead of row-based. The table itself is schemaless and consists ofentities of a certain kind and a set of properties.

The datastore can be queried by using Google’s query language GQL [18],which is an advanced query language optimized for retrieving entities fromthe datastore.

Services

Google App Engine offers several services that can be used by cloud user’s’applications, like a mail service for sending mails to customers, an URL Fetchservice for accessing resources on the internet or communicating with otherservices, a memory cache that provides the developer with high performancekey-value caching.

Applications in the Google App Engine run only during the request-responsecycle. However, this can be circumvented by scheduling tasks using the Cronservice that is available on the App Engine platform.

Administration console

In order to manage the web application, Google offers an administrationconsole to developers [17]. The console is an online dashboard in which theuser can logon to monitor and configure the applications that are runningin the cloud. The console gives the user the opportunity to monitor theperformance of the application and also to view and edit the data that isstored in the datastore.

27

3.5 Conclusion

Three popular cloud platforms have been introduced in this chapter. Allthree the platforms are public clouds, but the strategies of the platformsdiffer. The services offered by Amazon are mainly on the IaaS level.

Microsoft’s Windows Azure and Google App Engine are both based on thePaaS service model. Google App Engine focuses on web applications andoffers a SDK for creating cloud based web applications. Applications canonly execute code during a web request, which has a maximum executiontime of 60 seconds. Compute-intensive operations that need more time toexecute can only be executed by scheduling them as a task in the Cronservice.

Windows Azure is more than a web application platform. In addition toGoogle App Engine, Azure can also host non-web applications and purecomputational applications. Windows Azure has support for custom VMroles, in which cloud users can install and manage their own Windows basedVM. By offering this service, Microsoft does offer partial IaaS functionality.

From a database point of view, both Google and Amazon offer schemalesscolumn-based storage to the cloud user, whereas Microsoft offers a relationalrow-based database solution.

Choosing between the presented cloud providers is difficult and depends onthe needs of a user. When a cloud user needs full control over the serversettings, an IaaS solution is necessary. When a cloud user, wants to deploysoftware on top of a platform, a platform can be chosen based upon theprogramming language and database needs of a cloud user.

28

Chapter 4

BPM and cloud computing

In this chapter we investigate the state-of-the-art of the combination of BPMsoftware and cloud computing, by discussing relevant scientific literature andsome existing commercial products.

4.1 General benefits and drawbacks

Cloud-based BPM gives cloud users the opportunity to use cloud softwarein a pay-per-use manner, instead of having to make upfront investments onBPM software, hardware and maintenance [22]. Systems scale up and downaccording to the cloud users needs, which means that the user does not haveto worry about over-provisioning or under-provisioning.

BPM in the cloud has also several downsides. By putting a BPMS in thecloud, cloud users might lose control over their sensitive data. The efficiencyand effectiveness of activities that are not computation-intensive may notincrease by placing these activities in the cloud, but on the contrary, theseactivities may become more expensive. For example, an activity that is notcomputation-intensive might need to process a certain amount of data. Thetransfer of the data to the cloud might take longer than when the activityis executed on-premise, besides, the costs of the activity may increase sincedata transfer is one of the billing factors of cloud computing.

4.2 Service Models

In [6], moving an existing application from on-premise to each of the servicemodels is considered. In the analysis, the application is supposed to beorchestrated by a BPEL specification and executed by a BPEL execution

29

engine. The requirements and challenges that have to be faced when movingan application to each of the service models are briefly discussed below, basedon [6].

4.2.1 IaaS

When an application is moved to the IaaS service model, the cloud user isresponsible for the operating system, the middleware and the applicationsrunning in the virtual machine, as shown in Figure 4.1. Installing BPMsoftware in an IaaS cloud solution is therefore comparable to installing BPMsoftware on-premise, since everything except the hardware is managed bythe cloud user. In addition, the cloud user has to take certain securitymeasures to secure the system from intruders. Possible security measuresare blocking ports, enforcing access control policies and keeping the softwareand operating systems up to date.

SaaS (Salesforce)

PaaS (Google's App Engine)Flexibility for

Customer

Complexity for

Provider

IaaS (Amazon EC2)

Figure 1. Overview of delivery models

selection of the middleware and application components.By contrast the complexity for the provider decreasesfrom SaaS (highest) to IaaS (lowest). The comparison ofthe different delivery models will be discussed in moredetail in Section 4.

Right now there is no approach enabling executionof business processes through cloud computing, becausethe delivery models are not geared specifically at offeringBPM in the cloud and that is where our approach kicksin.

3. Running Example

We introduce the example of a fictional travel agencynamed Paradise Travel, which offers travel bookingthrough a Web interface to potential customers. Theoffering of Paradise Travel comprises complete travelbooking including transport to holiday resort, hotel book-ing as well as optional car rental service. The wholetravel booking process is modeled as business process inBPEL. External partners involved in the business process,e.g. several car rental agencies located at the differentavailable travel destinations as well as the credit inves-tigation company for checking solvency of customerbefore credit card payment, are contacted through Webservices.

Currently the infrastructure and the platform requiredfor Paradise Travel, e.g. an application server includingdeployed Web services as well as the business processengine including deployed travel booking process, arehosted in an enterprise datacenter owned and maintainedby the Paradise Travel company (on-premise).

4. BPEL in the cloud

In this section we will take a look on how to outsourceseveral parts of Paradise Travel’s business by applyingthe delivery models IaaS, PaaS and SaaS. Along theway we will consistently refer to the example describedbefore.

4.1. IaaS

Although outsourcing the hosting and maintenanceof the infrastructure required for the business of ParadiseTravel to an IaaS provider reduces the costs for the travelbooking company, the setup of the application serverand business process engine as well as thedeployment ofWeb services and travel booking business process is stillthe task of Paradise Travel. So applying IaaS is the firststep to concentrate on the core business.

4.1.1. Providing BPEL through IaaS. In terms of therequirements for a BPEL engine, IaaS (Figure 2) is veryclose to the traditional on-premise model. Customershave to make the same decisions regarding the instal-lation of software such as operating system, platformmiddleware and application. Of course, this decisionsmust comprise security considerations such as blockingout attackers by locking ports, patching the operatingsystem, running an anti-virus software, etc., as well asconfiguration and enforcement of access control policies.

CustomerApplications BPEL Processes Process Models

Process

Instances

Middleware BPEL Engine DBMS

Provider

OS

ProviderHardware

Figure 2. IaaS

In essence, IaaS realizes on-premise in the cloud bymoving the responsibilities for hosting the infrastruc-ture (e.g. hardware) from their own datacenters to anoutsourcing provider. Thus, there are no special require-ments or challenges that must be solved to be able toprovide BPEL through IaaS, except the installation of aBPEL engine.

4.2. PaaS

Applying PaaS for outsourcing most of the tasks notpart of the core business of Paradise Travel, e.g. hostingand maintenance of infrastructure as well as platformmiddleware, reduces the complexity for Paradise Travelcompared to IaaS. On the one hand it is still the task ofthe travel booking company to provide the Web servicesto be deployed on the application server as well as thetravel booking process to be deployed on the processengine. On the other hand this is not only a disadvantagebut also the possibility to keep a part of the flexibility ofthe travel booking company by enabling the adaption ofthe business process, in case the needs of the customers

672

Figure 4.1: Responsibilities for cloud users and cloud providers when aworkflow based application is moved to the IaaS service model [6]

4.2.2 PaaS

By placing a workflow-based application on the PaaS service model, theresponsibilities for both the cloud provider and cloud user change, as shownin Figure 4.2. The execution engine is assumed to be part of the platform inthis case and is offered by the provider. Users need to upload their processesin order to run them in the cloud. The engine can be used by multiple users,since the platform is shared. The responsibility of data storage and datamanagement is no longer in hands of the cloud user, which leads to severalsecurity issues.

The following three requirements need to be realized in order to offer asecure BPEL engine on PaaS:

30

concerning travel booking changed or in case the agencyproviding hotel information and hotel reservation is nolonger available and another provider has to be found.

4.2.1. Providing BPEL through PaaS. In contradic-tion to the IaaS deployment model, PaaS providers hosthardware, operating system and platform middleware(Figure 3) such as a BPEL engine and a database man-agement system (DBMS).

CustomerApplications BPEL Processes Process Models

Process

Instances

ProviderMiddleware BPEL Engine DBMS

OS

Hardware

Figure 3. PaaS

Because everything except the business processmodel is hosted, the customer must trust his provideron the basis of external audits or security certificates,having expertise in protecting the system at hardwareand operating system level.

Although a customer may trust the provider at thehardware or operating system level, one of the mainshow-stopper is the perceived lack of data confidentialityby means of data loss or even sellout of customer data.For example the on-demand cloud computing serviceFlexiScale6 has been offline for several days because anemployee accidentally deleted one of the main storagevolumes [23].

Two main distinctions between the forms of datahosted at an outsourcing provider must be made: Thefirst form is data, which describes the business process it-self such as process models. The second form is the datathat is processed by the business processes, such as cus-tomer information and order processing data. Becauseadministrators can simply gain access to the businessprocess models or the business process instance data inthe underlying databases, it becomes easier to “steal” theassets of the enterprise.

To increase customers’ confidence and trust in out-sourcing and the provider itself, unauthorized disclosuremust be impossible by design. Therefore we propose anarchitecture of a secure BPEL engine which ensures thatthe information about an asset of the enterprise such asprocess models and customer data is not inexpedientlyused. In the following section the security requirementsto prevent unauthorized disclosure of data are analyzedand a possible realization in terms of requirements to a

6http://www.flexiscale.com/

secure BPEL engine architecture is presented.

4.2.2. Requirements and Challenges. Business pro-cess management (BPM) is a evolutionary process thatinvolves modeling, implementation, execution, moni-toring, assessment and re-design of business processes.When outsourcing processes to a PaaS provider, theprovider is responsible for the execution and partiallymonitoring of these processes. Before a business processcan be executed, the process model has to be deployed tothe provider’s engine. The deployment process includesuploading of the process model and its final installationto the process engine. To ensure confidentiality andintegrity of the enterprise assets that are implicitly re-flected by the process model and process instance it isnecessary to realize the following security requirements:

• It must not be possible to read the process modelfor somebody who gets hold of a process modeldescription file such as a BPEL file.

• It must not be possible to alter the process model(and if yes, find out that it has been done)

• It must not be possible to deploy the process modelon another (maybe corrupted) engine

Therefore a means to encrypt and sign parts of pro-cesses or complete processes is needed. These kinds ofprocesses are further referred as obfuscated processes.Execution of an obfuscated process requires a modifiedprocess engine that implements additional features suchas a public key infrastructure (PKI). Thus, each enginehas a unique private key and only accepts obfuscatedprocesses that are encrypted or signed using one of theprovided public keys. Others are rejected. The privatekey of the engine is unknown even to the administrator.

Employing a PKI, the engine must provide the neces-sary interface to retrieve the public key that can then beused to encrypt the process, or publish it to a certifica-tion authority such as VeriSign7, which can furthermorecertify that the provider uses a secure BPEL engine. Al-though the PKI implicitly protects processes of beingdeployed to an unintended engine, it does not yet detectpossible tampering with the engine through the provider.It must not be possible to corrupt the process or engineto log information in an undesired way. Therefore thecompliant engine must be self-signed with its private keyto detect modifications to its source.

Since BPEL is based on XML we propose to usethe two W3C Recommendations XML-Encryption [30]and XML-Signature [31] to ensure secrecy, integrityand authenticity in BPEL processes. To apply these

7http://www.verisign.com/

673

Figure 4.2: Responsibilities for cloud users and cloud providers when anworkflow based application is moved to the PaaS service model [6]

• Process models should not be readable for intruders who get in pos-session of a process model description file;

• Process models cannot be altered by intruders;

• Process models cannot be deployed on other engines by intruders.

In order to achieve these requirements, process model descriptions have tobe encrypted and signed. Encryption ensures that the process models arenot readable for intruders. By signing, one can ensure that a file is only validfor one particular execution engine and that porting to another executionengine leads to failure.

Database storage may also be a problem. Often, BPEL engines deploy pro-cesses by reading process model description files and creating relational datain tables. Not only the process model itself, but also the process instanceinformation is stored in the database. The data in these databases has alsoto be encrypted in order to be unreadable for intruders. The problem ofdata encryption in databases is that it leads to a restriction of query expres-siveness with respect to relational operators. For example, performing joinoperations might be difficult when values in the database are encrypted [6].

4.2.3 SaaS

By moving the application to the SaaS service model, the cloud provider isnow also responsible for the application itself. The application is no longeran asset of the cloud user’s enterprise, but is offered by the cloud provider,as shown in Figure 4.3. The application can be offered to multiple cloudusers using a single-tenant or multi-tenant architecture. In a single-tenantarchitecture a new BPEL engine is installed for each process model. Ina multi-tenant architecture, multiple cloud users and process models are

31

served by one BPEL engine. The data that is stored by the cloud providershould be secured to prevent unintended access by the SaaS provider orby other cloud users. The security measures, as explained in the previoussection, can be applied to resolve this issue.

tem, platform middleware, and even the application itself(Figure 4). At this point a paradigm shift can be observed.The process no longer represents an asset of the tenant’senterprise and is even not visible at all to the tenant.

Customer

ProviderApplications BPEL Processes Process Models

Process

Instances

Middleware BPEL Engine DBMS

OS

Hardware

Figure 4. SaaS

By serving an application to multiple customers ithas to be distinguished between single-tenant and multi-tenant architectures. When providing BPEL throughSaaS, a single-tenant architecture implies the installationof one BPEL engine (and DBMS) not only for eachtenant but also for each process model. By relying on amulti-tenant architecture, it is possible to serve multipletenants with a single BPEL engine (and DBMS) hostingmultiple business processes. In both scenarios a tenantrequest triggers the creation and execution of an instanceof the requested business process. Thus the accumulatingdata of the respective tenants must be protected againstunintended access by the SaaS provider as well as othertenants if using a multi-tenant architecture.

While the first security requirement can be easilysolved by reusing the proposed modifications to BPELengines as described in Section 4.2.2, the protectionof information against other tenants sharing the sameresources needs to be further analyzed.

4.3.2. Requirements and Challenges. Multi-tenant ar-chitectures [7] distinguish between three approaches torealize multi-tenancy at data level. These approachescan be further distinguished by means of the degree ofdata isolation reaching from separate databases to sep-arate or shared schema in a common database. Whiledata isolation is very important it can be observed thatit correlates with the required system resources [12] interms of memory-footprint, used disk space, CPU usageand socket allocation. Utilizing the separate databasesapproach the data of each tenant is physical isolated.The separate schemas approach ensures logical isolationof each tenant’s data. Both approaches are applicablefor using existing database access control mechanisms.Therefore the BPEL engine must be extended to estab-lish a tenant context when a process instance is beingcreated. A tenant context is defined analogous to an au-

thentication context [25], encapsulating the identity of atenant as well as a reference to the schema location. Theengine must be able to map the tenant’s context to theappropriate authenticators for database access. The navi-gator component of the BPEL engine has to be modifiedto use this context for any database operation.

However, this solution is not directly applicable if thetenants share the same schema. As the same tables areused for the data of multiple tenants, there is neither alogical nor physical separation. Assumed that a BPELprocess includes a weak correlation mechanism it ispossible that requests to process instances sharing thesame correlation values get mixed up and responsescontaining confidential data are exposed to unintendedtenants. A straight forward solution would be to add acolumn containing a tenant identifier [7] to each table. Inthe ideal case a DBMS is able to define access control atthe level of rows by means of tenant identifiers. Becausethis kind of functionality would restrict the set of suitableDBMS too much, the concept of a tenant context asdiscussed above is needed. Furthermore the navigatorcomponent has to be extended to restrict the possiblequery results to the current tenant.

Another solution to the problem of serving multiplecustomers is to deviate from the multi-tenancy approachand dynamically redeploy a business process for everytenant. Thus a process instance and its correspondingprocess model are bound to one single tenant, still shar-ing the same resources at the DBMS level. Then thereis no longer a risk to expose data to other tenants bymistake, because now each process has its own endpointwhich serves as a unique identifier to distinguish theprocess instance information.

5. Analysis and First Evaluations with Ex-isting Engines

Given the detailed analysis above of the different re-quirements for BPEL engines regarding the differentdelivery models we give an overview of the findingsand a deeper analysis of what is required from future en-gines to support all kinds of delivery model. Today, opensource and commercial BPEL engines such as ApacheODE, Active BPEL9, IBM WebSphere Process Serveror Oracle’s BPEL engine and others are not explicitlydeveloped with multi-tenancy in mind. They are moregeared towards the on-premise market. We have madefirst experiments in combining the Apache ODE and Ac-tiveBPEL engines and IaaS. Therefore we installed thoseengines on Amazon EC210. We bundled respective Ama-zon EC2 images that allow us to quickly start up servers

9http://www.activebpel.org10http://aws.amazon.com/ec2/

675

Figure 4.3: Responsibilities for cloud users and cloud providers when anworkflow based application is moved to the SaaS service model [6]

In a multi-tenant architecture, multiple cloud users use the same BPELengine. The data used by one cloud user should not be accessible by othercloud users. As a solution, one can choose to create databases for each clouduser, or to add a column to each database table where the identifier thatuniquely identifies the user is stored.

4.3 Combining on-premise and cloud

Privacy protection is one of the barriers for performing BPM in the cloudenvironment. Not all users want to put their sensitive data in the cloud.Another issue is efficiency. Compute-intensive activities can benefit fromthe cloud because of the scalability of the cloud. Non-compute-intensiveactivities, however, do not always benefit from cloud computing. The per-formance of an activity that is running on-premise might be higher than inthe cloud because of data that needs to be transferred to the cloud first inorder to perform the activity. These activities can also make cloud com-puting expensive, since data transfer is one of the billing factors of cloudcomputing.

32

4.3.1 Architecture

In most BPM solutions nowadays, the process engine, the activities andprocess data are placed on the same side, this is either on-premise or thecloud. The authors of [22] investigate the distribution possibilities of BPMin the cloud by introducing a PAD model, in which the process engine,the activities involved in a process and the data involved in a process areseparately distributed, as shown in Figure 4.4. In this figure, P stands for theprocess enactment engine, which is responsible for activating and monitoringall the activities, A stands for activities that need to be performed in abusiness process, and D stands for the storage of data that is involved inthe business process. By making the distinction between the process engine,the activities and the data, cloud users gain the flexibility to place activitiesthat are not computation-intensive and sensitive data at the user-end sideand all the other activities and non-sensitive data in the cloud.

The PAD model, introduced in [22], defines four possible distribution pat-terns. The first pattern is the traditional BPM solution where everything isdistributed at the user-end. The second pattern is useful when a user alreadyhas a BPM system on the user-end, but the computation-intensive activitiesare placed in the cloud to increase their performance. The third pattern isuseful for users who do not have a BPM system yet, so that a cloud-basedBPM system can be utilised in a pay-per-use manner and non-computation-intensive activities and sensitive data can be placed at the user-end. Thefourth pattern is the cloud-based BPM pattern in which all elements areplaced in the cloud.

Yan-Bo Han et al.: A Cloud-Based BPM Architecture 1159

BPM, some problems should be made clear, for exam-ple, where to enact processes, where to execute activ-ities, as well as where to store the data produced andconsumed by activities. In this regard, we need to in-vestigate architecture patterns and identify an optimalpattern to utilize resources on both sides syntheticallyand sufficiently.

3.1 Design Tradeoff

Every coin has two sides. There is a contradictionbetween cloud computing and user-end distribution. Ifuser-end distribution is allowed, then some advantagesof cloud computing such as zero installation may be sac-rificed. Thus, design tradeoff is inevitable when build-ing a cloud-based BPM system supporting user-end dis-tribution.

To study the problem of user-end distribution ana-lytically and synthetically, we propose a PAD (Process-enactment, Activity execution and Data storage) modelwhich describes BPM architectures from the distri-bution of three independent functions. Fig.1 demon-strates the candidate architectures according to thePAD model.

These architectures can be classified into four princi-pal types: traditional standalone BPM, user-end BPMwith cloud-side distribution, cloud-based BPM withuser-end distribution and exiting cloud-based BPM.

Fig.1. Different patterns of BPM constellation.

Pattern 1 is the traditional architecture, which dis-tributes everything at user-end.

Pattern 4 is adopted by existing cloud-based BPMs,which distributes everything on cloud. The advantageis that users do not have to install anything at theuser-end. However, users of this kind of system will

lose control of their data and cannot utilize user-endresources.

For users who already have full-fledged BPM enginesat their-end, Pattern 2 can be a good choice. They justneed to distribute some compute-intensive activities tocloud-side for acquiring stronger capabilities and betterperformance.

For users who do not have their own full-fledgedBPM engines at user-end, the ideal style is Pattern 3.Process engine is on cloud-side, but process designerscan specify their distribution requirements of activityexecution and data storage. For example, sensitive dataand non-compute-intensive activities can be distributedat user-end, and compute-intensive activities and non-sensitive data can be distributed on cloud-side.

3.2 Separation of Control Data and BusinessData

In traditional centralized BPM systems, data shouldbe stored in a place, where the process engine can haveeasy access. Therefore, if we deploy them into clouddirectly, all business data related to certain processshould be stored on cloud-side, which violates our inten-tion of protecting data privacy. To protect the sensitivedata, a novel BPM architecture supporting Pattern 2 orPattern 3 should be in place to decouple process engineand business data.

Typically, a business process is composed of activi-ties or tasks that involve people, services, and data. Anactivity may be a service invocation or a human task,and activities are orchestrated to make a process thatis often represented as flow chart. There are two pri-mary perspectives of a business process: control-flowperspective and data-flow perspective[17]. The control-flow perspective regulates which activity is being per-formed, and which is the next activity to be executed.The data-flow perspective regulates how data is for-warded from one activity to another and data mappingissues.

Process engines usually have to handle both control-flow and data-flow. They use control data, such asactivity status, switching condition value, and processstatus to determine the control-flow, and move businessdata from one activity to another to ensure data-flow.In fact, during process enactment, it is possible for aprocess engine not to access some business data. Inorder to protect users’ data privacy, we must relievecloud-side engine from dealing with data-flow. As aresult, the cloud-side process engine only focuses onhandling control-flow according to the control data. Sothe business data, which only need to be exchanged be-tween user-end activities, does not have to be accessiblefor the cloud-side engine.

Figure 4.4: Patterns for BPM placement [22]

33

Business processes consist of two types of flows, namely a control-flow and adata-flow. The control-flow regulates the activities that are performed andthe sequence of these activities, while the data-flow determines how datais transferred from one activity to another. BPM workflow engines haveto deal with both control-flows and data-flows. A data-flow might containsensitive data, therefore, when a BPM workflow engine is deployed on thecloud, data-flows should be protected.

In the architecture proposed in [22], the cloud side engine only deals withdata-flow by using reference IDs instead of the actual data. When an activityneeds sensitive data, the transfer of the data to the activity is handled underuser surveillance through an encrypted tunnel. Sensitive data is stored atthe user-end and non-sensitive data is stored in the cloud. An overview ofthe architecture proposed in [22] is shown in Figure 4.5.1160 J. Comput. Sci. & Technol., Nov. 2010, Vol.25, No.6

Fig.2. Architecture of cloud-based BPM with user-end distribution.

Dealing separately with control data and businessdata can also enhance the stability of our system. Onthe one hand, when user-end crashes, the process in-stance on the cloud-side can be suspended for the pro-cess instance status is maintained on cloud-side. Afteruser-end resumes, the process instance can continue.On the other hand, users’ sensitive data will not beinfluenced when the process engine on cloud-side is un-available, because the data can be stored in a localrepository that is under their own control.

3.3 Architectural Rationales

Our design objectives of cloud-based BPMwith user-end distribution are as follows. Firstly, the cloud-side engine handles process enactment by collaboratingwith the user-end engine. These two engines can han-dle activity execution and data storage on their ownside. Secondly, between cloud-side and user-end, theremainly exists control data such as activity status or ser-vice request, which does not contain business data butreference ID. Lastly, business data exchanged betweencloud-side and user-end is also allowed but should beunder users’ surveillance through encrypted tunnel andcould be charged based on the amount of data trans-ferred.

Fig.2 illustrates the novel cloud-based BPM, whichhas an event-driven architecture supporting user-enddependency and autonomy while maintaining logic in-tegrity of an overall business process. As shown inFig.2, the solid arrow represents control-flow, and thewide arrow represents data-flow. The non-sensitivedata is stored in the cloud repository, and users’sensitive data, such as some business documents orconfidential financial reports, is stored in local repo-

sitory under their own control. There are mainly threecomponents (portal, user-end engine, and local repo-sitory) installed at user-end, which could be a normalPC. The cloud-side engine with activity scheduler isbuilt on large server clusters, which feature high per-formance and scalability.

With this architecture, it is no longer necessary forthe sensitive data to be accessed by the cloud-side en-gine especially when those data only need to be ex-changed between user-end activities. When activitiesdistributed on cloud-side want to use the data in user-end repository, the cloud-side engine must get autho-rized by the user-end engine to obtain them.

Users that need the cloud-based BPM just haveto deploy these user-end components on their privateserver, and then get the benefits of full-fledged BPMsystem without losing control of their sensitive data.Moreover, they can also make some further develop-ment on the basis of these components to satisfy theirspecific needs.

4 Key Issues with Scientific Exploration

In this section, we describe how we can ensure thatthe cloud-side engine collaborates seamlessly with theuser-end components to maintain logic integrity of anoverall business process, and also discuss our optimaldistribution mechanism and privacy protection issuesin more detail.

4.1 Communication Between Cloud-Side andUser-End

In the communications between cloud-side anduser-end, six types of event are defined as carriers of

Figure 4.5: Architecture of a cloud-based BPM system combined with user-end distribution [22]

4.3.2 Optimal distribution

The costs for using cloud computing are investigated in several articles [9,14]. In [22], formulae are given for calculating the optimal distribution ofactivities, when activities can be placed in the cloud or on-premise. Thecalculation takes into account the time costs, monetary costs and privacyrisk costs. By using these formulae, cloud users can make an estimation ofthe costs of deploying parts of their application on-premise and in the cloud.

34

4.4 Social BPM

Social software supports the interaction of human beings and the productionof artefacts by combining the input from independent contributors with-out predetermining how to do this [28]. An example of social softwareis wikipedia, where independent contributors are able to adjust the doc-uments in the system. Business processes can benefit from social software,since users involved in the business processes can easily share their knowl-edge and information by using social software. Not only business processes,but also BPM itself can benefit from social software. By offering BPM soft-ware as social software, multiple users can work on the design, operationand improvement of a business process simultaneously, and collaboration issupported since knowledge is shared through the software.

One of the issues in BPM is the Model-Reality divide issue. Abstract processmodels and executable process models are often separated. In the designphase, processes are modelled in by business designers. When the models arepassed onto the business implementers, the implementers might decide toimplement the processes differently then defined by the business designer,simply because not all the details are provided in the design or certaindesign decisions have not been made. This leads to inconsistency betweenthe designed process models and the executed process models. By involvingprocess implementers and process users in the design process, designers canobtain more information from the implementers and process users, and bothparties can react on design decisions that are made by the designers.

Process improvement can also benefit from social software. During the pro-cess evaluation phase, not all information inside the process might be takeninto account because of the information pass-on threshold [28]. The informa-tion pass-on threshold means that users might not share all the informationwith the designer. This might be the case when a user thinks it is toomuch effort to document all the information, or when the user thinks thatthe information is not useful for the designer. Social software can providethe users with interfaces for sharing the information and possibilities fordiscussing the information with a process designer.

Below we give an overview of possibilities that are provided by social softwarefor each of the BPM lifecycle phases, as introduced in Figure 2.1:

• DesignIn the design phase, not only process designers but also process usersare able to define requirements and comment on process ideas. Bycollaborating with process users, process designers get better insightinto the process, and users agree with the end result, since they haveparticipated in the design of the process, instead of being forced to

35

commit to an imposed business process.

• ImplementationIn the implementation phase, social software may be used for sharingissues with multiple implementers and process users. Time schedulingof deployment can also be broadcasted using social software.

• EnactmentIn the enactment phase, social software can be used for optimizingthe communication between process users. Tasks can be spread byusing an online dashboard. Information can be passed on through thedashboard, and tasks and comments can be exchanged using mobiledevices.

• EvaluationThe evaluation phase can be improved by giving process users accessto this phase. Process users can share what they want to change inthe process and other users can comment and propose solutions forthese issues.

Tools that can be used for enabling BPM as social software are sharedmodelling tools, blogs, wikis, fora, user rating systems and data exchangeservices, as proposed in [28].

4.5 Environments

On the internet, about 15 BPM cloud providers can be found. The productsthat are offered by the providers differ from complete BPM suites, to simplemodelling tools for business processes. In this section we give an overviewof available systems and we briefly discuss some of these products, to givean idea of the variety of products available.

4.5.1 Product overview

Table 4.1 gives an overview of some of the BPM cloud products currentlyavailable on the market. For each of the cloud products the cloud servicemodels and the cloud types were identified.

4.5.2 Oracle Fusion

Fusion is a middleware solution offered by Oracle [26]. With Fusion, en-terprises can create their own private cloud, based on the PaaS and SaaS

36

Product Service Model Cloud Type

Oracle Fusion PaaS, SaaS Private

Appian BPM Suite PaaS, SaaS Private or Public

Fujitsu InterstageBPM.com PaaS, SaaS Private or Public

Barium Live SaaS Public

Pega Business Process Cloud SaaS, limited PaaS Public

PNM Soft Sequence Cloud BPM SaaS Public

Elite BPM Cloud IaaS, PaaS, SaaS Public

Billfish BPM Paas, SaaS Private or Public

Cloud Harbor BOP Paas, SaaS Private or Public

Cordys PaaS Private

Intalio BPMS IaaS, PaaS, SaaS Private

Adeptia On Demand BPM SaaS Public

Table 4.1: Overview of BPM cloud products

service model. The middleware has support for running SOA-based applica-tions. In addition, the middleware offers the Oracle BPEL Process Managerin which business processes can be composed and deployed.

4.5.3 Appian

Appian BPM Suite [7] is a complete BPMS running in the cloud. The suiteconsists of cloud, mobile and social BPM solutions [8]. Appian offers bothon-premise and cloud-based BPM, and gives users the freedom to port theirsolutions between cloud and on-premise. The suite offers, amongst others,a process engine, content sharing, discussion boards, SOA integration toolsand sharepoint.

Appian BPM Suite is considered to be a social BPM, since all process userscan participate in the design, enactment and optimization phase of the BPMprocess. Users can discuss their findings in discussion boards and can sharecontent with each other by using the cloud. In addition, Appian offersmobile applications to users. These applications notify users when a newtask comes in, or gives users a reminder when a certain activity failed orneeds attention.

Appian offers several security guarantees to their customers: 99.5% uptime,SAS-70 Type II infrastructure audit reports, SAML and LDAP/AD integra-tion for secure authentication, SSL encryption for communication betweensystems and compliance with national data privacy laws through local host-ing.

37

4.5.4 Fujitsu InterstageBPM.com

InterstageBPM.com is the BPM platform of Fujitsu [15]. The BPM platformis available as a cloud service or as a standalone product which can bedeployed on-premise. Cloud users are offered a PaaS and SaaS based cloudsolution where all of the phases in the BPM lifecycle are fully supportedby the platform. The public variant of the cloud platform is hosted in thedatacenter of Fujitsu.

4.5.5 Barium Live

Barium Live is an public SaaS based BPM solution with two tools:

1. An online design tool in which multiple users can collaborate. Multipleusers can work simultaneously on the same business process. Thedesign tools works with a versioning system. When a user changes aprocess, the process can be checked in again and a new version of theprocess is stored in the cloud. Users can browse through the differentversions of the process and can merge different versions of the process.

2. A process enactment tool. Process users can logon to the BariumLive software and get an overview of the tasks they have to perform.Barium does not integrate with information systems, instead it canonly be used as a workflow engine for managing human-based tasks.

4.6 Conclusion

In this chapter, we reported on the current state of the art with respect toBPM in the cloud. Reduction of upfront investments and scalability havebeen identified as the benefits of BPM solutions in the cloud. Security ofdata is the main issue for not placing sensitive data in the cloud. Placementof activities that are non-computation-intensive in the cloud might also leadto high costs when BPM is placed in the cloud. As a solution, an approachhas been presented in which data is securely stored on-premise and onlycomputation-intensive activities are placed in the cloud.

Offering BPM as social software has been identified as a promising approach,in which improvement of communication and collection of knowledge havebeen identified as its main benefits.

We have introduced four available commercial cloud-based BPM products.In addition, we introduced a table with an overview of existing BPM cloudsolutions. Unfortunately, there is not a lot of information about how these

38

products handle security issues to offer a secure environment to the clouduser.

There are currently no open-source public cloud-based BPM solutions avail-able. There are some companies that offer open-source BPM solutions, suchas Intalio BPMS, but these products can only be used as a private cloudsolution. Since private clouds are only available within the borders of anorganization, less strict security measures are necessary when compared todeployment in a public cloud.

39

Chapter 5

Research directions

We identify research directions by further discussing the results of the liter-ature study.

5.1 BPEL engine deployment in the cloud

The deployment of a BPEL engine in the cloud according to one of the threeservice models is discussed in [6]. Several problems can occur when such anengine is moved to the cloud. Their research is mainly based on literatureand small experiments. Based on [6], it would be interesting to take an open-source BPEL engine and deploy it onto one of the service models and extendit in a way that each of the identified security issues is solved, presentinga solid solution to cloud users who want to benefit from secure workflowengines in the cloud.

A possible approach might be to investigate the structure of an open-sourceBPEL engine and identify the modules that need to be updated in orderto make it suitable for the cloud. Extensions need to be developed to solvesecurity issues. The next step would be to deploy the system to the cloud andperform advanced security tests to demonstrate that the system is secure.

5.2 Social BPM

Social BPM is one of the trends in BPM development, as mentioned in [28].By offering BPM as a social solution, not only process designers but alsoprocess users can be involved in the whole BPM lifecycle. The benefits ofsocial BPM have been investigated in [28]. Social BPM might influencethe BPM lifecycle. For example, collection of data for each of the phases

40

might be optimized since social BPM offers tools for exchanging information.These influences can be researched in more detail. In addition, concernssuch as authorisation, stability, versioning, auditing and security need to beinvestigated, in order to guarantee consistency.

5.3 Process decomposition

A distribution model in which business process activities can be placed on-premise or in the cloud has been proposed in [22]. Four patterns wereidentified for the distribution of a process engine, activities and data. How-ever, we can generalise this distribution and identify a fifth pattern, in whichprocess engines, activities and data are deployed on both the cloud and theend-user side. This solution has two potential benefits:

1. A process engine regulates both control flow and data flow. An activ-ity receives data from the process engine and after the execution of theactivity, the data that is produced by the activity is passed back to theprocess engine. Now consider that a sequence of activities is placed inthe cloud, while the process engine is deployed at the end-user side.Each activity uses the output data of the previous activity as inputdata. The data is not directly passed from activity to activity, but issent back to the process engine first. An example of this situation isshown in Figure 5.1. Since data transfer is one of the billing factorsof cloud computing, these situations can become expensive when largeamounts of data are transferred from activity to activity. To avoid thisproblem, a process engine can be added to the cloud, which regulatesthe control flow and data flow of the activities placed in the cloud.When a sequence of activities is placed in the cloud, data is now regu-lated by the process engine in the cloud, which reduces the amount ofdata that needs to be transferred between on-premise and the cloud.

2. When the cloud is unavailable, users can run the business processcompletely on-premise until the cloud is available again.

In order to run a business process on two process engines, the process has tobe split up into two individual processes. It would be convenient for BPMusers to take a business process and an activity distribution list, which canbe transformed automatically into two individual business processes, onefor the cloud and one for the end-user process engine. The communicationbetween both systems can be described by using a choreography descriptionlanguage.

The transformation described above is schematically represented in Figure5.2. In addition, the distribution list can be created automatically according

41

a2

a1

a3

a4

Process Engine

inputOn-PremiseCloud

output

input

output

output

input

input

output

Figure 5.1: Example of data intensive process problems with distributedactivities

to optimal distribution formulae, presented in [22].

Monitoring of the business process is now more complicated, since the busi-ness process has been split up in two new business processes. As a solution, abusiness processes monitor tool can be developed for monitoring the originalbusiness process, by combining the monitored details of both the individualprocesses.

A possible approach to handle process decomposition is to identify the struc-ture and the semantics of the processes. When the control dependencies anddata dependencies are known, the consequences of moving certain activitiesto either the end-user side or the cloud side can be investigated.

When the consequences of distributing activities are known, a model trans-formation can be created in which a business process and a list with markingsis used for creating two individual business processes, one for the cloud andone for the end-user side. In addition, a choreography description can begenerated to describe the communication between both the business pro-cesses.

42

Orchestration

Choreography

Orchestration Orchestration

End-User Processa1

a3a5

a6

Cloud Process

a2 a4

communication

Processa1

a2

a3

a4

a5

a6

Distribution listActivity On-premise Clouda1 Xa2 Xa3 Xa4 X a5 Xa6 X

Tran

sfo

rmat

ion

Figure 5.2: Schematic representation of separating business processes

43

Chapter 6

Conclusion

In this report, we investigated combination of BPM and cloud computing.We discussed both cloud computing and BPM and gave an overview ofliterature that discusses their combination.

BPM has been introduced by identifying the four phases of the BPM lifecy-cle. In addition, three standard languages for designing and implementingbusiness processes have been briefly discussed. We explained the structureof a BPMS and identified concepts within BPM that are relevant for thiswork.

We explained cloud computing by giving an overview of the three servicemodels, and the specific benefits and drawbacks of each of these service mod-els. Four cloud types were identified, and products of three cloud providers,namely Amazon, Google and Microsoft, have been introduced.

We discussed the most relevant combinations of BPM and cloud computingfound in the literature. The deployment of an orchestrated application ontoeach of the cloud service models was discussed. The distribution of data andactivities within a process were also discussed by looking into four deploy-ment patterns. We also investigated the impacts of offering BPM softwareas social software and gave an overview of existing cloud products.

We identified three possible research directions. The first direction is the de-ployment of a workflow engine in the cloud, where several security measuresneed to be taken to offer a business process platform as a service. The sec-ond direction is to investigate the impact of offering BPM as social service.In the third direction, we introduced a new distribution pattern in whichtwo process engines were applied, one in the cloud and one on-premise. Abusiness process can be annotated with a distribution schema and the busi-ness process can be transformed automatically to a business process for thecloud-based process engine and the on-premise process engine.

44

It was quite difficult to find scientific papers about the combination of cloudcomputing and BPM. We looked into several journals and browsed throughproceedings of conferences about cloud and computing and BPM, such asCLOUD, BPM and CLOSER. Papers about the subject are scarce, butseveral conference websites ask for submission of papers about this subject,which indicates that the subject is still interesting for further research.

45

Bibliography

[1] A. Alves, A. Arkin, S. Askary, B. Bloch, F. Curbera, Y. Goland,N. Kartha, Sterling, D. Konig, V. Mehta, S. Thatte, D. van der Rijn,P. Yendluri, and A. Yiu. Web Services Business Process ExecutionLanguage Version 2.0. OASIS Committee, 2007.

[2] Amazon. Amazon Elastic Compute Cloud (Amazon EC2). http://

aws.amazon.com/ec2/, Dec. 2011.

[3] Amazon. Amazon Simple Queue Service (Amazon SQS). http://aws.amazon.com/sqs/, Dec. 2011.

[4] Amazon. Amazon Simple Storage Service (Amazon S3). http://aws.

amazon.com/s3/, Dec. 2011.

[5] Amazon. Amazon SimpleDB. http://aws.amazon.com/simpledb/,Dec. 2011.

[6] T. Anstett, F. Leymann, R. Mietzner, and S. Strauch. Towards bpelin the cloud: Exploiting different delivery models for the execution ofbusiness processes. In Proceedings of the 2009 Congress on Services - I,pages 670–677, Washington, DC, USA, 2009. IEEE Computer Society.

[7] Appian. BPM in the Cloud Its Time, Its Safe... Its Smart, Oct. 2011.

[8] Appian. Cloud, Mobile and Social BPM, Mar. 2011.

[9] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Kon-winski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia.Above the clouds: A berkeley view of cloud computing. Technical Re-port UCB/EECS-2009-28, EECS Department, University of California,Berkeley, Feb 2009.

[10] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Bur-rows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributedstorage system for structured data. In In proceedings of the 7th confer-ence on usenix symposium on operating systems design and implemen-tation - volume 7, pages 205–218, 2006.

46

[11] D. Chappell. Introducing the Windows Azure Platform. http:

//www.davidchappell.com/writing/white_papers/Introducing_

the_Windows_Azure_Platform,_v1.4--Chappell.pdf, Oct. 2010.

[12] D. Chappell. Introducing Windows Azure. http://www.

davidchappell.com/writing/white_papers/Introducing_

Windows_Azure,_v1.3--Chappell.pdf, Oct. 2010.

[13] J. Dean, S. Ghemawat, and G. Inc. Mapreduce: simplified data pro-cessing on large clusters. In In OSDI04: Proceedings of the 6th confer-ence on Symposium on Operating Systems Design and Implementation.USENIX Association, 2004.

[14] E. Deelman, G. Singh, M. Livny, B. Berriman, and J. Good. The costof doing science on the cloud: the montage example. In Proceedingsof the 2008 ACM/IEEE conference on Supercomputing, SC ’08, pages50:1–50:12, Piscataway, NJ, USA, 2008. IEEE Press.

[15] fujitsu. Interstage Cloud BPM, 2009.

[16] K. Gibbs. Google app engine campfire one transcript. http:

//code.google.com/intl/nl/appengine/articles/cf1-text.html,Apr. 2008.

[17] Google. The administration console. http://code.google.com/intl/nl/appengine/docs/adminconsole/index.html, Dec. 2011.

[18] Google. Gql reference. http://code.google.com/intl/nl/

appengine/docs/python/datastore/gqlreference.html, Dec. 2011.

[19] Google. What is google app engine? http://code.google.com/intl/

nl/appengine/docs/whatisgoogleappengine.html, Dec. 2011.

[20] O. M. Group. BPMN 2.0 by Example Version 1.0 (non-normative).http://www.omg.org/spec/BPMN/2.0/examples/PDF, Jan. 2002.

[21] O. M. Group. Business Process Model and Notation (BPMN) Version2.0. http://www.omg.org/spec/BPMN/2.0/PDF, Jan. 2011.

[22] Y.-B. Han, J.-Y. Sun, G.-L. Wang, and H.-F. Li. A cloud-based bpmarchitecture with user-end distribution of non-compute-intensive activ-ities and sensitive data. J. Comput. Sci. Technol., 25(6):1157–1167,2010.

[23] H. Jin, S. Ibrahim, T. Bell, W. Gao, D. Huang, and S. Wu. Cloud typesand services. In B. Furht and A. Escalante, editors, Handbook of CloudComputing, pages 335–355. Springer US, 2010.

[24] N. Kavantzas, D. Burdett, G. Ritzinger, T. Fletcher, Y. Lafon, andC. Barreto. Web Services Choreography Description Language Version

47

1.0. World Wide Web Consortium, Candidate Recommendation CR-ws-cdl-10-20051109, 2005.

[25] P. Mell and T. Grance. The NIST Definition of Cloud Computing.National Institute of Standards and Technology, 53(6):50, 2009.

[26] Oracle. Platform-as-a-Service Private Cloud with Oracle Fusion Mid-dleware, Oct. 2009.

[27] M. P. Papazoglou. Web Services - Principles and Technology. PrenticeHall, 2008.

[28] R. Schmidt and S. Nurcan. Bpm and social software. In BusinessProcess Management Workshops, pages 649–658, 2008.

[29] K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Dis-tributed File System, 2010.

[30] M. Weske. Business Process Management: Concepts, Languages, Ar-chitectures. Springer, 2007.

[31] S. White. Introduction to BPMN. http://www.bpmn.org/Documents/Introduction%20to%20BPMN.pdf, 2004.

[32] Q. Zhang, L. Cheng, and R. Boutaba. Cloud computing: state-of-the-art and research challenges. Journal of Internet Services and Applica-tions, 1:7–18, 2010. 10.1007/s13174-010-0007-6.

48