16
SOCA (2012) 6:151–166 DOI 10.1007/s11761-011-0089-4 SPECIAL ISSUE PAPER Virtualised e-Learning on the IRMOS real-time Cloud Tommaso Cucinotta · Fabio Checconi · George Kousiouris · Kleopatra Konstanteli · Spyridon Gogouvitis · Dimosthenis Kyriazis · Theodora Varvarigou · Alessandro Mazzetti · Zlatko Zlatev · Juri Papay · Michael Boniface · Sören Berger · Dominik Lamp · Thomas Voith · Manuel Stein Received: 15 April 2011 / Revised: 25 July 2011 / Accepted: 17 September 2011 / Published online: 8 October 2011 © Springer-Verlag London Limited 2011 Abstract This paper presents the real-time virtualised Cloud infrastructure that was developed in the context of the IRMOS European Project. The paper shows how different concepts, such as real-time scheduling, QoS-aware network protocols, and methodologies for stochastic modelling and run-time provisioning were practically combined to provide strong performance guarantees to soft real-time interactive applications in a virtualised environment. The efficiency of the IRMOS Cloud is demonstrated by two real interactive The research leading to these results has received funding from the European Community’s Seventh Framework Programme FP7 under grant agreement no. 214777 IRMOS-Interactive Realtime Multimedia Applications on Service Oriented Infrastructures. T. Cucinotta · F. Checconi Real-Time Systems Laboratory, Scuola Superiore Sant’Anna, Pisa, Italy e-mail: [email protected] F. Checconi e-mail: [email protected] G. Kousiouris · K. Konstanteli (B ) · S. Gogouvitis · D. Kyriazis · T. Varvarigou National Technical University of Athens, Athens, Greece e-mail: [email protected] G. Kousiouris e-mail: [email protected] S. Gogouvitis e-mail: [email protected] D. Kyriazis e-mail: [email protected] T. Varvarigou e-mail: [email protected] A. Mazzetti eXact learning solutions SpA, Sestri Levante, Italy e-mail: [email protected] e-Learning applications, an e-Learning mobile content deliv- ery application and a Virtual World e-Learning application. Keywords Real-time scheduling · Virtualised infrastructures · Stochastic modelling · e-Learning 1 Introduction A current trend in the software engineering industry is to rely increasingly on the distributed computing paradigm, which is becoming mainstream and ubiquitous, especially due to the high market penetration of low-cost high-speed inter- net connectivity. To this direction, more and more applica- tions are developed in a distributed fashion to be hostedon Z. Zlatev · J. Papay · M. Boniface IT Innovation Centre, University of Southampton, Southampton, UK e-mail: [email protected] J. Papay e-mail: [email protected] M. Boniface e-mail: [email protected] S. Berger · D. Lamp University of Stuttgart, Stuttgart, Germany e-mail: [email protected] D. Lamp e-mail: [email protected] T. Voith · M. Stein Alcatel Lucent, Stuttgart, Germany e-mail: [email protected] M. Stein e-mail: [email protected] 123

Virtualised e-Learning on the IRMOS real-time Cloud

Embed Size (px)

Citation preview

Page 1: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166DOI 10.1007/s11761-011-0089-4

SPECIAL ISSUE PAPER

Virtualised e-Learning on the IRMOS real-time Cloud

Tommaso Cucinotta · Fabio Checconi · George Kousiouris · Kleopatra Konstanteli · Spyridon Gogouvitis ·Dimosthenis Kyriazis · Theodora Varvarigou · Alessandro Mazzetti · Zlatko Zlatev · Juri Papay ·Michael Boniface · Sören Berger · Dominik Lamp · Thomas Voith · Manuel Stein

Received: 15 April 2011 / Revised: 25 July 2011 / Accepted: 17 September 2011 / Published online: 8 October 2011© Springer-Verlag London Limited 2011

Abstract This paper presents the real-time virtualisedCloud infrastructure that was developed in the context of theIRMOS European Project. The paper shows how differentconcepts, such as real-time scheduling, QoS-aware networkprotocols, and methodologies for stochastic modelling andrun-time provisioning were practically combined to providestrong performance guarantees to soft real-time interactiveapplications in a virtualised environment. The efficiency ofthe IRMOS Cloud is demonstrated by two real interactive

The research leading to these results has received funding from theEuropean Community’s Seventh Framework Programme FP7 undergrant agreement no. 214777 IRMOS-Interactive Realtime MultimediaApplications on Service Oriented Infrastructures.

T. Cucinotta · F. ChecconiReal-Time Systems Laboratory, Scuola SuperioreSant’Anna, Pisa, Italye-mail: [email protected]

F. Checconie-mail: [email protected]

G. Kousiouris · K. Konstanteli (B) · S. Gogouvitis ·D. Kyriazis · T. VarvarigouNational Technical University of Athens, Athens, Greecee-mail: [email protected]

G. Kousiourise-mail: [email protected]

S. Gogouvitise-mail: [email protected]

D. Kyriazise-mail: [email protected]

T. Varvarigoue-mail: [email protected]

A. MazzettieXact learning solutions SpA, Sestri Levante, Italye-mail: [email protected]

e-Learning applications, an e-Learning mobile content deliv-ery application and a Virtual World e-Learning application.

Keywords Real-time scheduling · Virtualisedinfrastructures · Stochastic modelling · e-Learning

1 Introduction

A current trend in the software engineering industry is to relyincreasingly on the distributed computing paradigm, whichis becoming mainstream and ubiquitous, especially due tothe high market penetration of low-cost high-speed inter-net connectivity. To this direction, more and more applica-tions are developed in a distributed fashion to be hostedon

Z. Zlatev · J. Papay · M. BonifaceIT Innovation Centre, University of Southampton,Southampton, UKe-mail: [email protected]

J. Papaye-mail: [email protected]

M. Bonifacee-mail: [email protected]

S. Berger · D. LampUniversity of Stuttgart, Stuttgart, Germanye-mail: [email protected]

D. Lampe-mail: [email protected]

T. Voith · M. SteinAlcatel Lucent, Stuttgart, Germanye-mail: [email protected]

M. Steine-mail: [email protected]

123

Page 2: Virtualised e-Learning on the IRMOS real-time Cloud

152 SOCA (2012) 6:151–166

dedicated infrastructures, such as the Clouds, that can beeasily accessed by remote users, whether they are using localworkstations, laptops, palmtop devices or mobile phones.

According to the Cloud computing model, distributedapplications are developed by Software-as-a-Service (SaaS)providers to be hosted in Cloud infrastructures. During thisprocess, the SaaS provider may use tools offered by Plat-form-as-a-Service (PaaS) providers that facilitate the design,development and testing of their applications in a virtualisedenvironment. The Infrastructure-as-a-Service (IaaS) provid-ers use virtualisation techniques to deploy these applicationsinside their Cloud infrastructure. By using such techniques,the IaaS providers are able to achieve a better server consol-idation level by deploying multiple virtual machines (VMs)with different Operating Systems (and services therein) overthe same physical resources. Furthermore, the IaaS providersutilise virtualisation techniques at network level, such as livemigration, to seamlessly migrate the VMs from one physi-cal host to another one, for maintenance, fault-tolerance orload-balancing reasons.

However, as the number of VMs deployed over the samephysical resources (e.g. links and CPUs) increases, the levelof performance experienced by each VM becomes unstable.Indeed it depends heavily on the overall workload imposedby the other VMs competing for the shared resources, as hasbeen observed, for example, in Amazon EC2 [10]. There-fore for a virtualised Cloud environment to provide properaccommodation to this increasing number of soft real-timedistributed applications, proper CPU and network schedul-ing technologies coupled with proper performance modellingand run-time provisioning techniques are needed.

In this paper we present the way these techniques havebeen combined into one virtualised Cloud Computing ser-vice-oriented infrastructure that has been developed in thecontext of the IRMOS European Project1, and we showhow these concepts have been practically applied to pro-vide strong performance guarantees to each individual real-time interactive application that is hosted on the virtualisedresources of the IRMOS Cloud. This is realised by allowingapplications to be deployed in the form of a Virtual ServiceNetwork (VSN), a graph whose vertices represent VirtualMachine Units (VMUs) that encapsulate the different appli-cation components, and whose edges represent the networkconnections among them. In order for the VSN representationof the application to comply with real-time constraints as awhole, each VSN element is associated with precise require-ments in terms of the computing resources and networkingresources for each node and link respectably. These require-ments are fulfilled thanks to the IRMOS real-time scheduler,providing temporal isolation across VMUs on the computingside, and the use of QoS-aware network protocols for guar-

1 More information is available at:http://www.irmosproject.eu.

anteed networking. However, the accuracy of the estimationsfor the required computing and networking resources for eachVMU (which are used to decide the amount of resources toreserve for the VSN within the infrastructure) is a critical fac-tor for the provisioning of precise guarantees to the hostedapplications.

To this end, the values of the performance parametersare obtained after a proper benchmarking and monitoringprocess of the service components on the IRMOS virtua-lised resources that allows for an accurate estimation of theirbehaviour during the performance modelling of the appli-cation. However, the lack of information (e.g. source code)across the entities makes performance analysis a difficulttask. To overcome this problem, Artificial Neural Networks(ANNs) were used to effectively model the applications per-formance when their internal structure is unknown.

The IRMOS PaaS layer consists of tools that facilitate thedescribed above benchmarking, monitoring and performancemodelling phases. Furthermore, in order to address casesof bad performance modelling, the IRMOS PaaS continu-ously collects and evaluates monitoring data at run-time toidentify possible deviations between the agreed and achievedQoS levels, the source of the problem and possible correctiveactions, such as adjusting the allocated resources share, andreconfiguration of the running application. Apart from thesecorrective actions, the IRMOS PaaS also enable the SaaSprovider to define a range of application-driven events thatmay automatically trigger certain actions during the run-timeof the application, such as automatic renegotiation/reconfig-uration when certain usage terms are met.

This paper is organised as follows. Relevant related workis presented in Sect. 2. Section 3 gives an overview of the wayIRMOS achieves performance isolation by combining tem-poral isolation techniques at computing and networking levelwith proper performance modelling. A detailed description ofthe IRMOS PaaS is given in Sect. 4, whereas Sect. 5 presentsin detail the two real e-Learning applications, a Mobilee-Learning application and a Virtual World e-Learning appli-cation, that were integrated in the IRMOS platform and wereused to evaluate its performance. Section 6 presents the meth-ods that are used for estimating the performance behaviour ofthe applications to be hosted in IRMOS. In Sect. 7, the resultsgathered from a large set of experiments using the Mobilee-Learning application on the IRMOS platform under realis-tic settings are presented. A use case based on the VirtualWorld e-Learning application that highlights the runtimeQoS provisioning capabilities of the platform is presentedin Sect. 8. Finally, conclusions are drawn in Sect. 9.

2 Related work

In this section, related works that have appeared in theresearch literature are briefly summarised.

123

Page 3: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 153

The work by Lin and Dinda in [22] presents varioussimilarities with the one presented in this paper. First, anEDF-based scheduling algorithm [24] for Linux is usedon the host to schedule VMs. Furthermore, an analysisis conducted on the application performance, investigatingthe effects of scheduling decisions and concurrent virtualmachines execution. The analysis is very thorough and inter-esting; however, the major limitation of the work residesin the way low-level scheduling is achieved. In fact, theauthors make use of a scheduler built into a proper user-space process (VSched), which exploits POSIX real-timepriorities to achieve an EDF-based scheduling of VMs, andSIGSTOP/SIGCONT signals for realising optionally hardresource reservations. Such an approach presents high sched-uling overheads due to the forcibly increased number of con-text switches, whilst our scheduler [5] is directly built into thekernel and does not introduce any additional context switch.Furthermore, VSched cannot properly react to those situa-tions in which a VM blocks or unblocks, e.g. as due to I/Ooperations, something that is needed to guarantee a properlevel of temporal isolation. In order to address this issue, ourscheduler uses the CBS algorithm [1].

Another very interesting work by the same authors is [23],where the users of a virtual machine are given the opportu-nity to adapt the allocated CPU through a simple interface,based on their experience with the application. The cost ofthe increase is shown, so that the user may decide on the fly.Whilst it is a very promising approach and would eliminate avast number of issues with regard to application QoS levels,its main drawback is in the case of workflow applications.The degrading performance of a workflow application maybe due to a bottleneck on various nodes executing a part ofit. The users will most likely be unaware of the location ofthe bottleneck, especially if they are non-experts. Instead,the work by Nathuji et al. [27] focuses on automatic on-lineadaptation of the CPU allocation in order to maintain a sta-ble performance of VMs. However, the framework does nottreat a VM as a “black-box”, and needs application-specificmetrics to run the necessary QoS control loop, going beyondthe common IaaS business model.

Gupta et al. [13], investigated the performance isolationof virtual machines focusing on the exploitation of variousscheduling policies available in the Xen hypervisor [6]. Fur-thermore, Dunlap proposed [9] various enhancements to theXen credit scheduler in order to address various issues relatedto the temporal isolation and fairness among the CPU sharededicated to each VM. Instead, the IRMOS real-time sched-uler [5] is based on the KVM2 hypervisor. Having alreadydemonstrated the way temporal isolation of compute andnetwork-intensive VMs can be achieved using the KVMhypervisor in our previous work [7,8], the scheduling method

2 More information at: http://www.linux-kvm.org.

proposed in this paper also addresses the modelling issuesthat arise during the deployment of an e-Learning applica-tion under proper QoS guarantees.

Shirazi et al. [29] proposed DynBench, a method forbenchmarking infrastructures that support distributed real-time applications under dynamic conditions. Whilst promis-ing, this framework is mainly oriented towards investigatingthe limits of the infrastructure and not towards understand-ing the application’s behaviour in relationship to differentscheduler configurations.

In Germain et al. [11], present DIANE for grid-based user-level scheduling. However, the focus is on controlling theexecution end time of long processing applications, and noton real-time interactive ones as done in this paper. The prob-lem of optimum allocation of workflows of virtualised ser-vices on physical resources under a stochastic approach hasbeen investigated in [16], in the context of soft real-timeinteractive applications.

In terms of application performance modelling in dis-tributed infrastructures a number of interesting works exist.A code analysing process that allows for the simulation ofsystem performance is described in [14]. It models the appli-cation by a parameter-driven Conditional Data Flow Graph(CDFG) and the hardware (HW) architecture by a configura-ble HW graph. The execution cost of each task block in theapplication’s CDFG is modelled by user-configurable param-eters, which allows for highly flexible performance estima-tion. The CDFG of the application and the HW graph are usedto perform a low-level simulation of the HW activities. Whilstvery promising, it needs the source code in order to providethe CDFG. In our work, we deal with VMs as black boxes,which allows for the deployment of applications whosesource code is not available for confidentiality purposes.

Another interesting work is presented by Lee et al. in [21].The application, whose performance must be measured, isrun under a strict reservation of resources to determine ifthe given set of reservation parameters satisfies the time con-straints for execution. If this is not the case, then these param-eters are altered accordingly. If there is a positive surplus,the resources are decreased, and if it is negative, they areincreased until a satisfying security margin is reached. Whilstassuring high utilisation rates, the main disadvantage of thismethodology is that it must be performed for every individ-ual execution with the specific SLA parameters before theactual deployment.

Bekner et al. [3] introduce the Vienna Grid Environment(VGE), a framework for incorporating QoS, in Grid applica-tions. It uses a performance model to estimate the responsetime and a pricing model for determining the price of a jobexecution. In order to determine whether the client’s QoSconstraints can be fulfilled, for each QoS parameter a corre-sponding model has to be in place. However, VGE does notprescribe the actual nature of performance models. Instead,

123

Page 4: Virtualised e-Learning on the IRMOS real-time Cloud

154 SOCA (2012) 6:151–166

Fig. 1 Deployment of ServiceComponents (SCs) withinVirtual Machine Units (VMUs)over IRMOS/ISONI

VGE uses an abstract interface for the performance models,and it is assumed that these models become available throughanalytical modelling or historical data analysis. Althoughanalytical performance modelling in general requires a thor-ough knowledge of the application, the method proposed inthis paper allows for those parts of the application that are notvisible to the external world to be handled as “black boxes”.

Other works exist that address QoS assurance via perfor-mance prediction [18] and control via service selection [20].Whilst numerous promising solutions exist to the problemof performance analysis of VMs in presence of real-timescheduling, these are either not focused on critical parame-ters that are necessary for running real-time applications onSOIs, or they lack of a proper low-level real-time schedul-ing infrastructure, which is needed for supporting temporalisolation among concurrently running VMs. A particularlyrelevant investigation made by some of the authors of thispaper appeared in [19], where also the IRMOS infrastructure,comprising ANNs and the real-time scheduler, was used forperformance prediction, but the target application was com-posed of a set of synthetic Matlab-based benchmarks, whilstin this paper we target a real e-Learning application scenario.

3 Performance isolation—the IRMOS/ISONI way

One of the core components of the IRMOS platform is theIntelligent (virtualised) Service-Oriented Networking Infra-structure (ISONI) [30]. It acts as a Cloud Computing3 IaaSprovider for the IRMOS framework and manages a set ofphysical computing, networking and storage resources avail-able in form of multiple nodes/sites within a provider domain(see Fig. 1).

3 More information at: http://www.cloudcomputing.org/.

ISONI provides those virtualised resources over whichIRMOS applications are deployed. One of the key innova-tions introduced by ISONI is its capability to ensure guaran-teed levels of resource allocation for each individual appli-cation instance hosted within the ISONI domain.

This is realised by allowing applications to be deployedin form of a VSN. This is a graph whose vertices repre-sent individual Service Components (SCs) of an applicationwhich may be deployed in form of Virtual Machine Units(VMUs), and whose edges represent communications—thevirtual links (VLs)—among them.

In order for the application represented by a VSN to com-ply with real-time constraints as a whole, QoS needs to besupported for all the involved resources, particularly for net-work links, computing hosts and storage resources. To thispurpose, VSN elements are associated with precise resourcerequirements, e.g., in terms of the required computing per-formance (e.g. working memory, speed, scheduling) for eachnode and the required networking performance (e.g. band-width, latency, jitter) for each link. These requirements arefulfilled thanks to the allocation and admission control logicpursued by ISONI for instantiating VMs within the managedset of available physical resources, and to the low-level mech-anisms shortly described in what follows (a comprehensiveISONI overview is out of the scope of this paper and can befound in [30]).

3.1 Isolation of computing

In order to provide scheduling guarantees to individualVMs scheduled on the same system, processor and core,IRMOS incorporates a hybrid deadline/priority (HDP) real-time scheduler [5] developed within the IRMOS consortiumfor the Linux kernel. This scheduler provides temporal

123

Page 5: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 155

isolation among multiple possibly complex software compo-nents, such as entire VMs (with the KVM hypervisor4, a VMis seen as a process). It uses a variation of the Constant Band-width Server (CBS) algorithm [1], based on Earliest DeadlineFirst (EDF), for ensuring that each group of processes/threadsis scheduled for Q time units (the budget) every intervalof P time units (the period). The CBS algorithm has beenextended for supporting multi-core (and multi-processor)platforms, achieving a partitioned scheduler where the set oftasks belonging to each group may migrate across the asso-ciated CBS scheduler instances running on different CPUs,according to the usual multi-processor real-time priority-based scheduling in Linux.

The scheduler exhibits an interface towards user-spaceapplications based on the cgroups [25] framework, whichallows for configuration of kernel-level parameters by meansof a file system-based interface. This interface has beenwrapped within a Python API to make the real-time sched-uling services accessible from within the IRMOS platform.The parameters that are exposed by the scheduler are thebudget Q and the period P , as explained above.

3.2 Isolation of networking

Traffic isolation of independent VSNs within ISONI isachieved by provisioning each VSN deployment with anindividual virtual address space and by policing the networktraffic of each deployed virtual link. The automated deploy-ment of policed virtual link overlays prevents unwantedcrosstalk among services sharing physical network links andprohibits intrusion attempts from unnamed endpoints. Thetraffic policing in place ensures that the network traffic tra-versing the same network elements causes no overload whichwould lead to an unduly, uncontrolled growth of loss rate,delay and jitter for the network connections of other VSNs.A gap-less policing ensures that the network multiplex stagesalways get a controlled load of traffic. Therefore, bandwidthpolicing is an essential building block of QoS assurance forthe individual virtual links. It is important to highlight thatISONI allows for the specification of the networking require-ments in terms of common and technology-neutral trafficcharacterisation parameters, such as the needed guaranteedaverage and peak bandwidth, latency and jitter.

Depending on the specified networking requirements, anadequate transport network is chosen in order to meet theapplication requirements (see Fig. 1). By default, non-criticaltraffic without estimable performance requirements can bedeployed on Internet transit that does not provide guaranteeson the delivery and performance of traffic. On the other hand,soft real-time distributed applications that rely on capac-ity guarantees can be mapped onto network resources for

4 More information at: http://www.linux-kvm.org.

which ISONI controls the network utilisation. The strongerthe soft real-time application requirements on delay and jitterbecome for a virtual link, the shorter is the allowable networkdistance of the transport resource, which leaves either thesite-local network or the use of leased lines as suitable trans-port resource. Since there exist transport network resourceswith classification, reservation and other technology-individ-ual mechanisms to enforce QoS of traffic, an ISONI transportnetwork adaptation layer abstracts from transport networktechnology-individual QoS mechanism like Diffserv [4], Int-serv [31,32] and MPLS [28] (see Fig. 1).

3.3 Performance modelling

One of the key steps in deploying applications with pre-cise real-time or generally QoS guarantees within IRMOS isthe one of building a performance model of the applicationbehaviour. This means that, given application-specific con-figurable parameters (e.g. number of users, resolution of mul-timedia contents, etc.), and given possible performance levelsthat one may want to achieve, it should be possible to deter-mine what allocation is needed on the physical resources inorder to accomplish that. This is the core information neededby the SaaS provider to establish an accurate pricing policyfor the customers.

Application performance in the Cloud depends on manycomplex factors such as application workload, conditions ofthe network paths among the users and the servers and thecomputing workload of the physical hosts. Computing work-load factors are especially significant in multi-tenant Cloudswhere single hosts are used to service multiple applications.However, as already described, the IRMOS infrastructureembeds temporal isolation on the computing side, by meansof the real-time scheduler, and on the networking side, bymeans of QoS-aware protocols, realising temporal isolation.Thus, the level of interference among different VMs sharingthe same resources becomes negligible. The applications cantherefore be easily benchmarked using the facilities of the IR-MOS PaaS (see Sect. 4) and their performance can be mod-elled as a pure function of application-specific parametersand allocated resources, independently of other applicationsthat may be hosted on the same physical resources.

4 The IRMOS PaaS

The IRMOS PaaS, namely the IRMOS Framework Services(FS), acts as a mediator between the SaaS and the ISONIprovider. It consists of a family of services and tools that canbe used by the SaaS provider for benchmarking, modellingand managing applications on the service-oriented virtua-lised environment that is offered by the ISONI. We can makethe following categorisation of the FS:

123

Page 6: Virtualised e-Learning on the IRMOS real-time Cloud

156 SOCA (2012) 6:151–166

Fig. 2 Overview of the IRMOSplatform

– Engineering services: These include services for applica-tion modelling, benchmarking and performance model-ling, which are needed by the SaaS provider to make anapplication ready to be hosted in the underlying ISONICloud.

– Management services: These include services thatsupport the management of the full life-cycle of a ser-vice-oriented application, such as discovery, negotiation,reservation, enactment, monitoring and event handlingduring the execution of an application inside ISONI.

Using the services above, a description of the VSN can becreated and used by the FS for negotiating resources withthe ISONI provider. This description includes the requiredresources for the application service components (ASCs)and application client components (ACCs) that make up theapplication and their interconnections, including instances ofsome key FS responsible for real-time critical tasks, such asenactment and monitoring. Upon agreement, the ISONI pro-vider is responsible for delivering a “just-in-time” deploy-ment of the application on the reserved physical resourcesaccording to the specified QoS requirements. From then on,the end-user (consumer) can use the application without anyknowledge of the underlying Cloud. Figure 2 shows an over-view of the IRMOS platform and some of the basic conceptsbehind it.

4.1 Monitoring and benchmarking

By using the FS tools, the SaaS provider is able to modelbehavioural aspects of its application, which aid in the per-formance modelling analysis for deriving critical information

about the fluctuation of resource utilisation. To this direction,a number of important parameters when benchmarking theapplication needs to be defined. These include the criticalinputs of the services that influence the produced workload,as well as parameters whose values can be chosen by thecustomer depending on its needs, e.g. the maximum numberof end-users that are allowed to connect to an e-Learningapplication.

Another factor is the QoS outputs of the services that com-prise the application. These are collected and stored duringbenchmark runs by the FS monitoring system. The latter isa suite of components for monitoring, evaluating possibleperformance anomalies and visualising the performance ofthe application inside the VSN, along with the levels of theresources that are offered to the application by the ISON-I. These components acquire this information through anextensible interface that is used to execute application-spe-cific external programs for collecting high-level data aboutthe performance of the applications. Furthermore, the FSmonitoring system establishes communication with ISON-I’s monitoring system for retrieving low-level informationabout computing and networking resources of the VSN (seeFig. 2). More information about the design and implementa-tion of the FS monitoring system can be found in [15].

The monitoring information is aggregated and stored inthe FS and is used both for benchmarking and performancemodelling of an application to be deployed in the ISONI.Through this exchange of information, the FS are able tobenchmark the application and generate the rules needed forthe mapping of high-level parameters (used by the SaaS pro-vider for describing the QoS levels), to low-level parameters(used by the ISONI provider for the deployment of the VSN).

123

Page 7: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 157

These rules correlate the configurable workload inputs, theQoS outputs and the hardware assignments of the benchmarkruns and are used in conjunction with analytical approachesto generate an overall end-to-end performance model for theapplication. This model enables various trade-offs and con-figurations for selecting the most suitable combination ofresources at the lowest cost, whilst guaranteeing the QoSlevels needed by the SaaS (see Sect. 3.3).

4.2 Run-time QoS provisioning

In a dynamic environment such as the Cloud, the number ofthe resources and their availability can change frequently. Forexample, new resources (compute servers, file servers, etc)may be added, old ones may be removed or become temporar-ily unavailable for maintenance purposes, etc. Furthermore,there is always the chance of bad application performancemodelling that may result in deviations between the agreedand achieved QoS levels at run-time.

For these reasons, there is a need of continuous observa-tion of the resources and the application’s performance forpurposes such as fixing problems and tracking down theirorigin. Due to the fact that the business entities behind theSaaS, PaaS and IaaS often have conflicting interests and dif-ferent mechanisms and levels of fault tolerance, assessmenton the source of the problem can be a significant undertaking.To this end, monitoring data are collected during run-time andare being evaluated by the FS monitoring system to identifypossible deviations, the source of the problem and correctiveactions that could be undertaken. Typical corrective actionsinclude adjustments of the resource allocation to the applica-tion resource utilisation, reconfiguration of the steering ser-vice of the application, and notifying the end-user and theSaaS provider when certain events that affect the availabilityof the application occur, among others [12].

Apart from these corrective actions, the FS also enable theSaaS provider to define a range of application-driven eventsthat may automatically trigger certain actions during the run-time of the application. For example, it may be desirable forthe SaaS provider that the FS automatically launch a rene-gotiation/reconfiguration process when certain usage termsare reached. Such scenarios can be found in a variety of softreal-time applications. In the specific case of a virtual worldapplication, such as the one described in Sect. 5.2, the numberof users could be modelled into a trigger for launching auto-matic renegotiation and reconfiguration of the application bythe FS (see Sect. 8).

5 e-Learning applications

The IRMOS platform was especially designed to providestrong performance guarantees to distributed, interactive,soft real-time applications. In order to demonstrate the

Fig. 3 Components of the Mobile Outdoor e-Learning application

IRMOS platform functionality, the two interactive e-Learn-ing applications listed below were selected because of thesoft real-time requirements that they impose on the Cloudinfrastructure hosting them:

– Mobile Outdoor e-Learning, which is a mobile applica-tion, applied to outdoor usage, and

– Virtual World e-Learning, an ubiquitous application forenabling cooperative e-Learning.

Both applications were offered as SaaS and were executed onthe IRMOS IaaS layer (ISONI) after being modelled usingthe tools of the IRMOS PaaS layer (IRMOS FS).

5.1 Mobile Outdoor e-Learning

We focus on an e-Learning mobile instant content deliveryapplication, developed to take advantage of a service-ori-ented architecture paradigm, in which real-time requirementsplay an important role. In this scenario a user can receiveon a mobile phone e-Learning content relevant to the cur-rent geographical position (e.g. when approaching histori-cal monuments). It consists of a Tomcat5-based e-Learningserver that exploits a MySQL database for content manage-ment (see Fig. 4). The application is able to receive que-ries with GPS data from multiple clients, search the data-base and respond with e-Learning contents corresponding tothe provided GPS coordinates (see Fig. 3). The applicationserver is provided as a Web Application Archive (war) file,installed on Tomcat, and made available as a Virtual Machineimage within the IRMOS infrastructure. Using ISONI, eachinstance of the application can be assigned precise comput-ing and networking resources to ensure that the high-levelrequirements defined within the application provider’s Ser-vice-Level Agreement (SLA) can be met with an agreed levelof reliability.

The timing requirements of the application are mainlyrelated to the response times of individual requests submittedby the multitude of users. The response times are gatheredby ‘ascmon.jar’ and communicated to the FS by the ‘moni-tor’ script (see Fig. 4). As discussed in Sect. 3.3, thanks to

5 More information at: http://tomcat.apache.org.

123

Page 8: Virtualised e-Learning on the IRMOS real-time Cloud

158 SOCA (2012) 6:151–166

Fig. 4 Software architecture ofthe Mobile Outdoor e-Learningapplication

Fig. 5 The Virtual World e-Learning application’s GUI

the deployment within IRMOS, these response times dependmerely on high-level application-specific parameters, i.e. onthe number of concurrent users querying the same e-Learninginstance and the size of the downloaded content. In order toinvestigate application performance, we developed a multi-user client simulator that is capable of simulating the ran-dom movements of a certain number of users walking aroundgiven GPS coordinates. Furthermore, the simulator mimicsthe behaviour of the real mobile client associated with theapplication: whenever the monitored GPS coordinates movesufficiently away from the position of the last queried con-tent, a new request is submitted to the server with the newuser position. The number of users and a few parameters

governing how each emulated user exactly moves (e.g. theuser speed) determine the exact pattern of requests submittedto the server, and thus have a strong impact on the imposedserver load.

5.2 Virtual World e-Learning

In the Virtual World e-Learning application we considerremote users interacting through avatars in a virtual worldsystem. The Virtual World reproduces a museum, whereworks of art are associated to e-Learning lessons (see Fig. 5).The avatars can move independently from each other and cancommunicate through a chat or a voice line. When the avatars

123

Page 9: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 159

Fig. 6 Software architecture ofthe Virtual World e-Learningapplication

come close to a work of art, they can invite other avatars,through chat, to download the corresponding lesson. Eachuser can play the lesson by themselves, whilst communicat-ing via chat.

This collaborative virtual world application is composedby two main elements: the application service component(ASC), dealing with user’s communication, and the applica-tion client component (ACC or World Player), dealing withthe graphic rendering. The ASC builds on top of an OpenWonderland server with modified modules for IRMOS adap-tation and a MySQL database for managing performancedata. A detailed architecture of the virtual world applicationis shown in Fig. 6, in which the parts that have been createdor adapted to the IRMOS specifications, such as modules forconfiguration and monitoring, are marked with yellow colour.It should also be noted that the virtual world application pro-vides a multi-avatar simulator for benchmarking purposes.This tool allows the simulation of a large number of users,which are moving continuously to generate the maximumamount of traffic between the clients and the server.

Compared to the Mobile Outdoor e-Learning application,the virtual world application is more real-time intensive andrequires a greater number of high-level performance param-eters such as the avatar speed, the chatting quality, etc. TheVirtual World ASC must satisfy real-time constraints (e.g.response time and processing time) to support realisticallyfluid movement of the avatars. The performance of this com-ponent strongly depends on the number of connected users,because the quantity of dispatched messages increases poly-nomially with the number of avatars.

6 Performance estimation

In order to estimate what QoS level can be achievedusing different resource configurations, we use performance

modelling techniques. Many times, the application internalsoftware structure may be too complex to be modelled. Or,it may be unknown because developers are reluctant to sharedetailed information about their application internals, forconfidentiality purposes. In other cases, the use of externallibraries or components whose internals are unknown makesit impossible to build an exact model. So, from a modellingpoint of view, it is critical to be able to identify the expectedQoS output using a black-box approach.

Therefore, we use a combination of a stochastic model forpredicting statistics over the expected run-time networkingperformance [2], and an ANN for identifying the dependencyof a component’s QoS from factors like application-levelparameters (e.g. number of clients) or scheduling parame-ters (e.g. allocated budget and period). These two models,put together, allow for a precise estimate of the overall end-to-end QoS experienced by end-users.

6.1 Stochastic performance model

We built a Matlab model for simulating, by means of Monte-Carlo type discrete event simulation, a system composed of(see Fig. 7): a request generator (modelling the end-users), aPublic Wide Area Network (WAN), a Private Network inter-nal to an ISONI domain and the VMU hosting the actualASC. In order to account for interactivity, we modelled bothpaths from the user to the ASC and the other way round.The model uses a mix of exponential and Erlang probabil-ity distributions (see Fig. 8) for modelling the latencies ofapplication requests whilst traversing the involved networks,and it may also simulate packet loss due to buffer saturationin the various networks (particularly useful for UDP-basedcommunications).

The individual parameters of the model need to be tunedby resorting to proper benchmarking techniques. The behav-iour of the latencies inside the ISONI internal network may

123

Page 10: Virtualised e-Learning on the IRMOS real-time Cloud

160 SOCA (2012) 6:151–166

Fig. 7 Modelled elements ofthe e-Learning application

Fig. 8 Stochastic performance model: t_wan_in and t_wan_out aremodelled as exponential distributions, the other delays as Erlang ones

be accurately estimated thanks to the ISONI networking iso-lation, and they depend merely on the requested network-level QoS parameters specified in the VSN, and the expectedapplication request pattern. On the other hand, parametersrelative to the QoS-unaware WAN must be estimated basedon available statistics on the overall network workload fore-seen at the time of usage of the application. However, a wide-spread usage of ISONI would reduce the need for traversingQoS-unaware networks. The behaviour of the ASC was alsoestimated as an Erlang distribution. However, due to the non-trivial dependency of the performance from application-levelparameters, in addition to the resource allocation ones, theErlang parameters were tuned by resorting to an ANN model(see below).

The described simulator is capable of providing, for eachconfiguration, the full probability distribution of the end-to-end response times, as well as simpler statistics that maybe easily leveraged at design-time, such as the average or agiven percentile of the distribution. For example, this allowsfor finding the configuration parameters granting a given end-to-end response time with a given probability.

6.2 Artificial neural networks in the model

As already described in Sect. 6, the ASC part of Fig. 8 is mod-elled through Erlang distributions. However, the characteris-tics (mean value and standard deviation) of these distributionsare affected by application-level parameters (such as thenumber of users) and hardware-level parameters (such as thechoice of scheduling periods or CPU percentages). Thus, away of identifying and predicting these variations in the dis-tribution characteristics must be inserted, for each differentexecution of the service instance on the Cloud infrastruc-ture. We have chosen ANNs due to the fact that they canmodel this relationship (both linearly and non-linearly) and

interpolate it for cases that have not been met before. Fur-thermore, they represent a black-box approach, thus they donot need information regarding the internal structure of thesoftware component. This structure (e.g. source code) is notavailable as seen in the introduction.

An ANN model is used for modelling the time the serverneeds in order to retrieve the results from the internal data-base. The factors that are taken under consideration aremainly the number of connected clients and the schedulingdecisions (Q and P parameters). Following the black-boxapproach, the use of ANNs allows for an easy addition offurther inputs (or outputs) as needed (e.g. hardware-specificparameters like the reserved memory or processor speed),once the necessary training data sets are collected.

The investigation of the effect of parameters like the allo-cated CPU time Q over a period P is critical due to its influ-ence in the QoS output. The choice of P is mainly driven bythe time granularity for the allocation needed by the appli-cation. For example, for interactive applications, with fastresponse times and relatively light computations, the gran-ularity must be kept small (in the order of 10–100 ms). Forscientific applications performing long and heavy computa-tions, large periods will result in lower overheads (500 msand beyond).

The outputs of the ANN have been chosen to represent theaverage and the standard deviation of the expected responsetimes, as due to the configuration represented at the ANNinputs. These outputs are easily mapped to the Erlang param-eters needed for modelling the ASC temporal behaviour inthe general model in Fig. 8. The ANN model structure isdescribed in V.C., whilst the data gathered as training set arepresented in VI.B.

6.3 ANN structure and design

In order to implement the ANN, a standard form of networkwas selected. The type of operation that was desired wasfunction approximation, in order to determine the effect ofthe input parameters (number of clients, Q, P) on the pre-dicted output (mean value of inner server response time andstandard deviation). The collection of the data set was per-formed with the process described in [17]. Two more inputswere included, CPU speed and VM memory size, but themain focus was on the initial 3 parameters. The resulting

123

Page 11: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 161

Table 1 Structure of mean response time and standard deviation pre-diction ANN

Layer Transfer function Size (Neurons)

Mean response time Standard deviation

Input Tansig Logsig 5

Hidden Tansig Logsig 2

Output Linear (Purelin) Linear (Purelin) 1

Mean absolute errorMean response time 2.51%Standard deviation 2.75%

network for the mean time prediction was a 3-layer, feed-for-ward, back propagation network, created through the GNUOctave tool6. It was trained with the Octave ‘trainlm’ func-tion, using the Levenberg–Marquardt algorithm [26]. Thestructure of the network is shown in Table 1. All the inputsand outputs are normalised in the (−1,1) interval. A standardform of function approximation network was used, with onehidden layer and Tansig transfer functions for the input andhidden layer and linear transfer function for the output layer.

For the standard deviation, a similar process was followed(but with the Logsig transfer function) and the resulting net-work also appears in Table 1.

7 Experimental results

This section presents experimental results that validate thepresented approach, in terms of achieving temporal isolationand performance estimation accuracy of the ANN model.

First, the assumptions of temporal isolation over whichthe modelling technique relies are validated. Indeed, as men-tioned above, the IRMOS approach to performance esti-mation relies on the temporal isolation across co-locatedVMUs, so that the performance exhibited by each VMU isonly affected by the resource allocation performed for theVMU and its resource requirements (which in turn dependon parameters specific to the application), and not on thebehaviour and parameters of other co-located VMUs.

Then, some experimental data used for training the ANNmodels are described, and finally the accuracy of the ANN-based estimations is discussed. For the ANN section, themain motivation was to validate the accuracy of the predictedexpected outputs in terms of mean response time and standarddeviation. Furthermore, in order to validate why ANNs wereused, graphs depicting the variation of the metrics with regardto scheduling parameters are used to demonstrate the effect ofthe scheduling parameters. Different scheduling periods mayhave an effect on some of the metrics like standard deviation.

6 More information is available at: http://www.gnu.org/software/octave/.

0

50

100

150

200

250

0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

VM

U1

Ave

rage

Res

pons

e-T

ime

VMU1 CPU Share

VM2 IdleVM2 CPU Share: 10%VM2 CPU Share: 20%VM2 CPU Share: 30%VM2 CPU Share: 40%VM2 CPU Share: 50%

Fig. 9 Average response time of the first VM as a function of the CPUshare (on the x-axis) assigned to it, at varying CPU shares assigned tothe second competing VM (corresponding to the various curves)

Furthermore, the results highlight the difference in the dis-tributions for varying number of users, which is a high-levelapplication parameter. This difference is predicted by theANNs for cases that may or may not been included in thedata set, through interpolation.

All experiments were conducted using instances of theMobile Outdoor e-Learning application that has been pre-sented in detail in Sect. 5.1.

7.1 Temporal isolation by real-time scheduling

We ran an experiment for the purpose of validating ourapproach to the temporal isolation of VMs concurrently run-ning on the same CPU based on real-time scheduling. Tothis purpose, we considered two instances of the e-Learn-ing application deployed on the same host and physical core.Two instances of the Mobile Outdoor e-Learning multi-usersimulator, deployed on a second machine in an isolated net-working context, were used to simulate the requests from 10different users. We collected the response times experiencedby the two multi-user simulator instances under various con-ditions in terms of the scheduling parameters configured forthe two VMs on the server host.

In Fig. 9 we report the average response time of the firstVM as a function of the CPU share (on the x-axis) assigned toit, at varying CPU shares assigned to the second competingVM (corresponding to the various curves). Under the idealconditions of perfect temporal isolation, we would like thesecond competing VM workload, ranging from completelyidle (continuous curve) to having a 50% of load on the sys-tem (dashed curve tagged with little triangles), to have noimpact at all on the performance of the first VM. This wouldcorrespond to having all the curves perfectly superimposed.

123

Page 12: Virtualised e-Learning on the IRMOS real-time Cloud

162 SOCA (2012) 6:151–166

Fig. 10 Standard deviation of response time with regard to changingP for 90 users

As it can be seen from the experimental results, the softreal-time scheduler achieves a nearly good approximationof such a condition, realising a set of curves which arequite close to each other, where the increase of computingresources granted to the second VM corresponds to a slightdecrease of the performance of the first VM. This may bemainly attributed to an increased contention on the cache,and constitutes a minimum of interference which cannot beremoved. Other factors of interference which are not trivialto keep under control are due to shared resources on the hostOS, like the networking stack and interrupts. For example,see [8] for a discussion of the interference due to network-intensive VMs, and the extent to which it can be controlledby real-time scheduling of the CPU.

7.2 Experimental performance of the e-Learningapplication

In this section, experimental performance data gathered forvarious configurations of both application-level and resourceallocation parameters are presented. The range of values thatwere altered for the configuration parameters are:

– Number of users: 30–150– Q/P (CPU share) : 20–100% with a step of 20– P: 10,000–5,60,000 (µs) with a step of 50,000

For each configuration, about 800 response times were col-lected, and the corresponding average and standard deviationfigures computed. An indicative set of these measurementsis discussed below.

The effect of changing granularity on the deviation ofthe response time values can be observed in Fig. 10. Thisis expected since with high values of P , the service has long

Fig. 11 Mean response time with regard to changing P for 70 usersand 40% CPU share

active and inactive periods. If the requests fall in the activeinterval, they will be satisfied quickly but if they fall in theinactive one then they will have to wait until the next activeperiod begins. This effect decreases at increasing allocatedCPU shares, since then the CPU is almost dedicated to theapplication and whenever a request arrives, it is served. Themean response time, as shown in Fig. 11, seems not to beaffected greatly given that the percentage of CPU assignedis the same.

In Fig. 12, the comparison of the normalised probabilitydistribution of delay times is shown for two different numbersof users (30 and 50 users). The difference especially in themaximum values of the distributions depicts the effect of theapplication workload in the response times. Figure 13 showsthe cumulative distribution function and the confidence inter-vals for the same combination of number of users, given thatthe CPU share allocated to the VMU remains the same.

In Fig. 14, all the different configurations are shown fortwo different numbers of users. In this case, each group ofcolumns (the first high one followed by 4 lower ones) repre-sents one period configuration (P) for different CPU sharepercentages. The upper line is for low utilisation. Whilst theutilisation increases, the response time decreases. In the hor-izontal axis, the different P configurations represent increas-ing period values.

From these measurements it seems interesting that the bestgranularity (P) should depend also on the percentage of theCPU assigned to the application. In this occasion, for lowpercentages of utilisation it is best to assign values near themiddle of the investigated interval (10,000–5,60,000µs), asis depicted in Fig. 14. For higher percentages of utilisation,lower values of P are more beneficial for the response timesof the application. Furthermore, Fig. 14 highlights the effectof the increased CPU share allocation to the response time.

7.3 Prediction accuracy of the ANN model

For the estimation of the ANN accuracy, about 30% (87 testexecutions) of the data set was used only for validation. After

123

Page 13: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 163

Fig. 12 Comparison ofnormalised probabilitydistribution of delay times fordifferent number of users

Fig. 13 Cumulative distribution function and confidence interval fordifferent number of users

Fig. 14 Different P and CPU shares for 110 users

Fig. 15 Accuracy of the ANN for mean response time prediction

Fig. 16 Accuracy of the ANN for standard deviation prediction

the training of the model with the 70% of the test cases,we applied the according inputs of the validation cases andcompared the estimated output with the observed one. Theoverall accuracy was around 2.5% and the error of the net-work for each individual test case appears in Fig. 15. Foreach validation case, the network error appears in Fig. 16.

The accuracy of the ANN models is evident from thesemeasurements, giving sufficient reliability for this part of theoverall model. The maximum deviation from the validationcases is very satisfying and so is the mean error for all theexperiments. Furthermore, the predictions are not biased, afactor that is critical for the merging of different modellingapproaches like in this paper.

8 The virtual world use case

Within the IRMOS context, the Virtual World e-Learningapplication, as presented in Sect. 5.2, was used to pro-mote collaborative education through an immersive expe-rience. In more detail, several people (learners, teachers,tutors, organisers, etc.) were allowed to meet in an interactivethree-dimensional virtual world environment and interactwith 3D-objects that were linked to e-Learning contents(Fig. 5).

123

Page 14: Virtualised e-Learning on the IRMOS real-time Cloud

164 SOCA (2012) 6:151–166

Fig. 17 Negotiation ofautomatic resource extension

This particular use case was used to demonstrate the run-time QoS adaption capabilities behind the IRMOS platformas described in Sect. 4.2. The negotiation of the virtual worldapplication is based on a “pay-as-you-reserve” model, i.e. thecost fluctuates according to the amount of resources that arereserved to achieve the requested performance which heav-ily depends on the maximum number of avatars. Thereforein order to maintain the given QoS level, any extra requestsfor entrance should be rejected when the number of avatarsalready in the Virtual World is maximum.

However, the IRMOS ISONI is able to support the seam-less extension of the resources, by either extending thereserved CPU, memory and network bandwidth share of thereserved resources or by performing other actions such aslive-migrating the application onto other physical resources.Furthermore, the IRMOS FS Monitoring system, as alreadydescribed in Sect. 4, is continuously collecting high-levelmonitoring data during run-time, such as the number oflogged avatars and requests for entrance, and can combinethem into rules that may trigger certain actions. To this end,the IRMOS customer is given the ability to satisfy a demandfor extra resources at run-time, in order to maintain the overallperformance without interruption. This has been concretisedby a structured negotiation in which the customer can enablethe IRMOS FS to automatically re-negotiate an extensionof the resources by pre-agreeing to pay an additional cost.

Figure 17 shows the application negotiation in which 5 addi-tional avatars are foreseen.

At the end of the negotiation, the resources for the basicnumber of avatars (i.e. 10 avatars) are reserved and theapplication is deployed. Thus, up to 10 avatars can enter theVirtual World with the guarantee that the requested QoS levelis respected. With 10 already logged avatars, any request afterthat is being blocked, whilst the IRMOS FS are re-negotiatingwith the ISONI provider for resources that are able to sustainthe 15 avatars under the same QoS level. In the background,the allocated resources (CPU share, memory, network band-width) are being increased according to the output of theperformance model for the new maximum number of ava-tars. At the end of the re-negotiation, a VSN able to supportadditional users (15 avatars in total) is instantiated and thevirtual world application is reconfigured to accept 15 ava-tars. The blocked avatar requesting access is now allowed toenter, and the customer is charged with the agreed additionalfee. The entire process of renegotiation and reconfiguration,including the addition of the blocked avatar in the VirtualWorld is complete in less than a minute (≈ 50 s) with themonitoring polling rate set to 10 s for keeping the monitoringoverhead absolutely negligible. However, if appropriate, onecan increase the monitoring rate, to achieve a more respon-sive infrastructure at the cost of higher monitoring overheadsimposed on computing and networking resources.

123

Page 15: Virtualised e-Learning on the IRMOS real-time Cloud

SOCA (2012) 6:151–166 165

Fig. 18 Illustrative example ofthe QoS adaption process usingsnapshots (enclosed in a box).Moving left to right: at the firstsnapshot the renegotiation istriggered, at the second snapshotthe reserved CPU share isincreased and at the lastsnapshot the virtual worldapplication is reconfigured toaccept 15 avatars. The blockedavatars are allowed to log-inafterwards

Figure 18 is an illustrative example of the renegotiationprocess showing also the XML-based syntax of the gatheredmonitoring data (see yellow box labelled ‘monitor script’ inFig. 6). The monitoring parameters have 10 values, with eachof them collected 10 s after its previous one.

9 Conclusions

In this work, we discussed how two real e-Learning distrib-uted applications have been deployed with predictable andstable QoS levels within the IRMOS platform. We showedhow we flanked the temporal isolation mechanisms avail-able within the platform with proper performance analysis,modelling and benchmarking techniques, in order to inves-tigate the performance levels achievable by the applicationunder the various possible configurations. Furthermore, wedemonstrated the run-time QoS adaption capabilities behindthe IRMOS platform and the way they build proper moni-toring and evaluation of application performance on top. Inthe future, we plan to leverage the black-box approach forperformance estimation, so as to apply the described tech-nique to other applications that are already being adaptedfor deployment within IRMOS, such as distributed editingof professional-quality video and a virtual reality applica-tion. Also, we plan to extend the used performance modelsby accounting for possibly heterogeneous hardware withinan ISONI domain. Finally, we plan to extensively comparethe predicted performance and the actually realised one, inpresence of a variety of other deployed workload types, fromcompute-intensive to network-intensive ones.

References

1. Abeni L, Buttazzo G (1998) Integrating multimedia applicationsin hard real-time systems. In: Proceedings of the IEEE real-timesystems symposium, Madrid

2. Addis M, Zlatev Z, Mitchell W, Boniface M (2009) Mod-elling interactive real-time applications on service orientedinfrastructures. In: Proceedings of NEM summit—towards futuremedia internet, September 2009

3. Benkner S, Engelbrecht G (2006) A generic QoS infrastructurefor grid web services. In: Proceedings of the international confer-ence on internet and web applications and services, Guadeloupe,February 2006

4. Blake S, Black D, Carlson M, Davies E, Wang Z, Weiss W (1998)RFC2475. An architecture for differentiated service

5. Checconi F, Cucinotta T, Faggioli D, Lipari G (2009) Hierarchicalmultiprocessor CPU reservations for the Linux Kernel. In: Pro-ceedings of the 5th international workshop on operating systemsplatforms for embedded real-time applications (OSPERT), Dublin,June 2009

6. Cherkasova L, Gupta D, Vahdat A (2007) Comparison of the threeCPU schedulers in xen. SIGMETRICS Perform Eval Rev 35:42–51

7. Cucinotta T, Anastasi G, Abeni L (2009) Respecting temporal con-straints in virtualised services. In: Proceedings of the 2nd IEEEinternational workshop on real-time service-oriented architectureand applications (RTSOAA), Seattle, July 2009

8. Cucinotta T, Giani D, Faggioli D, Checconi F (2010) Providingperformance guarantees to virtual machines using real-time sched-uling. In: Proceedings of the 5th workshop on virtualization andhigh-performance Cloud computing (VHPC), Ischia (Naples),August 2010

9. Dunlap G (2009) Scheduler development update. Xen SummitAsia, Shanghai

10. Durkee D (2010) Why cloud computing will never be free. Com-mun. ACM 53(5):62–69. doi:10.1145/1735223.1735242

11. Germain-Renaud C, Loomis C, Moscicki J, Texier R (2008) Sched-uling for responsive grids. J Grid Comput 6:15–27. doi:10.1007/s10723-007-9086-4

12. Gogouvitis S, Konstanteli K, Waldschmidt S, Kousiouris G,Katsaros G, Menychtas A, Kyriazis D, Varvarigou T (2011) Work-flow management for soft real-time interactive applications invirtualized environments. Future generation computer systems (inPress)

13. Gupta D, Cherkasova L, Gardner R, Vahdat A (2006) Enforcingperformance isolation across virtual machines in Xen. In: Proceed-ings of the ACM/IFIP/USENIX international conference on mid-dleware, Springer, New York, pp 342–362

14. He Z, Peng C, Mok A (2006) A performance estimation toolfor video applications. In: Proceedings of the 12th IEEE real-

123

Page 16: Virtualised e-Learning on the IRMOS real-time Cloud

166 SOCA (2012) 6:151–166

time and embedded technology and applications symposium. IEEEcomputer society, Washington, pp 267–276

15. Katsaros G, Kousiouris G, Gogouvitis SV, Kyriazis D, Varvari-gou TA (2010) A service oriented monitoring framework for softreal-time applications. In: SOCA’10, pp 1–4

16. Konstanteli K, Cucinotta T, Varvarigou TA (2010) Optimum allo-cation of distributed service workflows with probabilistic real-timeguarantees. Serv Oriented Comput Appl 4:229–243

17. Kousiouris G, Checconi F, Mazzetti A, Zlatev Z, Papay J, VoithT, Kyriazis D (2010) Distributed interactive real-time multimediaapplications: a sampling and analysis framework. In: Proceedingsof the 1st international workshop on analysis tools and methodol-ogies for embedded and real-time systems (WATERS), Brussels,July 2010

18. Kousiouris G, Kyriazis D, Konstanteli K, Gogouvitis S, KatsarosG, Varvarigou T (2010) A service-oriented framework for GNUOctave-based performance prediction. In: Proceedings of the IEEEinternational conference on services computing (SCC), Miami,August 2010

19. Kousiouris George, Cucinotta Tommaso, Varvarigou Theodora(2011) The effects of scheduling, workload type and consolida-tion scenarios on virtual machine performance and their predic-tion through optimized artificial neural networks. J Syst Softw84(8):1270–1291

20. Kyriazis D, Tserpes K, Menychtas A, Sarantidis I, Varvari-gou T (2009) Service selection and workflow mapping for Grids:an approach exploiting quality-of-service information. ConcurrComput Pract Exp 21:739–766

21. Lee JW, Asanovic K (2006) METERG: measurement-based end-to-end performance estimation technique in QoS-capable multipro-cessors. In: Proceedings of the 12th IEEE real-time and embeddedtechnology and applications symposium (RTAS)

22. Lin B, Dinda P (2005) Vsched: mixing batch and interactive virtualmachines using periodic real-time scheduling. In: Proceedings ofthe IEEE/ACM conference on supercomputing, November 2005

23. Lin B, Dinda P (2006) Towards scheduling virtual machines basedon direct user input. In: Proceedings of the 2nd international work-shop on virtualization technology in distributed computing, Wash-ington, November 2006

24. Liu CL, Layland JW (1973) Scheduling algorithms for multipro-gramming in a hard real-time environment. J ACM 20:46–61

25. Menage P (2008) CGROUPS, Available on-line at: http://www.mjmwired.net/kernel/Documentation/cgroups.txt.

26. More J (1978) The Levenberg–Marquardt algorithm: implemen-tation and theory. In: Numerical analysis, volume 630 of lecturenotes in mathematics. Springer, Berlin, pp 105–116. doi:10.1007/BFb0067700

27. Nathuji R, Kansal A, Ghaffarkhah A (2010) Q-Clouds: manag-ing performance interference effects for QoS-aware Clouds. In:Proceedings of the 5th european conference on computer systems(EuroSys), Paris, April 2010

28. Rosen E, Viswanathan A, Callon R (2001) RFC3031: Multiproto-col label switching architecture. IETF, January 2001

29. Shirazi B, Welch L, Ravindran B, Cavanaugh C, Yanamula B,Brucks R, Huh E (1999) Dynbench: a dynamic benchmark suite fordistributed real-time systems. In: Proceedings of IPDPS workshopon embedded HPC systems and applications, San Juan, Puerto Rico

30. Voith T, Kessler M, Oberle K, Lamp D, Cuevas A, Mandic P, ReifertA (2008) ISONI whitepaper, September 2008

31. Wroclawski J (1997) RFC2210, The use of RSVP with IETF inte-grated services. IETF, September 1997

32. Wroclawski J (1997) RFC2211, specification of the controlled loadquality of service. IETF, September 1997

123