IEEE TRANSACTION ON CLOUD COMPUTING 1 Energy-efﬁcient ... · Transactions on Cloud Computing IEEE TRANSACTION ON CLOUD COMPUTING 1 Energy-efﬁcient Adaptive Resource Management

2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2016.2551747, IEEETransactions on Cloud Computing

IEEE TRANSACTION ON CLOUD COMPUTING 1

Energy-efficient Adaptive Resource Management forReal-time Vehicular Cloud Services

Mohammad Shojafar, Member, IEEE, Nicola Cordeschi and Enzo Baccarelli

Abstract—Providing real-time cloud services to Vehicular Clients (VCs) must cope with delay and delay-jitter issues. Fog computing isan emerging paradigm that aims at distributing small-size self-powered data centers (e.g., Fog nodes) between remote Clouds andVCs, in order to deliver data-dissemination real-time services to the connected VCs. Motivated by these considerations, in this paper,we propose and test an energy-efficient adaptive resource scheduler for Networked Fog Centers (NetFCs). They operate at the edge ofthe vehicular network and are connected to the served VCs through Infrastructure-to-Vehicular (I2V) TCP/IP-based single-hop mobilelinks. The goal is to exploit the locally measured states of the TCP/IP connections, in order to maximize the overallcommunication-plus-computing energy efficiency, while meeting the application-induced hard QoS requirements on the minimumtransmission rates, maximum delays and delay-jitters. The resulting energy-efficient scheduler jointly performs: (i) admission control ofthe input traffic to be processed by the NetFCs; (ii) minimum-energy dispatching of the admitted traffic; (iii) adaptive reconfiguration andconsolidation of the Virtual Machines (VMs) hosted by the NetFCs; and, (iv) adaptive control of the traffic injected into the TCP/IPmobile connections. The salient features of the proposed scheduler are that: (i) it is adaptive and admits distributed and scalableimplementation; and, (ii) it is capable to provide hard QoS guarantees, in terms of minimum/maximum instantaneous rates of the trafficdelivered to the vehicular clients, instantaneous rate-jitters and total processing delays. Actual performance of the proposed schedulerin the presence of: (i) client mobility; (ii) wireless fading; and, (iii) reconfiguration and consolidation costs of the underlying NetFCs, isnumerically tested and compared against the corresponding ones of some state-of-the-art schedulers, under both syntheticallygenerated and measured real-world workload traces.

Index Terms—TCP/IP-based Vehicular Cloud Computing, Cognitive Computing, Virtualized Fog Centers, Adaptive ResourceManagement, Energy-efficiency.

F1 INTRODUCTION AND REFERENCE VEHICULARSCENARIO

V EHICULAR Cloud Computing (VCC) is a promisingparadigm that aims at merging Mobile Cloud Comput-

ing (MCC) and Vehicular Networking (VN), in order to givearise to integrated communication-computing platforms [1].While safety applications have been the prime motivationbehind VN, the final target of VCC is to shrink the spectrumof the supported services, in order to include the emergingFuture Internet Applications [2]. These are infotainmentapplications (such as, for example, Netflix and VTube), thatexploit Infrastructure-to-Vehicle (I2V) links for providingdata dissemination and content delivery services to con-nected Vehicular Clients (VCs) [3]. Being of streaming type,these applications are delay and delay-jitter sensitive [4], sothat the large delay and delay-jitter typically induced byWide Area Networks (WANs) preclude the direct utilizationof remote clouds as servers (see, for example, Chapter 8 of[5]).

Hence, a new paradigm is emerging to meet the QoSrequirements of vehicular Future Internet Applications, gen-erally referred to as Fog Computing (shortly, Fog) [6], [7]. Invehicular scenarios, Fog nodes are virtualized networkeddata centers (e.g., NetFCs), which are hosted by Road SideUnits (RSUs) at the edge of the vehicular network, so to givearise to a three-tier Cloud-Fog-VC hierarchical architecture(see Fig.1a). Similar to Cloud centers, Fog nodes provide

• M. Shojafar, N. Cordeschi and E. Baccarelli are with the Dept. ofInformation Engineering, and Telecommunication, Sapienza Universityof Rome, via Eudossiana 18, 00184 Rome, ItalyE-mail:mohammad.shojafar, nicola.cordeschi,[email protected]

Preprint version April 06, 2016.

bandwidth, compute support, storage and application ser-vices to the connected VCs on a per-VC basis. However,the distinguishing features of the Fog paradigm are thefollowing four ones [6], [7]: (i) Edge location and locationawareness: being deployed in proximity of the VCs, Fognodes may efficiently exploit the awareness of the connec-tion states for the support of delay and delay-jitter sensitiveapplications, such as VTube-like interactive applications [3];(ii) Geographical deployment: Fog nodes support serviceswhich demand for widely distributed geographical deploy-ments, as, for example, vehicular video streaming services;(iii) Predominance of wireless access and support for themobility: Fog nodes may exploit Infrastructure-to-Vehicular(I2V) single-hop IEEE 802.11-like wireless links for datadissemination; and, (iv) Low energy consumption: since, invehicular applications, Fog nodes are hosted by RSUs, theyare equipped with capacity-limited batteries, that should be(hopefully) re-charged through solar cells. Hence, attainingenergy efficiency is a central target in the Fog paradigm [7].

1.1 Technical contribution overview and main idea ofthe paperMotivated by the aforementioned considerations, we fo-cus on the vehicular scenario of Fig.1a. In this scenario,Fog-equipped RSUs broadcast locally processed data tosmartphone-equipped VCs through point-to-point TCP/IP-based I2V connections, which rely, in turn, on one-hopIEEE802.11-type wireless links [8]. Regarding the technicalcontribution of the work and the novelty of the pursuedapproach, we point out that the resource management prob-




lem afforded in this paper jointly embraces (see Fig.1b):(i) the adaptive control of the input and output trafficflows by coping with the random (possibly, unpredictable)fluctuations of the input traffic to be processed and thestates of the utilized TCP/IP connections; (ii) the adaptivereconfiguration of the per-VM task sizes and processingrates; (iii) the adaptive reconfiguration of the intra-Fog per-link communication rates; and, (iv) the adaptive consolida-tion of the virtualized physical servers hosted by the Fognodes. The pursued objective is the minimization of theaverage communication-plus-computing energy wasted bythe overall TCP/IP-based Fog platform of Fig.1b under hardQoS requirements (e.g., hard bounds) on: (i) the averageenergy wasted by each TCP/IP-mobile connection and itsminimum transmission rate; and, (ii) the per-connectionmaximum computing-plus-communication delay and jit-ter induced by the NetFC platform of Fig.1b. Due to thepossible high number of involved VCs, RSUs and hostedVMs, we also require that the resulting adaptive schedulerallows distributed and scalable implementation. Overall, dueto its self-sensing and self-configuring capabilities and dis-tributed nature, we anticipate that the proposed resourcescheduler is an instance of the emerging paradigm of the(networked) Cognitive Computing [3].

1.2 Related work and outline of the paper

An examination of the (recently published) related worksupports the conclusion that the adaptive optimization ofthe aforementioned scheduling actions has been till nowpursued by following three main parallel (e.g., disjoint) re-search directions, so that the proposed cognitive computing-inspired joint approach seems to be somehow new.

Specifically, a first research direction focuses on task-scheduling algorithms for the minimization of the energyin reconfigurable data centers that serve static clients [9],[10], [11]. For this purpose, in [9], an energy-efficient greedy-type scheduling algorithm is presented. It maps jobs to VMsand, then, maps VMs to Dynamic Voltage Frequency Scaling(DVFS)-enabled servers under upper bounds on the allowedper-job execution time. Interestingly, the resulting schedulerrelies on a suitable version of the First-Fit-Decreasing (FFD)heuristic, in order to perform the most energy efficient VM-to-server mapping and assignment of the processing ratesto the servers. Its energy performance is, indeed, noticeable.However, we note that: (i) the application scenario of [9]does not consider, by design, mobile clients; and, (ii) theadmission control of the input workload is not performedand the bandwidths of the intra-datacenter links are as-sumed fixed. Energy-efficient joint reconfiguration of theoverall communication-plus-computing resources in virtu-alized data centers is the topic of [10], where an approachbased on the stochastic nonlinear optimization is pursued.Although the resulting scheduler is adaptive and guaranteesbounded per-job execution times, we point out that: (i) theapplication scenario of [10] does not consider mobile clients;and, (ii) resource consolidation and admission control of theinput traffic are not performed. Energy-saving dynamic pro-visioning of the computing resources in virtualized greendata centers is the topic of [11]. Specifically, in order toavoid the prediction of future input traffic, the authors of

[11] resort to a Lyapunov-based technique that dynami-cally scales up/down the sizes of the VM’s tasks and thecorresponding processing rates by exploiting the availablequeue information. Although the pursued approach doesnot suffer from prediction errors, it relies on an inherentexecution delay-vs.-utility tradeoff, that does not allow usto account for hard deadline on the execution times.

A second research direction deals with the energy-efficient traffic offloading from VCs to serving RSUs byexploiting the underlying Vehicular-to-Infrastructure (V2I)wireless connections [12], [13]. In this framework, the focusof [12] is on the optimized task and processing rate mappingof an assigned Task Interaction Graph (TIG) to a computinggraph composed by networked reconfigurable computingnodes. As in our approach, hard constraints on the overallexecution times are considered by [12]. However, unlikeour approach, we point out that: (i) the focus of [12] is onthe traffic offloading from mobile devices to remote clouds,so that the resulting task scheduler does not support, bydesign, data dissemination and content delivery services;(ii) the joint task and computing rate mapping afforded in[12] is, by design, static; and, (iii) the scheduler in [12] doesnot perform real-time reconfiguration and/or consolidationof the computing resources hosted by the serving cloud.The authors of [13] develop a distributed scheduler forperforming mobile-to-access point traffic offloading. Thescheduler enforces cooperation among the mobile devices,in order to minimize the average energy consumption of themobile devices under per-device hard upper bounds on thevolumes of the traffic to be offloaded to the remote cloud.For this purpose, in [13], it is assumed that mobile devicescan execute tasks locally, offload tasks to other cooperativemobile devices or to the remote cloud, in accordance withthe decision taken by the proposed scheduler. Interestingly,it exploits a Lyapunov-based optimization approach, inorder to attain a good tradeoff between the average energywasted by the mobile devices for local/remote traffic of-floading and the volume of the Internet traffic generatedtowards the remote cloud. Thus, similarly to our work, theenergy-vs.-traffic tradeoff is the focus of [13]. However, wepoint out that: (i) the work in [13] does not consider amobile scenario; (ii) the scheduler in [13] does not enforceper-client QoS guarantees on the maximum execution-plus-communication delay and/or the minimum processing rate;and, (iii) the energy wasted by the remote cloud for pro-cessing the offloaded traffic is not included in the objectivefunction of [13].

A third research direction focuses on the energy con-sumption of hand-held wireless devices when traffic of-floading over 3G/4G/WiFi connections is performed [14],[15], [16]. Specifically, [14] focuses on the CPU energy con-sumption of wireless devices by profiling the volume ofcomputation that can be performed under a given energybudget. Interestingly, this paper supports the conclusionthat DVFS does affect the computing energy efficiency ofwireless devices, but not radically. Interestingly, it is showedin [14] that the number of performed CPU cycles mainlydepends on the sizes of the tasks and the burst factorsof the task streams to be processed. According to thisobservation, a main result of [14] is the characterizationof the relationship between task sizes, average number of




needed CPU cycles and burst factors of the processed taskstreams. Similar results are presented in [15] and [16], whereanalytical models for the average energy consumptionsof 3G/WiFi and 4G connections are presented and testedthrough field trials. Overall, the contributions in [14], [15]and [16] do not afford, by design, resource managementand/or scheduling optimization problems and the carriedout energy analysis does not consider the effect of QoSconstraints on the allowed computing-plus-communicationdelay.

The remainder of this paper is organized as follows.After presenting the architecture of the considered TCP/IP-based vehicular Fog platform in Section 2, the proposed jointscheduler is developed in Sections 3 and 4. Implementationaspects and implementation complexity of the proposedscheduler are the topics of Section 5. Numerical resultsattained through performance tests are presented in Section6, while some conclusions are drawn in Section 7. Dueto space limitations, detailed proofs of the most analyticalresults are not reported here. The interested reader may referto [17].

Regarding the main adopted notation, E. indicatesexpectation, , means ”equal by definition”, [x]ba and [x]+indicate: minb; maxa;x, and: max0;x, respectively.Finally, Eσϕσ, s, q denotes the expectation of the multi-argument function ϕσ, s, q carried out over the probabil-ity density function (pdf) of the random variable (r.v.) σ.

2 THE CONSIDERED VCC INFRASTRUCTURE

Several recent research efforts target I2V networked in-frastructures for the support of data dissemination andcontent delivery services [18]. The main building blocks ofthese architectures are a number of static RSUs, which aredeployed along the road and are spaced apart by Di (m).Each RSU serves a spatial cluster of radius Ra (m). It actsas a serving point for the VCs currently traveling on theserved cluster and it is equipped with computing capability,in order to play the role of a Fog node [18]. The RSUsmay be also inter-connected by a wireless backbone (see thehorizontal orange rays in Fig.1a), in order to allow inter-Fogdata exchange when the served VCs move over adjacentspatial clusters [18], [19]. According to the Dedicated Short-Range Communication (DSRC) standard for the IntelligentTransportation System (ITS) [2], an RSU provides (possibly,multiple) I2V point-to-point connections to the served VCs.At this regard, we shortly note that the current radio bandof vehicular networks is located at 5.9 GHz and partitionedinto seven channels. One control channel is designed toestablish I2V transport connections, while the remainingservice channels are employed to download data throughpoint-to-point single-hop IEEE 802.11p radio links (see theRSU-to-vehicle green ray at the bottom of Fig.1a) [8]. For thispurpose, the transmission time over each service channel ispartitioned into slots, TS (s) is the slot duration, t is thediscrete-time slot index, and the t-th slot spans the semi-open time interval [tTS , (t+ 1)TS ], t ≥ 0.

2.1 Input traffic and queue model in virtualized FognodesIn a virtualized Fog node, each VM processes the currentlyassigned task by self-managing own local virtualized stor-

age/computing resources. When a request for a new jobis submitted to the Fog node, the corresponding resourcescheduler adaptively performs both admission control andallocation of the available virtual resources. Hence, accord-ing to the typical architecture presented, for example, in[7], Fig.1b reports the main blocks which operate at theMiddleware layer of the considered Fog node. Roughlyspeaking, a Fog node is composed by: (i) an AdmissionControl Router (ACR); (ii) an input buffer of size NI ; (iii) areconfigurable computing platform and the related switchedVirtual LAN (VLAN); (iv) an output queue of size NO ; and,(v) an adaptive scheduler, that reconfigures and consolidatesthe available computing-communication resources and alsoperforms the control of the input/output traffic flows.

Specifically, at the end of slot t, input requests ofprocessing new jobs arrive at the input of the ACR ofFig.1b. This happens according to a random real-valuedarrival process a(t) ∈ R+

0 , t ≥ 0, that is limited up toamax ∈ R+

0 Information Units (IUs) per slot1. The arrivalprocess a(t), t ≥ 0 is assumed to be independent fromthe current backlogs of the input/output queues of Fig.1b.Furthermore, since, in our framework, it may be the ag-gregation of multiple (possibly, heterogeneous) traffic flowsgenerated by the remote cloud, nearby Fog nodes and VCs[19], we do not introduce any a priori assumption about thestatistical behavior of a(t). According to this position, letλ(t) ∈ R+

0 (IU/slot) be the number of IUs out of a(t) thatare admitted into the input queue of Fig.1b at the end ofslot t, respectively. We assume that any new arrival that isnot admitted by the ACR of Fig.1b is declined. Thus, wehave: 0 ≤ λ(t) ≤ a(t), so that: 1− (λ(t)/a(t)), is the fractionof the input traffic that is rejected at slot t. Furthermore,we consider (time-slotted)G/G/1/NI andG/G/1/NO fluidsystems for modeling the input and output queues of Fig.1b,respectively. Due to the admission control, both queues areloss-free and they implement the FIFO service discipline.Let s(t) ∈ R+

0 and q(t) ∈ R+0 be the numbers of IUs stored

by the input and output queues of Fig.1b at the beginningof slot t. Furthermore, let Ltot (IU/slot) be the size of theincoming workload that: (i) is drained from the input queueat the beginning of slot t; and, (ii) is injected into the outputqueue at the end of slot t. Finally, let r(t) (IU/slot) bethe workload that is drained from the output queue andtransmitted over the vehicular TCP/IP connection of Fig.1bduring slot t. Hence, the time evolutions of the backlogss(t) ∈ R+

0 , t ≥ 0 and q(t) ∈ R+0 , t ≥ 0 of the input

and output queues are dictated by the following Lindley’sequations, respectively:

s(t+ 1) = [s(t)− Ltot(t)]+ + λ(t), t ≥ 0, (1)

q(t+ 1) = [q(t)− r(t)]+ + Ltot(t), t ≥ 0. (2)

2.2 The considered virtualized Fog architecture

Let NS be the number of physical servers which equipthe Fog node of Fig.1b, and let Mmax(k), k = 1, . . . , NS ,be the maximum number of VMs that can be hosted by

1. The meaning of an IU is application dependent. It may representa bit, byte, segment or even an overall large-size application task (forexample, a large image). We anticipate that, in the carried out tests ofSection 6, IUs are understood as Mbit.




(a)

(b)

Fig. 1: (a): The considered Cloud-Fog-Vehicular system architecture; (b): A snapshot of the assumed intra-Fog node architecture. Itoperates at the Middleware layer and Software as a Service (SaaS) is the provided service mode. Black boxes are Virtual NetworkInterface Cards (VNICs). RSU:= Road Side Unit; WAN:= Wide Area Network; LAN:= Local Area Network.

the k-th physical server, so that: M ,∑NS

k=1Mmax(k), isresulting maximum number of VMs hosted by the Fog node.In principle, each VM operates as a virtual server, which iscapable to process fi IUs per second (e.g., i is the VM index).Depending on the size L(i) (IU) of the task to be currentlyprocessed by VM(i), the corresponding processing rate fimay be adaptively scaled at run-time through DVFS tech-niques. In our framework, fi may assume values over theinterval [0, fmax

i ], where fmaxi (IU/slot) is the maximum

processing rate of VM(i).

Being Ltot the overall size of the current workload to beprocessed, let L(i) ≥ 0, i = 1, . . . ,M , be the size of thetask that the Adaptive Load Dispatcher of Fig.1b assigns toVM(i), so that, the following constraint:

∑Mi=1 L(i) = Ltot,

guarantees that the overall workload is partitioned into (atthe most)M parallel tasks. Hence, the set of attributes whichcharacterize each VM is:

Φ(ηi),∆, Emaxc (i), Eidlec (i), fmaxi , i = 1, 2, . . . ,M, (3)

where ∆ is the maximum per-slot and per-VM allowedprocessing time (in seconds); E idlec (i)(Joule) is the staticenergy consumed by VM(i) in the idle state; Emaxc (i)(Joule) is the maximum energy consumed by VM(i); fmaxi

is the maximum processing rate of VM(i); and Φ(ηi) is theutilization function of VM(i), which is typically evaluatedas in [20]: Φ(ηi), (fi/f

maxi )2, with ηi , fi/fmaxi .




2.3 Computing energy and reconfiguration cost in vir-tualized Fog nodes

It is shown in [20] that there is a linear relationship betweenthe CPU utilization Φ(ηi) by VM(i) and the correspondingcomputing energy consumption Ec(i), so that we can write:

Ec(i) = Eidlec (i) + (fi/fmaxi )2(Emaxc (i)− Eidlec (i)). (4)

Furthermore, we note that switching from the processingrate: fi(t−1) (e.g., the processing rate of VM(i) at the (t−1)-th slot) to: fi(t) (e.g., the processing rate of VM(i) at the t-thslot) entails an energy overhead of Edyn(i, t) [20]. Although,the actual behavior of the function: Edyn(i, t) depends onthe adopted DVFS technique, a quite common model is thefollowing quadratic one [20]:

Edyn(i, t) = ke (fi(t)− fi(t− 1))2 . (5)

In Eq. (5), ke (Joule/(Hz)2) denotes the reconfigurationcost induced by an unit-size rate switching and it is typicallylimited up to few hundreds of mJ ′s per (MHz)2 [20].

Remark 1 - Discrete DVFSBefore proceeding, a remark on the allowed spectrum

of the per-VM processing rates is in order. Specifically,we note that actual VMs are generally instantiated atopphysical computing cores which offer, indeed, only a finiteset: O, f (0) , 0, f (1), . . . , f (Q−1) , fmaxi of Q dis-crete processing rates. Hence, in order to deal with bothcontinuous and discrete reconfigurable physical computinginfrastructures, we borrow the approach formerly devel-oped, for example, in [21], [22]. Specifically, after indicatingby: B ,

η(0) , 0, η(1), . . . , η(Q−1) , 1

the discrete values

of η which correspond to the set O, we build up a VirtualEnergy Consumption curve Φ(η) by resorting to a piece-wise linear interpolation of the allowed Q operating points:(η(j),Φ(η(j))

), j = 0, . . . , (Q− 1)

. Obviously, we may

use it as the true energy consumption curve for the re-source provisioning. Unfortunately, being the virtual curveof continuous type, it is no longer guaranteed that theresulting optimally scheduled processing rates are still dis-crete valued. However, as also explicitly noted in [21], [22],any point: (η∗,Φ(η∗)), with η(j) < η∗ < η(j+1), on thevirtual curve may be actually attained by time-averagingover ∆ secs the corresponding surrounding vertex points:(η(j),Φ(η(j))

)and

(η(j+1),Φ(η(j+1))

). Due to the piece-

wise linear behavior of the virtual curve, as in [21], [22], it isguaranteed that the average energy cost of the discrete DVFSsystem equates that of the corresponding actual one overeach time interval of duration ∆.

2.4 Intra-Fog communication

We assume that the i-th virtual end-to-end connection (e.g.,the i-th virtual link) of the Fog platform of Fig.1b is bidirec-tional and symmetric. Furthermore, in order to guaranteereliable (e.g., loss and error-free) transport of data, as in[23], [24], we assume that the TCP New Reno protocol isimplemented for managing the intra-Fog end-to-end trans-port connections. Hence, under steady-state operating con-ditions, the power Pneti (t) (measured in (Watt)) drained bythe i-th end-to-end connection may be evaluated as in [25]:

Pneti (t) = Ωi(RTTi Ri(t)

)2, i = 1, . . . ,M, (6)

where RTTi is the average round-trip-time of the i-th intra-Fog connection, Ri(t) (IU/slot) is the corresponding com-munication rate at slot t, and Ωi (Watt) is the power con-sumption of the connection when the product: (round-trip-time) by (communication-rate) is unit-valued. Hence, afterindicating by Li(t) (IU/slot) the size of the task assigned toVM(i) at slot t, the corresponding communication energyELAN (i, t) (Joule) wasted by the i-th virtual link is:

ELAN (i, t) = Pneti (t)(Li(t)/Ri(t)) ≡ (2Ωi/(TS−∆))(RTT iLi(t))2

(7)In Eq. (7), the last relationship follows from the fact that, bydesign, the allowed per-slot transmission time is: (TS −∆),so that the corresponding communication rate equates:Ri(t) = (2Li(t)/(TS − ∆)). At this regard, we also notethat, in practical application scenarios, the maximum per-slot aggregate traffic conveyed by the intra-Fog LAN ofFig.1b is generally limited up to an assigned value: Rmax(IU/slot), which depends, in turn, on the employed (typi-cally, Ethernet-type) switch technology [23], [24]. As a con-sequence, the following constraint must hold:

M∑i=1

Ri(t) =

M∑i=1

(2Li(t)/(TS −∆)) ≤ Rmax. (8)

2.5 Rate-vs.-energy tradeoff in TCP/IP vehicular con-nectionsGoal of the output queue of Fig.1b is to effectively copewith the fading and mobility-induced random (possibly,unpredictable) fluctuations of the bandwidth of the corre-sponding single-hop downlink connection (see the RSU-to-vehicle green ray at the bottom of Fig.1b). At this regard,we note that, from an application point of view, there arethree main reasons for resorting to the TCP/IP protocol, inorder to implement this connection. First, it guarantees, bydesign, reliable (e.g., loss and error-free) transport of data[4]. Second, it allows the vehicular client to perform flowcontrol, in order to adaptively match the transmission rateof the serving RSU to the actual (typically, time-varying)operating conditions of the used mobile device. Third, beingconnection-oriented, the TCP protocol allows the exploita-tion of the (aforementioned) location awareness of the Fognode, in order to measure the instantaneous state of eachsustained RSU-to-vehicle connection. Hence, without loss ofgenerality, in the sequel we focus on the case of a single RSU-to-vehicle connection (see Fig.1a). The goal is to save energyby timely adapting the transmission rate r(t) (IU/slot) ofthe supported TCP connection. In fact, r(t) depends on boththe transmit energy EW (t) and the connection state σ(t) asin [17], [25]:

r(t) = σ(t)(EW (t))1/2, (IU/slot) (9)

where the state of the supported vehicular connection at slott is defined as in [17], [25]:

σ(t) , (K0(z(t))1/2)/RTTW (t), t ≥ 1, (10)

so that:EW (t) ≡ EW

(r(t), σ(t)

)= (r(t)/σ(t))2. (11)




In Eq. (10), z(t) is the mobility function of the vehicularclient served by the considered TCP/IP mobile connection.It may be modeled as a time-correlated log-distributedsequence [26]: z(t) , a0 100.1x(t), where a0 ≡ 0.9738,and x(t), t ≥ 1 is a time-correlated stationary zero-meanunit-variance Markov random sequence, whose probabilitydensity function is evenly distributed over the interval[−√

3,√

3] [26]. The (positive) constant K0 in (10) capturesthe performance of the FEC-based error-recovery systemimplemented at the Physical layer of Fig.1b [25], whileRTTW (t) is the round-trip-time of the connection. It isiteratively calculated through the Jacobson’s formula:

RTTW (t) = 0.75RTTW (t− 1) + 0.25 ∆IP (t), (12)

where ∆IP (t) is the instantaneous packet-delay (in multipleof the slot period) measured at the Transport layer of theconsidered connection. Furthermore, in micro-cellular land-mobile applications, the corresponding correlation coeffi-cient: h,E z(t)z(t− 1) that describes the time-correlationof the fading-induced fluctuations of the state of the connec-tion may be evaluated as in [26]:

h = (0.82)vTS/100, (13)

where v (m/s) is the average speed of the served vehicularclient of Fig.1a. Hence, after indicating by ∆max

IP the maxi-mum instantaneous delay of the connection, from Eq. (12), itfollows that: RTTW (t) ≤ ∆max

IP , so that the correspondingstate in (10) is lower bounded as in:

σ(t) ≥ σmin , K0

((a010−0.15)1/2/∆max

IP

). (14)

Overall, from the outset, it follows that the resulting totalcommunication-plus-computing energy: Etot(t) consumedat slot t by the Fog node of Fig.1b is given by the followingfinal expression:

Etot(t) =

( M∑i=1

Ec(i, t) + Edyn(i, t) + ELAN (i, t)

)+ EW (t). (15)

3 RESOURCE RECONFIGURATION

Let S(t) (resp., S(t)) be the set of turned-ON (resp., turned-OFF) VMs at slot t (see Fig.1b). The proposed schedulerinterleaves reconfiguration and consolidation slots. Specif-ically, at each reconfiguration slot, the scheduler does notchange the sets S(t) and S(t) of the turned ON/OFF VMs(and, then, it does not change the corresponding sets ofturned ON/OFF physical servers). However, it performsthe following reconfiguration actions (see Fig.1b): (i) controlof the input and output flows; and, (ii) assignment of thetasks and processing rates to the turned-ON VMs. In theremaining part of this section, we deal with the resourcereconfiguration problem, while resource consolidation isafforded in Section 4.

Hence, after indicating by:

Ltot(t) =∑i∈S(t)

Li(t), (16)

the size of the workload which is drained by the inputqueue of Fig.1b, the resource reconfiguration problem to beafforded at slot t is formulated as follows:

minA

θEσEtot(t) − (1− θ)Eσr(t), (17.1)

subjected to:

EσEW (σ, r)

≤ Eave, (17.2)

0 ≤ λ(t) ≤ a(t), (17.3)0 ≤ s(t+ 1) ≤ NI , (17.4)0 ≤ q(t+ 1) ≤ NO, (17.5)rmin ≤ r(t) ≤ minq(t); rmax, (17.6)0 ≤ fi(t) ≤ fmaxi , i ∈ S(t), (17.7)

0 ≤ Li(t) ≤ Lmaxi , ∆fmaxi , i ∈ S(t), (17.8)(Li(t)/fi(t)) ≤ ∆, i ∈ S(t), (17.9)Ltot(t) ≤ mins(t); (TS −∆)(Rmax/2).

(17.10)

In Eq. (17.1), A , Li(t), fi(t), i ∈ S(t);λ(t), r(t) is the setof variables to be reconfigured at slot t and θ ∈ [0, 1] is an(assigned) weight factor, that tunes the desired tradeoff be-tween average consumed energy and wireless transmissionrate.

Regarding the constraints in Eqs. (17.2)-(17.10), someremarks are in order. The constraint in (17.2) is typicallydictated by spectral compatibility and energy budget issues[2]. It limits up to: Eave (Joule) the per-slot average en-ergy available for transmitting over the single-hop mobileconnection of Fig.1a (see the RSU-to-vehicle green ray atthe bottom of Fig.1a). The constraint in (17.3) accounts forthe (aforementioned) admission control performed by theaccess router of Fig.1b, while Eqs. (17.4) and (17.5) accountfor the finite sizes of the input and output buffers, respec-tively. Eq. (17.6) guarantees that the size r(t) of the workloaddrained from the output queue of Fig.1b does not exceed thecorresponding queue backlog q(t), and also forces r(t) to fallinto a desired range: [rmin, rmax] of transmission rates. Fur-thermore, Eq. (17.7) (resp., Eq. (17.8)) bounds the maximumprocessing rate (resp., workload) of VM(i), while Eq. (17.9)is a hard limit on the corresponding per-slot and per-VMprocessing time. Finally, Eq. (17.10) guarantees that the sizeLtot(t) of the workload drained by the input queue does notexceed the corresponding queue backlog s(t), while beingalso compliant with the constraint in (8) on the maximumaggregate rate of the intra-Fog LAN.

Before proceeding, we explicitly point out that rmax(IU/slot) in (17.6) is the maximum instantaneous rate con-veyed by the mobile connection of Fig.1b, and its value istypically dictated by the WiFi technology actually employedfor implementing the single-hop wireless link of Fig.1a [8].Furthermore, the corresponding rmin (IU/slot) in (17.6)fixes the minimum instantaneous rate to be guaranteed,and, then, it plays the role of a hard QoS requirement, thatis enforced by the supported application. From Eqs. (9), (14)and (17.2), it follows that feasible values of rmin are limitedup to: σmin

√Eave (IU/slot).




3.1 Feasibility and QoS guarantees

Regarding the feasibility of the resource reconfigurationproblem in (17.1)-(17.10), the following formal result holds.

Proposition 1. Feasibility conditions.Under the reported framework, the following four inequalities:

EW (σmin, rmin) ≤ Eave, (18.1)M∑i=1

Lmaxi ≡M∑i=1

fmaxi ∆ ≥ rmin, (18.2)

(Rmax/2)(TS −∆) ≥ NI , (18.3)a(t) ≥ rmin, ∀t ≥ 0, (18.4)

guarantee that the resource reconfiguration problem in (17.1)-(17.10) is feasible.

Hence, since the reported conditions assure that thereconfiguration problem admits the solution, we pass toconsider the corresponding QoS properties. At this regard,we point out that a combined exploitation of the constraintsin (17.4)-(17.6) leads to the following hard bounds on theresulting communication-plus-computing delay and rate-jitter of the solution of the afforded resource reconfigurationproblem.

Proposition 2. Hard QoS guarantees.

Let the feasibility conditions of Proposition 1 be met. Let TQI ,TSI , TQO, and TSO be the random variables that measure therandom queue delay of the input queue, the service time of theinput queue, the queue delay of the output queue and the servicetime of the output queue, respectively. Thus, the following QoSguarantees hold:

a) the random total delay: Ttot , TQI + TSI + TQO + TSO (inmultiple of the slot period) induced by the Fog platform of Fig.1bis limited (in a hard way) up to:

Ttot ≤ ((NI +NO)/rmin) + 2 ; (19.1)

b) the instantaneous jitter affecting the wireless transmit rate isupper bounded as in:

|r(t1)− r(t2)| ≤ (rmax − rmin) , for any t1 and t2 ; (19.2)

c) the corresponding average jitter: σr ,√E

(r(t)− r)2

is

limited as in:σr ≤

√(rmax)2 − (rmin)2 . (19.3)

The reported QoS guarantees lead to the conclusion thatthe NetFC platform of Fig.1b is capable, indeed, to supportreal-time (i.e., delay and jitter-sensitive) applications, whilemeeting the bound in (17.2) on the consumed wirelessenergy.

3.2 Solution of the resource reconfiguration problem

In order to present the solution: f∗i (t), L∗i (t), i ∈ S(t);L∗tot(t), r∗(t) , of the stated resource reconfiguration prob-lem, let us introduce the following dummy position:

ψ(t) , mins(t); maxrmin; r∗(t− 1), (20)

where: r∗(t−1),(1/τ)(∑t−1τ=0 r

∗(τ)), is the average wirelesstransmit rate up to (t−1)-th slot. Hence, the following resultholds for the optimal resource reconfiguration.

Proposition 3. Solution of the resource reconfiguration problem.

Under the feasibility conditions in (18.1)-(18.4), for the solu-tion of the resource reconfiguration problem in (17.1)-(17.10), wehave that:

a) the optimal total workload L∗tot(t) to be processed isscheduled as in:

L∗tot(t) =

0, for q(t) > NO + rmin − rmax,ψ(t), for q(t) ≤ NO + rmin − rmax;

(21.1)

b) the optimal admitted input traffic λ∗(t) is scheduled as in:

λ∗(t) = mina(t); NI − s(t) + L∗tot(t); (21.2)

c) the optimal transmission rate r∗(t) over the mobile connec-tion is scheduled as in:

r∗(t) =

[(1− θ)σ(t)2

2ζ∗(t)

]minrmax; q(t)

rmin

, (21.3)

where ζ∗(t) is the nonnegative Lagrange multiplier of the con-straint in (17.2). It may be adaptively updated according to thefollowing gradient-based projection iteration:

ζ∗(t) = [ζ∗(t− 1)− α (Eave − EW (σ(t), r∗(t− 1)))]+ , (21.4)

where α is a suitable nonnegative step-size;d) for i ∈ S(t), the optimal processing rate f∗i (t) and task

size L∗i (t) of the i-th VM are scheduled as in:

f∗i (t) =

kef∗i (t− 1) + (∆ν∗i (t)/2θ)

ke +(Emaxc (i)−Eidlec (i)

(fmaxi )2

)fmax

i

0

, (21.5)

and

L∗i (t) = min

[(TS −∆)µ∗(t)

4Ωi(RTT i)2 θ

]Lmaxi

0

; [∆f∗i (t)]Lmax

i0

, (21.6)

where the (nonnegative) Lagrange multiplier ν∗i (t) of the i-thconstraint in (17.9) is given by:

ν∗i (t) =

[µ∗(t)− 4 θ Ωi (RTT i)

2L∗i (t)

(TS −∆)

]+

. (21.7)

Finally, the Lagrange multiplier µ∗(t) of the constraint in (16) isthe (unique nonnegative) root of the following algebraic equation:∑

i∈S(t)

L∗i (µ(t)) = L∗tot(t), (21.8)

where L∗i (.) is given by (21.6), with µ∗(t) replaced by the dummyvariable µ(t).

Regarding the tuning of the step-size α in (21.4), weanticipate that, for values of the average speed v in (13)ranging from: 30 (Km/h), to: 150 (Km/h), the followingsetting: α = 10−2, allows the iteration in (21.4) to convergeto its steady-state value within 20-25 slots, with a finalaccuracy within 1%.




4 RESOURCE CONSOLIDATION

Let TCN , t0, t1, . . . indicate the set of the consolidationslot indexes. By design, at each consolidation slot t ∈ TCN ,the scheduler updates the sets S(t) and S(t) of the turnedON/OFF VMs and, then, turns ON/OFF the correspondingphysical servers. Afterwards, on the basis of these updatedsets, the scheduler performs again the resource reconfig-uration actions previously described in Section 3. In ourframework, resource consolidation is performed accordingto the following three considerations.

First, as in [27], the set TCN of the consolidation slots isadaptively built up on the basis of the average utilizationof the currently turned-ON VMs. For this purpose, afterindicating by |S(τ)| the number of VMs that are turned ONat slot τ , the following average utilization coefficient:

U(τ) ,1

|S(τ)|

∑i∈S(τ)

L∗i (τ)

Lmaxi

, (22)

is evaluated at every slot τ ≥ 0. Afterwards, as in [27],the following `-out-m decision rule is applied: the currentslot t is flagged as a consolidation slot if at least ` outof the m most recently measured values of U(.) as thepredicted utilization for slot t fall out of a target intervalI , [ThL, ThU ]. The interval I is the set of the desiredutilization values and its end-points: ThL and ThU are thelower and upper thresholds on the desired average VMutilization.

Second, in our framework, a turned-OFF physical serveris turned ON if at least one of the hosted VMs is turned ON.Furthermore, a turned-OFF VM is turned ON if its process-ing rate is no longer vanishing. Hence, at each consolidationslot t ∈ TCN , the energy consumption of the i-th VM maybe modeled as in [28]:

Ec(i, t) = Eidlec (i)u−1(fi(t))+(fi(t)/fmaxi )2(Emaxc (i)−Eidlec (i)),

(23)where u−1(x) is the unit-size step-function. The first termin (23) accounts for the decrement (resp., increment) of thestatic computing energy that is experienced by turning OFF(resp., ON) the i-th VM.

Third, turning ON a VM induces a latency of TONseconds [27]. Hence, at each consolidation slot t ∈ TCN ,the following constraints must be also enforced:

TON fi(t) +

(Li(t)

fi(t)

)≤ ∆, ∀i ∈ S(t− 1), (24.1)(

Li(t)

fi(t)

)≤ ∆, ∀i ∈ S(t− 1), (24.2)

M∑i=1

Li(t) ≤ min s(t); (TS −∆)(Rmax/2) . (24.3)

The constraint in (24.1) accounts for the (aforementioned)latency, while the constraint in (24.2) enforces vanishingworkload at vanishing processing rate, and, then, it guar-antees that no workload is processed by VM(i) when itis turning OFF. Finally, Eq. (24.3) assures that the overallworkload to be processed by all available VMs does notexceed the maximum feasible one.

On the basis of the above three remarks, the resourceconsolidation problem is formulated as follows: minimize

the objective function in (17.1) (with Ec(i, t) in (15) given by(23)), under the sets of constraints in Eqs. (17.2)-(17.8) andEqs. (24.1)-(24.3). The set A of the variables to be optimizedis the same one of Section 3.

5 ADAPTIVE IMPLEMENTATION AND COMPLEXITYOF THE PROPOSED SCHEDULER

In order to effectively cope with the random (possibly,unpredictable) time fluctuations of the input traffic and stateof the mobile connection of Fig.1a, we proceed to developan adaptive implementation of the scheduler, which com-putes on-the-fly the solutions of both the resource reconfig-uration and consolidation problems. For this purpose, weresort to the primal-dual algorithm recently revised in [29].Generally speaking, it applies gradient-based iterations, inorder to adaptively update both the primal variables (i.e.,the variables to be optimized) and the dual variables (i.e.,the Lagrange multipliers associated to the constraints) ofthe considered constrained optimization problem. In ourframework, the proposed resource scheduler runs primal-dual iterations: (i) at the beginning of each reconfigurationslot, in order to compute the solution µ∗(t) of the (nonlinear)algebraic equation in (21.8); and, (ii) at the beginning of eachconsolidation slot, in order to numerically compute the setA∗ of the optimal consolidated resources2.

For this purpose, after indicating by x(t) ∈ A a (scalar)resource variable to be updated at slot t and by L(t) the La-grangian function of the considered optimization problem,the n-th primal-dual iteration assumes the following generalform [29]:

x(n)(t) =[x(n−1)(t)− g(n−1)

x (t)∇xL(n−1)(t)]

+, n ≥ 1, (25)

where n = 1, 2, . . ., is a discrete iteration index whichruns at the beginning of the considered slot t. Furthermore,according to [29], in Eq. (25), we have that: (i)∇xL(n−1)(t) isthe scalar derivative of the considered Lagrangian functionL(t), which is done with respect to the variable x and isevaluated at the resource configuration setting attained atthe previous iteration (n − 1); (ii) g(n−1)x (t) is a suitable n-varying (e.g., adaptive) step-size. According to [29], it is setas in: g(n−1)x (t) = 0.5(x(n−1)(t))2; and, (iii) the projectionin (25) accounts for the nonnegative values of all involvedvariables.

Just as an illustrative example, at the beginning of eachreconfiguration slot t, the root µ∗(t) of the (nonlinear) al-gebraic equation in (21.8) is computed by the scheduler byrunning the following iterations in the n-index:

µ(n)(t) =

[µ(n−1)(t)− 0.5

(µ(n−1)(t)

)2×

((∑i∈S(t)

L(n−1)i (t))− L∗tot(t)

)]+

, n ≥ 1,

(26)

2. Consider that, due to the presence of the step-size function in (23),the (previously stated) resource consolidation problem resists, indeed,closed-form solution.




where L∗tot(t) is given by Eq. (21.1), and L(n−1)i (t) is given

by Eq. (21.6), with µ∗(t) and f∗i (t) replaced by µ(n−1)(t) andf(n−1)i (t), respectively.

Algorithm 1 reports a pseudo-code for the overall adap-tive implementation of the proposed resource scheduler.

Algorithm 1 A pseudo-code of the proposed adaptive re-source scheduler

1: for t ≥ 1 do2: Apply the `-out-m decision rule and flag t as reconfigu-

ration or consolidation slot;3: if t is a consolidation slot, then4: Perform resource consolidation by running the itera-

tions in Eq. (25);5: Update the sets S(t) and S(t);6: end if7: if t is a reconfiguration slot, then8: Evaluate Eqs. (21.1)-(21.4);9: Compute µ∗(t) through Eq. (26);

10: Evaluate Eqs. (21.5)-(21.7);11: end if12: Update s(t + 1) and q(t + 1) through Eqs. (1) and (2),

respectively;13: Compute U(t) through Eq. (22), with τ replaced by t;14: end for

Regarding the resulting implementation complexity, twomain remarks are in order.

First, since the iterations in (25) are carried out at thebeginning of each slot, the iteration index n must run fasterthan the slot time TS . Hence, the actual time duration: TI(s)of each n-indexed iteration must be so small to allow theiterations in (25) to converge within a limited fraction of theslot time. At this regard, the formal results of Theorem 3.3 of[29] assure the asymptotic convergence of the iterations in(25). Furthermore, we have numerically ascertained that, atleast in the tests of Section 6, about 10-15 n-indexed stepssuffice, in order to attain the convergence of (25) with a finalaccuracy within 1%. Hence, in the carried out tests, we pose:TI = TS/250.

Second, since each VM updates the size of the task to beprocessed and the corresponding processing rate, the totalnumber of variables to be updated at each slot scales uplinearly with the number M of the available VMs. Therefore,the total per-slot implementation complexity of the schedulerof Algorithm 1 is of the order of O(M). Furthermore, sincethe pursued primal-dual approach allows the distributedimplementation of the solutions of the afforded reconfigura-tion and consolidation problems [29], each VM may locallyupdate its task size and processing rate through local mea-surements. As a consequence, the per-VM implementationcomplexity of the proposed scheduler does not depend onthe number M of the available VMs, that is, it is of the orderof O(1).

Overall, the above remarks lead to the conclusion thatthe implementation of the proposed scheduler is adaptiveand distributed, while the resulting per-slot execution timeis limited up to about 15 iteration periods, regardless of the(possibly, large) size the considered Fog node.

5.1 Implementation aspects of the intra-Fog Virtualiza-tion layerMain tasks of the intra-Fog Virtualization layer of Fig.1b aretwofold. First, it must guarantee that the demands for theper-VM processing rates f∗i and per-link communicationrates R∗i done by the Middleware layer are mappedonto adequate CPU cycles and channel bandwidths at theunderlying Physical layer. Second, it must profile at runtimethe maximum and idle energies actually consumed by therunning VMs.

Regarding the first task, we note that QoS mappingof the demands for processing and communication ratesmay be actually performed by equipping the Virtualizationlayer of Fig.1b with a per-VM queue system that imple-ments (in software) the so-called mClock and SecondNetschedulers [30], [31]. Interestingly, Table 1 of [30] pointsout that the mClock scheduler is capable to guarantee hard(e.g., absolute) reservation of CPU cycles on a per-VMbasis by adaptively managing the computing power of theunderlying DVFS-enabled physical cores (see Algorithm 1of [30] for the code of the mClock scheduler). Likewise, theSecondNet network scheduler in [31] provides bandwidth-guaranteed virtualized Ethernet-type contention-free linksatop any set of TCP-based (possibly, multi-hop) end-to-end connections. For this purpose, SecondNet implementsa suitable Port-Switching based Source Routing algorithm,that may directly work at the Middleware layer of Fig.1b(see Section 3 of [31]).

Regarding the (aforementioned) second task, the max-imum and idle energies consumed by each VM may beprofiled at runtime by resorting to the so-called Joulemetertool [32]. It is a software tool that operates at the Virtual-ization layer of Fig.1b and is capable to provide the sameenergy metering functionality for VMs as currently exists inhardware for physical servers. For this purpose, Joulemeteruses hypervisor-observable hardware power states to trackthe VM energy usage on each hardware component (seeSection 5 of [32] for a detailed description of the Joulemetertool). Interestingly enough, the field trials reported in [32]support the general conclusion that the maximum and idleenergies wasted by a running VM are proportional to thecorresponding maximum and idle energies wasted by thehosting physical server, with the scaling factor being givenby the fraction of the physical cores actually used by theVM.

6 TEST RESULTS AND PERFORMANCE COMPAR-ISONS

This section presents the tested energy performance ofthe proposed scheduler under a set of synthetic and real-world input arrival traces, and, then, compares it with thecorresponding ones of the DVFS-based scheduler in [25],Lyapunov-based scheduler in [11], static scheduler in [22]and NetDC scheduler in [10].

6.1 Simulated Fog setupThe simulations are carried out by exploiting the numericalsoftware of the MATLAB platform. They emulate the energyconsumptions of 5 homogeneous quad-core Dell PowerEdge




TABLE 1: Default values of the main simulated parameters.

Parameters

∆ = 0.8 (s) TS = 1 (s) NS = 5NI = NO = 50 (Mbit) Emaxc = 7 (Joule) Mmax = 4ke ∈ 0.5, 0.05 (J/(MHz)2) Eidlec = 4 (Joule) βvc = 100/130fmax = 10 (Mbit/slot) α = 10−2 Nvc = 40/3rmax = amax = 8 (Mbit/slot) Ω = 0.5 (Watt) h = 0.95rmin = 1.5 (Mbit/slot) TON = 10−3 (s) σmin = 375× 104

θ = 0.5 v = 100 (Km/h) Eave = 0.75 (Joule)Rmax = 1000 (Mbit/slot) vmax = 130 (Km/h) Ra = 250 (m)

servers, which are equipped with 3.06 GHz Intel Xeon CPUand 4GB of RAM. All the emulated servers are connectedthrough commodity Giga Ethernet Network Interface Cards(NICs) and host up to 4 VMs (see Table 1).

In order to measure the actual delay performance of theproposed adaptive scheduler, we resort to a per-IU averagetotal delay performance index (measured in multiple of theslot period), formally defined as in:

T∗tot ,

limt→∞

1t

(∑tτ=1 s

∗(τ))

limt→∞

1t

(∑tτ=1 λ

∗(τ)) +

limt→∞

1t

(∑tτ=1 q

∗(τ))

limt→∞

1t

(∑tτ=1 L

∗tot(τ)

) + 2.

(27)

In Eq. (27), the first ratio is the average delay: T∗QI = s∗/λ

∗

induced by the input buffer of Fig.1b, the second ratio isthe average delay: T

∗QO = q∗/L

∗tot of the output buffer,

and: TSI + TSO = 2, are the overall corresponding averageservice times.

6.2 Simulated vehicular setupA three-line highway vehicular environment is simulated,where Nvc vehicles proceed without performing abruptU-turns. They move along the highway of Fig.1a withspeeds uniformly ranging from: vmin to: vmax, so that:v = (vmax − vmin)/2, is the resulting per-vehicle averagespeed. Uniformly spaced RSUs serve the VCs and the simu-lated RSU-to-vehicle radio propagation accounts for free-space loss with exponent 4 and log-normal fading withstandard deviation 8 [8], [19].

Since no any assumption has been introduced about thestatistical behavior of the state in (10) of the consideredRSU-to-vehicle mobile connection, the optimality of theproposed scheduler of Algorithm 1 is retained, regardlessof the mobility patterns actually followed by the vehicles.However, the performed simulations refer to a highwayenvironment, and, then, it is reasonable to assume that: (i)at each slot time, each vehicle moves to the next spatialcluster or remains in the currently occupied cluster; and,(ii) being the vehicle speeds evenly distributed and subjectto random spatial fluctuations, the sojourn times of thevehicles in each cluster are geometrically distributed r.v.’s.These considerations lead, in turn, to adopt the so-calledMarkovian random walk model with random positioning,in order to simulate the vehicle mobility. According to thismodel, per-vehicle transitions among adjacent spatial clus-ters are represented by a Markovian spatial chain, in whicheach state corresponds to one spatial cluster with radiusRa (see Fig.1a). Furthermore, the per-vehicle inter-cluster

transition probability βvc and per-cluster average numberof served vehicles Nvc may be numerically evaluated as in[8], [19]:

βvc = v/vmax, and, Nvc = Ajam (1− βvc) (π/2)(Ra)2,(28)

where Ajam (vehicle/m2) is the maximum spatial densityof vehicles under traffic congestion phenomena.

6.3 Performance resultsIn the first test scenario, we run the proposed schedulerand evaluate the resulting average total consumed energyE∗tot under a synthetic (e.g., independent and identically dis-tributed) input arrival random sequence. Fig.2 reports E∗totfor various values of M , and its examination points out that:(i) E∗tot decreases for increasing M ; and, (ii) E∗tot increasesfor increasing values of the average wireless transmissionrate r∗. The second set of simulations in Fig.3 presents

5.6 5.8 6 6.2 6.4 6.6 6.8 7 7.2 7.4 7.610

12

14

16

18

20

22

24

r∗ (Mbit/slot)

E∗ tot(J

oule)

M = 20M = 15M = 10M = 5M = 2

Fig. 2: E∗tot-vs.-r∗ under synthetic input traffic.

the total average consumed energies E∗tot at M = 10, 15and 20 for various values of Eave. Fig.3 points out that, byincreasing the number of VMs, E∗tot still decreases. However,the interesting point is that, even for increasing values ofthe available Eave, the admission control performed by theproposed scheduler attempts to reduce the resulting E∗totto the minimum (see the quasi-flat behavior of the plots ofFig.3 at medium/large values of Eave).

6.3.1 Fraction of accepted requests-vs.-induced delayThe plots of Figs.4 and 5 report the (numerically evaluated)average fraction: F

∗A , lim

t→∞(1/t)

(∑tτ=1 λ

∗(τ)/a(τ))

, of




1 1.5 2 2.5 3 3.5 40.5

1

1.5

2

2.5

3

Eave (Joule)

E∗ tot(J

oule)

M = 20M = 15M = 10

Fig. 3: E∗tot-vs.-Eave under synthetic input traffic.

the input traffic that is actually admitted by the proposedscheduler and the corresponding average total delay T

∗tot in

(27). These plots refer to the case of ke = 0.05 (J/(MHz)2),M = 10 and h = 0.95. They allow us to gain insight aboutthe tradeoff between admission capability and induced de-lay of the proposed scheduler. Specifically, two main con-clusions may be drawn from the plots of Figs.4 and 5. First,the average fraction F

∗A of the admitted traffic increases

for increasing values of the storage capacity NI = NO ofthe input/output buffers, and/or increasing values of Eave(see Figs.4 and 5). Second, according to the Little’ law, the

5 10 20 30 40 500.97

0.98

0.99

1

NI = NO (Mbit)

F∗ A

4

6

8

10

12

T∗ tot

F∗

A

T∗

tot

Fig. 4: F ∗A and T∗tot versus the capacity NI = NO of the

input/output queues of Fig.1b for the application scenario ofSection 6.3.1 at M = 10 and Eave = 0.5 (Joule).

corresponding average total delay suffered by the admittedtraffic increases for increasing values of NI = NO (seeFig.4), while it (quickly) decreases for increasing values ofEave (see Fig.5).

6.4 Mobility effects

Goal of this set of numerical tests is to acquire insight aboutthe effects induced by the vehicle mobility on the resultingaverage total delay T

∗tot in (27), when the target performance

is assigned in terms of average wireless transmission rate r∗

0 0.1 0.2 0.3 0.4 0.5 0.60.7

0.8

0.9

1

Eave (Joule)

F∗ A

0 0.1 0.2 0.3 0.4 0.5 0.6

5

10

15

20

T∗ tot

F∗

A

T∗

tot

Fig. 5: F ∗A and T∗tot versus Eave for the application scenario of

Section 6.3.1 at M = 10 and NI = NO = 20 (Mbit).

and average total consumed energy E∗tot. For this purpose,the application scenario of Table 1 has been still consideredat M = 10 and ke = 0.05 (J/(MHz)2). The correspondingnumerically evaluated performance is reported in Table 2.

An examination of these results leads to two mainconclusions. First, the minimum size NI = NO of theinput/output buffers required to meet the target valuesr∗ and E∗tot (quickly) increases for increasing values of h(e.g., for decreasing values of the average vehicle speed v).This is due to the fact that larger values of h increase thecoherence time of the sequence σ(t) of the connectionstate (see Eqs. (10) and (13)), so that σ(t) tends to be”less ergodic”. Thus, in order to offset this effect withoutpenalizing the energy performance, the scheduler requiresmore buffering capacity, that, in turn, penalizes the resultingdelay performance (see the last column of Table 2). Second,at least in the simulated scenarios, the requested buffer sizeNI = NO tends to scale as: O (−1/log(h)) for h ≥ 0.87.

6.5 Performance tests and comparisons under real-world time-correlated input traffic

In order to test the consolidation capability of the proposedscheduler when the arrival sequence a(t) exhibits time-correlation, we have considered the arrival trace of Fig.6. Itreports the real-world trace in Fig.2 of [28] and refers tothe I/O workload taken from four RAID volumes of anenterprise storage cluster in Microsoft (see Section IV.A of[28]). In order to maintain the peak workload fixed at 8(Mbit/slot), each arrival of Fig.6 carries out a traffic of 0.266(Mbit).

6.5.1 Adaptive consolidation performanceThe numerical trials of this sub-section aim at testing boththe actual convergence to the steady-state and correspond-ing convergence speed of the proposed iterative consolida-tion algorithm of Section 5. For this purpose, we have eval-uated and compared its performance against the optimalone. By design, at each consolidation instant t, the optimalconsolidation scheduler computes the optimal sub-set of theconsolidated VMs by performing an exhaustive search over




TABLE 2: Buffer size–vs.–delay tradeoff at target r∗ and E∗tot for various values of the mobility-induced correlationcoefficient h. PMR:= Peak-to-Mean Ratio of the input traffic.

Workload Types h r∗ (Mbit/slot) E∗tot (Joule) NI = NO (Mbit) T∗tot

Synthetic workload at a = 7.5 (Mbit/s) and PMR= 1.250.870 6 36 4 4.30.967 6 36 7 6.20.997 6 36 24 7.3

Real-world workload of [28]0.870 7 42 10 40.20.967 7 42 11 41.90.997 7 42 15 47.6

0 20 40 60 80 100 120

5

10

15

20

t

Numberofper−

slotarrivals

Fig. 6: Sampled trace of an I/O arrival sequence from anenterprise cluster in Microsoft [28]. The corresponding PMRand time-correlation coefficient are 2.49 and 0.85, respectively.

the set of all ON/OFF VM configurations. The performancecomparisons have been carried out by implementing threedifferent event-driven policies, in which consolidation atslot t is triggered when at least ` = 3 out of the lastm = 5 values of the utilization factor in (22) fall out ofthe following intervals: (i) I(1) = [0.01, 0.99] (Case 1); (ii)I(2) = [0.2, 0.8] (Case 2); and, (iii) I(3) = [0.3, 0.7] (Case3). The corresponding measured energy loss (in percent)suffered by the proposed iterative consolidation algorithmof Section 5 against the optimal one is reported in Table 3 forthe cases of ke = 5 (mJ/(MHz)2) (e.g., case of low scalingcost) and ke = 500 (mJ/(MHz)2) (e.g., case of high scalingcost).

TABLE 3: Average energy loss (in percent) of the proposedconsolidation algorithm against the optimal one. The appli-cation scenario of Section 6.5.1 is considered at M = 10 andEave = 0.5 (Joule).

Case 1 Case 2 Case 3

ke = 500 (mJ/(MHz)2) 1.2% 0.8% 0.5%ke = 5 (mJ/(MHz)2) 0.87% 0.7% 0.3%

An examination of the results of Table 3 leads to threemain conclusions. First, the actual rate of occurrence ofconsolidation events increases by passing from Case 1 toCase 3. This induces, in turn, a reduction in the performancepenalty suffered by the proposed consolidation algorithm,whose chances of convergence to the optimal consolidatedconfigurations increase, indeed, for increasing rate of the

occurrence of the consolidation events. Second, the energypenalty suffered by the proposed consolidation algorithmis nearly negligible and, at least in the carried out tests, itremains, indeed, limited up to 1.2%, even at high values ofke. Third, since the rate of the occurrence of VM underuti-lization phenomena increases for increasing values of ke andtoo frequent occurrence of VM underutilization phenomenatends to increase the resource reconfiguration actions to becarried out at the consolidation instants, the energy penaltysuffered by the proposed consolidation algorithm tends tosomewhat increases for growing values of ke.

The plots of Fig.7 report the corresponding (numericallyevaluated) time-behaviors of the average utilization U(.) in(22) at ke = 5 (mJ/(MHz)2). A comparison of the plots of

10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

t

U(t)

No cons.Case 1Case 2Case 3

Fig. 7: Time-behaviors of the numerically evaluated averageutilization U(.) under the application scenario of Section 6.5.1at M = 10 and Eave = 0.5 (Joule). The horizontal dotted linesmark the upper/lower utilization thresholds of Cases 1, 2 and3.

Fig.7 with the arrival trace of Fig.6 confirms that the pro-posed consolidation algorithm is capable to track the abruptchanges of the input arrival sequence by scaling up/downthe number of turned ON VMs, while guaranteeing thatthe average utilization in (22) falls into the desired targetintervals.

6.5.2 Energy performance comparisonsIn this sub-section, we compare the energy performance ofthe proposed adaptive scheduler against the correspondingones of some state-of-the-art schedulers, e.g., the GRADient-based Iterative Scheduler (GRADIS) in [25], the Lyapunov-based scheduler (Lyap) in [11], the static scheduler in [22],and the NetDC scheduler in [10]. Goal of this last set of




numerical tests is to (numerically) evaluate and comparethe reductions in the overall average energy consumption:Ec+Edyn+ELAN , induced by the adaptive resource scalingand VM consolidation performed by the proposed sched-uler. This is motivated by the fact that current data cen-ters usually rely on static resource provisioning, so that,by design, a fixed number of VMs constantly runs at themaximum processing rate fmax, in order to provide thecomputing capacity needed to satisfy the peak input trafficamax [22]. Although the resulting static scheduler does notsuffer by the energy costs arising from the adaptive resourcescaling and consolidation, it induces overbooking of theused resources. Hence, the average energy consumption:E(STS) , E(STS)c + E(STS)LAN , of the STatic Scheduler (STS)provides a basic performance benchmark [22].

In order to carry out fair energy comparisons, in thetests of this sub-section, we have implemented the proposedscheduler by directly setting: a(t) = λ(t) = r(t), togetherwith θ = 1 and σ(t) =∞. In so doing, the objective functionin (17.1) to be minimized by the proposed scheduler reducesto the summation of the computing and communication en-ergies consumed by the Fog infrastructure of Fig.1b. Hence,on the basis of the carried out numerical tests, we haveexperienced that: (i) the average energy reduction of the pro-posed scheduler over the STS is of about 47% at ke = 500(mJ/(MHz)2), while it increases up to about 56% at ke = 5(mJ/(MHz)2); and, (ii) at least in the carried numericaltests, a fraction of about 35%− 40% of these energy savingsis provided by the performed consolidation actions. Theseresults confirm that the proposed consolidation algorithmprovides an effective means for leveraging the sudden time-variations exhibited by the input traffic.

Finally, in order to evaluate the energy reduction in-duced by scaling up/down the intra-Fog processing andcommunication rates, we have also tested the schedulersproposed in [10], [11], [25]. Table 4 reports the obtainedenergy consumptions averaged over the corresponding av-erage numbers of actually turned-ON VMs. According toTable 4, the average energy savings of the proposed sched-uler over the NetDC scheduler in [10], the Lyapunov-basedscheduler in [11] and the GRADIS scheduler in [25] ap-proach about 60%, 20% and 33% at medium/large valuesof M . This confirms that the proposed scheduler is capableto effectively adapt to the time-varying fluctuations of theinput traffic.

7 CONCLUSIONS AND FUTURE WORK

In this paper, we developed a cognitive computing-inspiredscheduler for the joint adaptive tuning of the: (i) inputtraffic; (ii) output traffic; and, (iii) resource reconfigura-tion and consolidation, of virtualized Fog platforms thatsupport vehicular TCP/IP connections. The overall goal isthe energy-saving support of QoS-demanding computing-intensive delay-sensitive I2V services. Remarkable featuresof the developed joint scheduler are that: (i) its implementa-tion is distributed and adaptive; (ii) it minimizes the energyconsumed by the overall NetFC platform for computing,intra-Fog communication and wireless transmission overthe vehicular TCP/IP connection; and, (iii) despite the un-predictable time-varying nature of the states of underlying

TCP/IP vehicular connections, it is capable to provide hardQoS guarantees, in terms of minimum per-connection in-stantaneous wireless transmission rate, maximum instan-taneous rate-jitter and maximum queuing-plus-computingdelay. Actual performance of the proposed scheduler hasbeen numerically tested under both synthetic and real-world input traffic, various mobility conditions and settingsof the networked Fog platform. This work can be extendedin some directions of potential interest. Just as an example,closed networked multi-tier computing infrastructures maybe considered for the support of delay-tolerant session-based services [33]. Since, in this application scenario, intra-slot traffic arrivals could be allowed, live migration of VMscould be also forecast for attaining additional energy sav-ings. Optimizing live migration of VMs without resortingto exhaustive NP-hard numerical approaches could be aninteresting topic for future work.

REFERENCES

[1] M. Whaiduzzaman, M. Sookhak, A. Gani, and R. Buyya, “Asurvey on vehicular cloud computing,” Journal of Network andComputer Applications, vol. 40, pp. 325–344, 2014.

[2] G. Karagiannis, O. Altintas, E. Ekici, G. Heijenk, B. Jarupan, K. Lin,and T. Weil, “Vehicular networking: A survey and tutorial onrequirements, architectures, challenges, standards and solutions,”IEEE Communications Surveys & Tutorials, vol. 13, no. 4, pp. 584–616, 2011.

[3] S. Abolfazli, Z. Sanaei, and I. Bojanova, “Internet of cores,” ITProfessional, vol. 17, no. 3, pp. 5–9, 2015.

[4] E. Baccarelli, N. Cordeschi, A. Mei, M. Panella, M. Shojafar, andJ. Stefa, “Energy-efficient dynamic traffic offloading and recon-figuration of networked data centers for big data stream mobilecomputing: review, challenges, and a case study,” IEEE Network,vol. 30, no. 2, pp. 54–61, 2016.

[5] H. T. Mouftah and B. Kantarci, Communication Infrastructures forCloud Computing. IGI Global, 2014.

[6] M. Hajibaba and S. Gorgin, “A review on modern distributedcomputing paradigms: Cloud computing, jungle computing andfog computing,” Journal of Computing and Information Technology,vol. 22, no. 2, pp. 69–84, 2014.

[7] R. Suryawanshi and G. Mandlik, “Focusing on mobile users atthe edge of internet of things using fog computing,” InternationalJournal of Scientific Engineering and Technology Research, vol. 4,no. 17, pp. 3225–3231, 2015.

[8] N. Cordeschi, T. Patriarca, and E. Baccarelli, “Stochastic trafficengineering for real-time applications over wireless networks,”Journal of Network and Computer Applications, vol. 35, no. 2, pp.681–694, 2012.

[9] C.-M. Wu, R.-S. Chang, and H.-Y. Chan, “A green energy-efficientscheduling algorithm using the dvfs technique for cloud data-centers,” Future Generation Computer Systems, vol. 37, pp. 141–147,2014.

[10] N. Cordeschi, M. Shojafar, and E. Baccarelli, “Energy-saving self-configuring networked data centers,” Computer Networks, vol. 57,no. 17, pp. 3479–3491, 2013.

[11] R. Urgaonkar, U. C. Kozat, K. Igarashi, and M. J. Neely, “Dynamicresource allocation and power management in virtualized datacenters,” in Proc. of IEEE NOMS, 2010, pp. 479–486.

[12] P. Balakrishnan and C.-K. Tham, “Energy-efficient mapping andscheduling of task interaction graphs for code offloading in mobilecloud computing,” in Proc. of 6th IEEE/ACM Int. Conf. on Utility andCloud Computing, 2013, pp. 34–41.

[13] J. Song, Y. Cui, M. Li, J. Qiu, and R. Buyya, “Energy-traffic tradeoffcooperative offloading for mobile cloud computing,” in Proc. of22nd IEEE Int. Symp. of Quality of Service, 2014, pp. 284–289.

[14] A. P. Miettinen and J. K. Nurminen, “Energy efficiency of mobileclients in cloud computing,” in Proc. of 2nd USENIX conference onHot topics in cloud computing, 2010, pp. 4–11.

[15] N. Balasubramanian, A. Balasubramanian, and A. Venkataramani,“Energy consumption in mobile phones: a measurement studyand implications for network applications,” in Proc. ACM IMC’09,2009, pp. 280–293.




TABLE 4: Average computing-plus-communication energy consumptions under the application scenario of Section 6.5.2.

Per-VM average energy consumption (Joule)M Proposed Scheduler NetDC [10] Lyap [11] GRADIS [25]

6 11.5 16.7 15.7 12.78 8.2 15.2 14.1 9.810 6.8 13 13.8 7.412 5.5 11.5 13.7 6.514 5.1 9.7 13.5 6.016 4.9 8.2 13.3 5.618 4.2 7.1 13.2 5.420 3.8 6.8 13.0 5.2

[16] J. Huang, F. Qian, A. Gerber, Z. M. Mao, S. Sen, and O. Spatscheck,“A close examination of performance and power characteristics of4g lte networks,” in Proc. of ACM MobiSys’12, 2012, pp. 225–238.

[17] M. Shojafar, “Saving energy in qos networked data centers,” Ph.D.dissertation, http://mshojafar.com/PhDThesis.pdf, 2015.

[18] A. Festag, G. Noecker, M. Strassberger, A. Lubke, B. Bochow,M. Torrent-Moreno, S. Schnaufer, R. Eigner, C. Catrinescu, andJ. Kunisch, “NoW–Network on Wheels: Project objectives, tech-nology and achievements,” Proc. of WIT’08, pp. 211–216, 2008.

[19] N. Cordeschi, D. Amendola, M. Shojafar, and E. Baccarelli, “Dis-tributed and adaptive resource management in cloud-assisted cog-nitive radio vehicular networks with hard reliability guarantees,”Vehicular Communications, vol. 2, no. 1, pp. 1–12, 2015.

[20] M. Portnoy, Virtualization essentials. John Wiley & Sons, 2012.[21] M. J. Neely, E. Modiano, and C. E. Rohrs, “Power allocation

and routing in multibeam satellites with time-varying channels,”IEEE/ACM Trans. on Networking, vol. 11, no. 1, pp. 138–152, 2003.

[22] K. Li, “Performance analysis of power-aware task schedulingalgorithms on multiprocessor computers with dynamic voltageand speed,” IEEE Trans. on Parallel and Distributed Systems, vol. 19,no. 11, pp. 1484–1497, 2008.

[23] A. Mishra, R. Jain, and A. Durresi, “Cloud computing: networkingand communication challenges,” IEEE Communications Magazine,vol. 50, no. 9, pp. 24–25, 2012.

[24] S. Azodolmolky, P. Wieder, and R. Yahyapour, “Cloud computingnetworking: challenges and opportunities for innovations,” IEEECommunications Magazine, vol. 51, no. 7, pp. 54–62, 2013.

[25] M. Shojafar, N. Cordeschi, D. Amendola, and E. Baccarelli,“Energy-saving adaptive computing and traffic engineering forreal-time-service data centers,” in Proc. of IEEE ICCW’15, 2015, pp.1800–1806.

[26] M. Gudmundson, “Correlation model for shadow fading in mo-bile radio systems,” IET Electronics Letters, vol. 27, no. 23, pp. 2145–2146, 1991.

[27] T. Wood, P. J. Shenoy, A. Venkataramani, and M. S. Yousif, “Black-box and gray-box strategies for virtual machine migration.” inProc. of USENIX NSDI’07, 2007, pp. 229–242.

[28] Z. Zhou, F. Liu, Y. Xu, R. Zou, H. Xu, J. Lui, and H. Jin, “Carbon-aware load balancing for geo-distributed cloud services,” in Proc.of IEEE 21st MASCOTS’13, 2013, pp. 232–241.

[29] Q. Peng, A. Walid, J.-S. Hwang, and S. H. Low, “Multipathtcp: Analysis, design, and implementation,” IEEE/ACM Trans. onNetworking, vol. 24, no. 1, pp. 596–609, 2016.

[30] A. Gulati, A. Merchant, and P. J. Varman, “mclock: handlingthroughput variability for hypervisor io scheduling,” in Proc. of9th USENIX Conf. on Operating Systems Design and Implementation,2010, pp. 1–7.

[31] C. Guo, G. Lu, H. J. Wang, S. Yang, C. Kong, P. Sun, W. Wu,and Y. Zhang, “Secondnet: a data center network virtualizationarchitecture with bandwidth guarantees,” in Proc. of ACM Co-Next,2010, pp. 15–26.

[32] A. Kansal, F. Zhao, J. Liu, N. Kothari, and A. A. Bhattacharya,“Virtual machine power metering and provisioning,” in Proc. ofACM Symp. on Cloud Computing, 2010, pp. 39–50.

[33] S. Abolfazli, Z. Sanaei, M. Alizadeh, A. Gani, and F. Xia, “Anexperimental analysis on cloud-based mobile augmentation inmobile cloud computing,” IEEE Trans. on Consumer Electronics,vol. 60, no. 1, pp. 146–154, 2014.

Mohammad Shojafar is currently last yearPh.D. student in information and communicationengineering at DIET Dept. of the Sapienza Uni-versity of Rome. He Received his Msc in soft-ware engineering in Qazvin Islamic Azad Univer-sity, Qazvin, Iran in 2010. Also, he received hisBsc. in computer engineering-software major inIran university science and technology, Tehran,Iran in 2006. His current research focuses ondistributed computing, wireless communications,and mathematical/AI optimization. Mohammad

was a programmer and analyzer in exploration directorate section atNational Iranian Oil Company (NIOC) and Tidewater ltd. Co. in Iran from2008-2013, respectively.

Nicola Cordeschi is a Researcher in Infor-mation and Communication Engineering at theDIET Dept. of Sapienza University of Rome. Hereceived the five-year Laurea degree (summacum laude) in Communication Engineering andthe Ph.D. degree in Information and Communi-cation Engineering from the Sapienza Universityof Rome in 2004 and 2008, respectively. His re-search activity focuses on data communication,distributed computing and design/optimization ofQoS transmission systems for wireless/ wired

multimedia applications.

Enzo Baccarelli is Full Professor in Informationand Communication Engineering at the DIETDept. of Sapienza University of Rome. He re-ceived the Laurea degree in Electronic Engi-neering and the Ph.D. degree in CommunicationTheory and Systems, both from Sapienza Uni-versity. In 1995, he received the Post-Doctoratedegree in Information Theory and Applicationsfrom the INFOCOM Dept. of Sapienza Universityof Rome. His current research focuses on datanetworks, distributed computing and adaptive

optimization.

Documents

IEEE TRANSACTION ON CLOUD COMPUTING 1 Energy-efﬁcient ... · Transactions on Cloud Computing IEEE TRANSACTION ON CLOUD COMPUTING 1 Energy-efﬁcient Adaptive Resource Management