Throughput Maximization for Multicore Energy …...permanent faults. This paper aims at scheduling dependent tasks on a multicore platform for throughput maximization under energy

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/ACCESS.2019.2930242, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.

Digital Object Identifier 10.1109/ACCESS.2017.DOI

Throughput Maximization for MulticoreEnergy-Harvesting Systems SufferingBoth Transient and Permanent FaultsJUNLONG ZHOU1, (Member, IEEE), PEIJIN CONG2, JIN SUN1, XIUMIN ZHOU1,TONGQUAN WEI2, (Senior Member, IEEE), MINGSONG CHEN3, (Senior Member, IEEE)1School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China2School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China3Shanghai Key Lab of Trustworthy Computing, East China Normal University, Shanghai, 200062, China

Corresponding author: Junlong Zhou (e-mail: [email protected]).

This work was in partial supported by National Natural Science Foundation of China (Grant Nos. 61802185, 61872185, and 61872147),Natural Science Foundation of Jiangsu Province (Grant No. BK20180470), National Key R&D Program of China (Grant No.2018YFB2101301), the Fundamental Research Funds for the Central Universities (Grant No. 30919011233), and Shanghai MunicipalNatural Science Foundation (Grant No. 16ZR1409000).

ABSTRACT Harvesting renewable generation (e.g., solar energy) from the ambient environment to achievea near perpetual operation for embedded systems is being paid more and more attention by academia andindustry. However, an immediate problem along with the utilization of renewable energy is the degradedsystem throughput caused by the intermittent characteristic of renewable generation. On the other hand,energy-harvesting systems (EHSs) deployed in harsh environment are more vulnerable to transient andpermanent faults. This paper aims at scheduling dependent tasks on a multicore platform for throughputmaximization under energy and reliability constraints. The target of this work is to design algorithmsthat optimize system throughput under the energy, reliability, as well as task precedence constraints. Toachieve this goal, we propose a mixed-integer linear programming (MILP) approach for allocating andscheduling precedence constrained tasks on the multicore to maximize the throughput of EHS. However,MILP may find the optimal solution in an exponential time. To overcome this difficulty, we propose apolynomial-time heuristic algorithm to solve the MILP-based throughput maximization problem. In thisheuristic algorithm, the uncertainty in energy sources is considered and the allocation and scheduling oftasks are determined based on system energy state. Extensive simulation experiments are carried out tovalidate our MILP approach and throughput-aware heuristic algorithm. Simulation results justify that theMILP approach achieves an up to 92.9% improvement of system throughput when compared to a baselinemethod, and the proposed heuristic improves system throughput by up to 32.1% on average when comparedto four representative existing approaches.

INDEX TERMS Energy-harvesting systems, lifetime reliability, soft-error reliability, throughput.

I. INTRODUCTION

POWER and energy are both critical design concerns ofembedded applications, especially for battery-powered

systems that are deployed in harsh environment [1]–[3].Human beings either have no access to these systems or havedifficulty in replacing a battery for these systems since thesesystems are in general deployed in extreme environments.Therefore, it is desirable for embedded systems deployedin such an environment to harvest energy from ambientenvironment to sustain their perpetual operation. For this

reason, energy-harvesting systems (EHSs) in which energyis provided by external sources, e.g., ambient vibration, heat,or light, have been popularly used as suitable alternatives totraditional battery-powered systems. EHSs are expected toaddress both energy shortage and environmental challengesto a great extent. However, affected by the intermittent issuein renewable generation resources, EHS may fail to completeall the arrival tasks, leading to the degradation in systemthroughput, which is defined as the number of tasks beingcompleted during a scheduling horizon [5]. In this paper, we

VOLUME 4, 2016 1



Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

are interested in maximizing the throughput of EHS runningon multicore platforms.

As CMOS feature size continues to shrink down, multicoreprocessors are now seriously under the threat of transientfaults and permanent faults that would in turn result in softerrors and hard errors, respectively. The situation becomeseven more severe for EHS deployed in harsh environmentswhere the integrated circuits are easier to suffer high-energyneutron and alpha particle strike-induced transient faultsand the cost of repairing or replacing the disable hardwareis generally prohibitive. Thus, handling both transient andpermanent faults to improve soft-error reliability (SER) aswell as lifetime reliability (LTR) is of great importance forlow-power, energy-harvesting embedded systems.

With that in mind, this paper focuses on solving theproblem motivated by a common application of EHS, in-situdata processing systems (InS systems). These systems are de-ployed in harsh environments such as oil/gas exploration [6],astronomy observing in remote area [7], rural geographicalsurveying [8], and video surveillance for behavioral studiesof wildlife [9]. The InS systems are powered by renewableenergy [10], which may be insufficient to support the process-ing of all in-situ workloads. Under this circumstance, a highsystem throughput is preferred to support in-situ workloadsto the most extent. Besides, SER and LTR optimization areimperative for InS systems since the systems are very vulner-able to transient and permanent faults due to their hash op-erating environment. Taking all the above into consideration,we aim to address the problem of maximizing throughput formulticore EHS suffering both transient and permanent faults.We make the following major contributions:

• We formulate the concerned problem as MILP modelthat determines an optimal schedule of dependent taskswith energy as well as reliability constraints on a multi-core EHS to maximize system throughput.

• Since MILP may take an exponential time in solutionspace exploration, we design a polynomial-time algo-rithm to maximize the throughput of EHS, which deter-mines task allocation and scheduling strategies based onsystem energy states.

• We develop an earliest-finish-time based list schedulingalgorithm for EHS with plentiful energy supply and across entropy based task scheduling algorithm for EHSwith insufficient energy supply.

• We carry out a series of simulations to validate theefficacy of our proposed MILP and heuristic algorithmsby comparing their performance with that of a baselinemethod as well as four peer approaches.

The remainder of the paper is organized as follows. Wereview related work in Section II discusses existing methodsrelevant to this work. Section III presents system models andfurther formulates our concerned throughput maximizationproblem. Section IV presents an MILP approach to solvethe throughput maximization problem. Section V describesthe proposed throughput-aware task scheduling algorithm.

Sections VI shows our experimental setups and results, andSection VII concludes the paper.

II. RELATED WORKA substantial number of research efforts have been devel-oped towards solving the problem of resource managementand task scheduling for embedded systems with energyharvesting. Most of these research efforts are made fromtwo perspectives of improving the harvesting efficiency andutilization of renewable energy. One perspective is concernedabout how to maximize the power output harvested from arenewable energy source [11], [12]. The other perspectiveconcentrates on how to exploit the fluctuating energy gen-erated by the source efficiently [13], [14]. Different from[11]–[14] that do not consider energy savings, Chen et al. [4]proposed a dynamic frequency selection scheme to improveenergy efficiency under the deadline miss rate constraint forreal-time applications in EHS, and Liu et al. [15] presented anadaptive dynamic programming based algorithm to improveelectricity efficiency of the residential grid. However, all theabove works do not explore how to maximize the throughputof EHS under the intermittent renewable energy.

Numerous studies have discussed the approaches for ad-dressing the throughput maximization problems. However,the existing techniques are either designed for energy har-vesting networked systems [16]–[19] and energy harvestingpower systems [20], or embedded systems [21] and datacenters [22] that ignore the uncertainty in energy sources. Forexample, Yuan et al. [22] proposed a workload-aware taskscheduling scheme that wisely decides the optimal combi-nation of virtual machine and routing path for tasks. Unlike[22], the approaches proposed in [23] and [24] both utilizerenewable power and are designed for green data centers.Specifically, the scheme [23] considers the temporal variationin grid price and renewable energy source, and intelligentlyschedules user requests under the delay bounds. The method[24] schedules all the arrival tasks cost-efficiently to meetuser requests’ delay-bound constraints by exploiting the spa-tial diversity in distributed green cloud data centers. Theseapproaches are effective but do not consider SER and LTR.

Several recent works [25]–[29] have addressed the SERand LTR co-optimization problem. Zhou et al. [25] proposeda SER and LTR-balanced task frequency and replicationselection strategy for maximizing system availability. Maet al. [26] presented a reliability improvement frameworkwhich exploits the power features of the Big-Little typecores to maximize SER under LTR, power, and real-timeconstraints. Kim et al. [27] introduced DVFS and Q-learningtechniques to optimize lifetime as well as energy for many-core microprocessors in the presence of both transient andpermanent faults. Based on the impacts of hardware- as wellas application-level variations on SER, a variation-aware taskscheduling scheme [28] is developed to maximize SER whilemeeting a constraint on LTR. Unlike the literature [26]–[28] that either optimize SER or LTR, an evolutionary-basedalgorithm is designed to optimize SER and LTR simultane-

2 VOLUME 4, 2016




TABLE 1. A comparison summary of related works from multiple aspects, such as optimization goal, constraint, utilization of renewable energy, target system, andapplication. ∗ indicates that the corresponding reference does not provide specific discussions on design constraint, renewable energy type, or application.

References Optimization Goal Constraint Renewable Energy Target System ApplicationChen et al. [4] Energy efficiency Deadline miss rate Yes (solar power) Embedded systems Real-time tasksEsram et al. [11] Output power * Yes (solar power) Photovoltaic systems *Kobayashi et al. [12] Output power * Yes (solar power) Photovoltaic systems *Sharma et al. [13] Throughput and delay Energy Yes (*) Sensor networks Network packetsAbdeddaim et al. [14] Feasibility Energy Yes (*) Embedded systems Real-time tasksLiu et al. [15] Electricity efficiency * Yes (solar power) Smart grid systems Housing unitsTutuncuoglu et al. [16] Throughput Battery storage Yes (*) Relay channels Transmit powersGupta et al. [17] Throughput Causality and overflow

of energy and dataYes (*) Successive relaying net-

worksEnergy arrival events

Mehrabi et al. [18] Throughput Energy Yes (*) Wireless sensornetworks

Mobile sinks

Wu et al. [19] Throughput Energy and capacity Yes (*) Communication systems Communicationchannels

Wang et al. [20] Spinning reserve cost Transmission capabil-ity

Yes (wind power) Power systems Spinning reserves

Huang et al. [21] Throughput Temperature No Embedded systems Real-time tasksYuan et al. [22] Revenue Round-trip time No Cloud data centers User requestsYuan et al. [23] Profit Delay Yes (hybrid power) Green data centers User requestsYuan et al. [24] Cost Delay Yes (hybrid power) Green cloud data centers User requestsZhou et al. [25] Availability Throughput No Embedded systems Frame-based tasksMa et al. [26] SER LTR No Embedded systems Real-time tasksTang et al. [27] LTR+energy Temperature No Embedded systems Real-time tasksZhou et al. [28] SER Deadline No Embedded systems Real-time tasksZhou et al. [29] SER+LTR Deadline No Embedded systems Real-time tasks

ously [29]. However, the above-mentioned approaches arenot developed for systems with uncertain energy supply, andfail to take throughput into consideration.

For a better understanding, we summarize related worksfrom multiple aspects in TABLE 1 such as optimization goal,constraint, utilization of renewable energy, target system,and application. From the table readers can easily find thatnone of these existing works considers throughput and relia-bility (SER and LTR) simultaneously for energy harvestedembedded systems. Unlike the existing works, this paperconcentrates on optimizing the throughput of EHS runningon the multicore embedded systems suffering both transientand permanent faults. In this paper, we present an allocationand scheduling scheme that uses an MILP solver to maximizesystem throughput under the energy, reliability, as well astask precedence constraints. To efficiently solve this NP-hard allocation and scheduling problem, we also propose anenergy uncertainty-aware task scheduling heuristic algorithmin which the allocation, frequency, and execution order oftasks are determined to maximize system throughput.

III. SYSTEM MODEL AND PROBLEM DEFINITIONBelow we introduce system models and define the problemthat we are trying to solve.

A. ARCHITECTURE AND APPLICATION MODELThe EHS is mainly composed of three modules (as shownin Fig. 1): energy source module, storage module, as wellas dissipation module. The energy source module scavengessolar energy automatically at the rate of Pharv(t). The scav-enged energy will be further transformed into electricalenergy. The energy storage module, usually implemented

by a battery or super-capacitor, works as a buffer againstthe uncertainty of harvested energy. An embedded systemrunning on the multicore C is used as the energy dissipa-tion module. The multicore C consists of M homogeneouscores C1, C2, · · · , CM. All the cores are dynamic voltageand frequency scaling-enabled and support multiple discretesupply voltage and frequency levels. Let Vmin/Fmin andVmax/Fmax be the minimum and maximum voltage/frequencyequipped with the multicore, respectively. The kth (1 ≤ k ≤K) voltage/frequency level supported by the multicore thensatisfies Vmin/Fmin ≤ Vk/Fk ≤ Vmax/Fmax, where K is thenumber of voltage/frequency levels.

Energy Source

Energy Storage

!"#$% &!'()* &+, …

Energy Dissipation

+- +.

FIGURE 1. The diagram of the system architecture.

Suppose the multicore C hosts multiple applications withprecedence constraints, each of which can be modeled as adirected acyclic graph (DAG) G = (V, E) [29]–[31]. Eachgraph contains a set V of vertices representing the set of tasknodes and a set E of edges representing the partial order oftasks. The edge (τi, τj) ∈ E (1 ≤ i, j ≤ |V|) imposes theprecedence constraint that task τj cannot execute until its pre-decessor τi has finished execution. The communication timebetween tasks τi and τj is denoted by CMT (τi, τj). Fig. 2

VOLUME 4, 2016 3




illustrates an example of DAG applications. Consideringthat a task may have multiple predecessors and successors,Pre(τi) and Succ(τi) are used to represent the set of task τi’simmediate predecessors and successors. Let wci representthe worst-case execution cycles of task τi, the execution timeof τi running at frequency Fk is then calculated as

ET (τi, Fk) =wciFk

. (1)

FIGURE 2. Example of applications modeled as DAG.

B. ENERGY MODELWe discuss the energy model for the concerned EHS fromsupply and demand perspectives separately.

1) Energy SupplyWe use Pharv(t) to denote the harvesting power, andEharv(t1, t2) to denote the energy harvested from the externalenvironment during time interval [t1, t2], Eharv(t1, t2) is thenformulated as

Eharv(t1, t2) =

∫ t2

t1

Pharv(t)dt. (2)

As illustrated in Fig. 1, the multicore system consumesa portion of the harvested energy and the energy storagemodule stores the residuals. Both of the two modules areable to supply the energy to the energy dissipation module.LetEsup(t1, t2) represent the energy available in time interval[t1, t2], andE(t1) represent the energy stored into the storagemodule at time point t1, the supply energy can be derived as

Esup(t1, t2) = Eharv(t1, t2) + E(t1). (3)

2) Energy demandThe power consumed by a CMOS device can be decomposedinto dynamic and static portions, i.e.,

Pcons = Pdyn + Psta. (4)

The dynamic power Pdyn is associated with core’s switch-ing activity. A general way is to model dynamic poweras a convex function of frequency. The static power Pstais the power dissipated by the CMOS circuit itself, and isindependent of switching activity. As the dynamic power isonly consumed for executing tasks and the static power is

dissipated to maintain circuit state, the total energy demandedby executing tasks in set V on the multicore C during ascheduling horizonH is

Econs(V, C,H) =∑M

m=1Psta(Cm)×H+∑M

m=1

∑τi∈Vm

Pdyn(τi, Cm)× ET (τi, Fk), (5)

where Psta(Cm) is the static power of core Cm andPdyn(τi, Cm) is the dynamic power of executing task τi oncore Cm. Vm is the set of tasks allocated to core Cm andET (τi, Fk) is τi’s execution time running at frequency Fk.

C. RELIABILITY MODELWe discuss the reliability model for the concerned EHS fromSER and LTR perspectives separately.

1) Soft-error reliabilitySER is determined by the average fault arrival rate thatis calculated as the expected failure number occurring persecond. Using the exponential model proposed by Zhu et al.[32], the core raw fault rate at frequency Fk is

λ(Fk) = λFmax × 10Fmax−Fk

Ω , (6)

where λFmax is core’s fault rate when it is operating at themaximum frequency, and parameter Ω indicates the trend offault rate increase with regard to the reduced frequency.

The SER of a task is defined as the probability of success-fully executing the task while suffering no transient faults,and can be modeled using the exponential failure law [32].Given task τi running on the multicore at frequency Fk, theSER of the task is then calculated as

SER(τi, Fk) = e−λ(Fk)×V Fi×

wciFk , (7)

where λ(Fk)× V Fi is the ultimate fault rate considering thetask error probability and wci/Fk is task τi’s execution timerunning at frequency Fk. Since a system’s correct operationrelies on the successful execution of all tasks, we formulatesystem SER as

SERsys =∏

τi∈V

∏Fk∈F

SER(τi, Fk), (8)

where F is the frequency set supported by the multicore.

2) Lifetime reliabilityLTR is decided by multiple wear-out effects such as elec-tromigration, time dependent dielectric breakdown, stressmigration, and thermal cycling [25]. For sake of simplicity, inthis work we consider electromigration (EM) as the primarysource of permanent faults for simplicity. EM is the disloca-tion of metal atoms due to momentum imparted by electricalcurrent in wires as well as vias [33]. It is worth emphasizingthat, we can easily extend this reliability model to incorporateother wear-out effects by the sum-of-failure-rate model [26].

4 VOLUME 4, 2016




LTR is generally evaluated upon the mean time to failure(MTTF) metric. According to the MTTF model for EM [33],core Cm’s MTTF is calculated as

MTTF (Cm) =

∫ ∞0

RLTR,m(t)dt =

∫ ∞0

e−(Amt)α

dt,

(9)

where RLTR,m is the LTR of core Cm following the Weibulldistribution, Am is the aging rate of core Cm, and α is theslope coefficient of the Weibull distribution. Am is deter-mined by the hardware as well as the thermal profile of coreCm. The calculation method of Am is provided in [34]. Asin [34], [35], the system is deemed as failed as long as anarbitrary core in the system fails, the system MTTF is therebyobtained as

MTTFsys = minm

MTTF (Cm), ∀m = 1, 2, · · · ,M. (10)

D. PROBLEM DEFINITIONWe focus on solving the problem motivated by in-situ dataprocessing applications deployed in special operating envi-ronments. These systems are powered by renewable energyand have throughput and reliability requirements. The prob-lem that we aim to solve is described as follows. Given an ap-plication represented by a DAG G = (V, E) to be scheduledon the multicore system C = C1, C2, · · · , CM, and anestimated harvested energy budget Ebgt during a schedulinghorizon H, design a task scheduling scheme to maximizesystem throughput under the limited supply energy whilesatisfying the constraints on SER, LTR, and task precedence.

To solve the problem, we provide an MILP approach andtwo heuristic algorithms for allocating and scheduling depen-dent tasks with energy as well as reliability constraints onthe multicore system. The details of our MILP approach andheuristic algorithms are introduced in the following sections.

IV. MILP-BASED APPROACHThis section presents our MILP approach to solving thethroughput maximization problem described in Section III-Dand discusses shortcomings of the MILP approach. Theapproach maximizes the system throughput under the con-straints of reliability, energy, as well as task dependency bydetermining i) on which cores should the tasks be executed,ii) what voltages/frequencies should be used for the tasks,and iii) when should the tasks start. Before showing the MILPformulation, the binary variables used in the formulation aredefined first below.

η(τi, Cm) =

1 if τi is allocated to Cm0 otherwise. (11)

ϕ(τi, τj) =

1 if τi starts before τj ,0 otherwise. (12)

δ(τi, Fk) =

1 if τi is executed at frequency Fk,0 otherwise. (13)

A. OBJECTIVEThe goal of the MILP formulation is to maximize the sys-tem throughput in a scheduling horizon using the harvestedenergy, which is calculated as the number of task instancesexecuted by the multicore EHS system. Given the input taskset V and the multicore C, the system throughput during thescheduling horizonH is then expressed as

Trusys = sizeof(Vexe) (14)

where Vexe is the set of tasks executed on the multicore andVexe ⊂ V holds.

B. CONSTRAINTSLet N , 1, 2, · · · , |Vexe|,M , 1, 2, · · · ,M, and K ,1, 2, · · · ,K, the constraints that must be satisfied are thenformulated as below.

1) Constraint on Task-to-Core Allocation: every task isallocated to exactly one core.∑M

m=1η(τi, Cm) = 1, ∀i ∈ N . (15)

2) Constraint on Frequency-to-Task Assignment: everytask is exactly executed at one frequency level.∑K

k=1δ(τi, Fk) = 1, ∀i ∈ N . (16)

3) Constraint on Energy Consumption: the total energyconsumed by executing tasks cannot exceed the energysupply.

Econs(Vexe, C,H) ≤ Esup(H). (17)

4) Constraint on SER and LTR: the system SER and LTRshould be no less than the thresholds.

SERsys ≥ SERth, (18)MTTFsys ≥MTTFth. (19)

5) Constraint on Task Dependency and Non-Preemption:the order of task executions cannot violate task de-pendency and no task executed on the same core canoverlap with each other.

∀i, j ∈ N , i 6= j,

ϕ(τi, τj) + ϕ(τj , τi) ≥ 0, (20)ϕ(τi, τj) + ϕ(τj , τi) ≤ 1, (21)tstart(τi) ≤ tstart(τj) + (1− ϕ(τi, τj))×∆×H,

(22)tstart(τj) ≤ tstart(τi) + ϕ(τi, τj)×∆×H, (23)∀i, j ∈ N , i 6= j,∀m ∈M,∀k ∈ Ktfinish(τi) ≤ tstart(τj) + (3− η(τi, Cm)

− η(τj , Cm)− ϕ(τi, τj))×∆×H, (24)tfinish(τj) ≤ tstart(τi) + (2− η(τi, Cm)

− η(τj , Cm) + ϕ(τi, τj))×∆×H, (25)

where tstart/tfinish is the start/finish of a task and ∆ (∆ ≥ 1) isa constant. Eq. (22) indicates that task τi must start before τj

VOLUME 4, 2016 5




if ϕ(τi, τj) = 1 and Eq. (24) ensures that task τi completesits execution before task τj if τi and τj are running on thesame core as well as τi starts before τj . Similar conditionshold for Eqs. (23) and (25).

C. LIMITATION OF MILP APPROACH

Our target is to find the optimal solution to our problem fromthe solution space. The problem can be optimally solvedfor systems of a small granularity using an MILP solver.However, since the MILP problem is NP-hard, the complex-ity may increase exponentially when solving systems of alarge granularity. In this case, MILP solvers could not servethe purpose even for design time exploration. Therefore,developing a time-efficient algorithm to find a sub-optimumsolution becomes a necessity. In the subsequent section, wepresent a polynomial-time scheduling strategy to maximizesystem throughput under the energy, reliability, and taskprecedence constraints.

V. THROUGHPUT-AWARE TASK SCHEDULING SCHEMEThis section describes the proposed throughput-aware taskscheduling scheme in detail. The scheme features the con-sideration of uncertainty in renewable energy sources andhandles the uncertainty by dividing the system operation intohigh energy state as well as low energy state. Two heuristicalgorithms are provided in the scheme to maximize thethroughput of systems in the two energy states, respectively.

A. SOLVE THE ENERGY UNCERTAINTY

The available energy used to support system operation variesin the scheduling horizon (i.e.,H) because of the intermittentnature of renewable energy sources. To handle this uncer-tainty, we first divide system operation into two states withrespect to energy: high energy state as well as low energystate. If there is sufficient energy available for the EHSsystem to complete all the tasks in the scheduling horizon athigh speed, the system is in high energy state; whereas in lowenergy state in the opposite case. We then propose an energystate-aware approach (ESA) to solve our studied problem.

Alg. 1 summarizes the algorithmic flow of ESA. Thealgorithm first estimates the energy Esup(H) supplied forthe system in the scheduling horizon H using Eqs. (2)-(3)and the energy Emax

cons(V, C,H) demanded by executing alltasks at the maximum frequency using Eq. (5) (lines 1-2).In case that there is sufficient energy to meet the the system’senergy demand, i.e., Emax

cons(V, C,H) ≤ Esup(H), indicatingthe system is in high energy state, earliest-finish-time basedlist scheduling (EFT-LS) heuristic is called to maximizesystem throughput (lines 3-5). Otherwise, the system is inlow energy state and cross entropy based task scheduling(CE-TS) heuristic is called to maximize system throughput(lines 6-8). The details of EFT-LS and CE-TS are presentedin Algs. 2 and 3, respectively.

Algorithm 1: Energy State-Aware Approach

1 calculate the supply energy Esup(H) using Eqs.(2)-(3);

2 compute the energy Emaxcons(V, C,H) consumed by all

tasks running at the maximum frequency using Eq.(5);

3 if Emaxcons(V, C,H) ≤ Esup(H) then

/* high energy state */4 call Alg. 2 (i.e., earliest-finish-time based list

scheduling algorithm);5 end6 else

/* low energy state */7 call Alg. 3 (i.e., cross-entropy based task

scheduling algorithm);8 end

B. EARLIEST-FINISH-TIME BASED LIST SCHEDULINGFOR HIGH ENERGY STATEWe apply list scheduling (LS) [36] to solve our throughputmaximization problem for systems operating at high energystate. Following the design of LS, EFT-LS is composed of atask prioritization phase (TPP) that determines the schedul-ing order (priority) of all tasks, as well as a core selectionphase (CSP) that selects tasks in the order of their prioritiesand allocates the tasks onto most appropriate cores. It hasbeen observed in [21] that system throughput is maximizedif the latency of executing all tasks in the application isminimized in case of sufficient energy supply. In addition, theexecution latency of a DAG application is in fact the finishtime of the exit task, which is minimized if the finish time ofits predecessors are minimized. Motivated by these observa-tions, EFT-LS maximizes system throughput by minimizingthe finish time of tasks.

TPP of EFT-LS: EFT-LS sets the priorities of tasks usingthe rank value rankEFT that is calculated based on the taskexecution time and communication time. Given task τi andits successors Succ(τi), the rank of τi is recursively definedas

rankEFT(τi) =

∑Kk=1ET (τi, Fk)

K+

maxτj∈Succ(τi)(CMT (τi, τj) + rankEFT(τj)

). (26)

As can be seen from Eq. (26), the rank is calculated recur-sively by traversing the task graph upward from the exit taskτexit, of which the rank value is derived as

rankEFT(τexit) =K∑k=1

ET (τexit, Fk)

K. (27)

After obtaining the rank values of all tasks, the task schedul-ing list is then generated by sorting the tasks in the non-increasing order of rankEFT. According to the definition ofrankEFT, we can easily deduce that the non-ascending order

6 VOLUME 4, 2016




of rankEFT ensures a topological task order preserving theprecedence constraints among tasks.

CSP of EFT-LS: Given the task scheduling list, EFT-LStentatively puts the task under scheduling on all the cores andselects the core that delivers the shortest EFT. The key ofEFT-LS is to minimize the EFT of tasks. To achieve this goal,EFT-LS assumes all the tasks run at the maximum frequencyFmax, which is a safe operation since the energy supply isplentiful. In addition, EFT-LS utilizes a task insertion policythat inserts a task in the idle time slot between two consec-utively scheduled tasks on the same core without violatingthe task precedence constraint. The length of the idle timeslot,i.e.,the timespan between the start time and finish time ofthe two consecutively scheduled tasks, should be longer thanthe execution time of the task to be inserted. By exploiting theidle time, the execution latency of the task set can be furtherreduced.

Algorithm 2: EFT-based list scheduling

1 compute rankEFT for all tasks using Eq. (26);2 sort the tasks in a scheduling list in the non-ascending

order of rankEFT values by Rank = UpRank(V);3 while the list Rank is not empty do4 select the first task τi from the list for scheduling;5 for each core Cm in set C do6 derive EFT (τi) of task τi that runs on core

Cm at frequency Fmax and uses theinsertion-based scheduling;

7 end8 denote the core with the minimum EFT of task τi

by Cr;9 while true do

10 if Temperature(τi, Cr) ≤ Tth then11 allocate task τi to core Cr;12 break;13 end14 else15 denote the core with the next minimum

EFT of task τi by Cr;16 end17 end18 end19 calculate the system SER using Eq. (8);20 if SERsys < SERth then21 output "Infeasible schedule";22 end

Alg. 2 presents the psuedo code for EFT-LS. The algorithmfirst derives the rankEFT for all tasks using Eq. (26) (line 1)and sorts these tasks in the non-ascending order of rankEFTvalues using function Rank = UpRank(V) (line 2), whereRank is the scheduling list of the sorted tasks andUpRank()computes the task rank values as well as sorts tasks basedon the rank values. It then iteratively decides the allocationof each task to cores (lines 3-18). During each iteration, the

algorithm calculates the EFTs of the currently-scheduled taskif it is executed at frequency Fmax and scheduled on M cores(lines 4-8), and allocates the task to the core with a minimumEFT if not violating the MTTF constraint (lines 10-13). Ithas been shown in [37] that the system LTR constraint canbe ensured by checking whether the operating temperature ofmulticore system exceeds a corresponding threshold. Thus,if the operating temperature of executing task τi on coreCr, represented by temperature(τi, Cr), doesnot exceed athreshold Tth, task τi is then allocated to core Cr. Otherwise,the algorithm attempts to allocate the task to the core with thenext minimum EFT and checks the temperature constraint.The outer while-loop is terminated if the allocation of alltasks in the list Rank have been determined. The algorithmfinally checks the system SER constraint. If the systemSER SERsys estimated by Eq. (8) is lower than the SERrequirement SERth, the derived task schedule is deemed asinfeasible (lines 19-22).

C. CROSS ENTROPY BASED TASK SCHEDULING FORLOW ENERGY STATEAs introduced above, the proposed EFT-LS algorithm canmaximize system throughput by minimizing the latency ofexecuting tasks in the application. However, EFT-LS is onlyeffective for systems with plentiful energy supply and thuscannot tackle the studied throughput maximization problemfor systems with insufficient energy supply. Therefore, wedevelop a cross-entropy (CE) based approach to maximizethe throughput of systems with insufficient energy supply.The CE approach is a versatile strategy used for solving NP-hard optimization problems [38].

For a deterministic optimization problem to be solved,the CE approach converts it into an associated stochasticoptimization problem and considers the optimum solutionto the stochastic problem as a rare event. The approachfinds the optimum solution by continuously increasing theprobability of the rare event using an iterative samplingscheme that gradually changes the sampling distribution.When the probability of the rare event approaches 1, theoptimum solution to the original deterministic problem isfound. During the iterative process, each solution is viewedas a sample. Solution samples are produced according to theprobability density function (PDF) and then converged in aprobabilistic manner to better solution samples. For moredetails of the CE approach, the readers are recommended torefer to the literature [38].

For a better understanding, we briefly illustrate the philos-ophy behind the CE approach in Fig. 3. During each iterationof the CE approach, solution samples are produced followingthe PDF and the quality of these solutions are evaluated. Weidentify those high-quality samples as elite samples, and relyon them to update the PDF’s characterizing parameter. Theupdated PDF will be utilized during the next iteration forproducing a new generation of samples.

Alg. 3 shows the pseudo code of our CE-based heuristic.In the initialization step, we initialize the mean value as well

VOLUME 4, 2016 7




Objective function

Objective function

Objective function

Objective function

Solution space Solution space

Solution space Solution space

Generate samplesaccording to PDF

Evaluate samples

Update PDF according toelite samples

Select elite samples

elitesamples

FIGURE 3. Workflow of cross entropy optimization approach [39].

as standard deviation of Gaussian distribution which will beadopted to produce samples. In this step, iteration count isalso initialized (lines 1-2). The algorithm iteratively derivesthe best solution leading to shortest task schedule from lines3 to 9. Specifically, in each iteration the algorithm gener-ates W voltage samples following the Gaussian distributionN(ζg, ξg) (line 4) and chooses J feasible samples whichmeet all the design constraints in Eqs. (15)-(25) by usingthe acceptance-rejection scheme (line 5). The algorithm thenevaluates the selected samples in terms of system throughputTrusys derived by Eq. (14), and choose the top Q elitesamples to update the distribution PDF (lines 6-7). At theend of each iteration, g, ζg , and ξg are updated (line 8).The iteration is terminated if the predefined convergencecriteria is met or the iteration count reaches a pre-specifiedlimit (line 9). Since the evaluation procedure for the solutionsamples are independent of each other, we can strikinglyaccelerate the heuristic algorithm by running the algorithmin the parallel programming environment.

VI. EVALUATIONThis section first describes the simulation setups used forvalidating the proposed MILP and ESA approaches, thenpresents and analyzes the simulation results.

A. SIMULATION SETUPSFour real-world DAG benchmarks including CyberShake,Inspiral, Montage, as well as Sipht [40] are utilized inthe simulations to verify the effectiveness of the proposedscheme. These DAG benchmarks are widely used in evaluat-ing the performance of task scheduling algorithms. TABLE 2summarizes the key characteristics of these benchmarks. Thesimulations are performed based on 2 × 3 (M = 6) and2 × 4 (M = 8) multicore systems. The multicore modelis built upon a ARM Cortex platform. Each core supportsthree levels of frequency and voltage, i.e., 300MHz/1.06V,600MHz/1.1V, and 900MHz/1.2V. We use solar energy asthe renewable generation, since it is the most easy-to-access

Algorithm 3: CE-based task scheduling

1 initialize the mean (ζ1) and variation (ξ1) of Gaussiandistribution; /* Gaussian distributionis adopted as the PDF */

2 g = 1; /* initialize the counter ofiterations */

3 repeat4 produceW solution samples [S1,S2, · · · ,SW ]

using Latin hypercube sampling methodaccording to distribution N(ζg, ξg);

5 select J (J <W) feasible solution samplesmeeting the constraints in Eqs. (15)-(25) usingacceptance-rejection scheme;

6 calculate the Trusys value for each selectedsample by Eq. (14);

7 select the top Q elite samples with respect toTrusys;

8 g + + and update ζg and ξg;9 until iteration converges or iteration count reaches its

maximum value;

energy source and it can derived based on the harvestingpower trace [4]

Pharv(t) =∣∣Λ×Ψ(t)× cos(

t

70π)× cos(

t

100π)∣∣, (28)

where Λ is a constant coefficient, and Ψ(t) is a zero-mean,unit-variance random variable. The raw failure rate at themaximum frequency λFmax is set to 1.0 × 10−4 [28]. Thetask vulnerability factor, V Fi, is randomly selected withinthe range (0, 1]. We use the same setups (e.g., α = 2, thecurrent density is 1.5 × 106A/cm2, the activation energy is0.48eV , derive the temperature using HotSpot [41]) as in [34]to predict core’s aging rate Am and in turn core’s MTTF.

TABLE 2. Characteristics of real-world DAGs [40].

DAG Benchmark Number of Nodes Number of Edges

CyberShake_30 30 112CyberShake_50 50 188CyberShake_100 100 380Inspiral_30 30 95Inspiral_50 50 160Inspiral_100 100 319Montage_25 25 95Montage_50 50 206Montage_100 100 433Sipht_30 30 91Sipht_60 60 198Sipht_100 100 335

We perform two separate sets of comparative experi-ments to fully verify the proposed MILP approach and ESAscheme. In the first set of experiments, we compare theproposed MILP approach with the baseline method Rand andour scheme ESA in terms of improving system throughput.Rand is a method that randomly determines the allocation and

8 VOLUME 4, 2016




TABLE 3. System throughput of 12 benchmarks achieved by the MILP approach, the proposed scheme ESA, and baseline method Rand.

6-cores 8-cores

DAG Rand MILP ESA Rand MILP ESA

Benchmark Trusys Trusys IMP Trusys IMP Trusys Trusys IMP Trusys IMP

CyberShake_30 18 26 44.4% 24 33.3% 20 28 40.0% 26 30.0%CyberShake_50 28 43 53.6% 40 42.9% 31 46 48.4% 43 38.7%CyberShake_100 65 88 35.4% 85 30.8% 70 91 30.0% 88 25.7%Inspiral_30 16 27 68.8% 25 56.3% 19 28 47.4% 26 36.8%Inspiral_50 29 45 55.2% 41 41.4% 33 47 42.4% 44 33.3%Inspiral_100 58 89 53.4% 83 43.1% 65 92 41.5% 87 33.8%Montage_25 12 22 83.3% 20 66.7% 15 24 60.0% 22 46.7%Montage_50 29 46 58.6% 41 41.4% 32 48 50.0% 45 40.6%Montage_100 63 91 44.4% 88 39.7% 69 93 34.8% 90 30.4%Sipht_30 14 27 92.9% 23 64.3% 18 28 55.6% 25 38.9%Sipht_60 32 55 71.9% 52 62.5% 38 57 50.0% 54 42.1%Sipht_100 49 90 83.7% 85 73.5% 57 93 63.2% 89 56.1%Average 34.4 54.1 62.1% 50.6 49.7% 38.9 56.3 46.9% 53.3 37.8%

frequency of tasks under the energy supply constraint. In thesecond set of experiments, we compare the proposed schemeESA with the benchmarking methods TATS [23], WARM[22], TMTC [21], and AFTS [42] in terms of increasingsystem throughput and ensuring the schedule feasibility. Thecomparative algorithms are described below.

• TATS [23] is a particle swarm optimization and simu-lated annealing based algorithm that exploits the tempo-ral variation to schedule tasks of green data center formaximizing profit under task delay constraints.

• WARM [22] is a throughput-aware approach that max-imizes the revenue of green data center providers byreducing the scheduling cost of all arrival tasks.

• TMTC [21] is a temperature-constrained optimizationmethod that maximizes system throughput by minimiz-ing the execution latency of tasks.

• AFTS [42] is a method that achieves energy efficiencyand fault-tolerance simultaneously for real-time EHSusing the techniques of DVFS and primary backup.

B. VALIDATE THE PROPOSED MILP APPROACHWe use a common MILP solver, CPLEX with AMPL, toaddress the instances of the proposed MILP formulationfor maximizing system throughput under the constraints ofenergy as well as reliability. As discussed in Section IV-C, theMILP solver may not address the optimization problem forsystems of a larger granularity efficiently. Generally, MILPcan generate the optimum throughput for small systems.However, for most small systems, MILP may fail in findingthe optimum solutions in several hours. Therefore, we termi-nate the MILP solver after six hours and adopt the best resultsgenerated by the solver. Considering that the baseline methodRand doesnot consider the reliability and task precedenceconstraint and hence its produced solutions may violate theconstraints, we remove these invalid results for conducting afair comparison with the proposed MILP and ESA.

TABLE 3 demonstrates the comparison of system through-put obtained by our MILP, ESA, and baseline method Rand.

In the comparison, twelve benchmarks and two multicoresystems are used. In the table, the Trusys column indicatesthe system throughput achieved by the three approacheswhile the "IMP" column indicates the improvement of systemthroughput realized by MILP and ESA over Rand. For the 6-core system, the average throughput improvement achievedby MILP and ESA over Rand are 62.1% and 49.7%, re-spectively. The highest improvement achieved by MILP andESA compared to Rand can be up to 92.9% and 73.5%,respectively. For the 8-core system, the average throughputimprovement achieved by MILP and ESA over Rand are46.9% and 37.8%, respectively. The highest improvementachieved by MILP and ESA compared to Rand can be up to63.2% and 56.1%, respectively. We also observe that MILPcan maximize the system throughput among the three ap-proaches regardless of benchmarks and multicores. However,the MILP solver always derives the solutions in hours whileESA and Rand in minutes.

C. VALIDATE THE PROPOSED ESA APPROACH

Two simulation experiments are performed to justify theefficacy of the proposed ESA approach in terms of increasingsystem throughput and ensuring schedule feasibility. In thefirst experiment, we compare the system throughput achievedby the proposed scheme ESA and four peer approaches TATS[23], WARM [22], TMTC [21], and AFTS [42]. In the secondexperiment, we compare the schedule feasibility realized bythe proposed scheme ESA and four peer approaches TATS[23], WARM [22], TMTC [21], and AFTS [42]. The schedulefeasibility is defined as the ratio of the number of applicationsthat can be successfully scheduled to the total number ofapplications adopted in the test.

Fig. 4 presents the throughput of executing 12 benchmarkson 6-core and 8-core systems achieved by the proposedscheme ESA and peer approaches TATS [23], WARM [22],TMTC [21], and AFTS [42]. The results clearly show thatESA is able to achieve the highest averaged system through-put among the five methods no matter which multicore

VOLUME 4, 2016 9




0

20

40

60

80

100

System Throughput TATS WARM AFTS TMTC ESA

0

20

40

60

80

100

System Throughput TATS WARM AFTS TMTC ESA

(a) 6-cores

(b) 8-cores

FIGURE 4. The system throughput of executing 12 benchmarks using the proposed scheme ESA and peer approaches TATS [23], WARM [22], TMTC [21], and AFTS [42].

system is adopted. Specifically, the averaged throughput of6-core system achieved by ESA, TATS [23], WARM [22],TMTC [21], and AFTS [42] over the 12 benchmarks are50.6, 40.3, 45.2, 40.8, and 38.3, respectively. The averagedthroughput of 8-core system using ESA, TATS [23], WARM[22], TMTC [21], and AFTS [42] are 53.3, 40.4, 46.1, 43.3,and 37.6, respectively. The averaged system throughput usingESA can be up to 32.1% higher than those of TATS [23],WARM [22], TMTC [21], and AFTS [42]. The reason whyESA outperforms the four peer approaches is that ESAconsiders the uncertainty in energy sources and determinesthe allocation and scheduling of tasks based on the systemenergy state. From the figure, we can also deduce that thethroughput of 8-core system is almost higher than that of 6-core system. This is because that adding more cores is benefitto the scheduling of tasks in the benchmark.

Fig. 5 shows the schedule feasibility of executing 12benchmarks on 6-core and 8-core systems using the proposedscheme ESA and peer approaches TATS [23], WARM [22],TMTC [21], and AFTS [42]. As can be seen in the figure,ESA can ensure a much higher feasibility as compared toTATS [23], WARM [22], TMTC [21], and AFTS [42]. For ex-ample, the feasibility achieved by ESA, TATS [23], WARM

0.5 0.6 0.7 0.8 0.9 1

AFTS

TMTC

ESA

WARM

TATS

Feasibility

8-cores 6-cores

FIGURE 5. The schedule feasibility of executing 12 benchmarks using theproposed scheme ESA and peer approaches TATS [23], WARM [22], TMTC[21], and AFTS [42].

[22], TMTC [21], and AFTS [42] for the 6-core system are83.3%, 66.7%, 75.0%, 58.3%, and 66.7%, respectively. Thisis because that neither of TATS [23], WARM [22], TMTC[21], and AFTS [42] considers the energy and reliabilityconstraints simultaneously. In addition, we can easily findthat the feasibility of executing benchmarks on the 8-core

10 VOLUME 4, 2016




system using the five methods are almost higher than that ofthe 6-core system, which benefits from the larger schedulingspace brought by the increased core number.

VII. CONCLUSIONIn this paper, we have solved the task allocation and schedul-ing problem for maximizing the throughput of multicoreenergy-harvesting systems. To optimize the throughput ofsystems powered by intermittent renewable energy, as wellas to satisfy the reliability and task precedence constraints,we designed an MILP approach and an energy state-awareapproach. The MILP approach is able to find the optimalsolution but it may take exponential time to finish while theenergy state-aware approach is a polynomial-time heuristicthat can derive the sub-optimal solutions efficiently. We per-formed extensive simulations to validate the proposed MILPand energy state-aware approaches. Simulation results showthat the proposed MILP approach has the best performancein increasing system throughput. The system throughput canbe increased by up to 92.9% using the MILP approach ascompared to a baseline method. The evaluation results alsodemonstrate that the proposed heuristic algorithm outper-forms four benchmarking methods from the perspectives ofsystem throughput and schedule feasibility.

ACKNOWLEDGMENTThe authors would like to thank the anonymous reviewers fortheir helpful suggestions.

REFERENCES[1] K. Lampka, K. Huang, and J. Chen, “Dynamic counters and the efficient

and effective online power management of embedded real-time systems,"The Proceedings of the International Conference on Hardware/SoftwareCodesign and System Synthesis, pp. 267-276, 2011.

[2] G. Xie, et al., “Reliability enhancement toward functional safety goalassurance in energy-aware automotive cyber-Physical Systems," IEEETrans. Industrial Informatics, vol. 14, no. 12, pp. 5447-5462, 2018.

[3] G. Xie, J. Huang, Y. Liu, R. Li, and K. Li, “System-level energy-awaredesign methodology towards end-to-end response time optimization,"IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems,in press, 2019. DOI: 10.1109/TCAD.2019.2921350

[4] J. Chen, T. Wei, and J. Liang, “State-aware dynamic frequency selectionscheme for energy harvesting real-time systems," IEEE Trans. Very LargeScale Integration (VLSI) Systems, vol. 22, no. 8, pp. 1679-1692, 2014.

[5] J. Zhou, J. Chen, K. Cao, T. Wei, and M. Chen, “Game theoretic energyallocation for renewable powered in-situ server systems," The Proceedingsof the International Conference on Parallel and Distributed Systems, pp.721-728, 2016.

[6] “Offshore oil and gas supply," Working Document of the NationalPetroleum Council, 2011.

[7] “How big data is changing astronomy (again)," The Atlantic, 2012.[8] “The changing geospatial landscape," A Report of the National Geo-

spatial Advisory Committee, 2009.[9] W. Cox, M. Pruett, T. Benson, S. Chiavacci, and F. Thompson III, “De-

velopment of camera technology for monitoring nests," USGS NorthernPrairie Wildlife Research Center, 2012.

[10] C. Li, Y. Hu, L. Liu, J. Gu, M. Song, X. Liang, J. Yuan, and T. Li, “Towardssustainable in-situ server systems in the big data era," The Proceedings ofthe International Symposium on Computer Architecture, pp. 14-26, 2015.

[11] T. Esram and P. Chapman, “Comparison of photovoltaic array maximumpower point tracking techniques," IEEE Trans. Energy Conversion, vol.22, no. 2, pp. 439-449, 2007.

[12] K. Kobayashi, I. Takano, and Y. Sawada, “A study on a two stagemaximum power point tracking control of a photovoltaic system underpartially shaded insolation conditions," Solar Energy Materials and SolarCells, vol. 90, no. 18-19, pp. 2975-2988, 2006.

[13] V. Sharma, U. Mukherji, V. Joseph, and S. Gupta, “Optimal energymanagement policies for energy harvesting sensor nodes," IEEE Trans.Wireless Communication, vol. 9, no. 4, pp. 1326-1336, 2010.

[14] Y. Abdeddaim and D. Masson, “Real-time scheduling of energy harvestingembedded systems with timed automata," The Proceedings of the Interna-tional Conference on Embedded and Real-Time Computing Systems andApplications, pp. 31-40, 2012.

[15] D. R. Liu, Y. C. Xu, Q. L. Wei, and X. L. Liu, “Residential energyscheduling for variable weather solar energy based on adaptive dynamicprogramming," IEEE/CAA J. of Autom. Sinica, 2018, vol. 5, no. 1, pp.36-46, Jan. 2018.

[16] K. Tutuncuoglu, B. Varan, and A. Yener, “Throughput maximization fortwo-way relay channels with energy harvesting nodes: the impact ofrelaying strategies," IEEE Trans. Communications, vol. 63, no. 6, pp.2081-2093, 2015.

[17] S. Gupta, R. Zhang, and L. Hanzo, “Throughput maximization for a buffer-aided successive relaying network employing energy harvesting," IEEETrans. Vehicular Technology, vol. 65, no. 8, pp. 6758-6765, 2016.

[18] A. Mehrabi and K. Kim, “General framework for network throughputmaximization in sink-based energy harvesting wireless sensor networks,"IEEE Trans. Mobile Computing, vol. 16, no. 7, pp. 1881-1896, 2017.

[19] W. Wu, J. Wang, X. Wang, F. Shan, and J. Luo, “Online throughputmaximization for energy harvesting communication systems with batteryoverflow," IEEE Trans. Mobile Computing, vol. 16, no. 1, pp. 185-197,2017.

[20] G. Z. Wang, Q. Y. Bian, H. H. Xin, and Z. Wang, “A robust reservescheduling method considering asymmetrical wind power distribution,"IEEE/CAA J. of Autom. Sinica, vol. 5, no. 5, pp. 961-967, Sep. 2018.

[21] H. Huang, V. Chaturbedi, G. Quan, J. Fan, and M. Qiu, “Throughput max-imization for periodic real-time systems under the maximal temperatureconstraint," ACM Trans. Embedded Computing Systems, vol. 13, no. 2s,2014.

[22] H. Yuan, J. Bi, M. Zhou, and K. Sedraoui, “WARM: workload-awaremulti-application task scheduling for revenue maximization in SDN-basedcloud data center," IEEE Access, vol. 6, pp. 645-657, 2018.

[23] H. Yuan, J. Bi, M. Zhou, and A. C. Ammari, “Time-aware multi-application task scheduling with guaranteed delay constraints in green datacenter," IEEE Trans. Automation Science and Engineering, vol. 15, no. 3,pp. 1138-1151, July 2018.

[24] H. Yuan, J. Bi, and M. Zhou, “Spatial task scheduling for cost minimiza-tion in distributed green cloud data centers," IEEE Trans. AutomationScience and Engineering, vol. 16, no. 2, pp. 729-740, April 2019.

[25] J. Zhou, X. S. Hu, Y. Ma, and T. Wei, “Balancing lifetime and soft-errorreliability to improve system availability," The Proceedings of the Asia andSouth Pacific Design Automation Conference, pp. 685-690, 2016.

[26] Y. Ma, T. Chantem, R. P. Dick, S. Wang, and X. S. Hu, “An on-lineframework for improving reliability of real-time systems on Big-Little typeMPSoCs," The Proceedings of the Design, Automation and Test in EuropeConference, pp. 446-451, 2017.

[27] T. Kim, Z. Sun, H. Chen, H. Wang, and S. X. D. Tan, “Energy and lifetimeoptimization for dark silicon manycore microprocessor considering bothhard and soft errors," IEEE Trans. Very Large Scale Integration (VLSI)Systems, vol. 25, no. 9, pp. 2561-2574, 2017.

[28] J. Zhou, T. Wei, M. Chen, X. S. Hu, Y. Ma, G. Zhang, and J. Yan,“Variation-aware task allocation and scheduling for improving reliabilityof real-time MPSoCs," The Proceedings of the Design, Automation andTest in Europe Conference, pp. 171-176, 2018.

[29] J. Zhou, X. Zhou, J. Sun, T. Wei, M. Chen, S. Hu, and X. S. Hu, “Resourcemanagement for improving soft-error and lifetime reliability of real-timeMPSoCs," IEEE Trans. Computer-Aided Design of Integrated Circuits andSystems, in press, 2019. DOI: 10.1109/TCAD.2018.2883993

[30] G. Xie, Y. Chen, R. Li, and K. Li, “Hardware cost design optimization forfunctional safety-critical parallel applications on heterogeneous distributedembedded systems," IEEE Trans. Industrial Informatics, vol. 14, no. 6, pp.2418-2431, 2018.

[31] G. Xie, et al., “Minimizing redundancy to satisfy reliability re-quirement for a parallel application on heterogeneous service-orientedsystems," IEEE Trans. Service Computing, in press, 2019. DOI:10.1109/TSC.2017.2665552

VOLUME 4, 2016 11




[32] D. Zhu, R. Melhem, and D. Mosse, “The effects of energy managementon reliability in real-time embedded systems," The Proceedings of theInternational Conference on Control, Automation and Diagnosis, pp. 35-40, 2004.

[33] J. Srinivasan, S. Adve, P. Bose, and J. Rivers, “The impact of technologyscaling on lifetime reliability," The Proceedings of the International Con-ference on Dependable Systems and Networks, pp.177-186, 2004.

[34] L. Huang, F. Yuan, and Q. Xu, “Lifetime reliability-aware task allocationand scheduling on MPSoC platform," The Proceedings of the Design,Automation and Test in Europe Conference, pp. 51-56, 2009.

[35] T. Chantem, Y. Xiang, X. S. Hu, and R. P. Dick, “Enhancing multicorereliability through wear compensation in online assignment and schedul-ing," The Proceedings of the Design, Automation and Test in EuropeConference, pp. 1373-1378, 2013.

[36] H. Topcuoglu, S. Hariri, and Min. Wu, “Performance-effective and low-complexity task scheduling for heterogeneous computing," IEEE Trans.Parallel and Distributed Systems, vol. 13, no. 3, pp. 260-274, 2002.

[37] Y. Ma, J. Zhou, T. Chantem, R. P. Dick, S. Wang, and X. S.Hu, “On-line resource management for improving reliability of real-time systems on Big-Little type MPSoCs," IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, in press, 2019. DOI:10.1109/TCAD.2018.2883990

[38] R. Rubinstein and D. Kroese, “The cross entropy method: a unifiedapproach to combinatorial optimization, monte-carlo simulation, and ma-chine learning," Annals of Operations Research, 2004.

[39] X. Chen, Y. Chen, A. Y. Zomaya, R. Ranjan, and S. Hu, “CEVP: cross en-tropy based virtual machine placement for energy optimization in clouds,"Journal of Supercomputing, vol. 72, no. 8, pp. 3194-3209, 2016.

[40] G. Juve, A. Chervenak, E. Deelman, S. Bharathi, G. Mehta, and K.Vahi,“Characterizing and profiling scientific workflows," Future Genera-tion Computing Systems, vol. 29, no. 3, pp. 682-692, 2013.

[41] K. Skadron et al., “Temperature-aware microarchitecture: modeling andimplementation," ACM Trans. Architecture and Code Optimization, vol.1, no. 1, pp. 94-125, 2004.

[42] L. Zhu, T. Wei, X. Chen, Y. Guo, and S. Hu, “Adaptive fault-tolerant taskscheduling for real-time energy harvesting systems," Journal of Circuits,Systems, and Computers, vol. 21, no. 1, 2012.

JUNLONG ZHOU (S’15-M’17) received thePh.D. degree in Computer Science from EastChina Normal University, Shanghai, China, in2017. He was a Visiting Scholar with the Uni-versity of Notre Dame, Notre Dame, IN, USA,during 2014-2015. He is currently an AssistantProfessor with the School of Computer Scienceand Engineering, Nanjing University of Scienceand Technology, Nanjing, China. His research in-terests include real-time embedded systems, cloud

computing, and cyber physical systems. Dr. Zhou has published more than40 refereed papers in his research areas, most of which are publishedin premium conferences and journals including IEEE/ACM DATE, IEEETPDS, IEEE TCAD, IEEE TCAS, IEEE TR, IEEE TAES, Elsevier JSS,Elsevier JSA, Elsevier FGCS, and etc. He received the Reviewer Awardfrom Journal of Circuits, Systems, and Computers, in 2016. He has servedas publication chairs, publicity chairs, section chairs, and TPC members fornumerous conferences. He has been an Associate Editor for the Journal ofCircuits, Systems, and Computers, and serves as a Guest Editor for severalspecial issues of the ACM Transactions on Cyber-Physical Systems, IETCyber-Physical Systems: Theory & Applications, and Journal of SystemsArchitecture: EMBEDDED SOFTWARE DESIGN.

PEIJIN CONG received the B.S. degree from theDepartment of Computer Science and Technology,East China Normal University, Shanghai, China,in 2016. She is currently pursuing the Ph.D. de-gree with the Department of Computer Scienceand Technology, East China Normal University,Shanghai, China. Her current research interests arein the areas of cloud computing and Internet ofThings. She has published 10 papers in premiumjournals such as the IEEE Transactions on Parallel

and Distributed Systems, IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems, IEEE Transactions on Sustainable Comput-ing, and IEEE Communicaitons Surveys and Tutorials.

JIN SUN (M’17) received the B.S. and M.S.degrees in computer science from Nanjing Univer-sity of Science and Technology, Nanjing, China, in2004 and 2006, respectively, and the Ph.D. degreein electrical and computer engineering from theUniversity of Arizona in 2011. Between January2012 and September 2014, he was with Orora De-sign Technologies, Inc. as a member of technicalstaff. He is currently an Associate Professor withthe School of Computer Science and Engineering,

Nanjing University of Science and Technology, Nanjing, China. His researchinterests include integrated circuit modeling and analysis and computer-aided design. Dr. Sun has been an Associate Editor for the Journal ofCircuits, Systems, and Computers since 2018. He is a member of the IEEE.

XIUMIN ZHOU received the B.S. degree incomputer science and Engineering from NanjingUniversity of Science and Technology, Nanjing,China, in 2012. He is currently pursuing the Ph.D.degree with Nanjing University of Science andTechnology. His current research interests are inthe areas of embedded systems, cloud computingand distributed systems.

TONGQUAN WEI (M’11-SM’19) received thePh.D. degree in electrical engineering from Michi-gan Technological University, Houghton, MI,USA, in 2009. He is currently an Associate Pro-fessor with the Department of Computer Scienceand Technology, East China Normal University,Shanghai, China. His current research interestsinclude real-time embedded systems, green andreliable computing, parallel and distributed sys-tems, and cloud computing. Dr. Wei has published

over 60 papers in these areas, most of which are published in premiumconferences and journals including IEEE/ACM ICCAD, IEEE/ACM DATE,IEEE TPDS, IEEE TC, IEEE TCAD, IEEE TVLSI, IEEE TCAS, IEEE TSG,IEEE TDSC, IEEE TAES, IEEE TRL, IEEE TSUSC, IEEE COMST, andetc. He has been a Regional Editor for the Journal of Circuits, Systems, andComputers since 2012. He served as a Guest Editor for several special sec-tions of IEEE Transactions on Industrial Informatics, ACM Transactions onEmbedded Computing Systems, and ACM Transactions on Cyber-PhysicalSystems. He is a senior member of the IEEE.

12 VOLUME 4, 2016




MINGSONG CHEN (S’08-M’11-SM’17) re-ceived the B.S. and M.E. degrees from the De-partment of Computer Science and Technology,Nanjing University, Nanjing, China, in 2003 and2006, respectively, and the Ph.D. degree in Com-puter Engineering from the University of Florida,Gainesville, FL, USA, in 2010. He is currentlya full Professor with the Department of Embed-ded Software and Systems, East China NormalUniversity, Shanghai, China. His current research

interests include cyber-physical systems, formal verification, and mobilecloud computing. Dr. Chen has published over 70 papers in these areas,most of which are published in premium conferences and journals includingICCAD, DAC, DATE, ICSE, CODES+ISSS, IEEE TPDS, IEEE TC, IEEETCAD, IEEE TCC, IEEE TSC, IEEE TRL, ACM TODAES, and etc. Hehas served as TPC chairs, publication chairs, publicity chairs, section chairs,and TPC members for numerous conferences. He is currently an AssociateEditor of IET Computers and Digital Techniques and the Journal of Circuits,Systems and Computers. He serves as a Guest Editor for a special section ofthe IEEE Transactions on Reliability. He is a senior member of the IEEE.

VOLUME 4, 2016 13

Documents

Throughput Maximization for Multicore Energy …...permanent faults. This paper aims at scheduling dependent tasks on a multicore platform for throughput maximization under energy