by Sun Suns232sun/mypapers/thesis_phd.pdf · With growing concerns about the environment and the energy independence of fossil fuels, more and more re-newable energy resources such

MANAGEMENT OF ELECTRICAL GRIDS WITH STORAGE AND FLEXIBLE LOADS UNDER

HIGH-PENETRATION RENEWABLES

by

Sun Sun

A thesis submitted in conformity with the requirements

for the degree of Doctor of Philosophy

Graduate Department of Electrical and Computer Engineering

University of Toronto

c© Copyright 2016 by Sun Sun

Abstract

Management of Electrical Grids with Storage and Flexible Loads under High-Penetration Renewables

Sun Sun

Doctor of Philosophy

Graduate Department of Electrical and Computer Engineering

University of Toronto

2016

With growing concerns about environment and energy independence issues, more and more renewable energy

resources such as wind and solar are expected to be integrated into the future power grid. Due to the intermittence

and limited dispatch-ability of renewable generation, its large-scale integration could upset the balance between

supply and demand, and affect grid reliability. To maintain grid reliability, traditional approaches include adding

more operating reserves such as fast-responsive generators, which in turn incurs an increased cost and meanwhile

discounts the environmental benefits of renewable generation. To combat the intermittence of renewable genera-

tion, in this thesis, an alternative solution is considered, which leverages the flexibility of energy storage and loads

in grid-wide services.

With the assistance of advanced “smart grid” technologies (e.g., information technology, control, and eco-

nomics), the general objective of this thesis is to facilitate the large-scale renewable integration so as to improve

the long-term performance of power grids (e.g., reliability, social welfare, or cost effectiveness). In achieving

this goal, several challenges are encountered, such as system uncertainty, coupling of system operational con-

straints, and large scale of power grids. Compared with previous works, this work builds on more complete

system models that can accommodate a wide spectrum of vital characteristics of a power system. We explicitly

incorporate system uncertainty (e.g., uncertainty of renewable generation, electricity price, and loads) into the

problem formulation. For the control of energy storage and loads, we provide centralized algorithms that are easy

to implement in reality, and at the same time ensure strong analytical performance. Furthermore, we propose dis-

tributed implementation of the centralized algorithms, which converges fast and only requires limited information

exchange.

ii

Acknowledgements

I would like to thank my supervisors Prof. Ben Liang and Prof. Min Dong, who led me to this exciting field.

This thesis would not have been possible without their encouragement and guidance. I would like to thank my

husband, Dr. Yaoliang Yu, for his love and support. His great interest in science inspires me all the time. I am

also indebted to my parents, Shumin Sun and Juan Wang, for their unconditional love.

iii

Contents

1 Introduction 1

1.1 Energy Storage in Renewable Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Flexible Loads in Renewable Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 Real-Time Power Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.2 Real-Time Phase Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.3 Real-Time Energy Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Real-Time Power Balancing with Static Storage 9

2.1 System Model and Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Power Balancing and Aggregator-DS System . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Centralized Real-Time Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.1 Problem Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.2 Virtual Queue Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.3 Centralized Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Distributed Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3.1 Lagrange Dual Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3.2 Dual Maximization with FISTA and Convergence Analysis . . . . . . . . . . . . . . . . 20

2.3.3 Price Signaling pc,t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3.4 Main Algorithm of Distributed Implementation . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.4.2 Convergence of Distributed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.4.3 Comparison with Greedy Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6.1 Proof of Proposition 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6.2 Proof of Lemma 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6.3 Proof of Lemma 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

iv

2.6.4 Proof of Theorem 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.6.5 Proof of Lemma 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.6.6 Proof of Theorem 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29


3 Real-Time Power Balancing with Dynamic Storage 30

3.1 System Model and Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.1 Regulation Service and Aggregator-EV System . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.2 Fair Regulation Allocation through Welfare Maximization . . . . . . . . . . . . . . . . . 33

3.2 Welfare-Maximizing Regulation Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2.1 Problem Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2.2 Problem Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.3 WMRA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.3.1 Properties of WMRA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3.2 Optimality of WMRA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39


3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.6 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.6.1 Proof of Lemma 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.6.2 Proof of Lemma 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44


3.6.4 Proof of Lemma 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.6.5 Proof of Lemma 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.6.6 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4 Real-Time Phase Balancing with Energy Storage 50


4.1.1 System Model of Each Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


4.2 Real-Time Algorithm for Ideal Energy Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.1 Centralized Real-Time Algorithm and Analysis . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.2 Distributed Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.3 Extension to Non-ideal Energy Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57


4.4.1 Effect of Correlations of Uncontrollable Power Flows . . . . . . . . . . . . . . . . . . . . 59

4.4.2 Effect of Energy Storage Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4.3 Effect of Charging and Discharging Circuit Parameters . . . . . . . . . . . . . . . . . . . 61

4.4.4 Effect of Other System Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.6 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.6.1 Proof of Relaxation from P1 to P2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.6.2 An Upper Bound of the Drift-Plus-Cost Function . . . . . . . . . . . . . . . . . . . . . . 64


v

4.6.4 Proof of Theorem 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65


4.6.6 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5 Real-Time Energy Management with Storage and Flexible Loads 68


5.1.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69


5.2 Real-Time Algorithm for Power Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.2.1 Description of Real-Time Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.2.2 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.2.3 Discussion on Multiple CGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3 Distributed Implementation of Real-Time Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 76

5.3.1 Distributed Algorithm Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.3.2 Distributed Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77


5.4.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.4.2 Benchmark Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.4.3 Comparison under Parameters V and α . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.4.4 Effect of Ramping Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.4.5 Convergence of Distributed Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.6 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.6.1 Proof of Relaxation from P1 to P2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.6.2 Upper bound on drift-plus-cost function . . . . . . . . . . . . . . . . . . . . . . . . . . . 82


5.6.4 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84



5.6.7 Simplification of (5.17)-(5.19) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6 Conclusion and Future Work 88

Bibliography 89

vi

List of Tables

1.1 Comparison with existing works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Number of iterations for |gt −∑N

i=1 xki − qk| < 0.01. . . . . . . . . . . . . . . . . . . . . . . . 23

3.1 Parameters of Type I and Type II EVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.1 Default setup of parameters and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

vii

List of Figures

2.1 Schematic representation of a local power grid. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Time-averaged system cost vs. number of DS units under various si,max. . . . . . . . . . . . . . . 24

2.3 Normalized time-averaged system cost vs. gmax/∑150

i=1 ri,max under various ∆t. . . . . . . . . . 24

3.1 Transition probabilities of 1i,t, ∀i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.2 Time-averaged social welfare with V = Vmax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3 Time-averaged social welfare with various si,max and V = Vmax. . . . . . . . . . . . . . . . . . 42

3.4 Time-averaged social welfare with various values of V . . . . . . . . . . . . . . . . . . . . . . . . 43

3.5 Sample path of a Type I EV’s energy state with V = [1, 2, 5]Vmax. . . . . . . . . . . . . . . . . . 43

4.1 System model with N phases. The details of the i-th phase are shown. . . . . . . . . . . . . . . . 51

4.2 Distributed implementation for solving P3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 System cost vs. phase correlation coefficient: Case 1. . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4 System cost vs. phase correlation coefficient: Case 2. . . . . . . . . . . . . . . . . . . . . . . . . 60

4.5 System cost vs. time correlation coefficient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.6 System cost vs. energy capacity si,max. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.7 System cost vs. energy capacity s1,max. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.8 System cost vs. round-trip efficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.9 System cost vs. maximum charging and discharging rate ui,max. . . . . . . . . . . . . . . . . . . 63

4.10 Power flows vs. time slots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.11 System cost vs. number of phases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.1 Schematic representation of the considered power grid. . . . . . . . . . . . . . . . . . . . . . . . 69

5.2 Information flow of distributed implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.3 System cost vs. control parameter V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.4 System cost vs. portion of unsatisfied flexible loads α. . . . . . . . . . . . . . . . . . . . . . . . 79

5.5 System cost vs. ramping coefficient r (small loads). . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.6 System cost vs. ramping coefficient r (large loads). . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.7 Performance gap vs. number of iterations for distributed algorithm. . . . . . . . . . . . . . . . . . 81

viii

List of Abbreviations

ADMM alternating direction method of multipliers

CG conventional generator

DC direct current

DLC direct load control

DP dynamic pricing

DS distributed storage

EV electric vehicle

FISTA fast iterative shrinkage-thresholding algorithm

MDP Markov decision process

RG renewable generator

TCL thermostatically controlled load

ix

Chapter 1

Introduction

With growing concerns about the environment and the energy independence of fossil fuels, more and more re-

newable energy resources such as wind and solar are expected to be integrated into the future power grid. For

example, the European Commission aims to include 20% renewable energy resources in the European Union

energy profile by 2020 [1]; California plans to achieve 33% of retail sales from renewable energy by 2020 [2];

and the Danish government is even more ambitious and sets a goal of containing 100% renewable energy in 2050,

being independent of fossil fuels [3]. As an important example, microgrids have been envisioned to contain a

large amount of renewable generation and are treated as a part of the basic structure of the future power grid.

Unlike fuel combustion, renewable generation is intermittent and has limited dispatch-ability. From [4], the

variability caused by small-scale integration of renewable generation can be absorbed in a current power system

by improving operational practices. However, a direct integration of large-scale renewable generation may create

a noticeable imbalance in a power system, and therefore severely jeopardize grid reliability. To maintain the power

system stability, a traditional approach is to add more operating reserves [4]. For example, to fulfill California’s

33% renewable goal in 2020, studies show that the required maximum regulation up (resp. down) capacity

would be more than 4 times (resp. twice) that in 2006 [5]. If these reserves were provided by fast-responsive

conventional generators (CGs) such as natural gas and hydroelectric generators, not only would the CG efficiency

be significantly reduced, but also the net environmental contribution of renewable generation would be largely

discounted. Therefore, to deepen the integration of renewable generation, alternative solutions that are both

economically and environmentally efficient are highly desirable.

Among all alternative approaches for facilitating large-scale renewable integration, control of energy storage

and flexible loads is promising for their functionality of shifting energy across time or location. Through intel-

ligent control, they can be employed in various grid-wide services (e.g., energy arbitrage, power balancing, and

load following) to combat the variability of renewable generation. Below, the background on energy storage and

flexible loads, along with their current exercises in power systems are reviewed.

1.1 Energy Storage in Renewable Integration

Energy storage has been employed widely in power systems for various applications. Examples of these applica-

tions include short-term power balancing services such as frequency regulation [6], and long-term services such

as maintaining a certain level of reserve capacity [7]. There are many types of storage with a wide range of

storage characteristics: pumped hydro storage, compressed air energy storage, thermal energy storage, batteries,

1

CHAPTER 1. INTRODUCTION 2

flywheels, capacitors, and superconducting magnetic energy storage [4]. The charging and discharging capabili-

ties enable a storage unit to shift energy across time. Specifically, a storage unit can absorb extra energy in the case

of energy surplus (e.g., sudden extra supply of renewable generation due to unexpected weather), or contribute

energy in the case of energy deficit (e.g., sudden increase of energy demand). Moreover, for a power network with

multiple nodes of storage installation, it is also possible for storage to shift energy across location through power

transmission lines connecting these nodes. Energy storage can be deployed at the supply side or the demand side.

In particular, with more and more renewable generation, it is suggested that energy storage can be co-located with

renewable generators so as to mitigate the uncertainty of renewable generation [8]. On the other hand, storage can

also be co-located with the demand side to minimize the cost of users [9].

Today, the majority of energy storage is large pumped hydro storage located far away from load centers

[10]. With a high penetration of solar arrays equipped on users’ roof tops, and a growing residential electricity

consumption from computers, air conditioners, and electric vehicles (EVs), it is expected that more and more

small-size distributed storage (DS) units will be deployed near communities. Examples of such DS units can be

batteries in EVs and RGs. Based on the data published in [11], the sales of the cumulative U.S. plugged-in vehicles

had reached 180,000 in February 2014 since December 2010, and keep on rising. Moreover, with a significant

growth of distributed photovoltaics, the number of battery-backed solar systems will increase accordingly [12].

Hence, it is believed that there will be a large number of such DS units in the near future, and they will play

important roles in the future grid operation. However, the size of an individual DS unit is usually small for

grid-wide services. For example, the typical power capacity of an EV is 5-20 kW, in comparison with frequency

regulation service requirement often on the order of megawatts. Therefore, it is often necessary to coordinate a

large number of DS units for grid-wide services. For control purposes, these DS units can be managed by an

electricity utility or a “third-party” aggregator who serves as an intermediary between DS units and an utility.

The optimal control of energy storage in power systems is generally a challenging problem due to storage

characteristics as well as system uncertainty. There are many existing works on storage control in the context

of renewable generation, using different mathematical approaches. To the best of our awareness, the common

approaches that are used in literature for storage control are as follows:

• dynamic programming [13],

• model predictive control [14],

• the linear-quadratic regulator [15], and

• Lyapunov optimization [16].

For example, using stochastic dynamic programming, the authors of [6] propose a stationary optimal policy for

power balancing, and the authors of [17] investigate both optimal and suboptimal polices for energy balancing.

Nevertheless, the derivation of an optimal policy under dynamic programming generally relies on system statistics

and some specific form of the problem structure, and therefore cannot be easily extended. In [18], the authors

employ model predictive control. However, the algorithm performance can only be evaluated through numerical

examples. In [19], the authors propose a control strategy based on the linear-quadratic regulator approach. This

approach applies when the system dynamics are described by a set of linear differential equations and the objective

function is quadratic. Under this approach, obtaining the optimal control action analytically is generally hard and

requires system statistics.

Besides the above three approaches, there are some other works employing Lyapunov optimization for stor-

age control in power systems. For example, the authors of [20] consider power balancing, the authors of [21]


study demand side management, and the authors of [22] investigate the management of networked storage with a

direct current (DC) power flow model. Lyapunov optimization has been used widely in wireless communications

for solving stochastic optimization problems. Under the standard framework of Lyapunov optimization, time-

averaged constraints can be transformed into queue stability constraints, and efficient real-time algorithms can

be developed for complex dynamic systems without the need for system statistics. However, the standard frame-

work of Lyapunov optimization cannot address general stochastic optimization problems. Therefore, in applying

Lyapunov optimization to these problems, the main technical challenge is to model the design problem in a form

that is amenable to Lyapunov optimization and to derive analytical relationship between control parameters and

system performance.

When there are a large number of energy storage units in the system, storage control can be performed in either

centralized way, or distributed way. Both of these control strategies have been studied in literature. For example,

with the objective of maximizing the profit of the grid operator or DS units, the authors consider centralized

control in [23–27]. In [28, 29], the authors study distributed control in a static system.

The current practice for storage control mainly focuses on centralized control. With centralized control,

charging and discharging commands of all storage units are sent by the grid operator, which can be efficient

in terms of achieving grid-wide objectives. The communication between the grid and each storage unit can

be accomplished by various communication platforms, such as broadband Internet connections and advanced

metering infrastructure which have been deployed widely [30]. However, it should be noted that, in practice,

storage units may be owned by different and self-interested users rather than the grid operator. This is the case

especially when more and more DS units are employed in grid-wide services. Therefore, implementation of

centralized control may incur the following issues. First, to enable centralized control, the operator requires

all information of storage, including a wide range of storage characteristics, the degradation cost function, and

the energy state at each time slot. This may violate the privacy of storage owners. Second, the large amount

of required information demands a lot of communication bandwidth, which may be constrained under certain

circumstances. Furthermore, the computational complexity of centralized control may increase drastically as the

number of the enrolled storage units grows.

In contrast, with distributed control, the grid operator does not directly control charging or discharging op-

eration of each storage unit. Instead, the owner of each storage unit can determine its charging and discharging

amounts with possibly limited revealment of private information to the grid operator. Also, the computational bur-

den of storage control is distributed over all storage units, in that the grid operator and the storage unit each solves

a small-scale optimization problem. However, a potential drawback of distributed control is that, the distributed

algorithm may converge slowly if it is not well designed. This implies that, by distributed control, it may take

a while for each storage unit to obtain the resultant charging and discharging amounts at each time slot, which

is certainly undesirable especially for short-term grid services. Another design challenge regarding distributed

control is to align the goal of individual storage owners with that of the system.

1.2 Flexible Loads in Renewable Integration

Compared to the loads such as lighting that must be satisfied once requested, other loads may have certain flex-

ibility such that their energy consumptions can be controlled through either curtailment or time shift. In fact, it

is pointed out in [31] that a considerable amount of power generation goes to these flexible loads, e.g., heating,

ventilation, air conditioning, and EVs, whose energy consumption can be deferred for a few minutes or hours at

little or no cost. What’s more, control of flexible loads is assessed to be the least expensive alternative solution


for the renewable-driven applications such as ancillary services, peak shaving, and contingency management [4].

As a concrete example, consider that an EV parked after 8pm needs to be fully charged before 8am in the next

morning. In this case, the vehicle provides certain time flexibility since the vehicle owner usually cares little about

the particular time slots during which the vehicle is charged, as long as the battery is full by 8am. Therefore, it

is possible to harness the flexibility of this vehicle by intelligently scheduling its charging rate. Like individual

EVs, small loads in residential and commercial buildings are now attracting more and more attentions as they are

ubiquitous and potentially can be easily controlled by the currently available communication platforms [32].

Flexible loads are controlled through demand side management [32, 33]. The key idea of demand side man-

agement is to exploit the flexibility of loads to match supply. In contrast, by conventional generation-based ap-

proaches, the power outputs of generators are adjusted to match demand, provided that the demand is not changed.

For example, as the electricity demand rises, to resolve the energy deficit, more expensive and normally less used

fast-responsive generators are brought online, which in turn can drive up the generation cost significantly. The

comparison between load-based and generation-based approaches is made in [32]. Some potential advantages of

load-based approaches include instantaneous response and environmental friendliness.

The design of demand side management can be from either the user’s perspective (e.g., for minimizing the

consumption cost of the user) or from the system’s perspective (e.g., for minimizing the system operational cost).

One key challenge in designing the demand side program is to combine the user-level objective along with the

system-level objective. So far, there are mainly two design methods proposed in literature for demand side man-

agement, each focusing on one perspective as mentioned:

• direct load control (DLC), and

• dynamic pricing (DP).

DLC has been adopted widely in industry since the 1970s [33]. By DLC, the utility can directly control the

energy consumption of the loads participating in the program. The advantage of DLC lies in its reliability and

efficiency for fulfilling system-level objectives. However, DLC can inevitably cause frequent interruptions and

thus discomfort to the users. As a result, some frustrated users may withdraw from the program, which in turn may

incur significant economic losses to the power system. Therefore, for DLC to work well in practice, it is crucial to

first incentivize users to be enrolled in the program, and later maintain a certain level of quality-of-service for the

engaged users. There are many works on DLC. For example, in [34], the authors propose an incentive mechanism

for participating loads. In [35], the authors study an EV charging problem aiming at maximizing the amount of

energy provided to a user with the minimum cost. In [36], the authors develop a simulation-based framework for

minimizing the amount of the controlled loads as well as the level of user discomfort.

As opposed to DLC, by DP, the utility does not control the load consumption directly, but instead through

some pricing strategy (e.g., real-time pricing [37,38], time-of-use pricing [39,40], and critical peak pricing [41]).

Different from the commonly applied fixed-rate retail pricing, which shields users from the wholesale price,

the idea of DP is to adjust the retail price to reflect the wholesale price and thus the instantaneous generation

cost in the wholesale market. Provided that the participating users are price-responsive, the load consumption

profile of each user is supposed to be adjusted in response to the retail price under DP. DP is attractive in that

it provides instantaneous incentives instead of direct commands for modifying energy consumptions of users,

which preserves the freedom of participating users. However, it should be noted that, in practice, the efficacy of

DP could be discounted for two main reasons: first, the users usually have limited knowledge to appropriately

respond to the time-varying price, and second, most of the current residences lack of automation systems for such

responses [42]. Consequently, the system-level performance may not be guaranteed.


To study demand side management (and probably many other problems such as transformer and storage

sizing, and distribution network simulation [43]), it is essential to adopt appropriate load models. Individual

load models can provide detailed load information, including load properties and various power consumption

constraints. For example, in literature, the model of individual home loads (e.g., washer, dryer, and refrigerator)

is proposed in [44]; moreover, the model of the temperature state evolution of an individual thermostatically

controlled load (TCL) is commonly represented by a discrete time difference equation [45, 46]. However, when

there are thousands or millions of loads enrolled in the demand side program, which can be the case for residential

loads, using such detailed individual models may lead to intractable load management. Therefore, it is crucial to

adopt an aggregate load model which can well balance the trade-off between the model accuracy and the control

tractability. For aggregate load models, for example, a stochastic model of aggregate EV charging is provided

in [47]; three aggregation models of TCLs are discussed in [48]; and a reduced-order load model for deferrable

and TCLs is provided in [49]. Interestingly, in [50–52], the authors show that the aggregation of some deferrable

loads can be represented as equivalent storage.

1.3 Literature Review

In this thesis, for energy storage, we focus on the applications of real-time power balancing and real-time phase

balancing in power systems. When additionally incorporating flexible loads into the system model, we consider

the application of real-time energy management in which the joint management of the supply side, the demand

side, and the storage units is investigated. Below we review state-of-the-art works in these studies.

1.3.1 Real-Time Power Balancing

Balancing power supply and demand, i.e., matching power generation and demand load continuously, is crucial

for grid reliability. To achieve power balance, the operator of a power grid needs to schedule generation and load

both in a large time scale (e.g., day-ahead or hour-ahead) based on the prediction of future supply and demand,

and in a real-time scale (e.g., minutes or seconds) due to, for example, unavoidable prediction errors [53]. For

real-time scale power balancing, one of the most prevalent examples is frequency regulation, which operates every

few seconds to maintain the frequency of a power grid at its nominal value (50 Hz in Europe and 60 Hz in the

U.S.), and is the most expensive ancillary service [54]. With more and more renewable integration in power grids,

the need for real-time power balancing could increase drastically.

To address the problem of real-time power balancing, several intelligent algorithms have been proposed aiming

at optimally scheduling either dispatchable generators on the supply side or flexible loads on the demand side.

For example, for the supply side management, in [55] and [56], by assuming that all demand loads are critical

and must be met, the authors provide real-time algorithms for optimally scheduling the output of dispatchable

generators, so as to minimize the system cost. In particular, the authors of [55] focus on the average system

performance, while the authors of [56] emphasize on the worst-case system performance. For the demand side

management, in [57], [58], and [21], real-time power balance is achieved by scheduling the loads of users, with the

objective of minimizing the average system cost. Specifically, the authors of [57] propose to optimally schedule

the non-interruptible and deferrable loads of individual users within their deadlines. The problem is formulated

as a Markov decision process (MDP) problem and is solved distributively. In addition, both the authors of [58]

and [21] develop their solutions under the framework of Lyapunov optimization.

Complementary to the direct supply and demand approaches, DS units, such as batteries inside EVs and batter-


ies deployed at renewable generators (for regulating the rate of power supply), are potentially effective alternatives

for real-time power balancing [59]. Experiments have revealed that an EV’s power electronics and battery can

well respond to frequent charging and discharging signals [60]. Thus, it is possible to exploit plugged-in EVs to

eliminate real-time power discrepancy. In addition, compared with supply side management using traditional gen-

erators, such as natural gas generators, which burn fossil fuels, DS units may be more environmentally friendly.

Compared with scheduling the loads of users, intelligent charging and discharging control of DS units may cause

less inconvenience to users. However, to serve real-time power balancing, we may need a large number of DS

units, as the power imbalance amount in a power grid is in general much greater than the power capacity of an

individual DS unit. To coordinate the participating DS units, we consider an aggregator-DS system in which the

aggregator serves as an intermediary between DS units and the grid operator.

There is a growing body of recent works on power balancing using DS units. Specific to the aggregator-DS

system, which focuses on the interaction between the aggregator and DS units, most works adopt centralized

control, with the objective of maximizing the profit of the aggregator or DS units (e.g., [23–27]) or the social

welfare of the system (e.g., [61, 62]). To our best knowledge, the only previous works that address distributed

control specific to the aggregator-DS system are presented in [28, 29, 63], all studying a deterministic system.

In addition, most of the earlier works have omitted to consider some essential characteristics of the aggregator-

DS system. For example, a deterministic model is used in [23] and [26], which ignores the uncertainty of the

electricity price, and the dynamics of the power imbalance amount is not incorporated in [28, 29, 63]. For the

aggregator, the potential cost for using external energy sources to clear the imbalance amount is ignored in [23–

27]. For DS units, the finite battery size constraints are not considered in [28]; the battery degradation costs due

to frequent charging and discharging in real-time operation are omitted in [25–29]; and the energy gain and loss

in storage operation is ignored in [25, 28, 29, 61].

1.3.2 Real-Time Phase Balancing

In North America, many residential customers are connected to distribution systems through single-phase trans-

mission lines. Phase balancing, i.e., maintaining the balance of loads among phases, is crucial for power grid

operation [53]. This is because phase imbalance can increase energy losses and the risk of failures, and can also

degrade system power quality. With the spread of single-phase renewable generators, such as wind and solar gen-

erators, and large loads, such as EVs, phase imbalance could be aggravated. Thus, how to maintain phase balance

in future power grids deserves more careful study.

Previous works on phase balancing considered methods such as phase swapping (e.g., [64]) and feeder recon-

figuration (e.g., [65]). However, these approaches can be ineffective or can incur extra costs on human resources,

maintenance expenses, and planned outage duration [64]. An alternative method is to employ energy storage to

mitigate the imbalance among phases. Examples of single-phase storage include:

• Traditional standalone storage such as batteries, flywheels, etc [66].

• Batteries in single-phase connected buildings such as plugged-in EVs [67].

• Aggregations of small single-phase deferrable loads, e.g., residential TCLs or EV garages, which have been

shown to be representable as equivalent storage [50–52].


Table 1.1: Comparison with existing works

[56] [55] [57] [20] [62] [6] [85] [86] [58] [87] [21] [88] [89]

Proposed

Supply management Y Y Y Y Y Y Y

Demand management Y Y Y Y Y Y Y Y

Storage management Y Y Y Y Y Y Y Y Y Y

Uncertainty/dynamics Y Y Y Y Y Y Y Y Y Y Y Y Y Y

Ramping constraint Y Y Y Y Y

Real-time algorithm Y Y Y Y Y Y Y Y Y Y Y Y Y

Distributed algorithm Y Y Y Y Y

1.3.3 Real-Time Energy Management

So far, there have been many works on energy management in the context of renewable integration. In Table

1.1, we select some representative papers that are more related to our work. These works emphasize on various

issues of the system in energy management. For example, the authors of [56] and [55] consider supply side

management by assuming that all loads are uncontrollable, the authors of [57] study demand side management

by optimally scheduling non-interruptible and deferrable loads of individual users, and the authors of [6, 20],

and [62] propose to employ energy storage to clear power imbalance. Some other works combine either supply

side and demand side managements [85], or supply side and storage managements [86], or demand side and

storage managements [21, 58, 87].

Among existing works, [88] and [89] are mostly related to our work, in which all three types of energy

management (i.e., supply, demand, and storage) are jointly considered for power balancing. However, in [88],

although the uncertainty of the renewable generation is considered and characterized by a polyhedral set, the

uncertainty of the loads and energy prices is ignored. Moreover, the algorithm is designed for off-line use such

as in day-ahead scheduling, and therefore cannot be implemented in real time. In [89], a real-time algorithm is

proposed to minimize the cost of a conventional generator (CG) only. Furthermore, the ramping constraint of the

CG is not considered in the algorithm design.

1.4 Thesis Contributions

The study we propose entails expertise in diverse disciplines, such as power systems, information technology,

control theory, and economics. The goal of this thesis mainly includes two: 1) develop rigorous methodologies

and tools to intelligently control energy storage and flexible loads in grid-wide services; and 2) quantify the effects

of enrolling energy storage and flexible loads on grid operations. The ultimate goal is to improve the long-term

performance, such as reliability, robustness, and cost effectiveness, of power grids under a high penetration of

renewables.

To achieve the goal proposed above, we may encounter many challenges. Below, we state three main chal-

lenges: system uncertainty, coupling of system operational constraints, and large scale of power grids.

• System uncertainty

System states such as renewable generation, electricity prices, and loads are in general random over time.

Incorporating the uncertainty of these system states into analysis may require accurate information of sys-

tem modeling and statistics, which is usually difficult to obtain in reality. Moreover, under system uncer-

tainty, the underlying problems are formulated as stochastic optimization problems, which are much harder

to solve than their deterministic counterparts.


• Coupling of system operational constraints

A power grid is operated under various constraints. For example, for the supply side, CGs may be subject to

the capacity and ramping constraints; for the demand side, loads may be subject to the power consumption

and deadline constraints; and energy storage may be subject to the maximum charging and discharging rates

and storage capacity constraints. Furthermore, transmission lines connecting these components in a power

grid are limited to transmission capacity constraints. As a result, for achieving a grid-wide objective, these

constraints could couple the control decisions of different grid components, and also couple the control

decisions of each component over multiple time instances, which complicates the algorithm design.

• Large scale of power grids

In a power grid, there can be a large number of components owned by different agents rather than the grid

operator. For example, at the supply side, there could be hundreds of small-size RGs operated by different

owners, and at the demand side, there could be thousands of individual loads owned by different residential

consumers. In such a large-scale power network, how to incentivize different agents to participate in grid-

wide services, how many net benefits of recruiting these agents, and how to efficiently coordinate the control

actions of individual agents to fulfill system-wide objectives are difficult and are currently open.

So far, there has been a large body of literature on energy storage and flexible loads in the context of renewable

integration. Compared with these works, our study contributes to the literature in the following ways:

1. We construct more complete system models that can accommodate a wide spectrum of vital characteristics

of a power system.

2. We explicitly take system uncertainty into account in the problem formulation.

3. For control purposes, we provide centralized real-time algorithms that are easy to implement and are with

strong analytical performance guarantee. Moreover, to address privacy issues and possibly restricted com-

munication constraints between the grid operator and individual system components, we propose efficient

distributed algorithms that only require limited information exchange.

For the design of real-time algorithms, we employ Lyapunov optimization, which has been discussed in Sec-

tion 1.1. For the design of distributed algorithms, we apply the fast iterative shrinkage-thresholding algorithm

(FISTA) [68] or the alternating direction method of multipliers ADMM [69] to distributively solve the per-slot

sub-problems.

1.5 Thesis Outline

The following four chapters are organized as below: Chapters 2, 3, and 4 are mainly about storage control in

grid-wide services. Chapters 2 and 3 focus on the application of real-time power balancing, in which the amount

of the power imbalance needs to be cleared for grid reliability. In Chapter 2, we consider general static storage

units that are always connected to the system. In Chapter 3, we further extend static storage units to dynamic

ones that can leave and rejoin the system, such as batteries inside EVs. Chapter 4 focuses on the application of

real-time phase balancing, in which the objective is to maintain the balance of loads among phases. In Chapter 5,

we additionally incorporate flexible loads into the power system operation, and study the joint management of the

supply side, the demand side, and storage for maintaining the balance of a power grid.

Chapter 2

Real-Time Power Balancing with Static

Storage

In this chapter, we consider a general problem of employing an aggregator-DS system to provide real-time power

balancing service for a power grid. Compared to previous works, we consider a more complete aggregator-DS

system model by incorporating all missing factors discussed in the previous paragraph. In particular, we aim at

offering both an optimal real-time schedule of charging and discharging for each DS unit, and a fast distributed

algorithm for its implementation. This leads to a large-scale stochastic optimization problem. The problem is

particularly challenging in two ways. First, in terms of real-time design, the dynamic system state and the finite

battery size constraints complicate the joint decision-making over multiple time instances. Second, in terms of

distributed implementation of scheduling the DS units’ charging and discharging amounts, the decision of each

DS unit is intrinsically coupled with those of the others due to the system-wide objective, which hinders the devel-

opment of a decentralized solution. To tackle these two difficulties, we first use a modified Lyapunov optimization

technique [16] to transform the original long-term objective into per-slot sub-problems that respect the battery size

constraints. Then, we employ Lagrange dual decomposition [70] and adapt a fast iterative shrinkage-thresholding

algorithm (FISTA) [68] to distributively solve the per-slot sub-problems.

For the DS units, we focus on the static storage units in this chapter. By “static,” we mean that the participating

storage units are always connected to the system. Examples of such DS units are batteries deployed at renewable

generators. The case of dynamic DS units that can leave and rejoin the system will be discussed in the next

chapter.

2.1 System Model and Problem Statement

2.1.1 Power Balancing and Aggregator-DS System

In a power grid as shown in Fig. 2.1, due to intrinsic prediction error of generation and load as well as the random-

ness of renewable sources, the generation amount cannot match the load amount continuously. The discrepancy

between these two at any time can be represented by a power imbalance signal. Consider a time-slotted system

with equal time intervals, which in practical systems may range from a few seconds to a few minutes. For ease

of notation, we incorporate time into the power imbalance signal and use energy units below. At time slot t,

we denote gt, |gt| ≤ gmax, as the energy imbalance amount, which is random. If gt > 0, then the generation

9

CHAPTER 2. REAL-TIME POWER BALANCING WITH STATIC STORAGE 10

...

External sources

Local grid

Generation

Load

Aggregator

DS 1

DS 2

DS N

Power imbalance signal

Information flow

Energy flow

Utility

Figure 2.1: Schematic representation of a local power grid.

amount is greater than the load amount by gt units, which results in energy surplus. If gt < 0, then the generation

amount is less than the load amount by |gt| units, which results in energy deficit. Define 1s,t,1(gt > 0) and

1d,t,1(gt < 0) as the indicators of energy surplus and energy deficit at time slot t, respectively, where 1(·) is the

indicator function. Since energy surplus and energy deficit cannot happen simultaneously, we have 1s,t ·1d,t = 0.

Assume that an aggregator serves the power grid and employs energy storage, capable of charging and dis-

charging, to clear the energy imbalance in every time slot. Since the magnitude of the energy imbalance signal,

|gt|, is in general large and building a massive energy storage unit could be costly, the aggregator instead coordi-

nates N (smaller) DS units, possibly owned by different users, to provide power balancing service.

At the beginning of time slot t, the aggregator receives the energy imbalance signal gt from the utility. If

gt > 0, the aggregator is required to absorb gt units of energy during time slot t. If gt < 0, the aggregator is

required to contribute |gt| units of energy during time slot t. Upon receiving the energy imbalance signal, the

aggregator communicates with each DS unit bidirectionally so as to negotiate the individual energy absorption or

contribution amount. The information and energy flows of the system are depicted in Fig. 2.1.

For the i-th DS unit, denote xi,t ≥ 0 as its charging amount during time slot t in the case of energy surplus,

and yi,t ≥ 0 as its discharging amount during time slot t in the case of energy deficit. Because of limitation

imposed by the charging and discharging circuits, the values of xi,t and yi,t are upper bounded. For simplicity,

assume that the maximum allowed charging and discharging amounts are of the same quantity, denoted by ri,max,

i.e.,

0 ≤ xi,t ≤ ri,max, 0 ≤ yi,t ≤ ri,max. (2.1)

Define N -dimensional charging and discharging amount vectors at time slot t as

xt,[x1,t, · · · , xN,t] and yt,[y1,t, · · · , yN,t],

respectively.

Let ηi,c ∈ (0, 1] be the charging efficiency coefficient of the i-th DS unit, and ηi,d ∈ [1,+∞) be the discharg-

ing efficiency coefficient. Because of the battery inefficiency, generally, the actual stored energy through charging

is less than xi,t, and the actual contributed energy through discharging is larger than yi,t. Denote si,t as the energy

state of the i-th DS unit at the beginning of time slot t. Due to charging and discharging, the energy state si,t


fluctuates over time and evolves as follows1:

si,t+1 = si,t + 1s,tηi,cxi,t − 1d,tηi,dyi,t,si,t + bi,t (2.2)

where

bi,t,1s,tηi,cxi,t − 1d,tηi,dyi,t (2.3)

is defined as the effective charging and discharging amount of the i-th DS unit during time slot t.

Charging a battery near its capacity or discharging it close to the zero energy state can significantly reduce

battery lifetime [71]. Therefore, lower and upper bounds on the battery energy state are usually imposed by

its manufacturer or owner. Denote si,cap as the energy capacity of the i-th DS unit, and [si,min, si,max] as its

preferred energy range with 0 ≤ si,min < si,max ≤ si,cap. We assume that the energy state at each time slot

should be limited within the preferred range, i.e.,

si,min ≤ si,t ≤ si,max. (2.4)

Combining the constraints (2.1) and (2.4), we can compactly represent the constraints of xi,t and yi,t as

0 ≤ xi,t ≤ min

{

ri,max,si,max − si,t

ηi,c

}

and

0 ≤ yi,t ≤ min

{

ri,max,si,t − si,min

ηi,d

}

,

respectively.

Since a DS unit absorbs and contributes energy in charging and discharging, respectively, it has either energy

gain or energy loss when providing real-time power balancing service. Denote the unit market electricity price

at time slot t as pm,t ∈ [pm,min, pm,max]. Then, the revenue of the i-th DS unit for absorbing energy in the case

of energy surplus is pm,txi,t, and the loss for contributing energy in the case of energy deficit is pm,tηi,dyi,t.

Additionally, by providing power balancing service, each DS unit can receive payment from the aggregator for its

controllable and flexible charging and discharging capability. Denote the unit prices for charging and discharg-

ing services at time slot t as pc,t and pd,t, respectively. Assume that the aggregator pays for the charging and

discharging services based on the actual service amounts xi,t and yi,t. In other words, the i-th DS unit receives

payment pc,txi,t in the case of energy surplus for charging, and payment pd,tyi,t in the case of energy deficit for

discharging. As a result, the effective cost of the i-th DS unit for providing power balancing service at time slot t

is

φi,t,1s,t(−pm,txi,t − pc,txi,t) + 1d,t(pm,tηi,dyi,t − pd,tyi,t).

For each DS unit, offering power balancing service is accompanied by battery degradation for frequent charg-

ing and extra cycling of battery [72]. Denote Di,c(·) and Di,d(·) as the degradation cost functions with respect

to the charging amount and the discharging amount, respectively, with Di,c(0) = Di,d(0) = 0. Since the actual

discharging amount is ηi,dyi,t, for notation simplicity, we will merge ηi,d into the function Di,d(·). Furthermore,

1We assume that the role of the DS units is to exclusively provide real-time power balancing service when connected and thus do not

explicitly consider their own charging needs.


since faster charging or discharging (i.e., a larger value of xi,t or yi,t) generally has a more detrimental effect on

battery lifetime, Di,c(·) and Di,d(·) can be approximated by increasing convex functions in general. To facilitate

later analysis, we slightly strengthen this condition and take the following assumptions:

C1:

• Di,c(·) and Di,d(·) are increasing, strictly convex, and twice continuously differentiable on [0, ri,max].

• The second derivatives of Di,c(·) and Di,d(·) are lower bounded by a constant di,l > 0 on [0, ri,max].

To limit battery degradation, the i-th DS unit sets a pre-designed upper bound li,u ≥ 0 to restrict the long-term

degradation cost, which can be formally expressed by limT→∞1T

∑T−1t=0 E[1s,tDi,c(xi,t)+1d,tDi,d(yi,t)] ≤ li,u.

Due to a lack of participating DS units or high battery degradation cost, sometimes the sum contribution of

all DS units may be insufficient to clear the total power imbalance amount. Specifically, for energy surplus, this

insufficiency means that∑N

i=1 xi,t < gt, and for energy deficit, it means that∑N

i=1 yi,t < |gt|. Hence, from

time to time, to fill the gap, the aggregator needs to exploit external energy sources, such as the external real-

time electricity market2. Denote the cost functions of the external sources for clearing energy surplus and energy

deficit as Cs(·) and Cd(·), respectively, with Cs(0) = Cd(0) = 0. Then, the cost of the aggregator for exploiting

the external sources at time slot t can be represented as 1s,tCs(gt −∑N

i=1 xi,t) + 1d,tCd(|gt| −∑N

i=1 yi,t). We

assume the following conditions on the external cost functions:

C2:

• Cs(·) and Cd(·) are increasing, strictly convex, and twice continuously differentiable on [0, gmax].

• The second derivatives of Cs(·) and Cd(·) are lower bounded by a constant cl > 0 on [0, gmax].

Finally, the total cost of the aggregator, including that for using the external sources and the payment to all DS

units, is given by

ϕt,1s,t

[

Cs

(

gt −N∑

i=1

xi,t

)

+ pc,t

N∑

i=1

xi,t

]

+ 1d,t

[

Cd

(

|gt| −N∑

i=1

yi,t

)

+ pd,t

N∑

i=1

yi,t

]

.

Combining the costs of all DS units with the cost of the aggregator, we have the total cost of the aggregator-DS

system at time slot t given by

wt,ϕt +N∑

i=1

φi,t.

Notice that the payment for the charging and discharging services does not appear in the final expression of

wt. This is because such payment is transferred from the aggregator to the DS units, hence not affecting the

system-wide cost. We will revisit the service payment in Section 2.3.3.

2.1.2 Problem Statement

The aggregator is assumed to be regulated and non profit-driven. For example, it can represent a government-

funded party that encourages the integration of DS units into a power grid. The aggregator coordinates the

DS units to provide real-time power balancing service, and aims to minimize the long-term system cost while

2In practice, the imbalance signal gt may relate to the capacity of the service provider. In this paper, we focus on the aggregator-DS

system, and assume that gt is determined externally and the aggregator guarantees to clear the imbalance in every time slot.


respecting the battery capacity and degradation cost constraints of each DS unit. We assume that each DS unit is

willing to provide real-time power balancing service and is under contract with the aggregator. In return, the DS

units will be paid for such a service as described in Section 2.1.13.

We formulate the real-time power balancing problem as the following stochastic optimization problem.

P1: min{xt,yt}

limT→∞

1

T

T−1∑

t=0

E[wt]

s.t. 0 ≤ xi,t ≤ min{

ri,max,si,max − si,t

ηi,c

}

, ∀i, t (2.5)

0 ≤ yi,t ≤ min{

ri,max,si,t − si,min

ηi,d

}

, ∀i, t (2.6)

N∑

i=1

xi,t ≤ 1s,tgt,

N∑

i=1

yi,t ≤ 1d,t|gt|, ∀t (2.7)

limT→∞

1

T

T−1∑

t=0

E[1s,tDi,c(xi,t) + 1d,tDi,d(yi,t)] ≤ li,u, ∀i (2.8)

where the expectations above are taken over the random system state defined as At,(gt, pm,t) and the possibly

random decisions (xt,yt). The rationale for constraints (2.5)-(2.8) is given in Section 2.1.1. By (2.7), we mean

that, first, the sum contribution of all DS units should not exceed the required amount, and second, in the case of

energy deficit (resp. energy surplus) the charging (resp. discharging) amount of each DS unit should be zero.

The above optimization problem can be solved centrally by traditional approaches such as dynamic program-

ming [13], provided that the aggregator knows perfectly about the system statistics and can fully control the

charging and discharging of all DS units. However, for one, dynamic programming is known to suffer from the

“curse of dimensionality,” and accurate statistics cannot be easily obtained in practice. For another, direct charging

and discharging control not only overrides a DS owner’s individual choice but also leads to high computational

complexity as the number of participating DS units becomes large.

Motivated by these concerns, our goal is to develop a real-time distributed algorithm, by which the statistics of

the system state is not required and each DS unit is able to make its own decision. This is a challenging problem

due to the presence of the dynamic system state, the finite battery size constraints, and the coupling of decisions

among all DS units. To address this problem, we first decompose the long-term optimization problem P1 into

per-slot sub-problems.

2.2 Centralized Real-Time Algorithm

To solve P1, we now propose a centralized algorithm using the general framework of Lyapunov optimization [16],

with modifications to handle finite battery size constraints and to facilitate the distributed algorithm introduced

later.

2.2.1 Problem Relaxation

Recall that for each DS unit, the hard constraints of the charging and discharging amounts, i.e., (2.5) and (2.6), are

equivalent to the constraints (2.1) and (2.4). Due to the battery size constraint (2.4), for each DS unit, the current

3We emphasize that the market aspects, such as the contract design investigated in [73], are not the focus of this paper.


charging and discharging decisions are coupled with all previous charging and discharging decisions through the

current energy state, which complicates the optimization. To avoid such coupling, we replace (2.4) with a new

time average constraint and introduce the following relaxed problem:

P2: min{xt,yt}

limT→∞

1

T

T−1∑

t=0

E[wt]

s.t. (2.1), (2.7), (2.8),

limT→∞

1

T

T−1∑

t=0

E[bi,t] = 0, ∀i (2.9)

where bi,t is defined in (2.3). As opposed to (2.4), by which the energy state is always bounded, (2.9) requires

that the effective charging and discharging amount is zero on average.

We now demonstrate that (2.4) implies (2.9), so that P2 is indeed a relaxation of P1. Summing both sides of

the energy state equation (2.2) over t ∈ {0, 1, · · · , T − 1} and dividing them by T yields

si,TT− si,0

T=

1

T

T−1∑

t=0

bi,t. (2.10)

Taking expectations on both sides of (2.10) and taking limits over T gives

limT→∞

E[si,T ]

T− lim

T→∞

E[si,0]

T= lim

T→∞

1

T

T−1∑

t=0

E[bi,t]. (2.11)

Since si,T and si,0 are bounded by (2.4), the left hand side of (2.11) is equal to zero and the constraint (2.9) holds.

By removing the coupling in charging and discharging decisions due to the battery size constraints, the relaxed

problem P2 allows us to apply Lyapunov optimization to decompose the original problem into real-time sub-

problems. We will show later in Section 2.2.4 that our developed solution in fact also satisfies (2.4), so it is

feasible for P1. This relaxation technique to accommodate the type of time-coupled decision constraints such as

(2.5) and (2.6) was first introduced in [74] for energy management in a data center equipped with an ideal battery,

and later was also applied in [58] and [21]. Compared with [74], besides our problem being different from it, we

consider multiple DS units. Compared with [58] and [21], the structure of our problem is more complicated, with

a nonlinear objective which allows for bidirectional energy flow between the aggregator and DS units. Thus, it is

more involved in the relaxation treatment to ensure that the battery size constraints are satisfied.

2.2.2 Virtual Queue Design

To solve P2, we introduce virtual queues and transform the time-averaged constraints (2.8) and (2.9) to queue

stability constraints, as explained below.

Consider constraint (2.8). To facilitate distributed implementation which will be explained later, we add a

constant cost cushion ai > 0 to both sides of (2.8), and obtain the following equivalent constraint for each DS

unit:

limT→∞

1

T

T−1∑

t=0

E[1s,tDi,c(xi,t) + 1d,tDi,d(yi,t) + ai] ≤ li,u (2.12)


where li,u,li,u + ai. Define a virtual queue Ji,t, which updates as

Ji,t+1 = max{Ji,t − li,u, 0}+ 1s,tDi,c(xi,t) + 1d,tDi,d(yi,t) + ai. (2.13)

Initialize Ji,0 = ai and define Jt,[J1,t, · · · , JN,t]. Based on (2.13), queue backlog Ji,t accumulates the total

amount of degradation cost in excess of li,u. The function of ai is to guarantee that Ji,t ≥ ai. The introduction of

ai is important, and we will discuss the design of ai in Section 2.3.2.

For constraint (2.9), we associate it with a virtual queue Ki,t, which evolves as

Ki,t+1 = Ki,t + bi,t. (2.14)

Define Kt,[K1,t, · · · ,KN,t]. By (2.14), Ki,t accumulates the total effective charging and discharging amount.

Comparing (2.14) with (2.2), we can see that Ki,t and the energy state si,t evolve in the same manner. We relate

them by initializing Ki,0 = si,0 − βi, where the perturbation parameter βi is set to be

βi,si,min + ηi,dri,max − V

(

pm,min −cmax

ηi,d

)

(2.15)

where cmax,max{C′s(gmax), C

′d(gmax)} and the weight V ∈ (0, Vmax] with

Vmax, min1≤i≤N

{

si,max − si,min − (ηi,c + ηi,d)ri,maxcmax+pm,max

ηi,c+ cmax

ηi,d− pm,min

}

. (2.16)

Thus, the virtual queue Ki,t is a shifted version of the energy state si,t. Ki,t is introduced to track si,t. More

importantly, as we will see later, the boundedness of si,t can be guaranteed through the control of Ki,t. The

design of βi and Vmax in (2.15) and (2.16) is crucial. We will show in Section 2.2.4 how the constraint (2.4) can

be guaranteed by such design.

Note that under the real-time operation, the value of ri,max in (2.16) is generally much smaller than the

energy capacity. For example, for the 2012 Ford Focus Electric, the energy capacity is 23 kWh and the maximum

charging and discharging rate is 6.6 kW. Assuming that the duration of each time slot is 5 minutes, we then have

ri,max = 0.55 kWh≪ 23 kWh. By this observation, from (2.16), we have Vmax > 0 in general.

Finally, we show that the time-averaged constraints (2.8) and (2.9) can be transformed into the mean rate

stability constraints of virtual queues, which is a direct result from [16]. Below we first give the definition of

mean rate stability of a queue.

Definition: A queue Qt is mean rate stable if limt→∞E[|Qt|]

t= 0.

Lemma 2.1 Constraints (2.8) and (2.9) hold if the virtual queues Ji,t and Ki,t are mean rate stable, respectively.

2.2.3 Centralized Algorithm

At time slot t, define a vector Θt,[Jt,Kt], the Lyapunov function L(Θt),12

∑Ni=1(J

2i,t +K2

i,t), and the associ-

ated one-slot Lyapunov drift

∆(Θt),E [L(Θt+1)− L(Θt)|Θt] .

The drift-plus-cost function is defined as ∆(Θt)+V E[wt|Θt] [16], in which the time-averaged constraints and the

objective function are jointly considered, with the weight V (the same V as in (2.15)) controlling their trade-off.

In the following proposition, we provide an upper bound on the drift-plus-cost function.


Proposition 2.1 For all possible policies of the charging/discharging decisions of all DS units, and all possible

values of Θt, the drift-plus-cost function is upper bounded as follows:

∆(Θt) + V E[wt|Θt] ≤B + V E[wt|Θt] +

N∑

i=1

Ki,tE[bi,t|Θt]

+

N∑

i=1

Ji,tE[1s,tDi,c(xi,t) + 1d,tDi,d(yi,t)− li,u|Θt] (2.17)

where

B,1

2

N∑

i=1

[

l2i,u +(

max{Di,c(ri,max), Di,d(ri,max)} + ai)2

+ r2i,max

]

, (2.18)

and V ∈ (0, Vmax].

Proof: See Appendix 2.6.1.

Adopting the general framework of Lyapunov optimization [16], we design a real-time algorithm to minimize

the upper bound of the drift-plus-cost function on the right hand side of (2.17). The algorithm can lead to a

guaranteed performance as shown in Section 2.2.4. Consequently, we consider the per-slot sub-problems for

energy surplus and energy deficit at each time slot t as follows. For notation simplicity, we will omit the subscript

t of the optimization variables whenever it is clear from the context.

P2(a) (energy surplus):

minx

[

N∑

i=1

Ji,tDi,c(xi)− V pm,txi +Ki,tηi,cxi

]

+ V Cs

(

gt −N∑

i=1

xi

)

s.t. 0 ≤ xi ≤ ri,max,N∑

i=1

xi ≤ gt.

P2(b) (energy deficit):

miny

[

N∑

i=1

Ji,tDi,d(yi) + V pm,tηi,dyi −Ki,tηi,dyi

]

+ V Cd

(

|gt| −N∑

i=1

yi

)

s.t. 0 ≤ yi ≤ ri,max,

N∑

i=1

yi ≤ |gt|.

The optimization variables x and y are N -dimensional vectors with the i-th element being xi and yi, respectively.

The centralized algorithm that is implemented by the aggregator is summarized as follows, in which the system

statistics are not required and only instantaneous observations are needed.

2.2.4 Performance Analysis

The proposed algorithm in Algorithm 2.1 is designed for P2, in which the battery size constraint (2.4) in P1

is replaced by the relaxed time average constraint (2.9). Thus, with {x∗t ,y

∗t }, it is not yet certain whether the

resultant si,t violates the battery size constraint (2.4), thus becoming infeasible for P1. We now demonstrate that

under the proposed algorithm, si,t in fact always satisfies (2.4).

Since the virtual queue Ki,t is designed to be a shifted version of si,t, to prove the boundedness of si,t, it


Algorithm 2.1: Centralized algorithm for real-time power balancing.

Initialize Ji,0 = ai and Ki,0 = si,0 − βi, ∀i.At each time, the aggregator executes the following steps sequentially.

1. Observe gt, pm,t, Ji,t, and Ki,t.

2. Solve P2(a) if gt > 0, and solve P2(b) if gt < 0.

3. Update Ji,t and Ki,t based on (2.13) and (2.14), respectively.

suffices to show that Ki,t is restricted within a shifted preferred range. We first show through the following

lemma that, Ki,t is bounded for any initial value Ki,0.

Lemma 2.2 For any initial value Ki,0,

1. if gt > 0 and Ki,t > V(pm,max+cmax)

ηi,c, x∗

i,t = 0;

2. if gt < 0 and Ki,t < V (pm,min − cmax

ηi,d), y∗i,t = 0.


Lemma 2.2 says that, given any Ki,0, for energy surplus, if Ki,t is greater than the above threshold, the

resultant charging amount is zero, and thus Ki,t+1 cannot be increased at the next time slot. Similarly, for energy

deficit, if Ki,t is less than the above threshold, the resultant discharging amount is zero, and thus Ki,t+1 cannot

be decreased at the next time slot. Therefore, Ki,t is bounded.

Using Lemma 2.2, we next show that by our designed initialization, Ki,t is bounded within a shifted preferred

range.

Lemma 2.3 Given Ki,0 = si,0 − βi, where βi is defined in (2.15), the queue backlog Ki,t is bounded within

[si,min − βi, si,max − βi] for all time slot t.


Remarks on Choices of βi and Vmax: To track the energy state si,t, in principle, the shift βi could be any

value. However, as required in Case 2′ of the proof of Lemma 2.3, βi should be lower bounded, i.e., βi =

si,min+ ηi,dri,max−V (pm,min− cmax

ηi,d)+ ǫ1 for any ǫ1 ≥ 0. Furthermore, as required in Case 1 of the proof, it is

sufficient to set Vmax = min1≤i≤N

{

si,max−si,min−ηi,dri,max−ǫ1−ǫ2cmax+pm,max

ηi,c+ cmax

ηi,d−pm,min

}

with any ǫ2 > 0. Finally, to facilitate Case

2 of the proof, we set ǫ1 and ǫ2 to be 0 and ηi,cri,max, respectively, to make Vmax as large as possible. (As shown

in Theorems 2.1 and 2.2 below, a larger Vmax implies better performance by the proposed algorithm.) This leads

to the specific designs as shown in (2.15) and (2.16).

By Lemma 2.3, the boundedness of the energy state si,t is straightforward, and is given in the following

lemma.

Lemma 2.4 Under the proposed algorithm, the energy state si,t is bounded within [si,min, si,max] for all time

slot t.

We next show the analytical performance of Algorithm 2.1. Denote the long-term system cost under the

proposed algorithm as f∗ and that under the optimal solution for P1 as f opt. Note that the optimal solution

may require statistical information of the system, and can be difficult to derive. The optimality of the proposed

algorithm is described in Theorems 2.1 and 2.2.


Theorem 2.1 Suppose that the system state At is i.i.d. over time.

1. The virtual queues Ji,t and Ki,t are mean rate stable, and {x∗t ,y

∗t } is feasible for P1;

2. f∗ ≤ f opt + BV

, where B is defined in (2.18) and V ∈ (0, Vmax].


From Theorem 2.1, the system cost under the proposed algorithm is away from the optimum by O(1/V ).

Thus, the larger V , the better the performance of the proposed algorithm. However, in practice, due to the

boundedness of the preferred energy range, V cannot be arbitrarily large and is upper bounded by Vmax, whose

design rationale is given in the remarks after Lemma 2.3.

Based on the definition of Vmax in (2.16), Vmax increases with the smallest span of the DS units’ preferred

ranges, i.e., min1≤i≤N{si,max−si,min}. Therefore, roughly speaking, the performance gap between the proposed

algorithm and the optimum decreases as the smallest battery capacity increases. Asymptotically, as each DS unit’s

battery capacity goes to infinity, the proposed algorithm achieves the optimum. We also note that the cost cushion

ai increases the performance bound through the constant B. Hence, a smaller ai is desirable.

In Theorem 2.1, the i.i.d. condition of At can be relaxed to Markovian, and a similar performance bound can

be obtained. This allows us to design aggregator-DS systems in the case when the power imbalance amount gt

and the electricity price pm,t are Markovian over time.

Theorem 2.2 Suppose that the system state At evolves based on a finite state irreducible and aperiodic Markov

chain.

1. The virtual queues Ji,t and Ki,t are mean rate stable, and {x∗t ,y

∗t } is feasible for P1;

2. f∗ ≤ f opt +O(1/V ), where V ∈ (0, Vmax].

Proof: The above results can be proved by expanding the proof of Theorem 2.1 using a multi-slot drift

technique [16]. We omit the proof here for brevity.

2.3 Distributed Implementation

In the last section, we provided a centralized algorithm for the aggregator to coordinate all DS units to provide the

power balancing service. In particular, the real-time sub-problems P2(a) and P2(b) are solved by the aggregator

in a centralized way. However, since the DS units may belong to different users, they may not be willing to

relinquish direct control of charging and discharging to the aggregator. In addition, the computational complexity

of centralized control would grow too quickly as the number of DS units increases. In this section, we employ

Lagrange dual decomposition and adapt a fast iterative algorithm to solve P2(a) and P2(b) distributively. Since

energy surplus and energy deficit cannot happen simultaneously and their analyses are similar, in the following,

we focus on the energy surplus problem P2(a).

2.3.1 Lagrange Dual Decomposition

In P2(a), since Ji,t ≥ ai > 0 and Di,c(·) is strictly convex, the objective function is strictly convex, which

means that there is at most one global minimizer. Additionally, since the objective function is continuous and the

constraint set of x is compact, there is at least one minimizer. Therefore, P2(a) has a unique solution.


However, we note that the term Cs(gt −∑N

i=1 xi) in the objective function and the term∑N

i=1 xi ≤ gt in

the constraint are functions of the charging amounts of all DS units, which hinders a distributed algorithm. To

avoid such coupling, we first introduce an auxiliary variable q ∈ [0, gt] to represent the difference between the

energy imbalance amount and the sum contribution of all DS units, i.e., gt−∑N

i=1 xi, and consider the following

problem.

P2(a’):

minx,q

[

N∑

i=1


]

+ V Cs(q)

s.t. 0 ≤ xi ≤ ri,max, 0 ≤ q ≤ gt, (2.19)

N∑

i=1

xi + q = gt. (2.20)

It is clearly that P2(a’) and P2(a) are equivalent and have the same unique solution in x.

Next we associate the equality constraint (2.20) with a Lagrange multiplier λ. The partial Lagrangian of

P2(a’) is

Ft(x, q, λ) =

[

N∑

i=1


]

+ V Cs(q) + λ

(

gt −N∑

i=1

xi − q

)

.

The dual function Gt(λ) is defined as the partial minimum of Ft(x, q, λ) with respect to the primal variables x

and q:

Gt(λ) = minx,q

Ft(x, q, λ) s.t. (2.19).

Note that Gt(λ) can be naturally decomposed into sub-problems for each DS unit and the aggregator. Specif-

ically, with Gt(λ) divided by V , the sub-problem for each DS unit is

minxi

−pm,txi −λ

Vxi +

Ji,tV

Di,c(xi) +Ki,tηi,c

Vxi (2.21)

s.t. 0 ≤ xi ≤ ri,max,

while the sub-problem for the aggregator is

minq

Cs(q) +λ

V(gt − q) s.t. 0 ≤ q ≤ gt. (2.22)

In (2.21), by interpreting λV

as pc,t, the unit price for charging service as defined in Section 2.1.1, we can view

the objective of the i-th DS unit as minimizing the weighted sum of its different costs. By the Karush-Kuhn-Tucker

(KKT) conditions, given λ, we obtain the unique solution of (2.21) in closed form:

[(D′i,c)

−1(V pm,t + λ−Ki,tηi,c

Ji,t

)

]ri,max

0 ([x]ba means projecting x onto an inverval [a, b]).

In the optimization problem (2.22), the aggregator minimizes its cost, including the external energy cost and the

payment to all DS units. Again, the unique solution of (2.22) is found in closed form: [(C′s)

−1( λV)]gt0 . Thus,

for any given λ, there is a unique solution for both (2.21) and (2.22). Consequently, the dual function Gt(λ) is


Algorithm 2.2: Distributed algorithm to solve the dual of P2(a’).

begin Aggregator’s algorithm:

Initialize: k = 1; γ1 = λ0 ∈ R; ν1 = 1; µ, ǫ > 0.

repeat

Broadcast γk; receive xki , ∀i.

qk ← [(C′s)

−1(γk

V)]gt0 ;

λk ← γk + µ(

gt −∑N

i=1 xki − qk

)

;

νk+1 ← 1+√

1+4(νk)2

2 ;

γk+1 ← λk + νk−1νk+1 (λ

k − λk−1);k ← k + 1.

until |gt −∑N

i=1 xki − qk| < ǫ;

Output: q∗t = qk.begin DS’s algorithm:

repeat

Receive γk;

xki ← [(D′

i,c)−1(V pm,t+γk−Ki,tηi,c

Ji,t

)

]ri,max

0 ;

send xki .

until;

Output: x∗i,t = xk

i .

continuously differentiable in R [75].

The Lagrange dual problem is defined as the maximization of the dual function:

maxλ

Gt(λ). (2.23)

Denote the optimal solution of the dual problem at time slot t as λ∗t , and the unique optimal solution of P2(a’) at

time slot t as (x∗t , q

∗t ). Verifying Slater’s condition on P2(a’), we are assured to have strong duality between the

primal P2(a’) and its dual (2.23) [70]. Thus, at time slot t, using λ∗t , we can recover the optimal solution (x∗

t , q∗t )

by solving the sub-problems (2.21) and (2.22) [75].

To solve (2.23), we propose a fast iterative algorithm presented in the next subsection.

2.3.2 Dual Maximization with FISTA and Convergence Analysis

Since we consider the real-time power balancing problem with a short time interval, it is highly desirable that the

algorithm can converge quickly in each time slot. To this end, we adapt a fast iterative shrinkage-thresholding

algorithm (FISTA) [68] to solve the dual problem (2.23). The proposed algorithm is summarized in Algorithm

2.2. Compared with the standard gradient algorithm in which the Lagrange multiplier λk is updated from the

previous iterate λk−1, in Algorithm 2.2, λk is updated from γk, which is designed as a linear combination of the

previous two iterates λk−1 and λk−2. Nonetheless, the extra computation is marginal.

Below we show that the gradient of the dual function is Lipschitz continuous, and determine its Lipschitz

constant. The result is crucial for the convergence analysis of Algorithm 2.2.

Lemma 2.5 Under the conditions C1 and C2, the gradient of the dual function is Lipschitz continuous, i.e., we

have |G′t(λ1)−G′

t(λ2)| ≤ ρ|λ1 − λ2| for all λ1, λ2 ∈ R, where the Lipschitz constant ρ is given by

ρ,(N + 1)max

{

1

a1d1,l, · · · , 1

aNdN,l

,1

V cl

}

(2.24)


where di,l and cl are given in C1 and C2, respectively, and ai is the cost cushion parameter in (2.12).


In (2.24), since ai > 0, we have 0 < ρ <∞. Using Lemma 2.5, we now prove the convergence of Algorithm

2.2.

Theorem 2.3 Under the conditions C1 and C2, in Algorithm 2.2, with step size µ ∈ (0, µ0] where µ0,1/ρ, the

sequence {λk} converges to the optimum λ∗t of the dual problem (2.23). Furthermore, for any k ≥ 1,

Gt(λ∗t )−Gt(λ

k) ≤ 2|λ0 − λ∗t |2

µ(k + 1)2,

where λ0 is the initial value of λ.

Proof: Given Lemma 2.5, the proof is similar to that in [68] with minor modification. See Appendix 2.6.6.

Theorem 2.3 suggests that Algorithm 2.2 has a worst-case convergence rate of O(1/k2). In comparison,

the standard gradient algorithm, which is used in [21]4 among many others, has a worst-case convergence rate

O(1/k). Also from Theorem 2.3, the step size µ is upper bounded by µ0, the inverse of the Lipschitz constant ρ.

Based on the definition of ρ, we roughly have that, the larger the number of DS units, the smaller µ0 hence the

slower the algorithm, which conforms to our intuition.

Furthermore, from (2.24), µ0 is a strictly increasing function of the cost cushion ai in the interval (0, V cldi,l

].

Therefore, for the sole purpose of faster convergence, a larger ai should be chosen. However, recall in Section

2.2.4, we know that a smaller ai is desirable for minimizing the performance gap. Therefore, ai in fact acts as a

tuning parameter for balancing the trade-off between system performance and convergence speed.

2.3.3 Price Signaling pc,t

We now look at the property of the optimal charging price signal p∗c,t =λ∗

t

V. Since DS units have energy gain by

charging,λ∗

t

Vcan be negative. Below, we give a condition under which

λ∗

t

Vis lower bounded and there exists a DS

unit willing to provide power balancing service.

Proposition 2.2 At time slot t, if there exists a DS unit j such that−pm,t+Jj,t

VD′

j,c(x)+Kj,tηj,c

V< C′

s(x), ∀x ∈(0, ǫ), where ǫ is an arbitrarily small positive number, then the price signal

λ∗

t

Vis lower bounded as

λ∗t

V> min

1≤i≤N

{

−pm,t +Ji,tV

D′i,c(0) +

Ki,tηi,cV

}

and the charging amount x∗j,t > 0.


Proposition 2.2 essentially states that, as long as there is a DS unit whose effective marginal cost, considering

both the energy gain and the charging and discharging costs, is strictly less than the marginal cost of the external

energy source, it is beneficial for the aggregator to incentivize the DS units to provide power balancing service

(even though the price signal can be negative).


Algorithm 2.3: Distributed implementation for real-time power balancing.

begin Aggregator’s algorithm:Repeat at each time:

1. Observe gt.

2. Broadcast 1s,t.

3. Execute aggregator’s algorithm in Algorithm 2.2.

begin DS’s algorithm:Initialize: Ji,0 = ai; Ki,0 = si,0 − βi.

Repeat at each time:

1. Observe pm,t, 1s,t, Ji,t, and Ki,t.

2. Execute DS’s algorithm in Algorithm 2.2.

3. Update Ji,t and Ki,t based on (2.13) and (2.14), respectively.

2.3.4 Main Algorithm of Distributed Implementation

In Algorithm 2.3, we formally state the real-time distributed algorithm for the aggregator-DS system to provide

power balancing service. For presentation simplicity, we focus on the energy surplus case only.

We now discuss the information required in Algorithm 2.3 and show that Algorithm 2.3 can be easily im-

plemented in practice. As in Algorithm 2.1, in Algorithm 2.3, the statistical information of the system is not

required, and only instantaneous observations are needed, which can be obtained either locally or through simple

communication. Specifically, at each time slot t, the aggregator observes the energy imbalance signal gt, and

each DS unit observes the electricity price pm,t, the indicator of the energy imbalance signal 1s,t, and the queue

backlogs Ji,t and Ki,t. To initialize Ji,0 and Ki,0 at each DS unit, the aggregator broadcasts V , cl, and cmax

to all DS units at the initial time. For the aggregator to determine Vmax in (2.16), the values of ηi,c, ηi,d, and

si,max − si,min − (ηi,c + ηi,d)ri,max at each DS unit are required. In practice, however, it may be unnecessary

to acquire all such information for determining Vmax. Note that as argued in Section 2.2.2 the maximum allowed

charging and discharging amount ri,max is much smaller compared with the energy capacity. Thus, when the

battery has high charging and discharging efficiencies, i.e., ηi,c and ηi,d are close to 1, approximately only the

minimum battery capacity among all DS units is required for the design of Vmax.

2.4 Simulation Results

In the previous sections, we have proven the analytical performance of the proposed algorithm. In this section,

we present numerical evaluation of the algorithm. For previous works on power balancing with a distributed

aggregator-DS system, e.g., [28,29,63], since the authors there study system models that are different and simpler

than ours, the proposed algorithms are not applicable for numerical comparison. Instead, we consider a greedy

algorithm as a benchmark.

4The subgradient algorithm used in [21] reduces to the gradient algorithm when the dual function is differentiable.


Table 2.1: Number of iterations for |gt −∑N

i=1 xki − qk| < 0.01.

µ/µ0 1 10 20 50 100

ai = Vmaxcl/di,l 279 105 85 45 26

ai = Vmaxcl/(4di,l) 964 411 183 131 44

2.4.1 Simulation Setup

We have developed an aggregator-DS model in Matlab. Unless otherwise specified, the following parameters

are set as default. The aggregator is connected with N = 150 DS units, each with energy capacity si,cap = 23

kWh and maximum charging/discharging rate 6.6 kW (based on the 2012 Ford Focus Electric). The duration of

each time slot is ∆t = 30 seconds. Then, the maximum allowed charging/discharging amount of each DS unit

is ri,max = 6.6∆t3600 kWh. The charging and discharging efficiency coefficients are ηi,c = 0.8 and ηi,d = 1.2,

respectively. The preferred energy range of each DS unit is [0.1si,cap, 0.9si,cap], from which the initial energy

state si,0 is uniformly drawn. The degradation cost functions of the charging/discharging amount are Di,c(x) =

Di,d(x) = x1.5, and the upper bound li,u =( ri,max

2

)1.5. The energy imbalance signal gt is i.i.d. over time and

is sampled uniformly from [−gmax, gmax], where gmax =∑N

i=1 ri,max. The unit market electricity price is 7

cents/kWh, which is the current off-peak electricity price in Ontario [76]. The external cost functions for clearing

energy surplus and energy deficit are Cs(x) = Cd(x) = 7x1.2. To determine the charging/discharging amounts of

DS units at each time slot t, we apply the proposed algorithm in Algorithm 2.3 with V = Vmax. The cost cushion

parameter ai =Vmaxcldi,l

by default for fast convergence. For all figures, we omit drawing confidence intervals since

they are small.

2.4.2 Convergence of Distributed Algorithm

In Table 2.1, we show the convergence speed of Algorithm 2.2 by listing the number of iterations required for

|gt −∑N

i=1 xki − qk| < 0.01, when the current energy imbalance signal gt = gmax. From Table 2.1, when the

step size µ = µ0, with the default ai, the algorithm converges within 279 steps; in contrast, when ai is a quarter

of the default value, the algorithm takes 964 steps to converge as the applied µ0 now is much smaller. We also

observe that the convergence speed can be significantly improved by increasing µ, indicating the robustness of the

algorithm to the step size design.

2.4.3 Comparison with Greedy Algorithm

As a benchmark, we consider a greedy algorithm that is applied to the same system model as ours but aims to

independently minimize the system cost at each time slot. Specifically, the charging and discharging amounts of

the DS units under the greedy algorithm is determined by the following optimization problem at each time slot t.

minxt,yt

wt

s.t. (2.5), (2.6), (2.7),1s,tDi,c(xi,t) + 1d,tDi,d(yi,t) ≤ li,u, ∀i.

The above problem produces a feasible solution for P1 at each time t and can be implemented distributively

following the technique in Section 2.3.

In Figs. 2.2 and 2.3, we compare the proposed algorithm with the greedy algorithm over a wide range of

parameter values. In particular, in Fig. 2.2, we exhibit the time-averaged system cost vs. the number of partici-


50 100 150 2000

5

10

15

20

25

30

Number of DS units

Tim

e−

ave

rag

ed

syste

m c

ost

Proposed: si,max

= 0.3si,cap

Proposed: si,max

= 0.5si,cap

Proposed: si,max

= 0.7si,cap

Proposed: si,max

= 0.9si,cap

Greedy: si,max

= 0.3si,cap

Greedy: si,max

= 0.5si,cap

Greedy: si,max

= 0.7si,cap

Greedy: si,max

= 0.9si,cap

Figure 2.2: Time-averaged system cost vs. number of DS units under various si,max.

0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.60

0.2

0.4

0.6

0.8

1

1.2

1.4

gmax/∑

150

i=1ri,max

No

rma

lize

d t

ime

−a

ve

rag

ed

syste

m c

ost

Proposed: ∆t = 5sProposed: ∆t = 10sProposed: ∆t = 30sProposed: ∆t = 60sGreedy: ∆t = 5sGreedy: ∆t = 10sGreedy: ∆t = 30sGreedy: ∆t = 60s

Figure 2.3: Normalized time-averaged system cost vs. gmax/∑150

i=1 ri,max under various ∆t.

pating DS units under various values of si,max. For all cases, the proposed algorithm achieves much lower system

cost, with cost reduction ranging from 11% to 80%. When the number of DS units increases, the system cost of

both algorithms decreases. For the proposed algorithm, Fig. 2.2 indicates that 150 DS units are enough for the

considered power balancing service, while for the greedy algorithm, more DS units are needed to further cut down

the system cost. Furthermore, when si,max increases, as opposed to the greedy algorithm which cannot benefit

from the increased energy range, the proposed algorithm exhibits performance improvement in general.

In Fig. 2.3, we consider four different values of∆t and display the normalized (over∆t) time-averaged system

cost vs. the ratio gmax/∑150

i=1 ri,max for different gmax values. For the proposed algorithm, the system cost grows

with ∆t and gmax. We observe that when the energy imbalance amount is low, the system cost achieved by the

proposed algorithm increases at a much lower rate than that of the greedy algorithm. When the energy imbalance

amount is high, even though the system cost achieved by both algorithms increases at nearly the same rate due to

saturation of the DS capacity, the proposed algorithm still substantially outperforms the greedy algorithm.


2.5 Summary

We have considered a comprehensive aggregator-DS system model to provide real-time power balancing service

to a power grid. To minimize the long-term system cost, we have developed a real-time distributed algorithm, by

which the statistics of the system is not required and each DS unit can determine its own charging and discharging

amounts. The algorithm provably converges quickly and asymptotically achieves the optimal performance as the

DS capacity increases. Also, a novel cost cushion parameter has been introduced that tunes the trade-off between

system performance and convergence speed. In simulations, we have compared the proposed algorithm with a

greedy algorithm over a wide range of parameter values, and demonstrated that the algorithm can offer substantial

performance gains.

In the system model, the DS units are assumed to exclusively provide real-time power balancing service

when they participate in the aggregator-DS system. In particular, their own charging needs (e.g., charging the

battery to a certain level before a deadline) are not considered, except that the energy state is ensured to be within

a preferred range. In a more general scenario, the DS units, e.g., batteries in EVs, may need to conduct self

charging while providing power balancing service. The challenging problem of jointly optimizing self charging

and power balancing remains open and is left for future research.

2.6 Appendices

2.6.1 Proof of Proposition 2.1

Based on the definition of L(Θt), we have

L(Θt+1)− L(Θt) =1

2

N∑

i=1

(J2i,t+1 − J2

i,t +K2i,t+1 −K2

i,t). (2.25)

Using the fact that for any q ≥ 0, b ≥ 0, and a ≥ 0, there is (max{q − b, 0}+ a)2 ≤ q2 + a2 + b2 + 2q(a− b),

we can upper bound J2i,t+1 − J2

i,t as follows:

J2i,t+1 − J2

i,t ≤l2i,u + (max{Di,c(ri,max), Di,d(ri,max)}+ ai)2

+ 2Ji,t [1s,tDi,c(xi,t) + 1d,tDi,d(yi,t)− li,u] . (2.26)

By the update equation of Ki,t in (2.14), K2i,t+1 −K2

i,t can be upper bounded by

K2i,t+1 −K2

i,t ≤ 2Ki,tbi,t + r2i,max. (2.27)

Imposing the upper bounds (2.26) and (2.27) on the right hand side of (2.25), taking conditional expectation

on both sides, and then adding the term V E[wt|Θt] gives the upper bound of the drift-plus-cost function in

Proposition 2.1.

2.6.2 Proof of Lemma 2.2

1) Consider gt > 0. Suppose that when Ki,t > V(pm,max+cmax)

ηi,c, the optimal solution under the proposed

algorithm is xt with xi,t > 0. Then we show that we can find another solution xt with xj,t = xj,t, ∀j 6= i, and

xi,t = 0, resulting in a strictly smaller objective value, which is a contradiction.


Using the objective function of P2(a), this is equivalent to showing that

[

N∑

j=1

Jj,tDj,c(xj,t)− V pm,txj,t +Kj,tηj,cxj,t

]

+ V Cs

(

gt −N∑

j=1

xj,t

)

>[

N∑

j 6=i

Jj,tDj,c(xj,t)− V pm,txj,t +Kj,tηj,cxj,t

]

+ V Cs

(

gt −N∑

j=1

xj,t + xi,t

)

which is equivalent to

Ji,tDi,c(xi,t)− V pm,txi,t +Ki,tηi,cxi,t

>V[

Cs

(

gt −N∑

j=1

xj,t + xi,t

)

− Cs

(

gt −N∑

j=1

xj,t

)

]

=V xi,tC′s(ǫ) (2.28)

where (2.28) is derived by the mean value theorem with ǫ ∈ (gt −∑N

j=1 xj,t, gt −∑N

j=1 xj,t + xi,t). Since

Ji,tDi,c(xi,t) ≥ 0, from (2.28), it suffices to show that

[Ki,tηi,c − V pm,t − V C′s(ǫ)]xi,t > 0. (2.29)

Since xi,t > 0, pm,t ≤ pm,max, andC′s(ǫ) ≤ cmax, (2.29) is true by using the condition thatKi,t >

V (cmax+pm,max)ηi,c

.

2) Consider gt < 0. Suppose that when Ki,t < V (pm,min − cmax

ηi,d), the optimal solution under the proposed

algorithm is yt with yi,t > 0. Then there is a contradiction since we can construct another solution yt with

yj,t = yj,t, ∀j 6= i, and yi,t = 0, which results in a strictly smaller objective value. The proof is similar to that in

1) and is omitted here.


The proof proceeds by induction over time t. The base case trivially holds. For the inductive step, first consider

the upper bound. Assume that Ki,t ≤ si,max − βi holds at time slot t. Consider the following two cases.

Case 1: V(pm,max+cmax)

ηi,c< Ki,t ≤ si,max − βi. (It is easy to check that V

(pm,max+cmax)ηi,c

< si,max − βi since

V ≤ Vmax.) For gt > 0, from Lemma 2.2, x∗i,t = 0; therefore, based on the update equation (2.14), there is

Ki,t+1 = Ki,t ≤ si,max − βi. For gt < 0, we have Ki,t+1 = Ki,t − ηi,dyi,t ≤ Ki,t ≤ si,max − βi.

Case 2: Ki,t ≤ V(pm,max+cmax)

ηi,c. From (2.14), Ki,t+1 ≤ V

(pm,max+cmax)ηi,c

+ ηi,cri,max ≤ si,max − βi, where

the last inequality holds since V ≤ Vmax.

We now consider the lower bound. Assume that Ki,t ≥ si,min−βi holds at time slot t. Consider the following

two cases.

Case 1′: si,min − βi ≤ Ki,t < V (pm,min − cmax

ηi,d). (It is easy to check that si,min − βi < V (pm,min − cmax

ηi,d)

since ri,max > 0.) For gt < 0, from Lemma 2.2, y∗i,t = 0; therefore, Ki,t+1 = Ki,t ≥ si,min − βi. For gt > 0,

from (2.14), Ki,t+1 = Ki,t + ηi,cxi,t ≥ Ki,t ≥ si,min − βi.

Case 2′: Ki,t ≥ V (pm,min − cmax

ηi,d). From (2.14), Ki,t+1 ≥ V (pm,min − cmax

ηi,d) − ηi,dri,max ≥ si,min − βi,

where the last inequality holds based on the definition of βi.


2.6.4 Proof of Theorem 2.1

Consider the problem P2, and denote the optimal long-term system cost for P2 as f . We first prove the following

lemma, which will be used later.

Lemma 2.6 For P2, there exists a stationary randomized regulation allocation solution (xst ,y

st ) that only de-

pends on the system state At, and at the same time satisfies the following conditions:

E[wst ] ≤ f , (2.30)

E[1s,tDi,c(xsi,t) + 1d,tDi,d(y

si,t)− li,u] ≤ 0, ∀i, (2.31)

E[bsi,t] = 0, ∀i, (2.32)

where the expectations are taken over the randomness of the system and the randomness of (xst ,y

st ).

Proof: The claims above can be derived from Theorem 4.5 in [16]. In particular, that theorem implies that

the sufficient conditions for the existence of a stationary and randomized algorithm as described in Lemma 2.6

are as follows: first, the system state At is stationary; second, the system satisfies the boundedness assumptions

and the law of large numbers; and third, P2 is feasible. It is easy to check that P2 is feasible. In addition, since

we have assumed that At is i.i.d. and the variables gt, xi,t, yi,t, and pm,t are bounded, these sufficient conditions

are all met in our problem. Therefore, the conclusion in Lemma 2.6 holds.

Since the proposed algorithm minimizes the upper bound of the drift-plus-cost function at each time, plugging

(xst ,y

st ) into the right hand side of (2.17) and using (2.30), (2.31), and (2.32) yields

∆(Θt) + V E[wt|Θt] ≤ B + V f ≤ B + V f opt (2.33)

where the last inequality holds since P2 is a relaxed problem of P1 hence having a smaller objective value.

We first prove the result in 2). Taking expectations over Θt on both sides of (2.33) and summing over

t ∈ {0, · · · , T − 1} gives

E[L(ΘT )]− E[L(Θ0)] + VT−1∑

t=0

E[wt] ≤ (B + V f opt)T. (2.34)

After some arrangement, from (2.34), there is

1

T

T−1∑

t=0

E[wt] ≤B + V f opt

V+

E[L(Θ0)]

TV. (2.35)

Taking T →∞ gives limT→∞1T

∑T−1t=0 E[wt] ≤ B

V+ f opt, V ∈ (0, Vmax], which is exactly the conclusion in 2).

We now prove the result in 1). From (2.34), we have

E[L(ΘT )] ≤ E[L(Θ0)] + [B + V (f opt − fmin)]T, (2.36)

where fmin, − pm,max

∑Ni=1 ri,max. Using the fact that E[Ji,T ] ≤

√

E[J2i,T ] ≤

√

2E[L(ΘT )], from (2.36) we

get

E[Ji,T ] ≤√

2 (E[L(Θ0)] + [B + V (f opt − fmin)]T ). (2.37)


Dividing both sides of (2.37) by T and taking limits gives limT→∞E[Ji,T ]

T= 0. Hence, by Lemma 2.1, the virtual

queue Ji,t is mean rate stable and constraint (2.8) holds. Using a similar argument, we can show that the virtual

queue Ki,t is mean rate stable and constraint (2.9) holds. Also, since we have proven in Lemma 2.4 that the

energy state is bounded within the preferred range, {x∗t ,y

∗t } is feasible for P1.


The gradient of Gt(λ) is G′t(λ) = gt−

∑Ni=1 xi,t(λ)− qt(λ), with xi,t(λ),[(Di,c)

′−1(V pm,t+λ−Ki,tηi,c

Ji,t

)

]ri,max

0

and qt(λ),[(C′s)

−1( λV)]gt0 . The second derivative of Gt(λ) when it exists is

G′′t (λ) = −

N∑

i=1

x′i,t(λ)− q′t(λ) (2.38)

where x′i,t(λ) when it exists is

x′i,t(λ) =

1Ji,tD

′′

i,c(xi,t(λ)), D′

i,c(0) ≤ V pm,t+λ−Ki,tηi,c

Ji,t≤ D′

i,c(ri,max)

0, otherwise,

and q′t(λ) when it exists is

q′t(λ) =

1V C′′

s (qt(λ)), C′

s(0) ≤ λV≤ C′

s(gt)

0, otherwise.

Assume that λ1 < λ2. Applying the mean value theorem to G′t(·), we have

G′t(λ1)−G′

t(λ2) = G′′t (ǫ)(λ1 − λ2), (2.39)

where ǫ ∈ (λ1, λ2). Using (2.38) in (2.39), there is

|G′t(λ1)−G′

t(λ2)|= |G′′

t (ǫ)||λ1 − λ2|

=

(

N∑

i=1

x′i,t(ǫ) + q′t(ǫ)

)

|λ1 − λ2|

≤ (N + 1)max{ 1

a1d1,l, · · · , 1

aNdN,l

,1

V cl

}

|λ1 − λ2|,

where the conditions C1 and C2 as well as the fact that Ji,t ≥ ai are used to derive the last inequality. When the

second derivative of Gt(λ) does not exist, we can replace the gradient in (2.39) by subgradient and the result still

holds [77].



From Theorem 4.4 of [68], we have the following conclusion: if in Algorithm 2.2 the step size µ = µ0 = 1/ρ,

then the generated sequence {λk} converges to the optimum λ∗t , and for any k ≥ 1,

Gt(λ∗t )−Gt(λ

k) ≤ 2ρ|λ0 − λ∗t |2

(k + 1)2,

where λ0 is the initial value of λ.

By the fact that if a function is Lipschitz continuous for the Lipschitz constant ρ, then the function is also

Lipschitz continuous for all finite constant ρ′ ≥ ρ, we can easily obtain Theorem 2.3 using Theorem 4.4 of [68]

for all µ ∈ (0, µ0].


By the Karush-Kuhn-Tucker (KKT) conditions, at the optimal point of P2(a’), the following optimality conditions

hold

Ji,t

VD′

i,c(0)− pm,t +Ki,tηi,c

V− λ∗

t

V≥ 0, if x∗

i,t = 0

Ji,t

VD′

i,c(x∗i,t)− pm,t +

Ki,tηi,c

V− λ∗

t

V= 0, if 0 < x∗

i,t < ri,max

Ji,t

VD′

i,c(ri,max)− pm,t +Ki,tηi,c

V− λ∗

t

V≤ 0, if x∗

i,t = ri,max.

(2.40)

Suppose that under the condition of Proposition 2.2, we have the contrary, i.e.,λ∗

t

V≤ min1≤i≤N{Ji,t

VD′

i,c(0)−pm,t +

Ki,tηi,c

V}, and thus x∗

t = 0 from (2.40). Then we will show that we can find another solution with

all elements zero except the j-th element equal to ǫ resulting in a strictly smaller objective value, which is a

contradiction. Using the objective function of P2(a’), this is equivalent to showing

Jj,tV

Dj,c(ǫ)− pm,tǫ+Kj,tηj,c

Vǫ < Cs(gt)− Cs(gt − ǫ). (2.41)

By the mean value theorem, from the left hand side of (2.41),

Jj,tV

Dj,c(ǫ)− pm,tǫ+Kj,tηj,c

Vǫ = ǫ

[Jj,tV

D′j,c(δ1)− pm,t +

Kj,tηj,cV

]

(2.42)

where 0 < δ1 < ǫ; from the right hand side of (2.41), we have

Cs(gt)− Cs(gt − ǫ) = ǫC′s(δ2) (2.43)

where gt − ǫ < δ2 < gt. Using (2.42) and (2.43), (2.41) is equivalent to

Jj,tV

D′j,c(δ1)− pm,t +

Kj,tηj,cV

< C′s(δ2). (2.44)

(2.44) is true since we have C′s(δ2) > C′

s(δ1) >Jj,t

VD′

j,c(δ1)− pm,t +Kj,tηj,c

V, where the first inequality is due

to gt ≫ ǫ and C′′s (·) > 0, and the second inequality is based on the condition of Proposition 2.2.

Chapter 3

Real-Time Power Balancing with Dynamic

Storage

In Chapter 2, we have studied the problem of real-time power balancing with static energy storage units that

are always connected to the system. In this chapter, we generalize static storage units to dynamic ones that can

leave and rejoin the system. Examples of such dynamic storage units are batteries inside EVs: from time to time,

the participating EVs may need to stop providing power balancing service and leave the system for their own

needs. This generalization from static to dynamic storage units is challenging, since the returning EV may have

a different energy state compared with the last leaving energy state, and this difference of the energy state would

impose much more difficulties on the aggregator for handling the battery size constraint in real time. Note that the

storage dynamics is considered in none of the previous works (e.g., [23–28]). To tackle this difficulty, we work

under the framework of Lyapunov optimization, and design a novel virtual queue to track the energy state of each

EV. Through a careful design of the dynamics of the virtual queue, we can ensure that the battery size constraint

of the EV is always satisfied.

Moreover, different from the objective of minimizing the long-term system cost in Chapter 2, in this chapter

we adopt a new objective of maximizing the long-term social welfare of the system. This is designed for the

aggregator to fairly allocate the power imbalance amount among the EVs.

In the rest of the chapter, we assume that the DS units are batteries in EVs. But in principle they can be any

storage units that are dynamic. Also, for the grid-wide service, we assume that these EVs are coordinated by an

aggregator for regulation service, which can be treated as an example of the real-time power balancing service.

3.1 System Model and Problem Formulation

3.1.1 Regulation Service and Aggregator-EV System

Consider a long-term time-slotted system, in which the regulation service is provided over equal time intervals

of length ∆t. At the beginning of each time slot t ∈ T ,{0, 1, · · · }, the aggregator receives a random regulation

signal Gt from a power grid. If Gt > 0, the aggregator is required to provide regulation down service by

absorbing Gt units of energy from the power grid during time slot t; if Gt < 0, the aggregator is required to

30

CHAPTER 3. REAL-TIME POWER BALANCING WITH DYNAMIC STORAGE 31

provide regulation up service by contributing |Gt| units of energy to the power grid during time slot t1. To

represent the type of the regulation service at time slot t, we define indicator random variables

1d,t,

1, if Gt > 0

0, otherwiseand 1u,t,

1, if Gt < 0

0, otherwise.

Note that the product 1d,t · 1u,t = 0, since regulation down and up services cannot happen simultaneously.

To provide regulation service, the aggregator coordinates N registered EVs and can communicate with each

EV bi-directionally when the EV is plugged-in. Each EV can leave the system for personal reason or for self-

charging or discharging purpose and re-join the system later. Assume that each EV provides regulation service

only if it is in the system.

For the i-th EV, denote tir,k ∈ T as its k-th returning time slot and til,k ∈ T as its k-th leaving time slot with

tir,k < til,k , ∀k ∈ {1, 2, · · · }. For simplicity of analysis, assume that all EVs are in the system at the initial time

and thus tir,1 = 0, ∀i. Define the set of the returning time slots of the i-th EV as Ti,r,{tir,1, tir,2, · · · } and the

set of its leaving time slots as Ti,l,{til,1, til,2, · · · }, with tir,k < tir,k+1 and til,k < til,k+1. Define

Ti,p, ∪∞k=1 {tir,k, tir,k + 1, · · · , til,k − 1}

as the set containing all participating time slots of the i-th EV for regulation service. Hence, the i-th EV is in the

system for any t ∈ Ti,p. Define an indicator random variable

1i,t,

1, if t ∈ Ti,p0, otherwise

to represent the dynamics of the i-th EV at time slot t (i.e., whether the i-th EV is in the system at time slot t).

Define a vector 1t,[11,t, · · · ,1N,t] to represent the dynamics of all EVs at time slot t.

At the beginning of each time slot, the aggregator allocates regulation amount among all participating EVs.

Denote xid,t ≥ 0 as the amount of regulation down energy allocated to the i-th EV through charging, and xiu,t ≥ 0

as the amount of regulation up energy contributed by the i-th EV through discharging. Due to the limitation of

charging and discharging circuits in battery, assume that xid,t and xiu,t are upper bounded by xi,max > 0. Note

that if the i-th EV is out of the system at time slot t, i.e., 1i,t = 0, then it cannot provide regulation service and

we have xid,t = xiu,t = 0. Define vectors xd,t,[x1d,t, · · · , xNd,t] and xu,t,[x1u,t, · · · , xNu,t] to represent the

regulation amounts of all EVs at time slot t.

For the i-th EV, assume that it is in the system at time slot t (i.e., 1i,t = 1), and thus can provide regulation

service. Denote si,t ∈ [0, si,cap] as its energy state at the beginning of time slot t, with si,cap being its battery

capacity. Due to the regulation service, the energy state of the i-th EV at the beginning of time slot t+ 1 is given

by

si,t+1 = si,t + 1d,txid,t − 1u,txiu,t = si,t + bi,t (3.1)

where

bi,t,1d,txid,t − 1u,txiu,t (3.2)

1Compared with the terminologies of the power balancing service described in Chapter 2, Gt here is equivalent to the power imbalance

signal there, and regulation down (resp. up) is equivalent to the case of energy surplus (resp. deficit).


is defined to be the effective charging or discharging amount of the i-th EV at time slot t. Charging a battery

near its capacity or discharging it close to the zero energy state can significantly reduce battery’s lifetime [71].

Therefore, lower and upper bounds on the battery energy state are usually imposed by its manufacturer or user.

Denote the interval [si,min, si,max] as the preferred energy range of the i-th EV with 0 ≤ si,min < si,max ≤ si,cap.

Then, the resultant energy state si,t+1 in (3.1) should lie in [si,min, si,max], which indicates that the regulation

amounts xid,t and xiu,t must satisfy 0 ≤ xid,t ≤ 1i,thid,t and 0 ≤ xiu,t ≤ 1i,thiu,t, respectively, where hid,t and

hiu,t are effective upper bounds on the regulation amounts and are defined as

hid,t, [xi,max, si,max − si,t]−,

and

hiu,t, [xi,max, si,t − si,min]− ,

respectively.

From time to time, the i-th EV may need to stop its regulation service and leave the system. When the EV

is out of the system (i.e., 1i,t = 0), it cannot offer regulation service and the aggregator has no information of

the EV’s energy state. Moreover, the dynamics of the energy state may not follow (3.1) when 1i,t = 0. When

returning, the EV may have a different energy state compared with its last leaving energy state. Assume that all

returning energy states of the i-th EV are confined in the preferred energy range by the EV’s self-control, i.e.,

si,t ∈ [si,min, si,max], ∀t ∈ Ti,r . Define

∆i,k,si,tir,k+1− si,til,k , ∀k ∈ {1, 2, · · · } (3.3)

as the difference between the i-th EV’s (k + 1)-th returning energy state and its last leaving energy state. We

assume that

A1) ∆i,k is bounded, i.e., |∆i,k| ≤ ∆i,max, where the constant ∆i,max ≥ 0.

A2) ∆i,k has mean zero, i.e., E[∆i,k] = 0, ∀k.

Note that A2 is a mild assumption, based on the random behavior of each EV when it is out of the system.

For each EV, providing regulation service incurs battery degradation due to frequent charging and discharging

activities. Denote Ci(x) as the degradation cost function of the regulation amount of the i-th EV, with 0 ≤Ci(x) ≤ ci,max and Ci(0) = 0. Since faster charging or discharging, i.e., larger value of xid,t or xiu,t, has a

more detrimental effect on the battery’s lifetime, we assume Ci(x) to be convex, continuous, and non-decreasing

on the interval [0, xi,max]. We further assume that each EV imposes an upper bound ci,up ∈ [0, ci,max] on the

time-averaged battery degradation, expressed by

limT→∞

1

T

T−1∑

t=0

E [1d,tCi(xid,t) + 1u,tCi(xiu,t)] ≤ ci,up.

The total regulation amount provided by the EVs may be insufficient to meet the requested regulation amount

due to, for example, a lack of participating EVs, or high battery degradation cost. For brevity, define

xi,t,1d,txid,t + 1u,txiu,t, 0 ≤ xi,t ≤ xi,max

as the regulation amount allocated to the i-th EV at time slot t, which equals either xid,t or xiu,t. Then, the insuf-


ficiency of the regulation amount is indicated by∑N

i=1 xi,t < |Gt|, with the gap |Gt| −∑N

i=1 xi,t representing

an energy surplus in the case of regulation down or an energy deficit in the case of regulation up. Assume that

energy surplus or energy deficit must be cleared, or the regulation service fails. Therefore, from time to time, the

aggregator has to exploit more expensive external energy sources, such as from the traditional regulation market,

so as to fill the energy gap. Denote the unit costs for clearing energy surplus and energy deficit at time slot t as

es,t and ed,t, respectively, which are both random but are restricted in the interval [emin, emax]. Then, the cost of

the aggregator for using the external energy sources at time slot t is given by

et,1d,tes,t

(

Gt −N∑

i=1

xid,t

)

+ 1u,ted,t

(

|Gt| −N∑

i=1

xiu,t

)

,

where we have implicitly assumed that the total regulation amount provided by all EVs cannot exceed the re-

quested amount.

3.1.2 Fair Regulation Allocation through Welfare Maximization

The objective of the aggregator is to maximize the long-term social welfare of the aggregator-EV system. Specif-

ically, the aggregator aims to fairly allocate the regulation amount among EVs and to reduce the cost for the

expensive external energy sources, with the constraints on each EV’s regulation amount and degradation cost. To

this end, we formulate the regulation allocation problem as the following stochastic optimization problem2:

P1:

maxxd,t,xu,t

N∑

i=1

ωiU(

limT→∞

1

T

T−1∑

t=0

E[xi,t])

− limT→∞

1

T

T−1∑

t=0

E[et]

s.t. 0 ≤ xid,t ≤ 1i,thid,t, ∀i, t (3.4)

0 ≤ xiu,t ≤ 1i,thiu,t, ∀i, t (3.5)

N∑

i=1

xid,t ≤ 1d,tGt, ∀t (3.6)

N∑

i=1

xiu,t ≤ 1u,t|Gt|, ∀t (3.7)

limT→∞

1

T

T−1∑

t=0

E [1d,tCi(xid,t) + 1u,tCi(xiu,t)] ≤ ci,up, ∀i (3.8)

where ωi > 0 is the normalized weight associated with the i-th EV, and U(·) is a utility function assumed to be

concave, continuous, and non-decreasing, with U(0) = 0. Furthermore, to facilitate later analysis, we make a

mild assumption that the utility function U(·) satisfies

U(x) ≤ U(0) + µx, ∀x ∈[

0, max1≤i≤N

{xi,max}]

, (3.9)

where the constant µ > 0. One sufficient condition for (3.9) to hold is that U(·) has finite positive derivate at zero,

such as U(x) = log(1 + x). The expectations in the above optimization problem are taken over the randomness

of the system and the possible randomness of the regulation allocation.

2For EVs that only visit the system finite times, since they only affect the system’s transient behavior, but not the long-term behavior, we

can ignore them and only consider the rest EVs that leave and re-join the system infinite times.


In the objective function of P1, the first term includes each EV’s welfare under the utility function U(·) and

the weight ωi, and the second term reflects the aggregator’s cost for exploiting external energy sources. Note that

the fairness of the regulation allocation among EVs is ensured by the utility function U(·), and various types of

fairness can be achieved by using different utility functions [78]. For each EV, in (3.4) and (3.5), hard constraints

on the regulation amounts are set at each time slot, while in (3.8), a long-term time-averaged constraint on the

regulation amount is set due to the battery degradation. The constraints (3.6) and (3.7) ensure that xid,t = 0 for

regulation up and xiu,t = 0 for regulation down.

3.2 Welfare-Maximizing Regulation Allocation

In this section, we first apply a sequence of two reformulations to P1, then propose a real-time welfare-maximizing

regulation allocation (WMRA) algorithm to solve the resultant optimization problem. The performance analysis

of the proposed WMRA will be shown in Section 3.3.

3.2.1 Problem Transformation

The objective of P1 contains a function of a long-term time average, which complicates the problem. Fortunately,

in general, such a problem can be transformed to a problem of maximizing the long-term time average of the

function [16]. Specifically, we transform P1 as follows.

We first introduce an auxiliary vector zt,[z1,t, · · · , zN,t] with the constraints

0 ≤ zi,t ≤ xi,max, ∀i, t, and (3.10)

limT→∞

1

T

T−1∑

t=0

E[zi,t] = limT→∞

1

T

T−1∑

t=0

E[xi,t], ∀i. (3.11)

From the above constraints, the auxiliary variable zi,t and the regulation allocation amount xi,t lie in the same

range and have the same long-term time average behavior. We next consider the following problem.

P2:

maxxd,t,xu,t,zt

limT→∞

1

T

T−1∑

t=0

E

[(

N∑

i=1

ωiU(zi,t)

)

− et

]

s.t. (3.4), (3.5), (3.6), (3.7), (3.8), (3.10), and (3.11).

Compared with P1, P2 is over xd,t, xu,t and zt with two more constraints (3.10) and (3.11). Nevertheless,

P2 contains no function of time average; instead, it maximizes the long-term time average of the expected social

welfare.

Denote (xopt

d,t,xoptu,t) as an optimal solution to P1, and (x∗

d,t,x∗u,t, z

∗t ) as an optimal solution to P2. Define

zoptt ,[zopt

1,t, · · · , zopt

N,t] with the i-th element

zopti,t, lim

T→∞

1

T

T−1∑

τ=0

E[xopti,τ ], ∀i, t,

where xopti,τ,1d,τx

opt

id,τ + 1u,τxoptiu,τ . Denote the objective functions of P1 and P2 as f1(·) and f2(·), respectively.

The equivalence of P1 and P2 is stated below.


Lemma 3.1 P1 and P2 have the same optimal objective, i.e., f1(xopt

d,t,xoptu,t) = f2(x

∗d,t,x

∗u,t, z

∗t ). Furthermore,

(xopt

d,t,xoptu,t, z

optt ) is an optimal solution to P2, and (x∗

d,t,x∗u,t) is an optimal solution to P1.

Proof: The proof follows the general framework given in [16]. Details specific to our system are given in

Appendix 3.6.1.

Lemma 3.1 indicates that the transformation from P1 to P2 results in no loss of optimality. Thus, in the

following, we will focus on solving P2 instead.

3.2.2 Problem Relaxation

P2 is still a challenging problem since in the constraints (3.4) and (3.5), the regulation allocation amount of each

EV depends on its current energy state si,t, hence coupling with all previous regulation allocation amounts. To

avoid such coupling, we relax the constraints of xid,t and xiu,t, and introduce an optimization problem P3 below.

P3:

maxxd,t,xu,t,zt

limT→∞

1

T

T−1∑

t=0

E

[(

N∑

i=1

ωiU(zi,t)

)

− et

]

s.t. 0 ≤ xid,t ≤ 1i,txi,max, ∀i, t, (3.12)

0 ≤ xiu,t ≤ 1i,txi,max, ∀i, t, (3.13)

limT→∞

1

T

T−1∑

t=0

E[bi,t] = 0, ∀i, (3.14)

(3.6), (3.7), (3.8), (3.10), and (3.11)

where in (3.14) bi,t is the effective charging or discharging amount defined in (3.2). In P3, we have replaced

the constraints (3.4) and (3.5) in P2 with (3.12)–(3.14), thus have removed the dependence of the regulation

amount on si,t. We next demonstrate that, any (xd,t,xu,t) that satisfies (3.4) and (3.5) also satisfies (3.12)–(3.14).

Therefore, P3 is a relaxed problem of P2.

Consider the i-th EV. The constraints (3.4) and (3.5) in P2 are equivalent to the following two sub-constraints:

if 1i,t = 1, then

0 ≤ xid,t ≤ xi,max (3.15)

0 ≤ xiu,t ≤ xi,max (3.16)

si,min ≤ si,t+1 ≤ si,max; (3.17)

if 1i,t = 0, then

xid,t = xiu,t = 0. (3.18)

Since (3.15), (3.16), and (3.18) are equivalent to (3.12) and (3.13), we are left to justify that (3.17) (i.e., the

boundedness of si,t) implies (3.14). Recall that si,t is bounded for any returning time slot t ∈ Ti,r by the EV’s

self-control. Together, we need to justify that if si,t ∈ [si,min, si,max], ∀t ∈ Ti,p ∪ Ti,l, then the constraint (3.14)

holds. This result is shown in the following lemma.

Lemma 3.2 For the i-th EV, under the assumption A2, if si,t ∈ [si,min, si,max], ∀t ∈ Ti,p∪Ti,l, then the constraint

(3.14) holds, i.e., limT→∞1T

∑T−1t=0 E[bi,t] = 0.



From Lemma 3.2, we know that, the boundedness of si,t indeed implies (3.14), which completes our demon-

stration that P3 is a relaxed version of P2 with a larger feasible solution set. Later, we will show in Section 3.3.1

that our proposed algorithm for P3 in fact ensures the boundedness of si,t, and thus provides a feasible solution

to P2 and to the original problem P1.

The relaxed problem P3 allows us to apply Lyapunov optimization to design a real-time algorithm for solving

welfare maximization.

3.2.3 WMRA Algorithm

In this subsection, we propose a WMRA algorithm to solve P3 by employing Lyapunov optimization technique.

We first define three virtual queues for each EV with the associated queue backlogs Ji,t, Hi,t, and Ki,t. The

evolutionary behaviors of Ji,t, Hi,t, and Ki,t are as follows:

Ji,t+1 = [Ji,t + 1d,tCi(xid,t) + 1u,tCi(xiu,t)− ci,up]+; (3.19)

Hi,t+1 = Hi,t + zi,t − xi,t; (3.20)

Ki,t =

{

si,t − ci, if t ∈ Ti,r (3.21a)

Ki,t−1 + bi,t−1, otherwise (3.21b)

where in (3.21a) we have designed a perturbation parameter ci = si,min + 2xi,max + V (ωiµ + emax) with

V ∈ (0, Vmax] and

Vmax = min1≤i≤N

{si,max − si,min − 4xi,max

2(ωiµ+ emax)

}

. (3.22)

The role of V will be explained later. It will also be clear in Section 3.3.1 that the specific expressions of

ci and Vmax are designed to ensure the boundedness of si,t. Note that xi,max is generally much smaller than

the energy capacity. For example, for the Tesla Model S base model [79], the energy capacity is 40 kWh, and

xi,max = 0.166 kWh if the maximum charging rate 10 kW is applied and the regulation duration is 1 minute.

Therefore, Vmax > 0 holds in general.

From (3.21a), Ki,t is re-initialized as a shifted version of si,t every time the i-th EV returning to the aggregator-

EV system; also, from (3.21b), Ki,t evolves the same as si,t for t ∈ Ti,p (recall that the dynamics of si,t may not

follow (3.1) when 1i,t = 0). Therefore, Ki,t is essentially a shifted version of si,t, ∀t ∈ Ti,p ∪ Ti,l, i.e.,

Ki,t = si,t − ci, ∀t ∈ Ti,p ∪ Ti,l. (3.23)

Additionally, since the effective charging or discharging amount bi,t = 0 when 1i,t = 0, once the i-th EV leaves

the system, the value of Ki,t will be locked until the next returning time slot of the EV, i.e.,

Ki,t = Ki,til,k , ∀t ∈ {til,k, · · · , tir,k+1 − 1}, and ∀k ∈ {1, 2, · · · }.

By introducing the virtual queues, the constraints (3.8) and (3.11) hold if the queues Ji,t and Hi,t are mean

rate stable, respectively [16].

Unlike Ji,t and Hi,t, Ki,t is re-initialized when t ∈ Ti,r, and therefore a new virtual queue is essentially

created every time the i-th EV re-joining the system. Therefore, the mean rate stability of Ki,t is insufficient for


the constraint (3.14) to hold, and a stronger condition is required. Fortunately, since Ki,t is just a shifted version

of si,t from (3.23), based on Lemma 3.2, the following result is straightforward.

Lemma 3.3 For the i-th EV, under the assumption A2, if Ki,t ∈ [si,min − ci, si,max − ci], ∀t ∈ Ti,p ∪ Ti,l, then

the constraint (3.14) holds, i.e., limT→∞1T

∑T−1t=0 E[bi,t] = 0.

Later in Section 3.3.1, we will show that by our proposed algorithm the boundedness assumption of Ki,t in

Lemma 3.3 can be guaranteed.

Define Jt,[J1,t, · · · , JN,t], Ht,[H1,t, · · · , HN,t], Kt,[K1,t, · · · ,KN,t], and Θt,[Jt,Ht,Kt]. Initialize

Ji,0 = Hi,0 = 0, and Ki,0 = si,0−ci, ∀i. Define the Lyapunov function L(Θt),12

∑Ni=1(J

2i,t+H2

i,t+K2i,t), and

the associated one-slot Lyapunov drift as ∆(Θt),E [L(Θt+1)− L(Θt)|Θt] . The drift-minus-welfare function

is given by ∆(Θt) − V E

[

∑Ni=1 ωiU(zi,t)− et|Θt

]

, where V ∈ (0, Vmax] is the weight associated with the

welfare objective. Hence, the larger V , the more weight is put on the welfare objective.

Furthermore, we assume that for the i-th EV, the conditional expectation of the energy state difference ∆i,k,

given the queue backlogs before the EV returns, is zero, i.e.,

A3) E[∆i,k|Θt] = 0, for t = tir,k+1 − 1, ∀k ∈ {1, 2, · · · }, ∀i.

Note that A3 is mild, considering the random behavior of each EV due to other activities.

Now we provide an upper bound on the drift-minus-welfare function in the following proposition.

Proposition 3.1 Under the assumptions A1 and A3, the drift-minus-welfare function is upper-bounded as

∆(Θt)− V E

[

N∑

i=1

ωiU(zi,t)− et|Θt

]

≤B +

N∑

i=1

Ki,tE[bi,t|Θt] +

N∑

i=1

Hi,tE[zi,t − xi,t|Θt] +

N∑

i=1

Ji,tE [1d,tCi(xid,t) + 1u,tCi(xiu,t)− ci,up|Θt]

− V E

[

N∑

i=1

ωiU(zi,t)− et

∣

∣

∣Θt

]

(3.24)

where

B,1

2

N∑

i=1

[

2x2i,max +∆2

i,max + [c2i,up, (ci,max − ci,up)2]+]

. (3.25)


Adopting the general framework of Lyapunov optimization [16], we now propose the WMRA algorithm by

minimizing the upper bound on the drift-minus-welfare function in (3.24) at each time slot. We will show in

Section 3.3 that the proposed algorithm can lead to a guaranteed performance.

The minimization problem is equivalent to the following decoupled sub-problems with respect to zt, xd,t, and

xu,t, separately. Denote the solutions produced by WMRA as zt,[z1,t, · · · , zN,t], xd,t,[x1d,t, · · · , xNd,t], and

xu,t,[x1u,t, · · · , xNu,t], respectively. Specifically, we obtain zi,t, ∀i, by solving (a):

(a): minzi,t

Hi,tzi,t − ωiV U(zi,t) s.t. 0 ≤ zi,t ≤ xi,max.


Algorithm 3.1: Welfare-maximizing regulation allocation (WMRA) algorithm.

1: The aggregator initializes the virtual queue vector Θ0, and re-initialize Ki,t = si,t − ci for t ∈ Ti,r, ∀i.2: At the beginning of each time slot t, the aggregator performs the following steps sequentially.

(2a) Observe Gt, es,t, ed,t,1t, Jt, Ht, and Kt.

(2b) Solve (a) and record an optimal solution zt. If Gt > 0, solve (b1) and record an optimal solution xd,t.

If Gt < 0, solve (b2) and record an optimal solution xu,t. Allocate the regulation amounts among EVs

based on xd,t and xu,t. If∑N

i=1 xid,t < Gt or∑N

i=1 xiu,t < |Gt|, clear the imbalance using external

energy sources.

(2c) Update the virtual queues Ji,t, Hi,t, and Ki,t, ∀i, based on (3.19), (3.20), and (3.21b), respectively.

For Gt > 0, we obtain xd,t by solving (b1):

(b1): minxd,t

V es,t

(

Gt −N∑

i=1

xid,t

)

−N∑

i=1

Hi,txid,t +N∑

i=1

Ji,tCi(xid,t) +N∑

i=1

Ki,txid,t

s.t. 0 ≤ xid,t ≤ 1i,txi,max,

N∑

i=1

xid,t ≤ Gt.

For Gt < 0, we obtain xu,t by solving (b2):

(b2): minxu,t

V ed,t

(

|Gt| −N∑

i=1

xiu,t

)

−N∑

i=1

Hi,txiu,t +

N∑

i=1

Ji,tCi(xiu,t)−N∑

i=1

Ki,txiu,t

s.t. 0 ≤ xiu,t ≤ 1i,txi,max,

N∑

i=1

xiu,t ≤ |Gt|.

Note that (a), (b1), and (b2) are all convex problems, so they can be efficiently solved using standard methods

such as the interior point method [70]. We summarize WMRA in Algorithm 1. Note from Steps (2b) and (2c) that,

the solutions of (a) and (b1) (or (b2)) affect each other over multiple time slots through the update of Hi,t, ∀i. To

perform WMRA, no statistical information of the system is needed, which makes the algorithm easy to implement.

Remarks: In this work, we focus on centralized control, and the proposed WMRA algorithm is implemented

by the aggregator in a centralized way. If each EV is aware of wi, U(·), V,Gt, es,t, and ed,t besides its own

information, the optimization problems (a), (b1), and (b2) can be solved by all EVs in a distributed way. With

currently available communication platforms, it is possible for each EV to obtain the imbalance signal Gt, and

the unit costs es,t and ed,t in real time. However, for the weights wi and the utility function U(·), since they are

designed by the aggregator for fair allocation, it may happen that the aggregator would not like to share the design

with all EVs.

3.3 Performance Analysis

In this section, we characterize the performance of WMRA with respect to our original problem P1.


3.3.1 Properties of WMRA Algorithm

We now show that WMRA can ensure the boundedness of each EV’s energy state. The following lemma charac-

terizes sufficient conditions under which the solution of xid,t and xiu,t under WMRA is zero.

Lemma 3.4 Under the WMRA algorithm, for any t ∈ Ti,p,

1. for Gt > 0, if Ki,t > xi,max+V (ωiµ+emax), then xid,t = 0, which means that Ki,t+1 cannot be increased

at the next time slot; and

2. for Gt < 0, if Ki,t < −xi,max − V (ωiµ + emax), then xiu,t = 0, which means that Ki,t+1 cannot be

decreased at the next time slot.


Since Lemma 3.4 on the other hand provides conditions under which queue backlog Ki,t can no longer

increase or decrease, using Lemma 3.4, we can prove the boundedness of Ki,t below.

Lemma 3.5 Under the WMRA algorithm, queue backlog Ki,t associated with the i-th EV is bounded within

[si,min − ci, si,max − ci], ∀t ∈ Ti,p ∪ Ti,l.


In the proof of Lemma 3.5, we remark on the specific designs of ci and Vmax, which are to ensure the bound-

edness of Ki,t within a shifted preferred energy range.

From Lemma 3.5, the boundedness condition of Ki,t in Lemma 3.3 is now satisfied, therefore the conclusion

there is true under WMRA. Since Ki,t = si,t − ci, ∀t ∈ Ti,p ∪ Ti,l, using Lemma 3.5, the following lemma is

straightforward.

Lemma 3.6 Under the WMRA algorithm, the energy state of the i-th EV is bounded within [si,min, si,max], ∀t ∈Ti,p ∪ Ti,l.

From Lemma 3.6, constraints (3.4) and (3.5) in P2 are met under WMRA.

3.3.2 Optimality of WMRA Algorithm

In this subsection, we investigate the optimality of WMRA by considering EVs with both predictable and random

dynamics, which are described below.

1. EVs with predictable dynamics: Predictable dynamics could happen when each EV joins and leaves the

aggregator-EV system regularly (e.g. from 9am to 12pm in the morning, then from 2pm to 6pm in the

afternoon). Therefore, the leaving and returning time slots of each EV can be predicted by the aggregator. In

other words, the aggregator is aware of the realization of 1t, ∀t, in advance. In this case, the random system

state at time slot t is defined as At,(Gt, es,t, ed,t). A specific case of EVs with predictable dynamics are

static EVs, i.e., 1i,t = 1, ∀i, t.

2. EVs with random dynamics: If the EVs do not participate in the aggregator-EV system regularly, then the

aggregator cannot predict their dynamics beforehand, and therefore has to observe 1t every time slot. In

this case, the random system state at time slot t is defined as At,(Gt, es,t, ed,t,1t).


Note that the WMRA algorithm is the same under both of the above cases. The only difference between them

is that, in the optimization problem P3, the expectations are taken over different randomness of the system state.

The performance under WMRA as compared to the optimal solution of P1 is given in the following theorem,

which applies to both predictable and random dynamics.

Theorem 3.1 Under the assumptions A1, A2, and A3, given the system state At is i.i.d. over time,

1. (xd,t, xu,t) is feasible for P1, i.e., it satisfies (3.4)–(3.8).

2. f1(xd,t, xu,t) ≥ f1(xopt

d,t,xoptu,t)− B

V, where B is defined in (3.25) and V ∈ (0, Vmax].


Remarks: From Theorem 3.1, the welfare performance of WMRA is away from the optimum by O(1/V ).

Hence, the larger V , the better the performance of WMRA. However, in practice, due to the boundedness con-

dition of EV’s battery capacity, V cannot be arbitrarily large and is upper bounded by Vmax, which is defined

in (5.14). Note that Vmax increases with the smallest span of the EVs’ preferred battery capacity ranges, i.e.,

min1≤i≤N{si,max − si,min}. Therefore, roughly speaking, the performance gap between WMRA and the opti-

mum decreases as the smallest battery capacity increases. Asymptotically, as the EVs’ battery capacities go to

infinity, WMRA would achieve exactly the optimum.

In Theorem 3.1, the i.i.d. condition of At can be relaxed to Markovian, and a similar performance bound can

be obtained. In particular, this relaxed condition can accommodate the case where Gt is Markovian and has a

ramp rate constraint (|Gt − Gt−1| ≤ ramp rate ×∆t), by properly designing the transition probability matrix of

Gt.

Theorem 3.2 Under the assumptions A1, A2, and A3, given that the system state At evolves based on a finite

state irreducible and aperiodic Markov chain,

1. (xd,t, xu,t) is feasible for P1, i.e., it satisfies (3.4)–(3.8).

2. f1(xd,t, xu,t) ≥ f1(xopt

d,t,xoptu,t)−O(1/V ), where V ∈ (0, Vmax].

Proof: The above results can be proved by expanding the proof of Theorem 3.1 using a multi-slot drift

technique [16]. We omit the proof here for brevity.


Besides the analytical performance bound derived above, we are further interested in evaluating WMRA in exam-

ple numerical settings. Towards this goal, we have developed an aggregator-EV model in Matlab and compared

WMRA with a greedy algorithm.

Suppose that the aggregator is connected with N = 100 EVs, evenly split into Type I (based on the 2012 Ford

Focus Electric) and Type II (based on the Tesla Model S base model). The parameters of Type I and Type II EVs

are summarized in Table 3.1 [79, 80]. The maximum regulation amount xi,max can be derived by multiplying the

maximum charging and discharging rate with the regulation interval ∆t. In current practice, ∆t is of the order of

seconds. For example, for PJM, ∆t = 2 seconds [81], and for NYISO, ∆t = 6 seconds [82]. In simulations, we

set ∆t = 5 seconds as an example.

Consider that the system state At = (Gt, es,t, ed,t,1t) follows a finite state irreducible and aperiodic Markov

chain. For the regulation signal Gt, we ignore the ramp rate constraint in our simulations. At each time slot, we


Table 3.1: Parameters of Type I and Type II EVs

Type I EV Type II EV

Energy capacity si,cap (kWh) 23 40

Maximum charging/discharging rate (kW) 6.6 10

1i,t = 1 1i,t = 0

0.05

1-p

p = 0.95

0.95

Figure 3.1: Transition probabilities of 1i,t, ∀i.

draw a sample of Gt from a uniformly distributed set {−1.15,−1.15+∆1,−1.15+2∆1, · · · , 1.15} (kWh) with

the cardinality 200, where 1.15 kWh is the maximum allowed regulation amount at each time slot if all N EVs

are in the system. The unit costs of the external sources, es,t and ed,t, are drawn uniformly from a discrete set

{0.1, 0.1 + ∆2, 0.1 + 2∆2, · · · , 0.12} (dollars/kWh) with the cardinality 200. The lower bound 0.1 dollars/kWh

and the upper bound 0.12 dollars/kWh correspond to the mid-peak and the on-peak electricity prices in Ontario,

respectively [76]. The dynamics of each EV is described by the indicator random variable 1i,t, which represents

whether the i-th EV is in the system at time slot t. In particular, we assume that 1i,t follows a two-state Markov

chain as shown in Fig. 3.1. The state transition probability p,P(0→ 1) is set to be 0.95 by default.

For the i-th EV, the (k + 1)-th returning energy state si,tir,k+1is drawn uniformly from the interval [si,til,k −

∆3, si,til,k + ∆3], where si,til,k is the k-th leaving energy state of the i-th EV and ∆3 = 5%si,cap3. We set the

minimum preferred energy state si,min = 0.1si,cap, and the maximum preferred energy state si,max = 0.9si,cap

except otherwise mentioned. In the objective function of P1, we set U(x) = log(1+ x) and ωi = 1, ∀i. Since the

degradation cost function Ci(·) is proprietary and unavailable, in simulations, we set Ci(x) = x2 as an example.

The upper bound ci,up is set to be x2i,max/4.

To allocate the requested regulation amount, we apply WMRA in Algorithm 3.1 at each time slot. For com-

parison, we consider a greedy algorithm which only optimizes the system performance at the current time slot.

The regulation allocation at each time slot is determined by the following optimization problem.

maxxd,t,xu,t

(

N∑

i=1

ωiU(xi,t)

)

− et

s.t. (3.4), (3.5), (3.6), (3.7), and

1d,tCi(xid,t) + 1u,tCi(xiu,t) ≤ ci,up, ∀i.

The above problem is a convex optimization problem, and we use the standard solver in MATLAB to obtain its

solution. For all figures, we omit drawing confidence intervals since they are small.

In Figs. 3.2 and 3.3, we compare the performance of WMRA with V = Vmax and the performance of the

greedy algorithm. From Fig. 3.2, with si,max = 0.9si,cap, WMRA is uniformly superior to the greedy algorithm

over all time slots, with the advantage about 40%. In Fig. 3.3, we set the transition probability p to be 0.95 and

0.05, and vary si,max from 0.3si,cap to 0.9si,cap. For p = 0.95, the observations are as follows. First, WMRA

3We ensure that all returning energy states are within the preferred range [si,min, si,max] by ignoring unqualified samples.


0 200 400 600 800 1000

0.35

0.4

0.45

0.5

0.55

0.6

0.65

Time slot

So

cia

l w

elfa

re

WMRA algorithm: V = V

max

Greedy algorithm

Figure 3.2: Time-averaged social welfare with V = Vmax.

0.3 0.4 0.5 0.6 0.7 0.8 0.90.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

si,max

/si,cap

So

cia

l w

elfa

re

WMRA algorithm: V = Vmax

and p = 0.95

WMRA algorithm: V = Vmax

and p = 0.05

Greedy algorithm: p = 0.95Greedy algorithm: p = 0.05

Figure 3.3: Time-averaged social welfare with various si,max and V = Vmax.

uniformly outperforms the greedy algorithm over different values of si,max. Second, as si,max increases, the

social welfare under WMRA slightly rises. This is because increasing si,max effectively increases Vmax, which

improves the performance of WMRA. This observation is also consistent with the remarks after Theorem 3.1. In

contrast, the greedy algorithm cannot benefit from the expanded energy range. For p = 0.05, the trends of the

curves resemble those for p = 0.95, but the social welfare of both algorithms drops. This is because when p is

decreased, roughly speaking, there are fewer EVs in the system for the regulation service. Hence, to provide the

requested regulation amount, the aggregator more relies on the expensive external energy sources, which leads to

a decreased social welfare.

In Fig. 3.4, we show the performance of WMRA with the value of V ranging from 0.2Vmax to 5Vmax, and

compare it with the performance of the greedy algorithm. For WMRA, as expected, the social welfare grows

with the value of V ; also, the growing rate slows down when V gets larger. Moreover, we observe that WMRA

outperforms the greedy algorithm even with V = 0.2Vmax.


0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.38

0.4

0.42

0.44

0.46

0.48

0.5

0.52

0.54

0.56

0.58

V/Vmax

So

cia

l w

elfa

re

WMRA algorithmGreedy algorithm

Figure 3.4: Time-averaged social welfare with various values of V .

In Lemma 3.6, the energy state of each EV is shown to be restricted within [si,min, si,max] when V ∈(0, Vmax]. In Fig. 3.5, for V being Vmax, 2Vmax, and 5Vmax, we show the evolution of a Type I EV’s energy

state under WMRA. We see that, when V = Vmax, the energy state is always within the preferred range. In con-

trast, when V = 2Vmax or 5Vmax, the associated energy state can exceed the preferred range from time to time.

Furthermore, the larger V the more frequently such violation happens. Therefore, the observations in Figs. 3.4

and 3.5 demonstrate the significance of Vmax in achieving the maximum social welfare under WMRA considering

the constraint of EV’s preferred energy range.

100 200 300 400 500 600 700 800 900 100015.5

16

16.5

17

17.5

18

18.5

19

19.5

20

20.5

21

Time slot

En

erg

y s

tate

si,max

Energy state: Vmax

Energy state: 2Vmax

Energy state: 5Vmax

Figure 3.5: Sample path of a Type I EV’s energy state with V = [1, 2, 5]Vmax.


3.5 Summary

We have studied a practical model of a dynamic aggregator-EV system providing regulation service to a power

grid. We have formulated the regulation allocation optimization as a long-term time-averaged social welfare

maximization problem. Our formulation accounts for random system dynamics, battery constraints, the costs

for battery degradation and external energy sources, and especially, the dynamics of EVs. Adopting a general

Lyapunov optimization framework, we have developed a real-time WMRA algorithm for the aggregator to fairly

allocate the regulation amount among EVs. The algorithm does not require any knowledge of the statistics of the

system state. We have been able to bound the performance of WMRA to that under the optimal solution, and

showed that the performance of WMRA is asymptotically optimal as EVs’ battery capacities go to infinity. Simu-

lation has demonstrated that WMRA offers substantial performance gains over a greedy algorithm that maximizes

per-slot social welfare objective.

3.6 Appendices


It is easy to see that (x∗d,t,x

∗u,t) is feasible for P1. To show that (xopt

d,t,xoptu,t, z

optt ) is feasible for P2, it suffices to

show that zoptt satisfies (3.10) and (3.11). Using the definition of z

opti,t , (3.11) naturally holds. Also, since x

opti,t lies

in [0, xi,max], which is a closed interval, (3.10) holds.

We claim that

f1(xopt

d,t,xoptu,t) = f2(x

opt

d,t,xoptu,t, z

optt )

≤ f2(x∗d,t,x

∗u,t, z

∗t )

≤ f1(x∗d,t,x

∗u,t)

≤ f1(xopt

d,t,xoptu,t). (3.26)

Using the definition of zopti,t in f2(·), the first equality holds. The first and the third inequalities hold since

(x∗d,t,x

∗u,t, z

∗t ) and (xopt

d,t,xoptu,t) are optimal for f2(·) and f1(·), respectively. The second inequality is derived

using Jensen’s inequality for concave functions. Since (3.26) is satisfied with equality, all inequalities in (3.26)

turn into equalities, which indicates the equivalence of P1 and P2.


Let T be large enough. For the i-th EV, decompose the total effective charging and discharging amount within T

time slots as

T−1∑

t=0

bi,t =

til,k∗−1∑

t=0

bi,t +

T−1∑

t=til,k∗

bi,t, (3.27)

where k∗,max{k : til,k ≤ (T − 1), k ∈ {1, 2, · · · }} is defined to be the total number of the leaving times of

the i-th EV up to time slot T − 1. On the right hand side of (3.27), the first term corresponds to the total effective

charging and discharging amount before the last leaving time, and the second term corresponds to the rest of the

total effective charging and discharging amount. Using the decomposition in (3.27), to show (3.14), it suffices to


show that the two limits limT→∞1TE[∑til,k∗−1

t=0 bi,t] and limT→∞1TE[∑T−1

t=til,k∗bi,t] are both equal to zero.

First consider the second limit. For the i-th EV, if there is no return between til,k∗ andT−1, then∑T−1

t=til,k∗bi,t =

0 and thus limT→∞1TE[∑T−1

t=til,k∗bi,t] = 0. Or, if there is one return, then

∑T−1t=til,k∗

bi,t = si,T − si,tir,k∗+1.

Using the boundedness condition of si,t, we have limT→∞1TE[∑T−1

t=til,k∗bi,t] = 0. Together, the second limit is

zero.

Next we show that the first limit is also zero. Based on the energy state evolution in (3.1), there is

til,k∗−1∑

t=0

bi,t =

k∗

∑

k=1

si,til,k −k∗

∑

k=1

si,tir,k

= si,til,k∗− si,tir,1 −

k∗−1∑

k=1

∆i,k. (3.28)

Taking expectations of both sides of (3.28), dividing them by T , then taking limits gives

limT→∞

1

TE

[ til,k∗−1∑

t=0

bi,t

]

= limT→∞

1

TE

[

si,til,k∗− si,tir,1

]

− limT→∞

1

TE

[

k∗−1∑

k=1

∆i,k

]

= 0,

where the last equality is derived by the boundedness of si,t and the assumption A2. This completes the proof.


Based on the definition of L(Θt), the difference

L(Θt+1)− L(Θt)

=1

2

N∑

i=1

H2i,t+1 + J2

i,t+1 +K2i,t+1 −H2

i,t − J2i,t −K2

i,t. (3.29)

In (3.29), H2i,t+1 −H2

i,t and J2i,t+1 − J2

i,t can be upper bounded as follows.

H2i,t+1 −H2

i,t ≤ 2Hi,t(zi,t − xi,t) + x2i,max (3.30)

J2i,t+1 − J2

i,t ≤ 2Ji,t[1d,tCi(xid,t) + 1u,tCi(xiu,t)− ci,up] + [c2i,up, (ci,max − ci,up)2]+. (3.31)

Taking conditional expectations for both sides in (3.30) and (3.31), we have

E[H2i,t+1 −H2

i,t|Θt] ≤ 2Hi,tE[zi,t − xi,t|Θt] + x2i,max (3.32)

E[J2i,t+1 − J2

i,t|Θt] ≤ 2Ji,tE[1d,tCi(xid,t) + 1u,tCi(xiu,t)− ci,up|Θt] + [c2i,up, (ci,max − ci,up)2]+. (3.33)

Now consider K2i,t+1 −K2

i,t. When 1i,t = 1, we have Ki,t+1 = Ki,t + bi,t and thus

K2i,t+1 −K2

i,t ≤ 2Ki,tbi,t + x2i,max. (3.34)

When 1i,t = 0, we have bi,t = 0 and there are two cases. First, for t ∈ {til,k, til,k + 1, · · · , tir,k+1 − 2}, ∀k ∈{1, 2, · · · }, there is Ki,t+1 = Ki,t. So, we can express

K2i,t+1 −K2

i,t = 2Ki,tbi,t. (3.35)


Second, for t = tir,k+1 − 1, ∀k ∈ {1, 2, · · · }, we have Ki,t = si,til,k − ci and Ki,t+1 = Ki,t +∆i,k. Hence, by

the assumption A1,

K2i,t+1 −K2

i,t ≤ 2Ki,t∆i,k +∆2i,max. (3.36)

Using the assumption A3, from (3.34), (3.35), and (3.36), we have

E[K2i,t+1 −K2

i,t|Θt] ≤ 2Ki,tE[bi,t|Θt] + x2i,max +∆2

i,max. (3.37)

Using the definition of ∆(Θt) and the upper bounds in (3.32), (3.33), and (3.37), we can derive the upper bound

on the drift-minus-welfare function in Proposition 3.1.


We need the following lemma.

Lemma 3.7 Under the WMRA algorithm, queue backlog Hi,t associated with the i-th EV is upper bounded as

follows:

Hi,t ≤ V ωiµ+ xi,max.

Proof: This can be shown using a similar method as in [16], and the technical condition (3.9) is needed.

1) Consider Gt > 0. Suppose that when Ki,t > xi,max+V (ωiµ+emax), one optimal solution under WMRA

is xd,t with xid,t > 0. Then we show that we can find another solution with xjd,t, ∀j 6= i and xid,t = 0 resulting

in a strictly smaller objective value, which is a contradiction.

Using the objective function of (b1), this is equivalent to showing that

V es,t

Gt −N∑

j=1

xjd,t

−N∑

j=1

Hj,txjd,t +N∑

j=1

Jj,tCj(xjd,t) +N∑

j=1

Kj,txjd,t

>V es,t

Gt −N∑

j=1

xjd,t + xid,t

−∑

j 6=i

Hj,txjd,t +∑

j 6=i

Jj,tCj(xjd,t) +∑

j 6=i

Kj,txjd,t,

which is equivalent to

−Hi,txid,t + Ji,tCi(xid,t) +Ki,txid,t > V es,txid,t. (3.38)

Since JiCi(xid,t) ≥ 0, from (3.38), it suffices to show that

(Ki,t −Hi,t − V es,t)xid,t > 0. (3.39)

Since xid,t > 0, (3.39) is true by using the assumption that Ki,t > xi,max + V (ωiµ + emax) and Lemma 3.7 in

which Hi,t is upper bounded.

2) ConsiderGt < 0. Suppose that whenKi,t < −xi,max−V (ωiµ+emax), one optimal solution under WMRA

is xu,t with xiu,t > 0. Then there is a contradiction since we can construct another solution with xju,t, ∀j 6= i


and xiu,t = 0 which results in a strictly smaller objective value. The proof is similar as that in 1) and is omitted

here.


Consider the set {tir,k, tir,k + 1, · · · , til,k} for any k ∈ {1, 2, · · · }. We show below that Ki,t is bounded for any

t in such set by induction.

First consider the upper bound. For the time slot tir,k, based on (3.21) and si,tir,k ≤ si,max, there is Ki,tir,k ≤si,max − ci. Assume that the upper bound holds for time slot t and consider the following two cases of Ki,t.

Case 1: xi,max+V (ωiµ+emax) < Ki,t ≤ si,max−ci (We can check that xi,max+V (ωiµ+emax) < si,max−cisince V ≤ Vmax). For Gt > 0, from Lemma 3.4 1), there is xid,t = 0. Therefore, Ki,t+1 = Ki,t ≤ si,max − ci.

For Gt < 0, we have Ki,t+1 = Ki,t − xiu,t ≤ Ki,t ≤ si,max − ci.

Case 2: Ki,t ≤ xi,max + V (ωiµ + emax). From (3.21), Ki,t+1 ≤ 2xi,max + V (ωiµ+ emax) ≤ si,max − ci,

where the last inequality holds since V ≤ Vmax.

Now look at the lower bound. For the time slot tir,k, based on (3.21) and si,tir,k ≥ si,min, there is Ki,tir,k ≥si,min − ci. Assume that the lower bound holds for time slot t and consider the following two cases of Ki,t.

Case 1′: si,min − ci ≤ Ki,t < −xi,max − V (ωiµ + emax) (We can check that si,min − ci < −xi,max −V (ωiµ + emax) since xi,max > 0). For Gt < 0, from Lemma 3.4 2), there is xiu,t = 0. Therefore, Ki,t+1 =

Ki,t ≥ si,min − ci, For Gt > 0, we have Ki,t+1 = Ki,t + xid,t ≥ Ki,t ≥ si,min − ci.

Case 2′: Ki,t ≥ −xi,max − V (ωiµ + emax). From (3.21), Ki,t+1 ≥ −2xi,max − V (ωiµ + emax), which is

exactly si,min − ci.

Remarks: To track the energy state si,t, in principle, the shift ci can be any number. However, to make

the proof in Case 2′ work, ci is lower bounded, i.e., should satisfy ci = si,min + 2xi,max + V (ωiµ + emax) +

ǫ1 where ǫ1 ≥ 0. For the design of Vmax, to make the proof in Case 1 work, it is sufficient to let Vmax =

min1≤i≤N

{

si,max−si,min−3xi,max−ǫ1−ǫ22(ωiµ+emax)

}

where ǫ2 > 0. Based on the proof in Case 2, ǫ1 and ǫ2 are further

determined as 0 and xi,max, respectively, to make Vmax as large as possible.


We first give the following fact, which is a direct consequence of the results in [16].

Lemma 3.8 There exists a stationary randomized regulation allocation solution (xsd,t,x

su,t) that only depends on

the system state At, and there are

E[xsi,t] = zsi , ∀i, for some zsi ∈ [0, xi,max], (3.40)

E[est ]−N∑

i=1

ωiU(zsi ) ≤ −f2(xd,t, xu,t, zt), (3.41)

E[1d,tCi(xsid,t) + 1u,tCi(x

siu,t)] ≤ ci,up, ∀i, and (3.42)

E[bsi,t] = 0, ∀i, (3.43)

where the expectations are taken over the randomness of the system and the randomness of (xsd,t,x

su,t), and

(xd,t, xu,t, zt) is an optimal solution for P3.


1) For brevity, define Wt,

(

∑Ni=1 ωiU(zi,t)

)

− et. Since WMRA minimizes the upper bound in (3.24), plug

(xsd,t,x

su,t) on the right hand side of (3.24) together with zi,t = zsi , ∀t, we have

∆(Θt)− V E

[

Wt|Θt

]

≤ B − V f2(xd,t, xu,t, zt), (3.44)

where (3.40), (3.41), (3.42), and (3.43) are used. Since Wt ≤∑N

i=1 ωiU(xi,max), from (3.44),

∆(Θt) ≤ D,B + V

(

N∑

i=1

ωiU(xi,max)− f2(xd,t, xu,t, zt)

)

.

Using Theorem 4.1 in [16], E[|Hi,t|] and E[|Ji,t|] are upper bounded by√

2tD + 2E[L(Θ0)], ∀t. Hence, the

virtual queues Hi,t and Ji,t are mean rate stable and the following limit constraints hold.

limT→∞

1

T

T−1∑

t=0

E[zi,t] = limT→∞

1

T

T−1∑

t=0

E[xi,t], ∀i, (3.45)

limT→∞

1

T

T−1∑

t=0

E [1d,tCi(xid,t) + 1u,tCi(xiu,t)] ≤ ci,up, ∀i.

Since si,t is bounded under WMRA by Lemma 3.6, using Lemma 3.2, we have limT→∞1T

∑T−1t=0 E[bi,t] = 0, ∀i.

In addition, note that (xd,t, xu,t) is derived under the constraints of the optimization problems (a), (b1), and (b2).

Therefore, we have that (xd,t, xu,t) is feasible for P3, P2, and P1.

2) Taking expectations of both sides of (3.44) and summing over t ∈ {0, 1, · · · , T − 1} for some T > 1, we

have

1

T

T−1∑

t=0

E[Wt] ≥E [L(ΘT )− L(Θ0)]

V T+ f2(xd,t, xu,t, zt)−B/V

≥ f2(xd,t, xu,t, zt)−B/V − E[L(Θ0)]/V T, (3.46)

where (3.46) holds since L(ΘT ) is non-negative. Also,

1

T

T−1∑

t=0

E[Wt] =1

T

T−1∑

t=0

E

[(

N∑

i=1

ωiU(zi,t)

)

− et

]

≤N∑

i=1

ωiU

(

1

T

T−1∑

t=0

E[zi,t]

)

− 1

T

T−1∑

t=0

E[et], (3.47)

where the inequality in (3.47) is derived using Jensen’s inequality for concave functions. Combining (3.46) and

(3.47) and taking limits on both sides, there is

N∑

i=1

ωiU

(

limT→∞

1

T

T−1∑

t=0

E[zi,t]

)

− limT→∞

1

T

T−1∑

t=0

E[et]

≥f2(xd,t, xu,t, zt)−B/V (3.48)

≥f2(x∗d,t,x

∗u,t, z

∗t )−B/V (3.49)

=f1(xopt

d,t,xoptu,t)−B/V, (3.50)


where (x∗d,t,x

∗u,t, z

∗t ) and (xopt

d,t,xoptu,t) are defined in Section 3.2.1, (3.48) holds since E[L(Θ0)] is bounded, (3.49)

holds since the feasible set of the optimization variables is enlarged from P2 to P3, and (3.50) is true due to Lemma

3.1.

Rewrite the objective function of P1 under WMRA, i.e., f1(xd,t, xu,t), as

N∑

i=1

ωiU

(

limT→∞

1

T

T−1∑

t=0

E[zi,t]

)

− limT→∞

1

T

T−1∑

t=0

E[et]

+

N∑

i=1

ωiU

(

limT→∞

1

T

T−1∑

t=0

E[xi,t]

)

−N∑

i=1

ωiU

(

limT→∞

1

T

T−1∑

t=0

E[zi,t]

)

.

Due to (3.45), the last two terms cancel each other. Hence, by (3.50), we have f1(xd,t, xu,t) ≥ f1(xopt

d,t,xoptu,t) −

B/V , which completes the proof.

Chapter 4

Real-Time Phase Balancing with Energy

Storage

In this chapter, we consider using energy storage to provide real-time phase balancing service in power grids. We

consider a substation connected to multiple phases, each with single-phase uncontrollable flow, controllable flow,

and an energy storage unit. In particular, we consider phase balancing on a time scale of seconds to minutes. As

such, we do not model power system physics such as frequency and voltage magnitude. Aiming at minimizing

the cost of all phases and mitigating phase imbalance, we propose a real-time algorithm that can be easily imple-

mented by the substation. Moreover, for the scenario of limited communication between the substation and each

phase, we provide distributed implementation of the real-time algorithm where only limited information exchange

is required.

The main contributions of this work are summarized as follows. First, we formulate a stochastic optimization

problem for phase balancing incorporating system uncertainty, storage characteristics, and power network con-

straints. Second, for ideal energy storage with lossless charging and discharging, we provide a real-time algorithm

building on the Lyapunov optimization framework and prove its analytical performance guarantee. Moreover, we

offer distributed implementation of the algorithm with fast convergence. Third, we extend the algorithm to ac-

commodate non-ideal energy storage with imperfect charging and discharging efficiency and show its analytical

performance. Finally, to numerically evaluate the performance of the proposed algorithm, we compare it with a

benchmark greedy algorithm under various settings and parameters. Simulation reveals that our proposed algo-

rithm is competitive in general. In particular, the proposed algorithm has noticeable advantage when applied to

storage with a large energy capacity, a high value of the energy-power ratio (e.g., compressed air energy storage

and batteries), and moderate-to-high charging and discharging efficiency (e.g., the round-trip efficiency of stor-

age is greater than 65%). In addition, a practical outcome of our analysis shows the following design guideline:

optimal power balancing favors even allocation of storage capacity over the phases.


Consider a discrete-time model with time t ∈ {0, 1, 2, . . .}. To simplify notation, we normalize the duration of

each time period ∆t to one and thus eliminate ∆t in presentation. The system model is depicted in Fig. 4.1. A

substation is connected with N ≥ 2 phases, each with single-phase loads and generation. We consider a general

case where it is optional for each phase to deploy energy storage. Denote the set of phases that deploy storage by

50

CHAPTER 4. REAL-TIME PHASE BALANCING WITH ENERGY STORAGE 51

f1,t fN,t

fi,t

EnergyStorage

Charge

Discharge Controllable flow

Uncontrollable flowu+

i,t

u−

i,t

ri,t

li,tη−

i u−

i,t

Substation

Phase 1

1

η+

i

u+

i,t

Phase i

Phase N

Figure 4.1: System model with N phases. The details of the i-th phase are shown.

E ⊆ {1, 2, . . . , N}. Below we first describe the components of each phase.

4.1.1 System Model of Each Phase

At the i-th phase, denote the amount of uncontrollable power at time slot t by ri,t. The uncontrollable flow can

represent renewable generation such as wind and solar, base loads, or the difference between renewable generation

and base loads. Since the uncontrollable flow is generally governed by nature or uncertain human behavior, we

assume that ri,t is random, but it is confined within an interval [ri,min, ri,max]. Throughout the paper we use a

bold letter to denote a vector that contains elements of N phases. Here, we define rt,[r1,t, . . . , rN,t] to represent

the uncontrollable flow vector at time slot t. The other vectors in the rest of this paper are defined similarly.

Denote the amount of the controllable power flow at the i-th phase at time t by li,t. The controllable flow can

represent the output of conventional generators, or the consumption of flexible loads. We associate a cost function

with the controllable flow and denote the function by Ci(li,t), which can represent the cost of local generators

(e.g., an on-site diesel generator), or the cost of a utility for consuming power.

Denote the power flow between the substation and the i-th phase at time slot t by fi,t. Due to the capacity

constraints of power lines, the value of fi,t is generally confined. We assume that at each time slot the power

flow vector ft ∈ F , where the set F is non-empty, compact, and convex. For example, F may be defined as

F,{ft|fi,t ∈ [fi,min, fi,max], ∀i}.Remark: The values of ri,t, li,t, and fi,t can be positive or negative. We use the positive sign to indicate

power injection into the i-th phase, and the negative sign to indicate power extraction from the i-th phase.

Assume that the i-th phase is equipped with an energy storage unit, i.e., i ∈ E . Denote the charging and

discharging rates of the storage at time slot t by u+i,t ∈ [0, ui,max] and u−

i,t ∈ [0, ui,max], respectively, where

ui,max is the maximum charging and discharging rates. Denote the energy state of the i-th storage at the beginning

of time slot t by si,t, which evolves as si,t+1 = si,t +u+i,t− u−

i,t. The energy state si,t is required to be within the

storage’s capacity limits [si,min, si,max].

Due to conversion and storage losses, charging and discharging may not be perfectly efficient. For the i-th

storage, we denote the charging efficiency by η+i ∈ (0, 1] and the discharging efficiency by η−i ∈ (0, 1]. Then, the

associated charging and discharging quantities seen on each phase are 1η+

i

u+i,t and η−i u

−i,t, respectively (see Fig.

4.1). Owing to the round-trip efficiency or other operating constraints, simultaneous charging and discharging

may be forbidden in practice, which can be reflected by the constraint u+i,t · u−

i,t = 0, i ∈ E . Moreover, if the i-th

phase is not equipped with storage, i.e., i /∈ E , we simply set the values of si,t, u+i,t, and u−

i,t to zero.

The energy storage can additionally be used for arbitrage. Denote the electricity price at time slot t by pt ∈[pmin, pmax], which is random over time. Then the cost of the i-th phase for energy arbitrage during time slot t is

pt(1η+

i

u+i,t−η−i u

−i,t). Finally, frequent charging and discharging can shorten the lifetime of storage [72]. To model

this effect, we introduce a degradation cost function Di(·), with negative input indicating discharging and positive


input indicating charging. Therefore, the degradation cost incurred at time slot t is given by Di(u+i,t)+Di(−u−

i,t).


Since phase imbalance is harmful for power system operation, it is critical to balance the power flows fi,t among

phases. To this end, we introduce a loss function F (·) to characterize the deviation of fi,t from the average

power flow. In particular, for the i-th phase, F (·) is a function of fi,t − f t, where f t is the average defined as

f t,1N

∑Nj=1 fj,t.

We assume that the system is operated by a representative of the substation, who aims to minimize the long-

term system cost, which includes the costs of all phases. Specifically, based on the model described in Section

4.1.1, the system cost at time slot t is given by

wt =∑

i∈E

[

pt(1

η+iu+i,t − η−i u−

i,t) +Di(u+i,t) +Di(−u−

i,t)]

+

N∑

i=1

[

Ci(li,t) + F (fi,t − f t)]

.

Denote the random system state at time slot t by qt,[rt, pt], which includes the uncontrollable power flow of

N phases and the electricity price. Denote the control action at time slot t by at,[lt,u+t ,u

−t , ft], which contains

the controllable power flow, the charging and discharging amounts, and the power flow between each phase and

the substation. We formulate the problem for phase balancing as the following stochastic optimization problem.

P1: min{at}

lim supT→∞

1

T

T−1∑

t=0

E[wt]

s.t. 0 ≤ u+i,t, u

−i,t ≤ ui,max, ∀i ∈ E , t, (4.1)

u+i,t · u−

i,t = 0, ∀i ∈ E , t, (4.2)

si,t+1 = si,t + u+i,t − u−

i,t, ∀i ∈ E , t, (4.3)

si,min ≤ si,t ≤ si,max, ∀i ∈ E , t, (4.4)

u−i,t = u+

i,t = 0, ∀i /∈ E , t, (4.5)

ft ∈ F , ∀t, (4.6)

fi,t + ri,t + li,t + η−i u−i,t −

1

η+iu+i,t = 0, ∀i, t. (4.7)

The expectation on the objective is taken over the randomness of qt and the possibly random control action that

depends on qt. Constraint (4.7) enforces power balance at each phase at each time slot.

To keep mathematical exposition simple, we assume that the cost functions Ci(·) and Di(·) are continuously

differentiable and convex. This assumption is realistic because many practical cost functions can be well approx-

imated by such functions [83]. Denote the derivatives of Ci(·) and Di(·) by C′i(·) and D′

i(·), respectively. Since

the variables u+i,t, u

−i,t, and li,t are bounded based on the constraints of P1, the cost functions and their derivatives

are bounded in the feasible set. For the cost functionCi(·), we denote its range by [Ci,min, Ci,max] and its range of

the derivative by [C′i,min, C

′i,max] in the feasible set. The range of the cost function Di(·) and that of its derivative

are defined similarly. In addition, we assume that the loss function F (·) is convex and continuously differentiable.

We are interested in designing both centralized and distributed real-time algorithms for solving P1. Distributed

implementation is motivated by the limited capability of real-time communication between the substation and each

phase, and also the potential privacy concerns of each phase. This is a challenging task due to system uncertainty,


the coupling of all phases through the objective and constraints, and the energy state constraint (4.4) which couples

the charging and discharging actions over time.

4.2 Real-Time Algorithm for Ideal Energy Storage

For tractability, in this section we first consider ideal energy storage that has perfectly efficient charging and

discharging, i.e., η+i = η−i = 1. The case of non-ideal energy storage is studied in Section 4.3. We first propose

a centralized real-time algorithm that can be implemented by the substation and show its analytical performance.

Then we provide distributed implementation for the proposed algorithm, for which only limited information

exchange is needed.

4.2.1 Centralized Real-Time Algorithm and Analysis

Under perfectly efficient charging and discharging, without loss of optimality, we can combine the charging and

discharging variables u+i,t and u−

i,t into one by introducing a new variable ui,t,u+i,t−u−

i,t, which can represent the

net charging and discharging amount. In particular, if ui,t > 0 it indicates charging, and if ui,t < 0 it indicates

discharging.

With the new variable ui,t, the non-simultaneous charging and discharging constraint (4.2) can be eliminated,

and the evolution of the energy state amounts to si,t+1 = si,t + ui,t. In addition, with ui,t, the control action

at time slot t is now at,[lt,ut, ft], and the system cost can be rewritten as wt =∑

i∈E

[

ptui,t + Di(ui,t)]

+∑N

i=1

[

Ci(li,t) + F (fi,t − f t)]

.

For the design of real-time implementation, we employ Lyapunov optimization [16], which has been used

widely in wireless networks for dealing with time-averaged constraints and providing simple yet efficient algo-

rithms for complex dynamic systems. However, the energy state constraint (4.4) is not a time-averaged constraint

but a hard constraint, and it couples the control action ui,t over multiple time instances. As a result, P1 is not

amenable to the standard framework of Lyapunov optimization. To overcome this difficulty, we replace the energy

state constraints (4.3) and (4.4) with a new time-averaged constraint, which only requires the net charging and

discharging amount to be zero on average, i.e.,

limT→∞

1

T

T−1∑

t=0

E[ui,t] = 0, ∀i ∈ E . (4.8)

With the new constraint (4.8), we form a new stochastic optimization problem as follows:

P2: min{at}

lim supT→∞

1

T

T−1∑

t=0

E[wt]

s.t. (4.6), (4.8),

fi,t + ri,t + li,t − ui,t = 0, ∀i, t, (4.9)

ui,t = 0, ∀i /∈ E , t, (4.10)

− ui,max ≤ ui,t ≤ ui,max, ∀i ∈ E , t. (4.11)

It can be shown that constraints (4.3) and (4.4) imply (4.8) (i.e., any ui,t that satisfies (4.3) and (4.4) also satisfies

(4.8)), and therefore P2 is a relaxed problem of P1 (see Appendix 4.6.1).


The above relaxation step is crucial for the application of Lyapunov optimization. However, we need to

emphasize that, solving P2 is not our purpose. Instead, the significance of proposing P2 is to facilitate the

development of a real-time algorithm for P1 and the associated performance analysis. Due to the relaxation, the

solution to P2 may be infeasible to P1. Later we will prove in Proposition 4.1 that our proposed algorithm ensures

constraints (4.3) and (4.4) satisfied, and therefore produces a feasible solution to P1.

We now propose a real-time algorithm leveraging on Lyapunov optimization. At time slot t, for phase i ∈ E ,

define a Lyapunov function L(si,t),12 (si,t − βi)

2, which measures the deviation of the energy state si,t from

a perturbation parameter βi. The parameter βi is introduced to ensure the boundedness of the energy state, i.e.,

constraint (4.4), and it needs to be carefully designed. In addition, we define a one-slot conditional Lyapunov drift

as ∆(st),E[∑

i∈EL(si,t+1)−L(si,t)

Vi|st]

, which collects the weighted sum of the one-slot conditional drifts of the

Lyapunov functions for all phases with storage.

In our design of the real-time algorithm, instead of directly minimizing the system cost at time slot t, we

consider a drift-plus-cost function ∆(st) + E[wt|st]. In particular, we first derive an upper bound on the drift-

plus-cost function (see Appendix 4.6.2 for the upper bound), and then formulate a per-slot optimization problem

to minimize this upper bound. Consequently, at each time slot t, we solve the following optimization problem:

P3: minat

wt +∑

i∈E

(si,t − βi)ui,t

Vi

s.t. (4.5)− (4.7), (4.11).

Denote an optimal solution of P3 at time slot t by a∗t,[l∗t ,u∗t , f

∗t ]. At each time slot, after obtaining the solution

a∗t , we update si,t using u∗i,t. It can be easily verified that the optimization problem P3 is convex, and thus may

be efficiently solved by standard convex optimization software packages such as those in MATLAB. We will

later shown in Theorem 4.1 that such design of the per-slot optimization problem can lead to certain guaranteed

performance.

In the following proposition, we show that, despite the relaxation to P2, by appropriately designing the per-

turbation parameter βi, we can ensure that constraint (4.4) is satisfied, and therefore the control actions {a∗t } is

feasible to P1.

Proposition 4.1 For phase i ∈ E , set the perturbation parameter βi as

βi,si,min + ui,max + Vi(pmax +D′i,max + C′

i,max) (4.12)

where Vi ∈ (0, Vi,max] with

Vi,max,si,max − si,min − 2ui,max

pmax − pmin +D′i,max −D′

i,min + C′i,max − C′

i,min

. (4.13)

Then the control actions {a∗t } obtained by solving P3 at each time t are feasible to P1.


To ensure the positivity of Vi,max in (4.13), we need the numerator si,max − si,min − 2ui,max > 0. This is

generally true for real-time applications, in which the length of each time interval is small ranging from a few

seconds to minutes.

The overall centralized real-time algorithm is summarized in Algorithm 4.1, which can be implemented by

the substation. It is worth mentioning that the proposed algorithm does not require any system statistics, which


Algorithm 4.1: Centralized algorithm for ideal storage.

At time slot t, the substation executes the following steps sequentially:

1. observe the system state qt and the energy state si,t;

2. solve P3 and obtain a solution a∗t,[l∗t ,u∗t , f

∗t ]; and

3. update si,t+1 by si,t + u∗i,t.

may be desirable when accurate system statistics are difficult to obtain.

Denote the optimal objective value of P1 by wopt. Under Algorithm 4.1, denote the objective value of P1 by

w∗ and the system cost at time slot t by w∗t . The performance of Algorithm 4.1 is shown in the following theorem.

Theorem 4.1 Assume that the system state qt is i.i.d. over time and the equipped storage at the phases is perfectly

efficient. Under Algorithm 4.1 the following statements hold.

1. w∗ − wopt ≤∑i∈E

u2i,max

2Vi.

2. 1T

∑T−1t=0 E[w∗

t ]− wopt ≤∑i∈E

u2i,max

2Vi+

E[L(si,0)]TVi

.


Remarks:

• For Theorem 4.1.1, first, if E is empty, i.e., no phase deploys storage, then Algorithm 4.1 achieves the opti-

mal objective value. In fact, for this case, Algorithm 4.1 reduces to a greedy algorithm that only minimizes

the current system cost at each time. Second, if E is non-empty, to minimize the gap to the optimal objective

value, we should set Vi = Vi,max. Asymptotically, if the energy capacity si,max is large and thus Vi,max is

large, Algorithm 4.1 achieves the optimal objective value.

• In Theorem 4.1.2, we characterize the performance of Algorithm 4.1 under a finite time horizon. An extra

gap∑

i∈EE[L(si,0)]

TViis incurred due to the initialization of the energy states. However, if the time horizon T

is large, this gap is negligible.

• The i.i.d. assumption of the system state qt can be relaxed to accommodate qt that follows a finite state

irreducible and aperiodic Markov chain. Using a multi-slot drift technique [16], we can show similar

conclusions which are omitted here. In simulation, we will evaluate the algorithm performance when the

uncontrollable power flows are temporally correlated.

An interesting additional consequence of Theorem 4.1 is that we obtain a general rule of thumb for the alloca-

tion of energy storage capacity among the phases. In particular, in the following proposition, we demonstrate that,

under some mild assumptions, equal allocation of a given energy storage capacity can result in a lower overall

system cost.

Proposition 4.2 Assume that si,min, ui,min and Di,max − Di,min are identical for all i ∈ E . Assume further

that for all phases, C′i,max − C′

i,min is the same. Then, under Algorithm 4.1, if the total energy storage capacity∑

i∈E si,max is fixed and the control parameter Vi = Vi,max is as in (4.13), the upper bound of the performance

gap in Theorem 4.1.1, i.e.,∑

i∈E

u2i,max

2Vi, is minimized when the energy storage capacity is equally allocated

among phases.



The above result states that energy storage is best allocated equally over the phases. Note that this result is

robust because it does not depend on any system statistics or specific values of system parameters. We will revisit

this in simulation.

4.2.2 Distributed Implementation

To accomplish the implementation of Algorithm 4.1 in a centralized way, each phase has to provide all informa-

tion that is required to solve the real-time problem P3. Specifically, for each phase, the cost functions and the

associated optimization constraints need to be communicated to the substation in advance. In addition, at each

time slot, the information of the uncontrollable power flow as well as the storage energy state has to be sent to the

substation. However, in practice, due to the limited capability of real-time communication along with potential

privacy concerns of each phase, some of the aforementioned information may be unavailable at the substation.

Therefore, the centralized implementation may be infeasible. In this subsection, we provide a distributed algo-

rithm for solving P3 in which only limited information exchange is required. For ease of notation, we suppress

the time index t in the following presentation.

The distributed algorithm is based on ADMM [69]. To facilitate algorithm development, we rewrite P3 as

follows:

mina

1(f ∈ F) +N∑

i=1

[

Hi(li, ui) + F (fi − f)]

s.t. fi + ri + li − ui = 0, ∀i (4.14)

where 1(·) is the indicator function that equals 0 (resp. +∞) when the enclosed event is true (resp. false), and for

each phase the function Hi(li, ui) is defined as follows:

Hi(li, ui),

(si−βi)ui

Vi+ pui +Di(ui) + Ci(li)

+1(−ui,max ≤ ui ≤ ui,max), if i ∈ ECi(li) + 1(ui = 0), if i /∈ E .

We associate a Lagrange multiplier λi with equality (4.14).

By treating the variables (l,u) as one block and the variable f as the other, we express the updates at the

(k + 1)-th iteration below according to the ADMM algorithm.

(li, ui)k+1 ← argmin

li,ui

[

Hi(li, ui) +ρ

2(fk

i + ri + li − ui +λki

ρ)2]

fk+1 ← argminf∈F

N∑

i=1

[

F (fi − f) +ρ

2(fi + ri + lki − uk

i +λki

ρ)2]

λk+1i ← λk

i + ρ(fk+1i + ri + lk+1

i − uk+1i )

where ρ > 0 is a pre-determined parameter.

To implement the above iteration, each phase updates the controllable power flow li, the net charging and

discharging amount ui, and the Lagrange multiplier λi, while the substation updates the power flow vector f . For

information exchange, at the (k + 1)-th iteration, the substation sends fki to each phase, and each phase provides


Substation

Phase iPhase 1 Phase N

(update lk+1

i, uk+1

i, and λk+1

i)

(update fk+1)

mk

ifk

i

· · · · · ·

Figure 4.2: Distributed implementation for solving P3.

mki,ri + lki − uk

i +λki

ρto the substation. The schematic representation of the distributed implementation is given

in Fig. 4.2.

Remark: With the proposed distributed algorithm, each phase only needs to provide the update of mki to the

substation without revealing the cost functions or the other parameters. Therefore, the communication requirement

and the information revelation of each phase are limited.

The convergence behavior of the distributed algorithm is summarized in the following theorem. The proof

follows Theorem 2 in [84] and thus is omitted.

Theorem 4.2 Assume that the functions Di(·), Ci(·), and F (·) are closed, proper, and convex. The sequence

{lk,uk, fk, λk} converges to an optimal primal-dual solution of P3 with the worst case convergence rate O(1/k).

4.3 Extension to Non-ideal Energy Storage

In this section, we discuss the algorithm design for non-ideal energy storage with inefficient charging and discharg-

ing. This is significant because common storage technologies such as batteries can have round-trip efficiency, i.e.,

η+i · η−i , ranging from 70% to 95% [66].

The mathematical framework of the algorithm design follows that of ideal storage. However, due to imperfect

charging and discharging, the charging and discharging variables u+i,t and u−

i,t cannot be combined into one as we

did in Section 4.2, and therefore, the (non-convex) non-simultaneous charging and discharging constraint (4.2)

cannot be eliminated. To overcome this difficulty, we first ignore constraint (4.2) and then adjust the resultant

solution to satisfy the constraint.

Specifically, we first modify the per-slot optimization problem P3 to the following:

P3’: minat

wt +∑

i∈E

(si,t − βi)

Vi

(u+i,t − u−

i,t)

s.t. (4.1), (4.5)− (4.7)

where we have defined the perturbation parameter

βi,si,min + ui,max + Vi

(

pmax

η+i+

1

η+iC′

i,max +D′i,max

)

. (4.15)

The parameter Vi in (4.15) lies in the interval (0, Vi,max], where

Vi,max,si,max − si,min − 2ui,max

pmax

η+

i

− pminη−i +D′

i,max −D′i,min +

1η+

i

C′i,max − η−i C

′i,min

.


Algorithm 4.2: Centralized algorithm for non-ideal storage.

At time slot t, the substation executes the following steps sequentially:

1. observe the system state qt and the energy state si,t;

2. solve P3’ and obtain an intermediate solution at,[lt, u+t , u

−t , ft];

3. generate the final solution a∗t where u+∗i,t = max{u+

i,t − u−i,t, 0}, u−∗

i,t = max{u−i,t − u+

i,t, 0},l∗i,t = li,t + η−i u

−i,t − 1

η+

i

u+i,t − η−i u

−∗i,t + 1

η+

i

u+∗i,t , and f∗t = ft; and

4. update si,t by (4.3) using u+∗i,t and u−∗

i,t .

Note that the definition of βi in (4.15) is similar to that in (4.12) for ideal storage, except the inclusion of the

charging and discharging efficiencies. Moreover, if η+i = η−i = 1, (4.15) reduces to (4.12).

Then, the overall centralized algorithm is summarized in Algorithm 4.2, where we use the superscript notations

ˆand ∗ to indicate the intermediate solution derived from P3’ and the final solution, respectively. To ensure that the

final solution satisfies constraint (4.2), in Step 3, we adjust the intermediate charging and discharging solutions

u+i,t and u−

i,t along with the controllable power flow li,t, so that simultaneous charging and discharging cannot

happen and the power balance constraint (4.7) still holds.

Remarks: Under some conditions, constraint (4.2) may automatically hold by solving P3’, e.g., when the

electricity price pt is positive and the cost function of the controllable flow Ci(·) is increasing. However, if pt can

be negative or consuming controllable flow costs money, the solution of P3’ may not meet constraint (4.2) and

thus Step 3 in Algorithm 4.2 may be necessary. In addition, if simultaneous charging and discharging is allowed

in practice, we can simply eliminate Step 3 in Algorithm 4.2.

The performance of Algorithm 4.2 is summarized in the following theorem.

Theorem 4.3 Assume that the system state qt is i.i.d. over time and the equipped storage at the phases is not

perfectly efficient. Under Algorithm 4.2 the following statements hold.

1. {a∗t } is feasible for P1;

2. w∗ − wopt ≤∑i∈E

u2i,max

2Vi+ ǫ; and

3. 1T

∑T−1t=0 E[w∗

t ]− wopt ≤∑i∈E

[

u2i,max

2Vi+

E[L(si,0)]TVi

]

+ ǫ,

where ǫ,∑

i∈E pmaxui,max(1η+

i

+ η−i ) + 2Di,max + Ci,max.


The results in Theorem 4.3 parallel those in Theorem 4.1 for ideal storage, with an extra gap ǫ incurred due

to the adjustment of the intermediate solutions. Furthermore, since constraint (4.2) is ignored in P3’, the problem

is convex and therefore Algorithm 4.2 can be implemented distributively using a similar ADMM-based algorithm

as that in Section 4.2.2.


In this section, we numerically evaluate the performance of the proposed algorithm. In each example, all phases

are equipped with energy storage. The specific values of the system parameters and functions are shown in Table


Table 4.1: Default setup of parameters and functions

Par. Setup Par. (Fun.) Setup

[ri,min, ri,max] [−8, 8] η+i , η−i 1

[fi,min, fi,max] [−5, 5] Ci(x) 1.5x2

[si,min, si,max] [2, 10] Di(x) 0.2x2

[pmin, pmax] [7, 12] F (x) 10x2

ui,max 1 N 3

4.1. The other default setup is as follows: the system state [rt, pt] is i.i.d. over time; at each time slot, the

uncontrollable power flows are modeled as independent among phases, and they follow the Gaussian distribution

N (0, 42) truncated within [ri,min, ri,max]; and the electricity price pt is approximated to follow the uniform

distribution. For ideal storage, at each time slot, the control action at is generated by Algorithm 4.1, and for

non-ideal storage, at is generated by Algorithm 4.2. Both Algorithms are run for T = 500 time slots. The control

parameter Vi is set to Vi,max.

For comparison, we use a greedy algorithm as the benchmark, which does not account for the future per-

formance. In particular, at each time slot, the greedy algorithm minimizes the current system cost subject to all

constraints of P1. For ideal storage, the greedy algorithm solves the following optimization problem in each time

slot:

minlt,ut,ft

wt

s.t. (4.6), (4.9), (4.10),

max{−ui,max, si,min − si,t} ≤ ui,t ≤ min{ui,max, si,max − si,t}.

For non-ideal storage, at time slot t, an intermediate solution is first found by solving the optimization problem

minlt,u

+t ,u−

t ,ft

wt s.t. (4.1), (4.3)− (4.7)

without the non-simultaneous charging and discharging constraint (4.2). Then, the final solution of the greedy

algorithm is determined by adjusting the intermediate solution using Step 3 in Algorithm 4.2. For all figures, we

omit drawing confidence intervals since they are small.

4.4.1 Effect of Correlations of Uncontrollable Power Flows

In this subsection, we examine the effect of both the phase and time correlation of the uncontrollable power flows

on the system cost. In Fig. 4.3, we assume that at each time slot, the uncontrollable flows of Phases 1 and

2 are correlated with the correlation coefficient ρ1, while the uncontrollable flow of Phase 3 is independent of

those of Phases 1 and 2. We see that, for both algorithms, the system cost decreases with ρ1. This is easy to

understand, since with a larger ρ1 the uncontrollable flows of Phases 1 and 2 are more positively related, which

makes phase balancing less challenging. In Fig. 4.4, we additionally assume that the uncontrollable flow of Phase

3 is correlated with that of Phase 1 with the same correlation coefficient ρ1. With the additional correlation among

phases, the performance gap between the proposed algorithm and the greedy algorithm becomes smaller.

In Fig. 4.5, we assume that the uncontrollable flows are independent among phases at each time slot, but they

are temporally correlated with the time correlation coefficient ρ2. We observe that, for both algorithms, the system


−1 −0.5 0 0.5 120

25

30

35

40

45

ρ1 (Phase correlation coefficient)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.3: System cost vs. phase correlation coefficient: Case 1.

−1 −0.5 0 0.5 10

10

20

30

40

50

ρ1 (Phase correlation coefficient)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.4: System cost vs. phase correlation coefficient: Case 2.

cost increases with ρ2. This is because at each phase, when the uncontrollable flow is more positively correlated,

the more expensive controllable flow is used for phase balancing since the energy state of the storage is close to

its range limit. Consequently, the proposed algorithm achieves a lower system cost when the uncontrollable flow

is more negatively correlated.

4.4.2 Effect of Energy Storage Capacity

In this subsection, we consider the effect of energy capacity allocation on the system cost. In Fig. 4.6, we increase

the values of the energy capacity of all storage units from 6 to 50. Note that for the proposed algorithm the role

of si,max is played through the design of the control parameter Vi,max in (4.13), and for the greedy algorithm

the effect of si,max is reflected through the upper bound of the net charging and discharging variable ui,t in the

optimization problem. We see that, as si,max increases, the system cost of the greedy algorithm does not change,

while that of the proposed algorithm drops with a decreasing slope. The former phenomenon could happen

when the maximum charging and discharging rate ui,max is relatively small and thus si,max has limited effect

on ui,t. The latter observation indicates the second remark below Theorem 4.1 that the proposed algorithm is

asymptotically optimal when si,max is large.

In Fig. 4.7, we fix the total energy capacity of all storage units to 30 (i.e., s1,max + s2,max + s3,max = 30)

and vary the capacity allocation among phases. In particular, we fix s2,max at 10 and change s1,max from 5 to

15. Two cases are considered: Case 1, the variance of the uncontrollable flow of each phase is 16 (default setup);

Case 2, the variances of the uncontrollable flow of phases 1, 2, and 3 are 9, 16, and 25, respectively. For both


−1 −0.5 0 0.5 110

15

20

25

30

35

40

45

ρ2 (Time correlation coefficient)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.5: System cost vs. time correlation coefficient.

10 20 30 40 5020

25

30

35

40

si,max

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.6: System cost vs. energy capacity si,max.

algorithms, Case 2 leads to a smaller system cost in general. Moreover, for the greedy algorithm, the system cost

barely changes with s1,max. In comparison, for the proposed algorithm, the system cost achieves the lowest value

when the energy capacity is approximately equally allocated. This observation is consistent with our conclusion

in Proposition 4.2.

4.4.3 Effect of Charging and Discharging Circuit Parameters

In Fig. 4.8, we consider that each phase is equipped with non-ideal energy storage. The charging and discharging

efficiencies η+i and η−i of each storage are assumed to be the same. We see that for both algorithms, the system

cost decreases almost linearly with the round-trip efficiency. The decreasing trend is expected since the storage

becomes more efficient with a larger value of the round-trip efficiency. In particular, the proposed algorithm lends

to a lower system cost when the storage is reasonably efficient. From the figure, this corresponds to the case when

the round-trip efficiency is greater than 0.65, which includes the range of the round-trip efficiency for most energy

storage in practice [66]. On the other hand, when the storage is highly inefficient, the greedy algorithm is shown

to produce a better performance.

In Fig. 4.9, we vary the value of the maximum charging and discharging rate ui,max of all phases from 0.1

to 3. Note that for the greedy algorithm, ui,max only affects the constraints of the net charging and discharging

amount, and for the proposed algorithm, ui,max additionally affects the design of Vi,max. We see that, the system

cost of the greedy algorithm decreases with ui,max, while the system cost of the proposed algorithm first decreases

and then increases. For the proposed algorithm, the increasing trend of the system cost could be explained using


5 10 1528

29

30

31

32

33

34

s1,max

Tim

e−

ave

rag

ed

sys

tem

co

st

Proposed: case 1Proposed: case 2Greedy: case 1Greedy: case 2

Figure 4.7: System cost vs. energy capacity s1,max.

50 60 70 80 90 10028

29

30

31

32

33

34

35

Round−trip efficiency (%)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.8: System cost vs. round-trip efficiency.

Theorem 4.1.1, in which the gap to the optimal objective value increases with ui,max. Moreover, from the figure,

when ui,max is less than 1.5, or, when the charging duration of the storage is larger than 6.6 time units, the

proposed algorithm outperforms the greedy one. Since the time scale we consider is seconds to minutes, this is

the case for most batteries as the time scale of their charging duration is hours [66]. The improvement of the

algorithm for large ui,max is left for future.

4.4.4 Effect of Other System Parameters

In Fig. 4.10, we exhibit the power flows between the substation and each phase as well as their average. Recall

that the purpose of phase balancing is to make fi,t of all phases as close as possible. The figure shows that the

curves of the power flows coincide most of the time. To further narrow the gap of these curves, we can increase

the coefficient of the loss function F (x) so as to impose more penalty for the flow deviation. In return, the system

cost would be higher.

Although the three-phase transmission is dominant in practice, we are interested in finding how the number of

phases affects the algorithm performance. In Fig. 4.11, we increase the number of phases N from 2 to 8. For both

algorithms, the system cost grows linearly with N , which is expected since the system cost sums up the costs of

all phases. Moreover, as N increases, the performance gain of the proposed algorithm over the greedy algorithm

increases.


0 0.5 1 1.5 225

30

35

40

45

ui,max

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.9: System cost vs. maximum charging and discharging rate ui,max.

0 100 200 300 400 500−5

0

5

t (Time slot)

Pow

er fl

ow

f1,t

f2,t

f3,t

Average

Figure 4.10: Power flows vs. time slots.

4.5 Summary

We have investigated the problem of phase balancing with energy storage. We have proposed both centralized

and distributed real-time algorithms for ideal energy storage and further extended the algorithms to accommodate

non-ideal energy storage. Moreover, we have conducted extensive simulation to evaluate the algorithm perfor-

mance, showing that it can substantially outperform a greedy alternative. Our key conclusions are that correlations

between the phases make phase balancing easier, and that evenly allocating storage over the phases results in the

best performance.

For future work, we are interested in incorporating system statistics into the algorithm design to further im-

prove performance, and also combing energy storage with traditional methods such as feeder reconfiguration for

phase balancing.

4.6 Appendices

4.6.1 Proof of Relaxation from P1 to P2

Using the energy state update si,t+1 = si,t + ui,t, we can derive that the left hand side of constraint (4.8) equals

the following:

limT→∞

1

T

T−1∑

t=0

E[ui,t] = limT→∞

E[si,T ]

T− lim

T→∞

E[si,0]

T. (4.16)


2 3 4 5 6 7 80

20

40

60

80

100

120

N (Number of phases)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedy

Figure 4.11: System cost vs. number of phases.

In (4.16), if si,t is always bounded, i.e., constraint (4.4) holds, then the right hand side of (4.16) equals zero and

thus constraint (4.8) is satisfied. Therefore, P2 is a relaxed problem of P1.

4.6.2 An Upper Bound of the Drift-Plus-Cost Function

In the following lemma, we show that the drift-plus-cost function is upper bounded.

Lemma 4.1 For all possible decisions and all possible values of si,t, i ∈ E , at each time slot t, the drift-plus-cost

function is upper bounded as follows:

∆(st) + E[wt|st] ≤ E[wt|st] +∑

i∈E

u2i,max

2Vi

+si,t − βi

Vi

E [ui,t|st] .

Proof: Based on the definition of L(si,t) and the update of si,t,

L(si,t+1)− L(si,t)

=1

2

[

(si,t+1 − βi)2 − (si,t − βi)

2]

≤(si,t − βi)ui,t +1

2u2i,max.

Using the upper bound above for all phase i ∈ E , taking the conditional expectation, and then adding the term

E[wt|st] gives the desired upper bound.


Since the per-slot problem P3 includes all constraints of P1 except the energy state constraint, the key of the

feasibility proof is to show that the energy state si,t is bounded within the interval [si,min, si,max]. To this end, we

first prove the following lemma which gives a sufficient condition for charging or discharging.

Lemma 4.2 Under Algorithm 4.1, for i ∈ E ,

1. if si,t < βi − Vi(pmax +D′i,max + C′

i,max), then u∗i,t = ui,max;

2. if si,t > βi − Vi(pmin +D′i,min + C′

i,min), then u∗i,t = −ui,max.


Proof: For simplicity of notation, we drop the time index t in P3. Using constraint (4.7) we replace lj

with uj − fj − rj in the objective of P3. Next we solve P3 through the partitioning method by first fixing the

optimization variables f and uj , j 6= i, and then minimizing over ui. The optimization problem with respect to ui

is as follows.

minui

pui +Di(ui) + Ci(ui − fi − ri) +(si − βi)ui

Vi

s.t. (4.11).

The derivative of the objective above with respect to ui is∂(·)∂ui

= p + D′i(ui) + C′

i(ui − fi − ri) +(si−βi)

Vi.

Therefore, if si is upper bounded as shown in Lemma 4.2.1, we have∂(·)∂ui

< 0 and thus u∗i,t = ui,max. Or, if si is

lower bounded as shown in Lemma 4.2.2, we have∂(·)∂ui

> 0 and thus u∗i,t = −ui,max.

Using Lemma 4.2 above and the definition of βi, we can easily show the boundedness of the energy state by

mathematical induction, which is omitted here.


We prove Theorem 4.1.1 and Theorem 4.1.2 together. Denote w as the optimal objective value of P2. In the

following lemma, we show the existence of a special algorithm for P2.

Lemma 4.3 For P2, there exists a stationary and randomized solution ast that only depends on the system state

qt, and at the same time satisfies the following conditions:

E[wst ] ≤ w, ∀t, E[us

i,t] = 0, ∀i ∈ E , t,

where the expectations are taken over the randomness of the system state and the possible randomness of the

actions.

The proof of Lemma 4.3 follows from Theorem 4.5 in [16] and is omitted for brevity. Using Lemmas 4.1 and

4.3, the drift-plus-cost function under Algorithm 4.1 can be upper bounded as follows:

∆(st) + E[w∗t |st]

≤E[wst |st] +

∑

i∈E

[u2i,max

2Vi

+si,t − βi

Vi

E[

usi,t|st

]

]

(4.17)

≤w +∑

i∈E

u2i,max

2Vi

(4.18)

≤wopt +∑

i∈E

u2i,max

2Vi

(4.19)

where (4.17) is derived based on Lemma 4.1 and the fact that P3 minimizes the upper bound of the drift-plus-

cost function, (4.18) is derived based on Lemma 4.3 and the fact that the action ast is independent of st, and the

inequality in (4.19) holds since P2 is a relaxed problem of P1.

Taking expectations over st on both sides of (4.19) and summing over t ∈ {0, · · · , T − 1} yields

∑

i∈E

E

[L(si,T )− L(si,0)

Vi

]

+

T−1∑

t=0

E[w∗t ] ≤

(

wopt +∑

i∈E

u2i,max

2Vi

)

T.


Note that L(si,T ) is non-negative. Divide both sides of the above inequality by T . After some arrangement, there

is

1

T

T−1∑

t=0

E[w∗t ]− wopt ≤

∑

i∈E

[u2i,max

2Vi

+E[L(si,0)]

TVi

]

, (4.20)

which is the conclusion in Theorem 4.1.2. Taking lim sup on both sides of (4.20) gives Theorem 4.1.1.


Denote S as the fixed total energy capacity of storage. For simplicity of notation, we drop the index i when

the parameters are the same over all phases or storage units. Given the assumptions in Proposition 4.2, the

optimization problem can be formulated as follows.

minsi,max

∑

i∈E

u2max(pmax − pmin +D′

max −D′min + C′

max − C′min)

2(si,max − smin − 2umax)

s.t.∑

i∈E

si,max = S

where we have replaced Vi,max with its definition in (4.13). It can be easily checked that the above problem is

a convex optimization problem. Using the Karush-Kuhn-Tucker (KKT) conditions [70], the optimal solutions of

si,max must be equal over i.


1) To show the feasibility of {a∗}, it suffices to show that the resultant energy state si,t, i ∈ E , is bounded. First

we give sufficient conditions of charging and discharging, which can be shown similarly to Lemma 4.2.

Lemma 4.4 For i ∈ E ,

1. if si,t < βi − Vi(pmax

η+

i

+D′i,max +

1η+

i

C′i,max), then u+

i,t = ui,max;

2. if si,t > βi − Vi(pminη−i +D′

i,min + η−i C′i,min), then u−

i,t = ui,max.

Using Lemma 4.4 and the mathematical induction arguments, we can show that si,t ∈ [si,min, si,max], ∀i ∈ E .

Note that the adjustment from (u+i , u

−i ) to (u+∗

i , u−∗i ) does not change the difference u+

i − u−i . Therefore, the

resultant energy state s∗i,t equals si,t and thus is bounded within [si,min, si,max].

2) Similar to the ideal case, the relaxed problem of P1 can be formed as follows.

P2’: min{at}

lim supT→∞

1

T

T−1∑

t=0

E[wt]

s.t. (4.1), (4.5), (4.6), (4.7),

limT→∞

1

T

T−1∑

t=0

E[u+i,t − u−

i,t] = 0, ∀i ∈ E .

Denote the optimal value of P2’ by w′. We first give the following two lemmas, which can be shown similarly to

Lemmas 4.1 and 4.3.


Lemma 4.5 For all possible decisions and all possible values of si,t, i ∈ E , in each time slot t, the drift-plus-cost

function is upper bounded as follows

∆(st) + E[wt|st] ≤∑

i∈E

u2i,max

2Vi

+ E[

wt|st]

+∑

i∈E

si,t − βi

Vi

E[

u+i,t − u−

i,t|st]

. (4.21)

Lemma 4.6 For P2’, there exists a stationary and randomized solution ast that only depends on the system state


E[wst ] ≤ w′, ∀t, (4.22)

E[u+si,t − u−s

i,t ] = 0, ∀i ∈ E , t. (4.23)

Denote the optimal values of P3’ under at and the adjusted solution a∗t by gt and g∗t , respectively. In the

following lemma, we characterize the gap between gt and g∗t .

Lemma 4.7 Under the proposed algorithm, at each time t we have g∗t−gt ≤ ǫ, where ǫ,∑

i∈E pmaxui,max(1η+

i

+

η−i ) + 2Di,max + Ci,max.

Proof: Using the objective of P3’, we have

g∗t − gt

≤∑

i∈E

pt(1

η+iu+∗i,t − η−i u−∗

i,t ) +Di(u+∗i,t ) +Di(−u−∗

i,t ) + Ci(l∗i,t)

− pt(1

η+iu+i,t − η−i u

−i,t)−Di(u

+i,t)−Di(−u−

i,t)− Ci(li,t)

≤∑

i∈E

pt1

η+iu+∗i,t + ptη

−i u

−i,t +Di(u

+∗i,t ) +Di(−u−∗

i,t ) + Ci(l∗i,t)

≤ǫ.

Using Lemmas 4.5, 4.6, and 4.7, the drift-plus-penalty function can be further upper bounded as follows.

∆(s∗t ) + E[w∗t |s∗t ]

≤E[

wt|s∗t]

+∑

i∈E

[u2i,max

2Vi

+s∗i,t − βi

Vi

E[

u+i,t − u−

i,t|s∗t] ]

+ ǫ

≤E[

wst |s∗t

]

+∑

i∈E

[u2i,max

2Vi

+s∗i,t − βi

Vi

E[

u+si,t − u−s

i,t |s∗t] ]

+ ǫ

≤ǫ +∑

i∈E

u2i,max

2Vi

+ w′

≤ǫ +∑

i∈E

u2i,max

2Vi

+ wopt.

The remaining proof is similar to that for Theorem 4.1 and is omitted for brevity.

Chapter 5

Real-Time Energy Management with

Storage and Flexible Loads

In previous three chapters, we have focused on the flexibility of energy storage, and studied the application of

energy storage in practical power system operations. In this chapter, we additionally employ the flexibility of

loads, and investigate joint management of the supply side, the demand side, and storage for maintaining power

balancing in a power grid. Recall that in Chapters 2 and 3, the power imbalance of a power grid at each time slot

is characterized by a power imbalance signal, with the positive sign indicating power surplus and the negative sign

indicating power deficit. To provide power balancing service, the participating storage units are expected to clear

the imbalance amount at each time. In this chapter, as we further include the supply side and the demand side into

the energy management, the requirement of power balancing now translates to power input equal to power output

at each time slot.

We consider a general power grid supplied by a CG and multiple RGs, and each RG is co-located with

an energy storage unit. An aggregator operates the grid by coordinating supply, demand, and storage units to

maintain the power balancing. Our goal is to minimize the long-term system cost, including all RGs’ cost,

the CG’s cost, and the cost for selling and purchasing energy from external energy markets. Meanwhile, the

aggregator has to respect system operational constraints and the quality-of-service requirement of flexible loads.

Our formulated optimization problem is stochastic in nature, and is technically challenging especially for real-time

control. First, owing to the practical operational constraints, such as the finite storage capacity and the CG ramping

constraint, the control actions are coupled over time, which complicates the real-time decision making. Second,

at the aggregator, centralized control of a potentially large number of RGs may lead to large communication

overhead and heavy computation. To overcome the first difficulty, we leverage Lyapunov optimization [16] and

develop special techniques to tackle our problem. To address the second challenge, we exploit the structure of

the optimization problem and employ the alternating direction method of multipliers (ADMM) [69] to offer a

distributed algorithm.

68

CHAPTER 5. REAL-TIME ENERGY MANAGEMENT WITH STORAGE AND FLEXIBLE LOADS 69


5.1.1 System Model

As shown in Fig. 5.1, we consider a power grid supplied by one CG (e.g., nuclear, coal-fired, or gas-fired

generator) and N RGs (e.g., wind or solar generators), and each RG is co-located with one on-site energy storage

unit. The grid is connected to external energy markets and is operated by an aggregator, who is responsible for

satisfying the loads by managing energy from various sources. The information flow and the energy flow are

also depicted in Fig. 5.1. Assume that the system operates in discrete time with time slot t ∈ {0, 1, 2, · · · }. For

notation simplicity, throughout the paper we work with energy units instead of power units. The details of each

component in the power grid are described below.

Information flow

Energy flow

Aggregator

CG

Storage 1 RG 1

External energy markets

Storage N RG N

Storage 2 RG 2

Flexible loads

Base loads

Power grid

Figure 5.1: Schematic representation of the considered power grid.

Loads

The loads include base loads and flexible loads. The base loads represent critical energy demands such as lighting,

which must be satisfied once requested. The flexible loads here represent some controllable energy requests that

can be partly curtailed if the energy provision cost is high. At time slot t, denote the amount of the total requested

base loads by lb,t ∈ [lb,min, lb,max], and the amount of the total requested flexible loads by lf,t ∈ [lf,min, lf,max].

The amounts lb,t and lf,t are generated by users based on their own needs and are considered random. Let the

amount of the total satisfied loads be lm,t, which should satisfy

lb,t ≤ lm,t ≤ lb,t + lf,t. (5.1)

The control of flexible loads needs to meet certain quality-of-service requirement. In this work, we impose an

upper bound on the portion of unsatisfied flexible loads. Formally, we introduce a long-term constraint

lim supT→∞

1

T

T−1∑

t=0

E

[

lb,t + lf,t − lm,t

lf,t

]

≤ α (5.2)

where α ∈ [0, 1] is a pre-designed threshold with a small value indicating a tight quality-of-service requirement.


RG and On-Site Storage

At the i-th RG, denote the amount of the renewable generation during time slot t by ai,t ∈ [0, ai,max], where

ai,max is the maximum generated energy amount. Due to the stochastic nature of the renewable sources, ai,t is

random.

We assume that each RG is co-located with one on-site energy storage unit capable of charging and discharg-

ing. Denote the charging or discharging energy amount of the i-th storage unit during time slot t by xi,t, with

xi,t > 0 (resp. xi,t < 0) indicating charging (resp. discharging). Because of the battery design and hardware

constraints, the value of xi,t is bounded as follows:

xi,min ≤ xi,t ≤ xi,max, (xi,min < 0 < xi,max) (5.3)

where |xi,min| and xi,max represent the maximum discharging and charging amounts, respectively. For the i-th

storage unit, denote its energy state at the beginning of time slot t by si,t. Due to charging and discharging

operations, the evolution of si,t is given by

si,t+1 = si,t + xi,t. (5.4)

Furthermore, the battery capacity and operational constraints require the energy state si,t be bounded as

si,min ≤ si,t ≤ si,max (5.5)

where si,min is the minimum allowed energy state, and si,max is the maximum allowed energy state and can be

interpreted as the storage capacity. It is known that fast charging or discharging can cause battery degradation,

which shortens battery lifetime [72]. To model this cost on storage, we use Di(·) to represent the degradation cost

function associated with the charging or discharging amount xi,t.

During every time slot, the RG supplies energy to the aggregator. Denote the amount of the contributed energy

by the i-th RG during time slot t by bi,t. Since the energy flows of the RG should be balanced, we have

bi,t = ai,t − xi,t, bi,t ≥ 0. (5.6)

In particular, if xi,t > 0 (charging), the contributed energy bi,t directly comes from the renewable generation; if

xi,t < 0 (discharging), bi,t comes from both the renewable generation and the storage unit.

CG

Different from the RGs, the energy output of the CG is controllable. Denote gt as the energy output of the CG

during time slot t, satisfying

0 ≤ gt ≤ gmax (5.7)

where gmax is the maximum amount of the energy output. Due to the operational limitations of the CG, the change

of the outputs in two consecutive time slots is bounded instead of arbitrarily large. This is typically reflected by a

ramping constraint on the CG outputs [90]. Assuming that the ramp-up and ramp-down constraints are identical,


we express the overall ramping constraint as

|gt − gt−1| ≤ rgmax (5.8)

where the coefficient r ∈ [0, 1] indicates the tightness of the ramping requirement. In particular, for r = 0, the CG

produces a fixed output over time, while for r = 1, the ramping requirement becomes non-effective. Furthermore,

we denote the generation cost function of the CG by C(·).

External Energy Markets

In addition to the internal energy resources, the aggregator can resort to the external energy markets if needed.

For example, the aggregator can buy energy from the external energy markets in the case of energy deficit, or sell

energy to the markets in the case of energy surplus. At time slot t, denote the unit prices of the external energy

markets for buying and selling energy by pb,t ∈ [pb,min, pb,max] and ps,t ∈ [ps,min, ps,max], respectively. To avoid

energy arbitrage, the buying price is assumed to be strictly greater than the selling price, i.e., pb,t > ps,t. The

prices pb,t and ps,t are typically random due to unexpected market behaviors. Denote

eb,t ≥ 0, es,t ≥ 0 (5.9)

as the amounts of the energy bought from and sold to the external energy markets during time slot t, respectively.

The overall system balance requirement is

gt + eb,t +∑N

i=1 bi,t = es,t + lm,t. (5.10)


The aggregator operates the power grid and aims to minimize the long-term time-averaged system cost by jointly

managing supply, demand, and storage units. With an increasing integration of renewable generation and energy

storage into power grids, the business models of electric utilities are evolving. From the study in [91], one

suggested model of future electric utilities is termed as “energy services utility.” Such utilities are expected to

provide similar services as those described in Section 5.1.1. Precisely, besides serving loads, these utilities would

actively provide a platform for demand response, manage generation assets, and coordinate energy sales with

external energy markets.

We define the control actions at time slot t by

ut, [bt,xt, lm,t, gt, eb,t, es,t]

where bt,[b1,t, · · · , bN,t] and xt,[x1,t, · · · , xN,t]. The system cost at time slot t includes the costs of all RGs

and the CG, and the cost for exploiting the energy markets, given by1:

wt,C(gt) + pb,teb,t − ps,tes,t +

N∑

i=1

Di(xi,t).

1For the RGs and the CG, the payment for supplying energy could be settled by additional contracts offered by the aggregator, or be

calculated based on the actual provided energy. For these cases, the payment is transferred inside the system hence not affecting the system-

wide cost.


Based on the system model described in Section 5.1.1, we formulate the problem of power balancing as a stochas-

tic optimization problem below.

P1 : min{ut}

lim supT→∞

1

T

T−1∑

t=0

E[wt] s.t. (5.1)− (5.10)

where the expectations in the objective and (5.2) are taken over the randomness of the system states

qt,[at, lb,t, lf,t, pb,t, ps,t]

where at,[a1,t, · · · , aN,t], and the possible randomness of the control actions.

To keep mathematical exposition simple, we assume that the cost functions C(·) and Di(·) are continuously

differentiable and convex. This assumption is mild since many practical costs can be well approximated by such

functions. Denote the derivatives of C(·) and Di(·) by C′(·) and D′i(·), respectively. Based on the assump-

tion, we have the derivative C′(gt) ∈ [C′min, C

′max], ∀gt ∈ [0, gmax], and D′

i(xi,t) ∈ [D′i,min, D

′i,max], ∀xi,t ∈

[xi,min, xi,max].

Remarks: Compared to a practical power system, the model considered in Section 5.1.1 is simplified, in

which power losses, network constraints, and some other practical operational constraints are ignored. Despite

the simplifications, we will show that the proposed formulation leads to an implementable control algorithm with

a provable performance bound on suboptimality. For future work, we will consider incorporating more practical

power system constraints into the problem formulation.

5.2 Real-Time Algorithm for Power Balancing

In this section, we propose a real-time algorithm for power balancing and analyze its performance theoretically.

5.2.1 Description of Real-Time Algorithm

To propose a real-time algorithm, we employ an analytical approach, Lyapunov optimization [16]. Lyapunov op-

timization can be used to transform some time-averaged constraints such as (5.2) into queue stability constraints,

and to provide efficient real-time algorithms for complex dynamic systems. Unfortunately, the time-coupled

constraints (5.5) and (5.8) are not time-averaged constraints, but are hard constraints required at each time slot.

Therefore, the Lyapunov optimization framework cannot be directly applied. To overcome this difficulty, we take

a relaxation step and propose the following relaxed problem:

P2 : min{ut}

lim supT→∞

1

T

T−1∑

t=0

E[wt]

s.t. (5.1)− (5.3), (5.6), (5.7), (5.9), (5.10),

limT→∞

1

T

T−1∑

t=0

E[xi,t] = 0, ∀i. (5.11)

Compared with P1, in P2 the energy state constraints (5.4) and (5.5) are replaced with a new time-averaged

constraint (5.11), and the ramping constraint (5.8) is removed. It can be shown that P2 is indeed a relaxation of

P1 (see Appendix 5.6.1).


The above relaxation step is crucial and enables us to work under the standard Lyapunov optimization frame-

work. However, we emphasize that, giving solution to P2 is not our purpose. Instead, the significance of proposing

P2 is to facilitate the design of a real-time algorithm for P1 and the performance analysis. Note that due to this

relaxation, the solution to P2 may be infeasible to P1. Motivated by this concern, we next provide a real-time

algorithm which can guarantee that all constraints of P1 are satisfied.

To meet constraint (5.2), we introduce a virtual queue backlog Jt evolving as follows:

Jt+1 = max{Jt − α, 0}+ lb,t + lf,t − lm,t

lf,t. (5.12)

From (5.12), the virtual queue Jt accumulates the portion of unsatisfied flexible loads. It can be shown that

maintaining the stability of Jt is equivalent to satisfying constraint (5.2) [16]. We initialize Jt as J0 = 0.

At time slot t, define a vector Θt,[s1,t, . . . , sN,t, Jt], which consists of the energy states of all storage units

and the virtual queue backlog Jt. Using Θt, we define a Lyapunov function L(Θt),12J

2t + 1

2

∑Ni=1(si,t − βi)

2,

where βi is a perturbation parameter designed for ensuring the boundedness of the energy state, i.e., constraint

(5.5). In addition, we define the one-slot conditional Lyapunov drift as ∆(Θt),E [L(Θt+1)− L(Θt)|Θt]. In-

stead of directly minimizing the system cost objective, we consider the drift-plus-cost function given by ∆(Θt)+

V E[wt|Θt]. It is a weighted sum of ∆(Θt) and the system cost at time slot t with V serving as the weight.

In our algorithm design, we first consider an upper bound on the drift-plus-cost function (see Appendix 5.6.2

for the upper bound), and then formulate a real-time optimization problem to minimize this upper bound at every

time slot t. As a result, at each time slot t, we have the following optimization problem:

P3 : minut

[

N∑

i=1

V Di(xi,t) + (si,t − βi)xi,t

]

+ V C(gt) + V pb,teb,t − V ps,tes,t −Jtlf,t

lm,t

s.t. (5.1), (5.3), (5.6)− (5.10).

We will show in Section 5.2.2 that the design of the real-time problem P3 can lead to some analytical performance

guarantee. Moreover, to ensure the feasibility of gt, we take a natural step and move the ramping constraint (5.8)

back into P3.

SinceDi(·) andC(·) are convex, P3 is a convex optimization problem and can be efficiently solved by standard

convex optimization software packages such as those in MATLAB. Denote an optimal solution of P3 at time slot

t by u∗t,

[

b∗t ,x

∗t , l

∗m,t, g

∗t , e

∗b,t, e

∗s,t

]

. At each time slot, after obtaining u∗t , we update si,t, ∀i, and Jt based on

their evolution equations.

In the following proposition we prove that, despite the relaxation to P2, by appropriately designing the per-

turbation parameters βi we can ensure the boundedness of the energy states hence the feasibility of the control

actions {u∗t } to P1.

Proposition 5.1 For the i-th storage unit, set the perturbation parameter βi as

βi,V (pb,max +D′i,max)− xi,min + si,min (5.13)

where V ∈ (0, Vmax] with

Vmax, min1≤i≤N

{

si,max − si,min + xi,min − xi,max

pb,max − ps,min +D′i,max −D′

i,min

}

. (5.14)


Algorithm 5.1: Centralized real-time algorithm for power balancing.

Initialize J0 = 0. At time slot t, the aggregator executes the following steps sequentially.

1. Observe the system states qt, energy states si,t, ∀i, and queue backlog Jt.

2. Solve P3 and obtain an optimal solution u∗t .

3. Use u∗t to update si,t, ∀i, and Jt based on (5.4) and (5.12), respectively.

Then the control actions {u∗t } derived by solving P3 at each time t are feasible to P1.


Remarks: For Vmax in (5.14) to be positive, the range of the energy state should be larger than the sum of the

maximum charging and discharging amounts. This is generally true if the length of each time interval is not too

long, for example, up to several minutes.

We summarize the proposed real-time algorithm in Algorithm 5.1. We can see that, Algorithm 5.1 is simple

and does not require any statistics of the system states. The latter feature is especially desirable in practice, where

accurate statistics of the system states are difficult to obtain but instantaneous observations are readily available.

5.2.2 Performance Analysis

We now analyze the solution provided by Algorithm 5.1 with respect to P1. Under Algorithm 5.1, to emphasize

the dependency of the cost objective value on the ramping coefficient r and the control parameter V , we denote

the achieved cost objective value by w∗(r, V ). Denote the minimum cost objective value of P1 by wopt(r), which

only depends on r. The main results are summarized in the following theorem.

Theorem 5.1 Assume that the random system states qt of the grid are i.i.d. over time. Then under Algorithm 5.1

we have

1. w∗(r, V )− wopt(r) ≤ (1 − r)gmax max{pb,max, C′max}+B/V , where B is a constant defined byB, 1

2 (1+

α2) + 12

∑Ni=1 max{x2

i,min, x2i,max}; and

2. wopt(r) ≥ w∗(1, V )−B/V .


Remarks:

• Theorem 5.1.1 characterizes an upper bound on the performance gap away from wopt(r). The upper bound

has two terms reflecting the ramping constraint and storage capacity limitation. It indicates that Algorithm

5.1 provides an asymptotically optimal solution as the ramping constraint becomes loose (i.e., r → 1) and

the control parameter V increases (or the storage capacity si,max increases based on the Vmax expression

in (5.14)). This is consistent with our intuition. Using this insight, in order to minimize the gap to the

minimum system cost, we should set V = Vmax in Algorithm 5.1.

• Theorem 5.1.2 provides a lower bound on wopt(r) in terms of the cost under Algorithm 5.1, in which the

ramping constraint is loose, i.e., r = 1. Since solving P1 to obtain the minimum objective value wopt(r)

is difficult, we will use this lower bound as a benchmark for performance comparison in simulation. The

gap between the performance under Algorithm 5.1 and this lower bound serves as an upper bound on the

performance gap between Algorithm 5.1 and an optimal control algorithm.


• The i.i.d. assumption of the system states qt can be relaxed to accommodate qt evolving based on a finite

state irreducible and aperiodic Markov chain. Similar conclusions can be shown, which are omitted for

brevity.

In the above analysis, the storage capacity si,max is assumed to be fixed, so that the control parameter V

should be upper bounded by Vmax in (5.14) for ensuring the feasibility of the solution. Alternatively, if the

storage capacity can be designed, the question is what its value should be in order to achieve certain required

performance. In the following proposition, we provide an answer to this question by giving an upper bound on

the energy state si,t (hence an upper bound on the minimum required energy capacity) for an arbitrary positive V

that can be greater than Vmax.

Proposition 5.2 Under Algorithm 5.1 with an arbitrary positive V value, for the i-th storage unit, the energy

state si,t satisfies si,t ∈ [si,min, si,up] where

si,up,V (pb,max − ps,min +D′i,max −D′

i,min) + xi,max − xi,min + si,min. (5.15)


The expression of si,up in (5.15) is informative and reveals some insights into the dependency of the design

of the storage capacity on some system parameters. First, si,up increases linearly with the control parameter V .

Second, si,up is larger if the energy prices are more volatile or the marginal degradation cost increases fast. Third,

the minimum si,up is given by −xi,min + xi,max + si,min if we have pb,max = ps,min and D′i,max = D′

i,min.

Other properties regarding flexible loads and external transactions are summarized in the following proposi-

tion.

Proposition 5.3 Under Algorithm 5.1 the following results hold.

1. The queue backlog Jt is uniformly bounded from above as Jt ≤ V pb,maxlf,max + 1.

2. The amounts of the external transactions e∗b,t and e∗s,t satisfy e∗b,te∗s,t = 0.


Remarks:

• In Proposition 5.3.1, the upper bound of Jt is deterministic and does not change over time. Moreover, the

fact that Jt is upper bounded implies that the accumulated portion of unsatisfied flexible loads is upper

bounded.

• Proposition 5.3.2 implies that the aggregator does not buy energy from or sell energy to the external energy

markets simultaneously.

5.2.3 Discussion on Multiple CGs

In the current system model, apart from multiple renewable generators, we incorporate one conventional generator

(CG) into the supply side. If there are multiple CGs with the same characteristics, i.e., the same maximum

output gmax, ramping coefficient r, and cost function C(·), for mathematical analysis, we can combine them

into one generator. In this case, the current mathematical framework and the performance analysis apply directly

with the combined generator. The output of each individual CG can then be obtained by dividing the output of

the combined generator equally over all individual ones. On the other hand, if these CGs have heterogeneous


characteristics and therefore cannot be combined into one, the proposed algorithm can still be used. In particular,

in the original problem P1, we would have constraints (5.7) and (5.8) for each individual generator; the total

output of the generators in (5.10) is∑M

j=1 gj,t; and the total cost of the generators is∑M

j=1 Cj(gi,t). The resultant

relaxed problem P2 would be similar to the current one, in which the ramping constraint (5.8) is removed for

each individual CG. For the real-time algorithm, the formulation of the per-slot optimization problem follows the

current mathematical framework. Moreover, distributed implementation of the algorithm (shown later in Section

5.3) can be developed using the same approach we propose.

5.3 Distributed Implementation of Real-Time Algorithm

At each time slot, our proposed algorithm (Algorithm 5.1) can be implemented by the aggregator centrally. How-

ever, the RGs may not be willing to relinquish direct control of storage or to offer private information to the

aggregator. In addition, the computational complexity of centralized control would grow quickly as the number

of RGs increases. In this section, we provide a distributed algorithm for solving P3, by which each RG and the

aggregator can make their own control decisions.

5.3.1 Distributed Algorithm Design

To facilitate the algorithm development, we first transform P3 into an equivalent problem. For notation simplicity

we drop the time index t. We define a new optimization vector y,[y1, · · · , yN+4], which relates to the optimiza-

tion variables of P3 by yi = xi for 1 ≤ i ≤ N, yN+1 = lm, yN+2 = −g, yN+3 = −eb, and yN+4 = es. Then,

the objective of P3 can be rewritten as the sum of certain function of each yi, denoted by Fi(yi). In addition, we

replace bi in the constrains of P3 by ai − yi for 1 ≤ i ≤ N based on constraint (5.6). Consequently, P3 can be

rewritten in a generic form P4 below.

P4: miny

N+4∑

i=1

Fi(yi) s.t. yi ∈ Yi, ∀i,N+4∑

i=1

yi =

N∑

i=1

ai

where the constraint sets {Yi} are derived from constraints (5.1), (5.3), and (5.6)-(5.9), given by Yi,[xi,min,

min{ai, xi,max}], i ∈ {1, · · · , N},YN+1,[lb, lb + lf ],YN+2,[

− min{gmax, gt−1 + rgmax},−max{gt−1 −rgmax, 0}

]

,YN+3,(−∞, 0], and YN+4,[0,+∞).

Next, we introduce an auxiliary vector z as a copy of y and further transform P4 into the following equivalent

problem.

P5: miny,z

N+4∑

i=1

[

Fi(yi) + 1(yi ∈ Yi)]

+ 1(

N+4∑

i=1

zi =

N∑

i=1

ai

)

s.t. y − z = 0 (5.16)

where 1(·) is an indicator function that equals 0 if the enclosed event is true and infinity otherwise. Through the

above transformations, the optimization problem P5 now fits the two-block form of ADMM [69], enabling us to

develop the distributed optimization algorithm.

Following a general ADMM approach [69], we associate the equality constraint (5.16) in P5 with dual vari-

ables d,[d1, · · · , dN+4]. Denote yki , zki , and dki as the respective variable values at the k-th iteration. Then, based


update yk+1

i

update yk+1

i(N + 1 ≤ i ≤ N + 4) and dk+1

yk+1

ivk

i

· · · · · ·

RG 1

RG i

RG N

Aggregator

Figure 5.2: Information flow of distributed implementation.

on ADMM, these values are updated as follows.

yk+1i = argmin

yi

{

Fi(yi) +ρ

2

(

yi − zki +dkiρ

)2|yi ∈ Yi}

, ∀i, (5.17)

zk+1 = argminz

{

N+4∑

i=1

(

zi −dkiρ− yk+1

i

)2|N+4∑

i=1

zi =N∑

i=1

ai

}

, (5.18)

dk+1i = dki + ρ(yk+1

i − zk+1i ), ∀i (5.19)

where ρ > 0 is a penalty parameter, which needs to be carefully adjusted for good convergence performance [69].

After further algebraic manipulation (see Appendix 5.6.7), we can eliminate the vectors z and d and simplify

the updates (5.17)-(5.19) to the follows:

yk+1i = argmin

yi

{

Fi(yi) +ρ

2

(

yi − vki )2|yi ∈ Yi

}

, ∀i, (5.20)

dk+1 = dk + ρ

(

yk+1 − 1

N + 4

N∑

i=1

ai

)

. (5.21)

In (5.20), we have vki ,yki − yk − dk

ρ+ 1

N+4

∑Ni=1 ai where yk, 1

N+4

∑N+4i=1 yki and dk is a scalar updated as in

(5.21).

Remarks: Following the proof of Theorem 2 in [84], we can show that the above updates lead to a worst-

case convergence rate O(1/k). Compared with the subgradient-based algorithm, which presents a worst-case

convergence rate O(1/√k), the proposed distributed algorithm is much faster and thus is well suited for real-time

implementation.

5.3.2 Distributed Implementation

Now we discuss the implementation of the proposed distributed algorithm in terms of both computation and

communication. In Fig. 5.2, we depict the information flow between the aggregator and the RGs for the updates

in (5.20) and (5.21) at the (k + 1)-th iteration.

Note that the minimization problems in (5.20) can be solved individually at each RG i for 1 ≤ i ≤ N , and

at the aggregator for N + 1 ≤ i ≤ N + 4, while the update in (5.21) can be computed by the aggregator. At the

initial iteration k = 0, each RG i needs to send its renewable generation amount ai to the aggregator. At each

iteration, the aggregator sends a signal vki to each RG i. Then RG i obtains the update yk+1i and sends it back to

the aggregator. We see that, the RGs do not have to release any other private information to the aggregator, and

the required information exchange is limited to one variable in each direction per RG.

Note that the minimization problems in (5.20) are all strictly convex and admit a unique (and sometimes


closed-form) solution. Furthermore, effectively, only one dual variable is required to be updated in (5.21). This

is because the transformation from P3 to P4 by introducing the new optimization vector y permits all dual vari-

ables to share the same updating structure, hence reducing the number of the effective dual updates as well as

simplifying the calculation.


In this section, we evaluate the proposed real-time algorithm and compare it with alternatives using an idealized

but representative power grid setup.

5.4.1 Simulation Setup

Unless otherwise specified, the following parameters are set as default. The length of each time slot is 10 min.

The amounts of the base loads lb,t and the flexible loads lf,t are uniformly distributed between 5 and 25 kWh,

and the portion of unsatisfied flexible loads α is 0.5. The aggregator is connected with N = 30 RGs. For each

on-site storage unit, we set the maximum discharging and charging amounts to be 1.1 kWh by assuming that the

discharging and charging rate to be 6.6 kW (three-phase, level II) [92]. Since the model of the degradation cost

function of storage is usually proprietary and unavailable, in simulation, we set Di(x) = 10x2 as an example.

The renewable generation ai,t is uniformly distributed between 0 and 1.1 kWh. For the CG, we set the generation

cost function to be C(x) = 8x, the maximum output gmax = 50 kWh, and the ramping coefficient r = 0.1. The

unit buying energy price pb,t is uniformly distributed between 10 and 12 cents/kWh, which is around the current

mid-peak energy price in Ontario [76]. The unit selling energy price ps,t is uniformly distributed between 4 and

6 cents/kWh, which is slightly below the current off-peak energy price in Ontario [76]. The control parameter V

is set to 1, si,min = 0, and si,max is given by (5.15).

5.4.2 Benchmark Algorithms

As compared with previous works (e.g., [6,20,21,55–58,62,85–89]), this paper is built on a more general system

model in which all issues listed in Table 1.1 are incorporated into the problem formulation. Therefore, mathe-

matically, the problem we study is new and different from all previous ones. As a result, the proposed algorithm

cannot be directly compared with the algorithms presented in [6, 20, 21, 55–58, 62, 85–89]. To overcome this

difficulty, we employ two alternative algorithms as well as the lower bound on the minimum system cost derived

in Theorem 5.1.2 for comparison.

The first alternative is a greedy algorithm, which only minimizes the current system cost. The optimization

problem of the greedy algorithm at time slot t is formulated as follows.

minut

wt

s.t. (5.3), (5.6)− (5.10),

lb,t + (1− α)lf,t ≤ lm,t ≤ lb,t + lf,t,

−si,t ≤ xi,t ≤ si,max − si,t.

The second alternative is suggested mainly to show the effect of the ramping constraint. In particular, at each

time slot t, we solve an optimization problem that is the same as P3 except without the ramping constraint (5.8).


Therefore, the resultant CG output may be infeasible to P1. To maintain feasibility, whenever the CG output

violates the ramping constraint, the aggregator only uses the external energy markets to augment the CG output.

We call it “naive algorithm” below. For all figures, we omit drawing confidence intervals since they are small.

5.4.3 Comparison under Parameters V and α

In Fig. 5.3, we depict the time-averaged system cost under various values of the control parameter V . For the

proposed algorithm, the system cost drops quickly and then remains stable as it drops close to the lower bound.

This observation demonstrates the efficiency of the algorithm and implies that using small storage may be enough

to achieve near-optimal performance. In contrast, the performance of the greedy algorithm barely changes with

V . In particular, the system cost under the greedy algorithm is about 1.7 times that under the proposed algorithm

when V ≥ 0.1.

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.420

40

60

80

100

120

140

160

V (Control parameter)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedyLower bound

Figure 5.3: System cost vs. control parameter V .

0 0.2 0.4 0.6 0.8 1−20

0

20

40

60

80

100

120

140

α (Portion of unsatisfied flexible loads)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedyLower bound

Figure 5.4: System cost vs. portion of unsatisfied flexible loads α.

In Fig. 5.4, we illustrate the effect of α, the portion of unsatisfied flexible loads. As expected, the system

cost goes down as α rises, since less load is to be satisfied. For the proposed algorithm, the marginal system cost

decreases with α, which indicates that the benefit of curtailing loads keeps on falling. We also notice that the

greedy algorithm is comparable with the proposed algorithm for α = 1. But for general cases of α, the proposed

algorithm is observed to have a noticeable advantage. In addition, the proposed algorithm is close to the minimum

system cost for all cases.


0 0.2 0.4 0.6 0.8 1

30

35

40

45

50

55

60

65

r (Ramping coefficient)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedyNaiveLower bound

Figure 5.5: System cost vs. ramping coefficient r (small loads).

0 0.2 0.4 0.6 0.8 1

205

210

215

220

225

230

235

240

245

r (Ramping coefficient)

Tim

e−

ave

rag

ed

sys

tem

co

st

ProposedGreedyNaiveLower bound

Figure 5.6: System cost vs. ramping coefficient r (large loads).

5.4.4 Effect of Ramping Constraint

In Fig. 5.5 we first consider a scenario with small loads. The system cost is shown to be non-increasing with

respect to the ramping coefficient r. This is easy to understand since a looser ramping constraint implies less usage

of the expensive external energy markets. Furthermore, for all algorithms, the system cost cannot be decreased

any further for r ≥ 0.3. This indicates that the CG supply is already sufficient at this point, and therefore a further

relaxation of the ramping constraint is unnecessary. We observe that, the proposed algorithm outperforms both

alternatives for all cases. However, the proposed and naive algorithms coincide when r ≥ 0.3. This happens

because with sufficient supply and a relaxed ramping constraint, the need to augmenting the CG output in the

naive algorithm is small. That is, the control actions under the naive algorithm are consistent with those under the

proposed algorithm in most cases.

In Fig. 5.6, we study a more stressed power grid by increasing the loads. We assume that lb,t and lf,t are

distributed between 20 and 40 kWh. For the proposed and naive algorithms, the ramping constraint now has

a more noticeable impact. First, the system cost under these two algorithms keeps on dropping for larger r,

and second, the proposed algorithm always outperforms the naive algorithm. In addition, for small r, the naive

algorithm is unsatisfactory as its performance is close to that of the greedy algorithm. This observation shows the

importance of jointly exploiting the system resources, especially under a stressful system environment.


10 20 30 40 50 6010

−3

10−2

10−1

100

101

102

k (Number of iterations)

f tk − f

t*

ProposedSubgradient

Figure 5.7: Performance gap vs. number of iterations for distributed algorithm.

5.4.5 Convergence of Distributed Implementation

In Fig. 5.7, we exhibit the convergence of the proposed distributed algorithm for a particular system realization.

The value of the penalty parameter ρ needs to be adjusted for good convergence performance and is set to 5 in our

case. For comparison, we also show the convergence of a subgradient algorithm [93]. The vertical axis denotes

the gap between the value of the objective function and the minimum value of the objective function of P5. We see

that, the proposed algorithm converges fast and exhibits a linear convergence rate, while the subgradient algorithm

is slow and exhibits a sublinear convergence rate. Moreover, the fast convergence of the proposed algorithm is

observed in general, and we omit the curves of the other system realizations for brevity.

5.5 Summary

We have investigated the problem of power balancing in a renewable-integrated power grid with storage and

flexible loads. With the objective of minimizing the system cost, we have proposed a distributed real-time algo-

rithm, which enjoys a fast convergence rate and is asymptotically optimal as the storage capacity increases and

the ramping constraint of the CG becomes loose.

There are several possible directions for the future work. For example, first, in the proposed real-time algo-

rithm, only the current observations of the system states are employed in the algorithm design. In reality, forecasts

of the system states (e.g., wind generation, loads, and electricity prices) are usually available within a certain time

interval. Therefore, it would be interesting to study how to incorporate these forecasts into the algorithm design

and how these forecasts could improve the algorithm performance. Second, the specific implementation of curtail-

ing the flexible loads is not considered in this paper. How to incentivize individual customers to participate in such

power balancing service or other demand response programs are currently open and worth further investigation.

5.6 Appendices

5.6.1 Proof of Relaxation from P1 to P2

Using the energy state update in (5.4) we can derive that the left hand side of constraint (5.11) equals the following:

limT→∞

1

T

T−1∑

t=0

E[xi,t] = limT→∞

E[si,T ]

T− lim

T→∞

E[si,0]

T. (5.22)


In (5.22), if si,t is always bounded, i.e., constraint (5.5) holds, then the right hand side of (5.22) equals zero and

thus constraint (5.11) is satisfied. Therefore, P2 is a relaxed problem of P1.

5.6.2 Upper bound on drift-plus-cost function

In the following lemma, we show that the drift-plus-cost function is upper bounded.

Lemma 5.1 For all possible decisions and all possible values of Θt, in each time slot t, the drift-plus-cost

function is upper bounded as follows:

∆(Θt) + V E[wt|Θt] ≤ B + JtE

[


lf,t− α

∣

∣

∣Θt

]

+N∑

i=1

(si,t − βi)E[

xi,t|Θt

]

+ V E[wt|Θt]

(5.23)

where B is a constant and is given by B, 12 (1 + α2) + 1

2

∑Ni=1 max{x2

i,min, x2i,max}.

Proof: Based on the definition of L(Θt), the difference

L(Θt+1)− L(Θt) =1

2

[

N∑

i=1

(si,t+1 − βi)2 − (si,t − βi)

2

]

+1

2(J2

t+1 − J2t ). (5.24)

From the iteration of Jt in (5.12), (J2t+1 − J2

t ) in (5.24) can be upper bounded as

J2t+1 − J2

t ≤ 2Jt

(


lf,t− α

)

+ 1 + α2. (5.25)

From the iteration of si,t in (5.4), [(si,t+1 − βi)2 − (si,t − βi)

2] in (5.24) can be upper bounded as

(si,t+1 − βi)2 − (si,t − βi)

2 ≤ 2xi,t(si,t − βi) + max{x2i,min, x

2i,max}. (5.26)

Applying inequalities (5.25) and (5.26) to (5.24), taking the conditional expectation given Θt, and adding the

term V E[wt|Θt] yields the upper bound in (5.23).


To prove the feasibility under Algorithm 5.1, we are left to show that the long-term constraint (5.2) and the energy

state constraint (5.5) are satisfied.

For constraint (5.2), under the Lyapunov optimization framework, it suffices to show that the virtual queue

Jt is mean rate stable, i.e., limT→∞E[Ji,T ]

T= 0 (see Section 4.4 in [16]). Using Proposition 5.3.1 that Jt is

deterministically bounded we can easily prove this identity.

To prove that constraint (5.5) is satisfied, we first show the following lemma which gives a sufficient condition

for charging or discharging.

Lemma 5.2

1. If si,t < −xi,min + si,min, then x∗i,t = min{ai,t, xi,max}.

2. If si,t > βi − V (ps,min +D′i,min), then x∗

i,t = xi,min.


Proof: To show Lemma 5.2.1, we first transform P3 to an equivalent problem P3a) by eliminating the

variables eb,t and bi,t, ∀i, and the constant terms.

P3a) :

min[

N∑

i=1


]

+ V C(gt) + V pb,t(

es,t + lm,t − gt +

N∑

i=1

xi,t

)

− V ps,tes,t −Jtlf,t

lm,t

s.t. (5.1), (5.7), (5.8), es,t ≥ 0

xi,min ≤ xi,t ≤ min{ai,t, xi,max} (5.27)

xi,t ≥N∑

i=1

ai,t −N∑

j 6=i

xj,t − lm,t + gt − es,t. (5.28)

We solve P3a) by the partitioning method. Specifically, we first fix the variables(

(xj,t)j 6=i, lm,t, gt, es,t)

and

minimize P3a) over xi,t. Since the objective function of P3a) is separable over all variables, an optimal solution

of xi,t can be derived by the following problem:

minxi,t

V Di(xi,t) + (si,t − βi)xi,t + V pb,txi,t

s.t. (5.27), (5.28).

Under the assumption that si,t < βi − V (pb,max +D′i,max) = −xi,min + si,min, the objective function above

is strictly decreasing with respect to xi,t. Therefore, the optimal solution of xi,t is min{ai,t, xi,max}.The demonstration of Lemma 5.2.2 is similar to that of Lemma 5.2.1. We first transform P3 to an equivalent

problem P3b) by eliminating the variables es,t and bi,t, ∀i, and the constant terms. To solve the problem, we first

fix the variables(

(xj,t)j 6=i, lm,t, gt, eb,t)

and minimize P3b) over xi,t. By some arrangement, an optimal solution

of xi,t can be derived by the following problem:

minxi,t

V Di(xi,t) + (si,t − βi)xi,t + V ps,txi,t

s.t. (5.27)

xi,t ≤∑N

i=1 ai,t −∑N

j 6=i xj,t − lm,t + gt + eb,t.

When si,t > βi − V (ps,min + D′i,min), the objective function above is strictly increasing with respect to xi,t.

Therefore, the optimal solution of xi,t is xi,min.

Using Lemma 5.2, we can show that constraint (5.5) holds by mathematical induction.

Lemma 5.3 For the i-th storage unit, the energy state si,t is bounded within the interval [si,min, si,max].

Proof: The basis: For t = 0, we have si,0 ∈ [si,min, si,max] for the initial setup.

The inductive step: Assume that si,t ∈ [si,min, si,max]. Then we need to show that si,t+1 ∈ [si,min, si,max].

In the following, we discuss three cases of si,t.

a) si,t ∈ [si,min,−xi,min + si,min). Using Lemma 5.2.1 and the iteration of si,t in (5.4), we have si,t+1 =

si,t + min{ai,t, xi,max} ≥ si,t ≥ si,min. Also, we have si,t+1 ≤ si,t + xi,max < si,max where the last

inequality is derived based on the assumption of si,t and Vmax > 0.

b) si,t ∈ [−xi,min + si,min, βi − V (ps,min +D′i,min)]. Based on the iteration in (5.4), we have si,t+1 ∈ [si,t +


xi,min, si,t + xi,max]. By the definitions of βi and Vmax we can derive that si,t+1 ∈ [si,min, si,max].

c) si,t ∈ (βi − V (ps,min + D′i,min), si,max]. Using Lemma 5.2.2 and the iterations in (5.4), we have si,t+1 =

si,t + xi,min < si,t ≤ si,max. Also, we have si,t+1 > si,min according to the assumption of si,t and the

definition of βi.


1) Note that P2 fits the standard Lyapunov optimization format (see Section 4.3 in [16] for details of the standard

format). The idea of showing performance of Algorithm 5.1 is to connect Algorithm 5.1 with the algorithm for

P2 that is designed under the Lyapunov optimization framework. Before showing performance of Algorithm 5.1,

we give two lemmas, which will be used later.

In the following lemma, we show the existence of a special algorithm for P2. Denote w as the optimal system

cost of P2.

Lemma 5.4 For P2, there exists a stationary and randomized solution ust that only depends on the system states


E[wst ] ≤ w, ∀t, (5.29)

E[xsi,t] = 0, ∀i, t, (5.30)

E

[

lb,t + lf,t − lsm,t

lf,t

]

≤ α, ∀t (5.31)

where all expectations are taken over the randomness of the system state and the possible randomness of the

decisions.

Proof: The claims above can be derived from Theorem 4.5 in [16]. In particular, that theorem provides

sufficient conditions for the existence of a stationary and randomized algorithm as described above. It can be

checked that these sufficient conditions are all met in our problem. Therefore, the conclusion in Lemma 5.4

holds.

By minimizing the upper bound of the drift-plus-cost function (i.e., the right hand side of (5.23)), the real-time

sub-problem for P2 at time slot t is given by

P3’ : minut

[

N∑

i=1


]

+ V C(gt) + V pb,teb,t − V ps,tes,t −Jtlf,t

lm,t

s.t. (5.1), (5.3), (5.6)− (5.7), (5.9), (5.10).

Note that P3’ is the same as P3 except without the ramping constraint (5.8). Denote the optimal objective values

of P3’ and P3 as ft and f∗t , respectively, and denote an optimal solution of P3’ and P3 as ut and u∗

t , respectively.

In the following lemma, we characterize f∗t in terms of ft.

Lemma 5.5 At each time slot, f∗t is bounded as ft ≤ f∗

t ≤ ft + ǫ, where

ǫ,V (1 − r)gmax max{pb,max, C′max}.


Proof: First, since P3 has more restricted constraints than P3’, there is f∗t ≥ ft.

Next, we are to upper bound f∗t − ft. Comparing the solution g∗t of P3 with the solution gt of P3’ there are

three possibilities:

1. g∗t = gt,

2. g∗t < gt (less output due to constraint (5.8)), and

3. g∗t > gt (more output due to constraint (5.8)).

For Case 1), it is easy to show that f∗t = ft. Thus, we focus on the latter two cases.

Denote a feasible solution of P3 as ut and its corresponding objective value as ft. Since characterizing the

gap f∗t − ft directly is challenging, we instead consider the gap ft − ft.

For Case 2), when g∗t < gt, the effective constraint of gt in P3 should be max{gt−1 − rgmax, 0} ≤ gt ≤gt−1 + rgmax. Set a feasible solution of P3 as ut = [bt, xt, lm,t, gt−1 + rgmax, eb,t + gt − gt−1 − rgmax, es,t].

That is, ut is the same as ut except the solutions of gt and eb,t. Intuitively, we can interpret ut as that, due to the

ramping constraint, the CG is forced to generate less energy, and the aggregator chooses to buy more from the

external energy markets to balance power. The gap ft − ft is given by

ft − ft

=V[

C(gt−1 + rgmax)− C(gt) + pb,t(gt − gt−1 − rgmax)]

≤V pb,t(gt − gt−1 − rgmax) (5.32)

≤V (1− r)gmaxpb,max (5.33)

where the inequality in (5.32) holds since gt > gt−1 + rgmax and the function C(·) is non-decreasing. From

(5.33), the gap f∗t − ft is upper bounded by

f∗t − ft ≤ ft − ft ≤ V (1 − r)gmaxpb,max. (5.34)

The proof for Case 3) is similar as that for Case 2). In particular, when g∗t > gt, the effective constraint of

gt in P3 should be gt−1 − rgmax ≤ gt ≤ min{gmax, gt−1 + rgmax}. Set a feasible solution of P3 as ut =

[bt, xt, lm,t, gt−1− rgmax, eb,t, es,t− gt + gt−1− rgmax]. That is, ut is the same as ut except the solutions of gt

and es,t. Intuitively, we can interpret ut as that, due to the ramping constraint, the CG is forced to generate more

energy, and the aggregator chooses to sell more to the external energy markets to balance power. The gap ft − ft

is given by

ft − ft

=V[

C(gt−1 − rgmax)− C(gt) + ps,t(gt − gt−1 + rgmax)]

≤V[

C(gt−1 − rgmax)− C(gt)]

(5.35)

≤V (gt−1 − rgmax − gt)C′max (5.36)

≤V (1− r)gmaxC′max (5.37)

where the inequality in (5.35) holds since gt < gt−1 − rgmax, and the inequality (5.36) is derived by the mean

value theorem. From (5.37), we have

f∗t − ft ≤ ft − ft ≤ V (1− r)gmaxC

′max. (5.38)


Combining (5.34) and (5.38) yields f∗t ≤ ft +V (1− r)gmax max{pb,max, C

′max}, which completes the proof.

Using Lemmas 5.1, 5.4, and 5.5, the drift-plus-cost function under Algorithm 5.1 can be upper bounded below:

∆(Θt) + V E[w∗t |Θt]

≤B + ǫ+ JtE

[


lf,t− α

∣

∣

∣Θt

]

+

N∑

i=1

(si,t − βi)E[

xi,t|Θt

]

+ V E[wt|Θt] (5.39)

≤B + ǫ+ JtE

[

lb,t + lf,t − lsm,t

lf,t− α

∣

∣

∣Θt

]

+

N∑

i=1

(si,t − βi)E[

xsi,t|Θt

]

+ V E[wst |Θt] (5.40)

≤B + ǫ+ V w (5.41)

≤B + ǫ+ V wopt (5.42)

where (5.39) is derived by Lemmas 5.1 and 5.5, (5.40) holds since P3’ minimizes the right hand side of (5.39),

(5.41) is derived based on (5.29),(5.30), and (5.31) in Lemma 5.4 and the fact that ust is independent of Θt, and

(5.42) holds since P2 is a relaxed problem of P1.

Taking expectations over Θt on both sides of (5.42) and summing over t ∈ {0, · · · , T − 1} yields

E[L(ΘT )]− E[L(Θ0)] + V

T−1∑

t=0

E[w∗t ] ≤ (B + ǫ+ V wopt)T. (5.43)

Since L(ΘT ) is non-negative, after some arrangement, from (5.43) there is

1

T

T−1∑

t=0

E[w∗t ] ≤

B + ǫ + V wopt

V+

E[L(Θ0)]

TV. (5.44)

Taking lim sup on both sides of (5.44) and rearranging the terms gives

w∗ − wopt ≤ B/V + (1 − r)gmax max{pb,max, C′max}.

To emphasize the dependence of performance on r and V , we express w∗ as w∗(r, V ). Similarly, we express wopt

as wopt(r).

2) The lower bound on wopt(r) can be derived by setting r = 1 in Theorem 5.1.1 and recognizing that

wopt(1) ≤ wopt(r).


Proposition 5.2 can be shown by mathematical induction. The proof resembles that of Lemma 5.3 where the

energy capacity si,max is replaced by si,up. We omit the proof for brevity.


1) We prove the conclusion by mathematical induction.

The basis: For t = 0, we have Jt = 0, which is obviously upper bounded.

The inductive step: Assume that Jt ≤ V pb,maxlf,max+1. Then we need to show that Jt+1 ≤ V pb,maxlf,max+

1. Consider the following two cases of Jt.


a) Jt ≤ V pb,maxlf,max. Based on the update of Jt in (5.12), we have Jt+1 ≤ max{Jt − α, 0} + 1 ≤ Jt + 1 ≤V pb,maxlf,max + 1.

b) Jt ∈ (V pb,maxlf,max, V pb,maxlf,max +1]. For this case, we will show that the unique solution of lm,t to P3 is

lb,t + lf,t. Hence, Jt+1 = max{Jt − α, 0} ≤ Jt ≤ V pb,maxlf,max + 1.

To this end, we consider the equivalent problem P3a). First fix the variables(

xt, gt, es,t)

and minimize P3a)

over lm,t. After some arrangement, an optimal solution of lm,t can be derived by the following problem:

minlm,t

(

V pb,t −Jtlf,t

)

lm,t

s.t. lb,t ≤ lm,t ≤ lb,t + lf,t,

lm,t ≥N∑

i=1

(ai,t − xi,t) + gt − es,t.

When Jt > V pb,maxlf,max, the objective function above is strictly decreasing. Therefore, the optimal solution

of lm,t is lb,t + lf,t.

2) We prove the conclusion by contradiction. Suppose that under our algorithm the optimal solutions of eb,t

and es,t satisfy e∗b,t > e∗s,t > 0. Then, we can show that there is another feasible solution

ut =[

b∗t ,x

∗t , l

∗m,t, g

∗t , e

∗b,t − e∗s,t, 0

]

achieving a strictly smaller objective value, hence contradicting the fact that u∗t is optimal. The proofs of the other

two possible cases, i.e., e∗b,t = e∗s,t > 0 and e∗s,t > e∗b,t > 0, are similar, and are omitted for brevity.

5.6.7 Simplification of (5.17)-(5.19)

Define yk, 1N+4

∑N+4i=1 yki and d

k, 1

N+4

∑N+4i=1 dki as the averages of yki and dki over i at the k-th iteration,

respectively. By solving the minimization problem in (5.18), we can get a closed-form solution of zk+1i below:

zk+1i =

dkiρ

+ yk+1i − d

k

ρ− yk+1 +

∑Ni=1 ai

N + 4. (5.45)

Substituting the right hand side of (5.45) for zk+1i in the d-update (5.19) yields dk+1

i = dk+ ρ(yk+1 −

∑Ni=1

ai

N+4 ), which indicates that the dual variables dk+1i are identical for all i at each iteration. Therefore, we can

safely drop the subscript i in dk+1i and obtain the d-update in (5.21). Meanwhile, substituting the right hand side

of (5.45) for zki in the y-update (5.17) and using the fact that dk−1i are identical for all i yields (5.20). Since the

vector z is not employed in either y-update or d-update, it can be eliminated.

Chapter 6

Conclusion and Future Work

More and more renewable energy resources such as wind and solar are expected to be integrated into the future

power grid. To overcome the intermittence of renewable generation, in this thesis, control of storage and flex-

ible loads has been considered in grid-wide services. The ultimate goal of the study is to facilitate large-scale

renewable integration so that the long-term performance of power systems can be improved. To this end, more

complete system models have been built, and analytical approaches have been employed to provide algorithms

with strong performance guarantee. For storage control, the problems of real-time power balancing and real-time

phase balancing have been studied. In addition to storage control, control of flexible loads has been incorporated

into the energy management of power systems for maintaining real-time power balance. Both centralized and

distributed algorithms have been proposed for control purposes.

There are many open problems left for the control of storage and flexible loads in the context of renewable

integration. Below two directions are listed as examples.

• In this thesis, a linear model of the storage state has been adopted. To better model energy storage resources,

more detailed and sophisticated models are desired (to this end, experimental data of storage charging and

discharging operations are required). With the help of such detailed storage models, for example, we may

be able to select an appropriate storage profile for a particular grid service.

• In this thesis, only aggregate flexible loads have been controlled. In practice, aggregate flexible loads are

composed of a large number of individual loads, and the aggregate control command needs to be decom-

posed into individual ones. Considering heterogeneity and large scale of individual loads, how to design

efficient algorithms for control purposes and how to incentivize individual loads to participate are critical

and worth studying.

88

Bibliography

[1] European Commission, “Package of implementation measures for the EU’s objectives on climate change

and renewable energy for 2020,” 2008. [Online]. Available: http://ec.europa.eu/clima/policies/package/

docs/sec 2008 85 ia en.pdf

[2] Senator energy, utilities and communications commitee, “Senate Bill X 1-2,” 2011. [Online]. Available:

http://www.leginfo.ca.gov/pub/11-12/bill/sen/sb 0001-0050/sbx1 2 cfa 20110214 141136 sen comm.html

[3] P. Meibom, K. Hilger, H. Madsen, and D. Vinther, “Energy comes together in denmark,” IEEE Power and

Energ. Mag., vol. 11, no. 5, pp. 46–55, Sep.-Oct. 2013.

[4] P. Denholm, E. Ela, B. Kirby, and M. Milligan, “The role of energy storage with renewable electricity

generation,” National Renewable Energy Laboratory, Tech. Rep., Jan. 2010.

[5] U. Helman, “Resource and transmission planning to achieve a 33% RPS in california-ISO modeling tools

and planning framework,” in Proc. FERC Technical Conf. Planning Models and Software, Jun. 2010.

[6] H. Su and A. Gamal, “Modeling and analysis of the role of energy storage for renewable integration: Power

balancing,” IEEE Trans. Power Syst., vol. 28, no. 4, pp. 4109–4117, Nov. 2013.

[7] F. Katiraei, R. Iravani, N. Hatziargyriou, and A. Dimeas, “Microgrids management: controls and operation

aspects of microgrids,” IEEE Power and Energ. Mag., vol. 6, no. 3, pp. 54–65, May-Jun. 2008.

[8] J. Kim and W. Powell, “Optimal energy commitments with storage and intermittent supply,” Oper. Res.,

vol. 59, no. 6, p. 13471360, Nov.-Dec. 2011.

[9] L. Huang, J. Walrand, and K. Ramchandran, “Optimal demand response with energy storage management,”

in Proc. IEEE SmartGridComm, Nov. 2012.

[10] B. Roberts and C. Sandberg, “The role of energy storage in development of smart grids,” Proc. IEEE, vol. 99,

no. 6, pp. 1139–1144, Jun. 2011.

[11] “Electric drive sales dashboard.” [Online]. Available: http://electricdrive.org/index.php?ht=d/sp/i/20952/

pid/20952

[12] J. John, “SolarCity and Tesla: a utility’s worst nightmare?” Mar. 2014. [Online]. Available:

http://www.greentechmedia.com/articles/read/SolarCitys-Networked-Grid-Ready-Energy-Storage-Fleet

[13] M. Puterman, Markov Decision Processes: discrete stochastic dynamic programming. Wiley-Interscience,

2005.

89

BIBLIOGRAPHY 90

[14] J. Rawlings, “Tutorial overview of model predictive control,” IEEE Control Syst. Mag., vol. 20, no. 3, Jun.

[15] H. Kwakernaak and R. Sivan, Linear optimal control systems. Wiley-Interscience New York, 1972, vol. 1.

[16] M. Neely, Stochastic Network Optimization with Application to Communication and Queueing Systems.

Morgan & Claypool, 2010.

[17] J. Taylor, D. Callaway, and K. Poolla, “Competitive energy storage in the presence of renewables,” IEEE

Trans. Power Syst., vol. 28, no. 2, pp. 985–996, May 2013.

[18] D. Zhu and G. Hug, “Decomposed stochastic model predictive control for optimal dispatch of storage and

generation,” IEEE Trans. Smart Grid, vol. 5, no. 4, pp. 2044–2053, Jul. 2014.

[19] Y. Kanoria, A. Montanari, D. Tse, and B. Zhang, “Distributed storage for intermittent energy sources: Con-

trol design and performance limits,” in Communication, Control, and Computing (Allerton), Annual Allerton

Conference on, Sep. 2011.

[20] S. Sun, M. Dong, and B. Liang, “Real-time power balancing in electric grids with distributed storage,” IEEE

J. Sel. Topics Signal Process., vol. 8, no. 6, pp. 1167–1181, Dec. 2014.

[21] Y. Guo, M. Pan, Y. Fang, and P. Khargonekar, “Decentralized coordination of energy utilization for residen-

tial households in the smart grid,” IEEE Trans. Smart Grid, vol. 4, no. 3, pp. 1341–1350, Sep. 2013.

[22] J. Qin, Y. Chow, J. Yang, and R. Rajagopal, “Distributed online modified greedy algorithm for networked

storage operation under uncertainty,” Jun. 2014. [Online]. Available: http://arxiv.org/pdf/1406.4615v2.pdf

[23] J. Garzas, A. Armada, and G. Granados, “Fair design of plug-in electric vehicles aggregator for V2G regu-

lation,” vol. 61, no. 8, pp. 3406–3419, Oct. 2012.

[24] E. Sortomme and M. Sharkawi, “Optimal scheduling of vehicle-to-grid energy and ancillary services,” IEEE

Trans. Smart Grid, vol. 3, no. 1, pp. 351–359, Mar. 2012.

[25] S. Han, S. Han, and K. Sezaki, “Optimal control of the plug-in electric vehicles for V2G frequency regulation

using quadratic programming,” in Proc. IEEE ISGT, Jan. 2011.

[26] ——, “Development of an optimal vehicle-to-grid aggregator for frequency regulation,” IEEE Trans. Smart

Grid, vol. 1, no. 1, pp. 65–72, Jun. 2010.

[27] W. Shi and V. Wong, “Real-time vehicle-to-grid control algorithm under price uncertainty,” in Proc. IEEE

SmartGridComm, Oct. 2011.

[28] C. Wu, H. Rad, and J. Huang, “Vehicle-to-aggregator interaction game,” IEEE Trans. Smart Grid, vol. 3,

no. 1, pp. 434–441, Mar. 2012.

[29] B. Gharesifard, T. Basar, and A. Domınguez-Garcıa, “Price-based distributed control for networked plug-in

electric vehicles,” in Proc. ACC, Jun. 2013.

[30] C. Woo, E. Kollman, R. Orans, S. Price, and B. Horii, “Now that California has AMI, what can the state do

with it?” Energy Policy, vol. 36, no. 4, p. 13661374, 2008.

[31] A. Papavasiliou and S. S. Oren, “Supplying renewable energy to deferrable loads: algorithms and economic

analysis,” in Proc. IEEE PES General Meeting, Jul. 2010.

BIBLIOGRAPHY 91

[32] D. Callaway and I. Hiskens, “Achieving controllability of electric loads,” Proc. IEEE, vol. 99, no. 1, pp.

184–199, Jan. 2011.

[33] M. Alizadeh, X. Li, Z. Wang, A. Scaglione, and R. Melton, “Demand-side management in the smart grid,”

IEEE Signal Process. Mag., vol. 29, no. 5, pp. 55–67, Sep. 2012.

[34] M. Alizadehy, Y. Xiao, A. Scaglioney, and M. Schaar, “Incentive design for direct load control programs,”

in Communication, Control, and Computing (Allerton), Annual Allerton Conference on, Oct. 2013.

[35] P. Martın, G. Sanchez, and G. Espana,, “Direct load control decision model for aggregated ev charging

points,” IEEE Trans. Power Syst., vol. 27, no. 3, pp. 1577–1584, Aug. 2012.

[36] B. Ramanathan and V. Vittal, “A framework for evaluation of advanced direct load control with minimum

disruption,” IEEE Trans. Power Syst., vol. 23, no. 4, pp. 1681–1688, Nov. 2008.

[37] S. Borenstein, “The long-run efficiency of real-time electricity pricing,” The Energ. J., vol. 26, no. 3, pp.

93–116, 2005.

[38] A. Mohsenian-Rad and A. Leon-Garcia, “Optimal residential load control with price prediction in real-time

electricity pricing environments,” IEEE Trans. Smart Grid, vol. 1, no. 2, pp. 120–133, Sep. 2010.

[39] E. Celebi and J. Fuller, “Time-of-Use Pricing in Electricity Markets Under Different Market Structures,”

IEEE Trans. Power Syst., vol. 27, no. 3, pp. 1170–1181, Aug. 2012.

[40] A. Adepetu, E. Rezaei, D. Lizotte, and S. Keshav, “Critiquing time-of-use pricing in ontario,” in Proc. IEEE

SmartGridComm., Oct. 2013.

[41] M. Kii, K. Sakamoto, Y. Hangai, and K. Doi, “The effects of critical peak pricing for electricity demand

management on home-based trip generation,” IATSS Res., vol. 37, no. 2, pp. 89–97, Mar. 2014.

[42] M. Piette, G. Ghatikar, S. Kiliccote, D. Watson, E. Koch, and D. Hennage, “Design and operation of an

open, interoperable automated demand response infrastructure for commercial buildings,” J. Comput. Inf.

Sci. Eng., vol. 9, no. 2, pp. 1–17, Jun. 2009.

[43] O. Ardakanian, S. Keshav, and C. Rosenberg, “Markovian models for home electricity consumption,” in

Proc. ACM SIGCOMM workshop on Green networking, Aug. 2011.

[44] A. Mohsenian-Rad, V. Wong, J. Jatskevich, R. Schober, and A. Leon-Garcia, “Autonomous demand-side

management based on game theoretic energy consumption scheduling for the future smart grid,” IEEE Trans.

Smart Grid, vol. 1, no. 3, pp. 320–331, Dec. 2010.

[45] D. Callaway, “Tapping the energy storage potential in electric loads to deliver load following and regulation,

with application to wind energy,” Energy Convers. Manage., vol. 50, pp. 1389–1400, Mar. 2009.

[46] P. Du and N. Lu, “Appliance commitment for household load scheduling,” IEEE Trans. Smart Grid, vol. 2,

no. 2, pp. 411–419, Jun. 2011.

[47] M. Alizadeh, A. Scaglione, J. Davies, and K. Kurani, “A scalable stochastic model for the electricity demand

of electric and plug-in hybrid vehicles,” IEEE Trans. Smart Grid, vol. 5, no. 2, pp. 848–860, Mar. 2014.

[48] J. Mathieu, M. Kamgarpour, J. Lygeros, and D. Callaway, “Energy arbitrage with thermostatically controlled

loadsy,” in Proc. ECC, Jul. 2013.

BIBLIOGRAPHY 92

[49] M. Alizadeh, A. Scaglione, A. Applebaum, G. Kesidis, and K. Levitt. ”Reduced-order load models for

large populations of flexible appliances.” , “Reduced-order load models for large populations of flexible

appliances,” to appear in IEEE Trans. Power Syst.

[50] H. Hao, B. Sanandaji, K. Poolla, and T. Vincent, “A generalized battery model of a collection of ther-

mostatically controlled loads for providing ancillary service,” in Communication, Control, and Computing

(Allerton), Annual Allerton Conference on, Oct. 2013.

[51] ——, “Aggregate flexibility of thermostatically controlled loads,” IEEE Trans. Power Syst., 2014.

[52] A. Nayyar, J. A. Taylor, A. Subramanian, D. S. Callaway, and K. Poolla, “Aggregate flexibility of collections

of loads,” in Proc. IEEE CDC, Dec. 2013.

[53] A. Meier, Electric Power Systems: A Conceptual Introduction. Wiley-IEEE Press, 2006.

[54] B. Kirby, “Frequency regulation basics and trends,” U.S. Dept. Energy, Tech. Rep., 2005.

[55] B. Narayanaswamy, V. Garg and T. Jayram, “Online optimization for the smart (micro) grid,” in ACM e-

Energy, May 2012.

[56] L. Lu, J. Tu, C. Chau, M. Chen, and X. Lin, “Online energy generation scheduling for microgrids with

intermittent energy sources and co-generation,” in Proc. ACM Sigmetrics, Jun. 2013.

[57] T. Chang, M. Alizadeh, and A. Scaglione, “Real-time power balancing via decentralized coordinated home

energy scheduling,” IEEE Trans. Smart Grid, vol. 4, no. 3, pp. 1490–1504, Sep. 2013.

[58] Y. Huang, S. Mao, and R. Nelms, “Adaptive electricity scheduling in microgrids,” in Proc. IEEE INFOCOM,

Apr. 2013.

[59] G. Joos, B. Ooi, D. McGillis, F. Galiana, and R. Marceau, “The potential of distributed generation to provide

ancillary services,” in Proc. IEEE Power Eng. Soc. Summer Meeting, Jan. 2000.

[60] W. Kempton, V. Udo, K. Huber, K. Komara, S. Letendre, S. Baker, D. Brunner, and N. Pearre, “A test of

vehicle-to-grid (V2G) for energy storage and frequency regulation in the PJM system,” Tech. Rep., Nov.

2008. [Online]. Available: http://www.udel.edu/V2G/resources/test-v2g-in-pjm-jan09.pdf

[61] S. Sun, M. Dong, and B. Liang, “Real-time welfare-maximizing regulation allocation in aggregator-EVs

systems,” in Proc. IEEE INFOCOM Workshop on CCSES, Apr. 2013.

[62] ——, “Real-time welfare-maximizing regulation allocation in dynamic aggregator-EVs system,” IEEE

Trans. Smart Grid, vol. 5, no. 3, pp. 1397–1409, May 2014.

[63] ——, “Distributed regulation allocation with aggregator coordinated electric vehicles,” in Proc. IEEE Smart-

GridComm., Oct. 2013.

[64] J. Zhu, M. Chow, and F. Zhang, “Phase balancing using mixed-integer programming,” IEEE Trans. Power

Syst., vol. 13, no. 4, pp. 1487–1492, Nov 1998.

[65] Y. Hsu, Y. Hwu, S. Liu, Y. Chen, H. Feng, and Y. Lee, “Transformer and feeder load balancing using a

heuristic search approach,” IEEE Trans. Power Syst., vol. 8, no. 1, pp. 184–190, Feb. 1993.

BIBLIOGRAPHY 93

[66] A. Castillo and D. Gayme, “Grid-scale energy storage applications in renewable energy integration: A sur-

vey,” Energy Convers. Manage., vol. 87, pp. 885–894, Aug. 2014.

[67] R. Green II, L. Wang, and M. Alam, “The impact of plug-in hybrid electric vehicles on distribution networks:

A review and outlook,” RENEW SUST ENERG REV, vol. 15, no. 1, p. 544553, Jan. 2011.

[68] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,”

SIAM J. Imaging Sci., vol. 2, no. 3, pp. 183–202, 2009.

[69] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed Optimization and Statistical Learning

via the Alternating Direction Method of Multipliers. Found. Trends Mach. Learning, 2011.

[70] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.

[71] S. Han, S. Han, and K. Sezaki, “Economic assessment on V2G frequency regulation regarding the battery

degradation,” in Proc. IEEE ISGT, Jan. 2012.

[72] P. Ramadass, B. Haran, R. White, and B. Popov, “Performance study of commercial LiCoO2 and spinel-

based Li-ion cells,” J. Power Sources, vol. 111, no. 2, pp. 210–220, Apr. 2002.

[73] E. Bitar, A. Giani, R. Rajagopal, D. Varagnolo, P. Khargonekar, K. Poolla, and P. Varaiya, “Optimal contracts

for wind power producers in electricity markets,” in Proc. IEEE CDC, Dec. 2010.

[74] R. Urgaonkar, B. Urgaonkar, M. Neely, and A. Sivasubramaniam, “Optimal power cost management using

stored energy in data centers,” in Proc. ACM SIGMETRICS, 2011.

[75] D. Bertsekas, Nonlinear Programming. Athena Scientific, 1999.

[76] “Electricity prices in Ontario.” [Online]. Available: http://www.ontarioenergyboard.ca/OEB/Consumers/

Electricity

[77] S. Low and D. Lapsley, “Optimization flow control – I: basic algorithm and convergence,” IEEE/ACM Trans.

Netw., vol. 7, no. 6, pp. 861–874, Dec. 1999.

[78] S. Shakkottai and R. Srikant, Network Optimization and Control. Now Publishers Inc, 2007.

[79] Tesla Model S. [Online]. Available: http://en.wikipedia.org/wiki/Tesla Model S

[80] Ford Focus Electric. [Online]. Available: http://en.wikipedia.org/wiki/Ford Focus Electric

[81] “Fast response regulation signal.” [Online]. Available: http://www.pjm.com/markets-and-operations/

ancillary-services/mkt-based-regulation/fast-response-regulation-signal.aspx

[82] “Ancillary services manual.” [Online]. Available: http://www.nyiso.com/public/webdocs/markets

operations/documents/Manuals and Guides/Manuals/Operations/ancserv.pdf

[83] A. Wood, B. Wollenberg, and G. Sheble, Power Generation, Operation, and Control, 3rd ed. Wiley-

Interscience, 2013.

[84] H. Wang and A. Banerjee, “Online alternating direction method.” [Online]. Available: http:

//arxiv.org/abs/1306.3721

BIBLIOGRAPHY 94

[85] L. Huang, J. Walrand, and K. Ramchandran, “Optimal power procurement and demand response with

quality-of-usage guarantees,” in Proc. IEEE PESGM, Jul. 2012.

[86] L. Xiang, D. Ng, W. Lee, and R. Schober, “Optimal storage-aided wind generation integration considering

ramping requirements,” in Proc. IEEE SmartGridComm., Oct. 2013.

[87] S. Chen, N. Shroff, and P. Sinha, “Heterogeneous delay tolerant task scheduling and energy management

in the smart grid with renewable energy,” IEEE J. Sel. Areas Commun., vol. 31, no. 7, pp. 1258–1267, Jul.

2013.

[88] Y. Zhang, N. Gatsis, and G. Giannakis, “Robust energy management for microgrids with high-penetration

renewables,” IEEE Trans. Sustainable Enery, vol. 4, no. 4, pp. 944–953, Oct. 2013.

[89] S. Salinas, M. Li, P. Li, and Y. Fu, “Dynamic energy management for the smart grid with distributed energy

resources,” IEEE Trans. Smart Grid, vol. 4, no. 4, pp. 2139–2150, Dec. 2013.

[90] M. Shahidehpour, H. Yamin, and Z. Li, Market Operations in Electric Power Systems: Forecasting, Schedul-

ing, and Risk Management. Wiley-IEEE Press, 2002.

[91] S. Nadel and G. Herndon, The Future of the Utility Industry and the Role of Energy Efficiency. Washington,

DC, 2014.

[92] A. Ipakchi and F. Albuyeh, “Grid of the future,” IEEE Power Energy Mag., vol. 7, no. 2, pp. 52–62, 2009.

[93] T. Larsson, M. Patriksson, and A. Stromberg, “Ergodic, primal convergence in dual subgradient schemes for

convex programming,” Math. Program., vol. 86, pp. 283–312, 1999.

Documents

by Sun Suns232sun/mypapers/thesis_phd.pdf · With growing concerns about the environment and the energy independence of fossil fuels, more and more re-newable energy resources such