Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Comprehensive Robustness via Moment-Based Optimization :Theory and Applications
by
Jonathan Yu-Meng Li
A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy
Department of Mechanical and Industrial EngineeringUniversity of Toronto
Copyright c© 2012 by Jonathan Yu-Meng Li
Abstract
Comprehensive Robustness via Moment-Based Optimization : Theory and Applications
Jonathan Yu-Meng Li
Doctor of Philosophy
Department of Mechanical and Industrial Engineering
University of Toronto
2012
The use of a stochastic model to predict the likelihood of future outcomes forms an
integral part of decision optimization under uncertainty. In classical stochastic modeling
uncertain parameters are often assumed to be driven by a particular form of probability
distribution. In practice however, the distributional form is often difficult to infer from
the observed data, and the incorrect choice of distribution can lead to significant quality
deterioration of resultant decisions and unexpected losses. In this thesis, we present
new approaches for evaluating expected future performance that do not rely on an exact
distributional specification and can be robust against the errors related to committing to
a particular specification. The notion of comprehensive robustness is promoted, where
various degrees of model misspecification are studied. This includes fundamental one
such as unknown distributional form and more involved ones such as stochastic moments
and moment outliers. The approaches are developed based on the techniques of moment-
based optimization, where bounds on the expected performance are sought based solely
on partial moment information. They can be integrated into decision optimization and
generate decisions that are robust against model misspecification in a comprehensive
manner. In the first part of the thesis, we extend the applicability of moment-based
optimization to incorporate new objective functions such as convex risk measures and
richer moment information such as higher-order multivariate moments. In the second
ii
part, new tractable optimization frameworks are developed that account for various forms
of moment uncertainty in the context of decision analysis and optimization. Financial
applications such as portfolio selection and option pricing are studied.
iii
To my love, Lily,
and to my parents, John and Jean Li.
iv
Acknowledgements
Along the course of my PhD, I have been fortunate to have the guidance and support of
a number of professors. My supervisor, Professor Roy H. Kwon, has been a role model
for implementing the principle of “being passionate about what you do”. His energy and
creativity has been a key ingredient in initiating many of our research discussions. The
past year, which was filled with the stress of looking for an academic job placement,
would have been much more difficult without his support and advice.
I would like to thank my committee members, Timothy Chan, Samir Elhedhli, Sebas-
tian Jaimungal, and Yuri Lawryshyn for their valuable time, and insightful comments and
suggestions. Special thanks go to Sebastian, who brought up the idea of penalty-based
optimization in my second year seminar, which later led to the fruitful developments
in Chapter 4. Special thanks also go to Samir who, although already being occupied by
many leadership duties, still kindly agreed to be my external examiner, to read my thesis,
and to adjust his schedule to attend my defense.
I am indebted to Professor Michael J. Best and Professor Tamas Terlaky for their
continuous support for many years since my Masters studies. Tamas introduced me to
the field of mathematical programming, whereas Michael opened my eyes in the area of
financial optimization. In particular, I would like to thank Michael for going out of his
way to help me on many occasions.
I would also like to thank my Masters supervisor, Antoine Deza, who helped me
transition from someone with a pure physics background to someone pursuing a research
career in Operations Research.
I have had the pleasure of sharing my PhD years with numerous good friends. I met
many of them during the development of the University of Toronto Operations Research
Group (UTORG), in particular, Mike and Kimia from the very first day. Special thanks
to Mike, Kimia, Velibor, and Jenya, who have made UTORG a family and not just a
v
research group to me. I am also thankful to Steve who made my start at U of T easier,
and to Tim who gave me much advice during my job search.
Last, yet most important, I owe my family all the love and gratitude. For many
years, I have been away from my parents and have always been “too busy” to go back
to visit them. They however never ceased in supporting me in any form they could.
They and my sister, Ann, continuously prayed for me, and my lord, Jesus Christ, has
always provided me with more than I could ever expect. Beyond any doubt, without their
unconditional love I would not be able to make it this far. I reserve this very last part
of the acknowledgement to a special person whom I met during the journey of graduate
studies, and ever since she has been not only a part of my family but a kindred spirit.
Without her companion, I could have lost the true taste of life. Thank you, Lily. Thank
you for your love, compassion, and support.
vi
Contents
1 Introduction and Thesis Outline 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Thesis Outline and Contribution . . . . . . . . . . . . . . . . . . . . . . . 5
2 Moment Problems, Tractable Counterparts, and Application 7
2.1 Moment Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Application in Model-Risk Management . . . . . . . . . . . . . . . . . . 14
2.2.1 Market Price-Based Convex Risk Measures . . . . . . . . . . . . . 15
2.2.2 A Moment-Based Distribution-Free Optimization Approach . . . 17
2.2.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Tractability of Accouting for Multivariate Moment Information . . . . . . 27
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Accounting for Stochastic Moments 36
3.1 Deterministic Semidefinite Optimization Models . . . . . . . . . . . . . . 37
3.2 A Stochastic Semidefinite Optimization Approach . . . . . . . . . . . . . 39
3.3 Solution Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Application in Bounding Option Prices . . . . . . . . . . . . . . . . . . . 49
3.4.1 A Moment-Based Lattice under Regime Switching . . . . . . . . . 50
3.4.2 Implementation and Experiments . . . . . . . . . . . . . . . . . . 54
vii
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4 Distributionally Robust Optimization under Extreme Moment Uncer-
tainty 63
4.1 Moment Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 Comprehensive Distributionally Robust Optimization . . . . . . . . . . . 67
4.3 General Complexity Results . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Connection with Classical Minimax Approaches . . . . . . . . . . . . . . 77
4.5 Semidefinite Optimization Reformulations . . . . . . . . . . . . . . . . . 80
4.5.1 Variations of Moment Uncertainty Structures . . . . . . . . . . . 88
4.5.2 Extensions to Factor Models . . . . . . . . . . . . . . . . . . . . . 90
4.6 Application in Portfolio Selection . . . . . . . . . . . . . . . . . . . . . . 92
4.6.1 Portfolio Selection under Model Uncertainty . . . . . . . . . . . . 92
4.6.2 Implementation and Experiments . . . . . . . . . . . . . . . . . . 93
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5 Conclusion and Future Research 102
A Additional Tables 105
A.1 Tables of Section 2.2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.2 Tables of Section 3.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
B Additional Figures 108
B.1 Section 3.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Bibliography 110
viii
List of Tables
2.1 ϑ(V∗) of Qfin for various values of parameters s′, K, τ . . . . . . . . . . . 25
2.2 ϑ(V∗) of Qmom for various values of parameters s′, K, τ , where w′ = 1. . . 26
3.1 Pseudo code for scenario generation . . . . . . . . . . . . . . . . . . . . . 54
4.1 Comparison of different approaches in the period: 1997/01-2003/12 . . . 97
4.2 Comparison of different approaches in the period: 2004/01-2010/12 . . . 97
4.3 Comparison of different approaches in the period: 2004/01-2007/06 . . . 98
4.4 Comparison of different approaches in the period: 2007/06-2010/12 . . . 98
A.1 CB (resp. CM) denotes the call option price of the diffusion (resp. jump-
diffusion) model with Lo’s specification. Cb denotes the call option prices
of the benchmark model, i.e. the jump-diffusion model with k = 1, φ2 =
0.15, λ = 0.25. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.2 ϑ(V∗) of Qmom for various values of parameters s′, K, τ , where w′ = 2. . . 106
A.3 ϑ(V∗) of Qmom for various values of parameters s′, K, τ , where w′ = 5. . . 106
A.4 Upper/lower bounds and prices for different strike prices K, b+(b−)-values
and time to maturity under 2 regimes. . . . . . . . . . . . . . . . . . . . 107
ix
List of Figures
3.1 Regime switching lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 The case of 2 regimes and K = 1200 . . . . . . . . . . . . . . . . . . . . 58
3.3 The case of 2 regimes and K = 1325 . . . . . . . . . . . . . . . . . . . . 59
3.4 The case of 2 regimes and K = 1400 . . . . . . . . . . . . . . . . . . . . 60
4.1 Cumulative wealth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.2 Cumulative wealth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
B.1 The case of 3 regimes and K = 1325 . . . . . . . . . . . . . . . . . . . . 108
B.2 The case of 4 regimes and K = 1325 . . . . . . . . . . . . . . . . . . . . 109
B.3 The case of 5 regimes and K = 1325 . . . . . . . . . . . . . . . . . . . . 109
x
Chapter 1
Introduction and Thesis Outline
1.1 Introduction
Consider a portfolio manager who follows a forecast of return distributions to determine
the optimal investment policy. What if the forecast model cannot well represent the re-
alized return? Is the resultant policy still reliable, or could it actually lead to unexpected
losses?
Central to the advance in decision sciences was the new development of probability and
optimization theories that enable decision makers to model the randomness of decision
environments and make decisions that best utilize available data. Stochastic Optimization
(SO), for example, has been a popular decision optimization tool that allows parameters
in decision optimization problems to be modeled as random variables and be driven by a
probabilistic model. These quantitative approaches however have been found unreliable
lately due to deficiency in the probabilistic model used to capture today’s extremely
volatile environments. Classical examples include 2008 financial crisis, where the failure
of modeling extreme correlations leads to global devastating losses.
The challenge today of applying mathematical modeling in decision making can per-
haps be best summarized by a quote from the famous statistician George Box, ”Essen-
1
CHAPTER 1. INTRODUCTION AND THESIS OUTLINE
tially, all models are wrong, but some are useful”. In particular, decision makers often face
challenges in deciding which probability distribution to employ in their decision analysis.
The information that they can acquire about the underlying probability distribution is
often fairly limited. For example, in financial decision making, one rarely has full in-
formation about the joint distribution of asset returns but only partial information such
as first and second order moments [Popescu, 2007]. In such cases, it is often tempting
to assume a particular distributional form such as a multivariate normal distribution in
evaluating expected future performance. This however can be potentially misleading and
often underestimates the true level of downside performance. In addition to the difficulty
of specifying an exact distributional form, in practice even moments of the underlying
distribution can be hard to estimate accurately. It has been found in time series studies
that moments are in many cases stochastic and changing over time. Overlooking such
level of uncertainty can also lead to a false sense of risk exposure since the estimated
volatility level (second order moment) can be completely different from the realized one.
The theme of this thesis is to develop more robust decision analysis by taking into
account all levels of uncertainty associated with distributional form and moments in
evaluating expected future performance. An evaluation approach and associated decision
analysis are considered robust if the evaluation relies only on partial information that
decision makers could acquire about the distribution, and is not sensitive to a particular
realization of distributional form or moments. In particular, throughout this thesis we
consider the following three layers of uncertainty associated with distributional forms and
moments:
1. Distribution uncertainty with fixed moments
2. Distribution uncertainty with stochastic moments
3. Distribution uncertainty with extreme moments (moment outliers)
2
CHAPTER 1. INTRODUCTION AND THESIS OUTLINE
The above three forms of uncertainty share a common feature that they do not assume
any particular form of the distribution. Additional complexities are introduced in the
latter layers as richer forms of moment uncertainty are considered in addition to the
uncertainty of distributional forms. In the second layer, moments are considered random
and governed by a finite-state stochastic model, where each state corresponds to each
possible realization of moments. This layer of uncertainty can be useful for example
to model the random switching of moments exhibited in many time series. The third
layer of uncertainty is motivated from radical behavior in modern decision environments,
where the changes of moments can be fully unpredictable based on available historical
data. For example, many crises from a statistical point of view are outliers, which are
highly improbable but often have devastating impact. In these crises, volatility often
soars to an unprecedented level. Such a layer of uncertainty can also be applied in the
cases that only limited amount of data is available to estimate moments. In these cases,
the estimated range of moments may fail to capture the true moments, and the need
arises to model the true moments as outliers. The idea of accounting for the above three
layers of uncertainties, which include all plausible realizations of distributional forms and
moments, in decision analysis and optimization constitutes the notion of comprehensive
robustness promoted in this thesis. The resultant analysis of performance evaluation
and optimal decision making is expected to be minimally impacted by all conceivable
misspecifications of the underlying probability/stochastic model.
The research challenges here are multifold. First and foremost, in all these layers of
uncertainty since no distributional form is assumed, there are infinitely many probability
measures involved in evaluating expected-performance. In the first layer of uncertainty
for example, an uncountable set of probability distributions that are consistent with given
moments is considered. This raises the following questions: What are the best-possible
estimates on the expected performance inferred from such a set, and how efficiently can
3
CHAPTER 1. INTRODUCTION AND THESIS OUTLINE
we generate the estimates? The questions become more challenging when the moment
information used to characterize the distribution set can only be provided in a stochastic
manner or even incomplete as outlined in the second and third layer of uncertainty.
Second, there is no clear rule of how extreme moments described in the third layer should
be handled. Clearly, the value of extreme moments involved in evaluating expected-
performance cannot be arbitrary, which otherwise leads to meaningless evaluation. This
leads to the questions: How should we decide which extreme moments matter more in
evaluation and discard those otherwise? Lastly, in the context of decision optimization, it
is essential to investigate if decisions based on the new performance-evaluation approaches
can be optimized in a tractable manner.
The methodology developed in this thesis can be viewed as a application of, from
a modeling perspective, moment problems that arise from probability theory. Classical
moment problems are concerned with deriving conditions for the existence of a probability
measure that matches a given sequence of moments. In a more generalized setting,
evaluation of certain expected-value functional is sought based on a sequence of moments.
Such evaluation typically involves infinitely many probability measures that satisfy the
same set of moments, and naturally leads to the problems of deriving upper and lower
bounds on the expected-values over the set of measures. While such a generalized setting
is exactly the framework we consider to find the best-possible estimates on the expected
performance inferred from moment information, the bounds derived based on probability
theory can however be too loose to be informative in decision analysis. Our approach to
derive tight bounds instead hinges on the connection between modern conic optimization
theory and moment problems. In particular, the exploitation of various optimization tools
such as duality and semi-definite optimization theories enables us not only to generate
sharp bounds in moment problems, but also to provide theoretical evidence that the best-
possible estimate can be generated efficiently. The main contribution of this thesis lies in
4
CHAPTER 1. INTRODUCTION AND THESIS OUTLINE
extending the applicability of these moment-based optimization approaches to resolve the
aforementioned research challenges in developing the notion of comprehensive robustness.
1.2 Thesis Outline and Contribution
The thesis is organized as follows.
Chapter 2. Moment Problems, Tractable Counterparts, and Ap-
plication
In this chapter we begin with a brief introduction of moment problems and its connection
with modern optimization theory. Two main streams of conic optimization approaches
that tackle the problems from distinct perspectives, one from the primal and another
from the dual perspective, are reviewed. We first present the application of the dual
approach in developing a special form of risk measures used for measuring the impact
of model uncertainty in derivative pricing. The new application is accompanied with
numerical studies that highlight the benefit of using a moment-based setting. We then
present new tractability results in generating tight (the tightest) bounds for moment
problems involving higher order marginal moments.
Chapter 3. Accounting for Stochastic Moments
We consider a new setting of moment problems, where moments are stochastic and driven
by a finite-state stochastic model that captures the dynamics of a non-stationary decision
environment. To account for distribution information about the states and associated
moments, we present two-stage stochastic semidefinite optimization models as robust
counterparts of semidefinite optimization models arising from moment problems with
fixed moments. The framework is comprehensive in the sense that it includes as special
5
CHAPTER 1. INTRODUCTION AND THESIS OUTLINE
limiting cases the deterministic and robust optimization counterparts. The central result
is a closed-form solution for the optimal value of the proposed optimization model, which
is equivalent to a Value at Risk quantity. The framework is applied in the area of
option pricing to derive upper and lower bounds for the price of a European-style call
option under regime switching, where only conditional moments of regime switching
distributions are assumed. Computational experiments using the S&P 500 index as the
underlying asset are performed that illustrate the advantages of the two-stage stochastic
programming approach over the deterministic strategy.
Chapter 4. Distributionally Robust Optimization under Ex-
treme Moment Uncertainty
The focus of this chapter is to develop a new robust formulation of stochastic program-
ming problems in the presence of rare but high-impact realization of moment uncertainty.
Such an extreme form of moment uncertainty can be treated as moment outliers, which
are difficult to infer from historical data. Prior robust formulations hedge moment un-
certainty by assuming a fixed range of values that moments can possibly fall into, which
however cannot effectively account for moment outliers. Our robust model can be seen as
a moment-based extension from classical penalized minimax frameworks, where a penalty
function is re-designed to account for extreme moment uncertainty. We proved that un-
der very mild conditions, the decision optimization model is guaranteed to be solvable in
a tractable manner, and show that for a wide range of specifications, the model can be
recast as semidefinite programs (SDP) and solved very efficiently. The framework is then
applied to portfolio selection problems. Computational experiments based on real-life
market data are presented, which highlights the utility of our approach during financial
market turmoil.
6
Chapter 2
Moment Problems, Tractable
Counterparts, and Application
In this chapter, we review and exploit the theory of moment problems to generate best-
possible estimates on expected performance when no distributional information except
the first finite number of moments are available for the underlying distribution. Moment
problems have their roots in probability theory. For example, fundamental probability
inequalities such as Markov’s and Chebyshev’s inequality attempt to derive bounds on
the probability of certain events based only on the mean and/or variance of an underlying
random variable. In decision analysis, risk-averse decision makers are often keen to esti-
mate the worst-case expected performance based on available distributional information
such as moments. As the performance measures can be of all sorts, the estimation of the
worst-case performance naturally leads to a more generalized setting of moment problems,
where bounds based on moments are sought on a wide range of expected functionals.
The development of these bounds involves two fundamental questions: How tight are
the bounds? Is it tractable (analytically or computationally) to generate tight bounds?
In the cases that the tightest bounds can be generated, i.e. existence of a probabil-
7
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
ity distribution attaining the bound, the bounds can be informative as they represent
the best-possible estimates of the expected performance that can be inferred from just
moment information. Developing such bounds can also be viewed as a robust way to
estimate certain expected-values. It does not rely on a particular form of distribution
and thus the estimation are free from the errors related to committing to a particular
distributional form. From here on, for simplicity, we may call such bounds moment-based
bounds.
The focus of this chapter is to study the tractability of generating tight or the tightest
moment-based bounds via modern optimization theory. The idea of using optimization
theory such as duality theory to formulate a dual optimization problem for which the opti-
mal value attains the tightest moment-based bound can be traced back to the earlier work
of Isii (1960). Smith (1995) then shed new light on the synthesis of the dual optimization
problem, computational strategies, and applications in decision analysis. A major break-
through was by Bertsimas et al. (2000) and Bertismas and Popescu (2002) who exploited
modern conic optimization theory to show that a large class of moment-based bounds can
be efficiently computed by reformulating the corresponding dual optimization problems
as semidefinite programming problems (SDP). Another related, but more generalized
approach using semidefinite programming in generating moment-based bounds was pro-
posed by Lasserre (2001). Instead of relying on the use of duality theory to seek the
tightest bound, which may not always be feasible, Lasserre resorted to the theory related
to the characterization of moment sequences and developed SDP relaxation techniques
that tighten the bounds by a hierarchy of SDP relaxation problems.
In Section 2.1, we present the problem of moments and review briefly the modern
solution approaches proposed by Bertsimas et al. (2000), Bertismas and Popescu (2002),
and Lasserre (2001). In the later sections, we will extend the applicability of these
approaches and present new tractability results. In Section 2.2, the application in model-
8
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
risk management is presented, where we develop new moment-based convex risk measures.
In Section 2.3, new SDP reformulations are presented for computing tight and the tightest
moment-based bounds that account for higher-order multivariate moments.
2.1 Moment Problems
Let (<n,B,Q) denote a probability space, where B is the Borel σ−algebra on <n. Suppose
that the expected-value of h(ξ) needs to be estimated, where ξ denotes a vector of n-
dimensional random variables, and h : <n → <. If complete knowledge of the probability
measure Q is available, this leads to the evaluation of the integral
EQ[h(ξ)] :=
∫Ch(ξ)dQ(ξ), (2.1)
where ξ ∈ <n, and C denotes the support of Q. However, in most cases only partial
moment information is available for the measure Q: EQ[φj(ξ)] = bj, j = 1, ..., J, where φj
is a polynomial function of ξ. Since the measure in these cases cannot not be uniquely
determined, the best we can do to evaluate the integral is to find the tightest possible
bounds on it. We thus arrive at the following optimization problem, also known as the
generalized moment problem (c.f. [Lasserre, 2010]):
maxQ
(minQ
) EQ[h(ξ)] (2.2)
subject to EQ[φj(ξ)] = bj, j = 1, ..., J.
Isii (1960, 1963) was the first to apply duality theory to study the above problem. Fol-
lowing the spirit of linear programming duality, we can derive the dual problem of (2.2):
minz
(maxz
) zTb (2.3)
subject to zTφ(ξ) ≥ h(ξ), ∀ξ ∈ C,
where b (resp. φ(ξ)) denotes a vector form of bj (resp. φj(ξ)). The most powerful
connection between solving the dual problem (2.3) and generating the tightest moment-
9
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
based bound (2.2) is established via the following strong duality result by Isii (1963). The
theorem implies that under a mild condition, if the problem (2.3) can be solved exactly,
its optimal value is the tightest moment-based bound.
Theorem 2.1.1. [Isii, 1963] If the vector of moments b is interior to the feasible moment
set M = E[φ(ξ)] | ξ arbitrary multivariate distribution, then strong duality holds.
While solving the dual problem (2.3) is known to be NP-hard in its generic form
[Bertsimas and Popescu, 2005], Bertismas and Popescu (2000,2002) showed that there is
a large class of instances motivated from real-life applications that can be solved in poly-
nomial time and in a practically efficient manner. Their results hinge on the findings that
a wide range of constraint forms in (2.3) can be reformulated as semidefinite constraints,
also known as semidefinite representable [Ben-Tal and Nemirovski, 2001]. Recall that
a convex set K ⊂ <n′ is called semidefinite representable (SDr) if it can be expressed
as k∗ ∈ K,∃t∗ | A(k∗ t∗) − B 0, where A denotes a linear operator, B denotes a
constant matrix, and the notation 0 implies that the left-hand-side of the expression is
a semidefinite matrix. In addition, a convex function f ∗ : <n′ → <∪∞ is called SDr if
its epigraph (k∗, t∗) | f ∗(k∗) ≤ t∗ is an SDr set. Here we present as an example one of
the key results related to semidefinite reformulations of (2.3) in [Bertsimas and Popescu,
2002] when both functions h and φj in (2.2) are univariate polynomial functions. The
result will also be used in the latter sections.
Proposition 2.1.1. [Bertsimas and Popescu, 2002], [Gotoh and Konno, 2002]
1. The polynomial g(x) =∑n
r=0 yrxr satisfies g(x) ≥ 0 for all x ∈ [0, a) if and only if
there exists a positive semidefinite matrix X = [xij]i,j=0,...,n such that
0 =∑
i,j:i+j=2l−1
xij, l = 1, ..., n,
l∑r=0
yr
n− r
l − r
ar =∑
i,j:i+j=2l
xij, l = 0, ..., n.
10
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
2. The polynomial g(x) =∑n
r=0 yrxr satisfies g(x) ≥ 0 for all x ∈ [a, b] if and only if
there exists a positive semidefinite matrix X = [xij]i,j=0,...,n such that
0 =∑
i,j:i+j=2l−1
xij, l = 1, ..., n,
l∑m=0
k+m−l∑r=m
yr
r
m
k − r
l −m
ar−mbm =∑
i,j:i+j=2l
xij, l = 0, ..., n.
3. The polynomial g(x) =∑n
r=0 yrxr satisfies g(x) ≥ 0 for all x ∈ [a,∞) if and only
if there exists a positive semidefinite matrix X = [xij]i,j=0,...,n such that
0 =∑
i,j:i+j=2l−1
xij, l = 1, ..., n,
n∑r=l
yr
r
l
ar−l =∑
i,j:i+j=2l
xij, l = 0, ..., n.
Since the work of [Bertsimas et al., 2000], [Bertsimas and Popescu, 2002], and [Bert-
simas and Popescu, 2005], the idea of applying conic optimization theory to efficiently
generate the tightest moment-based bounds has been considerably generalized (see [Zu-
luaga and Pena, 2005]). In Section 2.3, we will present new tractability results of a
special class of moment problems, where random variables are multivariate and marginal
moment information is incorporated.
Generating the tightest moment-based bounds, while most desirable, is not always
computationally tractable. Lasserre (2001) proposed an alternative conic optimization
approach, moment relaxation techniques, to solve moment problems approximately. In-
stead of taking a dual perspective of moment problem, Lasserre’s approach tackles directly
the primal form of the problem, i.e. (2.2) using a “change of variables”-type method. The
idea is to replace each integral form of monomial EQ[Πni=1ξ
pii ] by a new scalar up ∈ <,
where p = (p1, ..., pn) ∈ Zn+ is an index variable, and |p| = p1 + · · · + pn. In the cases
11
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
that functions h and φj in (2.2) are polynomials, the problem (2.2) can be reformulated
as follows:
maxu
(minu
) cTu (2.4)
subject to κTj u = bj, j = 1, ..., J,
where u is a vector form of up, and c (resp. κj) denotes the coefficients of the polynomial
h (resp. φj). Clearly, the above problem is a relaxation of the problem (2.2). To ensure
the bounds generated from the above relaxation can be reasonably tight, Lasserre (2001)
further employed the notion of moment matrices to strengthen the relaxation.
Definition 2.1.1. The moment matrix Mr(u) is defined by
Mr(u)(1, i∗) = Mr(u)(i∗, 1) = u∗i∗−1, for i∗ = 1, . . . , 2r + 1,
Mr(u)(1, j∗) = u∗α∗ and Mr(u)(i∗, 1) = u∗β∗ ⇒ Mr(u)(i∗, j∗) = u∗α∗+β∗ ,
where u∗i∗ ∈ R : i∗ ∈ Z+ is the sequence obtained by ordering u so that it conforms with
the indexing implied by the usual basis
1, ξ1, . . . , ξd, . . . , ξ
21 , ξ1ξ2, . . . , ξ
2r1 , ξ
2r−11 ξ2, . . . , ξ
2rn
(2.5)
of the vector space of R-valued polynomials in n variables of degree at most 2r.
It is easy to verify that if the sequence u is a feasible moment sequence, i.e. there
exists a probability measure having u as its moments, the corresponding moment matrix
Mr(u) must be positive semidefinite. The fact that such a semidefinite condition is also
sufficient for an infinite sequence u = up : |p| =∞ to be a feasible moment sequence
is established by Curto and Fialkow (1996).
Theorem 2.1.2. [Curto and Fialkow, 1996] For an infinite sequence u = up : |p| =∞,
if Mr(u) 0 and Mr(u) has finite rank r, then u has a unique r-atomic representing
measure.
12
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
The semidefinite condition given in Theorem 2.1.2 is intractable from a computational
perspective due to the infinite size of the sequence. Lasserre (2001) proposed the use of a
truncated sequence and the associated moment matrix, i.e. fixing the value r in (2.5), to
develop a hierarchy of relaxation counterparts. By increasing the value r, the relaxation
problem (2.4) can be strengthen in a systematical manner. Such relaxation techniques are
powerful because each relaxation can be solved efficiently as a semidefinite programming
problem. Lasserre (2001) also proved several asymptotic convergence result as r → ∞.
Later, we will revisit the use of Lasserre’s type of approach to develop more tractability
results.
In the cases that both necessary and sufficient conditions are available for a finite
sequence of u, the relaxation becomes exact. This is the case for univariate random
variables. The following theorem is due to Hamburger (1920,1921).
Theorem 2.1.3. [Hamburger, 1920],[Hamburger, 1921] For univariate random variables
with whole real line as support, the necessary and sufficient condition for a vector u :=
[u1, u2, ..., u2r] to be a feasible moment sequence is that it belongs to the following set Ω,
which is a positive semidefinite cone.
Ω := u |
u0 u1 · · · ur
u1 u2 · · · ur+1
......
. . ....
ur ur+1 · · · u2r
0,
where u = [u1, u2, ..., u2r].
Other related positive semidefinite conditions for univariate random variables with
different range of support can be found in the early work of Stieltjes (1894,1895).
13
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
2.2 Application in Model-Risk Management
In this section, we present an application of moment-based optimization in developing
a special form of risk measures used for measuring the impact of model uncertainty
in derivative pricing. The pricing of derivatives, such as options or futures, remains
challenging as modern markets more than ever are exhibiting complex and non-stationary
behaviors. In the classical Black-Scholes pricing formula [Black and Scholes, 1973], it is
assumed that the price dynamics of an underlying security follows a geometric Brownian
motion (GBM) with constant volatility. This assumption, however, does not fit well
most empirical findings that the trend and the volatility of markets are in general non-
stationary.
Despite the rapid developments in derivative pricing models, there is no single model
that can suit all cases. Practitioners often face the difficulty in choosing a right model
and are concerned about the possible losses associated with model misspecification. Dev-
astating financial losses have been long reported due to derivative mispricing resulting
from model misspecification. To help better manage such level of risk, known as model
risk, Cont (2006) was the first to provide a comprehensive treatment of the design of
new risk measures for quantifying the impact of model risk. Such new measures can be
viewed as special instances of the popular convex risk measures, where the optimization
representation of convex risk measures is specialized with its solution space refined to a
set of derivative-pricing models (distributions) that are ambiguous to traders. In par-
ticular, a market price-based penalty function was introduced in the measure that gives
higher preference to the pricing model that can better reproduce the market prices of
existing derivative instruments. In the rest of this section, we also call such measures
market price-based convex risk measures.
Cont gave examples that illustrate the evaluation of the risk measure based on finite
families of probability (pricing) measures. However, these families of measures often
14
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
require additional assumptions on the functional forms of the pricing distributions, which
can be very difficult to verify in practice. This can also lead to underestimation of the
impact of model misspecification since the true pricing distribution may not be even
considered when evaluating the risk measures. In this section, we consider the case
of infinite families of measures that share common moments, e.g. mean and variance
for European-style options, and present a new approach to evaluate Cont’s convex risk
measures. Examples are given that illustrate the benefits of evaluating the risk measure
with infinite families of measures and shed light on the limitations of considering only
finite families of measures.
2.2.1 Market Price-Based Convex Risk Measures
We first review briefly the properties of convex risk measures and then provide some
background of Cont’s market price-based risk measures. Given a sample space w : w ∈
Ω, let X : Ω → < denote a linear space of bounded functions. Note that for any
V1,V2 ∈ X the notation V1 ≥ V2 stands for the relation V1(w) ≥ V2(w) ∀ w ∈ Ω. A
function ρ : X → < is called a convex risk measure if it satisfies the following axioms, for
all V1,V2 ∈ X ,
1. if V1 ≥ V2 then ρ(V1) ≤ ρ(V2);
2. if c′ ∈ <, then ρ(V1 + c′) = ρ(V1)− c′;
3. ρ(λ′V1 + (1− λ′)V2) ≤ λ′ρ(V1) + (1− λ′)ρ(V2) ∀ λ′ ∈ [0, 1].
The convexity property 3 plays a significant role as it supports the notion that diversi-
fication typically helps to reduce risk. Under some mild conditions, Follmer and Schied
(2002) show a particularly useful representation. In particular, any convex risk measure
ρ can be represented as
ρ(V) = supQ∈DEQ[−V ]− α(Q),
15
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
where α : D → < is a convex function.
Assume now that there exists a financial market within which derivative instruments
are traded. Given a set of L financial instruments whose future payoffs are Hl : Ω →
<, l = 1, ..., L and whose current market prices are hl ∈ <, l = 1, ..., L. Then, there is no
arbitrage opportunity if and only if there exists a probability measure Q such that
EQ[Hl] = hl, l = 1, ..., L
holds. Such a Q may or may not be unique, depending on the assumptions of the mar-
ket. Detailed discussions of these assumptions and explanations are referred to relevant
literature. In the cases that Q is not unique or cannot be uniquely determined, traders
then face model uncertainty. In practice, based on their knowledge of the market, traders
usually specify a family of possible measures D for pricing a target payoff V∗ : Ω→ <; as
a consequence, the prices generated from different measures in D may not be the same.
One important source of information that helps in specifying a measure Q is from
the market prices of derivative instruments traded in the market. In [Cont, 2006], these
options are called benchmark options as their market prices can serve as useful reference.
Thus, based on a set of benchmark options with payoffs (Hl)l=1,...,L and market prices
(hl)l=1,...,L, Cont suggests the following metric ϑ to quantify the uncertainty (risk) with
respect to a given target payoff V∗ and a set of pricing models D, where
ϑ(V∗) = π∗(V∗)− π∗(V∗),
and
π∗(V∗) = supQ∈DEQ[V∗]− ||h
∗− EQ[H∗]||,
π∗(V∗) = infQ∈DEQ[V∗] + ||h
∗− EQ[H∗]||,
and h∗
(resp. H∗) denotes the aggregated vector form of (hl)l=1,...,L (resp. (Hl)l=1,...,L).
The operator || · || is a norm over the aggregated vector space. More generally, the norm
16
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
function can be represented as
||h∗− EQ[H∗]|| =
L∑l=1
w′l · |hl − EQ[Hl]|,
where w′l denotes a penalty parameter. The upper (resp. lower) bound measure is closely
related to the convex risk measure ρ(V), where
ρ(V∗) = π∗(−V∗) (resp. ρ(V∗) = −π∗(V∗)),
and the penalty function α(Q) is defined as α(Q) := ||h∗− EQ[H∗]||. The upper/lower
bound measure without the penalty term simply evaluates the most extreme value of
derivative prices evaluated under each measure in the set D. With an addition of the
penalty term, each measure further takes into account the “calibration error” of each
possible Q, i.e. the capability of each measure Q to reproduce the market prices of given
benchmark instruments. One of the most useful features of the penalty construction is
that given a set of ambiguous pricing measures D, the metric ϑ requires only one measure
Q ∈ D but not all in D to replicate the market prices of benchmark options so that the
metric ϑ can be considered as a“good”measure [Cont, 2006]. In other words, with almost
no difficulty, the metric ϑ can be applied over additional measures that are difficult to
calibrate as well as the initially specified measures (at least one of which is assumed to
calibrate sufficiently with the benchmark prices). This is beneficial since this provides
the flexibility to incorporate a wider class of pricing measures into D, especially those
that may be more compatible with a trader’s view of future market scenarios despite the
difficulty of calibration.
2.2.2 A Moment-Based Distribution-Free Optimization Approach
Here, we characterize the set of infinite families of probability measures that we will use
for the market price-based risk measures. Let ξ denote the random price of a single
17
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
underlying asset. The set D is specified through a set of moment conditions, i.e.
D := Q : EQ[φj(ξ)] = bj, j = 1, ..., J, (2.6)
where φj : < → < is continuous and bj ∈ <. The focus of this section is to provide a
method to reformulate the evaluation problems π∗(V∗) and π∗(V∗) under (2.6) as convex
optimization problems so that they can be solved efficiently. Our approach is based on the
theories of semi-infinite and semi-definite programming. From here on, we assume that
the norm function || · || within the penalty function is semidefinite representable (SDr).
Many norm functions, including those discussed in [Cont, 2006] are SDr (cf. [Ben-Tal
and Nemirovski, 2001]). The following lemma is essential for our development.
Lemma 2.2.1. Consider the problems
psup = supQ
∫Cψ(ζ)dQ(ζ) :
∫CE(ζ)dQ(ζ) = E0,
∫CdQ(ζ) = 1,
and
pinf = infQ
∫Cψ(ζ)dQ(ζ) :
∫CE(ζ)dQ(ζ) = E0,
∫CdQ(ζ) = 1,
where Q is a non-negative measure on the measurable space (<n,B), ψ : <n → < and
E : <n → <m are continuous, and E0 ∈ <m. The dual problems can be respectively
written as
dsup = infλ0,Λe
λ0 + ΛeTE0 : λ0 + Λe
TE(ζ) ≥ ψ(ζ) ∀ζ ∈ C (2.7)
= infΛe
supζ∈C
ψ(ζ)−ΛeTE(ζ) + Λe
TE0, (2.8)
and
dinf = supλ0,Λe
λ0 + ΛeTE0 : λ0 + Λe
TE(ζ) ≤ ψ(ζ) ∀ζ ∈ C (2.9)
= supΛe
infζ∈C
ψ(ζ)−ΛeTE(ζ) + Λe
TE0, (2.10)
where λ0 ∈ < and Λe ∈ <m. Then strong duality holds, i.e. psup = dsup (pinf = dinf ), if
E0 ∈ int(
∫CE(ζ)dQ(ζ)), (2.11)
18
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
for all Q.
Proof. One can derive the first dual reformulation (2.7) (resp. (2.9)) by following duality
theory for semi-infinite linear problems (cf. [Shapiro, 2001]). The second dual formulation
(2.8) (resp. (2.10)) can be derived by first converting the constraint into
λ0 ≥ supζ∈Cψ(ζ)−Λe
TE(ζ) (resp. λ0 ≤ infζ∈Cψ(ζ)−Λe
TE(ζ)).
Since the right-hand-side of the above inequality provides a lower (resp. upper) bound
of λ0, we can replace λ0 in the objective function by supζ∈Cψ(ζ) − ΛeTE(ζ) (resp.
infζ∈Cψ(ζ)−ΛeTE(ζ)).
We present here a general framework to generate tractable reformulations for the
problems of evaluating π∗(V∗) and π∗(V∗) under (2.6). Note that the target payoff V∗
and future payoffs Hl are functions of random price ξ.
Theorem 2.2.1. Given that the interior condition (2.11) holds, the problems of evalu-
ating π∗(V∗) and π∗(V∗) under (2.6) are equivalent to solving the following two problems
π∗ := inf s+ t
s.t. V∗(ξ)−J∑j=1
λmj(φj(ξ)− bj)−L∑l=1
λhlHl(ξ) ≤ s ∀ξ ≥ 0
sup(ql)l=1,...,L
L∑l=1
λhlql − ||h∗ − vec((ql)l=1,...,L)|| ≤ t, (2.12)
π∗ := sup s+ t
s.t. V∗(ξ)−J∑j=1
λmj(φj(ξ)− bj)−L∑l=1
λhlHl(ξ) ≥ s ∀ξ ≥ 0
inf(ql)l=1,...,L
L∑l=1
λhlql + ||h∗ − vec((ql)l=1,...,L)|| ≥ t, (2.13)
where for each problem (λmj)j=1,...,J , (λhl)l=1,...,L, s, t are variables, and λmj , λhl,s,t ∈ <.
Furthermore, the constraints (2.12) and (2.13) are SDr. Note that vec(qγ) denotes the
aggregated vector form of the set of scalar variables qγ.
19
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
Proof. We present here only the reformulation of π∗ since π∗ can be reformulated in an
identical manner. We first introduce slack variables (ql)l=1,...,L ∈ < and reformulate the
problem as follows
supQ,(ql)l=1,...,L
EQ[V∗(ξ)]− ||h∗ − vec((ql)l=1,...,L)||
subject to EQ[φj(ξ)] = bj, j = 1, ..., J
EQ[Hl(ξ)] = ql, l = 1, ..., L.
Consider maximizing the above problem first with respect to the measure Q; based on
Lemma 2.2.1, the problem can be reformulated as
supq
infλm,λh supξ≥0 V∗(ξ)−J∑j=1
λmj(φj(ξ)− bj)
−L∑l=1
λhl(Hl(ξ)− ql)− ||h∗ − q||,
where q := vec(ql=1,...,L), λm := vec(λmj=1,...,J) and λh := vec(λhl=1,...,L
). Since the op-
erator supξ≥0 preserves convexity, the problem is concave with respect to (ql)l=1,...,L and
convex with respect to λm,λh. Therefore, using Sion’s minimax theory we can switch
(supq) and (infλm,λh) and arrive
infλm,λh
supq supξ≥0 V∗(ξ)−J∑j=1
λmj(φj(ξ)− bj)
−L∑l=1
λhl(Hl(ξ)− ql)− ||h∗ − q||.
By introducing slack variables s and t, the problem can be equivalently written as
infλm,λh,s,t
s+ t
supξ≥0
V∗(ξ)−J∑j=1
λmj(φj(ξ)− bj)−L∑l=1
λhlHl(ξ) ≤ s (2.14)
supq
L∑l=1
λhlql − ||h∗ − q|| ≤ t.
20
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
Then, the first constraint is equivalent to a feasibility constraint which requires that the
inequality in (2.14) holds for all ξ ≥ 0. Now, consider the second constraint in the above
problem. By introducing a slack variable z′, the constraint can be reformulated as follows
supq,z′
L∑l=1
λhl · ql − z′ ≤ t
subject to ||h∗ − q|| ≤ z′.
Given that the norm ||·|| is SDr, the problem can be reformulated as a SDP maximization
problem. By applying SDP duality theory, we can derive an equivalent SDP minimization
dual problem. Notice that the Slater condition holds for the above problem, and therefore
strong duality holds. The resulting constraint is then in a form miny∈S cTy ≤ t, where c
is a coefficient vector and S is a SDr set, which is equivalent to the condition
∃y ∈ S : cTy ≤ t.
This condition is a set of SDP constraints.
We now consider the problem of evaluating Cont’s convex risk measures for European
call/put options when only a finite number of moments are available for the underlying
security. Based on Proposition 2.1.1 and Theorem 2.2.1, we show in Corollary 2.2.1
that the problem can be reformulated as a semidefinite programming problem. Similar
settings of moment conditions can also be found in [Grundy, 1991], [Boyle and Lin, 1997],
[Bertsimas and Popescu, 2002], [Gotoh and Konno, 2002].
Corollary 2.2.1. Consider the evaluations of π∗ and π∗ for a European call (put) option
with strike price K0, given a set of European call options with Hl(ξ) = max(0, ξ−Kl) l =
1, ..., o and put options with Hl(ξ) = max(0, Kl−ξ), l = o+1, ..., L as benchmark options,
where Kl ∈ <+. In the case that a vector of raw moments bj, j = 1, ..., J is given,
the evaluation problems π∗ and π∗ can be solved efficiently as semidefinite optimization
problems.
21
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
Proof. For brevity, we consider only reformulating the problem of evaluating π∗. The
evaluation problem of π∗ can be reformulated using the same approach. Based on Theo-
rem 2.2.1, the only constraint that needs to be further reformulated is
V∗(ξ)−J∑j=1
λmj(φj(ξ)− bj)−L∑l=1
λhlHl(ξ) ≤ s ∀ξ ≥ 0.
Thus, for the case that the target option is an European call option, i.e. V∗(ξ) =
max(0, ξ − K0), given that φj(ξ) = ξj, j = 1, ..., J the constraint can be equivalently
written as
max(0, ξ − K0)−J∑j=1
λmj(ξj − bj)−
o∑l=1
λhl max(0, ξ − Kl)
−L∑
l=o+1
λhl max(0, Kl − ξ) ≤ s ∀ξ ≥ 0.
Now, let k1, ..., kI denote the ordered sequence of the breakpoints Kl, l = 0, ..., L, where
ks′ ≤ ks′+1. We can partition the space of ξ ∈ <+ according to the sequence k1, ..., kI ,
and the above constraint can thus be decomposed and generally written as the following
set of constraints
J∑j=1
λmj ξj ≥
(a′0Tλq)ξ + b′0, ξ ∈ [0, k1]
(a′1Tλq)ξ + b′1, ξ ∈ [k1, k2]
......
(a′I−1Tλq)ξ + b′I−1, ξ ∈ [kI−1, kI ]
(a′ITλq)ξ + b′I , ξ ∈ [kI ,∞]
, (2.15)
where λq = vec(λhl=1,...,L). Proposition 2.1.1 can then be applied to convert each con-
straint in (2.15) to its SDP counterpart based on the end points of the respective partition.
This completes the proof that the overall problem can be reformulated as a semidefinite
optimization problem. It is trivial to see that the same approach also applies to the case
of an European put option, i.e. V∗(ξ) = max(0, K0 − ξ).
22
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
2.2.3 Numerical Examples
In this section, we will compare the values of the metric ϑ(V∗) calculated by a set of finite
families of pricing measures and a set of infinite families of measures. The infinite families
of pricing measures are in particular defined via the first two raw moments. We follow the
numerical example presented in [Lo, 1987], which illustrates the practical relevance of his
semi-parametric bound when there are several competing specifications for the stochastic
process of an underlying security.
In Lo’s experiment, he considered two leading classes of prices processes, lognormal
diffusions and mixed diffusion-jump processes as two candidates that drive the price
dynamic of the underlying security. A remarkable fact is that for any given dataset,
risk-neutral variances of these two processes are numerically identical, which implies that
the semi-parametric bound derived based on any specification within these two classes of
processes immediately applies to all other specifications. In our experiment, Lo’s setup
for the two processes will be the choice for the set of finite families of pricing measures,
and the associated moments will be the condition for an alternative set of infinite families
of pricing measures to satisfy. We compare the values of ϑ(V∗) evaluated between these
two sets. The lognormal diffusion and mixed diffusion-jump processes are defined as
follows. Note that the following notations are the same as the ones used in [Lo, 1987],
which may overlap with the notations used in other parts of this thesis.
dS1 = α1S1dt+ σ1S1dW,
dS2 = [α2 − λ(k − 1)]S2dt+ σ2S2dW + (γ − 1)S2dNλ,
where ln γ ∼ N(β, δ2) and k = E[γ]. We omit the details of the above popular processes,
but provide the analytical forms of their risk-neutral variances
V1 = e2rτ · [eσ21τ − 1],
V2 = e2rτ · [e[λ(k−1)2+σ22+λσ2
γ ]τ − 1],
23
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
where σ2γ = var[γ] = e[2β+δ2](eδ
2 − 1). Besides the parameters of the stochastic processes,
parameter r (resp. τ) denotes the risk-free rate (resp. expiration time). Having identical
risk-neutral variances (V1 = V2) implies that
σ21 = λ(k − 1)2 + σ2
2 + λσ2γ. (2.16)
In [Lo, 1987], a diffusion model is selected by setting σ1 = s′/√
52, where s is the annual
compound standard deviation. A mixed diffusion-jump model is selected by setting k = 1,
λ = 0.25, σ2r = φ1 · σ2
1, and σ22 = φ2 · σ2
1, where φ1 = 3.6, φ2 = 0.1. This setting ensures
that the condition (2.16) holds. From here on, QB (resp. QM) denotes the diffusion
model (resp. mixed diffusion-jump model) with Lo’s parameter setting. The European
call option prices of these two models for various τ, s′, K are presented in Table A.1 under
columns indexed by CB and CM .
We now consider a set of European call options as benchmark options. We generate
their prices based on an alternative mixed diffusion-jump model Qb parameterized by
k = 1, φ2 = 0.15, λ = 0.25, which is different from the one selected by Lo but with the
same risk-neutral variance. Note that φ1 is uniquely determined after setting k, φ2, λ.
The prices for various τ, s′, K are also listed in Table A.1 under columns indexed by Cb.
In particular, we choose only a portion of the generated option prices with respect to
strike prices K = 30, 35, 40 as the prices of benchmark options (hl) and leave the rest
for the use of comparing target payoff (V∗). We then conduct the experiment as follows.
First, we consider the set of finite families of measures
Qfin := QB,QM ,Qb,
and the set of infinite families of measures based on the first two raw moments
Qmom := Q | EQ[ST ] = S0erτ ,EQ[S2
T ] = V1,
where S0 (resp. ST ) here denotes the initial (resp. terminal) price of the underlying asset.
Based on each set, we will evaluate the metric ϑ(V∗) for the target payoff V∗(ξ) = (ξ−K)+
24
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
for K = 45, 50. Then, we add a sequence of mixed diffusion-jump models with parameters
φ2 = 0, 0.1, ..., 0.9, 0.99, λ = 2.5e − 7 to the set Qfin and examine the values of ϑ. Note
that the maximum of ϑ(V∗) among all possible specifications for the mixed diffusion-jump
models is attained by adding such a sequence, which is verified via optimization.
φ2 = 0.1 φ2 = 0.5 ∼ 0.99 φ2 = 0 ∼ 0.99
λ = 0.25 λ = 2.5e−7 λ = 2.5e−7
s′ K [τ = 1 τ = 12 τ = 24] [τ = 1 τ = 12 τ = 24] [τ = 1 τ = 12 τ = 24]
0.2 45 0.000 0.000 0.000 0.000 0.014 0.002 0.000 0.014 0.002
0.2 50 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.4 45 0.000 0.000 0.000 0.027 0.013 0.009 0.027 0.013 0.009
0.4 50 0.000 0.000 0.000 0.000 0.024 0.012 0.000 0.024 0.012
0.6 45 0.000 0.000 0.001 0.044 0.022 0.053 0.051 0.022 0.057
0.6 50 0.000 0.000 0.000 0.000 0.030 0.024 0.000 0.030 0.024
0.8 45 0.000 0.000 0.007 0.039 0.033 0.104 0.056 0.033 0.148
0.8 50 0.000 0.000 0.000 0.000 0.052 0.067 0.000 0.052 0.067
Table 2.1: ϑ(V∗) of Qfin for various values of parameters s′, K, τ .
In Table 2.1, the first three columns present the values of ϑ(V∗) evaluated based on
Qfin, columns 4 ∼ 6 (resp. 7 ∼ 9) present the values when additional diffusion-jump
models with parameters φ2 = 0.5, ..., 0.9, 0.99 (resp. φ2 = 0, ..., 0.9, 0.99) and λ = 2.5e−7
are added. First, notice in Table 2.1 that in several cases, the values of ϑ(V∗) are simply
zeros; that is, except the benchmark model Qb, which achieves optimality for both upper
and lower bound problems (π∗ and π∗), all other models are discarded. Some of these
discarded models have significant impacts on the price of the target payoff; however, the
impacts of these models are mostly devaluated by the respective calibration errors. This
sheds some light on one potential limitation of evaluating the measure ϑ based solely on
finite families of pricing measures: if only restrictive functional forms of distributions are
available for traders to represent their views of price dynamics, this can potentially lead
to a trivial conclusion such as zero model-uncertainty, which forgo all the information
that traders have provided. On the other hand, as shown in Table 2.2, the evaluation
25
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
s′ K τ = 1 τ = 12 τ = 24
0.2 45 0.044 0.552 1.215
0.2 50 0.021 0.182 0.439
0.4 45 0.167 1.944 2.792
0.4 50 0.070 0.910 2.269
0.6 45 0.370 2.842 3.402
0.6 50 0.151 2.431 4.351
0.8 45 0.657 3.454 3.566
0.8 50 0.285 3.908 5.801
Table 2.2: ϑ(V∗) of Qmom for various values of parameters s′, K, τ , where w′ = 1.
based on the first two moments of infinite families of pricing measures can always provide
nontrivial values of ϑ(V∗), which embodies traders’ concern of other possible specifications
by using their moment information. In the mean while, the approach retains the feature
of re-weighting the price impact of each model with respect to the associated calibration
error. Also, observe in Table 2.2 that with one week to maturity the values of ϑ(V∗)
are fairly tight with respect to the values evaluated based on finite families of pricing
measures, which follows closely the feature of Lo’s bound. Thus, in the cases that zero
model uncertainty are reported from the approach that takes into account only finite
families of pricing measures but not trusted by the traders, these tight bounds can be
particularly useful to serve as second reference for traders. In general, the values of
ϑ(V∗) presented in Table 2.2 increases with the variance and the time to maturity. This
behavior is plausible since the larger the variance is, the larger the weights can possibly
be on the tails of distributions, and therefore the larger the impact is on the price. In
addition, this impact can only be magnified as the time to maturity increases.
One additional aspect worth noting is that the evaluation of ϑ based on finite families
of pricing measures can in some cases be very sensitive to the adjustment of the penalty
parameter w′. By slightly increasing w′ to 1.6, the values of ϑ(V∗) in the setting of Table
A.2 all turn to zeros. Thus, this may impose additional difficulty for traders to adjust their
26
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
aversion towards calibration errors as the trivial conclusion of zero model-uncertainty can
be easily made by slight adjustment of the penalty parameter. On the other hand, as
shown in Table A.2 and A.3, the evaluation of ϑ(V∗) based on the moments of infinite
families of pricing measures remains reporting meaningful values when we increase w′ to
2 and 5. In fact, the result of Table A.3 is the “minimum” possible values of ϑ(V∗), which
are invariant to any further increment of w′. This minimum values of ϑ(V∗) is attained
by a pricing measure that perfectly replicates the benchmark prices so that the penalty
is always zero.
2.3 Tractability of Accouting for Multivariate Mo-
ment Information
Real-life applications often involve multiple random quantities of interest. For example,
in option pricing the value of an option may depend on multiple assets such as popular
basket options. The incorporation of multivariate moment information, while practically
useful, is however much more challenging than the univariate case. Several instances
for example are known to be computationally intractable (NP-hard). The focus of this
section is to shed light on the tractability of a special class of multivariate moment
problems that not only allow for the incorporation of high-order moments but are also
amenable to SDP reformulations.
To unify the presentation of all relevant results, in the rest of this section, the notation
D denotes a set of distribution that captures available information about Q. The objective
here is to solve the following optimization problem efficiently :
supQ∈D
E Q[h(ξ)]. (2.17)
In the cases that the objective function is “piecewise concave” in ξ, the problem (2.17)
is known to be tractable for incorporating the following forms of distribution sets: D
27
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
characterized by fixed support and mean [Dupacova, 1987], by fixed mean and covariance
[Bertsimas and Popescu, 2002], and by fixed ranges of support, mean, and an upper bound
on covariance [Delage and Ye, 2010]. The problem (2.17), however, has also been proven
intractable to incorporate D that fixes the support, mean, and covariance of distributions,
and that fixes the d-th first moments with d ≥ 4 [Bertsimas and Popescu, 2005].
We consider here two special forms of distribution sets that incorporate the informa-
tion of marginal higher moments. In the following descriptions, fi denotes a univariate
distribution, and Q(f1, ..., fn) represents a multivariate probability measure Q whose
marginal distributions are fi, i = 1, ..., n. From here on, for simplicity notation Q is
also used as shorthand for Q(f1, ..., fn).
• Marginal higher moments:
Dm :=Q(f1, ..., fn)
∣∣E fi [φ(i)(ξ)] = b(i)
,
where φ(i)(ξ) = [1 ξi ξ2i · · · ξdi ]T, and b(i) denotes a vector of associated univariate
moments.
• Marginal higher moments and a covariance matrix :
Dmc :=Q(f1, ..., fn)
∣∣E fi [φ(i)(ξ)] = b(i),E Q[ξ] = µ,E Q[(ξ − µ)(ξ − µ)T] = Σ,
where φi(ξ) = [1 ξ3i ξ4
i · · · ξdi ]T, and b(i) (resp. µ, Σ) denotes a vector of
associated univariate moments (resp. a mean vector, a covariance matrix).
In addition, we focus on functions h(ξ) of the following form
h(ξ) := maxk∈1,...,K
hk(ξ), hk(ξ) :=n∑i=1
cTk,ihi(ξ),
where hi(ξ) := [1 ξi ξ2i · · · ξd
′i ]T. ξi denotes the i-th univariate component of the
random vector ξ, and d′ denotes the order of ξi. We shall call such class of functions
28
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
piecewise separable functions. For example, in portfolio selection problems, one would
consider a special instance hk(ξ) =∑n
i=1 ak · xi · ξi + bk, where xi (resp. ξi) denotes the
money investment (resp. random return) of a single asset, and∑n
i=1 xi · ξi represents a
portfolio. The piecewise structure of h(ξ) in this case allows for modeling a wide range
of utility and risk measure functions (see more detailed discussion in Section 4.2).
It is often more practical in real-life settings to consider marginal higher moments
rather than joint higher moments. In the latter case, the number of parameters needed
to be estimated can be extremely large as it grows exponentially in the dimension of the
random quantities of interests. This could potentially lead to unstable estimations of
joint moments. In the case that only marginal higher moments are available, we show in
the following proposition that problem (2.17) based on the distribution set Dm can be
solved efficiently.
Proposition 2.3.1. The problem
supQ(f1,...,fn)
E Q[ maxk∈1,...,K
n∑i=1
cTk,ihi(ξ)]
subject to E fi [φ(i)(ξ)] = b(i), i = 1, ..., n′
can be solved efficiently via a semidefinite programming problem, where hi(ξ) = [1 ξi · · ·
ξd′i ]T and φ(i)(ξ) = [1 ξi · · · ξdi ]T.
Proof. Using duality theory of semi-infinite programming, we can derive the following
dual problem
minimizez(i)
n′∑i=1
b(i)Tz(i) :
n′∑i=1
zT(i)φ(i)(ξ) ≥ maxk∈1,...,K
n∑i=1
cTk,ihi(ξ), ∀ξ. (2.18)
The constraint in the above problem can be equivalently reformulated as a system of K
constraints:n′∑i=1
zT(i)φ(i)(ξ) ≥n∑i=1
cTk,ihi(ξ), ∀ξ, k = 1, ..., K. (2.19)
29
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
Based on (2.19), the problem (2.18) can be equivalently expressed in the following general
form
minimizez(i)
n′∑i=1
b(i)Tz(i) :
max(n′,n)∑i=1
θk(z(i))Tg(i)(ξ) ≥ 0 ∀ξ, k = 1, ..., K,
where g(i)(ξ) = [1 ξi · · · ξd′′(i)i ]T, and θk(z(i)) denotes the coefficient vector after shifting
the terms on the right-hand-side of the inequality in (2.19) to the left-hand-side and
re-grouping the coefficients with respect to each variable ξi. The above optimization
problem can be equivalently expressed as
minimizez(i)
n′∑i=1
b(i)Tz(i) : inf
ξ
max(n′,n)∑i=1
θk(z(i))Tg(i)(ξ) ≥ 0 k = 1, ..., K.
By introducing free variables tk,i, each kth-constraint can be equivalently reformulated
into following constraints :
max(n′,n)∑i=1
tk,i ≥ 0, infξiθk(z(i))
Tg(i)(ξ) ≥ tk,i, i = 1, ...,max(n′, n).
The constraints on the right-hand-side are equivalent to
θ∗k(tk,i, z(i))Tg(i)(ξ) ≥ 0, ∀ξi, i = 1, ...,max(n′, n), (2.20)
where θ∗k(tk,i, z(i)) denotes the coefficient vector after the same kind of shifting operation
done for θk(z(i)). Based on Proposition 2.1.1, each ith-constraint in (2.20) is known to
be SDP-representable. This completes the proof.
Our motivation to study the distribution set Dmc, which additionally accounts for
a covariance structure, comes from the application of a linear factor model in moment
problems. Linear factor models are a popular approach used to reduce the dimensionality
of the random quantities. In these models, the random vector ξ is assumed to be driven
by a lower-dimension factor vector ζ so that ξ = Vζ+ε holds, where V is a factor loading
matrix, and ε is a vector of residual returns with zero mean and zero correlation. The
application of factor models in portfolio selection problem can be found in Section 4.5.2.
30
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
It is often assumed in factor models that no correlation exhibits among components of ζ
and between ζ and ε. Thus, if we reformulate moment problems based on a factor model,
not only the higher marginal moments of random factors ζ but also the zero correlation
structure among ζ, ε needs to be incorporated in a distribution set, which is exactly the
distribution set Dmc with Σ whose off-diagonal elements are zeros.
The problem (2.17) based on the set Dmc is known to be tractable in generating the
tightest bound when the order of the polynomials in the function h(ξ) is less than or
equals to two, i.e. d′ ≤ 2, and only up to second order moment information is specified
in the distribution set Dmc. In Theorem 2.3.1, we consider the same maximum order
of polynomials in h(ξ), i.e. d′ = 2 but further address marginal higher moments in the
distribution set Dmc, i.e. d > 2. A SDP relaxation formulation is provided that generates
a tight upper bound. The bound is usefully tight in the sense that it is guaranteed to
be tighter than the bound generated based only on the first two moment (mean and
covariance) or only on marginal higher moments.
Theorem 2.3.1. Consider the following optimization problem:
supQ(f1,...,fn)
E Q[ maxk∈1,...,K
n∑i=1
cTk,ihi(ξ)]
subject to E fi [φ(i)(ξ)] = b(i), i = 1, ..., n, E Q[ξ] = µ,E Q[(ξ − µ)(ξ − µ)T] = Σ,
where hi(ξ) = [1 ξi ξ2i ]
T and φ(i)(ξ) = [1 ξi · · · ξdi ]T. The following SDP problem
provides an upper bound on the optimal value of the above problem, which is at least as
tight as the one generated based on the set Dmc with only mean and covariance or with
31
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
only marginal moments.
maximizeνk,%k,Γk,ηk,i
K∑k=1
n∑i=1
cTk,i[νk %k(i) Γk(i, i)]T
K∑k=1
%k = µ,K∑k=1
Γk = Σ + µµT, (2.21)
K∑k=1
ηk,i = b(i),
K∑k=1
νk = 1, Γk %k
%Tk νk
0, (νk, %k(i),Γk(i, i),ηk,i) ∈ Ki, i = 1, ..., n,
where each Ki denotes a positive semidefinite cone Ω (Theorem 2.1.3), and %k(i) (resp.
Γk(i, i)) denotes the i-th (resp. (i, i)-th) entry of the vector %k (resp. the matrix Γk).
Proof. Our first step is to partition the whole domain <n according to the piecewise
structure of the objective function h(ξ) := maxk∈1,...,K hk(ξ). Let Pk denote the partition
where the function hk(ξ) attains the maximum value of maxk∈1,...,K hk(ξ), i.e. <n = P1 ∪
P2∪ · · ·∪ PK and Pk′ ∩Pk′′ = ∅ for k′ 6= k′′. We accordingly define new measures Fk, k =
1, ..., K whose domain are respectively Pk, k = 1, ..., K. Without loss of generality, Q =∑Kk=1 Fk follows. Based on new measures Fk, k = 1, ..., K, we can reformulate the problem
as follows.
maximizeFk
K∑k=1
E Fk[hk(ξ)]
K∑k=1
E Fk[ξ] = µ,
K∑k=1
E Fk[ξξT] = Σ + µµT, (2.22)
K∑k=1
E Fk[φ(i)(ξ)] = b(i),
K∑k=1
E Fk[I<n ] = 1. (2.23)
Note that in the above formulation, we relax the condition that the measure Fk is sup-
ported on the partition Pk. Next, we replace E Fk[ξ] by %k, E Fk
[ξξT] by Γk, E Fk[φ(i)(ξ)]
by ηk,i, and E Fk[I<n ] by νk, where %k and ηk,i are in a vector form and Γk is in a matrix
32
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
form. Let %k(i) (resp. Γk(i, i)) denote the ith (resp. (i, i)th)-entry of %k(resp. Γk). We
thus arrive at the following relaxed problem.
maximizeνk,%k,Γk,ηk,i
K∑k=1
n∑i=1
cTk,i[νk %k(i) Γk(i, i)]T
K∑k=1
%k = µ,
K∑k=1
Γk = Σ + µµT, (2.24)
K∑k=1
ηk,i = b(i),
K∑k=1
νk = 1.
Now, we add the following constraint
(νk, %k(i),Γk(i, i),ηk,i) ∈ Ki,
where each Ki denotes a positive semidefinite cone Ω (Theorem 2.1.3). We now show
that the relaxation (2.21) provides a bound that is tighter than the one generated based
on Dmc with only marginal moments. Let (ν∗k ,%∗k,Γ
∗k,η
∗k,i) denote the optimal solution
of (2.24). Without loss of generality, we assume first that all ν∗k > 0. Due to the
cone property of Ki, there must exists marginal distributions fik that satisfies the
marginal moments %∗k(i)/ν∗k ,Γ
∗k(i, i)/ν
∗k , and η∗k,i/ν
∗k . We can always construct a product
measure Fk := f1 × · · · × fnk based on marginal distributions fik. We can finally
define a new measure F =∑K
k=1 Fk · ν∗k that straightfowardedly satisfies E F [ξ] = µ and
E F [φ(i)(ξ)] = b(i). Using such a measure, we can derive the following inequalities
supQ∈Dm
E Q[h(ξ)]
≥ E F [ maxk=1,..,K
hk(ξ)]
=K∑k=1
E Fk [ maxk=1,..,K
hk(ξ)] · ν∗k
≥K∑k=1
E Fk [hk(ξ)] · ν∗k (2.25)
=K∑k=1
n∑i=1
cTk,i[ν∗k %∗k(i) Γ∗k(i, i)]
T.
33
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
This implies that the relaxation problem (2.21) is at least as tight as the bound generated
based only on marginal moments. Next, we add the constraint Γk %k
%Tk νk
0, (2.26)
which is a necessary and sufficient condition for the existence of a multivariate probability
measure Fk that satisfies∫<n
I<ndFk(ξ) = 1,
∫<nξdFk(ξ) = %k/νk,
∫<nξξ
TdFk(ξ) = Γk/νk,
given that νk > 0. This is because constraint (2.26) is equivalent to Γkνk− %k
νk
%Tkνk 0 by
dividing (2.26) by νk and using Schur’s complement. Thus, we can always construct a
distribution, e.g. multivariate normal distribution, with mean%Tkνk
and covariance Γkνk−
%kνk
%Tkνk
. Following similar arguments, we can construct, based on the optimal solution
(ν∗k ,%∗k,Γ
∗k,η
∗k,i) of (2.24), a new measure F =
∑Kk=1 Fk · ν∗k , where
∫<n I<ndFk(ξ) =
1,∫<n ξdFk(ξ) = %∗k/ν
∗k ,
∫<n ξξ
TdFk(ξ) = Γ∗k/ν
∗k ., and such a measure satisfies the
given mean µ and second order moment Σ+µµT conditions. Following similar inequality
arguments (2.25), we conclude that the relaxation problem (2.21) is at least as tight as
the bound generated based only on mean and covariance.
Remark 2.3.1. Proposition 2.3.1 and Theorem 2.3.1 rely on the piecewise-separable
structure of objective functions, which is a useful structure in generalizing the tractability
of univariate moment problems to multivariate settings.
2.4 Conclusion
In this chapter, we first exploit the connection between the theory of moment problems
and modern conic optimization to develop a tractable moment-based setting in evaluating
market-price based convex risk measures. The moment-based setting is useful as it allows
34
CHAPTER 2. MOMENT PROBLEMS, TRACTABLE COUNTERPARTS, AND
APPLICATION
for incorporation of a much wider class of distributions that results in more revealing
measures than would otherwise be possible when quantifying model misspecification in
derivative pricing. New tractability results of solving multivariate moment problems
are also presented, where several SDP reformulations are provided for moment problems
incorporating higher marginal moments. This class of multivariate moment problems can
serve as powerful modeling tools in the applications for which only marginal moments
are available. This is for example the case for many financial applications, where joint
moments of random returns such as correlations are often hard to estimate. In our future
work, we will apply the class of moment problems in the problem of portfolio selection
to verify its practical values.
35
Chapter 3
Accounting for Stochastic Moments
In this chapter, we deal with the second layer of uncertainty, i.e. stochastic moments, ad-
dressed in the introduction in developing the notion of comprehensive robustness. It has
been assumed in the formulation of classical moment problems that moments are static,
either having fixed values or falling into a fixed range of values. This is opposed to the
notion that moments are dynamic and often driven by an underlying stochastic process in
time series studies. Such a discrepancy leads to the difficulty of applying moment-based
bounds in practice since they may not conform well to richer distributional information
governing the true dynamics of moments. In particular, many decision environments
such as financial markets have been known undergoing a phase transition in a repetitive
manner, changing from one state to another. Such a transition is often accompanied by
the abrupt changes of moments such as the soaring of volatility. In such instances, the
use of moment-based bounds that account for only a single state can be misleading and
giving a false sense of robustness.
While little has been argued about the presence of multiple states in many environ-
ments, each of which for example has a distinct feature of trend and volatility, the exact
form of distribution that characterizes each state is often hard to specify. This leads to
the following question: Is it possible to derive distribution-free bounds that account for
36
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
the existence of multiple states, and their associated likelihood and moment characteri-
zation? This motivates the development of a new framework presented in this chapter,
a stochastic semi-definite optimization model that incorporates the settings of classi-
cal moment problems as building blocks and uses the notion of recourse borrowed from
Stochastic Programming to further take into account the stochastic nature of moments.
As a result, more robust bounds are generated with respect to possible state realizations.
The remainder of the chapter is structured as follows. In Section 3.1, we review first
the deterministic semi-definite models proposed by Bertismas and Popescu (2002). In
section 3.2, we present the development of two-stage stochastic semi-definite models. We
show in section 3.3 that the framework is comprehensive and includes as special limiting
cases the deterministic and robust optimization counterparts. We also show that the
optimal bound value is equivalent to a Value at Risk quantity, and the optimal solution
can be obtained via simple sorting. Finally, in section 3.4, the framework is applied in
bounding the price of a European-style call option under regime switching. An additional
moment-based lattice is constructed for generating scenario-based moments. Computa-
tional experiments using the S&P 500 index as the underlying asset are performed that
illustrate the advantages of the stochastic programming approach over the deterministic
strategy.
3.1 Deterministic Semidefinite Optimization Models
In this section, we briefly review a class of deterministic SDP models introduced in
[Bertsimas and Popescu, 2002]. Such class of models, which is applicable to generating
moment-based bounds in case of univariate random variables, can be solved efficiently
for a wide range of specifications. To be specific, consider now the following formulation
37
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
of the moment problems.
maxQ
EQ[h(ξ)] (3.1)
subject to EQ[ξp] = bp, p = 0, 1, ..., d,
and
minQ
EQ[h(ξ)] (3.2)
subject to EQ[ξp] = bp, p = 0, 1, ..., d,
where ξ here is a univariate random variable. The general tractability results for the
above problem are given in the following theorem.
Theorem 3.1.1. [Bertsimas and Popescu, 2002] The tightest bounds for the problems
(3.1) and (3.2) with a piecewise polynomial function h : < → <, given moments of
an univariate random variable ξ, can be solved efficiently as a semidefinite optimization
problem.
From here on, for simplicity we will use the following generic form of SDP formulation
to represent the SDP problem used for bounding the expected value of a pre-specified
piecewise polynomial function h(ξ), given moments bp, p = 0, ..., d.
UBSDP(b) := minimizey,X,Z
d∑p=0
bpyp, (3.3)
subject to (X,Z,y) ∈ G ⊂ (Jm1 ,Jm2 ,Rd+1),
X,Z 0,
and
LBSDP(b) := maximizey,X,Z
d∑p=0
bpyp, (3.4)
subject to (X,Z,y) ∈ H ⊂ (Jm1 ,Jm2 ,Rd+1),
X,Z 0,
38
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
where b := (b0, ..., bd). G, H represent polyhedral sets corresponding to some linear
constraints introduced along the problem reformulation, and Jm denotes the set of real
symmetric matrices of order m. In general, the expression of matrices X and Z can be
further reduced to a single positive semi-definite matrix. The use of two matrices here is
for consistency purposes with later results.
It has been assumed in the above model formulations that the moments bp, p = 0, ..., d
are deterministic parameters. As discussed earlier, such an assumption can be problem-
atic in many practical applications. In the next section, we present a new optimization
approach that further accounts for the stochastic nature of moments and generates more
robust bounds with respect to possible state realizations.
3.2 A Stochastic Semidefinite Optimization Approach
In this section, we formulate models from which upper and lower bounds can be computed
in the presence of stochastic moments. In particular, we consider in our model that there
are S-possible realizations of states and each realization s (s = 1, ..., S) corresponds to
a distribution characterized by its vector of moments b(ws). Each realization of a state
will correspond to a scenario s and P (ws) is the probability that the scenario s will
realize. h(ws) denotes the optimal bound value obtained from the deterministic model
(3.1) or (3.2) using just the vector of moments b(ws). The models are two-stage stochastic
versions of the semidefinite programs defined in (3.3) and (3.4). In particular, the model
is a two-stage stochastic semidefinite program with recourse [Ariyawansa and Zhu, 2006]
that is analogous to the two-stage stochastic linear programming with recourse framework
[Kall and Wallace, 1994], [Birge and Louveaux, 1997].
The stochastic model seeks to find a semidefinite matrix in the first-stage that results
in a bound such that the expected penalized difference between the first-stage bound and
the bound for a given state realization in the second stage is minimized. The first stage
39
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
objective and constraints are as in the problems (3.3) and (3.4). Thus, the first stage
bound can be seen to be robust with respect to possible state realizations. The bound
values for each state realization in the second stage are computed offline using (3.3) for
upper bounds and (3.4) for lower bounds before the formulation of the model. Thus,
the recourse decision for each scenario is determined upon realization of the state in the
second stage and so the model is a stochastic program with simple recourse [Everitt and
Ziemba, 1979].
Robust Upper Bound with Stochastic Moments
We present the two-stage stochastic semidefinite programming model (SSDP) for the
upper bound, where the optimal value is denoted as UBSSDP(b) given that b is a moment
vector for the first stage. The two-stage stochastic semidefinite programming model is as
follows.
minimizey,X,Z
d∑p=0
bpyp +R(y), (3.5)
subject to (X,Z,y) ∈ G ⊂ (Jm1 ,Jm2 ,Rd+1),
X,Z 0.
The recourse function R(y) is defined as R(y) := Ew[Q(y, w)], where the function
Q(y, w) is defined as follows
Q(y, w) := minimizey+,y− b+y+ + b−y−, (3.6)
subject to y+ − y− = h(w)−d∑p=0
bpyp,
y+, y− ≥ 0,
40
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
where y = (y0, ..., yd), b+, b− ≥ 0, w denotes the random outcome (scenario), and
h(w) := minimizey,X,Z
d∑p=0
bp(w)yp, (3.7)
subject to (X, Z, y) ∈ G ⊂ (Jm1 ,Jm2 ,Rd+1),
X, Z 0.
In (3.7), bp(w) denotes the moments with respect to the random outcome (scenario)
w. Thus, we define a new upper bound based on the optimal solution of (3.5):
UBSSDP(b) :=d∑p=0
bpyoptp ,
where yoptp denotes the optimal solution of (3.5).
Robust Lower Bound with Stochastic Moments
To formulate a stochastic semidefinite programming model for robust lower bound, we
change the sign of R(y) from positive to negative as follows, since the problem (3.4) is a
maximization problem.
maximizey,X,Z
d∑p=0
bpyp −R(y) (3.8)
subject to (X,Z,y) ∈ H ⊂ (Jm1 ,Jm2 ,Rd+1),
X,Z 0,
To formulate the recourse function R(y) in (3.8), we only need to modify h(w) in (3.6)
as follows.
h(w) := maximizey,X,Z
d∑p=0
bp(w)yp, (3.9)
subject to (X, Z, y) ∈ H ⊂ (Jm1 ,Jm2 ,Rd+1),
X, Z 0.
41
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
Thus, we define a new lower bound based on the optimal solution of (3.8):
LBSSDP(b) :=d∑p=0
bpyoptp ,
where yoptp denotes the optimal solution of (3.8).
3.3 Solution Features
In this section, we discuss several features of the robust bounds UBSSDP(b) and LBSSDP(b).
In particular, we highlight its relation to the deterministic bounds UBSDP(b) and LBSDP(b).
We first consider the case that an identical moment vector b is used for both the first
stage in UBSSDP(b) (LBSSDP(b)) and in UBSDP(b) (LBSDP(b)). Such an identical moment
vector can for example be the average of the moment vectors associated with all possible
regime realizations. We show that under certain conditions on the penalty parameters
b+ and b− the bounds generated by the stochastic programming model are equivalent to
the ones generated by the deterministic model.
We then consider the case that the deterministic bounds UBSDP(b) and LBSDP(b) are
computed based on moment vectors that give the worst possible values of the bounds.
We show that the bounds UBSSDP(b) and LBSSDP(b) are always less conservative than the
worst-case bounds and therefore can be useful alternatives when additional information
of the underlying regime dynamics is revealed. Finally, we show that the robust bound
UBSSDP(b) (LBSSDP(b)) is actually equivalent to a Value at Risk (VaR) quantity where
the confidence level is a function of the penalty parameters of the stochastic model, and
thus can be computed easily via a sorting algorithm over a finite number of deterministic
bounds.
For simplicity, we present only the proofs of the upper bound results since the deriva-
tions of the corresponding lower bound results are analogous. In the following theo-
rem, we show that the robust bounds UBSSDP(b) and LBSSDP(b) are always more con-
42
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
servative than the deterministic bounds UBSDP(b) and LBSDP(b) given the same mo-
ment vector b. The difference between a robust bound and a deterministic bound, i.e.
(UBSSDP(b) − UBSDP(b)), can be seen to be the extra premium that relates to the cost
of hedging for over- or under-estimation of the bound.
Theorem 3.3.1. The optimal value UBSSDP(b) (LBSSDP(b) resp.) satisfies that
UBSSDP(b) ≥ UBSDP(b)
(LBSSDP(b) ≤ LBSDP(b) resp.) for any b.
Proof. Suppose that UBSSDP(b) < UBSDP(b) for some b. Since the optimal solution
of (3.5) is feasible for the problem (3.3), and the form of the function UBSSDP(b) :=∑dp=0 qpy
optp is identical to the objective function of (3.3), this contradicts the fact that
UBSDP(b) is the optimal value of (3.3).
The penalty parameters b+ and b− determine the risk aversion attitude of users to-
wards the difference between the first-stage bound and the bound on the option price for
a given regime realization in the second stage. Intuitively, the higher (lower) the b+ and
b− are, the more (less) sensitive the users are towards the difference. In the following
theorem, we show that when b+ ≤ 1 (resp. b− ≤ 1) the upper (resp. lower) bound
generated by the stochastic programming model is equivalent to the upper (resp. lower)
bound generated by the corresponding deterministic model.
Theorem 3.3.2. If b+ ≤ 1 (resp. b− ≤ 1), the optimal value UBSSDP(b) = UBSDP(b)
(resp. LBSSDP(b) = LBSDP(b)) for any b.
Proof. Based on Theorem 3.3.1, UBSSDP(b) ≥ UBSDP(b) for any b. Here, we further show
that UBSSDP(b) ≤ UBSDP(b) for any b given that b+ ≤ 1. First, let y′ := (y′0, ..., y′d) be
the optimal solution of the problem (3.5) and let y′′
:= (y′′0 , ..., y
′′
d ) be the optimal solution
43
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
of the problem (3.3). Suppose now that UBSSDP(b) > UBSDP(b) given that b+ ≤ 1, which
can be equivalently written as
d∑p=0
bpy′p =
d∑p=0
bpy′′
p + δ, δ > 0. (3.10)
Then, by substituting the right-hand-side of (3.10) as the optimal-value quantity as-
sociated with the solution y′ into the objective function of (3.5), and further re-arranging
the objective function based on the following partitions of ws
I1 := ws | h(ws) ≥d∑p=0
bpy′p
I2 := ws |d∑p=0
bpy′′
p ≤ h(ws) <d∑p=0
bpy′p
I3 := ws | h(ws) <d∑p=0
bpy′′
p,
we obtain the following quantity
d∑p=0
bpy′′
p + δ +∑ws∈I1
P (ws) · s(ws) +∑ws∈I2
P (ws) · s(ws) +∑ws∈I3
P (ws) · s(ws), (3.11)
where
s(ws) =
b+(h(ws)−
∑dp=0 bpy
′′p − δ) ws ∈ I1
b−(∑d
p=0 bpy′′p + δ − h(ws)) ws ∈ I2
b−(∑d
p=0 bpy′′p + δ − h(ws)) ws ∈ I3.
The quantity (3.11) can be re-written as
d∑p=0
bpy′′
p +R(y′′) + ∆ + δ(1−
∑ws∈I1
P (ws)b+ +
∑ws∈I2
P (ws)b− +
∑ws∈I3
P (ws)b−),
where ∆ := b−∑
ws∈I2 P (ws)·(∑d
p=0 bpy′′p−h(ws))−b+
∑ws∈I2 P (ws)·(h(ws)−
∑dp=0 bpy
′′p ).
Due to (3.10), h(ws)−∑d
p=0 bpy′′p ≤ δ for ws ∈ I2 holds, and thus ∆ ≥ δ(−b+
∑ws∈I2 P (ws)
−b−∑
ws∈I2 P (ws)). Based on this and some algebraic manipulation, it is easy to see that
∆ + δ(1 −∑
ws∈I1 P (ws)b+ +
∑ws∈I2 P (ws)b
− +∑
ws∈I3 P (ws)b−) > 0 if b+ ≤ 1, which
44
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
leads to the conclusion that y′′
is more optimal than y′ for the problem (3.5). This is a
contradiction, and thus if b+ ≤ 1, UBSSDP(b) ≤ UBSDP(b) must hold.
Consider now the worst-case formulations of the problems (3.3) and (3.4) in the spirit
of modern robust optimization [Ben-Tal et al., 2009]. From here on, let C∗ denote the
union set of the first stage moments b and the moments b(ws), s = 1, ..., S associated
with each scenario realization s, i.e. C∗ := b(ws), s = 1, ..., S ∪ b. The worst-case
formulation for the upper bound problem is given in (3.12), whereas the lower bound
formulation is given in (3.13).
WUBSDP = miny,X,Z
maxb(w)∈C∗
d∑p=0
bp(w)yp, (3.12)
subject to (X,Z,y) ∈ G ⊂ (Jm1 ,Jm2 ,Rd+1),
X,Z 0.
WLBSDP = maxy,X,Z
minb(w)∈C∗
d∑p=0
bp(w)yp, (3.13)
subject to (X,Z,y) ∈ H ⊂ (Jm1 ,Jm2 ,Rd+1),
X,Z 0,
where b(w) := (b0(w), ..., bd(w)). In the worst-case formulations (3.12) and (3.13) the
moment vector b is determined by the inner optimization problem, which seeks the worst
possible value of the objective function. In Theorem 3.3.3, we show that the robust
bound UBSSDP(b) (resp. LBSSDP(b)) is always less conservative than the bound WUBSDP
(resp. WLBSDP), which is based on the moment vector that results in the most extreme
objective value. Before showing the main theorem, we first present the following lemma,
which is useful for proving Theorem 3.3.3 and Theorem 3.3.4.
45
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
Lemma 3.3.1. The function h : y →∑d
p=0 bpyp given any b := (b0, ..., bd) is continuous
and unbounded above over the feasible set (X,Z,y) ∈ G ⊂ (Jm1 ,Jm2 ,Rd+1) | X ,Z
0, where y := (y0, ..., yd).
Proof. The continuity of h is obvious since the feasible set is a convex set. To see the
unboundness, consider maximizing instead of minimizing the objective function in (3.3).
The dual problem of the maximization form of (3.3) is as follows.
minimizeQ
∫−h(ξ)dQ(ξ), (3.14)
subject to
∫ξpdQ(ξ) = −bp, p = 0, 1, ..., d,
Q(ξ) ≥ 0,
where b0 = 1. Clearly, no feasible solution exists for the above problem. Based on
duality theory, this implies that the primal problem, the maximization form of (3.3), is
unbounded. This completes the proof.
Now, we are ready to prove Theorem 3.3.3.
Theorem 3.3.3. The optimal value UBSSDP(b) (LBSSDP(b) resp.) satisfies that
UBSSDP(b) ≤WUBSDP
(LBSSDP(b) ≥WLBSDP resp.) for any b ∈ C∗, where C∗ := b(ws), s = 1, ..., S ∪ b.
Proof. Consider the problem (3.5) with the first stage parameter b := (b0, ..., bd) ∈ C∗.
Let y′ := (y′0, ..., y′d) be the optimal solution of the problem (3.5). Suppose now that
UBSSDP(b) > WUBSDP. This implies that
d∑p=0
bpy′p > WUBSDP ≥ max
b(w)∈C∗min
y,X,Z:(X,Z,y)∈G,X,Z0
d∑p=0
bp(w)yp. (3.15)
To see why the last inequality in (3.15) is true, let bopt and yopt denote the optimal
b(w) and y in the optimization problem in (3.15). Then, in the worst-case upper bound
46
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
problem (3.12), if the optimal y∗ in (3.12) equals to yopt, the inequality in (3.15) must
hold. Consider now if the optimal y∗ in (3.12) does not equal to yopt, the following must
hold by the definition of the optimization problem in (3.15)
d∑p=0
(bopt)py∗p ≥
d∑p=0
(bopt)p(yopt)p.
Thus, the below inequalities follow immediately
miny,X,Z:(X,Z,y)∈G,X,Z0
maxb(w)∈C∗
d∑p=0
b(w)pyp = maxb(w)∈C∗
d∑p=0
b(w)py∗p
≥d∑p=0
(bopt)py∗p ≥
d∑p=0
(bopt)p(yopt)p.
This completes the verification of the last inequality in (3.15). The inequality in (3.15)
implies the following two sets of inequalities, (3.16) and (3.17):
d∑p=0
bpy′p > max
b(w)∈C∗min
y,X,Z:(X,Z,y)∈G,X,Z0
d∑p=0
b(w)pyp
≥ miny,X,Z:(X,Z,y)∈G,X,Z0
d∑p=0
bpyp =d∑p=0
bpy′′
p , (3.16)
d∑p=0
bpy′p > max
b(w)∈C∗min
y,X,Z:(X,Z,y)∈G,X,Z0
d∑p=0
b(w)pyp ≥ UBSDP(b(w)), ∀b(w) ∈ C∗, (3.17)
where (y′′0 , ..., y
′′
d ) denotes the optimal solution of the last optimization problem in (3.16).
The inequalities (3.16) imply that there exists a feasible y′′
such that
d∑p=0
bpy′p >
d∑p=0
bpy′′
p . (3.18)
Thus, based on Lemma 3.3.1 there must exists y′′′
:= (y′′′0 , ..., y
′′′
d ) such that
d∑p=0
bpy′p >
d∑p=0
bpy′′′
p = WUBSDP, and thusd∑p=0
bpy′′′
p ≥ UBSDP(b(w)), ∀b(w) ∈ C∗.
(3.19)
Finally, based on (3.17) and (3.19), it is easy to verify that y′′′
is more optimal than y′
for the problem (3.5), which is a contradiction. Thus, UBSSDP(b) ≤WUBSDP must hold
for any b ∈ C∗.
47
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
Interestingly, the two-stage stochastic semidefinite programming model for a robust
upper (resp. lower) bound can in fact be recast as a newsvendor problem with a simple
lower (resp. upper) bound constraint (cf. [Shapiro et al., 2009]). For simplicity, we
present and discuss only the reformulation of the upper bound problem to a newsvendor
problem as follows.
Theorem 3.3.4. Given that UBSDP(b) < ∞, the two-stage stochastic semidefinite pro-
gramming model (3.5) is equivalent to the following newsvendor problem
minimizex′≥l x′ + Ew[b+(h(w)− x′)+ + b−(h(w)− x′)−],
where b+, b− ≥ 0 and l := UBSDP(b).
Proof. To prove the equivalency, it suffices to show that for any x′ that satisfies x′ ≥
UBSDP(b), there exists a solution y := (y0, ..., yd) that is feasible with respect to both
the equality∑d
p=0 bpyp = x′ and constraints in (3.5). Given that UBSDP(b) < ∞, i.e.
∃(y0, ..., yd) such that∑d
p=0 bpyp ≤ x′ for any x′ ≥ l, the existence of a feasible y that
satisfies∑d
p=0 bpyp = x′ can be proven by showing that the function∑d
p=0 bpyp is contin-
uous with respect to the feasible set (X,Z,y) ∈ G ⊂ (Jm1 ,Jm2 ,Rd+1) | X,Z 0
and it is unbounded above, which is the result of Lemma 3.3.1.
Based on the above theorem, a closed-form solution for the two-stage stochastic
semidefinite programming model of upper bound (3.5) can be derived using the approach
to derive the optimal solution for the newsvendor problem.
Theorem 3.3.5. (cf. [Shapiro et al., 2009])
UBSSDP(b) = maxUBSDP(b), F−1(κ∗),
where κ∗ = (b+ − 1)/(b+ + b−), F−1(κ∗) = infh∗ : F (h∗) ≥ κ∗, and F (•) denotes the
cumulative distribution function of the random outcome h(w).
48
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
It is important to note that the measure F−1(κ∗) is also the popular risk measure,
Value-at-Risk (VaR). This result reveals the fact that the flexibility that we provide users
to control their risk aversion attitude towards over- or under-estimation of the bounds
in fact has a direct interpretation of the quantile selected for the Value-at-Risk measure.
Later in the computational experiments, we will provide both the values of b+ and b−
and the quantile to highlight the usefulness of this connection.
Finally, we highlight in the following corollary that as result of Theorem 3.3.5, to
compute the robust bounds UBSSDP(b) and LBSSDP(b) it suffices to consider only a finite
number of deterministic bounds, e.g. h(ws) and find the bounds that are optimal with
respect to the problems (3.5) and (3.8). This procedure can be easily carried out using a
sorting algorithm.
Corollary 3.3.1. (cf. [Shapiro et al., 2009]) The optimal value
UBSSDP(b) ∈ UBSDP(b)
(LBSSDP(b) ∈ LBSDP(b)), where b ∈ C∗ and C∗ := b(ws), s = 1, ..., S ∪ b.
3.4 Application in Bounding Option Prices
The problem of computing bounds on option prices has been of recent (e.g. [Gotoh and
Konno, 2002], [Bertsimas and Popescu, 2002], [Dalakouras et al., 2006], [Popescu, 2007])
and past interest (e.g. [Ritchken, 1985], [Lo, 1987], [Grundy, 1991], [Boyle and Lin,
1997]). This problem is important because it arises from the consideration of alternative
models to the standard geometric Brownian motion that is assumed in the Black-Scholes
framework for modeling the price of an asset, since geometric Brownian motion often
results in pricing biases. A variant of this problem [Bertsimas and Popescu, 2002], [Gotoh
and Konno, 2002] considers computing the tightest possible upper and lower bounds of
the price of an option given that there is no arbitrage in the market and that only the
49
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
first several moments of a risk neutral distribution are known. This approach can be
seen to be a relaxation of the Black-Scholes approach in that a particular model is not
assumed i.e. this approach is considered model or distribution free.
In this section, we consider bounds for the price of a European-style call option under
regime switching. The two-stage stochastic semidefinite programming model is applied
that incorporates a lattice generated by a finite-state Markov chain regime-switching
model as a representation of scenarios (uncertainty) to compute bounds. Our objective
here is to have a distribution free approach for computing bounds for a European-style call
option, but with a regime-switching process that does not necessarily assume a lognormal
distribution for each regime. We incorporate a finite-state Markov chain regime-switching
process as in [Hamilton, 1989] to generate a discrete lattice for computing option bounds.
The strategy is to use the lattice as a discrete set of scenarios that represents uncertainty
in regimes that will realize in the second stage in a stochastic programming with recourse
framework. The use of our stochastic semidefinite programming model will generate a
first-stage (here and now) bound that accounts for the regime switching dynamics of
the underlying asset. We demonstrate the value of the stochastic solution (bound) and
computational experiments using the S&P 500 index are performed that illustrate the
advantages of the stochastic programming approach over the deterministic strategy.
3.4.1 A Moment-Based Lattice under Regime Switching
We present here the construction of a lattice used for generating input parameters within
the recourse of the stochastic program in (3.5) and (3.8): scenarios s (s = 1, ..., S),
associated probability P (ws) and moments b(ws). The lattice is constructed based on
the information of conditional risk-neutral moments of a discrete-time regime switching
process, which captures the switching dynamics of the moments.
We assume that the continuous compound return of an underlying security follows
50
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
a Markovian regime switching model. The model assumes that there exists multiple
regimes Φt ∈ 1, ...,m′ for the value of the security at time t, and for different time t
the regimes can “switch” according to a transition probability matrix PΦ. The switching
follows a Markov process, pk′l′ = PΦ(Φt = l′|Φt−1 = k′) = PΦ(Φt = l′|I ′t−1), where I ′t−1
refers to the information set of prices available till t− 1. The model is defined as follows
Rt = µΦt + εt (3.20)
εt|Φt ∼ N (0, σ2Φt),
where Rt is a variable of interest, e.g. the continuous compound return, µΦt is regime-
dependent mean, and εt is a random variable with normal distribution and with regime-
dependent variance σ2Φt
.
Obtaining risk-neutral moments for a regime-switching process is in general not pos-
sible; however by conditioning on a sequence of realized regimes ΦT = ΦT , ...,Φt+1 =
Φt+1, where ΦT , ..., Φt+1 ∈ 1, ...,m′, the risk-neutral moments can be obtained as fol-
lows. From here on, the notation St denote the stock price at time t, and T denotes a
terminal time.
Lemma 3.4.1. Suppose the continuous compound return of the security price St follows
the mean-variance regime switching model (3.20). Then, the conditional risk-neutral
moment E[SdT |ΦT = ΦT , ...,Φt+1 = Φt+1, St = S0] = Sd0 exp(d(nr − 12
∑Tκ′=t+1 σ
2Φκ′
)h +
12d2∑T
κ′=t+1 σ2Φκ′h), where σ2
Φκ′is the variance with respect to regime Φκ′, n is the number
of time steps, h is the length of time step, and r is the risk-free rate.
Proof. The process can be written in the following form for each time step h at time t
log(St+1
St) = µΦt+1
h+ σΦt+1
√hεt+1, εt+1 ∼ N (0, 1),
where µΦt+1and σΦt+1
are independent of εt+1. Using the transformation of measures
(Maruyama-Girsanov Theorem, cf. [Kariya and Liu, 2003]), we can derive an equivalent
51
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
martingale measure by changing εt+1 = ε∗t+1 + δ′t+1
√h, ε∗t+1 ∼ N (0, 1), where δ′t+1 =
−(µΦt+1− r + 1
2σ2
Φt+1)/σΦt+1
, and the process for each time step h at time t under the
martingale measure becomes
St+1 = St exp((r − 1
2σ2
Φt+1)h+ σΦt+1
√hε∗t+1), ε∗t+1 ∼ N (0, 1).
By summing the sequence of normally distributed random variables with mean (r− 12σ2
Φt)h
and variance hσ2Φt
, one can obtain
ST = S0 exp((nr − 1
2
T∑κ′=t+1
σ2Φκ′
)h+√h
√√√√ T∑κ′=t+1
σ2Φκ′ε∗), ε∗ ∼ N (0, 1),
which is clearly a martingale conditioning on a given sequence of regimes. Finally, the
risk-neutral moments can be obtained using the moment function formula.
A conditioning approach seems to be computationally expensive or even intractable
due to an exponential number of regime switching paths where a single path is realization
of regimes for the time periods from the start to the time to maturity. Each time period
can be in one of m′ regimes, so the total number of regime switching paths for n′ periods
is m′n′. Running m′n
′many SDP subroutines can be computationally intractable for
large m′ or n′. However, by taking a careful look at the quantity of interest, E[SdT |ΦT =
ΦT , ...,Φt+1 = Φt+1, St = S0] = Sd0 exp(d(nr− 12
∑Tκ′=t+1 σ
2Φκ′
)h+ 12d2∑T
κ′=t+1 σ2Φκ′h), it ap-
pears that different regime-switching paths can result in the same quantity∑T
κ′=t+1 σ2Φκ′
.
This observation can be visualized in Figures 3.1(a) and 3.1(b) of the lattice construction
for the regime switchings under 2 and 3 regimes.
Each axis represents the state of each regime. As the process proceeds, the switching
of regimes can be viewed as taking incremental steps along the edges over the grid/mesh.
For 2-regimes, the construction coincides with the structure of a binomial lattice. It
is observed that all paths that traverse to the same node at time T in the lattice are
associated with the same quantity∑T
κ′=t+1 σ2Φκ′
. For 3-regimes, the lattice construction
52
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
w2
w1
(a) Lattice for 2 regimes
w2
w3
w1
(b) Lattice for 3 regimes
Figure 3.1: Regime switching lattices
requires an additional axis (dimension) so that the paths resulting in the same quantity∑Tκ′=t+1 σ
2Φκ′
can merge. Similar ideas apply to larger numbers of regimes.
Thus, to generate scenarios for m′ regimes with the set of variances (σ2)s := σ2Φ1, ...
, σ2Φm′ given T time periods, we first generate all combinations with replacement from
the set (σ2)s for T many selections; that is, we generate a sequence such as 1, 1, 2, 3
if m′ = 3 and T = 4, but does not allow a repeated combination such as 2, 1, 1, 3.
Each combination corresponds to a scenario and the respective risk-neutral moments
can be derived using the formula in Lemma 3.4.1. Thereafter, to derive the respective
probability for each scenario, we generate all possible permutations (i.e. the exact regime
switching paths) for each combination (each scenario), and for each path i calculate its
probability by P (i) =∑m′
j=1 P (i|Φ0 = j)P (Φ0 = j), where Φ0 is the initial state at time 0.
With the estimated parameters of the filtered probability P (Φ0 = j) and the switching
probability pk′l′ , P (i) can then be computed. Finally, by summing up the probability
of all paths within each scenario, we have a complete set of risk-neutral moments and
respective probabilities for all scenarios. This overall procedure can be summarized using
53
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
the pseudo-code algorithm in Table 3.1.
1. Initialize
S0: initial price, r: risk-free rate, (σ2)s: set of variances from m′ regimes,
T : the number of time periods till time to maturity,
P (Φ0 = j): filtered probability of initial state,
pk′l′ : transition probability from regime k′ to regime l′.
2. Generate scenarios ws, s = 1, ..., S
(2.a) generate a list of combinations with replacement ws, s = 1, ..., S from
the sequence 1, ..,m′, where the number of elements within each combination is T ,
for s = 1 to S
(2.b) compute the cumulative variance,∑Tκ′=t+1 σ
2Φκ′
(2.c) compute the risk-neutral moments using the formula:
E[SdT |ΦT = ΦT , ...,Φt+1 = Φt+1, St = S0]
= Sd0 exp(d(nr − 12
∑Tκ′=t+1 σ
2Φκ′
)h+ 12d
2∑Tκ′=t+1 σ
2Φκ′
h).
end
3. Compute the probability for each scenario
for s = 1 to S
(3.a) generate all permutations denoted by C′′
for the combination representing ws,
for c′′
= 1 to C′′
(3.b) compute P (c′′) =
∑m′
j=1 P (c′′ |Φ0 = j)P (Φ0 = j),
end
(3.c) compute P (ws) =∑C
′′
c′′=1 P (c′′).
end
Table 3.1: Pseudo code for scenario generation
3.4.2 Implementation and Experiments
Let T , K denote respectively the time to maturity (exercise time) and the strike price
(exercise price) of a European call option. r is the risk-free rate. Then, the call option
54
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
price is given by
e−rτEQ[max(0, ST − K)], (3.21)
where τ = T − t. The deterministic moment problems that bound the prices of the call
option are as follows:
UBSDP(b) := e−rτ maxQ
EQ[max(0, ST − K)] (3.22)
subject to EQ[SpT ] = bp, p = 0, 1, ..., d,
and
LBSDP(b) := e−rτ minQ
EQ[max(0, ST − K)] (3.23)
subject to EQ[SpT ] = bp, p = 0, 1, ..., d,
where ST is a non-negative random variable. The above problems can be considered as
special cases of the general moment problems addressed in Section 3.1.
In this section we investigate the empirical performance of the robust bound for
European-style call option prices under regime switching. In particular, we are inter-
ested in S&P500 stock index options as empirical studies show strong evidence of regime
switching behavior [Turner et al., 1989], [So et al., 1998], [Hardy, 2001], [Freeland et al.,
2009]. Our results show that the deterministic SDP bound UBSDP(•) can be inadequate
to bound the option price if the volatility of the underlying asset is non-deterministic. In
general, our estimation methodology follows [Christoffersen and Jacobs, 2004], and [Hsieh
and Ritchken, 2005]; that is, we minimize the sum of squared option-valuation errors and
combine cross sectional information from option prices and asset prices. In performing
the computational experiments, we focus on options with maturities from months to a
year as opposed to shorter durations, since regime-switching is most often occurring for
underlying assets such as the S&P500 for durations longer than a few months.
55
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
The main intent of the experiments is to demonstrate the computational feasibility
of computing quality bounds for option prices for assets with regime switching dynamics
using a stochastic semidefinite programming approach rather than emphasizing the cali-
bration of the model for testing in and out of sample performance. We collected option
data with maturities in multiple of five weeks. The data covers approximately the four
year period from October 2004 to March 2008. The data is collected on the third Friday
of each month and is obtained from the OptionMetric database through the Rotman
Financial and Trading Lab at the University of Toronto. We adjust the index level ac-
cording to the dividends paid out over the time to maturity. The actual cash dividend
payments made during the life of the option is used as a proxy for the expected dividend
payments, as suggested in [Jackwerth and Rubinstein, 1996] and [Bakshi et al., 1997].
Then, we subtract the present value of all the dividends from the index levels to obtain
contemporaneous adjusted index levels. We also normalize the option and strike prices
by the adjusted index price so that the adjusted index price is $1. We use the T-bill
term structure to deduce the discount rates and estimate the regime switching model by
minimizing the sum of squared errors between theoretical and actual prices.
However, in order to combine cross sectional information from option prices with the
time series behavior of the underlying asset, the transition probability matrix PΦ and the
filtered probability of the regimes are estimated using Maximum Likelihood Estimation
(MLE). Then, we minimize the following quantity:
$RMSE =
√1
Nt
∑i∗
(Ci∗,t − Ci∗,t(h∗t ))2,
where Ci∗,t is the market price of contract i∗ at time t, Ci∗,t(h∗t ) is the respective model
price, and Nt is the total number of contracts available at time t. At each time t,
the MLE-estimated filtered probability for each regime is taken as the probability for
initial state, and the MLE-estimated transition probability matrix PΦ is assumed to
hold for all contracts. Hence, only the volatility of each regime needs to be estimated.
56
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
The choice of estimation approach was to ensure the consistency of the approach for
the estimation of the volatility with traditional implied volatility estimation approaches.
10,000 simulated paths are generated, and the antithetic variate technique is employed
for variance reduction. To verify that 10,000 draws are adequate, we generate 25,000
paths for several cases and obtained identical results.
The following option bound computations are based on estimation using data until
March 20 2008, of which the details are provided in [Kwon and Li, 2011]. The initial price
is S0 = 1329.51 and the yearly risk-free rate r = 0.01. From the estimation results, where
up to five regimes are estimated, the initial state of the price process can be found to be
consistently in one of two regimes as determined by the estimated filtered probability of
each regime. The volatilities of the two regimes are distinctively different. Let blow (resp.
bhigh) denote the first four moments of the risk-neutral distribution for the regime with
low (resp. high) volatility. The first four moments are usually considered to capture most
of essential distributional information such as skewness and kurtosis. We consider in our
experiments that a trader may either assume the price process to follow the regime with
low volatility and compute the deterministic SDP upper bound UBSDP(blow), or assume
the process to follow the regime with high volatility and compute the deterministic SDP
lower bound LBSDP(bhigh).
To verify the quality of the bounds, we compute the European call option prices
under a lognormal regime switching model and use them as references for true option
prices. We also compute the respective robust bounds UBSSDP(blow) and LBSSDP(bhigh)
that further account for the dynamics of regime switching. For simplicity, we set b− = 1
(resp. b+ = 1) when computing upper (resp. lower) bounds for different values of b+
(resp. b−); we present the respective quantile (%) based on Theorem 3.3.5, which helps
to highlight its connection with the VaR risk measure. Finally, the worst-case bounds
WUBSDP and WLBSDP are generated as well. The bounds and prices are computed over
57
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
1 2 3 4 5 6 7 8120
140
160
180
200
220
240
260
Time to Maturity (5 weeks/unit)
Upper
bounds a
nd p
rices
BSlow
RS
UBSDP
UBSSDP
(b+=10)
UBSSDP
(b+=10
2)
WUBSSDP
(a) Upper bounds and prices
1 2 3 4 5 6 7 8120
140
160
180
200
220
240
Time to Maturity (5 weeks/unit)
Low
er
bounds a
nd p
rices
BShigh
RS
LBSDP
LBSSDP
(b−=10)
LBSSDP
(b−=10
2)
WLBSDP
(b) Lower bounds and prices
Figure 3.2: The case of 2 regimes and K = 1200
various maturities in multiples of 5 week periods for regime switching processes using 2, 3,
4, and 5 regimes. From here on, RS denotes the value of the European call option prices
under a lognormal regime switching model, and BS (Low) (resp. BS (High)) refers to
the Black-Scholes call option price computed based on the regime with low (resp. high)
volatility.
In Table A.4, we provide prices/bounds for various strike prices K that correspond
to the cases of in-the-money (K=1200), at-the-money (K=1325), and out-of-the-money
(K=1400).
(a). Quality of Bounds As seen in Figures 3.2-3.4 and Figures B.1-B.3, except for the
lower-bound case of 2 regimes and strike price 1200, the deterministic SDP bounds are
inadequate to bound the European call option prices in the presence of regime switching.
Despite its least reliance on the form of distribution, the deterministic SDP bounds
still significantly under- or overestimate how extreme the option price can be. More
importantly, if the distribution for each regime deviates from the lognormal distribution
58
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
1 2 3 4 5 6 7 820
40
60
80
100
120
140
160
180
200
220
Time to Maturity (5 weeks/unit)
Upper
bounds a
nd p
rices
BSlow
RS
UBSDP
UBSSDP
(b+=10)
UBSSDP
(b+=10
2)
WUBSSDP
(a) Upper bounds and prices
1 2 3 4 5 6 7 80
20
40
60
80
100
120
140
160
180
Time to Maturity (5 weeks/unit)
Low
er
bounds a
nd p
rices
BShigh
RS
LBSDP
LBSSDP
(b−=10)
LBSSDP
(b−=10
2)
WLBSDP
(b) Lower bounds and prices
Figure 3.3: The case of 2 regimes and K = 1325
assumed in RS pricing, e.g. fat-tail return distribution in a bear market, the actual under-
or overestimation resulting from the use of deterministic SDP bounds can be even worse.
On the other hand, the robust bounds UBSSDP(blow) (LBSSDP(bhigh)) consistently bound
the RS prices with fixed penalty parameters and thus could serve as useful alternatives
for evaluating the extremeness of option prices. In addition, the robust bounds can also
be found much more reasonable than the worst-case upper bounds WUBSDP that are
too conservative to have any practical value. The most extreme example can be found
in Figure B.3(a), where the worst-case upper bound is meaningless. Interestingly, even
when we increase the value of b+ to an extent that corresponds to covering 99.999%
the possible bounds, the robust upper bounds UBSSDP(blow) are still significantly tighter
than the worst-case ones WUBSDP. This sheds light on the practical value of the robust
bounds that allow controllability in the degree to which the regime switching dynamics
are incorporated.
59
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
1 2 3 4 5 6 7 80
20
40
60
80
100
120
140
160
180
Time to Maturity (5 weeks/unit)
Upper
bounds a
nd p
rices
BSlow
RS
UBSDP
UBSSDP
(b+=10)
UBSSDP
(b+=10
2)
WUBSSDP
(a) Upper bounds and prices
1 2 3 4 5 6 7 80
20
40
60
80
100
120
140
Time to Maturity (5 weeks/unit)
Low
er
bounds a
nd p
rices
BShigh
RS
LBSDP
LBSSDP
(b−=10)
LBSSDP
(b−=10
2)
WLBSDP
(b) Lower bounds and prices
Figure 3.4: The case of 2 regimes and K = 1400
(b). The Impact of the Structure of Regime Switching Lattice on the Option
Bounds It can be observed in Figures 3.2-3.4 that for different K values, the feature of
the robust bounds are mostly identical except the actual values. This is plausible since
the change of K value is unrelated to the structure of the lattice and, therefore, affect
only the exact values of the bounds. In addition, the robust bounds exhibit two trends.
First, all bounds increase as a function of time to maturity (which reflects the fact that
prices for options for longer maturities are higher). Second, as the penalty parameter b+
(b−) increases the bounds become increasingly extreme and reflect the regime switching
process as indicated by the undulating nature of the bounds that correspond to different
regime realizations that may occur in the future.
Consider the upper bound results as an example. In the case that b+ = 1, i.e.
UBSSDP(•) is equivalent to the deterministic bound UBSDP(•), the curve is smooth and
the bound increases steadily as the time of maturity increases. The difference between
the features of the curves highlights the effectiveness of the robust bounds to further
take into account the structure of a regime switching lattice. Parameter b+ corresponds
60
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
to a certain quantile of the distributions of paths constructed via a regime switching
lattice, where a path is a particular sequence of regime realizations starting from the
start of the second stage and up to time t. Thus, if an increment of b+ leads to only a
marginal increase in the bound UBSSDP(•) at a time to maturity t, this implies that for
the regime switching paths that end at the time point t, those paths which may lead to
higher bounds continue to be much less likely under the increased quantile and do not
influence the bound. If bounds at times to maturity t are sensitive to the incremental
changes in b+, it is because that those paths that lead to higher bounds are also more
likely at time t under the increased quantile.
It can also be observed (see Table A.4) that the bounds become more responsive to
increases in the parameter b+ (b−) as the number of regimes increase. In the cases of
larger number of regimes, the bound at each time-to-maturity changes more frequently
as b+ (b−) increases. This can be explained by the observation that as the number of
regimes increases, the regime-switching lattice becomes finer and, thus, at each time to
maturity the number of possible option bounds at the second-stage increases since the
number of realizations of regime paths increases. As a result, it becomes more likely
to switch from one bound based on a particular realized regime path to another as
parameter b+ (b−) increases. Similar reasoning could also be used to explain why the
robust bounds become in general more responsive to increases in the parameter b+ as
the time-to-maturity increases. Overall, such a feature can be useful in practice since it
implies that the sensitivity of the bounds to a user’s risk aversion attitude (represented
by parameter b+ (b−)) toward over or under estimation of the first stage bound can be
controlled through the complexity of the underlying regime switching lattice.
61
CHAPTER 3. ACCOUNTING FOR STOCHASTIC MOMENTS
3.5 Conclusion
In this chapter, stochastic semidefinite programming models were developed that in-
corporate as scenarios (uncertainty) a moment-based lattice generated by a finite-state
stochastic model to compute bounds on expected future performance. The stochastic pro-
gramming approach presents an effective approach to mitigate the risk associated with
stochastic moments in that the models are tractable and controllable through penalty
parameters that express risk aversion. The use of a general finite-state lattice in the
stochastic programming framework is not an ad hoc approach for computing bounds as
the deterministic and robust optimization counterparts are limiting cases and all bounds
are equivalent to Value at Risk quantities where the confidence corresponds to a function
of the penalty parameters. Extensive computational experiments to generate bounds on
the price of European-style call options under regime switching were performed illustrat-
ing the flexibility and advantages of the bound over deterministic approaches.
62
Chapter 4
Distributionally Robust
Optimization under Extreme
Moment Uncertainty
In this chapter, our focus is to tackle the third layer of uncertainty, i.e. extreme moment
uncertainty, that completes the notion of comprehensive robustness proposed in the be-
ginning of this thesis. This extreme form of moment uncertainty, moment outliers, is
addressed in the context of decision optimization. Recently, there is a growing body of
research, so-called Distributionally Robust Optimization (DRO), that focus on stochas-
tic optimization problems for which only partial moment information of the underlying
probability measure is available. DRO stems from the minimax approaches pioneered
by Scarf (1958), where decisions that minimize the worst-case (maximum) expected cost
among a set of distributions sharing common mean and variance are sought. The inner
maximization problem, in its general form, is indeed the moment problem addressed in
Chapter 2. The complexity of DRO, which involves infinitely many moment problems
as sub-problems, has been studied in various contexts. For example, El Ghaoui et al.
63
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
(2003) considered a portfolio selection problem that minimizes worst-case value-at-risk
of portfolios when only the mean and covariance information is available. Variants and
extensions of El Ghaoui et al.’s work can be found in [Natarajan et al., 2008], [Zhu and
Fukushima, 2009] among others. Popescu (2007) considered a wide range of utility func-
tions in decision analysis and studied the problem of maximizing expected utility, given
only mean and covariance values. In accounting for moment uncertainty, Goh and Sim
(2010) developed a general optimization framework with recourse that takes into account
the uncertainty of mean. Delage and Ye (2010) modeled moment uncertainty via an
ellipsoidal set of mean vectors and a conic set of covariance matrices, and proved the
tractability of solving a general class of stochastic optimization problems with piecewise-
concave objective functions.
The aforementioned DRO approaches however assume that the range of moments can
be specified completely through historical data, which overlooks the limitation of data in
capturing extreme events. We present in this chapter a new DRO-type framework, which
we call comprehensive distributionally robust optimization, that enables decision makers
to seek a reasonably robust policy in the presence of rare but high-impact realization
of moment uncertainty. Our framework can be viewed as a moment-based extension
from the penalized maxmin framework studied in [Anderson et al., 2000], [Uppal and
Wang, 2003], and [Maenhout, 2004], where a penalty function is used to account for
the ambiguity of a prior reference measure. In our framework, the reference measure
is replaced by the confidence region of reference moments and alternative measures are
replaced by alternative moments.
Besides the possible modeling benefit of being distribution-free, one other advantage of
our moment-based approach is its tractability, as a penalized distribution-based approach
typically results in a computationally overwhelming optimization problem unless some
strong assumptions are made, e.g. normality, or discrete random returns (e.g. [Calafiore,
64
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
2007]). Without these assumptions, a sampling-based approximation such as Monte-
Carlo simulation is typically required, which could lead to an extremely computationally
intensive problem. In contrast, our problem is moment-based and thus is expected to
be free from such a challenge. We provide two computationally tractable methods for
solving the problem. The first method is developed using the famous ellipsoid method
well suited for a general convex formulation, and the second one is based on semidefinite
programming reformulations and state-of-art semidefinite programming algorithms.
The structure of this chapter is as follows. We begin in section 4.1 with the highlight
of moment outliers. In section 4.2, we present a new comprehensive distributionally ro-
bust optimization approach which does not rely on full distributional information and
requires only the first two moments. In section 4.3, we show that under very mild con-
ditions, the newly developed optimization model is guaranteed to be solvable in polyno-
mial time, which provides a firm basis for future development of efficient algorithms. We
also highlight in section 4.4 the relation between our comprehensive robust optimization
framework and the classical worst-case (minimax) approach. In section 4.5, we further
specialize the problem to a particular class of convex problems and show such class of
problems can be reformulated as semidefinite programming problems that can be solved
in a practically efficient manner. Variations and extensions of the problem are addressed
such as incorporation of alternative moment structures, and extension to factor mod-
els. Finally, in section 4.6, we apply the distributionally robust approach to a portfolio
selection problem, where extensive numerical study based on real-life data is presented.
4.1 Moment Outliers
Prior DRO approaches account for moment uncertainty by constructing a region where
realized moments can possibly fall in. For example, in Delage and Ye (2010) a high-
percentile confidence region revolving around a pair of sampled mean and covariance
65
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
is constructed and incorporated in decision optimization. They showed via portfolio-
selection experiments that the respective performance is superior to the performance
of the portfolio solved by an approach using a fixed pair of mean and covariance only.
However, what remains to be investigated is the effect of the extreme values of moments
that fall outside the region on the overall portfolio performance. Due to its extremeness,
moments at tail percentile may significantly change the portfolio selection. In addition,
these outliers become ever non-negligible in modern portfolio risk management as sev-
eral severe losses in recent financial markets are due to those rarely-happened events.
Unfortunately, a fixed-bound DRO approach, like Delage and Ye’s, may not provide a
satisfactory solution since there is no clear rule to decide a bound within this tail per-
centile. If including all physically possible realizations of moments into the uncertainty
set, one would obtain an overly pessimistic solution. Alternatively, if one specifies the
uncertainty set based on his/her confidence region of mean and covariance, this may leave
investors fully unguarded if the realized mean and covariance fall outside the uncertainty
set. In short, any fixed bound can turn out to give a overly-conservative solution or a
solution vulnerable to worst case scenarios.
In the next section, we provide a new optimization framework offering a mechanism
that can be seen as “endogenously” achieving bounds for extreme moment uncertainty.
The degree to which the bounds are enlarged will depend on the performance deterioration
that the enlargement can cause. Such a mechanism becomes possible via a novel penalty-
type of construction.
66
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
4.2 Comprehensive Distributionally Robust Optimiza-
tion
We begin this section with considering the following scenario. An decision maker in-
tends to optimize his/her resource allocation ξTx according to a certain convex measure
function Gc, where x ∈ <n is a resource allocation vector assigned over n resources as-
sociated with the vector of random payoffs ξ. Let Q denote the probability measure of
random payoffs ξ. The allocation vector x is subject to a convex feasible set Xc ⊆ <n,
which is typically specified by some real-life constraints. He/she is uncertain about the
exact distributional form of the probability measure Q, and the information he/she can
acquire about Q is that it belongs to a distribution set characterized via a set of first
two moments (µc,Σc) = (µi,Σi) | i ∈ C. From here on, the notation Q(· ; µ,Σ)
denotes a probability measure Q associated with mean µ and covariance Σ. The set
(µs,Σs) = (µi,Σi) | i ∈ S comprises all pairs of ambiguous moments (µ,Σ) (of Q).
Thus, if µ ∈ <n and Σ ∈ <n×n then both the sets (µc,Σc) and (µs,Σs) are subsets
of the space <n × <n×n. Note that the confidence region of moments (µc,Σc) can be
either a singleton or an uncountable set. Now, we reformulate a penalized moment-based
framework in its generic form:
infx∈Xc
sup(µ,Σ)∈(µs,Σs),Q(· ; µ,Σ)
EQ(· ; µ,Σ)[Gc(ξTx)]− Tw( µ,Σ | µc,Σc ). (4.1)
For technical reasons, we consider the probability measure Q associated with measurable
space (<n,B).1
In the above formulation, the function Tw is a newly-introduced function that mea-
sures the discrepancy between a pair of moments (µ, Σ) and the set of moments specified
in the confidence region (µc,Σc). The subscript w is a users-defined penalty parameter.
We call such discrepancy as“moments discrepancy”throughout the chapter. The function
1B is Borel σ−algebra on <n.
67
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
Tw is assumed to satisfy the following property
(µ,Σ) ∈ (µc,Σc)⇔ Tw( µ,Σ | µc,Σc ) = 0,
(µ,Σ) /∈ (µc,Σc)⇔ Tw( µ,Σ | µc,Σc ) > 0.
The magnitude of the functional Tw(·) is assumed to be positively correlated with the
moments discrepancy. Thus, the larger the moments discrepancy between (µ,Σ) and
(µc,Σc) is, the less likely it is for the measure Q(· ; µ,Σ) to be chosen for evaluating
the expectation. From a modeling perspective, the penalized moment-based problem
provides a comprehensive treatment for decision makers holding different conservative
attitudes towards the following three ranges within which (µ,Σ) possibly takes values:
– When the candidate mean and covariance (µ,Σ) stays within the confidence region
(µc,Σc), the problem recovers the standard minmax setting. In other words, when
the decision maker is certain about some realizations of moments, he/she naturally
holds strictly conservative attitude and pursues only robust performance of portfolio
selection.
– When (µ,Σ) /∈ (µc,Σc) but is in the set (µs,Σs), this represents an “ambiguity” region
that the decision maker seeks a balance between relying on his/her prior knowledge
and properly hedging the risk of model uncertainty, where (µs,Σs) contains all
“physically possible” realizations of moments. In this region, using a standard
minmax setting can lead to an impractical solution. Instead, the moment-based
problem helps the decision maker to decide appropriate conservativeness based on
the possible performance deterioration resulting from each (µ,Σ) /∈ (µc,Σc). This
leads to a less-conservative setting.
– When (µ,Σ) /∈ (µs,Σs), the moments are in a region with no “physically possible”
realizations. Therefore, the decision is optimized without taking into account this
68
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
scenario when evaluating its worst-case performance. The decision maker holds no
conservative attitude for this region.
Comprehensive Distributionally Robust Optimization
Now, we further specialize the formulation (4.1) by refining the structure of the penalty
function Tw. We first consider two separate distance functions dµ : <n1 × <n2 → <+,
dΣ : <n1×n1 × <n2×n2 → <+ that are used to measure the deviation of (µ,Σ) from
the confidence region (µc,Σc). Specifically, we define dµ(µ, µc) := infν′∈µc ||µ − ν ′|| and
dΣ(Σ,Σc) := infσ′∈Σc ||Σ− σ′|| as distance functions, where the notation || · || denotes a
norm that satisfies the properties of positive homogeneity and of subadditivity. From here
on, for tractability we assume that the sets µc and Σc are closed, bounded and convex.
In some cases, it can be useful to have the penalty function Tw depend non-linearly on
the moments discrepancy, and we assume only that Tw is jointly convex with respect to
dµ(µ, µc) and dΣ(Σ,Σc).
To implement the overall problem in a tractable manner, we now propose the following
comprehensive distributionally robust optimization model
(Pp) minx∈Xc
maxγ,µ,Σ,Q(· ; µ,Σ)
∫Gc(ξ
Tx)dQ(ξ)− rw(γ)
subject to infν′∈µc
||µ− ν ′|| ≤ γ1 (4.2)
infσ′∈Σc
||Σ− σ′|| ≤ γ2 (4.3)
0 ≤ γ ≤ a. (4.4)
In the above model, the penalty function Tw is implemented using an alternative convex
penalty function rw together with the constraints (4.2) and (4.3). The variable γ denotes
the vector (γ1, γ2). The variables γ1, γ2 are introduced to bound the mean-covariance
discrepancy. The function rw is assumed to satisfy the properties of a norm and used to
measure the magnitude of the vector γ and thus translate the moments discrepancy into
69
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
penalty. The constraint (4.4) provides a hard bound on γ and models the “physically
possible” region (µs,Σs).
Our last two refinements of the model (Pp) are as follows. First, the objective function
Gc is assumed to be a piecewise-linear convex function
Gc(z) := maxk=1,...,K
ak · z + bk.
This general piecewise linear structure provides decision makers the flexibility to maxi-
mize a piecewise linear utility function by setting ak = −ck and bk = −dk given the utility
function U(z) := mink=1,...,Kck · z + dk. This structure can also be easily extended to
model the popular CVaR risk measure and a more general optimized certainty equivalent
(OCE) risk measure (see [Natarajan et al., 2010]).
Furthermore, the penalty function rw(γ) is assumed to admit the form
rw(γ) :=L∑l=1
wlrl(γ), wl ≥ 0, (4.5)
where each rl(γ) is a convex norm function. In this form, the penalty parameter w is
expanded from a scalar to a vector. This expansion allows for a more flexible way to
adjust decision makers’ aversion towards model ambiguity based on particular structures
of (γ1, γ2). Thus, the index w in rw(·) should correspond to a vector (w1, ..., wL) at right-
hand-side of (4.5). In addition, w2 > w1 means that w2l > w1
l , l = 1, ..., L. We now
consider how the investor may adjust his/her ambiguity-aversion attitude using (4.5) in
the following example.
Example 4.2.1. Consider the penalized problem (Pp) with rw(γ) = w1 · γ1 + w2 · γ2 +
w3 · ||γ||2. By setting w3 = 0, ambiguity of mean and covariance can only be adjusted
independently. As an example, a risk-management oriented investor can be less sensitive
towards the ambiguity of mean and thus tends to increase the value of w1; he however
may hesitate to increase the value of w2 due to the concern of unexpected volatility. On
70
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
the other hand, a return-driven investor may do so in an opposite way. When w3 6= 0,
ambiguity of mean and covariance can be both adjusted independently and dependently.
Thus, for the investor who thinks the chance that both mean and covariance will fall
outside the confidence region is small, increasing w3 value serves the need.
Remark 4.2.1. Classical penalized approaches based on a relative entropy penalty func-
tion can in fact be viewed as a special instance of our moment-based approach when the
standard assumption of normality is made. The relevant discussion is provided in [Li
and Kwon, 2011].
4.3 General Complexity Results
The goal of this section is to obtain a globally-optimal solution for the problem (Pp) in
a computationally tractable manner. To help the discussion, we define first the following
two functions
F(x,γ) := maxµ,Σ,Q(· ; µ,Σ)
∫Gc(ξ
Tx)dQ(ξ) | (4.2) ∼ (4.4), (4.6)
Sw(x,γ) := F(x,γ)− rw(γ).
Thus, Sw(x,γ) denotes the optimal value of (Pp), for fixed w,x,γ. The solution method
is developed based on two observations. First, given fixed w∗,x∗, the functional Sw∗(x∗,γ)
is concave with respect to γ. This concavity observation together with the convexity of
the feasible region Xc (of the allocation vector x) allows us to reformulate the problem
by exchanging the minx∈Xc and the maxγ ; thus, we can reformulate the problem (Pp) as
follows
(Pν) max0≤γ≤a
ν(γ)− rw(γ),
where
ν(γ) := minx∈Xc maxµ,Σ,Q(· ; µ,Σ)
∫Gc(ξ
Tx)dQ(ξ) | (4.2) ∼ (4.4). (4.7)
71
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
In addition, the concavity observation also gives us the certificate that a local search
method is sufficient to find a globally optimal γ∗ if ν(γ) can be evaluated for any γ.
Our second observation is that given a fixed γ∗, there exists a computationally tractable
approach to solve the dual problem of the right-hand-side optimization problem in (4.7)
and strong duality holds for the right-hand-side problem. That is, for fixed γ∗ the
functional ν(γ∗) can be efficiently evaluated. Combining these two observations, a direct
search method (see [Kolda et al., 2003]) can be applied to solve (Pν). For such a two-
dimensional problem with box-type of constraints, a straightforward approach that leads
to global convergence is to examine the steps along x, y-coordinate directions. If there
exists a feasible and improved direction, new iterate is updated; otherwise, bisect the four
possible steps and examine them again. Although a direct search method may not be
as efficient as a derivative-based optimization method, the problem (Pν) shall be small
and simple enough to be tractable using such a method. Furthermore, if only the bound
γ∗∗ = max(γ1, γ2) is of interest to penalize, the problem (Pν) that unifies the bound
γ1 = γ2 can be solved in polynomial time using a binary search algorithm.
Up to now, we have only stated the two observations as a fact but have not justified
their validity. The first observation is proven in the Theorem 4.3.1, which hinges heavily
on the following lemma.
Lemma 4.3.1. Given that the distance functions dµ(·, µc), dΣ(·,Σc) are convex, let Qα(·
; µα,Σα) (resp. Qβ(· ; µβ,Σβ)) denote a probability measure that satisfies dµ(µα, µc) ≤
αµ (resp. dµ(µβ, µc) ≤ βµ) for some αµ(resp. βµ) and dΣ(Σα,Σc) ≤ αΣ (resp. dΣ(Σβ,Σc)
≤ βΣ) for some αΣ(resp. βΣ). Then, there exists a probability measure Qη(· ; µη,Ση) =
λ′Qα + (1− λ′)Qβ that satisfies dµ(µη, µc) ≤ ηµ and dΣ(Ση,Σc) ≤ ηΣ, where
ηµ
ηΣ
=
λ′
αµ
αΣ
+ (1− λ′)
βµ
βΣ
and 0 ≤ λ′ ≤ 1.
72
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
Proof. Given that dµ (resp. dΣ) is a convex function, by definition the epigraph Sµ :=
(µ, t) | dµ(µ, µc) ≤ t (resp. SΣ := (Σ, s) | dΣ(Σ,Σc) ≤ s) is a convex set. Since
(µα, αµ), (µβ, βµ) ∈ Sµ and (Σα, αΣ), (Σβ, βΣ) ∈ SΣ, the following holds for any 0 ≤ λ′1 ≤
1, 0 ≤ λ′2 ≤ 1 according to the property of a convex set
λ′1(µα, αµ) + (1− λ′1)(µβ, βµ) ∈ Sµ
λ′2(Σα, αΣ) + (1− λ′2)(Σβ, βΣ) ∈ SΣ.
Thus, given that
ηµ
ηΣ
= λ′
αµ
αΣ
+ (1 − λ′)
βµ
βΣ
and 0 ≤ λ′ ≤ 1, the above
implies that ∃µη,Ση such that µη = λ′µα + (1− λ′)µβ and Ση = λ′Σα + (1− λ′)Σβ also
satisfy
dµ(µη, µc) ≤ ηµ, dΣ(Ση,Σc) ≤ ηΣ
by setting λ′1 = λ′2 = λ′. Finally, it is trivial to see that the probability measure λ′Qα +
(1− λ′)Qβ indeed satisfies
Eλ′Qα+(1−λ′)Qβ [X] = λ′EQα [X] + (1− λ′)EQβ [X],
where X is a random variable. This completes the proof.
Theorem 4.3.1. Given that the penalty function rw(γ) is convex over γ and w∗,x∗ are
fixed, the functional Sw∗(x∗,γ) is concave with respect to γ.
Proof. It suffices to show that for a fixed x∗ the function F(x∗,γ) in (4.6) is concave
with respect to γ due to induced concavity of −rw(γ). Let us consider the functional
λ′F(x∗,γα′) + (1− λ′)F(x∗,γβ′). Let
Q′α′(β′) := arg maxQ∈Q(· ; µ,Σ) : [dµ dΣ]T≤γα′(β′)
∫Gc(ξ
Tx∗)dQ(ξ),
where we abbreviate the notation dµ(µ, µc) and dΣ(Σ,Σc) as dµ dΣ. Then,
λ′F(x∗,γα′) + (1− λ′)F(x∗,γβ′) =
∫Gc(ξ
Tx∗)d(λ′Q′α′ + (1− λ′)Q′β′)(ξ). (4.8)
73
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
Lemma 4.3.1 gives us that there exists Q′η′ ∈ Q(· ; µ,Σ) : [dµ dΣ]T ≤ λ′γα′+(1−λ′)γβ′
such that
Q′η′ = λ′Q′α′ + (1− λ′)Q′β′ .
Suppose that Q′′η′ = arg maxQ∈Q(· ; µ,Σ) : [dµ dΣ]T≤λ′γα′+(1−λ′)γβ′∫Gc(ξ
Tx∗)dQ(ξ). It fol-
lows that
(4.8) =
∫Gc(ξ
Tx∗)dQ′η′(ξ) ≤
∫Gc(ξ
Tx∗)dQ′′η′(ξ) = F(x∗, λ′γα′ + (1− λ′)γβ′).
This shows the concavity of F(x∗,γ) with respect to γ.
Next, we validate the second observation that there exists a computationally tractable
method to evaluate ν(γ∗) for each given γ∗. We resort to an ellipsoid method which is
applicable to a general class of convex optimization problems based on the equivalence
of convex set separation and convex optimization. Specifically, Grotschel et al. (1981)
showed that for a convex optimization problem with a linear objective function and a
convex feasible region C, given that the set of optimal solutions is nonempty the problem
can be solved using an ellipsoid method in polynomial time if and only if the following
procedure can be implemented in polynomial time: for an arbitrary point c, check whether
c ∈ C and if not, generate a hyperplane that separates c from C.
It should be noted that the application of the ellipsoid method must be handled
with care. This is because additional complexity associated with distance functions is
introduced, and careful analysis is needed to verify the existence of an optimal solution
and the applicability of the ellipsoid method to each embedded optimization problem.
Theorem 4.3.2 below shows the tractability of evaluating ν(γ∗). The theorem requires
only the following mild assumptions:
• The set Xc (resp. µc, Σc) is nonempty, convex and compact (closed and bounded).
• Let N (·) := ||·|| denote the chosen norm in the distance function dµ, dΣ. Evaluation
of N (·) and a sub-gradient ∇N (·) can be provided in polynomial time.
74
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
• There exists an oracle that can verify for any x(resp. ν, σ) if x(resp. ν, σ)
is feasible with respect to the set Xc(resp. µc, Σc) or provide a hyperplane that
separates x(resp. ν, σ) from the feasible set in polynomial time.
Theorem 4.3.2. For any given γ∗, under the above assumptions, the optimal value of
ν(γ∗) is finite and the evaluation of ν(γ∗) can be done in polynomial time.
Proof. Given that Gc(z) := maxk=1,...,Kak · z+bk, using duality theory for infinite linear
programming the optimization problem associated with ν(γ∗) in (4.7) can be reformulated
as follows (cf., Theorem 2.1 in [Natarajan et al., 2010])
ν(γ∗) := infx∈Xc,r,q,y,s,t≥0
r + q (4.9)
subject to r ≥ ak(µTx) + bk + a2
ky + aks ∀ µ ∈ Sµ,∀ k = 1, ..., K
4yq ≥ t2 + s2, y ≥ 0
t2 ≥ xTΣx ∀ Σ ∈ SΣ,
where Sµ := µ | dµ(µ, µc) ≤ γ∗1 and SΣ := Σ 0 | dΣ(Σ,Σc) ≤ γ∗2. Now, we show
that a separation approach can be applied to the above problem in polynomial time.
First, hyperplanes t ≥ 0, y ≥ 0 can be generated. Then, by reformulating the second and
the third constraints as
g2(t, s, y, q) :=
√t2 + s2 + (y − q)2 − (y + q) ≤ 0,
√xTΣx− t ≤ 0,
we can find that the feasible set of (x, r, q, y, s, t) is convex for any µ ∈ Sµ and Σ ∈
SΣ. For the second constraint, it is straightforward to verify if a assignment v∗ :=
(x∗, r∗, q∗, y∗, s∗, t∗) is feasible, i.e. g2(t∗, s∗, y∗, q∗) ≤ 0, or generate a valid separation
hyperplane based on the convexity of the feasible set:
∇tg2(v∗)(t− t∗) +∇sg2(v∗)(s− s∗) +∇yg2(v∗)(y − y∗) +∇qg2(v∗)(q − q∗) + g2(v∗) ≤ 0.
75
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
For the first constraint, feasibility can be checked for each k-constraint by solving the
optimization problem
φk := supµ∈Sµ
ak(µTx∗) + bk + a2
ky∗ + aks
∗ − r∗.
The above problem can be equivalently reformulated as
supµ,ν
ak(µTx∗) + bk + a2
ky∗ + aks
∗ − r∗ : ||µ− ν|| ≤ γ∗1 ,ν ∈ µc (4.10)
by dropping the (infν) in the original distance function. Under the assumption that the
evaluation of the chosen norm || · || and its sub-gradient can be provided in polynomial
time and the existence of an oracle with respect to µc, we can apply the oracle for an
infeasible ν∗ /∈ µc, and/or we generate a hyperplane for an infeasible (µ∗,ν∗)
∇Nµ(µ∗,ν∗)(µ− µ∗) +∇νN (µ∗,ν∗)(ν − ν∗) +N (µ∗,ν∗) ≤ γ∗1
in polynomial time. Verification of feasibility is straightforward. In addition, since the
set µc is compact and γ∗1 is finite, the set of optimal solutions for (4.10) is non-empty
given that at least one feasible solution µ,ν | µ = ν,ν ∈ µc exists. Thus, we can
conclude that φk can be evaluated in polynomial time. Then, if φk ≤ 0, feasibility of
(r∗, x∗, y∗, s∗) is verified; if φk > 0 for some optimal µ∗, then generate the hyperplane
ak(µ∗Tx) + a2
ky + aks− r ≤ −bk.
Similarly, for the third constraint, feasibility can be checked by solving the optimization
problem
ρ := supΣ∈SΣ
(x∗)TΣx∗. (4.11)
The polynomial solvability of (4.11) and its non-empty set of optimal solutions can be
justified in a similar way to the first constraint except the constraint Σ 0. To verify the
feasibility of the constraint Σ 0, a polynomial QR algorithm can be applied. If there is
any negative eigenvalue, one may choose the lowest eigenvalue to construct a separation
76
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
hyperplane. As a result, if ρ ≤ (t∗)2, feasibility of (t∗,x∗) is verified; if ρ > (t∗)2 for some
optimal Σ∗, the hyperplane
(x∗)TΣ∗x−√
(x∗)TΣ∗x∗t ≤ 0
can be generated.
Finally, to see that the optimal value of ν(γ∗) is finite, it suffices to show that for
any x ∈ Xc, the optimal value of the optimization problem (4.9) is finite. Consider now
the original formulation of ν(γ) defined in (4.7). Given a pair of feasible µ,Σ, one can
always construct a probability measure Q, e.g. normal distribution, having µ and Σ as
its mean and covariance. This implies that ν(γ∗) is bounded below and thus its optimal
value is finite. Since the sets Xc,Sµ,SΣ are nonempty and compact, the feasible set of
(4.9) can be easily shown to be nonempty. Thus, we can conclude also that the set of
optimal solutions of (4.9) is nonempty.
Hence, given that the separation problem can be solved in polynomial time, for any
fixed γ∗, the evaluation of ν(γ∗) can be done in polynomial time.
4.4 Connection with Classical Minimax Approaches
We start this section by the following observation of the problem (Pp).
Theorem 4.4.1. Suppose that (xwi ,γwi) denotes the optimal solution for the problem
(Pp) associated with a penalty vector wi. Given an increasing sequence of penalty vectors
wi∞i=1, γwi is monotonically decreasing for a fixed x∗wi. Furthermore, the sequence
F(xwi ,γwi) is also monotonically decreasing, where
F(x,γ) := maxµ,Σ,Q(· ; µ,Σ)
∫Gc(ξ
Tx)dQ(ξ) | (4.2) ∼ (4.4).
Proof. To show that γwi is monotonically decreasing as wi increases, it suffices to discuss
the case of increasing a univariate wil . Suppose that w1l and w2
l are fixed such that
77
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
w1l < w2
l . Let γw1 be the optimal γ w.r.t. w1l and γw2 be the optimal γ w.r.t. w2
l . By
definition, the following inequalities hold
F(x∗,γw1)− w1l · rl(γw1) ≥ F(x∗,γw2)− w1
l · rl(γw2)
F(x∗,γw2)− w2l · rl(γw2) ≥ F(x∗,γw1)− w2
l · rl(γw1).
By adding the first inequality to the second inequality, we obtain (w1l − w2
l )(rl(γw2) −
rl(γw1)) ≥ 0. w1l < w2
l implies that rl(γw1) ≥ rl(γw2). Since rl(γ) ≥ 0 and that
rl(γ) is non-decreasing, we attain γw2 ≤ γw1 . For the case of two vectors w1 < w2 one
can increase each entry of w1 at a time and finally attain w2. Since at each step γ is
monotonically decreasing, γw2 ≤ γw1 still holds for w1 < w2. This also implies that
F(x∗,γw2) ≤ F(x∗,γw1) for a fixed x∗.
Now, consider the relation between F(xw1 ,γw1) and F(xw2 ,γw2), where xw1 and xw2
are respective optimal solutions with respect to w1 and w2. Based on the above result,
the inequality F(xw1 ,γw2) ≤ F(xw1 ,γw1) holds. In addition, since xw2 is the minimizer
with respect to w2, the inequality F(xw2 ,γw2) ≤ F(xw1 ,γw2) must hold as well. Then,
these two inequalities imply that F(xw2 ,γw2) ≤ F(xw1 ,γw1).
The above theorem indicates that as decision makers gain more confidence on the prior
reference models, by increasing the penalty parameter w they are always able to improve
the performance of the resultant optimal solutions. Such an improvement directly results
from the decreasing of the optimal bounds γ on the mean-covariance discrepancy. This
also reveals a close relation between our comprehensive robust optimization framework
and the classical worst-case (minimax) approach. In Theorem 4.4.2, we formalize such
a relation by proving that the optimal solution generated from our framework in fact
implicitly corresponds to the optimal decision generated using the following minimax
78
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
formulation.
(Pc) minx∈Xc
maxγ,µ,Σ,Q(· ; µ,Σ) ∫Gc(ξ
Tx)dQ(ξ) | (4.2) ∼ (4.4)
subject to rl(γ) ≤ bl, l = 1, ...L,
where bl is used to parameterize the constraint.
Theorem 4.4.2. The following two problems provide an identical set of optimal solutions.
That is, given that x∗,γ∗ is an optimal solution for some w∗l , l = 1, ..., L, in the first
problem, there exists b∗l , l = 1, ..., L, such that x∗,γ∗ is also optimal for the second
problem, and vice versa.
minx∈Xc
maxγ≤aF(x,γ)−
L∑l=1
wlrl(γ), wl ≥ 0,
minx∈Xc
maxγ≤aF(x,γ) | rl(γ) ≤ bl, l = 1, ..., L,
where F(x,γ) := maxµ,Σ,Q(· ; µ,Σ)
∫Gc(ξ
Tx)dQ(ξ) | (4.2) ∼ (4.4).
Proof. It suffices to prove that for a fixed x∗, if γ∗ is an optimal solution for the inner
optimization problem of the first problem with parameter w∗, there exists a b∗ for the
second problem such that γ∗ is also optimal for its inner optimization problem given
x∗. Based on the optimality condition of convex optimization problems, for γ∗ to be an
optimal solution of the first problem it is required that
F(x∗,γ∗)−L∑l=1
wlrl(γ∗)−
∑j
λj(γ∗j− aj) ≥ F(x∗,γ)−
L∑l=1
wlrl(γ)−∑j
λj(γj − aj),∀γ,
and λj(γ∗j− aj) = 0, λj ≥ 0. Similarly, for γ∗ to be an optimal solution of the second
problem, it is required that
F(x∗,γ∗)−L∑l=1
ρl(rl(γ∗)−bl)−
∑j
vj(γ∗j−aj) ≥ F(x∗,γ)−
L∑l=1
ρl(rl(γ)−bl)−∑j
vj(γj−aj),
79
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
∀γ, and ρl(rl(γ∗) − bl) = 0, vj(γ
∗j− aj) = 0, ρl, vj ≥ 0. This optimality condition is
equivalent to
F(x∗,γ∗)−L∑l=1
ρlrl(γ∗)−
∑j
vj(γ∗j− aj) ≥ F(x∗,γ)−
L∑l=1
ρlrl(γ)−∑j
vj(γj − aj),∀γ,
and ρl(rl(γ∗) − bl) = 0, vj(γ
∗j− aj) = 0, ρl, vj ≥ 0. Then, if γ∗, λj is a solution of the
first system, γ∗, λj is also a solution for the second system with bl = rl(γ∗), vj = λj, and
ρl = wl. For the other direction, if γ∗, ρl, vj is a solution for the second system, then
γ∗, ρl, vj is also a solution for the first system with wl = ρl, λj = vj.
Intuitively, the constraint form of the penalized function can be interpreted as an ad-
ditional constraint on the mean-covariance discrepancy, which provides greater flexibility
in modeling the ambiguity. The practical value of such additional flexibility can be seen
in Section 4.5.1. Finally, the above result also supports the perspective that the use of
the penalty construction within our framework can be seen as “endogenously” achieving
bounds for hedging against extreme moment uncertainty. The bounds are determined
based on the associated performance deterioration that the change of bounds may cause.
4.5 Semidefinite Optimization Reformulations
In this section, we reformulate the problem (Pp) as a semidefinite programming prob-
lem (SDP) by further assuming that the confidence region (µc,Σc) is semidefinite rep-
resentable (SDr). In addition, we also assume that both the norm function used in the
discrepancy measurement and the penalty functions are SDr; that is, the respective epi-
graph of each function is a SDr set. Based on SDP reformulations, further efficiency of
solving (Pp) can be gained using polynomial-time interior-point methods. A wide class
of SDr functions can be found in [Ben-Tal and Nemirovski, 2001].
Throughout the rest of this section, the binary operator • denotes the Frobenius inner
product. We first consider the general case that the confidence region µc (resp. Σc) is an
80
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
uncountable but bounded set, which is parameterized by a sampled mean vector µ0 (resp.
a sampled covariance Σ0). We further consider matrix Σ as the centered second moment
matrix, i.e. Σ := E[(ξ − µ0)(ξ − µ0)T], and assume that Σ 0. This overall setting
allows one to exploit the information of sampled mean and covariance. In Theorem 4.5.1
below, we provide a fairly general method to generate SDP reformulations to the problem
(Pp). We first maximize with respect to Q(· ; µ,Σ) and then maximize with respect to
(µ,Σ) within the feasible region. Such a strategy provides a flexible SDP reformulation,
which would be extended to other practical settings later. Before showing the main
reformulation, the following lemma is presented to facilitate the reformulation of (Pp) to
a SDP problem.
Theorem 4.5.1. Assume that the confidence region (µc,Σc), the norm measurement
|| · ||, and the penalty function rl(γ) are SDr. Also, suppose that the confidence region is
uncountable. Then, the SDP reformulation of the problem (Pp) can be generated using
the following problem that is equivalent to (Pp)
minx∈Xc,λ,Λ,r,s
r + s −Λ • µ0µT0
subject to (Ps) ≤ r Λ 12(λ − 2Λµ0 + akx)
12(λ − 2Λµ0 + akx) s + bk
0, k = 1, ..., K,
where (Ps) denotes the following problem
max0≤γ≤a,t,µ,Σ,ν,σ
λTµ+ Λ •Σ− wTt
subject to ||µ− ν|| ≤ γ1, ||Σ− σ|| ≤ γ2,ν ∈ µc,σ ∈ Σc, rl(γ) ≤ tl , l = 1, ..., L.
Proof. To ease the exposition of the proof, we first define two sets S1(γ1) and S2(γ2)
S1(γ1) := µ′ | infν∈µc||µ′ − ν|| ≤ γ1, S2(γ2) := Σ′ | inf
σ∈Σc||Σ′ − σ|| ≤ γ2.
81
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
Given that µ := E[ξ] and Σ := E[(ξ−µ0)(ξ−µ0)T], the problem (Pp) can be reformulated
as the following semi-infinite linear problem
minx∈Xc
max0≤γ≤a,µ∈S1 (γ1),Σ∈S2 (γ2)
maxQ
∫Gc(ξ
Tx)dQ(ξ)− rw(γ)
s.t.
∫dQ(ξ) = 1,
∫ξdQ(ξ) = µ,∫
ξξT − ξµT
0 − µ0ξTdQ(ξ) = Σ− µ0µ
T0 .
Using Lemma 2.2.1, we thus have
minx∈Xc
max0≤γ≤a,µ∈S1 (γ1),Σ∈S2 (γ2)
minλ,Λ
maxξ
−rw(γ) + Gc(ξTx) + λT(µ− ξ) +
Λ • (Σ− µ0µT0 − ξξ
T+ ξµT
0 + µ0ξT) .
Since Σ 0, the interior condition holds and thus the strong duality holds for the
above dual problem. Note that the inner maximization problem with respect to ξ can be
formulated as a problem of the form: maxk=1,...,K maxξ−ξTΛξ+pk
Tξ+qk, for some pk
and qk. Thus, it is easy to see that for a problem with finite optimal value, Λ 0 must
hold. Given that the operator maxξ preserves convexity, the overall problem is convex
w.r.t. λ,Λ and is concave w.r.t. γ,µ,Σ. Applying Sion’s minimax theorem, we
can exchange max0≤γ≤a,µ∈S1 (γ1),Σ∈S2 (γ2) and minλ,Λ0 and have an equivalent problem.
By some algebraic manipulation and addition of variables r, s, the problem can be
reformulated as
minx∈Xc,λ,Λ,r,s
r + s −Λ • µ0µT0
subject to max0≤γ≤a,µ∈S1 (γ1),Σ∈S2 (γ2)
λTµ+ Λ •Σ− rw(γ) ≤ r
Gc(ξTx) + ξ
T(−λ + 2Λµ0)−Λ • ξξT ≤ s, Λ 0 ∀ξ ∈ <n.
The second constraint (by expanding Gc(ξTx))
Λ • ξξT + ξT(λ − 2Λµ0 + akx) + s + bk ≥ 0 ∀ξ ∈ <n, k = 1, ...K
82
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
can be reformulated as Λ 12(λ − 2Λµ0 + akx)
12(λ − 2Λµ0 + akx) s + bk
0, k = 1, ...K
using Schur’s complement. For the first constraint, the left-hand side can be re-expressed
as
max0≤γ≤a,t,µ,Σ
λTµ+ Λ •Σ− wTt : infν∈µc||µ− ν|| ≤ γ1, inf
σ∈Σc||Σ− σ|| ≤ γ2, rl(γ) ≤ tl ,
l = 1, ..., L, and it is equivalent to
max0≤γ≤a,t,µ,Σ,ν,σ
λTµ+ Λ •Σ− wTt
subject to ||µ− ν|| ≤ γ1, ||Σ− σ|| ≤ γ2,ν ∈ µc,σ ∈ Σc, rl(γ) ≤ tl , l = 1, ..., L.
Given that ∃(γ∗, t∗,µ∗,Σ∗,ν∗,σ∗) that satisfies the Slater condition, which is easily ver-
ified, by applying strong duality theory of SDP and dropping the minimization operator
of the dual, the constraint is SDr. Thus, the overall problem can be reformulated as a
semidefinite programming problem.
Remark 4.5.1. The dual problem of (Ps) has a particularly useful structure that the
penalty parameter k is placed at constraints in a form of (. . . ) ≥ −w, where (. . . ) denote
the terms that linearly depends on dual variables. Such dependency allows for setting
parameter w as additional variable. Thus, if the upper bound of original objective function
is τ , one may replace the objective function by τ + κ(k), where κ is a user-defined
function. Similar discussion can also be found in [Ben-Tal et al., 2006].
Remark 4.5.2. When the confidence region µc (resp. Σc) is only a singleton, the re-
formulation can be simplified. In that case, the distance measurement (inf || · ||) reduces
to the norm measurement (|| · ||), and the constraints (4.2) and (4.3) can be directly
formulated as semi-infinite conic constraints. Lemma 2.2.1 can be extended to account
for the problem with semi-infinite conic constraints (cf. Shapiro [2001]), and the rest of
reformulation follows closely the result of Theorem 4.5.1.
83
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
The focus so far has been on deriving a general class of efficiently solvable SDP
formulations for the problem. Except for the SDr property, no additional structure has
been imposed on the norm measurement || · ||, the confidence region (µc,Σc), and the
penalty function rw. One natural choice of ||·|| for the discrepancy dµ(µ, µc)(dΣ(Σ,Σc)) is
suggested by following the connection between moments discrepancy and KL- divergence,
specifically,
(µ− ν)Tσ−1(µ− ν) ≤ γ1 (4.12)
−γ2Σ Σ− σ γ2Σ
, (4.13)
i.e. the ellpsoidal norm || · ||σ−1 (in (4.12)) and the spectral norm of a matrix (in (4.13)),
where Σ 0. For defining a confidence region, Delage and Ye (2010) consider mean
and covariance to be bounded as follows
(ν − µ0)TΣ−10 (ν − µ0) ≤ ρ1 (4.14)
θ3Σ0 σ θ2Σ0, (4.15)
where σ = E[(ξ−µ0)(ξ−µ0)T]. This structure turns out to be identical with our choice
of measurements for moments discrepancy by setting θ2 := (1 +ρ2), θ3 := (1−ρ2). Thus,
combining (4.12), (4.13), (4.14), and (4.15) provides a coherent way to specialize the
result of Theorem 4.5.1. We provide a SDP reformulation for (Pp) associated with the
penalty function rw(γ) := w1γ1 + w2γ2 + w3||γ||2 in the following Corollary 4.5.1.
Corollary 4.5.1. Given that the penalty function is defined as rw(γ) := w1γ1 + w2γ2 +
w3||γ||2, and the constraints associated with variables µ,Σ,ν,σ in (Ps) (Theorem 4.5.1)
84
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
are replaced by (4.12),(4.13),(4.14) and (4.15), the problem (Pp) can be reformulated as
(PJ) minx∈Xc,λ,Λ,r,s,y1,2,ζ1,2,S
[1,...,4,l
1,2
r + s −Λ • µ0µT0
subject to a1y1 + a2y
2 + ρ1ζ
2 + µT
0λ + Σ0 • S[2
−θ3(Σ0 • S[3) + θ2(Σ0 • S[4) ≤ r
l1 + ζ1 ≤ y1 + w1 (4.16)
l2 + Σ •Λ ≤ y2 + w2 (4.17)√(l1)2 + (l2)2 ≤ w3
S[4 − S[1 − S[3 −Λ 0 S[1 −λ2
−λ2
ζ1
0
S[2 −λ2
−λ2
ζ2
0
S[3 0,S[4 0, y1, y2 ≥ 0,
and Λ 12(λ − 2Λµ0 + akx)
12(λ − 2Λµ0 + akx) s + bk
0,
where k = 1, ..., K, a = (a1, a2), and the constraint√
(l1)2 + (l2)2 ≤ w3 is also SDr.
Proof. The objective function and the first constraint can be derived from Theorem
4.5.1. Only the sub-problem (Ps) in Theorem 4.5.1 needs to be further reformulated
with respect to the penalty function rw(γ) := w1γ1 + w2γ2 + w3||γ||2 and the constraints
85
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
(4.12),(4.13),(4.14) and (4.15); that is, we need to reformulate the problem
maxγ,t,µ,Σ,ν,σ
λTµ+ Λ •Σ− w1γ1 − w2γ2 − w3t ≤ r (4.18)
subject to (µ− ν)Tσ−1(µ− ν) ≤ γ1, −γ2Σ Σ− σ γ2Σ
,
(ν − µ0)TΣ−10 (ν − µ0) ≤ ρ1, θ3Σ0 σ θ2Σ0,
||γ||2 ≤ t, 0 ≤ γ ≤ a.
We can first replace the variable Σ by (σ + γ2Σ) since the optimal value can always be
attained by such a replacement. To see why, let us first assume that the optimal solution
(γ∗, t∗,µ∗,Σ∗,ν∗,σ∗) instead satisfies Σ∗ ≺ σ∗+γ∗2Σ, and let c denotes the respective
optimal value. Now, Λ 0 together with the constraint Σ− σ γ2Σ implies that
c ≤ λTµ∗ + Λ • (σ∗ + γ∗2Σ)− w1γ∗1 − w2γ
∗2 − w3t
∗.
This implies that an alternative solution (γ∗, t∗,µ∗,Σ∗∗,ν∗,σ∗), where Σ∗∗ = σ∗+γ∗2Σ,
must also be optimal.
Then, by reformulating constraints associated with µ,ν as SDP constraints using
the Schur complement lemma and the constraint ||γ||2 ≤ t as a SOCP constraint (see
[Ben-Tal and Nemirovski, 2001]), the problem can be reformulated as
maxγ,t,µ,ν,σ
λTµ+ Λ • (σ + γ2Σ)− w1γ1 − w2γ2 − w3t
subject to
σ µ− ν
µ− ν γ1
0,
Σ0 ν − µ0
ν − µ0 ρ1
0,
θ3Σ0 σ θ2Σ0,
||
γ1
γ2
||2 ≤ t, 0 ≤ γ1 ≤ a1, 0 ≤ γ2 ≤ a2.
86
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
As a result, we can derive the dual problem using conic duality theory and obtain the
problem (PJ).
Numerical examples of (PJ) are provided in the later section. Its practical value is
also verified in a real world application.
Remark 4.5.3. It is worth noting that by solving the reformulated problem (PJ) the dual
optimal solutions associated with the constraints (4.16) and (4.17) in (PJ) are exactly
the optimal γ1 and γ2 in the original problem, which should be clear by following our
derivation in the proof carefully. This allows one to apply the sensitivity analysis result
of SDP to study the impact of the perturbation of penalty parameter w on γ1 and γ2, which
could be difficult to study using a penalized distribution-based approach. In addition, it
should also be clear that by setting w1 = w2 = w3 = 0 in (PJ), the optimal y1 and y2 give
the values of penalty parameters that lead to γ1 = a1 and γ2 = a2 in the original problem.
This fact will be used later in our computational experiment.
In the following sections, we present variations and extensions on the problem (Pp).
Most of the work throughout the following sections are based on or closely related to
Theorem 4.5.1. In particular, we show that the problem (Pp) can be easily extended
to the case of more flexible moment structures and to the case of a factor model by
modifying the sub-problem (Ps) in Theorem 4.5.1
max0≤γ≤a,t,µ,Σ,ν,σ
λTµ+ Λ •Σ− wTt
subject to ||µ− ν|| ≤ γ1, ||Σ− σ|| ≤ γ2,ν ∈ µc,σ ∈ Σc, rl(γ) ≤ tl , l = 1, ..., L.
As a result, these models can also be efficiently solvable via a semi-definite programming
approach.
87
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
4.5.1 Variations of Moment Uncertainty Structures
The sub-problem (Ps) can accommodate a wide class of moment uncertainty structures,
including those considered in [Tutuncu and Koenig, 2004], [Goldfarb and Iyengar, 2003],
[Natarajan et al., 2010], and [Delage and Ye, 2010]. In this section, we highlight some
useful variations that provide additional flexibility in the structure of moment uncertainty.
Affine Parametric Uncertainty In (Ps) the mean vector µ (resp. second moment
matrix Σ) is assumed to be directly perturbed and is subject to each respective SDr
constraint. Alternatively, we can achieve a more flexible setting by instead assuming µ
and Σ to be affinely dependent on a set of perturbation vectors ζi and requiring the
set to be SDr. This follows closely to the typical affine-parametric-uncertainty structure
widely adopted in robust optimization literature. To be specific, µ,Σ can be expressed
in terms of ν,σ as follows
µ = ν +∑i
ζ ′iµi , ζ ′i ∈ Uµ,
Σ = σ +∑j
ζ ′′jΣj , ζ ′′j ∈ UΣ,
where µi , Σj are user-specified parameters, and Uµ,UΣ are SDr sets. Clearly, the orig-
inal moment structure can be viewed as a special instance of the above expression. To
incorporate this moment structure, we can modify the problem (Ps) as follows and still
retain its SDr property
max0≤γ≤a,t,ν,σ,ζ′
i,ζ′′j
λT(ν +∑i
ζ ′iµi) + Λ • (σ +∑j
ζ ′′jΣj)− wTt
subject to ||ζ ′|| ≤ γ1, ||ζ ′′|| ≤ γ2,ν ∈ µc,σ ∈ Σc, rl(γ) ≤ tl , l = 1, ..., L.
Applying the above formulation, one can for example further consider the case that the
perturbation vectors ζi are subject to a “cardinality constrained uncertainty set” (see
88
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
[Bertsimas and Sim, 2004]), e.g.,
−1 ≤ ζ ′i ≤ 1,∑i
|ζ ′i| ≤ γ1.
This perturbation structure particularly allows moment discrepancy to be defined as
maximum number of parameters that can deviate from ν, σ.
Partitioned Moments The framework we consider so far relies only on mean and
covariance information. While the use of mean/covariance information only helps to re-
move possible bias from particular choice of distribution, the framework may be criticized
by overlooking possible distributional skewness. In [Natarajan et al., 2010], partitioned
statistics information of random return is exploited to capture skewness behavior. In sum-
mary, the random return ξ is partitioned into its positive and negative parts (ξ+, ξ−),
where ξ+i = maxξi, 0 and ξ−i = minξi, 0. Then, the triple (µ+,µ−,Σp) is called the
partitioned statistics information of ξ if it satisfies
µ+ = EQ[ξ+], µ− = EQ[ξ−], Σp = EQ[
ξ+ − µ+0
ξ− − µ−0
ξ+ − µ+
0
ξ− − µ−0
T
],
where µ+0 ,µ
−0 are partitioned sampled means. By modifying the objective function ac-
cordingly, i.e. ξTx = (ξ+)Tx− (ξ−)Tx, incorporating such a partitioned moment struc-
ture into (Ps) is straightforward as shown in the following theorem. We however note
that the reformulation problem provides only the upper bound of the optimal value as
it is necessary to relax the support condition associated with (ξ+, ξ−) in order to apply
Theorem 4.5.1 to generate a tractable problem.
Theorem 4.5.2. Given that the confidence region of partitioned mean and second mo-
ment matrix (µ+c , µ
−c ,Σ
pc) are uncountable convex sets, consider the problem (Pp) in which
candidate measures are associated with partitioned moments µ+ := E[ξ+], µ− := E[ξ−],
and Σp =
Σ11 Σ12
Σ12 Σ22
. Then, the SDP reformulation of the problem that provides
89
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
the upper bound of (Pp) can be generated using the following problem
minx∈Xc,r,s,λ+,λ−,Λ11,Λ
12,Λ
22
r + s −Λ11 • µ+0 (µ+
0 )T − 2Λ12 • µ+0 (µ−0 )T −Λ22 • µ−0 (µ−0 )T
subject to (∗), (∗∗),
where (∗) denotes the following constraint
max (λ+)Tµ+ + (λ−)Tµ− + Λ11 •Σ11 + 2 ·Λ12 •Σ12 + Λ22 •Σ22 − wTt ≤ r
subject to ||
µ+
µ−
− ν+
ν−
|| ≤ γ1, ||
Σ11 Σ12
Σ12 Σ22
− σ11 σ12
σ12 σ22
|| ≤ γ2
ν+
ν−
∈ µ+
c
µ−c
,
σ11 σ12
σ12 σ22
∈ Σpc , 0 ≤ γ ≤ a, rl(γ) ≤ tl ,
l = 1, ..., L, where γ1, γ2, t,µ+,µ−,Σ11,Σ12,Σ22,ν
+,ν−,σ11,σ12,σ22 are decision vari-
ables, and (∗∗) denotes the following positive semidefinite constraint Λ11 Λ12
Λ12 Λ22
12(· · · )
12(· · · ) s + bk
0, k = 1, ..., K,
where (· · · ) is replaced by the vector
(λ+ − 2Λ11µ+0 − 2Λ12µ
−0 + akx, λ
− − 2Λ22µ−0 − 2Λ12µ
+0 − akx),
given that the penalty function rl(·) and the norm measurement for moments discrepancy
are SDr.
4.5.2 Extensions to Factor Models
Up to now, we have assumed either a pair of reference mean and covariance or a confidence
region of possible mean and covariance values among assets are readily available. In some
cases, this assumption may pose difficulty when the number of underlying assets becomes
90
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
large. Fortunately, the behavior of the random returns can often be captured by a fewer
number of major sources of randomness (see [Luenberger, 1998]). In these cases, a factor
model that corresponds directly to those major sources (factors) is commonly used.
In a similar vein, we show that our penalized problem can be further extended to the
case of a factor model. Consider a factor model of the return vector ξ defined as follows
ξ = Vζ + ε,
where ζ is a vector of m factors (m ≤ n), V is a factor loading matrix, and ε is a vector
of residual returns with zero mean and covariance Σε. Let µζ denote the mean vector of
ζ. The mean µ of the random return ξ is thus expressed as µ = Vµζ. For re-expressing
the second moment matrix Σ, one has to decide whether or not to keep the information
of a sampled mean µ0. Since the estimation of a sampled mean is not a difficult task,
and including such information does not add much complexity to the problem, we keep
the information and find a vector µ′0 that approximately satisfies µ0 ≈ Vµ′0. Thus, by
further defining a second moment matrix of ζ as Σζ := E[(ζ−µ′0)(ζ−µ′0)T], the matrix
Σ can alternatively be expressed as Σ ≈ VΣζVT + Σε.
Given fixed V and Σε, one straightforward way to extend our model is to modify the
problem (Ps) as follows
maxγ,t,µ,Σ,ν,σ,µζ ,Σζ
λTµ+ Λ •Σ− wTt
subject to ||µ− ν|| ≤ γ1, ||Σ− σ|| ≤ γ2
ν = Vµζ,σ = VΣζVT + Σε
µζ ∈ ζ1,Σζ ∈ ζ2, 0 ≤ γ ≤ a, rl(γ) ≤ tl , l = 1, ..., L,
where ζ1, ζ2 are SDr sets that correspond to the confidence regions of factors moments.
The model can also be viewed as a penalty-based extension of the factor model considered
in [El Ghaoui et al., 2003]. We should note that in the above model the deviation of factor
moments from the respective confidence region ζ1, ζ2 may not be effectively taken into
91
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
account. Alternatively, one may consider replacing the first two constraints in the above
model by
µ = Vµ′ζ,Σ = VΣ′ζVT + Σε,
||µ′ζ − µζ|| ≤ γ1, ||Σ′ζ −Σζ|| ≤ γ2,
where µ′ζ,Σ′ζ are new variables that correspond directly to the ambiguous factor mo-
ments. Thus, the formulation directly penalizes the discrepancy of factor moments.
4.6 Application in Portfolio Selection
4.6.1 Portfolio Selection under Model Uncertainty
Modern portfolio theory sheds light on the relationship between risk and return over
available assets, guiding investors to evaluate and achieve more efficient asset allocations.
The theory requires specification of a model, e.g. a distribution of returns or moments
of a distribution. To avoid any ambiguity, from here on model refers to the probability
measure or moments that characterizes the stochastic nature of a financial market. In
practice, practitioners cannot ensure the correct choice of model due to the complex na-
ture of model determination and validation. Ellsberg (1961) has also found that investors
in fact hold aversion attitudes toward the ambiguity of models. As a classical example,
even with lower expected return, investors have higher preference for investments that are
geographically closer due to their better understanding of the return distribution. This
finding implies that investors tend to pay an additional ambiguity premium, if possible,
in investing. Therefore, portfolio selection models that do not take into account this
ambiguity-aversion attitude may be unacceptable to such investors.
The maxmin (worst-case) approaches pioneered by Gilboa and Schmeidler (1989) ac-
count for investors’ ambiguity-aversion attitude by allowing investors to maximize the
92
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
expected utility of terminal wealth, while minimizing over a set of ambiguity measures.
Unlike classical approaches to decision making such as expected utility theory that ne-
glect an agent’s preference on the choice among multiple probability models, Gilboa and
Schmeidler provided a system of axioms under which an agent’s preference on the choice
of the models can be characterized by the worst-case approach. In this regard, Dis-
tributionally Robust Optimization can be in fact seen as a special class of Gilboa and
Schmeidler’s approach, where the set of ambiguity measures is defined via the information
of moments. Several facets of constructing a robust portfolio based on limited statistical
information can be found in [Goldfarb and Iyengar, 2003], [Tutuncu and Koenig, 2004],
and [Zhu and Fukushima, 2009]. Most recent DRO applications in portfolio selection are
within the works of [Natarajan et al., 2010] and of [Delage and Ye, 2010].
To examine the strength of our comprehensive distributionally robust optimization
approach, we specialize the framework based on the portfolio selection model employed
in [Delage and Ye, 2010], and compare it with the approaches in [Delage and Ye, 2010]
and [Popescu, 2007], and with a sample-based approach. The details of implementation
and experiments based on real market data are presented in the following section.
4.6.2 Implementation and Experiments
In this section, we provide numerical examples to illustrate the performance of our penal-
ized approach. In particular, we consider the problem (PJ) and examine its performance
by comparing to the approaches of [Popescu, 2007], [Delage and Ye, 2010], and a sample-
based approach. Except the sample-based approach, which evaluates expectation using
empirical distribution constructed from sample data, the other two approaches are both
DRO approaches that evaluate expectation based on the worst-possible distribution sub-
ject to certain constraints on the first two moments. In [Popescu, 2007], the mean µ and
the covariance Σ are assumed to be equal to the sampled mean and covariance, while
93
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
in [Delage and Ye, 2010] µ, Σ are assumed to be bounded within a confidence region
revolving around a pair of sampled mean and covariance. The objective of these computa-
tional experiments is to contrast the performance of “fixed-bound” DRO approaches with
the penalized problem (PJ) which “endogenously” determines the bound on moments
according to the level of deterioration in worst-case performance.
We compare the performance of the four approaches on real market data. In partic-
ular, we consider in this experiment the popular CVaR risk measure as the performance
measure to be minimized for each portfolio. Recall that CVaR risk measure is defined
as CVaRδ(z) := minλ[ λ
[ + (1/δ)E[(z − λ[)+] , where (t)+ = max0, t, z denotes
the loss distribution, λ[ is a slack variable to be minimized, and δ denotes a certain
probability level. CVaR is thus the conditional expectation of the loss above (1− δ)%-
quantile. Although in general a wide range of performance measures can be modeled
using (PJ), our intent here is to avoid those associated with specific investors’ preference,
e.g. specific functional form of a utility function, and rather to select the one that can
be widely accepted by practitioners. We believe that the tradeoff between downside risk
and associated return helps to give the most direct comparison among all approaches.
We also specialize further the moment structure in the penalized model (PJ) by setting
σ = Σ0 in (4.12) and Σ = Σ0 in (4.13), which is more consistent with the one used in
[Delage and Ye, 2010] and helps to compare the two models.
Our list of stocks consists of 46 major stocks of the S&P500 index across 10 industry
categories. We collected from Yahoo! Finance the historical daily prices of the 46 stocks
from January 1st, 1992 to December 31th 2010, in total 19 years. Our experiment setting
follows closely the one considered in [Delage and Ye, 2010]. Among 46 stocks, for each
experiment we randomly choose 4 stocks as the default portfolio and then rebalance the
portfolio every 15 days. At each time of constructing/rebalancing a portfolio, the prior
30-days daily data is used to estimate sampled mean and covariance. As Delage and
94
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
Ye has shown that their approach outperforms other approaches under such a setting,
our hope is to carry over their high quality result to this experiment and compare it
with our penalized approach. Our choice of time period to examine the performance of
each approach is inspired by the choices in [Goldfarb and Iyengar, 2003], where the time
period January 1997 – December 2000 is chosen, and in [Delage and Ye, 2010], where the
time period 2001 – 2007 is chosen. To further cover the most recent financial crisis, the
entire time period that we consider to evaluate the performance is from January 1997
to December 2010. The dataset for the time period January 1992 – December 1996 was
used for initial parameter estimation.
We assume in this experiment that investors hold strictly conservative attitudes and
pursue only robust performance when the moments are realized within 90% confidence
region. To estimate the parameters ρ1 and ρ2 that correspond to the 90% confidence
region, we apply similar statistical analysis as the one used in [Delage and Ye, 2010]. It is
however difficult to determine the “right” amount of data that gives the “best” estimation
of ρ1 and ρ2. To mitigate possible bias due to the choice of the amount of data, in
addition to the initial estimation based on the data from January 1992 to December
1996 another re-estimation based on the data from January 1992 to December 2003 is
further performed in the middle of the rebalancing period, i.e. January 2004. Thus, in
our later analysis the portfolio performance of the first 7-years period (1997-2003) will
be presented separately from the latter 7-years period (2004-2010). The estimation of ρ1
and ρ2 with respect to the 90% confidence region are given as follows
ρ1−90% = 0.1816, ρ2−90% = 3.7356, (1992− 1996)
ρ1−90% = 0.1860, ρ2−90% = 4.3827, (1992− 2003).
In addition to parameters ρ1 and ρ2, penalty parameters w1, w2, w3 are also required to
be estimated for our model (PJ). Various approaches may be considered to estimate the
penalty parameters. For example, one may attempt to find those values which generally
95
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
lead to superior portfolio performance by solving (PJ) repeatedly based on some historical
data. However, this additional calibration procedure, which may (or may not) give
unfair advantages over classical DRO approaches, may hinder us to provide a consistent
comparison and weaken the illustration of the benefit accrued solely from the bounds that
are endogenously generated from our penalized approach. As an alternative approach,
in this experiment we generate the penalty parameters by the following procedure. At
the time that we estimate ρ1−90% and ρ2−90%, we additionally estimate another set of
parameters ρ1−99% and ρ2−99% that corresponds to a 99%-confidence region:
ρ1−99% = 0.3779, ρ2−99% = 9.3773 (1992− 1996)
ρ1−99% = 0.4161, ρ2−99% = 12.1698 (1992− 2003).
We assume that the penalty parameters are calibrated in a way that the optimal portfolio
generated by model (PJ) with a 90% confidence region is identical to the one generated by
Delage and Ye’s model with a 99% confidence region at the time of parameter estimation.
Following Remark 4.5.3, we can compute the value of the penalty parameters by solving
(PJ), where the difference a1 = ρ1−99% − ρ1−90% and a2 = ρ2−99% − ρ2−90% is set as the
upper bound of γ1, γ2 and w1 = w2 = w3 = 0. This overall estimation procedure will help
compare fairly the following three models: Delage and Ye’s model with parameters ρ =
(ρ1−90%, ρ2−90%) (denoted by DY-90), with parameters ρ = (ρ1−99%, ρ
2−99%) (denoted
by DY-99), and our penalized model (PJ) with parameters ρ = (ρ1−90%, ρ2−90%) and
penalty parameters estimated by a1, a2 (denoted by LK-90). Note that as the sampled
mean and covariance are re-estimated at each rebalancing point, DY-90 and DY-99 have
ρ1, ρ2 unchanged; that is, the fixed bounds remain the same while LK-90 instead keeps
its penalty parameters unchanged.
In addition to the above three models, the performance of Popescu’ model (denoted by
P), and a sample-based approach (denoted by SP) will also be compared. The comparison
in terms of average (avg.), geometric mean (geo.) and CVaR measures of various quantiles
96
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
δ among all models for the time periods 1997-2003 and 2004-2010 are given in Table 4.1
and Table 4.2.
avg. geo. δ = 0.01 δ = 0.1 δ = 1 δ = 5 δ = 10 yr. ret.
P 1.0043 1.0014 0.4375 0.6631 0.7553 0.8321 0.8662 1.0685
DY-90 1.0062 1.0046 0.6931 0.733 0.7986 0.8721 0.9000 1.0931
DY-99 1.007 1.0053 0.6908 0.7328 0.8002 0.8752 0.9027 1.1042
LK-90 1.0073 1.0056 0.6911 0.7328 0.8005 0.8762 0.9036 1.1087
SP 1.0043 1.0008 0.4375 0.5577 0.7301 0.8168 0.8535 1.0703
Table 4.1: Comparison of different approaches in the period: 1997/01-2003/12
avg. geo. δ = 0.01 δ = 0.1 δ = 1 δ = 5 δ = 10 yr. ret.
P 1.0042 1.0018 0.5634 0.5799 0.7233 0.8297 0.8723 1.0597
DY-90 1.004 1.0027 0.6219 0.6835 0.7717 0.8676 0.9046 1.0642
DY-99 1.0044 1.0032 0.6314 0.6878 0.7772 0.8739 0.9098 1.0718
LK-90 1.0047 1.0036 0.6417 0.6918 0.7803 0.8763 0.9115 1.0772
SP 1.0043 1.0013 0.5634 0.5786 0.6992 0.8158 0.8605 1.0599
Table 4.2: Comparison of different approaches in the period: 2004/01-2010/12
Various CVaR measures are provided to ensure the consistency of the performance in
terms of downside risk. As the economy has experienced a dramatic change before and
after the 2008 financial crisis, we further provide the comparison for the time periods
2004-2007 and 2007-2010, which are separately given in Table 4.3 and Table 4.4. As
shown in the tables, among 300 experiments LK-90 exhibits overall superior performance
among all the models except having lower mean and geometric mean than the P and the
SP model during 2004-2007. For that time period, it appears that even though the P and
the SP model still expose to higher downside risk than other approaches, they enjoy the
most advantage of upward trend of the market and achieve better average return. One
possible reason for this is that the market for the time period 2004-2007 is less volatile
(compared with other time periods), for which a sample-based approach can possibly
97
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
avg. geo. δ = 0.01 δ = 0.1 δ = 1 δ = 5 δ = 10 yr. ret.
P 1.0091 1.0081 0.8142 0.8411 0.8686 0.9101 0.9292 1.0597
DY-90 1.0074 1.0069 0.8784 0.8955 0.92 0.9421 0.9529 1.0642
DY-99 1.0073 1.0069 0.8743 0.8955 0.9246 0.9459 0.956 1.0718
LK-90 1.0073 1.0069 0.8737 0.8963 0.9251 0.9461 0.956 1.0772
SP 1.0095 1.0083 0.782 0.8245 0.861 0.9019 0.9218 1.0599
Table 4.3: Comparison of different approaches in the period: 2004/01-2007/06
avg. geo. δ = 0.01 δ = 0.1 δ = 1 δ = 5 δ = 10 yr. ret.
P 0.9994 0.9957 0.5634 0.5634 0.6776 0.7847 0.8334 0.901
DY-90 1.0008 0.9987 0.6142 0.6675 0.7366 0.8265 0.8689 0.9545
DY-99 1.0016 0.9996 0.6253 0.674 0.7429 0.8325 0.8752 0.9716
LK-90 1.0022 1.0003 0.6199 0.677 0.7479 0.8357 0.8776 0.9826
SP 0.9991 0.9946 0.5634 0.5634 0.6563 0.7671 0.819 0.887
Table 4.4: Comparison of different approaches in the period: 2007/06-2010/12
benefit the most from using only sample data. On the other hand, in all other time
periods Delage and Ye’s and our penalized approach not only perform better than the
P and the SP approach in terms of CVaR values, where the improvement can go up to
5∼10% for δ = 1, but also achieve superior average performance, where the improvement
can go up to around 0.3%. This overall superior performance is also carried over to the
comparison of long-term performance; for example, the average yearly return is also
improved up to 3∼10% by using Delage and Ye’s model or our penalized model. This
verifies the importance of taking moment uncertainty into account in real-life portfolio
selection, which helps to achieve more efficient portfolios.
By comparing the performance of DY-90, DY-99 and LK-90, we can first see that LK-
90 has a clear advantage over DY-90. Since DY-99 also outperforms DY-90, this verifies
the intuition that if there is any additional gain by increasing the fixed bound of the
confidence region, our penalized approach can as well effectively benefit from such a gain.
98
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
To explain the reason why DY-99 outperforms DY-90 is not easy as we have discussed
earlier that deciding appropriate bounds is highly non-trivial. What can however be
intriguing is that in most cases LK-90 outperforms DY-99 in terms of both average return
and downside risk performance. Although the improvement is not as substantial as it is
on other models, which is actually plausible as we enforce the consistency of the initial
setting between DY-99 and LK-90, we believe that this overall superior performance does
reflect the benefit from using a penalized approach, which endogenously determines the
bound at each rebalancing point according to the level of deterioration in worst-case
performance. Furthermore, as shown in Table 4.2, the improvement of the CVaR value
can still go up to 1.5% while the improvement of average return is 0.03%. Another
important observation is that in the time period 2007-2010, where the market is most
volatile, the improvement of LK-90 over DY-99 is most substantial in terms of average
return, and the improvement of average yearly return is as much as the improvement of
DY-99 over DY-90. By contrasting the improvement of LK-90 over DY-99 between the
time periods 2004-2007 and 2007-2010, we find that the more volatile the market is, the
more one can possibly benefit from using our penalized approach.
In Figures 4.1 - 4.2 we have also provided the average evolution of cumulative wealth
for each model for time periods 1997-2003, 2004-2010, 2004-2007, and 2007-2010. Note
that in all figures the evolution of a unit price of the S&P500 index has also been provided
for reference purpose. As seen, for the time period 1997-2003, the P and the SP model
show their vulnerability in a constantly volatile market and their associated cumulative
wealth dropped greatly as the market crashed around 2001-2002, whereas DY-90, DY-
99 and LK-90 have much better downside risk performance. One can also observe the
strength of the penalized model LK-90 compared with DY-90 and DY-99: its greater
wealth is cumulated by consistently providing more stable performance in a volatile mar-
ket. Similar observation can also be found in the time period 2004-2010. This comparison
99
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
0 20 40 60 80 100 120
1
1.2
1.4
1.6
1.8
2
2.2
15 days/unit
Cum
ula
tive W
ealth (
$)
Jan. 1997 − Dec. 2003
P
DY−90
DY−99
LK−90
SP
S&P500
(a) 1997/01-2003/12
0 20 40 60 80 100 120
0.8
1
1.2
1.4
1.6
1.8
2
15 days/unitC
um
ula
tive W
ealth (
$)
Jan. 2004 − Dec. 2010
P
DY−90
DY−99
LK−90
SP
S&P500
(b) 2004/01-2010/12
Figure 4.1: Cumulative wealth
0 10 20 30 40 50 60
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
15 days/unit
Cum
ula
tive W
ealth (
$)
Jan. 2004 − Jun. 2007
P
DY−90
DY−99
LK−90
SP
S&P500
(a) 2004/01-2007/06
0 10 20 30 40 50 600.4
0.5
0.6
0.7
0.8
0.9
1
1.1
15 days/unit
Cum
ula
tive W
ealth (
$)
Jun. 2007 − Dec. 2010
P
DY−90
DY−99
LK−90
SP
S&P500
(b) 2007/06-2010/12
Figure 4.2: Cumulative wealth
100
CHAPTER 4. DISTRIBUTIONALLY ROBUST OPTIMIZATION UNDER EXTREME
MOMENT UNCERTAINTY
contrasts further a “fixed-bound” approach with our “endogenous bound” approach. The
overall computational results support well the idea that the penalized problem (PJ)
which endogenously decides the bound of moments based on the level of deterioration in
worst-case performance, improves the overall performance.
4.7 Conclusion
In this chapter, we address the difficulty of providing a “reasonably” robust policy in
the presence of rare but high-impact realization of moment uncertainty. A penalized
moment-based framework is proposed that extends classical penalized maxmin framework
to incorporate richer forms of moment uncertainty. While the classical DRO approaches
focus on ensuring the solution is robust against a bounded set of moment vectors, our
approach provides additional level of robustness when the realized moments fall out-
side the set. Under some mild conditions, the penalized moment-based problem turns
out to be computationally tractable for a wide range of specifications. Computational
experiments were conducted, where we specialize the penalized problem to a portfolio
selection model and find promising performance of our approach using historical data.
The improvement of performance has been found more substantial as the market is more
volatile. This highlights the potential benefit of endogenously achieving bounds for mo-
ment uncertainty using our penalized approach. We have also provided a few practical
extensions of the problem. The practical performance of those extensions remains to be
examined, and we will leave those examinations for our future work.
101
Chapter 5
Conclusion and Future Research
This thesis has focused on developing a comprehensive set of moment-based optimization
models that account for various forms of uncertainty associated with distributional spec-
ifications in decision evaluation and optimization. Various financial applications that can
benefit from the developments of these models were presented. We began this thesis by
presenting a novel application in model-risk management, where resorting to the use of
moment-based optimization has been shown to be extremely useful in providing mean-
ingful risk evaluations. Prior to our work, moment-based optimization was only known
to be applicable in fairly restrictive settings, where moments considered were, in general,
low-order and assumed to be deterministic. In the first part of the thesis, we presented
new tractability results of incorporating high-order marginal moments. These results
advance the existing knowledge of tractable foundations that can be used for modeling
richer moment information, and lay the foundations for studying other possible tractable
instances.
In the second part of the thesis, two new moment-based optimization models were
proposed that address the uncertain nature of moments. In the first model, a special
form of recourse functions were constructed to account for the stochastic information
of moments, whereas in the second model, a convex penalty function was designed to
102
CHAPTER 5. CONCLUSION AND FUTURE RESEARCH
capture the extreme moments falling outside a pre-specified confidence region. Although
these two models were developed from completely different perspectives, some light can
be shed on the common features that they share from a high-level perspective. From
a modeling point of view, both models present effective approaches to mitigate the risk
associated with the uncertainty of moments in that the both models are controllable
through penalty parameters that express risk aversion. From a theoretical point of view,
both models are consistent with their deterministic robust counterparts in that the de-
terministic models can be shown as limiting cases. Finally, from a computational point
of view, the complexity of the solution methods for both models can be shown equivalent
to that of solving a finite number of their deterministic robust counterparts. That is,
our new models, while accounting for additional levels of uncertainties, do not add much
computational burden compared with their deterministic counterparts. We believe be-
cause of these prominent features, the models developed in this thesis can add significant
value to the stream of research related to moment-based optimization.
There are a number of research directions that are important to pursue. We briefly
describe them here. First, the SDP formulation provided in Theorem 2.3.1 does not
guarantee to generate the tightest upper bounds. This leads to the question: Is it possible
to find a polynomial-time algorithm that generates the tightest bounds? We suspect the
question can be quite challenging to answer. There is reason to believe that the problem
of generating the tightest bounds in Theorem 2.3.1 can be a NP-hard problem, given
that incorporating joint multivariate moments up to the fourth order is NP-hard. If
so, it is not clear how to prove such a result. An alternative approach to improve the
tightness of the bounds is to employ Lasserre’s SDP relaxation techniques (Lasserre
(2001)). However, to achieve reasonably tight bounds, the size of the associated SDP
relaxation problems can become extremely large in the case of high-dimensional problems.
This imposes significant computational challenges in seeking tighter bounds even with
103
CHAPTER 5. CONCLUSION AND FUTURE RESEARCH
the use of modern SDP solvers. To resolve this, specialized algorithms that exploit the
structure of the resulting SDP problems need to be further developed.
Another research direction that may generate a wealth of applications is to investi-
gate the tractability and applicability of two-stage stochastic semi-definite programming
models that are more general than the ones considered in Chapter 3. It should be
clear that except for special instances such as the ones in Chapter 3, solving a general
stochastic semi-definite programming instance naturally gives rise to the problem of solv-
ing a large-scale SDP. One of the challenges is that decomposition-type algorithms that
work well for large-scale stochastic linear programming problems may not be immedi-
ately applicable for stochastic semi-definite programming problems. For example, many
decomposition algorithms employ a cutting-plane type of strategy that generates cuts
from sub-problems at each iteration and re-solves the master problem repeatedly. This
re-solving procedure can become cumbersome when applied to SDP problems. This is
because existing SDP algorithms are in general not as compatible with “warmstarting” as
linear programming solvers are, an essential optimization technique that determines the
efficiency of re-solving. The detailed study of resolving these computational challenges
and exploring relevant applications will be part of our future work.
104
Appendix A
Additional Tables
A.1 Tables of Section 2.2.3
τ = 1 τ = 12 τ = 24
s′ K [CB CM Cb] [CB CM Cb] [CB CM Cb]
0.2 30 10.034 10.034 10.035 10.401 10.405 10.416 10.812 10.821 10.843
0.2 35 5.039 5.041 5.042 5.565 5.584 5.595 6.221 6.228 6.251
0.2 40 0.465 0.330 0.351 1.804 1.747 1.764 2.710 2.673 2.695
0.2 45 0.000 0.005 (0.004) 0.285 0.295 (0.295) 0.855 0.838 (0.848)
0.2 50 0.000 0.000 (0.000) 0.022 0.039 (0.038) 0.199 0.214 (0.215)
0.4 30 10.034 10.036 10.037 10.566 10.600 10.608 11.361 11.369 11.388
0.4 35 5.044 5.093 5.090 6.378 6.353 6.364 7.639 7.595 7.618
0.4 40 0.907 0.635 0.677 3.316 3.193 3.217 4.817 4.731 4.756
0.4 45 0.015 0.083 (0.077) 1.493 1.426 (1.436) 2.875 2.798 (2.818)
0.4 50 0.000 0.012 (0.011) 0.594 0.611 (0.610) 1.641 1.605 (1.616)
0.6 30 10.034 10.059 10.057 11.154 11.160 11.168 12.535 12.487 12.509
0.6 35 5.108 5.206 5.199 7.543 7.434 7.453 9.395 9.283 9.312
0.6 40 1.349 0.939 1.001 4.825 4.626 4.660 6.919 6.769 6.800
0.6 45 0.129 0.226 (0.217) 2.947 2.784 (2.808) 5.029 4.881 (4.909)
0.6 50 0.004 0.075 (0.069) 1.735 1.668 (1.676) 3.623 3.505 (3.527)
0.8 30 10.039 10.114 10.109 12.019 11.950 11.962 13.976 13.847 13.875
0.8 35 5.265 5.346 5.335 8.817 8.610 8.641 11.231 11.034 11.070
0.8 40 1.791 1.240 1.323 6.325 6.034 6.079 8.994 8.759 8.799
0.8 45 0.355 0.396 (0.383) 4.463 4.194 (4.232) 7.194 6.953 (6.992)
0.8 50 0.042 0.186 (0.175) 3.112 2.928 (2.950) 5.755 5.533 (5.567)
Table A.1: CB (resp. CM) denotes the call option price of the diffusion (resp. jump-
diffusion) model with Lo’s specification. Cb denotes the call option prices of the bench-
mark model, i.e. the jump-diffusion model with k = 1, φ2 = 0.15, λ = 0.25.
105
APPENDIX A. ADDITIONAL TABLES
s′ K τ = 1 τ = 12 τ = 24
0.2 45 0.044 0.496 1.201
0.2 50 0.021 0.182 0.436
0.4 45 0.157 1.921 2.072
0.4 50 0.070 0.908 2.269
0.6 45 0.350 2.140 1.812
0.6 50 0.151 2.431 3.987
0.8 45 0.655 1.951 1.660
0.8 50 0.285 3.716 4.560
Table A.2: ϑ(V∗) of Qmom for various values of parameters s′, K, τ , where w′ = 2.
s′ K τ = 1 τ = 12 τ = 24
0.2 45 0.044 0.496 1.201
0.2 50 0.021 0.182 0.436
0.4 45 0.157 1.897 1.618
0.4 50 0.070 0.908 2.269
0.6 45 0.350 1.679 1.469
0.6 50 0.151 2.431 2.938
0.8 45 0.655 1.571 1.393
0.8 50 0.285 3.142 2.785
Table A.3: ϑ(V∗) of Qmom for various values of parameters s′, K, τ , where w′ = 5.
A.2 Tables of Section 3.4.2
106
APPENDIX A. ADDITIONAL TABLES
Strike Price (K) Time to Maturity
1 2 3 4 5 6 7 8
K=1200
BSLow 130.7 132.4 134.6 137.2 139.9 142.7 145.5 148.4
High 143.1 160.3 175.3 188.8 201.0 212.2 222.7 232.6
RS 136.2 141.0 148.2 154.6 160.8 166.8 172.7 178.0
UB SSDP
b+=1(0%) 131.2 133.6 136.6 139.8 143.1 146.5 150.0 153.4
b+=10(81%) 149.0 152.5 155.9 177.0 179.9 182.7 200.2 202.6
b+=102(98%) 149.0 152.5 174.1 177.0 195.1 197.7 200.2 215.7
WUB SDP b+=∞ 149.0 171.1 190.0 206.4 221.2 234.6 247.0 258.7
LB SSDP
b−=1(0%) 130.7 133.5 139.2 145.0 150.4 155.3 159.8 163.8
b−=10(81%) 130.7 131.9 133.1 134.3 135.7 137.1 138.4 139.7
b−=102(98%) 130.7 131.9 133.1 134.3 135.5 136.7 137.9 139.1
WLB SDP b−=∞ 130.7 131.9 133.1 134.3 135.5 136.7 137.9 139.1
K=1325
BSLow 22.3 30.9 37.6 43.5 48.6 53.4 57.8 62.0
High 60.2 84.5 103.2 119.1 133.1 145.8 157.5 168.4
RS 40.6 53.7 65.0 74.2 82.7 90.8 97.6 105.2
UB SSDP
b+=1(0%) 27.0 37.6 45.8 52.8 59.0 64.7 70.0 74.9
b+=10(81%) 74.2 78.6 82.9 110.3 113.4 116.5 137.1 139.7
b+=102(98%) 74.2 78.6 107.1 110.3 131.7 134.4 137.1 154.6
WUB SDP b+=∞ 74.2 103.8 126.3 145.1 161.5 176.2 189.7 202.1
LB SSDP
b−=1(0%) 43.5 60.0 72.1 81.9 90.2 97.3 103.6 109.1
b−=10(81%) 16.8 23.1 28.1 32.3 53.8 56.1 58.3 60.5
b−=102(98%) 16.8 23.1 28.1 32.3 36.1 39.6 42.8 45.8
WLB SDP b−=∞ 16.8 23.1 28.1 32.3 36.1 39.6 42.8 45.8
K=1400
BSLow 1.8 6.2 10.8 15.2 19.4 23.4 27.2 30.9
High 30.6 53.6 72.0 87.7 101.7 114.5 126.3 137.3
RS 17.1 26.4 37.7 46.5 54.1 62.2 69.3 76.1
UB SSDP
b+=1(0%) 3.1 9.4 16.0 22.2 28.0 33.5 38.6 43.5
b+=10(81%) 43.5 47.9 52.1 80.2 83.4 86.5 107.7 110.3
b+=102(98%) 43.5 47.9 76.9 80.2 102.3 105.0 107.7 125.8
WUB SDP b+=∞ 43.5 73.5 96.8 116.2 133.1 148.2 162.0 174.7
LB SSDP
b−=1(0%) 15.3 32.1 45.0 55.6 64.6 72.4 79.3 85.4
b−=10(81%) 0.0 0.1 1.8 4.6 25.1 27.4 29.6 31.8
b−=102(98%) 0.0 0.1 1.8 4.6 7.6 10.6 13.5 16.3
WLB SDP b−=∞ 0.0 0.1 1.8 4.6 7.6 10.6 13.5 16.3
Table A.4: Upper/lower bounds and prices for different strike prices K, b+(b−)-values
and time to maturity under 2 regimes.
107
Appendix B
Additional Figures
B.1 Section 3.4.2
1 2 3 4 5 6 7 80
50
100
150
200
250
Time to Maturity (5 weeks/unit)
Upper
bounds a
nd p
rices
BSlow
RS
UBSDP
UBSSDP
(b+=10)
UBSSDP
(b+=10
2)
WUBSSDP
(a) Upper bounds and prices
1 2 3 4 5 6 7 80
20
40
60
80
100
120
140
160
180
Time to Maturity (5 weeks/unit)
Low
er
bounds a
nd p
rices
BShigh
RS
LBSDP
LBSSDP
(b−=10)
LBSSDP
(b−=10
2)
WLBSDP
(b) Lower bounds and prices
Figure B.1: The case of 3 regimes and K = 1325
108
APPENDIX B. ADDITIONAL FIGURES
1 2 3 4 5 6 7 80
50
100
150
200
250
Time to Maturity (5 weeks/unit)
Upper
bounds a
nd p
rices
BSlow
RS
UBSDP
UBSSDP
(b+=10)
UBSSDP
(b+=10
2)
WUBSSDP
(a) Upper bounds and prices
1 2 3 4 5 6 7 80
20
40
60
80
100
120
140
160
180
Time to Maturity (5 weeks/unit)Low
er
bounds a
nd p
rices
BShigh
RS
LBSDP
LBSSDP
(b−=10)
LBSSDP
(b−=10
2)
WLBSDP
(b) Lower bounds and prices
Figure B.2: The case of 4 regimes and K = 1325
1 2 3 4 5 6 7 80
100
200
300
400
500
600
700
800
900
Time to Maturity (5 weeks/unit)
Upper
bounds a
nd p
rices
BSlow
RS
UBSDP
UBSSDP
(b+=10)
UBSSDP
(b+=10
2)
WUBSSDP
(a) Upper bounds and prices
1 2 3 4 5 6 7 80
20
40
60
80
100
120
140
160
180
200
Time to Maturity (5 weeks/unit)
Low
er
bounds a
nd p
rices
BShigh
RS
LBSDP
LBSSDP
(b−=10)
LBSSDP
(b−=10
2)
WLBSDP
(b) Lower bounds and prices
Figure B.3: The case of 5 regimes and K = 1325
109
Bibliography
Anderson, E. W., Hasen L. P., and Sargent, T. J. (2000): Robustness, detection and the
price of risk.
Ariyawansa, K. A. and Zhu, Y. (2006): Stochastic semidefinite programming: a new
paradigm for stochastic optimization, 4OR-A Quarterly Journal of Operations Re-
search 4, 239–253.
Bakshi, G., Cao, C., and Chen, Z. (1997): Empirical performance of alternative option
pricing models, Journal of Finance 53, 499–547.
Ben-Tal, A., Boyd, S., and Nemirovski, A. (2006): Extending scope of robust optimiza-
tion: comprehensive robust counterparts of uncertain problems, Mathematical Pro-
gramming Series B 107(1), 63–89.
Ben-Tal, A. and Nemirovski, A. (2001): Lectures on Modern Convex Optimization: Anal-
ysis, Algorithms, and Engineering Applications, MPS/SIAM Series on Optimization,
SIAM, Philadelphia, PA, USA.
Ben-Tal, A. and Nemirovski, A. (2002): Robust optimization - methodology and appli-
cations, Mathematical Programming Series B 927(3), 453–480.
Ben-Tal, A., El Ghaoui, L., and Nemirovski, A. (2009): Robust Optimization, Princeton
University Press, Princeton, NJ, USA.
110
BIBLIOGRAPHY
Bertsimas, D., Popescu, I., and Sethuraman, J. (2000): Moment problems and semidefi-
nite programming, Handbook on Semidefinite Programming: Theory, Algorithms, and
Applications, Kluwer Academic Publishers, Dordrecht, Netherlands.
Bertsimas, D. and Popescu, I. (2002): On the relation between option and stock prices:
a convex optimization approach, Operations Research 50, 358–374.
Bertsimas, D. and Popescu, I. (2005): Optimal inequalities in probability theory: a
convex optimization approach, SIAM Journal on Optimization 15(3), 780–804.
Bertsimas, D. and Sim, M. (2004): The price of robustness, Operations Research 52(1),
35–53.
Birge, J. R. and Louveaux, F. (1997): Introduction to Stochastic Programming, Springer-
Verlag, New York, NY, USA.
Black, F. and Scholes, M. (1973): The pricing of options and corporate liabilities, Journal
of Political Economy 81, 637–654.
Boyle, P. B. and Lin, X. S. (1997): Bounds on contingent claims based on several assets,
Journal of Financial Economics 46, 383–400.
Calafiore G. (2007): Ambiguous risk measures and optimal robust portfolios, SIAM Jour-
nal of Optimization 18(3), 853–877.
Christoffersen, P. and Jacobs, K. (2004): Which GARCH model for option valuation?,
Management Science 50, 1204–1221.
Cont, R. (2006): Model uncertainty and its impact on the pricing of derivative instru-
ments, Mathematical Finance 16, 519–547.
Curto, R. E. and Fialkow, L. A. (1996): Solution of the truncated complex moment
problem for flat data, Memoirs of the American Mathematical society 119(568).
111
BIBLIOGRAPHY
Dalakouras, G. V., Kwon, R. H., and Pardalos, P. M. (2008): Semidefinite programming
approaches for bounding Asian option prices, Computational Methods in Financial
Engineering, Springer Verlag, Berlin, German.
Delage, E. and Ye Y. (2010): Distributionally robust optimization under moment uncer-
tainty with application to data-driven problems, Operations Research 58(3), 595–612.
Dupacova J. (1987): The minimax approach to stochastic programming and an illustra-
tive application, Stochastics 20, 73–88.
El Ghaoui, L., Oks, M., and Oustry, F. (2003): Worst-case value-at-risk and robust
portfolio optimization: a conic programming approach, Operations Research 51(3),
543–556.
Ellsberg, D. (1961): Risk, ambiguity, and the savage axioms, Quarterly Journal of Eco-
nomics 75(4), 643–669.
Everitt, R. and Ziemba, W. T. (1979): Stochastic programs with simple recourse, Oper-
ations Research 27, 485–502.
Follmer, H. and Schied, A. (2002): Convex measures of risk and trading constraints,
Finance and Stochastics 6(4), 429–447.
Freeland, R. K., Hardy, M. R., and Till, M. (2009): Assessing regime switching equity
return models, Technical report, University of Waterloo, Ontario, Canada.
Gilboa, I. and Schmeidler, D. (1989): Maxmin expected utility with a non-unique prior,
Journal of Mathematical Economics 18(2), 141–153.
Goh, J. and Sim, M. (2010): Distributionally robust optimization and its tractable ap-
proximations, Operations Research 58(1), 902–917.
112
BIBLIOGRAPHY
Goldfarb, D. and Iyengar, G. (2003): Robust portfolio selection problems, Mathematics
of Operations Research 28(1), 1–38.
Gotoh, J. and Konno, H. (2002): Bounding option prices by semidefinite programming:
a cutting plane algorithm, Management Science 48(5), 665–678.
Grotschel, M., Lovasz L., and Schrijver A. (1981): The ellipsoid method and its conse-
quences in combinatorial optimization, Combinatorica 1(2), 169–197.
Grundy, B. (1991): Option prices and the underlying asset’s return distribution, Journal
of Finance 46(3), 1045–1070.
Hamburger, H. (1920): Uber eine Erweiterung des Stieltjesschen Momentenproblems,
Mathematische Annalen 81(2), 235–319.
Hamburger, H. (1921): Uber eine Erweiterung des Stieltjesschen Momentenproblems,
Mathematische Annalen 82(3), 168–187.
Hamilton, J. D. (1989): A new approach to the economic analysis of nonstationary time
series and the business cycle, Econometrica 57, 357–384.
Hardy, M. R. (2001): A regime switching model of long-term stock returns, North Amer-
ican Actuarial Journal 5, 41–53.
Hsieh, K. and Ritchken, P. (2005): An empirical comparison of GARCH option pricing
models, Review of Derivatives Research 8, 129–150.
Isii, K. (1960): The extrema of probability determined by generalized moments (i)
bounded random variables, Annals of the Institute of Statistical Mathematics 12, 119–
133.
Isii, K. (1963): On sharpness of Tchebycheff-type inequalities, Annals of the Institute of
Statistical Mathematics 14, 185–197.
113
BIBLIOGRAPHY
Jackwerth, J. and Rubinstein, M. (1996): Recovering probability distributions from op-
tion prices, Journal of Finance 51, 1611–1631.
Kall, P. and Wallace, S. W. (1994): Stochastic Programming, John Wiley and Sons,
Chichester, West Sussex, England.
Kariya, T. and Liu, R. Y (2003): Asset pricing: discrete-time approach, Kluwer Academic
Publishers, Boston, MA, USA.
Kolda, T. G., Lewis R. M., and Torczon V. (2003): Optimization by direct search: new
perspectives on some classical and modern methods, SIAM Review 45(3), 385–482.
Kwon, R. H. and Li, J. Y. (2011): A stochastic semidefinite programming approach for
bounds on option pricing under regime switching, Working Paper.
Lasserre, J. B. (2001): Global optimization with polynomials and the problems of mo-
ments, SIAM Journal on Optimization 11, 796–817.
Lasserre, J. B. (2010): Moments, positive polynomials and their applications, Imperial
College Press, London, UK.
Li, J. Y. and Kwon, R. H. (2011): Portfolio selection under model uncertainty: a penal-
ized moment-based optimization approach, Working Paper.
Li, J. Y. and Kwon, R. H. (2012): Market price-based convex risk measures: a
distribution-free optimization approach, Operations Research Letters 40(2), 128–133.
Lo, A. W. (1987): Semi-parametric upper bounds for option prices and expected payoffs,
Journal of Financial Economics 19(2), 373–387.
Luenberger, D. G. (1998): Investment Science, Oxford University Press, New York, NY,
USA.
114
BIBLIOGRAPHY
Maenhout, P. J. (2004): Robust portfolio rules and asset pricing, Review of Financial
Studies 17(4), 951–983.
Natarajan, K., Pachamanova, D., and Sim, M. (2008): Incorporating asymmetric distri-
butional information in robust value-at-risk optimization, Management Science 54(3),
573–585.
Natarajan, K., Sim, M., and Uichanco, J. (2010): Tractable robust expected utility and
risk models for portfolio optimization, Mathematical Finance 20(4), 695–731.
Popescu, I. (2007): Robust mean-covariance solutions for stochastic optimization, Oper-
ations Research 55(1), 98–112.
Primbs, J. A. (2010): SDP relaxation of arbitrage pricing bounds based on option prices
and moments, Journal of Optimization Theory and Applications 144, 137–155.
Ritchken, P. (1985): On option pricing bounds, Journal of Finance 40, 1219–1233.
Scarf, H. (1958): A min-max solution of an inventory problem, Studies in The Mathe-
matical Theory of Inventory and Production 201–209.
Shapiro, A. (2001): On duality theory of conic linear problems, Semi-Infinite Program-
ming: Recent Advances, Kluwer Academic Publishers, Netherlands.
Shapiro, A., Dentcheva, D., and Ruszczynski, A. (2009): Lectures on Stochastic Program-
ming: Modeling and Theory, MPS/SIAM Series on Optimization, SIAM, Philadelphia,
PA, USA.
Smith, J. (1995): Generalized Chebyshev inequalities: theory and applications in decision
analysis, Operations Research 43(5), 807–825.
Stieltjes, T. J. (1894): Recherches sur les fractions continues, Annales de la Faculte des
Sciences de Toulouse 8, 1–122.
115
BIBLIOGRAPHY
Stieltjes, T. J. (1895): Recherches sur les fractions continues, Annales de la Faculte des
Sciences de Toulouse 9, 1–47.
So, M. K. P., Lam, K., and Li, W. K. (1998): A stochastic volatility model with markov
switching, Journal of Business and Economic Statistics 16, 244–253.
Turner, C. M., Startz, R., and Nelson, C. R. (1989): A markov model of heteroscedastic-
ity, risk and learning in the stock market, Journal of Financial Economics 25, 3–22.
Tutuncu R. H. and Koenig, M. (2004): Robust asset allocation, Annals of Operations
Research 132, 157–187.
Uppal, R. and Wang, T. (2003): Model misspecification and under-diversification, Jour-
nal of Finance 58(6), 2465–2486.
Zhu, S. S. and Fukushima, M. (2009): Worst-case conditional value-at-risk with applica-
tion to robust portfolio management, Operations Research 57(5), 1155–1168.
Zuluaga, L. F. and Pena, J. F. (2005): A conic programming approach to generalized
tchebycheff inequalities, Mathematics of Operations Research 30(2), 369–388.
116