View
2
Download
0
Category
Preview:
Citation preview
DOCTORAL THESIS
FAST SCENARIO-BASED OPTIMAL CONTROL
FOR STOCHASTIC PORTFOLIO OPTIMIZATIONwith Application to a Large-Scale Portfolio
Author:Marc WEIBEL
Supervisor:Associate Professor. Juri HINZ
Co-supervisor:Professor. Marc WILDI
A thesis submitted to the Finance Discipline Group of the University of Technology Sydney, in fulfilment of the requirements for the degree of Doctor of Philosophy.
June 2019
Finance Discipline Group University of Technology Sydney PO Box 123, Broadway, NSW 2007, Australia
Certificate of Original Authorship
I, Marc Weibel declare that this thesis, is submitted in fulfilment of the requirements for the
award of Degree of Doctor of Philosophy, in the Finance Discipline Group of the Faculty of
Sciences at the University of Technology Sydney.
This thesis is wholly my own work unless otherwise reference or acknowledged. In addition, I
certify that all information sources and literature used are indicated in the thesis.
This document has not been submitted for qualifications at any other academic institution.
This research is supported by the Australian Government Research Training Program.
Signature of Student: Date:
10/26/2018
i
Production Note:
Signature removed prior to publication.
Abstract
This thesis contributes towards the development of a fast optimal control algorithm, relying
on the Alternating-Direction of Multipliers (ADMM), for solving large-scale linear convex
multi-period optimization problems as well as the design of investment strategies aiming at
stabilizing portfolio performance over time.
The first part of the thesis focuses on a statistical risk-budgeting method to improve naive
diversification strategies. We extend the so-called minimum-torsion approach and use ad-
vanced modern techniques for covariance estimation and shrinkage purposes. We propose
a novel factor investing approach, which dynamically identifies statistical risk factors over
time. We device dynamic investment strategies aiming at diversifying idiosyncratic risk left
unexplained by the factors.
We develop in the second part of this thesis a fast algorithm for solving scenario-based model
predictive control (MPC) arising in multi-period portfolio optimization problems efficiently.
We derive an alteration of the termination criterion, using the probabilities assigned to the
scenarios and provide a convergence analysis. We show that the proposed criterion outper-
forms the standard approach and highlight our results with a numerical comparison with a
state-of-the-art algorithm. We also enhance the standard two-set splitting algorithm of the
ADMM method, by including inequality constraints through a so-called embedded splitting,
without recourse to an additional (costly) splitting set.
We present a real-world large-scale multi-period portfolio application, where we combine the
different concepts derived in this thesis. We propose an approach to generate scenarios relying
on a Hidden Markov Model (HMM) and solve the constrained multi-period MPC problem
with the ADMM algorithm developed. We also suggest an innovative concept to steer the
risk-aversion used in the objective function dynamically, building on the probability assigned
to each scenario. We back-test the strategy and show that the results obtained do provide the
expected risk-adjusted outperformance and stability, without deviating significantly from the
strategic asset allocation.
Key words: Risk-Budgeting, Diversification, Convex Optimization, Model Predictive Control,
Alternative-Direction Method of Multipliers, Optimal Control.
ii
Acknowledgements
During my career in the industry and at the Zurich University of Applied Sciences, I have
had the privilege and pleasure of working and interacting with many talented persons. These
people have contributed to my education and my evolution and it would be impossible to
name each of them.
First of all, I would like to thank my principal advisor, Juri Hinz. He gave me the opportunity to
pursue PhD study, his dedication and support have helped me to navigate through my thesis
and to conduct a proper research. He undertook the necessary steps for enrolling me into the
program at the University of Technology in Sidney and helped me in every aspect of the PhD
thesis. I would like to express my deepest gratitude and respect to Juri for his support and
encouragement.
I would like to thank my co-adviser, Marc Wildi, who played an important role in the pro-
cess of my PhD study. Our close interaction brought me further in my research and his
funded knowledge in econometrics have helped me in taking the correct direction when I felt
uncertain.
I would also like to express my gratitude to my management at the Zurich University of
Applied Sciences, in particular Wolfgang Breymann and Jürg Hosang, who encouraged me to
accomplish this PhD.
My first contact with financial theory was during my Master Studies in Economics at the
University of Neuchatel, Switzerland. I took my first finance class of Prof. Michel Dubois and
resolved to take every class Michel has proposed since. Michel was an incredible lecturer and
teacher and sparked my interest in financial topics.
I would like to thank the academics at UTS, who offered me the opportunity to pursue my study
out of Switzerland and the reviewers who took the time necessary for reading my progress
reports and provided me with thoughtful advice.
On the private side, I would like to thank my family and especially my lovely wife Coralie for her
unconditional and constant support throughout the years. She let me work sometimes very
late in the night to finish a chapter and put a smile back on my face, when I felt demoralized.
This thesis is dedicated to you Coralie.
iii
Contents
Declaration i
Abstract ii
Acknowledgements iii
List of Figures ix
List of Tables x
Abbreviations xii
Introduction 1
1 Alternative Diversification Strategies 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Naive Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Mean-Variance and Naive Diversification . . . . . . . . . . . . . . . . . . . 5
1.3 Random Matrix Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Random correlation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Minimum-Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
iv
Contents
1.4.1 Corrected-Benchmark Portfolio . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Statistical Risk Budgeting 19
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Factor Investing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Smart Beta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.2 Risk Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Statistical Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Minimum-Torsion Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Uncorrelated Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.3 Effective Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.4 Factor Risk Budgeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.1 Idiosyncratic Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.2 Measuring Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Investment Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.3 Benchmark Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.4 Risk Parity Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6.5 Risk Budgeting Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6.6 Diversification Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
v
Contents
3 Portfolio Optimization 37
3.1 Single-Period Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1.1 Modern Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1.2 Mean-Variance Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Multi-Period Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Dynamic Programming Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 Model Predictive Control 45
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Scenario-Based MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Portfolio, Benchmark and Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5.1 Minimum and Maximum Weights . . . . . . . . . . . . . . . . . . . . . . . 48
4.5.2 Brokerage Costs and Bid-Ask Spread . . . . . . . . . . . . . . . . . . . . . . 49
4.5.3 Price impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.6 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.6.1 Portfolio Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.6.2 Objective function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.7 Decomposition Quadratic / Non-Quadratic . . . . . . . . . . . . . . . . . . . . . 51
4.7.1 Quadratic Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.7.2 Non-Quadratic Component . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Fast Scenario-Based Optimal Control 55
5.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.1 Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
vi
Contents
5.1.2 Proximal Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.3 Proximal minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 ADMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.1 Accelerated ADMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3 Splitting the MPC Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3.3 ADMM Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.4 Splitting Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Solving the Convex Quadratic Control Problem . . . . . . . . . . . . . . . . . . . 65
5.5 Solving the Convex Non-Quadratic Problem . . . . . . . . . . . . . . . . . . . . . 68
5.6 Extending the State-of-the-Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.6.1 Improving Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.6.2 Extended Two-Set Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.6.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6 Application 77
6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3 Scenario Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3.1 Hidden Markov Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.4 Large-Scale Dynamic Portfolio Strategy . . . . . . . . . . . . . . . . . . . . . . . . 84
6.4.1 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.4.2 Scenarios and Rebalancing . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
vii
Contents
6.4.3 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4.4 Costs and Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7 Conclusion and Outlook 88
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.2 Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2.1 Risk-Budgeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2.2 Multi-Period Optimization via ADMM . . . . . . . . . . . . . . . . . . . . 89
7.2.3 Investment Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.2.4 Scenario Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
A Appendix 92
A.1 The Jacobi and Gauss-Seidel Iterative Methods . . . . . . . . . . . . . . . . . . . 92
A.1.1 Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
A.1.2 Gauss-Siedel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Bibliography 99
viii
List of Figures
1.1 Eigenvalues distribution of 200 random assets . . . . . . . . . . . . . . . . . . . 9
1.2 Eigenvalues spectrum of 75 stocks in the S&P500 . . . . . . . . . . . . . . . . . 9
1.3 S&P500: Marchenko-Pastur density (best fit) . . . . . . . . . . . . . . . . . . . . 10
1.4 S&P500: Reshuffled assets) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Eigenvalues spectrum of the S&P500 stocks . . . . . . . . . . . . . . . . . . . . . 15
1.6 Cumulative performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7 Outperformance vs. buy-and-hold portfolio . . . . . . . . . . . . . . . . . . . . 17
2.1 Asset Classes – Correlations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 Risk Parity Strategies – Performance . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3 Risk Budgeting Strategies – Performance . . . . . . . . . . . . . . . . . . . . . . 33
2.4 Risk Parity vs. Risk Budgeting – Performance . . . . . . . . . . . . . . . . . . . . 34
2.5 Diversification along assets and factors . . . . . . . . . . . . . . . . . . . . . . . 35
5.1 Convergence properties: weighted vs. unweighted scheme . . . . . . . . . . . 71
6.1 Asset Classes – Correlations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.2 Dynamic Portfolio – Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
ix
List of Tables
1.1 Minimum-Torsion algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1 Asset Classes – Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 Risk Parity Strategies – Key figures . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Risk Budgeting Strategies – Key figures . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Risk Parity vs. Risk Budgeting – Key figures . . . . . . . . . . . . . . . . . . . . . 34
5.1 Computational Time Results for Stochastic MPC Problems . . . . . . . . . . . 75
6.1 Dow Jones Stocks – Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Dynamic Portfolio vs. Benchmark – Key figures . . . . . . . . . . . . . . . . . . 86
x
It’s not about finding your limits. It’s about finding what lies just beyond them.
— Unknown
To my wife Coralie . . .
Abbreviations
AADMM . . . . . . . . . . Accelerated Alternating Direction Method of Multipliers
ADP . . . . . . . . . . . . . . Approximate Dynamic Programming
ADMM . . . . . . . . . . . Alternating Direction Method of Multipliers
CVaR . . . . . . . . . . . . . Conditional Value-at-Risk
DP . . . . . . . . . . . . . . . . Dynamic Programming
HMM . . . . . . . . . . . . . Hidden Markov Model
MPC . . . . . . . . . . . . . . Model Predictive Control
PCA . . . . . . . . . . . . . . Principal Component Analysis
RMT . . . . . . . . . . . . . . Random Matrix Theory
VaR . . . . . . . . . . . . . . . Value-at-Risk
xii
Introduction
Motivation
This project has been sponsored by the Swiss Government and led by the University of Applied
Sciences in Switzerland together with a Swiss industry partner. The partner is a company
based in Zurich, which provides advice, project management services and innovative software
products to financial institutions and trading firms.
We aim at developing a discrete-time multi-period portfolio optimization framework that will
be used in real business environment for commercial purposes. The industry partner wants to
obtain in a first step a so-called reference portfolio that does not require the estimation of asset
returns and does not assume any market efficiency as in the standard framework formulated
in the seminal work of Markowitz [54]. We thus focus on risk diversification methodology and
do not face the statistical issues of predicting any returns.
Relying on so-called risk budgeting techniques, we extend the minimum-torsion approach
presented in Meucci et al. [63] and use advanced modern techniques for covariance estimation
and shrinkage purposes, based on Random Matrix Theory (RMT). We enhance the parity
approach with a factor-based risk-budgeting methodology relying on consistent estimate of
the covariance matrix. We present a sound and consistent approach, extending the state-of-
the-art and examine real-world market data within a portfolio construction context.
Optimizing a multi-asset portfolio over a given investment portfolio within an industrial frame-
work usually leads to a problem of sequential decision-making under uncertainty. Closed-form
solutions to real-world problems are exceptions and usually an exact solution is out of reach
and is not of primary importance in real-world applications. Our final objective in this thesis
is to develop a fast optimal control algorithm, which can handle constantly changing trading
signals efficiently in the presence of portfolio constraints and transaction costs. The final part
of the project, which is not handled by this thesis, aims at delivering a high-quality software to
the industrial partner relying on these novel scientific insights, providing a sound and efficient
portfolio construction framework delivering stable performance.
1
Abbreviations
Outline and Contribution
In Chapter 1, we review existing portfolio diversification methods and propose a statistical
risk-budgeting method to improve naive diversification strategies. We extend the minimum-
torsion approach presented in Meucci et al. [63] and use advanced modern techniques for
covariance estimation and shrinkage purposes, based on random matrix theory. We extend the
state-of-the-art and develop a novel factor investing approach in Chapter 2, where we identify
dynamically statistical risk factors. We combine the minimum-torsion approach with the
concept of modified effective rank developed by Kakushadze and Yu [44] and device dynamic
investment strategies aiming at diversifying idiosyncratic risk left unexplained by the factors.
We review the single and multi-period optimization framework in Chapter 3 as well as the
solutions proposed in the literature. After a short overview of Dynamic Programming (DP)
techniques and the issues associated when considering a large-scale portfolio, we introduce in
Chapter 4 Model predictive control (MPC), a widespread technique for solving linear convex
optimization problems often used in engineering. We derive a scenario-based formulation
of the portfolio optimization problem over a given investment horizon, in the presence of
transaction costs and portfolio restrictions. We show how the resulting problem can be
decomposed into a quadratic and non-quadratic component, whose particular structure can
be exploited.
In Chapter 5, we present the Alternating Direction Method of Multipliers (ADMM), which
solves the scenario-based MPC portfolio optimization problem efficiently and quickly. We
develop a fast algorithm for the efficient solution of scenario-based model predictive control
arising in multi-period portfolio optimization problems. We enhance the standard procedure
and derive an alteration of the termination criterion, using the probabilities assigned to
the scenarios and provide a convergence analysis. We show that the proposed criterion
outperforms the standard approach and highlight our results with a numerical comparison
with state-of-the-arts algorithms. We also enhance the two-set splitting algorithm of the
ADMM method, by including inequality constraints through a so-called embedded splitting,
without recourse to a third splitting of the variables.
We propose a real-world large-scale multi-period portfolio application in Chapter 6, where
we combine the different concepts derived in this thesis. We suggest a novel approach to
generate scenarios relying on a Hidden Markov Model (HMM) and solve the constrained multi-
period MPC problem with the new ADMM algorithm developed in this thesis. We suggest an
innovative concept to steer the risk-aversion dynamically, building on the probability assigned
to each scenario. We back-test the strategy and show that the results obtained do provide the
expected risk-adjusted outperformance and stability, without deviating significantly from the
strategic asset allocation.
We present our conclusions in Chapter 7 and identify areas for future research.
2
1 Alternative Diversification Strategies
After a short review of existing portfolio diversification methods in Section 1.1, we start our
analysis in Section 1.2 with the naive diversification approach and consider the traditional
one-period portfolio optimization as proposed by Markowitz [54]. The computation of the
minimum-variance portfolio should rely on a consistent estimate of covariance matrix and
remove the noise inherent to financial data in particular. To this end we rely in Section 1.3
on Random Matrix Theory (RMT) and using results on eigenvalue distribution of random
matrices we distinguish significant dependencies among data and neglect natural fluctuation
in the eigenvalues.
Using concepts of risk budgeting methodology, we aim at holding equal fraction of the entire
portfolio wealth in each of the assets. However, due to asset correlations, some adaptations
have to be carried out. We follow here a procedure, the so-called minimum-torsion approach
described in Section 1.4, allowing to retrieve uncorrelated factors. We device an investment
strategy relying on the well-known naive diversification, we allocate wealth applying a correc-
tion to the original equal weight distribution. The correction is carried out with the help of
the minimum-torsion matrix and results in a portfolio in which highly correlated assets are
under-represented. These findings are illustrated by the case study on 375 stocks in the SP500
index in Section 1.5. We conclude this chapter in Section 1.6.
1.1 Introduction
In portfolio management practice, major difficulties originate from problems associated with
reliable estimation of statistical model parameters and the sensitivity of the optimal asset
allocation with respect to these quantities. Since there are several aspects of these difficulties
and an entire range of methods which address them, let us explain our concept focusing on
classical dynamic portfolio optimization. In this context, the major quantitative ingredients
are the (conditional) means and the covariances of the so-called asset log-returns. While a
reliable and statistically significant estimation of log-return means is virtually impossible, the
estimation of covariances may also be extremely difficult in practice, since for a large asset
3
1.2. Naive Diversification
number, we must consider the asymptotic behavior of the spectrum of random empirical
covariance matrices. Given permanent time changes of the price fluctuation intensity and
an extreme sensitivity of the optimal portfolio weights, a naive construction of the optimal
portfolio has no value for practical applications.
In view of these problems, some practically relevant approaches to portfolio optimization have
been suggested in order to overcome or to diminish the dependence on statistical procedures
of model identification.
Let us emphasize the benchmark approach in this setting. This theory addresses an optimal
portfolio selection under minimal theoretical assumptions and presents considerations, jus-
tifying the asymptotic optimality of an equally-weighted portfolio. This investment strategy
attempts to hold an approximately equal fraction of the entire portfolio wealth in each of the
assets, selected for the investment. Since this strategy requires a regular position re-balancing,
it is not a static investment, strictly speaking. However, empirical investigations show that
for appropriate diversification, even infrequent re-balancing can achieve a reasonable perfor-
mance.
A similar area of ideas is related to the so-called risk-parity approach. In this framework,
the investor attempts to build a portfolio choosing portfolio weights such that the marginal
contribution from each asset position to an appropriately defined total portfolio risk is the
same. Such risk parity approach is used to build diversified portfolios which do not rely on
return expectations, with the focus on risk management rather than performance. However,
the risk parity approach has also been criticized, and some stylized dependence on expected
returns has been reintroduced, with extensions in terms of the so-called minimum-torsion
approach.
A general framework is suggested in Platen [74], the so-called benchmark approach which
assumes the existence of a numeraire portfolio. This numeraire portfolio displays posi-
tive weights and when used as a benchmark renders all benchmarked portfolios to super-
martingales. Platen shows that this portfolio is equivalent to the Kelly portfolio which max-
imizes a logarithmic utility function. This numeraire portfolio cannot be systematically
outperformed by any other long-only portfolio. This theoretical numeraire portfolio can be
approximated by a worldwide diversified portfolio.
1.2 Naive Diversification
Explicit multi-period optimization and related processes have had very limited adoption
among practitioners because of several constraints explained in more detail in Chapter 3. The
most significant impediment is parameter estimation error.
Naive diversification approaches such as equally-weighted or market-weighted portfolios
are thus still used in absence of valid alternatives, displaying reasonable performance due
4
1.3. Random Matrix Theory
to diversification. These methods do not rely on any statistical estimation and are justified
by asymptotic properties. Alternatively, risk-parity strategies have been developed in order
to circumvent the inherent flaws of the traditional mean-variance approach, i.e. avoid using
expected returns and focusing on risk diversification over the assets in the portfolio. However,
these strategies have their limits, as showed in 2013 where most of the risk-parity products
displayed very disappointing performance. Covariance estimates are the key component of
such strategies and inherent noise should be removed before inferring the asset weightings.
Moreover, risk-parity products are highly invested in fixed-income instruments which display
historically a low-volatility but could massively impact the future performance if interest rates
start to rise again globally.
A combination of these methods could deliver the desired performance stability of investors’
portfolios. We propose a novel approach for risk-budgeting purposes by extending the
minimum-torsion methodology, which is applied with stable covariance estimates. We identify
uncorrelated sources of risk within the portfolio. Using risk-budgeting methodology instead
of a plain risk-parity allows to avoid this extreme overweight of low-volatility instruments.
1.2.1 Mean-Variance and Naive Diversification
The traditional mean-variance framework presented in Markowitz [54] is still the most used
approach in the financial industry. However, estimation theory has proven that this framework
produces optimal portfolio weights that change quite dramatically over time due to the
absence of large datasets. Several approaches have been proposed in order to stabilize or
shrink the covariance matrix. These minimum-risk approaches do not rely on the estimation
of expected returns and only focus on risk.
In several studies, Platen (see Platen and Heath [75], Platen [74], Platen and Rendek [76])
proposes an alternative to Markowitz-based strategies relying on naive diversification. This
diversification is deemed to approximate the “ideal” numeraire portfolio, which maximizes
the logarithmic utility function of an investor and thus dominates any other strictly positive
portfolio over a long period of time. Platen and Rendek [76] shows that when the number
of assets tends towards infinity, the equally-weighted portfolio converges to the numeraire
portfolio. This robust approach is consistent and does not imply any asset returns model.
Moreover, even after deduction of transaction costs, such portfolios significantly dominate
corresponding market-weighted assets, thus empirically proving the asymptotic behavior of
the numeraire “proxies”.
1.3 Random Matrix Theory
This section starts with an overview of the Random Matrix Theory (RMT) and presents specific
concepts related to financial market returns.
5
1.3. Random Matrix Theory
1.3.1 Overview
The study of statistical factors inherent to asset returns has a long history in finance. Since
the seminal paper of Markowitz [54], using volatility (or variance) as a risk measure has
become standard. The additive property of variance in uncorrelated markets, enables an easily
identification of the sources of risk within a portfolio:
V ar (Rp ) =N∑
i=1V ar (wi Ri )
where Rp is the portfolio returns and Ri the returns of asset i , with a weight wi in the portfolio.
However, financial assets do display correlation and one often recourse to the principal
component analysis in order to extract uncorrelated sources of risk. To this purpose we
decompose the covariance matrix Σ of asset returns:
ETΣE ≡Λ
where Λ ≡ diag(λ1, . . . ,λN ) is a diagonal matrix containing the eigenvalues of Σ and E ≡(e1, . . . ,eN ) are the corresponding eigenvectors (column-wise). These eigenvectors define N
uncorrelated factors, whose returns are defined by RPCA ≡ E−1 R. The eigenvalues Λ corre-
spond to the variances of these uncorrelated factors.
In Bouchaud and Potters [12], the authors show that the minimum variance portfolio, as
proposed by Markowitz [54], displays the largest weight on the eigenvectors of the correlation
matrix with the smallest eigenvalues. An effective empirical estimation of the correlation
matrix thus turns out to be a complicated task but plays a major role in portfolio construction.
If we consider N assets with a number of observations T not very large compared to N , one
can expect that the estimation of the covariances will be “noisy”, meaning that the empirical
correlation matrix is to a large extent composed of random entries. We thus have to be
careful when using empirical correlations in portfolio construction, above all in minimum-risk
strategies. It is of utmost importance to design a procedure allowing to retain real information
and removing noise from the eigenvalues and eigenvectors.
1.3.2 Theory
Random Matrix Theory (RMT) came up in the 50’s and are also of interest in a portfolio
construction context. Algorithms used in the context of optimal portfolio liquidation, trading
off the risk and the impact cost rely on the inversion of the covariance matrix. Thus, small or
zero eigenvalues, are related to portfolios of assets that have nonzero returns but vanishing
or low risk. Small samples or insufficient data lead to estimation errors that impact such
portfolios. Random matrix techniques aim at solving this issue of small eigenvalues in the
sample covariance matrix.
6
1.3. Random Matrix Theory
In their research paper, Laloux et al. [49] propose to compare the properties of the empirical
correlation matrix to a purely random matrix, retrieved from simulated independent returns.
The identification of deviations from the random matrix helps detect the presence of true
information.
1.3.3 Random correlation matrices
Let us consider time series of N assets, with T observations. The elements of the empirical
correlation matrix C, of size N ×N , are given by
Ci j = 1
N
T∑t=1
ri t r j t ,
where ri t denotes the return of asset i at time t , normalized by volatility such thatVar [ri t ] = 1.
If we use the matrix form the correlation matrix can written as
C= RRT,
where R defines the N ×T matrix whose rows correspond to the return observations for each
asset.
Theorem 1 (Marchenko-Pastur theorem). In random matrix theory, the asymptotic behavior
of eigenvalues of large rectangular random matrices are described by the Marchenko-Pastur
distribution. Let X denotes a M ×N random matrix whose entries are independently identi-
cally distributed random variables with mean 0 and finite variance. YN = N−1X X T and let
λ1,λ2, . . . ,λM be the eigenvalues of YN . Consider the random spectral measure
µM (A) = 1
M
M∑j=1
δλ j∈A , A ∈ R .
Assume that M , N →∞ so that the ratio M/N →λ ∈ (0,+∞). Then µMd−→µ, where
µA =
(1− 1λ )10∈A +ν(A), if λ> 1
ν(A), if 0 ≤λ≤ 1,
and
dν(x) = 1
2πσ2
√λmax −x)(x −λmin)
λx1[λmi n ,λmax ]d x
λmaxmin = (1±
pλ)2σ2.
We can now apply the theorem defined in (1) within a portfolio context, Let Rt ∼N (m,1N )
7
1.3. Random Matrix Theory
denote the independently and normally distributed asset returns1 and C the empirical corre-
lation matrix.
We denote ρC(λ) the density of the eigenvalues of the correlation matrix C, defined as:
ρC(λ) = 1
N
dn(λ)
dλ,
where n(λ) corresponds to the number of eigenvalues of the correlation matrix C that are less
than λ.
In Edelman [28], the authors showed, as N →∞ and T →∞, with Q = T /N ≥ 1, that ρC(λ) is
exactly known:
ρC(λ) = Q
2πσ2
√λmax −λ)(λ−λmin)
λ, (1.1)
λmaxmin = (1±
√1/Q)2σ2, (1.2)
with λ ∈ [λmin,λmax]. As asset returns have been scaled the variance σ2 is equal to 1.
In the limit Q=1, Laloux et al. [49] shows that the distribution of the normalized eigenvalues is
given by (1.1) and that important features can be extracted in the limit N →∞:
• the lower boundary of the spectrum is strictly positive, except for Q=1; No eigenvalue
displays a value between 0 and λmin. In the neighborhood of this boundary, the density
of eigenvalues exhibits a sharp maximum, except in the limit Q=1, corresponding to
λmin = 0, where it diverges as ∼ 1/pλ.
• the density of eigenvalues vanishes above λmax.
Let us consider a simple example of T = 1000 random asset returns (N = 200 assets), with
constant variance σ2 = 1.
When N is finite, these particular features displayed in the neighborhood of the boundaries
are not sharp. There is still a small probability of finding eigenvalues below λmin and above
λmax. This probability vanishes when the number of observation N becomes very large.
We consider a sample of 75 representative stocks (N ) in the S&P500 index for which we have
200 observations(T ). We thus have Q = T /N = 2.66. We display the spectrum of eigenvalues
and superimpose the Marchenko-Pastur density, with Q = 2.66 and σ2 = 1.
We can observe that the highest eigenvalues are massively off the higher bound defined by the
upper bound λmax and that the overall fit is not really satisfying.
1Asset returns have been scaled to have σ= 1.
8
1.3. Random Matrix Theory
0.00
0.25
0.50
0.75
1.00
0.0 0.5 1.0 1.5 2.0Eigenvalues
Den
sity
Figure 1.1 – Eigenvalues distribution of 200 random assets
0
1
2
0 1 2 3 4Eigenvalues
Den
sity
Figure 1.2 – Eigenvalues spectrum of 75 stocks in the S&P500
When we look at the corresponding eigenvectors, as expected, we can notice that all compo-
nents are roughly equal on all stocks, thus proving that the first component corresponds to a
proxy for the market itself. We can reject the hypothesis of “pure noise” for the first principal
component. Another conjecture would be to assume that the other principal components,
that are de facto orthogonal to the market proxy, are pure noise.
To this purpose, we can subtract the contribution of λmax from the nominal value for σ2 = 1,
leading to σ2 = 1−λmax/N = 0.94. Figure 1.3 displays the empirical distribution with this
better fit for σ2 (cyan line).
We see that some eigenvalues are still above λmax and can thus be considered as information
and reduce the random part of the correlation matrix. σ2 can be considered as a parameter
that we can adjust to optimize the fit. The best fit is obtained for σ2 = 0.44 and corresponds to
9
1.3. Random Matrix Theory
0
1
2
0 1 2 3 4Eigenvalues
Den
sity
Figure 1.3 – S&P500: Marchenko-Pastur density (best fit)
the red line in Figure 1.3. It accounts for roughly 95% of the spectrum, while the remaining
highest eigenvalues are still well above the upper threshold λmax.
If we now randomize our asset returns, by shuffling the returns of each assets we obtain a
spectrum of the eigenvalues that in the limit follows a Marchenko-Pastur density, as shown in
Example 1.3.3.
We consider the same stocks but shuffle the time series of each asset and compute the corre-
sponding asset values. We repeat this procedure 1’000 times and average the results. We can
observe that the random returns are very well explained by the theoretical density.
0.0
0.2
0.4
0.6
0.8
0 1 2Eigenvalues
Dens
ity
Figure 1.4 – S&P500: Reshuffled assets)
10
1.4. Minimum-Torsion
1.4 Minimum-Torsion
Let us assume the price evolution (S(t ) = (S1(t ), . . . ,SN (t ))t∈N¯
of given N ∈ N¯
financial assets
follows an adapted stochastic process taking values in R¯
N realized on a filtered probability
space (Ω, ,P, (t )t∈N¯
). If we denote byπ(t ) = (π1(t ), . . . ,πN (t ) the vector of fractions of the wealth
invested in the assets i = 1, . . . , N at time t = 0,1,2, . . . , then, following the self-financed strategy
determined by π= (π(t ))t∈N¯
, the wealth (Sπ(t ))t∈N¯
evolves as
Sπ(t +1) = Sπ(t )(1+N∑
i=1πi (t )Ri (t +1)), t = 0,1,2 . . . ,
with the so-called returns
Ri (t +1) = (Si (t +1)−Si (t ))/Si (t ), t ∈ N¯
, i = 1, . . . , N ,
of the assets i = 1, . . . , N . For instance, the so-called equally-weighted portfolio suggests
holding the same fraction of the wealth in each asset at any time
πi (t ) = 1
N, t ∈ N
¯, i = 1, . . . , N .
The traditional mean-variance considerations on portfolio optimization assume that the mean
vector and covariance matrix of the returns (R(t) = (Ri (t))Ni=1)t∈N
¯do not change with time
t ∈ N¯
. Let us agree that R := R1 represents the distribution of the returns. The main ingredients
for the calculation of optimal portfolio in the spirit of Markowitz are the return covariances
Σ= Cov(R), σ2 = Var(R) = diag−1(Σ), (1.3)
whose reliable estimation has attracted persistent attention in the literature. Given the covari-
ance matrix Σ, the so-called principal component analysis (PCA) is based on diagonalization
D = TΣT > with entries of the diagonal matrix D given by the eigenvalues of Σwhose orthonor-
mal eigenvectors are rows of the orthogonal matrix T . The principal components are given by
(T R(t ))t∈N¯
which can be considered as an approximation of the returns of the synthetic assets
(T S(t ))t∈N¯
. Such linear transformation of the original price process (T S(t ))t∈N¯
to (T S(t ))t∈N¯whose components have uncorrelated returns can be utilized in the portfolio optimization.
However, the process (T S(t ))t∈N¯
may appear artificial. That is, to reach return uncorrelation,
other linear transformations than T may be of interest, thus we address the following question:
determine a linear transformation T ∗ : R¯
N → R¯
N
such that T ∗R are uncorrelated and the
components of T ∗R are close to those of R
(1.4)
In what follows, we present an approach to this problem, in terms of an algorithm which yields
a matrix T ∗ solving (1.4), in certain sense.
11
1.4. Minimum-Torsion
Given the returns’ covariance matrix Σ and the vector σ2 of variances as in (1.3), the solution
to (1.4) is proposed in terms of minimizing the function
f (T ) =√√√√ 1
N
N∑i=1
Var((T R)i −Ri )
Var(Ri ), (1.5)
which is defined for each N ×N matrix T , and will be minimized subject to the uncorrelation
condition
Cov(T R) is a diagonal matrix
Having introduced the random variable Z = (Z1, . . . ZN )
Zi = Ri
Var(Ri )1/2, i = 1, . . . , N ,
representing the distribution of the normalized returns, the function (1.5) satisfies
f 2(diag(σ)−1V diag(σ)) := g (V ) := tr(Cov(V Z −Z )),
where the symbol tr denotes the normalized trace. Let us interpret the problem (1.4) as that of
determining
V ∗ = argminV 7→ g (V ) subject to Cov(V Z ) ∈D ,
where D denotes all N ×N diagonal matrices, followed by the transformation
T ∗ = diag(σ)V ∗diag(σ)−1. (1.6)
Based on Meucci et al. [63], we present an algorithm, which recursively generates a sequence
(V (k))k∈N¯
of matrices such that
(g (V (k)))k∈N¯
is decreasing, and Cov(V (k)Z ) ∈D for all k ∈ N¯
.
Interrupting this sequence at appropriate step k∗ ∈ N¯
, we obtain a matrix V ∗ :=V (k∗) which
provides a solution to the problem (1.4).
Let us rewrite the target function as
g (V ) = tr(Cov(V Z −Z ))
= tr(V C 2V >−V C 2 −C 2V >+C 2)
= tr(V C 2V >−2V C 2)+ tr(C 2),
12
1.4. Minimum-Torsion
where C = Cov(Z )12 represents the root of the correlation matrix of the returns R. Since tr(C 2)
does not depend on V , we aim at
minimization of V 7→ tr(V C 2V >−2V C 2)
subject to Cov(V Z ) =V C 2V > ∈D .
Using the change of variablesΠ=V C , we equivalently address
minimization ofΠ 7→ tr(ΠΠ>−2CΠ)
subject toΠΠ> ∈D .
To satisfy the restrictionΠΠ> ∈D, the matrix is represented using polar decomposition
Π= DU
meaning that
U is orthogonal UU> = 1 (1.7)
and
D ∈D is positive definite and diagonal. (1.8)
With this decomposition, we address a separate minimization in U :
given D as in (1.8), determine a minimizer to
U 7→ tr(D2 −2C DU ) subject to (1.7)
which is solved in terms of the minimizer
U = (DC 2D)−12 DC
and a separate minimization in D :
given U as in (1.7), determine a minimizer to
D 7→ tr(D2 −2C DU ) subject to (1.8)
which is also solved with the minimizer
D = diag−1(diag(CU )+).
The successive alteration of both minimizations yields the algorithm (1.1) whose details are
given in Meucci et al. [63].
Given the matrix V ∗ returned by the algorithm (1.1), the so-called minimum-torsion matrix
13
1.4. Minimum-Torsion
Given square root C of return correlation0. Initialize D(0) ← 1, k ← 0
1. Root and rotation U (k) ← (D(k)C 2C )−12 D(k)C
2. Stretching D(k) ← diag(diag−1(U (k)C )+)3. Perturbation Π(k) ← D(k)U (k)4. Interruption? result V ∗ ←Π(k)C−1
5 Continuation set k ← k +1 and go to 1.
Table 1.1 – Minimum-Torsion algorithm
T ∗ is calculated from (1.6) and is considered as a solution to the problem (1.4). With the matrix
T ∗, the optimization of portfolio can be addressed.
1.4.1 Corrected-Benchmark Portfolio
In the spirit of risk-parity strategies, which take correlations among assets into account we use
the torsion matrix in order to correct the equally-weighted portfolio (naive diversification).
We calculate the corrected weights by multiplying the portfolio weights (wi = 1N )N
i=1 by the
torsion matrix T ∗ computed in (1.6). This transformation does not necessarily result in a fully
invested portfolio. The difference can be either invested in cash or the weights can be scaled
to add up to one. This methodology results in a portfolio where highly correlated assets are
under-represented. We characterize the corrected-benchmark portfolio in terms of wealth
fractions π(t ) = (πi (t ))Ni=1, invested in each asset i = 1, . . . , N at time t ∈ N
¯as
π>(t ) = w∗>T ∗/w∗>T ∗~1, t ∈ N¯
. (1.9)
In this formula, w∗>T ∗ stands for the wealth fractions, invested in risky assets given the
un-correlation from minimum-torsion matrix T ∗. According to this approach, only a fraction
w∗>T ∗~1 ∈]0,1[ of the wealth would be invested (Here~1 stands for the vector whose entries
are equal to one). In order to achieve a full investment of all available funds, we scale this
portfolio appropriately, obtaining (1.9). A similar strategy can be obtained if short positions
are not feasible,
π>(t ) = (w∗>T ∗)+/(w∗>T ∗)+~1, t ∈ N¯
. (1.10)
Here we use (·)+ to denote a component-wise application of positive-part function. We exam-
ine the behavior of both (1.9), (1.10) portfolio strategies in an empirical study in Section 1.5.
14
1.5. Case Study
1.5 Case Study
Due to data availability, we consider a representative sample of 375 stocks (N) in the S&P500
index for which we dispose of 1095 observations (T) covering the period from 2012/01 to
2016/05.
A common study design is to split the sample into a training and an independent testing set,
where the former is used to develop the model and the latter to evaluate its performance.
Accordingly, we start in a first step by analyzing the empirical histogram of the eigenvalues
of the considered stocks over the first half of the period (2012/01-2013/12) and superimpose
the theoretical Marchenko-Pastur density provided by the random matrix theory framework
detailed in Section 1.3.
σ
λ
Figure 1.5 – Eigenvalues spectrum of the S&P500 stocks
Based on this analysis (Figure 1.5), we retain eigenvalues above λmax = 2.262 that are assumed
to contain “information” and shrink the remaining ones that correspond to “noise”. For
the shrinkage procedure we follow Laloux et al. [49] and replace the noisy eigenvalues with
average value such that the trace of the covariance matrix remains unchanged. This results in
a “denoised” covariance matrix that will be used to compute the correction matrix, given by
the minimum-torsion methodology explained in Section 1.4.
The following R code details the minimum-torsion algorithm:
# Compute the Minimum-Torsion Matrix
minimum_torsion <- function(cov.matrix) # returns a matrix T such that sum (Var(TR R)/Var(R))
# is minimal where R is a random vector with cov(R)=cov.matrix
# subject to entries of TR uncorrelated
sigmas <- diag(cov.matrix)^(0.5)# Correlation matrix
C2 <- diag(1/sigmas)%*%cov.matrix%*%diag(1/sigmas)
E <- eigen(C2)# Square root of C2
15
1.5. Case Study
C <- E$vectors %*% diag(E$values^(0.5)) %*% t(E$vectors)# Inverse of C2Cinv <- E$vectors %*% diag(E$values^(-0.5)) %*%
t(E$vectors)# Requirements for break conditionsPIold <- C# InitializationD <- diag(x = 1,
nrow = nrow(C),ncol = ncol(C))
repeat DC <- D %*% CE <- eigen(DC %*% t(DC))U <- E$vectors %*% diag(E$values^(-0.5)) %*%
t(E$vectors) %*% DCdiagonal <- pmax(0, diag(U %*% C))D <- diag(x = diagonal)PI <- D %*% Utolerance <- max(abs(PI - PIold))PIold <- PIV <- PI %*% Cinv
# Convergence check: must be decreasingprint(
sum(diag(V %*% C2 %*% t(V) - V %*% C2 - C2 %*% V + C2)
))if (tolerance < 0.00001)
break
result <- diag(sigmas)%*%V%*%diag(1/sigmas)return(result)
In a second step, we used the torsion-matrix calibrated to the training sample and apply
the corrected-benchmark strategy defined in (1.4.1) on the test sample covering the period
2014/01-2016/05. Figure 1.6 compares the cumulative performance of the naive strategy
(equally-weighted) to the corrected-benchmark approach. We also consider an alternative
portfolio, where we impose a long-only constraint in order to deliver a fair comparison to the
equally-weighted strategy.
We can observe a similar dynamics of the three strategies, the unconstrained corrected-
benchmark portfolio, where short positions are allowed, displays a better performance to-
gether with higher volatility. The constrained portfolio also slightly outperforms the naive
diversification approach with comparable volatility.
In order to validate the multi-period approach, where we allow to rebalance the portfolio
dynamically over time, we compare in Figure 1.7 the three strategies vs. their buy-and-hold
equivalent.
16
1.5. Case Study
déc. 31 2013 juin 02 2014 déc. 01 2014 juin 01 2015 déc. 01 2015 mai 12 2016
Performance Comparison 2013−12−31 / 2016−05−12
100
120
140
160
100
120
140
160
Naive
Corrected Naive
Corrected Naive Long−Only
Figure 1.6 – Cumulative performance
déc. 31 2013 juin 02 2014 déc. 01 2014 juin 01 2015 déc. 01 2015 mai 12 2016
Outperformance: Strategy vs. Buy−and−Hold 2013−12−31 / 2016−05−12
−4
−2
0
2
4
6
−4
−2
0
2
4
6
Naive
Corrected Naive
Corrected Naive Long−Only
Figure 1.7 – Outperformance vs. buy-and-hold portfolio
An astonishing pattern emerges from this figure, both the dynamic naive diversification
approaches and the long-only corrected-benchmark strategy display an underperformance
with regards to the buy-and-hold equivalent, whereas the corrected-benchmark strategy
(constrained) moves from a cumulative outperformance of more than 6% in mid-June 2015
towards an underperformance of 2% at the end of the testing period. We also note that
all strategies suffered a drawdown in the second half of 2015, with a dramatic collapse in
performance of the unconstrained strategy.
17
1.6. Conclusion
1.6 Conclusion
Although the performance of our alternative diversification strategies, relying on a (con-
strained) corrected-benchmark approach do not deliver outperformance after costs, we ob-
served interesting features that deserve further study. This motivates us to pursue our research
in the direction of multi-period optimization taking various scenarios into account in order to
stabilize the portfolio dynamics and deliver a consistent performance over time.
Several other techniques have to be tested and could further developed for “denoising” and
estimating the covariance matrix. These are deferred to future work.
In the next chapter, we build on the results presented in this chapter to device new investment
strategies. In particular, we focus on retrieving uncorrelated risk drivers (factors) and investi-
gate various strategies based on the diversification of the idiosyncratic risk left unexplained by
the factors.
18
2 Statistical Risk Budgeting
In this chapter, we start with a short introduction to factor investing and the selection of risk
factors in Sections 2.1 and 2.2. In Section 2.3, we focus on statistical factors and propose a
novel dynamic approach, focusing on statistical analysis of the data and on risk budgeting
techniques. We extend the work of Meucci et al. [63] and propose a shrunk version of the
minimum-torsion matrix, using the effective rank approach of Roy and Vetterli [84] to extract
the number of risk factors driving asset returns.
Our pure statistical approach enables a risk decomposition of a given portfolio into a system-
atic and a specific component. We detail in Section 2.4 the methodology used to decompose
total risk and to assess the level of diversification in our statistical framework.
We devise various dynamic investment strategies in Section 2.5, especially an innovative
implementation of a risk budgeting technique, where the budget of a given asset is inversely
proportional to its idiosyncratic risk, left unexplained by the statistical factors. We illustrate
our approach through an empirical application in Section 2.6 and conclude in Section 2.7.
2.1 Background
The last financial crisis and more recently the dramatic events surrounding Greece has trig-
gered a re-design of portfolio strategies among practitioners. Uncertainty about future asset
returns in a portfolio optimization framework has led the financial industry to look for new
solutions to propose to their clients.
Notably, the widespread 60-40 equity/fixed-income allocation outlived its usefulness. After
the financial crisis this allocation scheme has fulfilled its task, namely bonds provided a
welcomed downside protection when stock markets tumbled. However, in the current envi-
ronment, where interest rates are hovering just above their all-time lows and stock markets
are relatively expensive1, this protection is not effective anymore. Risk parity is meanwhile
1The Shiller adjusted price-to-earnings ratio is standing at nearly two standard deviations above the long-termaverage
19
2.2. Factor Investing
a well-established concept which does not rely on returns expectations and focus on risk
diversification within a portfolio. (see Roncalli and Weisang [83]).
Alternatively, Smart Beta investment strategies have been proposed allowing to diversify along
identified risk drivers (factors) that are assumed to deliver a risk premium in a rule-based and
transparent way. The increased popularity of these strategies is linked to a desire for portfolio
risk management and diversification as well as seeking to enhance risk-adjusted returns above
cap-weighted indices (see Amenc et al. [1], Amenc et al. [2]).
2.2 Factor Investing
Factor models have a long history in finance and experience a surge in popularity through
the emergence of Smart Beta products. This growing acceptance of factor-based investment
strategies is mainly due to its ease of implementation in terms of infrastructure and costs.
Factor-based investing is not a new topic and numerous academic studies have been con-
ducted in this area. Asset pricing theory (APT) postulates that efficient diversification is
decisive to eliminate unrewarded risks. By definition, these risks are unattractive to risk-averse
investors, who are therefore only willing to accept risk if a decent reward is associated to it.
Accordingly, in current volatile markets and low-rate environment market participants strive
to select risk factors with a proven ability to deliver positive risk premia over the long run and
to reduce idiosyncratic risks.
2.2.1 Smart Beta
Smart Beta investing (Amenc et al. [1]) aims above all at circumventing the shortcomings of
cap-weighted indices, that are mainly concentrated in a few stocks, with large capitalization
and high growth, and display sub-optimal factor exposures.
These highly concentrated cap-weighted products, characterized by a large-cap and growth
bias, display a strong presence of idiosyncratic risk, while empirical studies have proved
that small-cap and value investing provide positive rewards. Amenc et al. [2] construct factor
indices exposed to rewarded risk factors only and suggest that the so-called smart factor indices
lead to better risk-adjusted performance than a broad cap-weighted index, after considering
transaction costs. Smart Beta strategies fill thus the gap between traditional (cap-weighted)
passive investment and traditional active management.
2.2.2 Risk Drivers
Fixing the number of factors to include is a problem that has no conclusive answer and mainly
depends on managers’ views on which factors are expected to deliver the required risk premia.
20
2.3. Statistical Factors
In practice, we often recourse to subjective and empirical decisions based on experience
and the selected factors may differ depending on the portfolio considered and the current
economic environment.
Connor and Korajczyk [23] developed a test statistic that does not require a strict factor
structure2 to assess the determine the number of factors. Their test is based on the observation
that, if L is the appropriate number of factors, then there should be no significant decrease in
the cross-sectional mean-square of idiosyncratic returns in moving form L to L+1 factors.
Whatever the method used to fix the number of factors, the relationship among risk factors is
not static, as we could observed during the financial crisis. Indeed, traditional asset classes
tend to display a high correlation in downturn periods, thus decreasing potential diversifica-
tion possibilities. Applying a traditional risk parity or, alternatively, a Smart Beta approach on
the retained factors thus implies that the volatility and the correlations of the factors should
be closely monitored, to maintain the desired risk contribution of each asset in the portfolio.
Changing correlations as well as the shift in the factors retained in the investment decision
require a dynamic investment management process. We suggest a statistical approach to
identify the risk factors driving asset returns and use a dynamic risk budgeting framework to
manage a diversified portfolio.
2.3 Statistical Factors
There is a large body of literature covering the two mostly used types of factor models, namely
macroeconomic and fundamental factors (see Fama and French [31], Fama and French [32],
Chen et al. [21], Carhart [19]). A third type aims at identifying and estimating risk drivers
using statistical techniques such as principal component analysis (PCA). Miller [66] show that
statistical factors work best with high-frequency data and are more useful when combined with
fundamental factors. Statistical factor models do not generally specify the number of factors
in advance and extract them directly from asset returns. (see Connor [22] for an overview of
these three types of factor models).
Principal component analysis allows to express portfolio returns as a combination of un-
correlated factors. However, this method raises several issues notably the instability of the
components related to the lowest eigenvalues. Another issue is that the principal components
are not unique and are sensitive to the units of measurement3. Moreover, the factors extracted
by the PCA are often difficult to identify and interpret and can lead to counter-intuitive results
when used in portfolio allocation decisions (see Meucci et al. [63] for more details). This
technique has thus been rejected by most practitioners.
2i.e. where specific risk have zero correlation across assets.3When we apply PCA to the covariance matrix of asset prices, one would obtain different results in different
currencies
21
2.3. Statistical Factors
2.3.1 Minimum-Torsion Approach
Meucci et al. [63] propose a new approach relying on uncorrelated statistical factors ex-
tracted from the asset returns, which remain as close as possible to the original data set. This
minimum-torsion concept allows to clearly identify the contribution of each risk factor within
a portfolio of assets and “generalizes the marginal contributions to risk used in traditional risk
parity”.
This methodology circumvents the identification problem met with the standard PCA de-
composition and relies on a tracking-error minimization between uncorrelated factors and
original assets. As the factors are forced to remain as close as possible to the data, we do not
face this interpretation issue and this framework can thus be easier used in a risk budgeting
framework.
Definition 1. Torsion Matrix
If we consider K assets in a portfolio, Meucci et al. [63] show that we can find a so-called torsion
matrix that decorrelates the original assets in K uncorrelated factors. The linear transformation
used to retrieve this torsion matrix is the one that least disrupts the original factors (assets)
denoted R. Meucci et al. [63] suggest to select the torsion matrix 4 that minimizes the tracking
error between the uncorrelated factors and data:
tMT ≡ argminCor (t R)=IN×N
NTEt R,R, (2.1)
where R expresses the asset returns, t a valid torsion matrix and NTE denotes the normalized
tracking error defined as :
NTEF || R ≡√
1
K
∑KV(FK −RK
Sd(RK )
).
Meucci et al. [63] show that this normalization, unlike the PCA approach, is not sensitive to
factors expressed in different units.
2.3.2 Uncorrelated Factors
In a manner similar to PCA, after having solved the minimum torsion optimization (2.1) above
we can easily retrieve the uncorrelated factors FMT:
FMT = tMT R. (2.2)
The factor exposures, denoted wMT, can be obtained by inverting the torsion matrix tMT and
multiplying it by the asset weights w . Eventually, the return of a given portfolio Rp with an
allocation w = (w1, . . . , wK ) can either be expressed as a linear combination of the asset returns
4We refer to Chapter 1, Section 1.4 for the full derivation of the torsion matrix.
22
2.3. Statistical Factors
R or, alternatively, with the help of the uncorrelated factors FMT:
wMT = t−1MT w, Rp = wT R = wTMT FMT.
This methodology allows to fully characterize the K asset returns with the help of K uncor-
related factors. However, it could be optimal, in the presence of a large number of assets, to
consider a subset of these risk factors and thus to proceed to a shrinkage of the torsion matrix.
2.3.3 Effective Rank
To assess the appropriate number of risk drivers to retain, we recourse to PCA and the concept
of effective rank suggested by Roy and Vetterli [84], which extends the notion of rank of a matrix.
Let us consider a correlation matrix C of size K ×K and proceeds to a principal component
decomposition
C= E Λ E ′,
where E are the eigenvectors of size K ×K andΛ is the K ×K diagonal matrix of the eigenvalues
λ1 ≥λ2 ≥ . . . ≥λK ≥ 0,
We denote λ= (λ1,λ2, . . . ,λK )T the vector of the K positive eigenvalues and define the eigen-
value distribution as
pi = λi
|| λ ||1, i = 1,2, . . . ,K ,
where || ||1 denotes the L1-norm.
Definition 2. Effective rank
As stated by Roy and Vetterli [84] the effective rank of the matrix C is denoted by eRank(C) and
is defined as
eRank(C) = expH(p1, p2, . . . , pK ),
where H(p1, p2, . . . , pK ) is the spectral entropy given by
H(p1, p2, . . . , pK ) =−K∑
i=1pi log pi . (2.3)
The effective rank measure applied to C retrieves the effective number of risk drivers, which is
generally lower than the number of assets (K ). This can easily be explained by the fact that
some assets may display a relatively high correlation, which reduces the true dimension of the
correlation matrix.
23
2.3. Statistical Factors
Special Case: highly correlated markets
The CAPM-related literature shows that the market can be viewed as the first and predominant
equity factor (Lintner [53], Mossin [68], Sharpe [88] and Treynor [95]). Thus, a problem arises
when considering for instance the correlation matrix of a national stock market: in this
particular case the principal component analysis results merely in a single large eigenvalue,
corresponding to market-wide fluctuations, which makes the above measure ineffective (see
also Kim and Jeong [46] for more details).
Definition 3. Modified effective rank
Following Kakushadze and Yu [44], we can circumvent this problem by removing the first eigen-
value related to the so-called market factor and separately proceeds with the eRank calculation
described above on the remaining eigenvalues. The modified eRank metric used subsequently is
defined as
eRank2(C) = exp−K∑
i=2pi log pi +1 = expH(p2, p3, . . . , pK )+1,
where 1 is related to the first factor that has been removed from the calculation. We therefore
consider the L major risk drivers, defined as
L = beRank2(C)e , (2.4)
where b·e denotes the nearest integer function.
Shrunk Uncorrelated Factors
Using the notion of effective rank defined in Equation (2.4) we shrink the torsion matrix tMT to
retain only the L most relevant risk drivers. We define the shrunk factors and the corresponding
torsion matrix of dimension K ×L as
FSMT ≡1:L
FMT, tSMT ≡1:L
tMT . (2.5)
Our portfolio can be now decomposed into a systematic component, explained by the L risk
drivers, as well as a specific component denoted εp :
Rp = wT R︸ ︷︷ ︸asset-based
= wTSMT FSMT︸ ︷︷ ︸systematic
+ εp︸︷︷︸specific
.
2.3.4 Factor Risk Budgeting
An appealing feature of the uncorrelated factors obtained in Equation (2.2) is that we can
easily compute the risk budgeting portfolio analytically. This is unfortunately not the case
24
2.4. Diversification
when the factors are not uniformly correlated with each other, except if the correlation reaches
a lower bound or in case of perfect correlation (for further details, see Roncalli [82]).
In the case of uniform correlation ρi , j = ρ the risk parity portfolio, with budgets bi = 1/K ,
where K corresponds to the number of factors, is given by:
wi =σ−1
i∑Kj=1σ
−1j
, i = 1,2, . . . ,K . (2.6)
In the more general case, when the investor defines risk budgets b1, . . . ,bK for each factor we
get:
wi = biσ−1
i∑Kj=1 b jσ
−1j
, i = 1,2, . . . ,K . (2.7)
However, practitioners may set their risk budgets on the portfolio assets rather than on factors
that change over time. We thus have to redirect these asset-based budgets to the underlying
risk drivers. During our research for this project, we also devised a methodology 5 that
considers the loadings of the risk factors left aside in the shrinkage computation to reallocate
the user-based risk budgets to the corresponding L risk factors.
2.4 Diversification
This section details the concept of diversification and presents a way of measuring the level of
diversification within a portfolio.
2.4.1 Idiosyncratic Risk
According to the Asset Pricing Theory, an investment strategy should strive to diversify away
unrewarded risks, also called specific risk. Our statistical factor model, which identifies L risk
drivers with the help of the modified effective rank methodology (see Definition 3), enables
the decomposition of asset returns into systematic and specific risk:
The covariance matrix Γ of our factor model can be written
Γ= EL ΛL E ′L︸ ︷︷ ︸
systematic risk
+ Υ2︸︷︷︸specific risk
, (2.8)
where EL ΛL E ′L corresponds to the systematic risk related to the L factors identified and Υ2 is
a diagonal matrix of dimension K ×K expresses the idiosyncratic risk of each asset. EL is a
5As we only consider algorithmic risk budgeting strategies in this thesis, we do not give the details of thismethodology that has not been further investigated.
25
2.4. Diversification
K ×L matrix corresponding to the L factors exposures, whereasΛL is a L×L diagonal matrix
corresponding to the variance of the L uncorrelated factors.
An investment strategy will achieve this diversification by setting risk budgets b j that are
inversely proportional to the idiosyncratic risk υ j of the assets considered:
bi ∝ 1
υ2i
, i = 1,2, . . . ,K .
2.4.2 Measuring Diversification
Following Meucci [61] we recourse to the effective number of bets methodology to quantify
the diversification of a given portfolio. We propose here to define the diversification level as a
percentage, with 100% meaning perfect risk diversification. We show in Section 2.3.3 that a
portfolio can be decomposed into a systematic and a specific component:
Rp = wT R = wTSMT FSMT+εp .
Intuitively, a perfectly diversified portfolio can be fully explained by the systematic component
and displays no specific risk. As the factors FSMT are per definition uncorrelated, we can now
calculate the contribution of each risk factor to total risk:
RCi ≡VwT
SMTiFSMTi
VRp , i = 1, . . . ,K ,
where V denotes the variance. The contributions RCi sum up to one, are non-negative and
can thus be considered as weightings. A risk parity portfolio will be characterized by an equal
contribution of each risk factor to total risk.
Definition 4. Diversification measure
Computing the spectral entropy defined in Equation (2.3) of the risk contributions RCi , we
obtain the effective number of bets defined as
B≡ expH(RC1,RC2, . . . ,RCK ).
The measure B ranges from 1, when all the variability (risk) stems from a single risk driver,
to L when the total risk of a given portfolio is equally spread among the risk drivers. If we
normalize the measure B by the number of risk drivers L retained and multiply by 100, we
obtain a diversification measure expressed as a percentage
D≡ expH(RC1,RC2, . . . ,RCK )
L×100 = B
L×100. (2.9)
26
2.5. Investment Strategies
2.5 Investment Strategies
We apply the minimum-torsion approach in its standard version as well as the shrunk alter-
native, obtained through the modified eRank methodology used to identify the number of
risk drivers. Building on these uncorrelated sources of risk, we compare various investment
strategies that do not rely on expected returns or any external data. Only historical data of the
assets considered are required.
Naive Factor Diversification
Our first strategy is related to a standard naive diversification approach, also called equally-
weighted. Instead of equally-weighting the assets in the portfolio, we apply this technique to
the K uncorrelated factors. We retrieve the corresponding weights of the K assets by using the
computed torsion matrix tMT. These weights are defined as
wi = tMT1
K, i = 1,2, . . . ,K .
Factor Risk Parity
We also apply the traditional risk parity approach to the uncorrelated factors identified, where
the overall portfolio risk is equally spread among the K risk drivers, i.e. the risk contribution of
each factor is identical. Using Equation (2.6) and the torsion matrix tMT we retrieve the asset
weights:
σ−1i = 1
SdFSMTi ,
wi = tMT
σ−1i∑K
j=1σ−1j
, i = 1,2, . . . ,K ,
where SdFSMTi denotes the standard deviation of the risk factor i .
Factor Risk Budgeting – Naive
The rationale behind the naive diversification approach led us to consider an alternative,
where an equal budget is allocated to each factor, without considering any factor specificity,
such as volatility. This can be seen as an equally-weighted approach along the factors, that
actually translates into a risk budgeting strategy along the assets. Using the torsion matrix tMT,
we define the asset-based budgets as:
bi = tMT1
K, i = 1,2, . . . ,K .
27
2.5. Investment Strategies
In this particular case, the methodology defined in Equation (2.7) cannot be used to retrieve
the weights of the K assets, as the budgets are now expressed along the assets and correlations
must be considered. We therefore use the algorithm developed by Spinu [91] for computing
the allocation weights of the risk budgeting6.
Factor Risk Budgeting – Proportional
We then consider two other risk budgeting strategies: in the first one, we set risk budgets
that are inversely proportional to the volatility of the K factors. This framework is related to
the early traditional risk parity strategies applied to assets (see Bhansali et al. [9]), where the
correlations were not considered. In our setting the underlying assumptions are met, as the
factors are per definition uncorrelated. Using Equation (2.7) and tMT, we retrieve the asset
weights:
bi = 1
SdFSMTi ,
wi = tMT biσ−1
i∑Kj=1 b jσ
−1j
, i = 1,2, . . . ,K ,
where SdFSMTi denotes the standard deviation of the risk factor i .
Factor Risk Budgeting – Specific
In the second risk budgeting strategy, we set risk budgets that are inversely proportional to the
specific risk of the assets (Υ2), applying the risk decomposition in Equation (2.8), that remain
unexplained by the L factors. Using Equation (2.7) and tSMT, we retrieve the asset weights:
bi = 1
υ2i
,
wi = tSMT biσ−1
i∑Kj=1 b jσ
−1j
, i = 1,2, . . . ,K ,
where υ2i denotes the specific risk of asset i .
6This algorithm is based on Newton’s method.
28
2.6. Application
2.6 Application
In this section we present a concrete application of the risk-budgeting strategies developed in
this chapter.
2.6.1 Goal
In this application the main purpose is to compare the performance and tail risk metrics of
various investment strategies. A secondary goal is to evaluate the level of diversification of
each strategy when considering the asset and the risk drivers (factors) respectively. Analyzing
diversification along factors, that are the assumed to be the true risk drivers of asset returns,
should emphasize the distorted picture provided by a pure asset-based analysis.
2.6.2 Data
Our dataset is provided by Complementa Investment-Controlling AG7 and is composed of
eight asset classes from May 1997 through December 2015: World and Emerging-Markets
Equities, Global Bonds, Inflation-Linked Bonds, High Yield and Emerging-Markets Bonds,
Hedge Funds and Commodities. All time series are hedged in CHF. Each portfolio is rebalanced
on a monthly basis, using the previous four years of data (48 months) to estimate second
moments and extract the risk factors. Trading costs of 20 basis points (bp) are considered for
the return calculations as well as a price impact cost of 1 bp. Our investment universe is meant
to be representative of a well-diversified Swiss endowment fund with exposure to different
risk premia.
We start by reporting summary statistics – annualized return and volatility as well as correla-
tions, Value-at-Risk, Expected Shortfall and Maximum Drawdown – over the whole sample
period. Table 2.1 highlight the significant variations across asset classes in terms of return
and risk. Emerging Markets Equities show a negative return (-0.01%) and the highest risk
(25% volatility) whereas Emerging Markets Bonds have the highest return (6%). The highest
drawdown is recorded by Emerging Markets Equities (-19%), while Global Bonds display the
lowest one (-4%).
The correlation matrix across asset classes in Figure 2.1 emphasizes the potential for diver-
sification provided by an optimal combination of these asset classes. High Yield and Global
bonds have surprisingly the lowest correlation (-0.13), whereas Global and Emerging Markets
equities exhibit the highest coefficient (0.83).
7 Complementa Investment-Controlling AG supports investors in planning, organizing and monitoring thefunding process. It has been an operationally independent unit of State Street Holdings Germany GmbH sinceOctober 2011.
29
2.6. Application
Table 2.1 – Asset Classes – Statistics
Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDDGlobal Equities 0.02 0.17 0.10 0.64 -0.10 -0.12
EM Equities -0.01 0.25 -0.03 0.66 -0.13 -0.19Global Bonds 0.02 0.07 0.29 0.22 -0.03 -0.04
Inflation-Linked Bonds 0.04 0.08 0.45 0.19 -0.03 -0.05High-Yield Bonds 0.05 0.10 0.57 0.37 -0.03 -0.07
EM Bonds 0.06 0.12 0.50 0.35 -0.03 -0.08Hedge Funds 0.02 0.12 0.14 0.42 -0.05 -0.08Commodities -0.04 0.16 -0.22 0.70 -0.07 -0.11
This table shows the annualized return and volatility of each asset class over the period05/1997-12/2015. Value-at-Risk and Expected Shortfall are calculated with a 95% confidencelevel. The Maximum Drawdown (MDD) over the period is displayed in the last column.
Figure 2.1 – Asset Classes – Correlations.
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1Glob
al Equ
ities
EM Equitie
s
Global
Bonds
Inflat
ion−L
inked
Bonds
High−Y
ield B
onds
EM Bonds
Hedge
Funds
Commod
ities
Global Equities
EM Equities
Global Bonds
Inflation−Linked Bonds
High−Yield Bonds
EM Bonds
Hedge Funds
Commodities
1 0.83
1
0.35
0.18
1
0.5
0.35
0.8
1
0.54
0.62
−0.13
0.2
1
0.48
0.66
0.03
0.18
0.69
1
0.68
0.56
0.66
0.73
0.15
0.17
1
0.44
0.5
0.22
0.38
0.33
0.29
0.46
1
This figure displays asset classes correlations over the period 05/1997-12/2015. Highly corre-lated asset classes are highlighted by blue dots with shadings related to the degree of correla-tion.
2.6.3 Benchmark Strategies
We provide two benchmark strategies to assess the incremental value of our approach: the
first one is the well-established 60-40 allocation, which seeks long-term capital appreciation,
30
2.6. Application
taking current income into account, by investing in an amount equal to 60% of its assets in
Global Equities and 40% of its assets in Global Bonds. The second benchmark strategy is a
simple equally-weighted approach, also called naive diversification, where each asset has the
same weight in the portfolio.
Our first investment strategy described in Strategy 2.5, leans on the naive diversification
approach and applies it to the uncorrelated factors, i.e. each factor is equally-weighted in the
portfolio. Table 2.2 shows that the 60-40 allocation, rebalanced on a monthly basis, delivers a
disappointing net performance (0.87%) and a low Sharpe ratio (0.09). Our naive diversification
strategy along the factors manages to outperform the two benchmark strategies, with an
average annualized return of 2.21% and a Sharpe ratio of 0.26 compared to 2.11% and 0.22
respectively for the naive benchmark strategy applied to the assets directly. However, this
simple approach does not reduce the maximum drawdown significantly (28.65%, 27.59%,
24.97% for the three strategies respectively).
Table 2.2 – Risk Parity Strategies – Key figures
Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDD60% / 40% 0.87 9.33 0.09 -4.81 -6.62 28.65
equally-weighted: assets 2.11 9.44 0.22 -5.01 -8.72 27.59equally-weighted: factors 2.21 8.41 0.26 -4.41 -8.81 24.97
risk parity: assets 2.25 7.39 0.30 -3.85 -6.73 19.38risk parity: factors 2.52 7.05 0.36 -3.59 -6.42 17.91
This table presents the annualized return and volatility (after costs) as well as tail risk metricsof the standard equally-weighted approach and the risk parity strategies applied to the assetsand to the statistical factors respectively.
2.6.4 Risk Parity Strategies
In a second step, we build on the traditional risk parity approach relying on an equal contribu-
tion of each asset in the portfolio to the overall portfolio volatility and apply it to the statistical
factors identified using Definition (1). The approach is described in Strategy 2.5
Figure 2.2 shows that the risk parity strategy applied to the factors improves the factor-based
naive strategy, posting a better net performance and lower drawdowns. Table 2.2 highlights
the appeal of our factor risk parity strategy over traditional risk parity and naive diversification
respectively.
2.6.5 Risk Budgeting Strategies
In a last step we extend our approach to a risk budgeting framework. We first construct a
portfolio relying on the naive risk budgeting approach detailed in Strategy 2.5, where we
allocate an equal risk budget to each factor.
31
2.6. Application
Figure 2.2 – Risk Parity Strategies – Performance−0
.2−0
.10.
00.
10.
20.
30.
4 60% / 40%equally−weighted: factorsrisk parity: factors
Cum
ulat
ive
Ret
urn
Net Performance
−0.0
50.
000.
05
Mon
thly
Ret
urn
Avr. 01 Oct. 01 Avr. 02 Oct. 02 Avr. 03 Oct. 03 Avr. 04 Oct. 04 Avr. 05 Oct. 05 Avr. 06 Oct. 06 Avr. 07 Oct. 07 Avr. 08 Oct. 08 Avr. 09 Oct. 09 Avr. 10 Oct. 10 Avr. 11 Oct. 11 Avr. 12 Oct. 12 Avr. 13 Oct. 13 Avr. 14 Oct. 14 Avr. 15
−0.3
0−0
.20
−0.1
00.
00
Dra
wdo
wn
This figure compares the performance after costs of two benchmark strategies, the widespread60-40 and the equally-weighted strategies, to the risk parity approach applied to the statisticalfactors.
Following Strategy 2.5, we form an additional portfolio where the risk budgets are inversely pro-
portional to the volatility of the statistical factors. Eventually, we construct the portfolio relying
on the modified effective rank methodology defined in (3) and diversify away idiosyncratic risk
of the assets as detailed in Strategy 2.5.
We can observe in Tables 2.2 and 2.3 that the strategies applied to the assets directly are all
outperformed by their factor-based counterparts.
Table 2.3 – Risk Budgeting Strategies – Key figures
Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDDrisk budgets: factors, naive 2.90 7.73 0.38 -3.70 -4.91 12.28
risk budgets: assets, specific 1.29 7.53 0.17 -4.04 -7.64 17.92risk budgets: factors, specific 3.58 6.84 0.52 -2.85 -4.09 10.10
risk budgets: assets, proportional 2.18 6.84 0.32 -3.49 -6.12 16.55risk budgets: factors, proportional 2.38 6.37 0.37 -3.14 -5.85 15.91
This table presents the annualized return and volatility as well as tail risk metrics of the threerisk budgeting strategies applied to assets and factors.
32
2.6. Application
Table 2.3 shows that the naive risk budgeting approach provides a considerable improvement
relative to the risk parity strategy. Despite a similar Sharpe ratio, tail risk has substantially
been reduced (from 17.91% to 12.28%). However, the proportional risk budgeting approach
reveals in Figure 2.3 disappointing results, with a similar drawdown pattern as the naive risk
budgeting strategy but an inferior annualized performance (2.38% vs 2.90%). The significant
added-value in terms of annualized return, Sharpe ratio as well as tail risk metrics is provided
by Strategy 2.5.
Figure 2.3 – Risk Budgeting Strategies – Performance
0.0
0.2
0.4
0.6
risk budgets: factors, naiverisk budgets: factors, specificrisk budgets: factors, proportional
Cum
ulat
ive
Ret
urn
Net Performance
−0.0
6−0
.04
−0.0
20.
000.
020.
04
Mon
thly
Ret
urn
Avr. 01 Oct. 01 Avr. 02 Oct. 02 Avr. 03 Oct. 03 Avr. 04 Oct. 04 Avr. 05 Oct. 05 Avr. 06 Oct. 06 Avr. 07 Oct. 07 Avr. 08 Oct. 08 Avr. 09 Oct. 09 Avr. 10 Oct. 10 Avr. 11 Oct. 11 Avr. 12 Oct. 12 Avr. 13 Oct. 13 Avr. 14 Oct. 14 Avr. 15
−0.1
5−0
.10
−0.0
50.
00
Dra
wdo
wn
This figure compares the performance after costs of risk budgeting strategies based on equalrisk budgets and on the volatility of the factors, as well as strategies relying on the diversifica-tion of specific risk left unexplained by the statistical factors.
Table 2.4 summarizes the risk-return characteristics of the factor-based strategies considered
in this chapter. Figure 2.4 gives an overview of the performance after costs over time as well as
downside risks of these strategies.
33
2.6. Application
We can observe that the main strategy, relying on the diversification of specific risk, displays the
best results: an average return of 3.58% resulting in a Sharpe ratio of 0.52. This strategy is also
characterized by a stable net performance and a relatively low maximum drawdown (10.10%).
This dynamic approach, which identifies the number of risk drivers at each rebalancing step
and diversify along them, allows to considerably reduce the portfolio tail risk without altering
the performance.
Table 2.4 – Risk Parity vs. Risk Budgeting – Key figures
Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDDequally-weighted: factors 2.21 8.41 0.26 -4.41 -8.81 24.97
risk parity: factors 2.52 7.05 0.36 -3.59 -6.42 17.91risk budgets: factors, naive 2.90 7.73 0.38 -3.70 -4.91 12.28
risk budgets: factors, specific 3.58 6.84 0.52 -2.85 -4.09 10.10
This table presents the annualized return and volatility (after costs) as well as tail risk metricsof the best performing factor-based strategies, relying on naive diversification, risk parity aswell as risk budgeting techniques.
Figure 2.4 – Risk Parity vs. Risk Budgeting – Performance
0.0
0.2
0.4
0.6
equally−weighted: factorsrisk parity: factorsrisk budgets: factors, naiverisk budgets: factors, specific
Cum
ulat
ive
Ret
urn
Net Performance
−0.1
0−0
.05
0.00
0.05
Mon
thly
Ret
urn
Avr. 01 Oct. 01 Avr. 02 Oct. 02 Avr. 03 Oct. 03 Avr. 04 Oct. 04 Avr. 05 Oct. 05 Avr. 06 Oct. 06 Avr. 07 Oct. 07 Avr. 08 Oct. 08 Avr. 09 Oct. 09 Avr. 10 Oct. 10 Avr. 11 Oct. 11 Avr. 12 Oct. 12 Avr. 13 Oct. 13 Avr. 14 Oct. 14 Avr. 15
−0.3
0−0
.20
−0.1
00.
00
Dra
wdo
wn
This figure compares the performance after costs of the best performing factor-based strate-gies, relying on naive diversification, risk parity as well as risk budgeting techniques.
34
2.6. Application
Figure 2.5 – Diversification along assets and factors
This figure compares the diversification measure of the five different strategies, when applieddirectly to the asset classes or using the statistical risk factors.
2.6.6 Diversification Analysis
Figure 2.5 reveals the importance of a thorough analysis of inherent risk drivers to assess
the diversification of a given portfolio. Applying the measure D detailed in Equation (2.9) to
the risk factors and assets respectively, we observe that the asset-based measure, blurred by
correlation among assets, underestimates the true diversification. The traditional 60-40%
strategy, invested in only two asset classes, reveals a relatively poor diversification along assets
and factors (23% and 64% respectively). Interestingly, the asset-based measure underestimates
in this case the diversification level of this portfolio. We can also observe that the two risk
parity portfolios are the only strategies showing a perfect diversification (along assets and
factors respectively). Such an extreme degree of diversification is not required and can even be
sub-optimal in terms of risk-adjusted performance, as evidenced in the performance Tables
(2.2) and (2.3). Moreover, the relatively poor net performance of asset-based strategies relative
to their factor-based counterparts reveals the importance of identifying the risk factors driving
the asset returns when constructing a portfolio8 (see Tables 2.2 and 2.3).
8This finding is consistent with other work, such as Bhansali et al. [9]
35
2.7. Conclusion
2.7 Conclusion
We propose a novel approach for dynamic risk budgeting. The model extends the statistical
approach of minimum-torsion proposed by Meucci et al. [63] and shrink the number of uncor-
related factors with the help of the modified effective rank methodology of Kakushadze and Yu
[44]. Relying on the rationale of the APT theory we apply a risk budgeting investment strategy
where we diversify away unrewarded risks. We compare four different investment strategies:
the first one is similar to the equally-weighted approach and rely on a naive diversification
of the uncorrelated factors. We draw on the naive diversification rationale and propose a
risk budgeting approach, that sets an equal risk budget to each factor. These budgets are
then expressed as risk budgets along the assets, using the minimum-torsion matrix. The
two remaining risk budgeting strategies set risk budgets that are inversely proportional to
the volatility of the factors and to the specific risk left unexplained by the statistical factors
respectively. We show that substantial improvement in the Sharpe ratio as well as in tail risk
metrics can be obtained by applying these dynamic statistical risk-based investment strategies.
Promising results in terms of performance and risk management are provided by the last
strategy considered and this will be used as strategic allocation scheme in the real-world
application presented in Chapter 6.
36
3 Portfolio Optimization
The major decision in the portfolio management process consists in allocating investment
capital to a given universe of investable assets, with respect to a set of assumptions on the
market dynamics and constraints. It has been showed that strategic asset allocation is decisive
in determining the expected return and risk of a portfolio and that security selection only
plays a minor role. This long-term based decision process is crucial for institutional as well
as private investors and all financial aspects, such as current wealth, future incomes and
outcomes, goals, inflation, etc. should be considered.
Although some asset classes display a higher return than others over a long-term horizon,
in the short run the investor cannot neglect risk in the analysis. As we have seen in the
financial crisis of 2008, some asset classes that have displayed a low correlation historically,
suddenly drop quickly and simultaneously. Choosing the right balance between investment
opportunities thus depends on the risk the investor is ready to accept and usually fluctuates
over time due to change in wealth levels, market environment or investor’s goals.
We start by an overview of the single-period optimization framework in Section 3.1 and
continue our review with previous research in multi-period optimization and the solutions
proposed in Section 3.2. Finally, a short overview of Dynamic Programming (DP) techniques
used in multi-period optimization is provided in Section 3.3.
3.1 Single-Period Optimization
Markowitz [54], in his seminal work, showed that asset returns are random parameters and
that for the evaluation of a portfolio, one should consider both its expected returns and its risk,
where for representing risk he used the portfolio’s variance. His mean-variance framework laid
the foundations for modern finance and explained how financial markets work. This mean-
variance framework is referred to as modern portfolio theory, whereas post-modern portfolio
theory considers further extensions including non-normal distributions and asymmetric risk
measures.
37
3.1. Single-Period Optimization
3.1.1 Modern Portfolio Theory
Modern Portfolio Theory as proposed by Markowitz [54] frames the time dimension of invest-
ing as a single period over which the parameters of the probability distribution of asset returns
are both known with certainty and are fixed. The future is treated as a single period which
starts today but ends only at some unknown moment in the future. This second assumption
has received attention in the theoretical literature but there has been little progress in terms of
practical advances available to financial practitioners.
The "single-period" framework for portfolio optimization is legitimized by assuming that
all frictions that impact portfolio formation and rebalancing are neglectable. If the cost of
rebalancing is zero, then the single-period assumption delivers the optimal solution. While
these costs may be very small for some investment assets held by some investors, the necessary
conditions are not fulfilled in most practical cases.
For almost all real-world investors, portfolio rebalancing is costly. For taxable investors holding
illiquid assets such as private equity or real estate, transaction costs are often predominant
with respect to return and risk considerations unless holding periods exceed multiple decades.
3.1.2 Mean-Variance Framework
Markowitz [54] paved the way to a new era of modern portfolio management when he pre-
sented the mean-variance framework for managing and optimizing portfolios. Portfolio vari-
ance is a valid risk measure for ranking investor’s preferences if either he exhibits a quadratic
utility function or if the underlying asset returns are normally distributed.
min(w)
1
n
n∑i=1
(m∑
j=1w j(ri , j −µ j
))2
subject tom∑
j=1w jµ j = R
m∑j=1
w j = 1
w j ≥ 0,∀ j ∈ 1, . . . ,m ,
where w represents the j = 1, . . . ,m asset weights, i = 1, . . . ,n are the number of returns
observations r and µ j the expected return.
This optimization problem effectively minimizes portfolio risk, measured by the variance,
subject to the portfolio forecast return being equal to R, a full investment constraint and
positivity constraints on the weights. While it is simple to express the problem in its quadratic
form such that variance is equal to w ′Σw , we leave the problem here in its more general
nonlinear programming form which allow nonlinear constraints that include long-short
optimization with a leverage constraint.1
1In the case of constraints exhibiting a quadratic form, the problem can also be posed as a second order cone
38
3.2. Multi-Period Optimization
Criticisms of variance as a valid method for assessing risk of a given portfolio is mainly aimed
at the quadratic utility assumption which is just a mathematical convenience rather than a
reflection of reality, leading to the irrational investor’s behavior, preferring less to more after a
certain point on the utility curve, whilst the multivariate normality assumption is not usually
borne out by empirical data. Hanoch and Levy [39] was the first to criticize variance as a risk
measure, that penalizing both up and down deviations at the same rate 2.
However, its ease of use and tractability has made it a very popular choice with numerous ex-
tensions to provide for robustness and uncertainty mainly in the derivation of the covariance
matrix. For example, James and Stein [43] provide for a shrinkage estimator, Black and Litter-
man [10] a semi-Bayesian approach while Michaud [65] a general criticism of the approach
with a patented alternative based on resampling methods.
3.2 Multi-Period Optimization
The optimal portfolio selection problem has always played a predominant role in applied
financial research. Numerous papers strive to answer questions like how to construct an
optimal portfolio based on historical data, how the portfolio choice is influenced by asset
returns, etc.
The seminal work of Markowitz [54] relies on a trade-off between expected returns and risk of
a given portfolio, where risk is defined as portfolio variance. His paper is equivalent to the well-
known mean-variance utility maximization problem. This easy to implement methodology
is, despite its caveats, still very popular in the financial industry and solve the static (single-
period) portfolio choice problem (see Brandt and Santa-Clara [16]).
However, most industry problems rely on finding an optimal investment strategy over a long-
term investment horizon and this issue has not yet been solved. The multi-period portfolio
selection issue has been first formulated by Markowitz [55] in his book. Mossin [69] also
covered this topic in his paper, followed by Samuelson [86], Merton and Samuelson [60]. At
the beginning of the 20th century, numerous literatures focused on this topic (see Li and Ng
[52], Steinbach [93], Leippold et al. [51], Brandt and Santa-Clara [16], Celikyurt and Özekici
[20], Skaf and Boyd [89]) but a closed-form solution, except in very restrictive cases, has not
been provided yet.
Literature Review
Much research has been carried on the formulation of full multi-period optimization. Mossin
[69] suggests an explicit multi-period approach for portfolio optimization. A research paper
written by Cargill and Meyer [18] sets the focus on the risk perspective of the multi-period
(SOCP) problem.2The criticism was not only aimed at variance but at any symmetric dispersion measure.
39
3.2. Multi-Period Optimization
optimization problem. This was followed by Merton [59] who introduced a continuous-time
approach similar to mean-variance optimization, and by Pliska [77] who provided a discrete-
time alternative related to the single period method.
A solution to multi-period mean variance optimization problems relying on dynamic stochas-
tic programming is provided by Li and Ng [52]. These methods investigate a range of potential
paths of future outcomes and select the set portfolio composition that meets a defined ob-
jective as well as the client’s constraints. Such techniques are often used by institutional
investors or high net-worth individuals for asset allocation purposes. The computational
power required by this method often restricts the number of assets that can be dealt with. Even
with today’s improvements in computational efficiency, the methodology is still only viable for
portfolios with a small number of assets. Using a set of simplifying assumptions, Sneddon [90]
provides a closed-form solution to the multi-period optimization including optimal turnover
that could be applied to problems with large number of assets.
Another line of research in this domain focuses on the idea of creating rules that inform
investors when it is really necessary to rebalance their portfolio weights. In the absence of
statistically significant and economically material advantage such rules simply tell the investor
to do nothing, hence avoiding rebalancing costs altogether. Preliminary research in this area
includes Rubinstein [85] who examines the efficiency of continuous rebalancing and proposes
a rule for avoiding unnecessary turnover. Kroner and Sultan [48] propose a "hurdle" rule for
rebalancing currency hedges when return distributions are time-varying, while Engle et al.
[29] propose a similar hurdle on alpha improvement as the trigger for rebalancing actively
managed asset allocations.
Bootstrap resampling techniques were used by Bey et al. [8] to identify "indifference" regions
along the efficient frontier. In Gold [35], a similar technique is applied to define indifference
to rebalancing portfolios for illiquid asset classes such as real estate. In Michaud and Michaud
[64], a parametric resampling technique is applied to measure the confidence interval on
portfolio return and risk to design a “when to trade rule.”
In Markowitz and Van Dijk [56] a rebalancing rule based on game theory to approximate multi-
period optimization is defined, but the authors argue it is mathematically intractable (at least
in closed form) for large problems. In Kritzman et al. [47] the authors test the efficiency of the
rebalancing rule presented in Markowitz and Van Dijk [56] against full dynamic programming
for cases up to a maximum of five assets, as dynamic programming becomes computationally
unfeasible for larger numbers of assets. The authors extend the method to one hundred assets.
Practitioner often resort to some kind of "all or nothing" rebalancing strategies. There are
however two main impediments to such extreme rules. When active managers are “inactive”
because the potential benefits of rebalancing are too small, this lack of trading is perceived by
clients as the manager being neglectful rather than as an analytically-driven decision to reduce
trading costs. Presumably this objection could be overcome by appropriate communication
between the asset manager and their investors. The second argument is that after a period
40
3.2. Multi-Period Optimization
of inactivity, the eventual rebalancing concentrates the required trading into a particular
moment in time. For large investors the market impact arising from doing trades that are a
larger fraction of available trading volume per unit of time will create higher transaction costs
than if the trading had been done gradually between the previous portfolio rebalancing and
the current one. For portfolios that are composed of assets with homogeneous transaction
costs it is common to simply place heuristic limits on the amount of turnover allowed in a
given rebalancing procedure.
A value-added/turnover efficient frontier is proposed in Grinold and Stuckelman [37]. The au-
thors derive that under certain common assumptions, value added measured as improvement
in utility, is approximately a square root function of turnover. As such, investors can optimize
their portfolios without considering trading costs, and then simply choose an intermediate
point between the initial portfolio and the optimal portfolio that results in the best trade-off
between utility improvement and incurred transaction costs.
Multi-Period Optimization based on Expected Utility
Mossin [69] was the first to analyze optimal multi-period portfolio strategies based on maxi-
mizing expected utility. His research focused on isolating a type of utility functions of terminal
wealth which are independent of asset returns beyond the current period for intermediate
wealth levels. Such functions are called myopic and have the obvious benefit that the optimal
multi-period strategy can be achieved by only taking the current period into account. The
author found that for general asset distribution, the logarithmic utility function is completely
myopic. When asset returns are serially independent, power functions are the ideal candidates.
Mossin also concluded that for investor exhibiting a risk tolerance that is linear in wealth, the
so-called HARA utility functions, and in the presence of a risk-free asset whose distribution
is known for the whole investment horizon, then these functions lead to partial myopia: the
investor would optimally invest in a given period as if he would only invest in the risk-free asset
in subsequent periods. Accordingly, if the risk-free rate is zero, a complete myopia applies.
Hakansson [38] showed that even when asset returns are serially independent, for HARA utility
functions, no myopic strategies are optimal except in the very restrictive case of absence of
constraints on leverage or short sales. When such restrictions are present then only the power
and logarithmic utility functions lead to a myopic optimal strategy.
To summarize results we can say that a myopic strategy results when the investor exhibits a
logarithmic utility function, for serially dependent and independent asset returns distributions.
In the case of the power utility function, myopia applies only in the case of serially independent
asset returns distributions. Finally, in the absence of restrictions on leverage and short sales
and when the investor exhibits a HARA utility function, a myopic strategy applies only for
serially independent distributions.
41
3.2. Multi-Period Optimization
Multi-Period Optimization under Independence Assumption
Li and Ng [52] as well as Leippold et al. [51] showed that a closed-form solution to a discrete-
time multi-period portfolio optimization problem can be obtained when assuming indepen-
dence of asset returns distribution within a mean-variance framework.
Brandt and Santa-Clara [16] provided an answer to the multi-period portfolio selection prob-
lem assuming that the portfolio weights can be presented as a linear function of certain state
variables. This assumption leads to a massive simplification of the optimization problem and
provides a local maximum, that could obviously substantially differ from the global solution.
When considering continuous-time, Duffie and Richardson [24], Basak and Chabakauri [4] as
well as Aït-Sahalia et al. [3] found a solution to the multi-period portfolio selection problem.
Multi-Period Optimization under Quadratic Utility Function
Bodnar et al. [11] derived a closed-form solution to the dynamic portfolio choice problem with
and without a risk-free asset, using relatively weak assumptions. They imposed the conditions
of the existence of the conditional mean vectors and covariances matrices. They did not make
assumptions about the autocorrelation structure or about the asset returns distribution. Their
solution can be applied to stationary as well as non-stationary stochastic models. However,
the solution relies on the quadratic utility function, which displays an increasing constant risk
aversion as showed in the previous chapter.
Brandt (2006) showed that the quadratic utility function is a good approximation of other utility
functions and above all can provide a welcomed support to the mean-variance framework
of Markowitz. Indeed, Tobin [94] showed in his paper that the Bernoulli principle is met if
returns are normally distributed or if the investor displays a quadratic utility function. As
assets returns are practically never normally distributed, the quadratic utility assumption
justifies the use of the traditional mean-variance model.
Bodnar et al. [11] also proved that under the independence assumption of assets returns, the
optimal multi-period portfolio allocation at a given rebalancing time is closely related to the
optimal single-period portfolio allocation. Both portfolios differ only in the coefficient of risk
aversion. The authors showed that if the allocation is based on the tangency portfolio, the
multi-period solution is the same as the one obtained by solving the single-period problem at
each rebalancing time.
Multi-Period Optimization with a Downside Mean-Square Error Objective
Skaf and Boyd [89] consider the problem of multi-period portfolio optimization with arbitrary
distribution of asset returns and a self-financing budget constraint only. The authors used a
mean-square error objective function. They show that when no other constraint is added to the
42
3.3. Dynamic Programming Techniques
model, the optimization problem can be solved by a standard dynamic programming approach
and the resulting sub-optimal policy is affine. The optimal policy involves solving a convex
quadratic program, at each step, by using the Bellman value function for the unconstrained
problem to approximate future value of portfolios. They provided some examples showing
that even in presence of transaction costs their sub-optimal policy performs as well as without
transaction costs.
Instead of relying on common utility functions, that all have pros and cons when applied to
real-world problems, we chose to follow the idea proposed in Skaf and Boyd [89]. Based on
this idea, we focus our research on an investor willing to reach a desired wealth level w t ar g et
over a finite horizon T and choose an objective function which penalizes final wealth levels
that are below this targeted wealth level. However, such an objective function does not display
the useful convexity property required by quadratic programming approaches which were
used by Skaf and Boyd.
3.3 Dynamic Programming Techniques
Optimizing a decision policy within an industrial framework usually leads to a problem of
sequential decision-making under uncertainty. This class of questions is addressed under the
framework of discrete-time stochastic control and in most cases can be formulated under the
umbrella of Markov Decision Process (see Bäuerle and Rieder [5], Bertsekas [7], Feinberg and
Schwartz [33], and Puterman [79]).
Dynamic programming (DP) offers some techniques to find the optimal policy. DP represents
the optimal controls in terms of an optimization problem involving the value function of the
stochastic control problem (see Bertsekas [7]).
With an increasing number of time steps, severe difficulties arise when solving generic real-
world applications, since the underlying state variables in practice usually must be modeled
in terms of high-dimensional controlled Markov processes. These issues cause a variety of
problems, frequently referred to as the curse of dimensionality.
This makes even representing the value function intractable when the state or action spaces
are infinite, or as a practical matter, when the number of states or actions is large. Even when
the value function can be represented, evaluating the optimal policy can still be intractable.
Closed form solutions to such problems are exceptions and usually an exact solution is out
of reach and is not of primary importance in real-world applications. For these reasons,
approximate numerical solutions (approximate dynamic programming (ADP)) are targeted
almost always in practice as a general method for finding sub-optimal control policies (see
Powell [78]). In ADP, approximate value functions are substituted for value functions in the
expression for the optimal policy. The goal is to select the control-Lyapunov function (i.e. the
approximate value function) so that the performance of the resulting policy is close to optimal.
43
3.3. Dynamic Programming Techniques
The accumulation of numerical inaccuracies is the main difficulty in the step-wise calculation
of approximate solutions via backward induction (see Bender et al. [6]). Due to the interleaved
application of numerical integration, the calculation of each value function relies on one
which was obtained in the previous step. This concatenation causes a deviation from the true
value functions, inevitably progressing with the number of time steps. This difficulty becomes
severe for a generic real-world problem, since the underlying state variables in practice usually
must be modeled in terms of high-dimensional controlled Markov processes.
44
4 Model Predictive Control
In this chapter, we introduce in Section 4.1 linear convex stochastic control problems con-
sidered in a multi-period portfolio optimization context, in the presence of transaction costs
and portfolio restrictions. We shortly outline in Section 4.2 how such linear convex prob-
lems can be handled by Model Predictive Control (MPC) techniques and detail in Section 4.3
the methodology, when scenarios are included in the multi-period portfolio optimization
problem. In Section 4.4 we explain the kind of transaction costs that can be handled by the
scenario-based MPC scheme and how to include them. In Section 4.5, we detail the inclusion
of portfolio constraints into the scenario-based MPC scheme. we formulate in Section 4.6 the
scenario-based multi-period MPC portfolio optimization problem and show how this problem
can be split into a convex quadratic and a non-quadratic component.
4.1 Introduction
We consider a multi-period investment optimal strategy in discrete time and with a finite
horizon and time-varying distribution of returns. Such optimization problems are formulated
as a stochastic control problem, usually with a linear dynamics which is easier to handle. This
linear dynamics require that we model the wealth evolution in terms of assets’ values instead
of the usual portfolio weights. Investor constraints as well as transaction costs cannot be
neglected in real-world applications and can normally be defined as convex functions, leading
to linear convex stochastic control problems.
Linear convex stochastic control problems can easily be solved in the absence of transaction
costs, as they are reduced to a sequence of single-period optimization problems. When
only so-called impact costs are considered, the optimization problem can also be solved by
dynamic programming (DP), because these costs can be formulated as a quadratic function.
In this case, Skaf and Boyd [89] showed that the resulting optimal trading policies reduce to
affine functions of the current portfolio. When a small number of assets is considered, i.e. a
maximum of three assets, we can recourse to numerical dynamic programming to find the
optimal policy.
45
4.2. Background
When non-quadratic transaction costs are considered, such as bid-ask spread or brokerage
costs, the optimization problem is no more computationally tractable. As an exact solution
cannot be found, we resort to a suitable approximation, a so-called sub-optimal policy. Sev-
eral techniques can be applied to derive the approximation, such as Approximate Dynamic
Programming and Model Predictive Control (MPC). As changes in financial markets can occur
often and have to be considered, we focus on techniques that can directly incorporate these
market changes. ADP-based policies require some pre-computations in order to find the
approximated value functions, which can be computationally costly. However, MPC has the
advantage of not requiring pre-computations and can thus deal with sudden market changes,
expressed in terms of modified returns expectations as well as time-varying covariances.
4.2 Background
Model Predictive Control is built around the idea of controlling a system by predicting its
evolution and choosing an optimal control based on the forecasted trajectory. In a portfolio
management context, the controls correspond to a trade vector which is usually selected to
minimize so-called stage-cost functions over the system’s states and controls. In MPC, all
future asset returns are simply replaced by their expected values.
MPC is therefore often chosen for its ability to handle portfolio constraints, as well as its
appealing properties in terms of performance and robustness (see Morari et al. [67], Mayne
et al. [58], Mayne [57]).
Linear convex optimization problems can be solved by various efficient algorithms, which
aim at improving the computation time or at approximating the optimal solution with a
sub-optimal policy. Modern interior-point techniques, as presented in Wang and Boyd [96],
recourse to a few complex iterations whereas Fast gradient, multiplicative dual update and
Alternating Direction Method of Multipliers (ADMM) rely on many simple iterations (see
Nesterov and Nemirovskii [71], Richter et al. [80], Nesterov [70] and Parikh and Boyd [73]). In
this thesis, we recourse to the ADMM for solving our MPC problem, also known as the Douglas-
Rachford algorithm. This algorithm is adapted for solving portfolio optimization problems or
in a high-frequency trading context, where execution speed is favored over precision.
4.3 Scenario-Based MPC
In this framework, we handle uncertainty pertaining to the system’s dynamics by generating a
set of scenarios, i.e. a set of dynamics equations and find for each scenario s the sequence of
trades (controls) over the whole investment horizon that minimizes the expected costs.
The state and control vectors at time t are denoted by xst and us
t ∈ Rn respectively, where s
corresponds to a given scenario and t = 1, . . . ,T .
46
4.4. Portfolio, Benchmark and Trading
We consider an uncertain discrete-time linear convex optimization problem over a horizon T ,
with S dynamics equations and corresponding time-varying probabilities. At every time step
t ∈, t ≥ 0, given the current state x(t ), the scenario-based MPC problem is defined by
min(x s ,us )S
s=1
S∑s=1
p stΦ(xs ,us)
subject to xst+1 = As
t xst +B s
t ut + f st
xst+1 ∈X s
t ×Ut s
xs0 = xstart, s = 1, . . . ,S, t = 0, . . . ,T −1,
(4.1)
where p st ∈ [0,1] is the probability for the scenario s at time t and
∑Ss=1 p s
t = 1, f st is a distur-
bance vector and the sets X st and U s
t are polyhedral constraints on the states and controls in
each scenario.
For each scenario s, the objective functionΦ(xs ,us) in (4.1) can be written as
Φ(xs ,us) := 1
2
T∑t=0
(xst
T Q st xs
t +ust
T R st us
t ), (4.2)
where Q st ∈ Rn×n is a stage-cost matrix on the state and R s
t ∈ Rn×n is the stage-cost matrix for
the controls. Q st and R s
t are usually symmetric positive semi-definite matrices.
We show in section 4.6 how to translate this constrained portfolio optimization problem into a
constrained scenario-based linear-convex control that we have just defined.
4.4 Portfolio, Benchmark and Trading
The portfolio and benchmark universe are composed of n assets. The state variable xst ∈ Rn
corresponds to the dollar amount invested in each asset at time t in scenario s, bst ∈ Rn
corresponds to the benchmark composition at time t in scenario s, whereas the control
variable ust ∈ Rn is the amount of each asset bought or sold at time t .
We define µst and g s
t = 1+µst as the vectors of expected returns and expected gains in scenario
s for the period t to t +1 respectively.
We assume the following linear dynamics for the portfolio wealth in a given scenario s:
xst+1 =G s
t (xst +us
t ), s = 1, . . . ,S, t = 0, . . . ,T −1,
where G st is the diagonal matrix of expected asset gains at time t , in scenario s. Asset gains are
typically non-negative.
47
4.5. Constraints
The benchmark at time 0 is known and its dynamics in a given scenario is thus fully determin-
istic
bst+1 =G s
t bst , s = 1, . . . ,S, t = 0, . . . ,T −1.
We highlight that this formulation is needed to get the required linear wealth dynamics. This
requirement wouldn’t be met if other state variables had been chosen, such as portfolio weights
or the number of shares invested in each asset.
The trades in scenario s are determined in each period t by the policy φst : Rn → Rn :
ust =φs
t (xst ), s = 1, . . . ,S, t = 0, . . . ,T −1.
We note that the trades done at time t only depend on the portfolio holdings xst in a given
scenario. Bertsekas [7] showed that there is no added value to including past returns or
portfolio states in the trading policy. With this setting, we can easily enforce a self-financing
strategy by adding the constraint 1Tust = 0.
4.5 Constraints
The post-trade constraint set Ct defines which post-trade portfolios are acceptable. Ct is
assumed nonempty, and thus for any value of xst , we can find a us
t for which
xs∗t = xs
t +ust ∈Ct , t = 0, . . . ,T −1, s = 1, . . . ,S,
where xs∗t corresponds to the portfolio at time t , in scenario s, just after trading. As the pre-
trade portfolio xst is determined by the random asset gains g s
t in the previous time period, it
is not directly under our control and we thus have to set the constraints on the post-trade
portfolio. In the special case of a simple long-only constraint, due to non-negativity of the
asset gains g st , the pre-trade portfolio xt will also meet the restriction.
4.5.1 Minimum and Maximum Weights
We can set constraints on the minimum and maximum allowed positions for each asset
separately or , better yet, set position limits relative to the total portfolio value (e.g. weights):
−xs∗t ≤−(1Txs∗
t )γlbt , xs∗
t ≤ (1Txs∗t )γub
t , s = 1, . . . ,S, t = 0, . . . ,T −1,
with γlbt and γub
t ∈ Rn contain positive entries, which ensure that the value in asset i meet
or exceed the fraction (γlbt )i and not exceed the fraction (γub
t )i of the total portfolio value
respectively. This constraint is convex, with Ct a polyhedron (see Boyd et al. [15]).
48
4.6. Problem Description
4.5.2 Brokerage Costs and Bid-Ask Spread
Commissions charged by a broker at each transaction can take various forms. However, the
considered function has to be convex, to be handled by standard MPC1. If we set the brokerage
fees proportional to the traded volume, we get a convex function given by:
ψst (xs
t ,ust ) = κTt |us
t |, s = 1, . . . ,S, t = 0, . . . ,T −1, (4.3)
where κt ≥ 0 is the vector of commission rates and the absolute value is element-wise. When
the brokerage fees are uniform, e.g. equal across assets, the function reduces to ψst (xs
t ,ust ) =
κt |ust |, where κt ≥ 0 is a scalar.
Bid-ask spread can be considered as additional transaction costs, which can be significant for
illiquid assets. These are modeled in the same way as brokerage costs. (κt )i corresponds to
one-half the bid-ask spread for the asset i (see Boyd et al. [15]).
4.5.3 Price impact
When large orders are executed, the prices tend to move against the traders as orders are filled.
Such indirect costs are defined as price impact. A quadratic form for price-impact cost can be
chosen to ensure convexity of the function and is given by:
ψst (xs
t ,ust ) = cTt us
t2, s = 1, . . . ,S, t = 0, . . . ,T −1,
where (ct )i ≥ 0 and the square is element-wise.
Another price impact model has been used by Meucci and Nicolosi [62], the 3/2 power trans-
action cost ψst (xs
t ,ust ) = cTt |us
t |(3/2), which also meets the convexity requirement.
4.6 Problem Description
Adapting from Meucci and Nicolosi [62], we solve a deterministic, discrete-time and finite-
horizon problem of an investor willing to maximize a satisfaction index relying on expected
risk-adjusted returns against a defined benchmark under a set of defined scenarios with
time-varying probabilities. Transaction, impact costs as well as lower and upper bounds on
the portfolio weights are considered.
Whereas Meucci and Nicolosi [62] only consider quadratic impact costs in their framework,
we extend the formulation in order to take bid-ask costs into account. Moreover, Meucci and
Nicolosi [62] track the changes in portfolio exposure at a given time t , we choose here to model
both portfolio exposures and the trades done separately, allowing to consider self-financing
strategies.
1although non-convex constraints can be handled by hybrid MPC procedures
49
4.6. Problem Description
We can define the following stage-cost function in each scenario s, which considers expected
excess return, tracking-error, market impact and transaction costs defined in (4.5):
p st
(−µs
t (xst +us
t −bst )+λt (xs
t +ust −bs
t )TΣst (xs
t +ust −bs
t )+κTt |ust |+us
tT diag(ct )us
t
), (4.4)
where p st is the (time-varying) probability assigned to scenario s, κt ≥ 0, ct ≥ 0 are vectors of
bid-ask and market impact (quadratic) costs respectively. Σst º 0 corresponds to the covariance
matrix in scenario s, which is positive semi-definite and λt > 0 is the (time-varying) risk
aversion parameter applied to the quadratic risk (tracking error).
4.6.1 Portfolio Restrictions
We assume that the initial portfolio is fully invested in the first asset and that we liquidate
the portfolio at the end of the investment horizon T . Moreover, we impose a self-financing
strategy2. We thus have the following equality constraints in this linear-convex framework
xs0 = xstart =
(1,0, · · · ,0
)Txs
T +usT = 0
1Tust = 0, s = 1, . . . ,S, t = 0, . . . ,T −1.
We also add a restriction on post-trade portfolio weights, as defined in (4.5.1) and obtain the
following inequality constraints:
−xs∗t ≤−(1Txs∗
t )γlbt , xs∗
t ≤ (1Txs∗t )γub
t , s = 1, . . . ,S, t = 0, . . . ,T −1.
2Note that this framework allows to handle in- and outflows as well.
50
4.7. Decomposition Quadratic / Non-Quadratic
4.6.2 Objective function
Instead of maximizing a satisfaction index, we consider the related minimization problem.
We thus minimize the negative of the satisfaction index.
Using our stage-cost functions in (4.4) at each time-step t , given a set of scenarios s = 1, . . . ,S
and their probability of occurrence p st , the initial/final portfolio and self-financing restrictions
as well as the weights constraints, we can formulate the objective function of our constrained
scenario-based linear-convex control problem
min(x,u)
S∑s=1
T∑t=0
p st
(−µs
tT(xs
t +ust −bs
t )+λst (xs
t +ust −bs
t )TΣst (xs
t +ust −bs
t )
+ustT diag(ct )us
t +κTt |ust |)
subject to xst+1 =G s
t (xst +us
t )+ f st
bst+1 =G s
t (bt )
1Tust = 0
xs0 = xstart
xsT +us
T = 0
(1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,
(4.5)
with state and control variables xst ∈ Rn·S ,us
t ∈ Rn·S , t = 0, . . . ,T and s = 1, . . . ,S. The variable
f st ∈ Rn·S corresponds to cash flows at time t , in scenario s.
A discount factor can be included to discount expected satisfaction.
4.7 Decomposition Quadratic / Non-Quadratic
The objective function consists of convex objectives, the so-called stage-cost functions de-
fined in (4.4). As shown in (4.2) we can rewrite the portfolio optimization problem (4.5) as a
combination of a convex quadratic part φ(x,u) and a convex non-quadratic partΨ(x,u)
min(x,u)
Φ(x,u)+Ψ(x,u)
subject to xst+1 =G s
t (xst +us
t )+ f st
bst+1 =G s
t (bt )
1Tust = 0
xs0 = xstart
xsT +us
T = 0
(1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,
51
4.7. Decomposition Quadratic / Non-Quadratic
where x ∈ RS(
n·(T+1))
denotes the portfolio composition (states) and u ∈ RS(
n·(T+1))
the se-
quence of trades (controls) for all scenarios and over the investment horizon, respectively.
Similarly, we define (x,u) ∈ RS(
2n·(T+1)), which corresponds to the concatenated portfolios
and trades for all scenarios and over the investment horizon.
4.7.1 Quadratic Component
We denote by Φ(x,u) the quadratic part of the objective function, which includes all elements
of the objective functions except the bid-ask (or brokerage) costs, which are non-quadratic.
Using the following definitions
x =
x1
x2
...
xs
...
xS
, u =
u1
u2
...
us
...
uS
, Q =
Q1 0 · · · · · · · · · 0
0 Q2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · Q s · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · QS
,
R =
R1 0 · · · · · · · · · 0
0 R2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · R s · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · RS
, V =
V 1 0 · · · · · · · · · 0
0 V 2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · V s · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · V S
,
where
xs =
xs1
xs2...
xst...
xsT
, us =
us1
us2
...
ust
...
usT
, Q s =
Q s1 0 · · · · · · · · · 0
0 Q s2 · · · · · · · · · 0
......
. . ....
... 0
0 0 · · · Q st · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · Q sT
,
R s =
R s1 0 · · · · · · · · · 0
0 R s2 · · · · · · · · · 0
......
. . ....
... 0
0 0 · · · R st · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · R sT
, V s =
V s1 0 · · · · · · · · · 0
0 V s2 · · · · · · · · · 0
......
. . ....
... 0
0 0 · · · V st · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · V sT
,
s = 1, · · · ,S.
52
4.7. Decomposition Quadratic / Non-Quadratic
Q st := p s
t (2λstΣ
st )
V st := p s
t (2λstΣ
st )
R st := p s
t (2λstΣ
st +diag(c s
t ))
q st := p s
t (−µst −Q s
t bst )
r st := p s
t (−µst −V s
t bst ).
the convex quadratic terms of the function Φ(x,u) can be written in a convenient matrix
format
φ(x,u) = (1/2)
x
u
1
T Q V q
V T R r
qT r T 0
x
u
1
,
where Q,R and V º 0.
4.7.2 Non-Quadratic Component
We denote by Ψ(x,u) the non-quadratic convex part of the objective function, which includes
the bid-ask (or brokerage) costs and is defined by
Ψ(x,u) = κT|u|
κ ∈ RS(
n·(T+1))
is a column vector which contains the bid-ask cost for each asset in each
scenario, which is also allowed to vary over time and is written
κ=
κ1
κ2
...
κs
...
κS
, κs =
κ0
κ1...
κt...
κT
, t = 0, . . . ,T.
Not only does this splitting into a quadratic and non-quadratic convex parts facilitates the
representation of the optimization problem but it will also be helpful in formulating and
understanding the ADMM splitting algorithm in the next chapter.
53
4.8. Conclusion
4.8 Conclusion
We defined a linear-convex problem with restrictions that cannot be solved analytically. Vari-
ous efficient tools relying on generic interior-point cone solvers, such as SeDuMi or SDPT3,
can help solving such optimization problems. Yalmip and CVX are two parse-solvers allow-
ing the user to describe at a high-level this optimization problem, the parser will transform
and express it as a cone program. However, evaluating these scenario-based MPC problems
requires solving large quadratic problems, beyond the computational limits of the solver.
For real-time applications in a portfolio management context or for trading purposes, a very
high-accuracy trading policy is not required and the focus is set on reducing computation
times. In the following chapter we show how to solve the scenario-based MPC problem in
(4.5), by breaking the global problem into a quadratic optimal control part that can be solved
in a very efficient way for all scenarios at one time and a set of single-period problems that can
be solved separately. We also extend the state-of-the-art methodology by integrating weight
constraints defined in (4.5.1) into the two-sets splitting framework without using an additional
splitting set.
54
5 Fast Scenario-Based Optimal Control
Several methods have been employed to solve the optimization problem in (4.5), such as
dynamic programming, approximate dynamic programming, Linear Matrix inequalities (see
Boyd et al. [15]).
The Alternating Direction of Multipliers (ADMM), which was introduced by Glowinski and
Marroco [34], relies on a simple algorithm that is very-well adapted to problems arising in
scenario-based MPC in particular. ADMM splits the whole optimization into sub-problems,
which are easier to solve, to converge to a coordinated solution for the (large) global problem.
In Section 5.1 we start with a review of relevant convex optimization definitions as well as a
description of proximal operators that will be subsequently used by the ADMM algorithm.
The general methodology of the ADMM algorithm is presented in Section 5.2. We apply the
algorithm to solve the scenario-based MPC problem presented in the previous chapter in
Section 5.3 and following the methods presented in Parikh and Boyd [73], we apply modern
techniques from linear algebra in Section 5.4, to solve the quadratic part of the optimization
problem very efficiently and handle high-dimensional optimization problems. The non-
quadratic optimization problem is solved in Section 5.5 using proximal operators.
In Section 5.6 we present our contribution and extend the current ADMM methodology by
including portfolio weights constraints into the two-sets splitting algorithm, without using an
additional splitting set. Moreover, we propose a modification of the termination criterion used
in previous work, which allows to improve convergence speed by embedding the probabilities
assigned to the scenarios considered into the criterion. The complete Extended Two-Set
Alternating Direction Method of Multipliers algorithm is presented. We conclude this chapter
in Section 5.7.
55
5.1. Definitions
5.1 Definitions
5.1.1 Convex Functions
Definition 5. A function f : Rn → ∪ ∞ is convex if its domain dom f is convex and for any
x, y ∈ dom f
f (θx + (1−θ)y) ≤ θ f (x)+ (1−θ) f (y), θ ∈ [0,1].
Definition 6. A function f : Rn → R ∪ ∞ is strongly convex if the following strict inequality
holds for x 6= y
f (θx + (1−θ)y) < θ f (x)+ (1−θ) f (y), θ ∈ [0,1].
Definition 7. A function f : Rn → R ∪ ∞ is called proper if its epigraph
epi f = (x, t ) ∈ Rn ×R | f (x) ≤ t (5.1)
is a nonempty closed convex set. We can write the effective domain of f as
dom f = (x) ∈ Rn | f (x) <+∞,
which corresponds to the set of points for which f takes on finite values. (see Parikh and Boyd
[73] for more details).
Definition 8. The indicator function of a closed convex set Ct ∈ R2n is
ψt (xt +ut ) = 1Ct (xt ,ut ) =0, (xt ,ut ) ∈Ct .
∞, otherwise.
We note that the indicator function is a convex function.
5.1.2 Proximal Operators
Proximal operators carry out small convex optimization problems. These often have closed-
form solutions or can at least be solved very efficiently.
Definition 9. The proximal operator denoted prox f : Rn → R of f is
prox(v) = argminx
(f (x)+ 1
2||x − v ||22
),
where || · || is the `2 norm. The function minimized is strongly convex and has a unique mini-
mizer for v ∈ Rn .
56
5.1. Definitions
To simplify the notation, the function f is often scaled with a parameter ρ > 0 and we denote
Definition 10. the proximal operator of f with parameter ρ
proxρ(v) = argminx
(f (x)+ ρ
2||x − v ||22
).
In this thesis, transactions costs are formulated as a `1 minimization problem (see Chapter 4
Equation 4.3. As shown in Parikh and Boyd [73],
Definition 11. The proximal operator of f = || · ||1, is
(proxρ f (v)
)i=
vi −ρ vi ≥ ρ0 |vi | ≤ 1
vi +ρ vi ≤−ρ
and is also known as the element-wise soft-thresholding operator. Parikh and Boyd [73] show
that this can also be expressed in compact form as
proxρ f (v) = (v −ρ)+− (−v −ρ)+ . (5.2)
The inclusion of inequality constraints related to portfolio weights restrictions, introduced in
Section 5.6 relies on the projection onto a box or hyper-rectangle.
Definition 12. The projection onto a box C = x | lb ≤ x ≤ u takes the form
(ΠC (v))k =
lk vk ≤ lk
vk lk ≤ vk ≤ uk
uk vk ≥ uk,
(5.3)
where lk ,uk are the lower and upper bounds of the box, respectively.
5.1.3 Proximal minimization
If the function f : Rn → R ∪ +∞ is a closed proper convex function, the minimization can
be carried out with the help of a proximal minimization algorithm. At each iteration, this
algorithm also known as proximal iteration solves
vk+1 := proxρ f (vk ),
where k is an iteration counter and vk corresponds to the kth iteration. If the problem is
feasible, it can be shown that vk converges to the set of minimizers of the function f and f (xk )
converges to its minimum.
According to the definition, prox f (v) is a point that operates a trade-off between remaining
57
5.2. ADMM
close to v and minimizing f . This is the reason why, it is also called a proximal point of v with
respect to f . The parameter ρ corresponds to the trade-off parameter.
We will use extended closed proper convex functions subsequently, meaning they can take the
infinity value outside their domain.
Assumption 1. Functions f and g are closed convex functions.
5.2 ADMM
We present in this section a popular splitting method, known as the Alternating Direction of
Multipliers (ADMM), which is a special form of the Douglas-Rachford splitting algorithm. It
combines the decomposability of dual ascent and displays strong convergence properties of
the method of multipliers and is particularly adapted to optimization problems that are too
large to be handled by generic solvers (for more details see Parikh and Boyd [73]).
The algorithm solves following convex optimization problems
minimize f (x)+ g (x)
subject to x − x = 0,(5.4)
with variables x ∈ Rn and x ∈ Rm , where f , g : Rn → R ∪ +∞ are closed proper convex
functions. The optimization variable has been split in two parts denoted x and x, with the
objective function separable across this splitting. The so-called consensus constraint ensures
that x and x agree.
This method is a so-called proximal algorithm, meaning it solves convex optimization prob-
lems using proximal operators of the objective functions.
To solve the problem in (5.4) we can form the augmented Lagrangian
Lρ(x, x, y) := f (x)+ g (x)+ yT (x − x)+ ρ
2||x − x||22,
where ρ > 0 is a parameter and y ∈ Rn is a dual variable associated with the consensus
constraint. The alternating direction method of multipliers (ADMM) consists then of the
following iterations
xk+1 := argmin(x)
Lρ(x, xk , yk )
xk+1 := argmin(x)
Lρ(xk+1, x, yk )
yk+1 := yk +ρ(xk+1 − xk+1),
where k is an iteration counter and ρ > 0 is a step size used in the algorithm. Lρ is first mini-
mized over the variable x, using the most recent updated value of the other primal variable x
and the dual variable y . We observe that the dual variable (scaled) corresponds to the sum of
the consensus errors up to iteration k.
58
5.2. ADMM
As shown in Parikh and Boyd [73] this method enables handling the two objectives separately
and taking constraints into account, as they can by definition take on infinite values. the two
functions f and g are accessed through their proximal operators and obviously it is assumed
that these operators can be evaluated in an efficient way. Using the proximal operators of f
and g respectively, the algorithm reads
xk+1 := proxρ f (xk − yk )
xk+1 := proxρg (xk+1 + yk )
yk+1 := yk +ρ(xk+1 − xk+1).
5.2.1 Accelerated ADMM
An accelerated variant of the ADMM presented above has been proposed in Goldstein et al. [36]
and is presented in Algorithm 1. The convergence performance of the ADMM can be improved
by using this predictor-corrector acceleration step and avoids inherent spiral movements
around the optimum. A global convergence can be guaranteed if both objective functions f
and g are strongly convex (see Definition 7).
As our portfolio optimization problem will consider constraints on weights, this requirement
is not met. However, Goldstein et al. [36] show that for weakly convex problems, a restart rule
can be applied to ensure stability. The restart rule simultaneously recourse to both the primal
and dual residuals:
ek = 1
ρ||yk − yk ||22 +ρ||x − x||22.
At every iteration k, we compare ek to its previous value multiplied by a constant η ∈ [0,1],
usually chosen close to 11. If the combined residual has been decreased by a factor of at least
η, we apply the acceleration, otherwise we “restart” the algorithm by setting αk+1 = 1.
Our extensive testing in the portfolio optimization framework has shown that the inclusion of
Nesterov-based acceleration step does not always improve the performance of the algorithm.
Following Boyd et al. [14] we will also consider other acceleration techniques to improve the
convergence rate in Section 5.6.
1we used η= 0.999 in all our experiments as indicated in Goldstein et al. [36]
59
5.3. Splitting the MPC Problem
Algorithm 1 Accelerated Alternating Direction Method of Multipliers (aADMM)
Require: Initialize x−1 = ˆx0, y−1 = y0 ρ > 0, α1 = 1, η ∈ [0,1]1: for iteration k = 1,2, . . . do2: xk = argmin
(x)Lρ(x, xk , yk )
3: xk = argmin(x)
Lρ(xk , x, yk )
4: yk = yk +ρ(xk − xk )5: if ek < ηek−1 then
6: αk+1 = 1+p
1+4αk 2
2
7: yk+1 = yk + αk−1αk+1 (yk − yk−1)
8: ˆxk+1 = x + αk−1αk+1 (xk − xk−1)
9: else10: αk+1 = 1, ˆxk+1 = x and yk+1 = yk
11: end if12: end for
5.3 Splitting the MPC Problem
5.3.1 Overview
In this section we show how to fit the optimal control problem defined in (4.5) to the ADMM
framework and how to derive the algorithm, relying on the methodology presented in the
previous section. To ease the reading, we restate the scenario-based MPC problem
min(x,u)
S∑s=1
T∑t=0
p st
(−µs
tT(xs
t +ust −bs
t )+λst (xs
t +ust −bs
t )TΣst (xs
t +ust −bs
t )
+ustT diag(ct )us
t +κTt |ust |)
subject to xst+1 =G s
t (xst +us
t )+ f st
bst+1 =G s
t (bt )
1Tust = 0
xs0 = xstart
xsT +us
T = 0
(1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,
(5.5)
with state and control variables xst ∈ Rn·S ,us
t ∈ Rn·S , t = 0, . . . ,T and s = 1, . . . ,S. The variable
f st ∈ Rn·S corresponds to cash flows at time t .
We reformulate the problem as a combination of a convex quadratic part Φ(x,u) and a convex
60
5.3. Splitting the MPC Problem
non-quadratic partΨ(x,u)
min(x,u)
Φ(x,u)+Ψ(x,u)
subject to xst+1 =G s
t (xst +us
t )+ f st
bst+1 =G s
t (bt )
1Tust = 0
xs0 = xstart
xsT +us
T = 0
(1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1.
(5.6)
The splitting decision is not unique and the choice can affect the speed of the algorithm
significantly. Both sub-problems, when no closed-form solution exists, should be cheap to
solve, i.e. with the help of proximal minimizations or simple projections. Expensive operations
such as matrix inversions should be avoided and if some quantities remain constant over the
iterations, they should be pre-factored.
We chose to split here the objective function in (5.5) into a quadratic and a non-quadratic part
(i.e. a two-set splitting), that will be solved alternatively until a convergence is reached.
5.3.2 Notation
Input data used in the optimization problem are the initial and final portfolio (initial and ter-
minal state) xstart, xfinal ∈ Rn , the dynamics transition (asset gains) matrix G , the self-financing
portfolio, the vector of cash flows f , the quadratic costs elements Q, R, S, q and r (see Chap-
ter 4, Section 4.7.1) and the non-quadratic cost functions Ψ (see Chapter 4, Section 4.7.2). All
data, except the initial portfolio are given for the investment horizon t = 0, . . . ,T and for the
scenarios s = 1, . . . ,S.
We use x ∈ RS(
n·(T+1))
and u = (u0, . . . ,uT ) to denote the portfolio composition (states) and u ∈RS(
n·(T+1))
for the trades sequences (controls). Moreover, we define w = (x,u) ∈ RS(
2n·(T+1)),
which corresponds to the concatenated portfolios and trades for all scenarios and over the
investment horizon.
We start by expressing the equality constraints in the form of an indicator function. To
this purpose, we define the set D of the state and control variables that preserve the linear
dynamics, the self-financing constraint as well as the starting and final portfolio constraints.
We denote by 1D the indicator function on D, that is a closed proper convex function as
explained in Definition 8. D reads
D = (x,u) | xs0 = xst ar t , xs
T = x f i nal , xst+1 =G s
t (xst +us
t )+ f st , bs
t+1 =G st (bt ),
1Tust = 0, s = 1, . . . ,S, t = 0, . . . ,T −1
.
(5.7)
61
5.3. Splitting the MPC Problem
5.3.3 ADMM Formulation
We formulate the scenario-based MPC portfolio optimization problem to fit the ADMM
framework. The objective function defined in (5.6) is composed of S · (T +1) convex objective
terms. We already split the function into a convex quadratic function Φ(x,u) and a convex
non-quadratic one denotedΨ(x,u) that we restate here for convenience:
min(x,u)
Φ(x,u)+Ψ(x,u)
subject to xst+1 =G s
t (xst +us
t )+ f st
bst+1 =G s
t (bt )
1Tust = 0
xs0 = xstart
xsT +us
T = 0
(1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1.
The quadratic part of the function, denoted by Φ(x,u), reads
Φ(x,u) = (1/2)
x
u
1
T Q V q
V T R r
qT r T 0
x
u
1
,
whereas the non-quadratic functionΨ(x,u) reads
Ψ(x,u) = κT|u|.
We refer to Chapter 4, Sections 4.7.1 and 4.7.2, respectively, for the derivation.
The equality constraints can be easily integrated into the convex optimal control problem
by the inclusion of the indicator function 1D defined in (5.7), which contains the equality
constraints of the optimization problem. We can now rewrite the global problem
min(x,u)
1D(x,u),+Φ(x,u)+Ψ(x,u)
subject to (1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1.
(5.8)
Relying on the method of Augmented Lagrangians, we can formulate the optimal control
problem defined in (5.8) in a consensus form as in (5.4) above:
minimize 1D(x,u),+Φ(x,u)+Ψ(x, u)
subject to (x,u) = (x, u).(5.9)
We note that the inequality constraint, related to portfolio weights, is not included in this
62
5.3. Splitting the MPC Problem
formulation. We will provide an extension to the ADMM methodology in Section 5.6, to
include portfolio weights constraints, without using a three-set splitting.
We split here the state-control variables and we add the consensus constraint that they must
agree. The first element of the equation contains the quadratic objective, the return dynamics,
the self-financing restriction, as well as the initial and terminal portfolio constraint. The
second element is separable across assets and contains the non-quadratic objective on the
states and control. The equality constraint simply states that the two pairs of state and control
variables should be in consensus.
5.3.4 Splitting Operator
We define two pairs of state and control variables, denoted x,u ∈ RS(
n·(T+1))
and x, u ∈RS(
n·(T+1)). We will use two dual variables z and y . We initiate the four variables that we
denote (x0, u0) and (z0, y0) for k = 0,1, . . ..
We can write the augmented Lagrangian of the splitting problem in (5.9)
Lρ(x, x,u, u, z, y) := 1D (x,u)+Φ(x,u)+Ψ(x, u)
−ρ(z, y)T (x − x,u − u)
+ ρ
2||x − x,u − u||22,
that we can easily reformulate in a squared form
Lρ(x, x,u, u, z, y) := 1D (x,u)+Φ(x,u)+Ψ(x, u)+ ρ
2||(x − x,u− u)− (z, y)||22 −
ρ
2||(z, y)||22.
By minimizing over the primal variables and maximizing over the dual variables we derive the
algorithm updates
(xopt,uopt) := argmin(x,u)
(1D (x,u)+Φ(x,u)+ ρ
2||(x,u)− (x, u)− (z, y)||22
)(xopt, uopt) := argmin
(x,u)
(Ψ(x, u)+ ρ
2||(x, u)− (x,u)+ (z, y)||22
)(zopt, yopt) := argmax
(z,y)
(− ρ
2||(z, y)||22 +
ρ
2||(z, y)− (x − x,u − u)||22
).
By starting from initial values for the pairs of variables (x0, u0) and (z0, y0), for k = 0, . . . ,max.iter,
63
5.3. Splitting the MPC Problem
the algorithm reads
(xk+1,uk+1) := argmin(x,u)
(1D (x,u)+Φ(x,u)+ ρ
2||(x,u)− (xk , uk )− (zk , yk )||22
)(5.10a)
(xk+1, uk+1) := argmin(x,u)
(Ψ(x, u)+ ρ
2||(x, u)− (xk+1,uk+1)+ (zk , yk )||22
)(5.10b)
(zk+1, yk+1) := (zk , yk )+ (xk+1, uk+1)− (xk+1,uk+1), (5.10c)
where k corresponds to the iteration counter and ρ > 0 is the step size of the algorithm.
(zk , yk ) ∈ RS(
(2n)(T+1))
is the scaled dual variable related to the consensus constraint.
We begin by minimizing a sum of convex quadratic functions with respect to the state and
control variables, subject to a return dynamics constraint. We thus have a convex quadratic
control problem to solve. The second step involves the minimization of a convex objective
function, that is separable across assets and scenarios and is thus easily parallelizable. The
right-hand side of Equation (5.10c) is the proximal operator of the functionΨt evaluated at
(xk+1 − zk ,uk+1 − yk ).
We will show in Sections 5.4 and 5.5 how to efficiently solve the two convex functions for all
scenarios s = 1, . . . ,S.
Termination Criterion
As detailed in Boyd et al. [14], the ADMM algorithm converges under two soft assumptions,
provided the solution exists.
Assumption 2. The extended-real-valued functions
f : Rn → R ∪ +∞ and g : Rm → R ∪ +∞ are closed proper convex functions.
Assumption 3. The unaugmented Lagrangian L0 has a saddle point.
Assumption 2 is satisfied if and only if the epigraph of the function, as defined in (5.1), is a
closed nonempty convex set.
Assumption 3 means explicitly that (xopt, xopt, yopt) exist and for which
L0(xopt, xopt, y) ≤ L0(xopt, xopt, yopt) ≤ L0(x, x, yopt)
holds for all x, x and y . For further details we refer to [14].
The primal and dual residuals, denoted respectively r k and sk , for (5.9) are given by
r k = (xk ,uk )− (xk , uk ), sk = ρ((xk , uk )− (xk−1, uk−1)
).
64
5.4. Solving the Convex Quadratic Control Problem
r k and sk converge to zero under the defined algorithm. A reasonable termination criterion is
when both residuals are below defined thresholds
||r k ||2 < εprimal, ||sk ||2 < εdual,
where εprimal > 0 and εdual > 0 are tolerances for primal and dual feasibility, respectively.
Following Boyd et al. [14], these thresholds can be assigned in the following way:
εprimal = εabsp(T +1)(2n)+εrelmax||(xk ,uk )||2, ||(xk , uk )||2
εdual = εabsp(T +1)(2n)+εrel||(zk , yk )||2,
where εabs > 0 and εrel > 0 are absolute and relative tolerance levels, respectively. This enables
that the tolerances scale with the size of the problem as well as the variable values.
Opposite to interior-points algorithms, the splitting method converges rapidly to modest
accuracy, which is sufficient for portfolio construction purposes. However, convergence to
accurate solutions can take many iterations.
5.4 Solving the Convex Quadratic Control Problem
The convex quadratic control terms of the function Φ(x,u) do not include any transaction
costs and thus are only dependent on the term known at time t for each scenario s. We derive
the analytical solution in the current section.
The first step of the splitting algorithm consists in solving a linear-quadratic optimal control
problem that can be expressed as
minimize1
2wT Ew + cT w
subject to Aw = d ,
where w ∈ R(2n)(T+1) includes the state and control variables (x and u). The quadratic and
affine terms are grouped in the objective function and the equality constraint contains the
return dynamics of the system, the self-financing constraint as well as the starting and final
portfolio restrictions. We define for s = 1, . . . ,S
c =
c1 0 · · · · · · · · · 0
0 c2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · c s · · · 0
0 0 · · · · · · · · · 0
0 0 · · · · · · · · · cS
, d =
d 1 0 · · · · · · · · · 0
0 d 2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · d s · · · 0
0 0 · · · · · · · · · 0
0 0 · · · · · · · · · d S
,
65
5.4. Solving the Convex Quadratic Control Problem
E =
E 1 0 · · · · · · · · · 0
0 E 2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · E s · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · E S
, A =
A1 0 · · · · · · · · · 0
0 A2 · · · · · · · · · 0...
.... . .
...... 0
0 0 · · · As · · · 0
0 0 · · · · · · . . . 0
0 0 · · · · · · · · · AS
,
where
cs =
q s0 −ρ
((x0
k + (zs0)k)
r s0 −ρ
((us
0k + (y s
0)k)
...
q st −ρ
((xs
t )k + (zst )k)
r st −ρ
((us
t )k + (y st )k)
...
q sT −ρ((xs
T )k + (zsT )k)
r sT −ρ((us
T )k + (y sT )k)
, ds =
xst ar t
f s0...
f st...
f sT−1
x f i nal
, As =
Aas
Abs
Acs
,
E s =
Q0 +ρ I V0 · · · · · · · · · · · · · · · · · · 0 0
V0 R0 +ρ I · · · · · · · · · · · · · · · · · · 0 0
0 0 Q1 V1 · · · · · · · · · · · · 0 0
0 0 V1 R1 +ρ I · · · · · · · · · · · · 0 0...
......
.... . .
......
...... 0
0 0 · · · · · · · · · Qt +ρ I Vt · · · 0 0
0 0 · · · · · · · · · Vt Rt +ρ I · · · 0 0
0 0 · · · · · · · · · · · · · · · . . . 0 0
0 0 · · · · · · · · · · · · · · · · · · QTT +ρ I VT
0 0 · · · · · · · · · · · · · · · · · · V TT RT +ρ I
66
5.4. Solving the Convex Quadratic Control Problem
and
Aas =
I 0 0 0 · · · 0 0 0 0
−G s0 −G s
0 I 0 · · · 0 0 0 0
0 0 −G s1 −G s
1 · · · 0 0 0 0...
......
.... . .
......
......
0 0 0 0 · · · −G sT−1 −G s
T−1 I 0
Abs =
0 I 0 0 · · · 0 0 0 0
0 0 0 I · · · 0 0 0 0...
......
.... . .
......
......
0 0 0 0 · · · 0 0 0 I
As
c =[
0 0 0 · · · 0 0 0 I I]
.
We note that E s is a block diagonal matrix, formed by T +1 blocks of size (2n2) and is positive
definite since the parameter ρ is positive.
A1 encodes the dynamics (asset gains), A2 contains the self-financing constraint, whereas A3
imposes the final portfolio restriction. We note that A contains the identity blocks that makes
it full rank ((T +1)n). We thus face a standard equality-constrained quadratic program, that
can be solved efficiently due to the special structure (sparse matrix, block diagonal) displayed
by the matrix A.
Sufficient and necessary optimality conditions are provided by the Kuhn and Tucker (KKT)
conditions for this type of problem:[E AT
A 0
][w
η
]=[−c
d
], (5.12)
where η ∈ R(T+1)n are dual variables related to the equality constraints. This is an optimization
problem with S(3n)(T +1) equations and variables. As E is positive definite and A is full rank,
the KKT matrix in (5.12) is invertible but substantial computational effort can be saved by
exploiting its structure (sparsity) and avoiding the inversion.
The splitting algorithm used require solving this quadratic problem many times (i.e. at each
iteration) but we note that the coefficient matrix will remain unchanged, only values of f
are changing at each iteration. We choose to solve and cache (5.12) by following a modern
approach described in Boyd and Vandenberghe [13] and applied in Stathopoulos et al. [92],
using a sparse LDLT decomposition. The authors show that the coefficient matrix can be
factored as[E AT
A 0
]= PLDLT P T ,
67
5.5. Solving the Convex Non-Quadratic Problem
where P corresponds to a permutation matrix, chosen based on the sparsity of (5.12) resulting
in a lower-triangular matrix L, with one on the diagonal and few non-zeros as well as a stable
factorization (see Bunch and Parlett [17]). D is block diagonal with 2 blocks. This factorization
enables to solve (5.12)[w
η
]= P(L−T(D−1(L−1(P T
[−c
d
])))). (5.13)
Forward and backward substitution are used to carry out the multiplication by L−1 and
L−T respectively, thus avoiding to resort to division. As the algorithm leads to solve this
optimization problem many times, by only changing the values of c, we can compute and
cache P , L and D−1. We then have to solve the system using (5.13) at each iteration step.
Regularization In order to ensure that the factorization detailed above always exists and is
stable, as in Saunders [87] we regularize the system by writing[E AT
A −εI
],
where ε> 0 is a constant. After regularization the matrix is quasi-definite and a stable factor-
ization can be carried out for any permutation P . Saunders [87] suggest a value of 10−8 for ε,
allowing to ensure stability without modifying the optimization problem substantially.
5.5 Solving the Convex Non-Quadratic Problem
In the second step of the ADMM algorithm, we solve the non-quadratic following problems
S ·T times
min(x s
t ,ust )S
s=1
Ψt (xst ,us
t )+ ρ
2||(xs
t ,ust )− (v s
t , w st )||22.
These are straightforward single-period proximal steps. In our scenario-based MPC optimal
control problem defined in (5.6), Ψ(x,u) = κT|u| encompasses the minimization of the trans-
action costs. This problem can efficiently be solved by the proximal operator of the absolute
value which is given by the soft-thresholding operator detailed in (5.2).
This optimization step is separable over assets and can be evaluated in parallel.
68
5.6. Extending the State-of-the-Art
5.6 Extending the State-of-the-Art
In this section we present our contribution to extend existing techniques to include the
inequality constraint in (5.5), related to portfolio weights. Whereas previous research recourse
to a three-set splitting scheme in the ADMM algorithm to consider equality and inequality
constraints separately, we propose here a new approach, named embedded update splitting,
to include such constraints in the two-set splitting scheme defined in (5.6).
Although the scenario-based MPC problem is fully separable across scenarios, we show in
Section 5.4 that the quadratic convex control can be efficiently solved at one go for all scenarios
simultaneously. We provide here an additional contribution to the state-of-the-art, by using a
probability-weighted norm for the primal and dual residuals of the ADMM splitting problem
in the termination criterion, allowing to improve the convergence speed significantly.
5.6.1 Improving Convergence
As mentioned in Section 5.2.1, the Accelerated ADMM Method does not always improve
the convergence performance of the algorithm. A so-called relaxation method, relying on
the combination of the two iterative methods of Jacobi and Gauss-Seidel (explained in Ap-
pendix A.1), has been used in previous work (see Eckstein and Bertsekas [26], Eckstein [25],
Eckstein and Ferris [27]). Our extensive tests in a portfolio optimization context have shown
that the number of iterations can be substantially reduced by using this procedure.
Relaxation Method
The methodology simply replaces (xk+1,uk+1) of the first update equation in (5.10) with
(xk+1,uk+1) =α(x∗,u∗)+ (1−α)(xk + uk ), (5.14)
where α ∈ [0,2] corresponds to the relaxation parameter. When α> 1 this procedure is referred
to as over-relaxation, whereas this is referred to as under-relaxation when α < 1. Eckstein
[25] and Eckstein and Ferris [27] suggest values of α between 1.5 and 1.8, to improve the
convergence. We note than if we set α = 1, we get the Gauss-Seidel Method. In line with
common practice, we use a value of 1.8 subsequently.
Probability-weighted Residuals
The scenario-based MPC approach and the ADMM algorithm used is fully separable across
scenarios and can be computed in parallel. In previous research, Kang et al. [45] show that the
computations required by the algorithm scale linearly with the number of scenarios.
In this section, although our tests confirmed the convergence of the scenario-based MPC
portfolio problem expressed as a two-stage splitting problem, we suggest an alteration of the
69
5.6. Extending the State-of-the-Art
termination criterion defined in Section 5.3.4. In Section 5.4 we benefit from the particular
structure of the first sub-problem and solve the quadratic minimization at one go. We build
on this appealing feature and propose to include the probabilities of the scenarios into the
termination criterion. We show that the two-stage splitting problem, converges faster when
using probability-weighted primal and dual residuals, regardless of the size of the optimization
problem considered.
Using the respective probabilities assigned to the scenario, we can write the weighted primal
and dual residuals, denoted respectively r kw and sk
w , for (5.9) as
r kw =
S∑s=1
ps
((xk
s ,uks )− (xk
s , uks ))
skw =
S∑s=1
ps ·ρ((xk
s , uks )− (xk−1
s , uk−1s )), s = 1, . . . ,S.
(5.15)
r kw and sk
w converge to zero under the defined algorithm. The termination criterion is satisfied
when both residuals are below defined thresholds
||r kw ||2 < εprimal
w , ||skw ||2 < εdual
w ,
where εprimal > 0 and εdual > 0 are tolerances for primal and dual feasibility, respectively. These
thresholds rely on the same approach used in Section 5.3.4, except that we again recourse to
probability-weighted approach. We define the probability-weighted first and second set of
primal variables, as
xkw =
S∑s=1
ps xks , xk
w =S∑
s=1ps xk
s
ukw =
S∑s=1
psuks , uk
w =S∑
s=1ps uk
s , , s = 1, . . . ,S,
where xw , xw , uw and uw denote the weighted first and second primal states and controls
respectively. We compute the probability-weighted dual variables zkw and yk
w
zkw =
S∑s=1
ps zks , yk
w =S∑
s=1ps yk
s , s = 1, . . . ,S.
Using the weighted norms in (5.15) as well as the weighted state, control and dual variables
just defined, the termination criterion reads
εprimalw = εabsp(T +1)(2n)+εrelmax||(xk
w ,ukw )||2, ||(xk
w , ukw )||2
εdualw = εabsp(T +1)(2n)+εrel||(zk
w , ykw )||2,
(5.16)
where εabs > 0 and εrel > 0 are absolute and relative tolerance levels, respectively.
70
5.6. Extending the State-of-the-Art
We illustrate the improved convergence speed of our termination criterion by comparing the
number of iterations needed by the weighted and non-weighted scheme, for small, medium,
large and extra-large optimization problems detailed in Section 5.6.3, respectively. Figure 5.1
shows that even for small problems the convergence is improved by using the weighted
methodology, this difference increases with the size of the problem. Previous research also20
040
060
080
0
Nb of
iterat
ions
extra−large large medium small
Unweighted Criterion
Weighted Criterion
Small Medium Large Extra LargeStates & controls (n) 4.00 20.00 20.00 50.00Horizon (T ) 5.00 10.00 15.00 20.00Number of scenarios (S) 2.00 5.00 7.00 10.00Step size (r ho) 0.10 0.10 0.10 0.10Unweighted: Solve time (sec) 1.13 31.61 95.19 764.40Unweighted: Number of iterations 43.00 431.00 561.00 929.00Unweighted: Objective value -0.78 -1.55 -2.44 -4.86Weighted: Solve time (sec) 1.19 22.95 84.22 501.27Weighted: Number of iterations 41.00 330.00 375.00 549.00Weighted: Objective value -0.78 -1.55 -2.44 -4.86
Figure 5.1 – Convergence properties: weighted vs. unweighted scheme
considered an adaptive step size parameter ρ. We have shunned this practice in this thesis,
as a variation of ρ would require a recalculation of the pre-computed matrix in the convex-
quadratic control problem in Section 5.4.
In Section 5.7, we identify avenues for future research in this direction.
71
5.6. Extending the State-of-the-Art
5.6.2 Extended Two-Set Splitting
Stathopoulos et al. [92] show how to formulate a standard convex quadratic optimal control
problem under equality and inequality constraints with a three-set decomposition scheme. In
the three-set setting, the objective function, equality and inequality constraints are split into
three sets of variables. O’donoghue et al. [72] propose a simple scheme for taking straightfor-
ward long-only restrictions into account.
We propose here a modification of the update step given by the soft-thresholding operator
in Equation (5.10c). To derive the modified update step, we rely on a special case met in
stochastic portfolio optimization, assuming the absence of transaction costs. In this simplified
case, the optimal policy determining the trade sequence, denoted by φ(x,u), can be expressed
as a function of the post-trade portfolio x∗:
φ(x,u) =φ(x∗).
Our optimization problem in (5.5) would reduce to a convex quadratic problem, with a linear
dynamics and inequality constraints. This problem would only require a minimization over a
function of x∗ = x +u to find the optimum x∗opt and the resulting optimal policy would be
retrieved by taking u = x∗opt −x.
We can thus describe the optimal policy problem as follows: We start by solving
min(x∗)
φ(x∗)
subject to x∗ ∈C ,
over the variables x∗ = x+u. The optimal policy is reduced to rebalance the portfolio to match
the optimal post-trade portfolios. This affine optimal policy reads
φopt(u) = x∗opt −x. (5.17)
Based on this principle, we show here how to embed this straightforward rebalancing scheme
into the update step in Equation (5.10c), to consider weights constraints. To this purpose, we
modify the ADMM algorithm in (5.10), and detail the procedure :
1. We solve the quadratic constrained problem in in (5.10b) and retrieve the updated states
and controls xopt and uopt of the first primal set.
2. We solve the soft-thresholding minimization in (5.10c) and retrieve the updated controls
uopt of the second primal set.
72
5.6. Extending the State-of-the-Art
3. We use the projection onto a box defined in (5.3), to set weights restrictions and retrieve
the optimal post-trade portfolio:
l b = (1T(x +u))γlb , ub = (1T(x +u))γub
x∗opt := ΠC (x + u)),
where γl b and γub ∈ R(n·S)(T+1) contain lower and upper bounds, respectively, which
ensure that the value in asset i meet or exceed the fraction (γlbt )i and not exceed the
fraction (γubt )i of the total portfolio value respectively.
We note that the value of the portfolio used in the projection is retrieved from the first set
of primal variables x +u, whereas the vector that we project corresponds to the second
set of primal variables x + u.
4. As explained in (5.17) we could simply re-update the second primal control vector u
by subtracting the optimal state vector xopt from the optimal post-trade portfolio x∗opt.
Inspired by the splitting philosophy, we suggest instead to split the rebalancing scheme
over the second set of states and controls equally:
ω= x∗opt − x − u
2
xopt = x +ωuopt = u +ω,
(5.18)
where ω corresponds to half of the required portfolio rebalancing after considering
bid-ask costs and portfolio weights constraints.
We name this method the embedded update splitting, which displays appealing con-
vergence properties compared to the simple rebalancing scheme proposed in (5.17).
Extensive testing has shown that the simple scheme does not always converge, even for
small optimization problems (4 assets and investment horizon of 5 periods).
73
5.6. Extending the State-of-the-Art
We can now combine the embedded update splitting in (5.18) with the over-relaxation method
in (5.14) and formulate the Extended Two-Set ADMM Algorithm:
Algorithm 2 Extended Two-Set Alternating Direction Method of Multipliers (eADMM)
Require: Initialize x−1 = ˆx0, u−1 = ˆu0, z−1 = z0, y−1 = y0
1: for iteration k = 1,2, . . . do
. Constrained Quadratic Minimization
2: (xk+1,uk+1,η) := P(L−T(D−1(L−1(P T[−c
d
])))).Over-Relaxation:
3: (xk+1,uk+1) = α(x∗,u∗)+ (1−α)(xk + uk )
. Soft-Thresholding (Transaction Costs):4: uk+1 := prox(uk − yk )5: xk+1 := (xk − zk )
. Box Projection (weights constraints):6: x∗ := ΠC=x∗ | lb≤x∗≤ub(xk+1 + uk+1))
. Embedded Update Splitting:
7: ω := x∗− xk+1 − uk+1
28: xk+1 := xk+1 +ω9: uk+1 := uk+1 +ω
.Dual Update10: (zk+1, yk+1) := (zk , yk )+ (xk+1, uk+1)− (xk+1,uk+1)
. Termination Criterion11: if ||r k
w ||2 < εprimalw and ||sk
w ||2 < εdualw then
12: Convergence = true return xk+1 and uk+1
13: else14: Convergence = false15: end if16: end for
74
5.6. Extending the State-of-the-Art
5.6.3 Numerical Results
We test the performance of Extended Two-Set Alternating Direction Method of Multipliers
(eADMM) presented in (2), we implemented the algorithm in the language R without paral-
lelization. We benchmark the performance of our algorithm with the package CVX in Matlab,
relying on the standard SDPT3 solver.
We consider a multi-period portfolio optimization problem, with transactions costs and lower-
upper-bound constraints set to 5% and 30% respectively for each asset. We generated a
random matrix of covariances and expected returns and applied a noise to generate scenarios.
In Table 5.1 we present average computational results over 15 runs for portfolios of different
sizes (n=4, 10, 50, 100), consider various horizons (T=5, 10, 50) and vary the number of
scenarios (n=3, 5, 10). All computations were performed on a 2 Ghz Intel Core i7 CPU processor
with 8 GB RAM. We highlight the fact that opposite to R, Matlab has a slight edge due to the
multi-threaded execution of subroutines. We used a tolerance level of 10-4 in the termination
criterion presented in (5.16).
Table 5.1 – Computational Time Results for Stochastic MPC Problems
Small Medium Large Extra Large
States & controls (n) 4 20 30 50
Horizon (T ) 5 10 15 20
Number of scenarios (S) 2 5 7 10
Step size (ρ) 0.1 0.1 0.1 0.1
CVX: Solve time (sec) 7.42 45.34 107.51 593.34
CVX: Number of iterations 29.00 42.00 52.00 66.00
CVX: Objective value -0.78 -1.55 -2.86 -4.86
eADMM: Solve time (sec) 1.19 22.95 84.22 501.27
eADMM: Number of iterations 41.00 330.00 414.00 549.00
eADMM: Objective value -0.78 -1.55 -2.86 -4.86
This table compares the performance of the proposed Extended ADMM algorithm to theSDPTD3 used in CVX (Matlab) for optimization problems of different sizes.
Comparing CVX to the Extended ADMM algorithm, the computational time is similar for
small-size problems but diverge rapidly when the dimension of the problem increases. For
large-scale problems we faced memory issues with CVX and were not able to solve the prob-
lem. This computational performance of the algorithm can be further improved, through
an implementation in C and a parallelization. We provide suggestions for future research in
Chapter 7.
75
5.7. Conclusion
5.7 Conclusion
In this chapter we rely on the Alternating-Direction Method of Multipliers (ADMM) for solv-
ing the scenario-based MPC problem presented in the previous chapter. We present tech-
niques available to accelerate the algorithm and extend the state-of-the-art by providing an
adapted two-set splitting scheme, which allows to consider inequality constraints, as well as
a probability-based termination criterion to improve convergence speed. We detail the ex-
tended fast optimal control algorithm developed and highlight the soundness of our approach
using small-scale to large-scale optimization examples. This new algorithm will be applied in
the next chapter to a real-world application related to large-scale portfolio optimization.
76
6 Application
In this chapter, we combine the different concepts derived in this thesis and apply them to a
real-world portfolio optimization problem.
We first suggest an approach to generate scenarios (views) about expected returns and co-
variances and subsequently illustrate how these scenarios are handled by the scenario-based
MPC framework detailed in Chapter 4. The ADMM methodology from Chapter 5 is used to
recompute the optimal multi-period asset allocation at each rebalancing date.
We explain the methodology in Section 6.3.2 to generate the scenarios, leaning on the EU-
R/CHF exchange rate which is used as an indicator and drive the dynamic asset allocation
strategy. We suggest an innovative method for steering the risk aversion in the optimization
dynamically, using the probabilities assigned to the scenarios.
We back-test the strategy in Section 6.4 and compare its performance relative to a given
strategic allocation (benchmark) as well as naive diversification strategies. We show that
the results displayed by our dynamic portfolio strategy outperforms the benchmark on a
risk-adjusted basis and provide the desired portfolio stability without deviating significantly
from the strategic asset allocation. We conclude this chapter in Section 6.5.
6.1 Background
We consider an Equity portfolio manager, who identifies various scenarios about future returns
and risks of the assets constituting the Dow Jones Index. He is confronted with the pressure
of delivering short-term performance to his clients, while ensuring a long-term portfolio
stability and not deviating too much from a strategical allocation, exogenously assigned. The
strategical allocation is defined by the risk-budgeting strategy proposed in (2.5), which aims at
diversifying away idiosyncratic risk left unexplained by the statistical risk drivers identified.
Although cash flows can be easily handled we do not consider them in this application.
However, we consider both transaction and impact costs and use restrictions on portfolio
weights.
77
6.2. Data
6.2 Data
The dataset used in this chapter is retrieved from Yahoo Finance and is composed of twenty-
nine stocks of the Dow Jones index, from May 1999 through June 2016. We also consider the
exchange rate between Euro and Swiss Francs between 1998 and June 2016, which acts as an
indicator for extracting market regimes with a Hidden Markov Model (see Section 6.3.2).
Table 6.1 – Dow Jones Stocks – Statistics
Stock Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDD
AAPL 0.27 0.44 0.63 -0.04 -0.06 0.82
AXP 0.04 0.37 0.12 -0.04 -0.05 0.84
BA 0.09 0.31 0.29 -0.03 -0.04 0.71
CAT 0.08 0.33 0.24 -0.03 -0.05 0.73
CSCO 0.01 0.41 0.03 -0.04 -0.06 0.89
CVX 0.08 0.26 0.30 -0.02 -0.04 0.45
DD 0.03 0.29 0.11 -0.03 -0.04 0.70
DIS 0.09 0.31 0.27 -0.03 -0.04 0.68
GE 0.03 0.31 0.08 -0.03 -0.05 0.86
GS 0.06 0.39 0.14 -0.03 -0.05 0.79
HD 0.09 0.32 0.29 -0.03 -0.05 0.70
IBM 0.04 0.27 0.14 -0.03 -0.04 0.59
INTC 0.03 0.39 0.07 -0.04 -0.06 0.82
JNJ 0.08 0.20 0.42 -0.02 -0.03 0.36
JPM 0.04 0.41 0.09 -0.04 -0.06 0.74
KO 0.04 0.22 0.19 -0.02 -0.03 0.44
MCD 0.09 0.24 0.38 -0.02 -0.03 0.74
MMM 0.11 0.24 0.45 -0.02 -0.03 0.54
MRK 0.03 0.28 0.10 -0.03 -0.04 0.69
MSFT 0.04 0.32 0.12 -0.03 -0.05 0.69
NKE 0.13 0.32 0.43 -0.03 -0.04 0.59
PFE 0.03 0.26 0.11 -0.02 -0.04 0.69
PG 0.06 0.22 0.29 -0.02 -0.03 0.54
TRV 0.11 0.31 0.37 -0.03 -0.04 0.55
UNH 0.19 0.34 0.57 -0.03 -0.05 0.74
UTX 0.08 0.28 0.30 -0.03 -0.04 0.53
VZ 0.05 0.26 0.19 -0.02 -0.04 0.57
WMT 0.05 0.25 0.19 -0.02 -0.04 0.38
XOM 0.07 0.25 0.29 -0.02 -0.04 0.37
This figure shows the annualized return and volatility of each stock over the period 05/1999-06/2016. Value-at-Risk and Expected Shortfall are calculated with a 95% confidence level. TheMaximum Drawdown (MDD) over the period is displayed in the last column.
78
6.2. Data
We report summary statistics – annualized return and volatility as well as correlations, Value-
at-Risk, Expected Shortfall and Maximum Drawdown – over the whole sample period. Table 6.1
highlight the significant variations across stocks in terms of return and risk. Cisco displays the
lowest return (1%) and one of the highest risks (41% volatility) whereas Apple has the highest
return (27%). The highest drawdown is recorded by Cisco (-89%), while Johnson & Johnson
displays the lowest one (-36%).
The correlation matrix across assets in Figure 6.1 emphasizes the potential for diversification
provided by an optimal combination of these 29 stocks. Correlation coefficients range from
0.15 between Apple and Procter & Gamble to 0.83 between Chevron and Exxon Mobile.
Figure 6.1 – Asset Classes – Correlations.
This figure displays asset correlations over the period 05/1999-06/2016. Highly correlatedstocks are highlighted by blue dots with shadings related to the degree of correlation.
79
6.3. Scenario Generator
6.3 Scenario Generator
Several methods can be used to generate scenarios about future returns and risks. Some of
them rely on bootstrapping techniques of historical data, other recourse to simulations or
sampling from a given distribution.
Inspired by Hinz and Yee [42], we apply the Hidden Markov Model (HMM) methodology to
the FX-rate between EUR and CHF to identify market regimes and define scenarios for our
stocks in each of these regimes. We let the parameters of the asset price dynamics switch
between different market regimes, allowing more flexibility when sudden regime shifts occur.
We introduce the theory of Hidden Markov Models and explain the methodology used to
generate scenarios for the return dynamics of the 29 stocks.
6.3.1 Hidden Markov Model
Hidden Markov Models (HMMs) are powerful methods for modelling time varying dynamics
of a statistical process and only require a set of (soft) assumptions. A stochastic process is
modelled as a set of states, each of these states possess a set of signals. Movements between
different states characterize underlying changes in the stochastic process. HMMs have been
extensively used in finance (see Hardy [40], Haussmann and Sass [41], Erlwein [30] and Lee
[50]).
Definition 13. A Markov chain is a stochastic process xt with a countable set of states and
respects the Markov property, defined as
P (X t+1 = j ) | X0 = i0, X1 = i1, . . . , X t = it ) = P (X t+1 = j | it ),
where it denotes the prevailing state at time t .
The process evolves over time and transit from one state to another, as defined by the transition
matrixΠ. This transition matrix also named transition kernel gives the probability Pi j for the
process of migrating to state j given the current state i . We assume that all probabilities are
stationary.
The idea of Hidden Markov modelling is to realize a time series (yt )Tt=0 in such a way that
it behaves as it was driven by a background device which may operate in different regimes.
Thereby, one supposes that the operating regime is not directly observed and evolves like a
Markov chain (xt )Tt=0 on a finite space.
The major advantage thereby is that it is possible to trace the evolution of the hidden states
indirectly, based on the observation of (yt )Tt=0, using efficient recursive schemes for calculation
of the so-called hidden state estimate
xt = E(xt | y j , j ≤ t ) t = 0, . . .T.
80
6.3. Scenario Generator
Thereby, at each time t = 0, . . . ,T −1, the probability vector xt describes the distribution of xt
conditioned on the past observations (y j )tj=0. More importantly, such approach reproduces a
Markovian dynamics in the following sense: Although (yt )Tt=0 is not Markovian in general, it
turns out (see Yushkevich [97]) that the observations (yt )Tt=0 equipped with latent variables
(xt )Tt=0 form a two-component process such that
the evolution (xt , yt )Tt=0 is Markovian.
From this perspective, modelling a time series (yt )Tt=1 in this way yields a technique to address
control problems in certain non-Markovian situations. Let us introduce the ingredients
required therefore. Assume that an unobservable global regime evolves like a Markov chain
(xt )Tt=0 on the set X = e1, . . . ,ed of unit vectors in R
¯d , while the information available to
the controller is gained from the observation of the process (yt )Tt=0 which takes values in a
measure space Y .
We suppose that the joint evolution ((xt , yt ))Tt=0 follows a Markov process whose transition
kernels Qt for t = 0, . . . ,T −1 are acting on functions φ : X ×Y → R as∫φ(x ′, y ′)Qt (d(x ′, y ′) | (x, y)) = ∑
x ′∈X
∫Yφ(x ′, y ′)Γx,x ′µx (dy ′). (6.1)
Thereby, the stochastic matrix Γ= (Γx,x ′)x,x ′∈X describes the transition from xt to xt+1 whereas
µx denotes the distribution of the observation yt+1 conditioned on xt = x ∈ X . Assuming
that for each x ∈X the distribution µx is absolutely continuous with respect to a reference
measure µ on Y , we introduce the densities
νx (y) = dµx
dµ(y), y ∈Y , x ∈X ,
to write the distributions as
µx (dy) = νx (y)µ(dy) x ∈X .
It turns out that (xt , yt )Tt=0 follows a Markov process on the state space X ×Y , driven by
transition kernels Qt which act for t = 0, . . . ,T −1 on functions φ : X ×Y → R¯
as
∫X×Y
φ(x ′, y ′)Qt (d(x ′, y ′) | (x, y)) =∫Yφ
(Γ>V (y ′)x
‖V (y ′)x‖ , y ′)‖V (y ′)x‖µ(dy ′).
In this formula, V (y) stands for the diagonal matrix whose diagonal elements are given by
(νx (y))x∈X for y ∈Y , and the norm is defined as ‖z‖ =∑ni=1 |zi |, each z ∈ R
¯d .
81
6.3. Scenario Generator
6.3.2 Methodology
We consider the scenario-based MPC portfolio problem described in details in Chapter 5 that
we restate here for convenience:
min(x,u)
S∑s=1
T∑t=0
p st
(−µs
tT(xs
t +ust −bs
t )+λst (xs
t +ust −bs
t )TΣst (xs
t +ust −bs
t )
+ustT diag(ct )us
t +κTt |ust |)
subject to xst+1 =G s
t (xst +us
t )+ f st
bst+1 =G s
t (bt )
1ust = 0
xs0 = xstart
xsT +us
T = 0
(1Txs∗t )γl b
t ≤ xs∗t ≤ (1Txs∗
t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,
with state and control variables xst ∈ Rn·S ,us
t ∈ Rn·S , t = 0, . . . ,T and s = 1, . . . ,S. The variable
f st ∈ Rn·S corresponds to cash flows at time t and T is the investment horizon.
Given a set of scenarios, the objective of the portfolio manager is to obtain a risk-adjusted
outperformance over a given benchmark portfolio, in the presence of transaction costs and
constraints.
To generate plausible scenarios over the horizon T we start by fitting an HMM Model with
Gaussian innovations to the EUR/CHF exchange rate over the period 1998-2000 and identify
two market regimes. The exchange rate will be used as an indicator and will drive the dynamic
asset allocation strategy including a large set of assets.
In each of the two regimes identified, we compute the (conditional) average asset returns as
well as the covariance matrix and consider that uncertainty increases with the length of the
time horizon T . These two sets of returns-risk pairs at each time t of the investment horizon
deliver the scenarios that we will use by the fast optimal control methodology detailed in
Chapter 5. Note that for illustration purposes, we do not here re-fit the parameters of the
model but preliminary results show that this could add value to the dynamic strategy results.
The current regime probabilities and the transition kernel in (6.1) allow not only to consider
the current uncertainty pertaining to the two scenarios, but also to consider the dynamics of
the regime probabilities in the future.1
1The scenario-based optimizer presented in Chapter 5 handles these time-varying probabilities.
82
6.3. Scenario Generator
Additionally, we suggest a novel approach to steer the (time-varying) risk aversion of the opti-
mization procedure presented in Chapter 5, Section 5.3. The underlying principle relies on the
observation that if probabilities at time t are all equal, we assume that our uncertainty related
to the current regime is at its maximum. In other words, without a strong confidence in his
market views, the portfolio manager should not deviate too much from his strategic allocation
(benchmark). Conversely, if a regime is currently assigned a probability of 1, this means that
our uncertainty is at its minimum and that the portfolio manager can take calculated risks.
We consider these two boundaries to drive the risk aversion parameter, helped by the spectral
entropy, given by
H(p1, p2, . . . , pS) =−S∑
s=1ps log ps ,
where S denotes the number of regimes (scenarios). expH(p1, p2, . . . , pS) ranges from 1 to
S = 2 in our two-regime example. We can scale the spectral entropy and express it as a value,
denoted by $, between 0 and 1
$(p1, p2, . . . , pS) = exp−∑Ss=1 ps log ps
S, s = 1, . . . ,S. (6.2)
In the numerical example presented in Section 6.4 the risk aversion λ will range from 0.5 to
20, a low value for λ will deviate from the strategic allocation (benchmark) and consider the
expected returns in each scenario, whereas a high value will lead the optimizer to remain close
to the benchmark. The risk aversion parameter λ can thus be expressed as a function of the
scaled spectral entropy in (6.2)
λt =φ($(p1, . . . , ps , . . . , pS)
), t = 1, . . . ,T, s = 1, . . . ,S.
We summarize the required steps for retrieving the scenarios and the risk aversion parameter
at time t (rebalancing date) as follows:
1. Fit the 2-regimes HMM Model to the FX Data (only fitted once, at time t = 0) and retrieve
the current probabilities as well as the transition matrix.
2. Define the conditional expected asset returns for each period of the investment horizon
as the historical conditional returns in each regime and add a gaussian noise which
scales with the distance from time t , i.e. we consider increasing uncertainty over the
horizon.
E(Rt+i | s) = 1
M
∑Rk∈s
Rk + εt+i , i = t +1, . . . , t +T, s = 1, . . . ,S,
where Rt+1 ∈ Rn is the vector of expected returns for the n assets for the period t + i in
83
6.4. Large-Scale Dynamic Portfolio Strategy
scenario s, εt+i ∼N (0,p
t + i ) ∈ Rn corresponds to the gaussian noise with increasing
horizon over the forecasting horizon and M is the number of historical observations
counted in scenario s.
3. Define the conditional expected asset covariances for each period of the investment
horizon as the historical covariances in each regime and add a gaussian noise which
scales with the distance from time t , i.e. we consider increasing uncertainty over the
horizon.
Σt+i | s = ∑Rk∈s
∑Rl∈s
Cov(Rk ,Rl ) + εt+i , i = t +1, . . . , t +T, j = 1, . . . ,S,
where Σt+i ∈ Rn×n is the estimated covariance matrix for the period t + i in scenario s,
εt+i ∼N (0,p
t + i ) ∈ Rn×n corresponds to the gaussian noise with increasing horizon
over the forecasting horizon.
4. Compute the risk aversion parameter λt , kept fixed over the investment horizon, using
the scaled entropy measure in (6.2), the current probabilities ps and the boundaries for
the risk aversion:
λt =$(p1, . . . , ps , . . . , pS) · (λ+−λ−)+λ−, t = 1, . . . ,T, s = 1, . . . ,S,
where λ−,λ+ ∈ R corresponds to the minimum and maximum risk aversion, respectively.
6.4 Large-Scale Dynamic Portfolio Strategy
In this section we combine all the results presented in this thesis. We calibrate the HMM
model on FX rate over the period 1998-2000 and back-test our dynamic investment strategy
over the period 2001/01-2016/06.
6.4.1 Benchmark
We consider a strategic allocation given by the risk-budgeting strategy proposed in Chapter 2,
Section 2.5 applied to 29 assets constituting the Dow Jones 30 Index. This strategy will be used
as a benchmark in the multi-period optimization and relies on the identification of statistical
risk drivers, aiming at diversifying away the specific risk left unexplained by the factors. For
convenience purposes, we restate the strategy here:
Benchmark: Factor Risk Budgeting – Specific
We set risk budgets that are inversely proportional to the specific risk of the assets (Υ2) that
remain unexplained by the L statistical factors, as detailed in Chapter 2, Section 2.5. Applying
the risk decomposition in Equation (2.8) and using the corresponding torsion matrix tSMT in
84
6.4. Large-Scale Dynamic Portfolio Strategy
(2.5), the asset weights are given by
bi = 1
υ2i
,
wi = tSMT biσ−1
i∑Kj=1 b jσ
−1j
, i = 1,2, . . . ,K .
where υ2i denotes the specific risk of asset i .
6.4.2 Scenarios and Rebalancing
We allow the portfolio to rebalance on a monthly basis and thus consider an investment
horizon of one month in the multi-period ADMM optimization procedure. We use two
scenarios over the following month (i.e. 21 days), given by the average returns and covariances
observed over the last two years (500 days) in each of the regime identified by the HMM fitted
to the EUR/CHF exchange rate over the period 1998-2000. We follow the methodology detailed
in Section 6.3.2 to compute the conditional expected returns and covariances at each time
step of the investment horizon.
We subsequently use these scenarios and the benchmark strategy to optimize and derive
an optimal tactical portfolio, helped by the fast optimal control methodology presented in
Chapter 5.
6.4.3 Risk Aversion
Time-varying probabilities assigned to scenarios, retrieved with the transition kernel in (6.1),
are considered and steer the risk aversion parameter between 0.5 and 20 dynamically at each
rebalancing date, based on the methodology explained in Section 6.3.2.
6.4.4 Costs and Restrictions
We consider transaction costs of 10 bp (bid-ask) as well as quadratic costs of 1 bp (impact). No
more than 30% can be invested in a single stock and short positions are restricted to 20% of
the portfolio value.
6.4.5 Results
We compare the performance after costs of the benchmark strategy with our dynamic invest-
ment strategies relying on market scenarios. Figure 6.2 shows that our dynamic portfolio
approach clearly outperforms on a risk-adjusted basis its benchmark and reduce the draw-
downs borne over the period.
85
6.4. Large-Scale Dynamic Portfolio Strategy
Figure 6.2 – Dynamic Portfolio – Performance0
12
34
56 Dynamic Portfolio
Benchmark
Naive: assets
Naive: factors
Cum
ulat
ive
Ret
urn
Performance
−0.0
6−0
.02
0.02
0.04
0.06
0.08
Dai
ly R
etur
n
2001−12−31 2002−12−02 2003−12−01 2004−12−01 2005−12−01 2006−12−01 2007−12−03 2008−12−01 2009−12−01 2010−12−01 2011−12−01 2012−12−03 2013−12−02 2014−12−01 2015−12−01
−0.4
−0.3
−0.2
−0.1
0.0
Dra
wdo
wn
This figure compares the performance after costs of the dynamic portfolio relative to itsbenchmark and naive diversification approaches applied to the assets and market factorsrespectively.
Table 6.2 – Dynamic Portfolio vs. Benchmark – Key figures
Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) Max. Drawdowndynamic portfolio 10.93 8.54 1.28 -3.26 -4.79 12.87
benchmark 13.22 13.81 0.96 -5.29 -9.73 20.74equally-weighted: assets 14.44 16.28 0.89 -5.58 -7.39 23.78
equally-weighted: factors 15.38 15.11 1.02 -5.34 -8.42 19.88
This table presents the annualized return and volatility as well as tail risk metrics of thedynamic portfolio, the strategic allocation (benchmark) and two equally-weighted (naive)strategies applied to the assets and market factors respectively.
Table 6.2 summarizes the results of the back-tested dynamic and benchmark strategies as well
as two naive diversification approaches (equally-weighted) detailed in Chapter 2, Section 2.6.3.
Table 6.2 shows that the naive diversification approach along the factors provides a consider-
able improvement relative to the standard equally-weighted strategy, with a Sharpe ratio of
1.02 and 0.89 respectively. However, these two approaches had to bear an important drawdown
during the two financial crises (2001, 2008). The strategic allocation (benchmark), relying on a
specific diversification along the factors identified, displays a better risk-adjusted performance
86
6.5. Conclusion
relative to the standard naive approach but cannot outperform the naive diversification along
the factors.
Remarkably, our dynamic portfolio strategy, relying on market scenarios with time-varying
probabilities and risk aversion, displays the best results in terms of risk-adjusted performance.
We could not only substantially improve the Sharpe ratio relative to the benchmark (from
0.96 to 1.28) and the other strategies, but also considerably reduced tail risks. In particular,
the maximum drawdown has been significantly reduced (from 20.74% for the benchmark to
12.87% for the dynamic portfolio). The tracking-error, measured as the standard deviation
of the return difference between the portfolio and the benchmark, amounts to 9.41% 2. This
metric measures the extent of the deviation to the strategic allocation and reveals that the
underlying confidence placed in the scenarios did not reach extreme values, i.e. the risk
aversion stayed within the defined boundaries.
6.5 Conclusion
In this Chapter, we applied our findings from previous chapters and proposed an application
to a real-world portfolio allocation problem. We suggested a method to generate market
scenarios over the investment horizon. Relying on a simple Hidden Markov Model fitted to the
EURCHF exchange rate, with two states, we use these market regimes to compute conditional
expected asset returns and covariances. We handled these scenarios with the scenario-based
MPC methodology detailed in this thesis and solved the multi-period allocation problem with
the new Fast Optimal Control algorithm developed in Chapter 5.
We proposed an innovative approach to dynamically steer the risk aversion over time, relying
on the relative confidence placed in the scenarios considered. Using the spectral entropy
measure, we derive a methodology to increase or reduce the risk-aversion within pre-defined
bounds.
We back-tested the dynamic allocation approach and compared the results with a strategical
allocation (benchmark) as well as two naive diversification strategies, which set an equal
weight to the assets or factors respectively. Our results show that the dynamic approach, even
after considering transaction costs, delivers a risk-adjusted outperformance compared to
the other strategies and also allows to significantly reduce tail risks. The appealing features
displayed by the dynamic framework proposed, even within a simplified framework with only
2 regimes, pave the way for further research directions.
2Tracking error is often used in practice for active mandates, which aim at outperforming an assigned bench-mark.
87
7 Conclusion and Outlook
7.1 Conclusion
This thesis was motivated by the need to device dynamic multi-period portfolio strategies
that can be solved efficiently and outperform common standard strategies used in practice.
We improved the state-of-the-art by proposing a new formulation of the two-set splitting
ADMM algorithm, used in previous work and considered a scenario-based approach to take
uncertainty into account in the multi-period framework.
We presented in Chapter 1 the naive diversification method and the benchmark approach
reported in previous research. We suggested a procedure, the so-called minimum-torsion ap-
proach, allowing to retrieve uncorrelated factors. We devised an investment strategy relying on
the naive diversification, where we allocated wealth applying a correction to the original equal
weight distribution. The correction has been carried out with the help of the minimum-torsion
matrix and results in a portfolio in which highly correlated assets are under-represented. We
illustrated our findings with a case study on 375 stocks in the SP500 index.
Building on the modified naive approach and relying these statistical factors, we proposed in
Chapter 2 a novel dynamic approach, focusing on statistical analysis of the data and on risk
budgeting techniques. We suggested a shrunk version of the minimum-torsion matrix, using
the effective rank approach to extract the number of risk factors driving asset returns. This pure
statistical approach enabled a risk decomposition of a given portfolio into a systematic and
specific component as well as an assessment of its level of diversification. We devised various
dynamic investment strategies, especially an innovative implementation of a risk budgeting
technique, where the budget of a given asset is inversely proportional to its idiosyncratic risk,
left unexplained by the statistical factors. We illustrated our approach through an empirical
application.
We reviewed the single and multi-period optimization framework in Chapter 3 and present
the solutions proposed in the literature. We gave a short overview of Dynamic Programming
(DP) techniques and the issues associated when considering a large-scale portfolio.
88
7.2. Further Research
We presented in Chapter 4 Model predictive control (MPC), a widespread technique for solving
linear convex optimization problems. We derived a scenario-based formulation of the portfolio
optimization problem over a given investment horizon, in the presence of portfolio restrictions
and transactions costs. We showed how the resulting problem can be expressed as a quadratic
and non-quadratic component.
In Chapter 5, we used the Alternating Direction Method of Multipliers (ADMM), to solve the
scenario-based MPC portfolio optimization problem efficiently and quickly. We developed a
new algorithm, which allows to include inequality constraints in the two-set splitting scheme
of the ADMM. We also proposed a modified stopping criterion for the ADMM, which rely on
the probabilities of the scenarios considered, allowing to improve convergence properties of
the algorithm.
We presented a real-world large-scale multi-period portfolio application in Chapter 6, where
we combined the different concepts derived in this thesis. We suggested an approach to
generate scenarios relying on a Hidden Markov Model (HMM) and solved the constrained
multi-period MPC problem with the new ADMM algorithm developed in this thesis. We
suggested a novel concept to steer the risk aversion over time, relying on the probabilities
assigned to the different scenarios. We finally back-tested the strategy with a large-scale
portfolio and showed that the results obtained provided the desired outperformance, without
deviating significantly from the strategic asset allocation.
7.2 Further Research
7.2.1 Risk-Budgeting
In Chapters 1 and 2, we suggested various investment strategies which display appealing risk-
adjusted properties. This area should be built upon and improved, in particular in connection
with risk-budgeting techniques. The factor-based risk budgeting methods developed in this
thesis rely on the decomposition of the portfolio risk, measured by the standard deviation.
Considering the non-normality of asset returns, future research in this field should strive to
design strategies based on the diversification of tail or downside risks. Serious challenges
will be posed to the researcher, as the decorrelation used notably by PCA techniques or the
minimum-torsion approach used in this thesis do not lead to a strict independence of the
factors retrieved (tail-risk) and one should recourse to an Independent Component Analysis
approach.
7.2.2 Multi-Period Optimization via ADMM
The techniques developed in this thesis to include inequality constraints into the two-set
splitting scheme and the improvement of the convergence properties when dealing with sce-
narios can be extended further. As suggested by our industry partners, a promising direction
89
7.2. Further Research
are so-called GPU computations, which allow to perform computations at a low-level on the
graphic card processor and to run tasks in parallel on a machine with multiple cores.
Enthusiasm for studies about the improvement of the convergence properties of the ADMM
algorithm has reignited among researchers in recent years and there is a potential for accel-
erating the algorithm further. A suggestion might be to express the step size parameter in
the algorithm as a function of the residuals and use an adaptive step size at each iteration.
However, as already mentioned in Chapter 5, Section 5.6.1, this would require a recalculation
of the pre-computed matrix in the convex-quadratic control problem but GPU computations
could potentially alleviate this issue.
Another area of research would be to use a risk penalty in the objective function that rely on
tail risk measures. Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) are nowadays
essential ingredients for a proper risk management. Rockafellar and Uryasev [81] present a
method in a single-period framework for minimizing CVaR. The authors suggest a scenario de-
composition, in which case the optimization problem can be reduced to linear programming.
Future work should build on these results and implements the CVaR penalty in a multi-period
optimization.
7.2.3 Investment Strategies
The scenario-based MPC framework developed in this thesis opens many opportunities for
rebalancing techniques. At each rebalancing time t , the optimal portfolio is recomputed,
given the scenarios and the investment horizon. Three rebalancing schemes can apply:
• A so-called open-loop scheme can be used, where we simply rebalance the portfolio
over the horizon given the optimal controls computed at time t , hence no recourse is
possible.
• On their side, closed-loop schemes also called controlled approach, consider incoming
market information and recompute the optimal controls over the investment horizon.
• The last method simply implements the first control at time t and do not rebalance over
the investment horizon, i.e. the method reduces to a buy-and-hold strategy over the
investment horizon.
Future research might conduct a thorough investigation of the optimal rebalancing scheme,
relying on the tools developed in this thesis. Depending on the type of investor considered
and transaction costs incurred, a lower or higher rebalancing frequency might be appropriate.
7.2.4 Scenario Generator
We presented a scenario generation method relying on a Hidden Markov Model (HMM), which
identified three regimes. This area of research as well as other machine-learning techniques
90
7.2. Further Research
are very popular and a lot of empirical research is awaiting in this area. A direction for further
research is to find a combination of various investment strategies, whose performances are
weakly correlated in a given regime, allowing to stabilize or ideally improve the risk-adjusted
performance of an aggregated portfolio.
In this thesis we considered the open-loop approach on a monthly basis, without refitting the
parameter of the HMM Model. A dynamic strategy could benefit from a refitting on a regular
basis as, over the years, the market dynamics could lead to the identification of other regimes,
i.e. the parameters of the HMM are time-varying.
91
A Appendix
A.1 The Jacobi and Gauss-Seidel Iterative Methods
A.1.1 Jacobi Method
The Jacobi Method is an iterative method which relies on a splitting methodology. Assume we
want to solve the system Ax = b, where A ∈ Rn is a sparse matrix. We split A = D −E −F and
move the two terms on different sides of this equation and obtain
Dx = (E +F )x +b.
The iteration consists in replacing x on the left side by xk , while replacing x on the right side
by xk−1
xk = D−1(E +F )xk−1 +D−1b.
A.1.2 Gauss-Siedel Method
The Gauss-Siedel Method is derived in the same way as the Jacobi Method but uses the splitting
A = (L+D)+U , resulting in the following iteration
(D +L)xk+1 =−Uxk +b.
For each k ≥ 1, we generate xk from xk−1 by
xk =Cg xk−1 + cg , k = 1, · · · ,T −1,
where Cg = (D −L)−1U and cg = (D −L)−1b. This iterative method performs a forward sub-
stitution at each step and overwrite the old value with this new calculated value. The error
reduction is faster than in the Jacobi method.
92
Bibliography
[1] Noel Amenc, Felix Goltz, and Lionel Martellini. Smart Beta 2.0. The Journal of Index
Investing, 4(3):15–23, 2013.
[2] Noel Amenc, Felix Goltz, Ashish Lodh, Lionel Martellini, and Eric Shirbini. Risk Allo-
cation, Factor Investing and Smart Beta: Reconciling Innovations in Equity Portfolio
Construction. Edhec Publications, July 2014.
[3] Y. Aït-Sahalia, J. Cacho-Diaz, and T.R. Hurd. Portfolio choice with jumps: A closed-form
solution. The Annals of Applied Probability, 19:556–584, 2009.
[4] S. Basak and G. Chabakauri. Dynamic mean-variance asset allocation. Review of Finan-
cial Studies, 23:2970–3016, 2010.
[5] N. Bäuerle and U. Rieder. Markov Decision Processes with Applications to Finance.
Springer, Heidelberg, 2011. doi: 10.1007/978-3-642-18324-92.
[6] C. Bender, C. Gärtner, and N. Schweizer. Pathwise dynamic programming. Working
paper, 2015.
[7] D. Bertsekas. Dynamic Programming and Optimal Control, volume 1. Athena Scientific,
2005.
[8] R. Bey, R. Burgess, and P. Cook. Measurement of estimation risk in markowitz portfo-
lios. Working Paper, 1990. Available at: http://www.thierry-roncalli.com/download/
risk-factor-parity.pdf.
[9] Vineer Bhansali, Josh Davis, Graham Rennison, Jason C. Hsu, and Feifei Li. The Risk in
Risk Parity: A Factor Based Analysis of Asset Based Risk Parity. Social Science Research
Network Working Paper Series, October 2012. URL http://ssrn.com/abstract=2167058.
[10] F. Black and R. Litterman. Global portfolio optimization. Financial Analysts Journal, 48
(5):28–43, 1992.
[11] T. Bodnar, N. Parolya, and W. Schmid. A Closed-Form Solution of the Multi-Period
Portfolio Choice Problem for a Quadratic Utility Function. EUV working paper 292, 2012.
[12] J.P. Bouchaud and M. Potters. Theory of Financial Risk. Aleea-Saclay, Eyrolles, Paris, 1997.
93
Bibliography
[13] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004.
[14] Stephen Boyd, N. Parikh, E. Chue, B. Peleato, and J. Eckstein. Distributed optimization
and statistical learning via the alternating direction method of multipliers. Found. Trends
Optim., 3(1):1–122, 2011. ISSN 2167-3888. doi: 10.1561/2200000016.
[15] Stephen Boyd, Mark T. Mueller, Brendan O’Donoghue, and Yang Wang. Performance
bounds and suboptimal policies for multi–period investment. Found. Trends Optim., 1
(1):1–72, January 2014. ISSN 2167-3888. doi: 10.1561/2400000001. URL http://dx.doi.
org/10.1561/2400000001.
[16] M. Brandt and P. Santa-Clara. Dynamic portfolio selection by augmenting the asset space.
The Journal of Finance, 61:2187–2217, 2006.
[17] J.R Bunch and B.N. Parlett. Direct methods for solving symmetric indefinite systems of
linear equations. SIAM Journal on Numerical Analysis, 8(4):639–655, 1971.
[18] Thomas F. Cargill and Robert A. Meyer. Multiperiod Portfolio Optimization And The Value
Of Risk Information. Advances in Financial Planning and Forecasting, v2(1):245–268,
1987.
[19] Mark Carhart. On persistence in mutual fund performance. Journal of Finance, 52(1):
57–82, 1997. URL http://EconPapers.repec.org/RePEc:bla:jfinan:v:52:y:1997:i:1:p:57-82.
[20] Celikyurt and Özekici. Multiperiod portfolio optimization models in stochastic markets
using the mean-variance approach. European Journal of Operational Research, 179:
186–202, 2007.
[21] Nai-Fu Chen, Richard Roll, and Stephen A Ross. Economic forces and the stock market.
The Journal of Business, 59(3):383–403, 1986. URL http://EconPapers.repec.org/RePEc:
ucp:jnlbus:v:59:y:1986:i:3:p:383-403.
[22] G. Connor. The three types of factor models: A comparison of their explanatory power.
Financial Analysts Journal, 519–531, 1995.
[23] G. Connor and R.A. Korajczyk. A test for the number of factors in an approximate factor
model. The Journal of Finance, 48(4):1263–1291, 1993. ISSN 1540-6261. doi: 10.1111/j.
1540-6261.1993.tb04754.x. URL http://dx.doi.org/10.1111/j.1540-6261.1993.tb04754.x.
[24] D. Duffie and H. Richardson. Mean-variance hedging in continuous time. Annals of
Probability, 1:1–15, 1991.
[25] J. Eckstein. Parallel alternating direction multiplier decomposition of convex pro-
grams. Journal of Optimization Theory and Applications, 80(1):39–62, 1994. doi:
10.1007/bf02196592.
94
Bibliography
[26] Jonathan Eckstein and Dimitri P. Bertsekas. On the douglas—rachford splitting method
and the proximal point algorithm for maximal monotone operators. Mathematical
Programming, 55(1-3):293–318, 1992. doi: 10.1007/bf01581204.
[27] Jonathan Eckstein and Michael C. Ferris. Operator-splitting methods for monotone affine
variational inequalities, with a parallel application to optimal control. INFORMS Journal
on Computing, 10(2):218–235, 1998. doi: 10.1287/ijoc.10.2.218.
[28] A. Edelman. Eigenvalues and condition numbers of random matrices. SIAM J. Matrix
Analy. Appl., 9(4):543, 1988.
[29] R. Engle, J. Mezrich, and L. You. Optimal asset allocation. 1998.
[30] Christina Erlwein. Applications of hidden markov models in financial modelling, 2008.
[31] E.F. Fama and K.R. French. The cross-section of expected stock returns. Journal of finance,
pages 427–465, 1992.
[32] Eugene Fama and Kenneth French. Common risk factors in the returns on stocks and
bonds. Journal of Financial Economics, 33(1):3–56, 1993. URL http://EconPapers.repec.
org/RePEc:eee:jfinec:v:33:y:1993:i:1:p:3-56.
[33] E. A. Feinberg and A. Schwartz. Handbook of Markov Decision Processes. Kluwer Academic,
2002.
[34] R. Glowinski and A. Marroco. Sur l’approximation, par éléments finis d’ordre un, et la réso-
lution, par pénalisation-dualité d’une classe de problèmes de dirichlet non linéaires. Re-
vue française d’automatique, informatique, recherche opérationnelle. Analyse numérique,
9(R2):41–76, 1975. doi: 10.1051/m2an/197509r200411.
[35] R.B. Gold. Why the efficient frontier for real estate is fuzzy. Journal of Real Estate Portfolio
Management, 1:59–66, 1995.
[36] T. Goldstein, B. O’Donoghue, and S.Setzer. Fast alternating direction optimization meth-
ods. Technical report, UCLA Computational and Applied Mathematics Report, 2012.
[37] R.C. Grinold and M. Stuckelman. The value-added/turnover frontier. The Journal of
Portfolio Management, pages 8–17, 1993.
[38] N.H. Hakansson. On myopic portfolio policies, with and without serial correlation of
yields. Journal of Business, 44(3):324–334, 1971.
[39] G. Hanoch and H. Levy. The efficiency analysis of choices involving risk. The Review of
Economic Studies, 36(3):335–346, 1969.
[40] Mary R. Hardy. A regime-switching model of long-term stock returns. North American
Actuarial Journal, 5(2):41–53, 2001. doi: 10.1080/10920277.2001.10595984.
95
Bibliography
[41] Ulrich G. Haussmann and Jörn Sass. Optimal terminal wealth under partial information
for hmm stock returns. Contemporary Mathematics of Finance, page 171–185, 2004. doi:
10.1090/conm/351/06401.
[42] J. Hinz and J. Yee. Stochastic switching for partially observable dynamics and optimal
asset allocation. Working paper, 2016.
[43] W. James and C. Stein. Estimation with quadratic loss. In Proceedings of the Berkeley
Symposium on Mathematical Statistics and Probability, volume 4, page 361. University of
California Press, 1956.
[44] Zura Kakushadze and Willie Yu. Statistical Risk Models. Social Science Research Network
Working Paper Series, February 2016. URL http://ssrn.com/abstract=2732453.
[45] Jia Kang, Arvind U. Raghunathan, and Stefano Di Cairano. Decomposition via admm for
scenario-based model predictive control. 2015 American Control Conference (ACC), 2015.
doi: 10.1109/acc.2015.7170904.
[46] Dong-Hee Kim and Hawoong Jeong. Systematic analysis of group identification in stock
markets. Phys. Rev. E, 72:046133, Oct 2005. doi: 10.1103/PhysRevE.72.046133. URL
https://arxiv.org/pdf/physics/0503076.pdf.
[47] M. Kritzman, S. Mygren, and S. Page. Optimal rebalancing: A scalable solution. Revere
Street Working Papers, 2007.
[48] K.F. Kroner and J. Sultan. Time-varying distributions and dynamic hedging with foreign
currency futures. Journal of Financial and Quantitative Analysis, 28(4):535–551, 1993.
[49] Laurent Laloux, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters. Random
matrix theory and financial correlations. Science Finance (CFM) working paper archive
500053, Science Finance, Capital Fund Management, 1999. URL http://EconPapers.
repec.org/RePEc:sfi:sfiwpa:500053.
[50] D. Lee. Trading USDCHF filtered by Gold dynamics via HMM coupling. ArXiv e-prints,
August 2013.
[51] M. Leippold, P. Vanini, and F. Trojani. A geometric approach to multiperiod mean-
variance optimization of assets and liabilities. Journal of Economic Dynamics and Control,
28:1079–1113, 2004.
[52] D. Li and W.L. Ng. Optimal dynamic portfolio selection: multiperiod meanvariance
formulation. Mathematical Finance, 10:387–406, 2000.
[53] John Lintner. The valuation of risk assets and the selection of risky investments in stock
portfolios and capital budgets. The Review of Economics and Statistics, 47(1):13–37, 1965.
ISSN 00346535, 15309142. URL http://www.jstor.org/stable/1924119.
[54] H. Markowitz. Portfolio selection. The Journal of Finance, 7:77–91, 1952.
96
Bibliography
[55] H. Markowitz. Portfolio Selection: Efficient diversification of investments. John Wiley, New
York, 1959.
[56] H.M. Markowitz and E.L. Van Dijk. Single-period mean-variance analysis in a changing
world. Financial Analysts Journal, 59(2):20–44, 2003.
[57] David Q. Mayne. Model predictive control. Automatica, 50(12):2967–2986, December
2014. ISSN 0005-1098. doi: 10.1016/j.automatica.2014.10.128. URL http://dx.doi.org/10.
1016/j.automatica.2014.10.128.
[58] D.q. Mayne, J.b. Rawlings, C.v. Rao, and P.o.m. Scokaert. Constrained model predic-
tive control: Stability and optimality. Automatica, 36(6):789–814, 2000. doi: 10.1016/
s0005-1098(99)00214-9.
[59] R. Merton. Continuous-Time Finance. Oxford, 1990.
[60] R. C. Merton and P.A. Samuelson. Fallacy of the log-normal approximation to optimal
portfolio decision-making over many periods. Journal of Financial Economics, 1:67–94,
1974.
[61] A. Meucci. Simulations with exact means and covariances. 2009. Available at Symmys:
http://symmys.com/node/162.
[62] A. Meucci and M. Nicolosi. Dynamic portfolio management with views at multiple
horizons. 2015. Available at SSRN: http://ssrn.com/abstract=2583612.
[63] A. Meucci, A. Santangelo, and R. Deguest. Measuring portfolio diversification based
on optimized uncorrelated factors. 2013. Available at SSRN: http://ssrn.com/abstract=
2276632.
[64] Richard O. Michaud and Robert O. Michaud. Efficient Asset Management: A Practical
Guide to Stock Portfolio Optimization and Asset Allocation. Oxford University Press, 2
edition, 2008. URL http://EconPapers.repec.org/RePEc:oxp:obooks:9780195331912.
[65] R.O. Michaud. The markowitz optimization enigma: Is optimized optimal? Financial
Analysts Journal, 45(1):31–42, 1989.
[66] G. Miller. Needles, haystacks, and hidden factors. The Journal of Portfolio Management,
32(2):25–32, 2006.
[67] Manfred Morari, Carlos E. Garcia, and David M. Prett. Model predictive control:
Theory and practice. Model Based Process Control, page 1–12, 1989. doi: 10.1016/
b978-0-08-035735-5.50006-1.
[68] J. Mossin. Equilibrium in a Capital Asset Market. Econometrica,, 34:766–783, 1966.
[69] J. Mossin. Optimal multiperiod portfolio policies. The Journal of Business, 41:215–229,
1968.
97
Bibliography
[70] Yu. Nesterov. Gradient methods for minimizing composite functions. Mathematical
Programming, 140(1):125–161, 2012. doi: 10.1007/s10107-012-0629-5.
[71] Yurii Nesterov and Arkadii Nemirovskii. Interior-point polynomial algorithms in convex
programming. 1994. doi: 10.1137/1.9781611970791.
[72] Brendan O’donoghue, Giorgos Stathopoulos, and Stephen Boyd. A splitting method for
optimal control. IEEE Transactions on Control Systems Technology, 21(6):2432–2442, 2013.
doi: 10.1109/tcst.2012.2231960.
[73] Neal Parikh and Stephen Boyd. Proximal algorithms. Found. Trends Optim., 1(3):127–239,
January 2014. ISSN 2167-3888. doi: 10.1561/2400000003. URL http://dx.doi.org/10.1561/
2400000003.
[74] Eckhard Platen. A Benchmark Approach to Investing and Pricing. Research Paper Series
253, Quantitative Finance Research Centre, University of Technology, Sydney, August
2009.
[75] Eckhard. Platen and David. Heath. A benchmark approach to quantitative finance /
Eckhard Platen, David Heath. Springer Berlin, 2006. ISBN 9783540262121 3540262121
9783540262121.
[76] Eckhard Platen and Renata Rendek. Approximating the numeraire portfolio by naive
diversification. Research Paper Series 281, Quantitative Finance Research Centre, Univer-
sity of Technology, Sydney, 2010.
[77] Stanley R. Pliska. Introduction to mathematical finance: discrete time models. Blackwell
Publishers, 1997.
[78] W. B. Powell. Approximate dynamic programming: Solving the curses of dimensionality.
Wiley, 2007.
[79] M.L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming.
Wiley, New York, 1994.
[80] Stefan Richter, Colin N. Jones, and Manfred Morari. Real-time input-constrained mpc
using fast gradient methods. Proceedings of the 48h IEEE Conference on Decision and
Control (CDC) held jointly with 2009 28th Chinese Control Conference, 2009. doi: 10.1109/
cdc.2009.5400619.
[81] R.T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. Journal of risk,
2:21–42, 2000.
[82] T. Roncalli. Introduction to Risk Parity and Budgeting. Chapman Hall/CRC, 2013.
[83] T. Roncalli and G. Weisang. Risk parity portfolios with risk factors. Working Paper, 2012.
Available at: http://www.thierry-roncalli.com/download/risk-factor-parity.pdf.
98
Bibliography
[84] Olivier Roy and Martin Vetterli. The effective rank: A measure of effective dimensionality.
In 15th European Signal Processing Conference, EUSIPCO 2007, Poznan, Poland, Septem-
ber 3-7, 2007, pages 606–610, 2007. URL http://ieeexplore.ieee.org/xpl/articleDetails.jsp?
arnumber=7098875.
[85] M. Rubinstein. Continuously rebalanced investment strategies. The Journal of Portfolio
Management, 10(3):387–406, 1991.
[86] P. A. Samuelson. Lifetime portfolio selection by dynamic stochastic programming. Review
of Economic Studies, 51:239–246, 1969.
[87] M.A. Saunders. Cholesky-based Methods for Sparse Least Squares: the Benefits of Regular-
ization. 1996.
[88] William Sharpe. Capital asset prices: A theory of market equilibrium under conditions of
risk. Journal of Finance, 19(3):425–442, 1964. URL http://EconPapers.repec.org/RePEc:
bla:jfinan:v:19:y:1964:i:3:p:425-442.
[89] J. Skaf and S. Boyd. Multi-Period Portfolio Optimization with Constraints and Transaction
Costs. Stanford working paper, 2009.
[90] L. Sneddon. The dynamics of active portfolios. In Proceedings of the Northfield Research
Conference 2005. Northinfo, 2005.
[91] Florin Spinu. An algorithm for computing risk parity weights. SSRN Electronic Journal,
2013. doi: 10.2139/ssrn.2297383.
[92] G. Stathopoulos, T. Keviczky, and Y. Wang. A hierarchical time-splitting approach for
solving finite-time optimal control problems. In 2013 European Control Conference (ECC),
pages 3089–3094, July 2013.
[93] M. C. Steinbach. Liquidity preference as behavior towards risk. Society for Industrial and
Applied Mathematics Review, 43:31–85, 2001.
[94] J. Tobin. Liquidity preference as behavior towards risk. Review of Economic Studies, 25:
65–86, 1958.
[95] Jack L. Treynor. Market Value, Time, and Risk. Social Science Research Network Working
Paper Series, 1961. URL http://ssrn.com/abstract=2600356.
[96] Yang Wang and Stephen Boyd. Fast model predictive control using online optimization.
IEEE Transactions on Control Systems Technology, 18(2):267–278, 2010. doi: 10.1109/tcst.
2009.2017934.
[97] A. A. Yushkevich. Reduction of a controlled markov model with incomplete data to a
problem with complete information in the case of borel state and control space. Theory
of Probability Its Applications, 21(1):153–158, 1976. doi: 10.1137/1121014.
99
Recommended