FAST SCENARIO-BASED OPTIMAL CONTROL FOR STOCHASTIC ... · on the Alternating-Direction of Multipliers (ADMM), for solving large-scale linear convex multi-period optimization problems

DOCTORAL THESIS

FAST SCENARIO-BASED OPTIMAL CONTROL

FOR STOCHASTIC PORTFOLIO OPTIMIZATIONwith Application to a Large-Scale Portfolio

Author:Marc WEIBEL

Supervisor:Associate Professor. Juri HINZ

Co-supervisor:Professor. Marc WILDI

A thesis submitted to the Finance Discipline Group of the University of Technology Sydney, in fulfilment of the requirements for the degree of Doctor of Philosophy.

June 2019

Finance Discipline Group University of Technology Sydney PO Box 123, Broadway, NSW 2007, Australia

https://www.uts.edu.au/about/uts-business-school/finance

http://www.uts.edu.au/

Certificate of Original Authorship

I, Marc Weibel declare that this thesis, is submitted in fulfilment of the requirements for the

award of Degree of Doctor of Philosophy, in the Finance Discipline Group of the Faculty of

Sciences at the University of Technology Sydney.

This thesis is wholly my own work unless otherwise reference or acknowledged. In addition, I

certify that all information sources and literature used are indicated in the thesis.

This document has not been submitted for qualifications at any other academic institution.

This research is supported by the Australian Government Research Training Program.

Signature of Student: Date:

10/26/2018

i

Production Note:

Signature removed prior to publication.

Abstract

This thesis contributes towards the development of a fast optimal control algorithm, relying

on the Alternating-Direction of Multipliers (ADMM), for solving large-scale linear convex

multi-period optimization problems as well as the design of investment strategies aiming at

stabilizing portfolio performance over time.

The first part of the thesis focuses on a statistical risk-budgeting method to improve naive

diversification strategies. We extend the so-called minimum-torsion approach and use ad-

vanced modern techniques for covariance estimation and shrinkage purposes. We propose

a novel factor investing approach, which dynamically identifies statistical risk factors over

time. We device dynamic investment strategies aiming at diversifying idiosyncratic risk left

unexplained by the factors.

We develop in the second part of this thesis a fast algorithm for solving scenario-based model

predictive control (MPC) arising in multi-period portfolio optimization problems efficiently.

We derive an alteration of the termination criterion, using the probabilities assigned to the

scenarios and provide a convergence analysis. We show that the proposed criterion outper-

forms the standard approach and highlight our results with a numerical comparison with a

state-of-the-art algorithm. We also enhance the standard two-set splitting algorithm of the

ADMM method, by including inequality constraints through a so-called embedded splitting,

without recourse to an additional (costly) splitting set.

We present a real-world large-scale multi-period portfolio application, where we combine the

different concepts derived in this thesis. We propose an approach to generate scenarios relying

on a Hidden Markov Model (HMM) and solve the constrained multi-period MPC problem

with the ADMM algorithm developed. We also suggest an innovative concept to steer the

risk-aversion used in the objective function dynamically, building on the probability assigned

to each scenario. We back-test the strategy and show that the results obtained do provide the

expected risk-adjusted outperformance and stability, without deviating significantly from the

strategic asset allocation.

Key words: Risk-Budgeting, Diversification, Convex Optimization, Model Predictive Control,

Alternative-Direction Method of Multipliers, Optimal Control.

ii

Acknowledgements

During my career in the industry and at the Zurich University of Applied Sciences, I have

had the privilege and pleasure of working and interacting with many talented persons. These

people have contributed to my education and my evolution and it would be impossible to

name each of them.

First of all, I would like to thank my principal advisor, Juri Hinz. He gave me the opportunity to

pursue PhD study, his dedication and support have helped me to navigate through my thesis

and to conduct a proper research. He undertook the necessary steps for enrolling me into the

program at the University of Technology in Sidney and helped me in every aspect of the PhD

thesis. I would like to express my deepest gratitude and respect to Juri for his support and

encouragement.

I would like to thank my co-adviser, Marc Wildi, who played an important role in the pro-

cess of my PhD study. Our close interaction brought me further in my research and his

funded knowledge in econometrics have helped me in taking the correct direction when I felt

uncertain.

I would also like to express my gratitude to my management at the Zurich University of

Applied Sciences, in particular Wolfgang Breymann and Jürg Hosang, who encouraged me to

accomplish this PhD.

My first contact with financial theory was during my Master Studies in Economics at the

University of Neuchatel, Switzerland. I took my first finance class of Prof. Michel Dubois and

resolved to take every class Michel has proposed since. Michel was an incredible lecturer and

teacher and sparked my interest in financial topics.

I would like to thank the academics at UTS, who offered me the opportunity to pursue my study

out of Switzerland and the reviewers who took the time necessary for reading my progress

reports and provided me with thoughtful advice.

On the private side, I would like to thank my family and especially my lovely wife Coralie for her

unconditional and constant support throughout the years. She let me work sometimes very

late in the night to finish a chapter and put a smile back on my face, when I felt demoralized.

This thesis is dedicated to you Coralie.

iii

Contents

Declaration i

Abstract ii

Acknowledgements iii

List of Figures ix

List of Tables x

Abbreviations xii

Introduction 1

1 Alternative Diversification Strategies 3

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Naive Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Mean-Variance and Naive Diversification . . . . . . . . . . . . . . . . . . . 5

1.3 Random Matrix Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.3 Random correlation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Minimum-Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

iv

Contents

1.4.1 Corrected-Benchmark Portfolio . . . . . . . . . . . . . . . . . . . . . . . . 14

1.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Statistical Risk Budgeting 19

2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Factor Investing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 Smart Beta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.2 Risk Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3 Statistical Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3.1 Minimum-Torsion Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.2 Uncorrelated Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.3 Effective Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.4 Factor Risk Budgeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.4 Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.1 Idiosyncratic Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.2 Measuring Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 Investment Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.6 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6.3 Benchmark Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.6.4 Risk Parity Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.6.5 Risk Budgeting Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.6.6 Diversification Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

v

Contents

3 Portfolio Optimization 37

3.1 Single-Period Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1.1 Modern Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.1.2 Mean-Variance Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.2 Multi-Period Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3 Dynamic Programming Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Model Predictive Control 45

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Scenario-Based MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.4 Portfolio, Benchmark and Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.5.1 Minimum and Maximum Weights . . . . . . . . . . . . . . . . . . . . . . . 48

4.5.2 Brokerage Costs and Bid-Ask Spread . . . . . . . . . . . . . . . . . . . . . . 49

4.5.3 Price impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.6 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.6.1 Portfolio Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.6.2 Objective function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.7 Decomposition Quadratic / Non-Quadratic . . . . . . . . . . . . . . . . . . . . . 51

4.7.1 Quadratic Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.7.2 Non-Quadratic Component . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Fast Scenario-Based Optimal Control 55

5.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.1.1 Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

vi

Contents

5.1.2 Proximal Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.1.3 Proximal minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.2 ADMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.2.1 Accelerated ADMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.3 Splitting the MPC Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.3.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.3.3 ADMM Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.3.4 Splitting Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.4 Solving the Convex Quadratic Control Problem . . . . . . . . . . . . . . . . . . . 65

5.5 Solving the Convex Non-Quadratic Problem . . . . . . . . . . . . . . . . . . . . . 68

5.6 Extending the State-of-the-Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.6.1 Improving Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.6.2 Extended Two-Set Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.6.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6 Application 77

6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.3 Scenario Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.3.1 Hidden Markov Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.4 Large-Scale Dynamic Portfolio Strategy . . . . . . . . . . . . . . . . . . . . . . . . 84

6.4.1 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.4.2 Scenarios and Rebalancing . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

vii

Contents

6.4.3 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.4.4 Costs and Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

7 Conclusion and Outlook 88

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.2 Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.2.1 Risk-Budgeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.2.2 Multi-Period Optimization via ADMM . . . . . . . . . . . . . . . . . . . . 89

7.2.3 Investment Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.2.4 Scenario Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

A Appendix 92

A.1 The Jacobi and Gauss-Seidel Iterative Methods . . . . . . . . . . . . . . . . . . . 92

A.1.1 Jacobi Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

A.1.2 Gauss-Siedel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Bibliography 99

viii

List of Figures

1.1 Eigenvalues distribution of 200 random assets . . . . . . . . . . . . . . . . . . . 9

1.2 Eigenvalues spectrum of 75 stocks in the S&P500 . . . . . . . . . . . . . . . . . 9

1.3 S&P500: Marchenko-Pastur density (best fit) . . . . . . . . . . . . . . . . . . . . 10

1.4 S&P500: Reshuffled assets) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Eigenvalues spectrum of the S&P500 stocks . . . . . . . . . . . . . . . . . . . . . 15

1.6 Cumulative performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.7 Outperformance vs. buy-and-hold portfolio . . . . . . . . . . . . . . . . . . . . 17

2.1 Asset Classes – Correlations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Risk Parity Strategies – Performance . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.3 Risk Budgeting Strategies – Performance . . . . . . . . . . . . . . . . . . . . . . 33

2.4 Risk Parity vs. Risk Budgeting – Performance . . . . . . . . . . . . . . . . . . . . 34

2.5 Diversification along assets and factors . . . . . . . . . . . . . . . . . . . . . . . 35

5.1 Convergence properties: weighted vs. unweighted scheme . . . . . . . . . . . 71

6.1 Asset Classes – Correlations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.2 Dynamic Portfolio – Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

ix

List of Tables

1.1 Minimum-Torsion algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1 Asset Classes – Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Risk Parity Strategies – Key figures . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3 Risk Budgeting Strategies – Key figures . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Risk Parity vs. Risk Budgeting – Key figures . . . . . . . . . . . . . . . . . . . . . 34

5.1 Computational Time Results for Stochastic MPC Problems . . . . . . . . . . . 75

6.1 Dow Jones Stocks – Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.2 Dynamic Portfolio vs. Benchmark – Key figures . . . . . . . . . . . . . . . . . . 86

x

It’s not about finding your limits. It’s about finding what lies just beyond them.

— Unknown

To my wife Coralie . . .

Abbreviations

AADMM . . . . . . . . . . Accelerated Alternating Direction Method of Multipliers

ADP . . . . . . . . . . . . . . Approximate Dynamic Programming

ADMM . . . . . . . . . . . Alternating Direction Method of Multipliers

CVaR . . . . . . . . . . . . . Conditional Value-at-Risk

DP . . . . . . . . . . . . . . . . Dynamic Programming

HMM . . . . . . . . . . . . . Hidden Markov Model

MPC . . . . . . . . . . . . . . Model Predictive Control

PCA . . . . . . . . . . . . . . Principal Component Analysis

RMT . . . . . . . . . . . . . . Random Matrix Theory

VaR . . . . . . . . . . . . . . . Value-at-Risk

xii

Introduction

Motivation

This project has been sponsored by the Swiss Government and led by the University of Applied

Sciences in Switzerland together with a Swiss industry partner. The partner is a company

based in Zurich, which provides advice, project management services and innovative software

products to financial institutions and trading firms.

We aim at developing a discrete-time multi-period portfolio optimization framework that will

be used in real business environment for commercial purposes. The industry partner wants to

obtain in a first step a so-called reference portfolio that does not require the estimation of asset

returns and does not assume any market efficiency as in the standard framework formulated

in the seminal work of Markowitz [54]. We thus focus on risk diversification methodology and

do not face the statistical issues of predicting any returns.

Relying on so-called risk budgeting techniques, we extend the minimum-torsion approach

presented in Meucci et al. [63] and use advanced modern techniques for covariance estimation

and shrinkage purposes, based on Random Matrix Theory (RMT). We enhance the parity

approach with a factor-based risk-budgeting methodology relying on consistent estimate of

the covariance matrix. We present a sound and consistent approach, extending the state-of-

the-art and examine real-world market data within a portfolio construction context.

Optimizing a multi-asset portfolio over a given investment portfolio within an industrial frame-

work usually leads to a problem of sequential decision-making under uncertainty. Closed-form

solutions to real-world problems are exceptions and usually an exact solution is out of reach

and is not of primary importance in real-world applications. Our final objective in this thesis

is to develop a fast optimal control algorithm, which can handle constantly changing trading

signals efficiently in the presence of portfolio constraints and transaction costs. The final part

of the project, which is not handled by this thesis, aims at delivering a high-quality software to

the industrial partner relying on these novel scientific insights, providing a sound and efficient

portfolio construction framework delivering stable performance.

1

Abbreviations

Outline and Contribution

In Chapter 1, we review existing portfolio diversification methods and propose a statistical

risk-budgeting method to improve naive diversification strategies. We extend the minimum-

torsion approach presented in Meucci et al. [63] and use advanced modern techniques for

covariance estimation and shrinkage purposes, based on random matrix theory. We extend the

state-of-the-art and develop a novel factor investing approach in Chapter 2, where we identify

dynamically statistical risk factors. We combine the minimum-torsion approach with the

concept of modified effective rank developed by Kakushadze and Yu [44] and device dynamic

investment strategies aiming at diversifying idiosyncratic risk left unexplained by the factors.

We review the single and multi-period optimization framework in Chapter 3 as well as the

solutions proposed in the literature. After a short overview of Dynamic Programming (DP)

techniques and the issues associated when considering a large-scale portfolio, we introduce in

Chapter 4 Model predictive control (MPC), a widespread technique for solving linear convex

optimization problems often used in engineering. We derive a scenario-based formulation

of the portfolio optimization problem over a given investment horizon, in the presence of

transaction costs and portfolio restrictions. We show how the resulting problem can be

decomposed into a quadratic and non-quadratic component, whose particular structure can

be exploited.

In Chapter 5, we present the Alternating Direction Method of Multipliers (ADMM), which

solves the scenario-based MPC portfolio optimization problem efficiently and quickly. We

develop a fast algorithm for the efficient solution of scenario-based model predictive control

arising in multi-period portfolio optimization problems. We enhance the standard procedure

and derive an alteration of the termination criterion, using the probabilities assigned to

the scenarios and provide a convergence analysis. We show that the proposed criterion

outperforms the standard approach and highlight our results with a numerical comparison

with state-of-the-arts algorithms. We also enhance the two-set splitting algorithm of the

ADMM method, by including inequality constraints through a so-called embedded splitting,

without recourse to a third splitting of the variables.

We propose a real-world large-scale multi-period portfolio application in Chapter 6, where

we combine the different concepts derived in this thesis. We suggest a novel approach to

generate scenarios relying on a Hidden Markov Model (HMM) and solve the constrained multi-

period MPC problem with the new ADMM algorithm developed in this thesis. We suggest an

innovative concept to steer the risk-aversion dynamically, building on the probability assigned

to each scenario. We back-test the strategy and show that the results obtained do provide the

expected risk-adjusted outperformance and stability, without deviating significantly from the

strategic asset allocation.

We present our conclusions in Chapter 7 and identify areas for future research.

2

1 Alternative Diversification Strategies

After a short review of existing portfolio diversification methods in Section 1.1, we start our

analysis in Section 1.2 with the naive diversification approach and consider the traditional

one-period portfolio optimization as proposed by Markowitz [54]. The computation of the

minimum-variance portfolio should rely on a consistent estimate of covariance matrix and

remove the noise inherent to financial data in particular. To this end we rely in Section 1.3

on Random Matrix Theory (RMT) and using results on eigenvalue distribution of random

matrices we distinguish significant dependencies among data and neglect natural fluctuation

in the eigenvalues.

Using concepts of risk budgeting methodology, we aim at holding equal fraction of the entire

portfolio wealth in each of the assets. However, due to asset correlations, some adaptations

have to be carried out. We follow here a procedure, the so-called minimum-torsion approach

described in Section 1.4, allowing to retrieve uncorrelated factors. We device an investment

strategy relying on the well-known naive diversification, we allocate wealth applying a correc-

tion to the original equal weight distribution. The correction is carried out with the help of

the minimum-torsion matrix and results in a portfolio in which highly correlated assets are

under-represented. These findings are illustrated by the case study on 375 stocks in the SP500

index in Section 1.5. We conclude this chapter in Section 1.6.

1.1 Introduction

In portfolio management practice, major difficulties originate from problems associated with

reliable estimation of statistical model parameters and the sensitivity of the optimal asset

allocation with respect to these quantities. Since there are several aspects of these difficulties

and an entire range of methods which address them, let us explain our concept focusing on

classical dynamic portfolio optimization. In this context, the major quantitative ingredients

are the (conditional) means and the covariances of the so-called asset log-returns. While a

reliable and statistically significant estimation of log-return means is virtually impossible, the

estimation of covariances may also be extremely difficult in practice, since for a large asset

3

1.2. Naive Diversification

number, we must consider the asymptotic behavior of the spectrum of random empirical

covariance matrices. Given permanent time changes of the price fluctuation intensity and

an extreme sensitivity of the optimal portfolio weights, a naive construction of the optimal

portfolio has no value for practical applications.

In view of these problems, some practically relevant approaches to portfolio optimization have

been suggested in order to overcome or to diminish the dependence on statistical procedures

of model identification.

Let us emphasize the benchmark approach in this setting. This theory addresses an optimal

portfolio selection under minimal theoretical assumptions and presents considerations, jus-

tifying the asymptotic optimality of an equally-weighted portfolio. This investment strategy

attempts to hold an approximately equal fraction of the entire portfolio wealth in each of the

assets, selected for the investment. Since this strategy requires a regular position re-balancing,

it is not a static investment, strictly speaking. However, empirical investigations show that

for appropriate diversification, even infrequent re-balancing can achieve a reasonable perfor-

mance.

A similar area of ideas is related to the so-called risk-parity approach. In this framework,

the investor attempts to build a portfolio choosing portfolio weights such that the marginal

contribution from each asset position to an appropriately defined total portfolio risk is the

same. Such risk parity approach is used to build diversified portfolios which do not rely on

return expectations, with the focus on risk management rather than performance. However,

the risk parity approach has also been criticized, and some stylized dependence on expected

returns has been reintroduced, with extensions in terms of the so-called minimum-torsion

approach.

A general framework is suggested in Platen [74], the so-called benchmark approach which

assumes the existence of a numeraire portfolio. This numeraire portfolio displays posi-

tive weights and when used as a benchmark renders all benchmarked portfolios to super-

martingales. Platen shows that this portfolio is equivalent to the Kelly portfolio which max-

imizes a logarithmic utility function. This numeraire portfolio cannot be systematically

outperformed by any other long-only portfolio. This theoretical numeraire portfolio can be

approximated by a worldwide diversified portfolio.

1.2 Naive Diversification

Explicit multi-period optimization and related processes have had very limited adoption

among practitioners because of several constraints explained in more detail in Chapter 3. The

most significant impediment is parameter estimation error.

Naive diversification approaches such as equally-weighted or market-weighted portfolios

are thus still used in absence of valid alternatives, displaying reasonable performance due

4

1.3. Random Matrix Theory

to diversification. These methods do not rely on any statistical estimation and are justified

by asymptotic properties. Alternatively, risk-parity strategies have been developed in order

to circumvent the inherent flaws of the traditional mean-variance approach, i.e. avoid using

expected returns and focusing on risk diversification over the assets in the portfolio. However,

these strategies have their limits, as showed in 2013 where most of the risk-parity products

displayed very disappointing performance. Covariance estimates are the key component of

such strategies and inherent noise should be removed before inferring the asset weightings.

Moreover, risk-parity products are highly invested in fixed-income instruments which display

historically a low-volatility but could massively impact the future performance if interest rates

start to rise again globally.

A combination of these methods could deliver the desired performance stability of investors’

portfolios. We propose a novel approach for risk-budgeting purposes by extending the

minimum-torsion methodology, which is applied with stable covariance estimates. We identify

uncorrelated sources of risk within the portfolio. Using risk-budgeting methodology instead

of a plain risk-parity allows to avoid this extreme overweight of low-volatility instruments.

1.2.1 Mean-Variance and Naive Diversification

The traditional mean-variance framework presented in Markowitz [54] is still the most used

approach in the financial industry. However, estimation theory has proven that this framework

produces optimal portfolio weights that change quite dramatically over time due to the

absence of large datasets. Several approaches have been proposed in order to stabilize or

shrink the covariance matrix. These minimum-risk approaches do not rely on the estimation

of expected returns and only focus on risk.

In several studies, Platen (see Platen and Heath [75], Platen [74], Platen and Rendek [76])

proposes an alternative to Markowitz-based strategies relying on naive diversification. This

diversification is deemed to approximate the “ideal” numeraire portfolio, which maximizes

the logarithmic utility function of an investor and thus dominates any other strictly positive

portfolio over a long period of time. Platen and Rendek [76] shows that when the number

of assets tends towards infinity, the equally-weighted portfolio converges to the numeraire

portfolio. This robust approach is consistent and does not imply any asset returns model.

Moreover, even after deduction of transaction costs, such portfolios significantly dominate

corresponding market-weighted assets, thus empirically proving the asymptotic behavior of

the numeraire “proxies”.

1.3 Random Matrix Theory

This section starts with an overview of the Random Matrix Theory (RMT) and presents specific

concepts related to financial market returns.

5


1.3.1 Overview

The study of statistical factors inherent to asset returns has a long history in finance. Since

the seminal paper of Markowitz [54], using volatility (or variance) as a risk measure has

become standard. The additive property of variance in uncorrelated markets, enables an easily

identification of the sources of risk within a portfolio:

V ar (Rp ) =N∑

i=1V ar (wi Ri )

where Rp is the portfolio returns and Ri the returns of asset i , with a weight wi in the portfolio.

However, financial assets do display correlation and one often recourse to the principal

component analysis in order to extract uncorrelated sources of risk. To this purpose we

decompose the covariance matrix Σ of asset returns:

ETΣE ≡Λ

where Λ ≡ diag(λ1, . . . ,λN ) is a diagonal matrix containing the eigenvalues of Σ and E ≡(e1, . . . ,eN ) are the corresponding eigenvectors (column-wise). These eigenvectors define N

uncorrelated factors, whose returns are defined by RPCA ≡ E−1 R. The eigenvalues Λ corre-

spond to the variances of these uncorrelated factors.

In Bouchaud and Potters [12], the authors show that the minimum variance portfolio, as

proposed by Markowitz [54], displays the largest weight on the eigenvectors of the correlation

matrix with the smallest eigenvalues. An effective empirical estimation of the correlation

matrix thus turns out to be a complicated task but plays a major role in portfolio construction.

If we consider N assets with a number of observations T not very large compared to N , one

can expect that the estimation of the covariances will be “noisy”, meaning that the empirical

correlation matrix is to a large extent composed of random entries. We thus have to be

careful when using empirical correlations in portfolio construction, above all in minimum-risk

strategies. It is of utmost importance to design a procedure allowing to retain real information

and removing noise from the eigenvalues and eigenvectors.

1.3.2 Theory

Random Matrix Theory (RMT) came up in the 50’s and are also of interest in a portfolio

construction context. Algorithms used in the context of optimal portfolio liquidation, trading

off the risk and the impact cost rely on the inversion of the covariance matrix. Thus, small or

zero eigenvalues, are related to portfolios of assets that have nonzero returns but vanishing

or low risk. Small samples or insufficient data lead to estimation errors that impact such

portfolios. Random matrix techniques aim at solving this issue of small eigenvalues in the

sample covariance matrix.

6


In their research paper, Laloux et al. [49] propose to compare the properties of the empirical

correlation matrix to a purely random matrix, retrieved from simulated independent returns.

The identification of deviations from the random matrix helps detect the presence of true

information.

1.3.3 Random correlation matrices

Let us consider time series of N assets, with T observations. The elements of the empirical

correlation matrix C, of size N ×N , are given by

Ci j = 1

N

T∑t=1

ri t r j t ,

where ri t denotes the return of asset i at time t , normalized by volatility such thatVar [ri t ] = 1.

If we use the matrix form the correlation matrix can written as

C= RRT,

where R defines the N ×T matrix whose rows correspond to the return observations for each

asset.

Theorem 1 (Marchenko-Pastur theorem). In random matrix theory, the asymptotic behavior

of eigenvalues of large rectangular random matrices are described by the Marchenko-Pastur

distribution. Let X denotes a M ×N random matrix whose entries are independently identi-

cally distributed random variables with mean 0 and finite variance. YN = N−1X X T and let

λ1,λ2, . . . ,λM be the eigenvalues of YN . Consider the random spectral measure

µM (A) = 1

M

M∑j=1

δλ j∈A , A ∈ R .

Assume that M , N →∞ so that the ratio M/N →λ ∈ (0,+∞). Then µMd−→µ, where

µA =

(1− 1λ )10∈A +ν(A), if λ> 1

ν(A), if 0 ≤λ≤ 1,

and

dν(x) = 1

2πσ2

√λmax −x)(x −λmin)

λx1[λmi n ,λmax ]d x

λmaxmin = (1±

pλ)2σ2.

We can now apply the theorem defined in (1) within a portfolio context, Let Rt ∼N (m,1N )

7


denote the independently and normally distributed asset returns1 and C the empirical corre-

lation matrix.

We denote ρC(λ) the density of the eigenvalues of the correlation matrix C, defined as:

ρC(λ) = 1

N

dn(λ)

dλ,

where n(λ) corresponds to the number of eigenvalues of the correlation matrix C that are less

than λ.

In Edelman [28], the authors showed, as N →∞ and T →∞, with Q = T /N ≥ 1, that ρC(λ) is

exactly known:

ρC(λ) = Q

2πσ2

√λmax −λ)(λ−λmin)

λ, (1.1)

λmaxmin = (1±

√1/Q)2σ2, (1.2)

with λ ∈ [λmin,λmax]. As asset returns have been scaled the variance σ2 is equal to 1.

In the limit Q=1, Laloux et al. [49] shows that the distribution of the normalized eigenvalues is

given by (1.1) and that important features can be extracted in the limit N →∞:

• the lower boundary of the spectrum is strictly positive, except for Q=1; No eigenvalue

displays a value between 0 and λmin. In the neighborhood of this boundary, the density

of eigenvalues exhibits a sharp maximum, except in the limit Q=1, corresponding to

λmin = 0, where it diverges as ∼ 1/pλ.

• the density of eigenvalues vanishes above λmax.

Let us consider a simple example of T = 1000 random asset returns (N = 200 assets), with

constant variance σ2 = 1.

When N is finite, these particular features displayed in the neighborhood of the boundaries

are not sharp. There is still a small probability of finding eigenvalues below λmin and above

λmax. This probability vanishes when the number of observation N becomes very large.

We consider a sample of 75 representative stocks (N ) in the S&P500 index for which we have

200 observations(T ). We thus have Q = T /N = 2.66. We display the spectrum of eigenvalues

and superimpose the Marchenko-Pastur density, with Q = 2.66 and σ2 = 1.

We can observe that the highest eigenvalues are massively off the higher bound defined by the

upper bound λmax and that the overall fit is not really satisfying.

1Asset returns have been scaled to have σ= 1.

8


0.00

0.25

0.50

0.75

1.00

0.0 0.5 1.0 1.5 2.0Eigenvalues

Den

sity

Figure 1.1 – Eigenvalues distribution of 200 random assets

0

1

2

0 1 2 3 4Eigenvalues

Den

sity

Figure 1.2 – Eigenvalues spectrum of 75 stocks in the S&P500

When we look at the corresponding eigenvectors, as expected, we can notice that all compo-

nents are roughly equal on all stocks, thus proving that the first component corresponds to a

proxy for the market itself. We can reject the hypothesis of “pure noise” for the first principal

component. Another conjecture would be to assume that the other principal components,

that are de facto orthogonal to the market proxy, are pure noise.

To this purpose, we can subtract the contribution of λmax from the nominal value for σ2 = 1,

leading to σ2 = 1−λmax/N = 0.94. Figure 1.3 displays the empirical distribution with this

better fit for σ2 (cyan line).

We see that some eigenvalues are still above λmax and can thus be considered as information

and reduce the random part of the correlation matrix. σ2 can be considered as a parameter

that we can adjust to optimize the fit. The best fit is obtained for σ2 = 0.44 and corresponds to

9


0

1

2

0 1 2 3 4Eigenvalues

Den

sity

Figure 1.3 – S&P500: Marchenko-Pastur density (best fit)

the red line in Figure 1.3. It accounts for roughly 95% of the spectrum, while the remaining

highest eigenvalues are still well above the upper threshold λmax.

If we now randomize our asset returns, by shuffling the returns of each assets we obtain a

spectrum of the eigenvalues that in the limit follows a Marchenko-Pastur density, as shown in

Example 1.3.3.

We consider the same stocks but shuffle the time series of each asset and compute the corre-

sponding asset values. We repeat this procedure 1’000 times and average the results. We can

observe that the random returns are very well explained by the theoretical density.

0.0

0.2

0.4

0.6

0.8

0 1 2Eigenvalues

Dens

ity

Figure 1.4 – S&P500: Reshuffled assets)

10

1.4. Minimum-Torsion

1.4 Minimum-Torsion

Let us assume the price evolution (S(t ) = (S1(t ), . . . ,SN (t ))t∈N¯

of given N ∈ N¯

financial assets

follows an adapted stochastic process taking values in R¯

N realized on a filtered probability

space (Ω, ,P, (t )t∈N¯

). If we denote byπ(t ) = (π1(t ), . . . ,πN (t ) the vector of fractions of the wealth

invested in the assets i = 1, . . . , N at time t = 0,1,2, . . . , then, following the self-financed strategy

determined by π= (π(t ))t∈N¯

, the wealth (Sπ(t ))t∈N¯

evolves as

Sπ(t +1) = Sπ(t )(1+N∑

i=1πi (t )Ri (t +1)), t = 0,1,2 . . . ,

with the so-called returns

Ri (t +1) = (Si (t +1)−Si (t ))/Si (t ), t ∈ N¯

, i = 1, . . . , N ,

of the assets i = 1, . . . , N . For instance, the so-called equally-weighted portfolio suggests

holding the same fraction of the wealth in each asset at any time

πi (t ) = 1

N, t ∈ N

¯, i = 1, . . . , N .

The traditional mean-variance considerations on portfolio optimization assume that the mean

vector and covariance matrix of the returns (R(t) = (Ri (t))Ni=1)t∈N

¯do not change with time

t ∈ N¯

. Let us agree that R := R1 represents the distribution of the returns. The main ingredients

for the calculation of optimal portfolio in the spirit of Markowitz are the return covariances

Σ= Cov(R), σ2 = Var(R) = diag−1(Σ), (1.3)

whose reliable estimation has attracted persistent attention in the literature. Given the covari-

ance matrix Σ, the so-called principal component analysis (PCA) is based on diagonalization

D = TΣT > with entries of the diagonal matrix D given by the eigenvalues of Σwhose orthonor-

mal eigenvectors are rows of the orthogonal matrix T . The principal components are given by

(T R(t ))t∈N¯

which can be considered as an approximation of the returns of the synthetic assets

(T S(t ))t∈N¯

. Such linear transformation of the original price process (T S(t ))t∈N¯

to (T S(t ))t∈N¯whose components have uncorrelated returns can be utilized in the portfolio optimization.

However, the process (T S(t ))t∈N¯

may appear artificial. That is, to reach return uncorrelation,

other linear transformations than T may be of interest, thus we address the following question:

determine a linear transformation T ∗ : R¯

N → R¯

N

such that T ∗R are uncorrelated and the

components of T ∗R are close to those of R

(1.4)

In what follows, we present an approach to this problem, in terms of an algorithm which yields

a matrix T ∗ solving (1.4), in certain sense.

11


Given the returns’ covariance matrix Σ and the vector σ2 of variances as in (1.3), the solution

to (1.4) is proposed in terms of minimizing the function

f (T ) =√√√√ 1

N

N∑i=1

Var((T R)i −Ri )

Var(Ri ), (1.5)

which is defined for each N ×N matrix T , and will be minimized subject to the uncorrelation

condition

Cov(T R) is a diagonal matrix

Having introduced the random variable Z = (Z1, . . . ZN )

Zi = Ri

Var(Ri )1/2, i = 1, . . . , N ,

representing the distribution of the normalized returns, the function (1.5) satisfies

f 2(diag(σ)−1V diag(σ)) := g (V ) := tr(Cov(V Z −Z )),

where the symbol tr denotes the normalized trace. Let us interpret the problem (1.4) as that of

determining

V ∗ = argminV 7→ g (V ) subject to Cov(V Z ) ∈D ,

where D denotes all N ×N diagonal matrices, followed by the transformation

T ∗ = diag(σ)V ∗diag(σ)−1. (1.6)

Based on Meucci et al. [63], we present an algorithm, which recursively generates a sequence

(V (k))k∈N¯

of matrices such that

(g (V (k)))k∈N¯

is decreasing, and Cov(V (k)Z ) ∈D for all k ∈ N¯

.

Interrupting this sequence at appropriate step k∗ ∈ N¯

, we obtain a matrix V ∗ :=V (k∗) which

provides a solution to the problem (1.4).

Let us rewrite the target function as

g (V ) = tr(Cov(V Z −Z ))

= tr(V C 2V >−V C 2 −C 2V >+C 2)

= tr(V C 2V >−2V C 2)+ tr(C 2),

12


where C = Cov(Z )12 represents the root of the correlation matrix of the returns R. Since tr(C 2)

does not depend on V , we aim at

minimization of V 7→ tr(V C 2V >−2V C 2)

subject to Cov(V Z ) =V C 2V > ∈D .

Using the change of variablesΠ=V C , we equivalently address

minimization ofΠ 7→ tr(ΠΠ>−2CΠ)

subject toΠΠ> ∈D .

To satisfy the restrictionΠΠ> ∈D, the matrix is represented using polar decomposition

Π= DU

meaning that

U is orthogonal UU> = 1 (1.7)

and

D ∈D is positive definite and diagonal. (1.8)

With this decomposition, we address a separate minimization in U :

given D as in (1.8), determine a minimizer to

U 7→ tr(D2 −2C DU ) subject to (1.7)

which is solved in terms of the minimizer

U = (DC 2D)−12 DC

and a separate minimization in D :

given U as in (1.7), determine a minimizer to

D 7→ tr(D2 −2C DU ) subject to (1.8)

which is also solved with the minimizer

D = diag−1(diag(CU )+).

The successive alteration of both minimizations yields the algorithm (1.1) whose details are

given in Meucci et al. [63].

Given the matrix V ∗ returned by the algorithm (1.1), the so-called minimum-torsion matrix

13


Given square root C of return correlation0. Initialize D(0) ← 1, k ← 0

1. Root and rotation U (k) ← (D(k)C 2C )−12 D(k)C

2. Stretching D(k) ← diag(diag−1(U (k)C )+)3. Perturbation Π(k) ← D(k)U (k)4. Interruption? result V ∗ ←Π(k)C−1

5 Continuation set k ← k +1 and go to 1.

Table 1.1 – Minimum-Torsion algorithm

T ∗ is calculated from (1.6) and is considered as a solution to the problem (1.4). With the matrix

T ∗, the optimization of portfolio can be addressed.

1.4.1 Corrected-Benchmark Portfolio

In the spirit of risk-parity strategies, which take correlations among assets into account we use

the torsion matrix in order to correct the equally-weighted portfolio (naive diversification).

We calculate the corrected weights by multiplying the portfolio weights (wi = 1N )N

i=1 by the

torsion matrix T ∗ computed in (1.6). This transformation does not necessarily result in a fully

invested portfolio. The difference can be either invested in cash or the weights can be scaled

to add up to one. This methodology results in a portfolio where highly correlated assets are

under-represented. We characterize the corrected-benchmark portfolio in terms of wealth

fractions π(t ) = (πi (t ))Ni=1, invested in each asset i = 1, . . . , N at time t ∈ N

¯as

π>(t ) = w∗>T ∗/w∗>T ∗~1, t ∈ N¯

. (1.9)

In this formula, w∗>T ∗ stands for the wealth fractions, invested in risky assets given the

un-correlation from minimum-torsion matrix T ∗. According to this approach, only a fraction

w∗>T ∗~1 ∈]0,1[ of the wealth would be invested (Here~1 stands for the vector whose entries

are equal to one). In order to achieve a full investment of all available funds, we scale this

portfolio appropriately, obtaining (1.9). A similar strategy can be obtained if short positions

are not feasible,

π>(t ) = (w∗>T ∗)+/(w∗>T ∗)+~1, t ∈ N¯

. (1.10)

Here we use (·)+ to denote a component-wise application of positive-part function. We exam-

ine the behavior of both (1.9), (1.10) portfolio strategies in an empirical study in Section 1.5.

14

1.5. Case Study

1.5 Case Study

Due to data availability, we consider a representative sample of 375 stocks (N) in the S&P500

index for which we dispose of 1095 observations (T) covering the period from 2012/01 to

2016/05.

A common study design is to split the sample into a training and an independent testing set,

where the former is used to develop the model and the latter to evaluate its performance.

Accordingly, we start in a first step by analyzing the empirical histogram of the eigenvalues

of the considered stocks over the first half of the period (2012/01-2013/12) and superimpose

the theoretical Marchenko-Pastur density provided by the random matrix theory framework

detailed in Section 1.3.

σ

λ

Figure 1.5 – Eigenvalues spectrum of the S&P500 stocks

Based on this analysis (Figure 1.5), we retain eigenvalues above λmax = 2.262 that are assumed

to contain “information” and shrink the remaining ones that correspond to “noise”. For

the shrinkage procedure we follow Laloux et al. [49] and replace the noisy eigenvalues with

average value such that the trace of the covariance matrix remains unchanged. This results in

a “denoised” covariance matrix that will be used to compute the correction matrix, given by

the minimum-torsion methodology explained in Section 1.4.

The following R code details the minimum-torsion algorithm:

# Compute the Minimum-Torsion Matrix

minimum_torsion <- function(cov.matrix) # returns a matrix T such that sum (Var(TR R)/Var(R))

# is minimal where R is a random vector with cov(R)=cov.matrix

# subject to entries of TR uncorrelated

sigmas <- diag(cov.matrix)^(0.5)# Correlation matrix

C2 <- diag(1/sigmas)%*%cov.matrix%*%diag(1/sigmas)

E <- eigen(C2)# Square root of C2

15

1.5. Case Study

C <- E$vectors %*% diag(E$values^(0.5)) %*% t(E$vectors)# Inverse of C2Cinv <- E$vectors %*% diag(E$values^(-0.5)) %*%

t(E$vectors)# Requirements for break conditionsPIold <- C# InitializationD <- diag(x = 1,

nrow = nrow(C),ncol = ncol(C))

repeat DC <- D %*% CE <- eigen(DC %*% t(DC))U <- E$vectors %*% diag(E$values^(-0.5)) %*%

t(E$vectors) %*% DCdiagonal <- pmax(0, diag(U %*% C))D <- diag(x = diagonal)PI <- D %*% Utolerance <- max(abs(PI - PIold))PIold <- PIV <- PI %*% Cinv

# Convergence check: must be decreasingprint(

sum(diag(V %*% C2 %*% t(V) - V %*% C2 - C2 %*% V + C2)

))if (tolerance < 0.00001)

break

result <- diag(sigmas)%*%V%*%diag(1/sigmas)return(result)

In a second step, we used the torsion-matrix calibrated to the training sample and apply

the corrected-benchmark strategy defined in (1.4.1) on the test sample covering the period

2014/01-2016/05. Figure 1.6 compares the cumulative performance of the naive strategy

(equally-weighted) to the corrected-benchmark approach. We also consider an alternative

portfolio, where we impose a long-only constraint in order to deliver a fair comparison to the

equally-weighted strategy.

We can observe a similar dynamics of the three strategies, the unconstrained corrected-

benchmark portfolio, where short positions are allowed, displays a better performance to-

gether with higher volatility. The constrained portfolio also slightly outperforms the naive

diversification approach with comparable volatility.

In order to validate the multi-period approach, where we allow to rebalance the portfolio

dynamically over time, we compare in Figure 1.7 the three strategies vs. their buy-and-hold

equivalent.

16

1.5. Case Study

déc. 31 2013 juin 02 2014 déc. 01 2014 juin 01 2015 déc. 01 2015 mai 12 2016

Performance Comparison 2013−12−31 / 2016−05−12

100

120

140

160

100

120

140

160

Naive

Corrected Naive

Corrected Naive Long−Only

Figure 1.6 – Cumulative performance

déc. 31 2013 juin 02 2014 déc. 01 2014 juin 01 2015 déc. 01 2015 mai 12 2016

Outperformance: Strategy vs. Buy−and−Hold 2013−12−31 / 2016−05−12

−4

−2

0

2

4

6

−4

−2

0

2

4

6

Naive

Corrected Naive

Corrected Naive Long−Only

Figure 1.7 – Outperformance vs. buy-and-hold portfolio

An astonishing pattern emerges from this figure, both the dynamic naive diversification

approaches and the long-only corrected-benchmark strategy display an underperformance

with regards to the buy-and-hold equivalent, whereas the corrected-benchmark strategy

(constrained) moves from a cumulative outperformance of more than 6% in mid-June 2015

towards an underperformance of 2% at the end of the testing period. We also note that

all strategies suffered a drawdown in the second half of 2015, with a dramatic collapse in

performance of the unconstrained strategy.

17

1.6. Conclusion

1.6 Conclusion

Although the performance of our alternative diversification strategies, relying on a (con-

strained) corrected-benchmark approach do not deliver outperformance after costs, we ob-

served interesting features that deserve further study. This motivates us to pursue our research

in the direction of multi-period optimization taking various scenarios into account in order to

stabilize the portfolio dynamics and deliver a consistent performance over time.

Several other techniques have to be tested and could further developed for “denoising” and

estimating the covariance matrix. These are deferred to future work.

In the next chapter, we build on the results presented in this chapter to device new investment

strategies. In particular, we focus on retrieving uncorrelated risk drivers (factors) and investi-

gate various strategies based on the diversification of the idiosyncratic risk left unexplained by

the factors.

18

2 Statistical Risk Budgeting

In this chapter, we start with a short introduction to factor investing and the selection of risk

factors in Sections 2.1 and 2.2. In Section 2.3, we focus on statistical factors and propose a

novel dynamic approach, focusing on statistical analysis of the data and on risk budgeting

techniques. We extend the work of Meucci et al. [63] and propose a shrunk version of the

minimum-torsion matrix, using the effective rank approach of Roy and Vetterli [84] to extract

the number of risk factors driving asset returns.

Our pure statistical approach enables a risk decomposition of a given portfolio into a system-

atic and a specific component. We detail in Section 2.4 the methodology used to decompose

total risk and to assess the level of diversification in our statistical framework.

We devise various dynamic investment strategies in Section 2.5, especially an innovative

implementation of a risk budgeting technique, where the budget of a given asset is inversely

proportional to its idiosyncratic risk, left unexplained by the statistical factors. We illustrate

our approach through an empirical application in Section 2.6 and conclude in Section 2.7.

2.1 Background

The last financial crisis and more recently the dramatic events surrounding Greece has trig-

gered a re-design of portfolio strategies among practitioners. Uncertainty about future asset

returns in a portfolio optimization framework has led the financial industry to look for new

solutions to propose to their clients.

Notably, the widespread 60-40 equity/fixed-income allocation outlived its usefulness. After

the financial crisis this allocation scheme has fulfilled its task, namely bonds provided a

welcomed downside protection when stock markets tumbled. However, in the current envi-

ronment, where interest rates are hovering just above their all-time lows and stock markets

are relatively expensive1, this protection is not effective anymore. Risk parity is meanwhile

1The Shiller adjusted price-to-earnings ratio is standing at nearly two standard deviations above the long-termaverage

19

2.2. Factor Investing

a well-established concept which does not rely on returns expectations and focus on risk

diversification within a portfolio. (see Roncalli and Weisang [83]).

Alternatively, Smart Beta investment strategies have been proposed allowing to diversify along

identified risk drivers (factors) that are assumed to deliver a risk premium in a rule-based and

transparent way. The increased popularity of these strategies is linked to a desire for portfolio

risk management and diversification as well as seeking to enhance risk-adjusted returns above

cap-weighted indices (see Amenc et al. [1], Amenc et al. [2]).

2.2 Factor Investing

Factor models have a long history in finance and experience a surge in popularity through

the emergence of Smart Beta products. This growing acceptance of factor-based investment

strategies is mainly due to its ease of implementation in terms of infrastructure and costs.

Factor-based investing is not a new topic and numerous academic studies have been con-

ducted in this area. Asset pricing theory (APT) postulates that efficient diversification is

decisive to eliminate unrewarded risks. By definition, these risks are unattractive to risk-averse

investors, who are therefore only willing to accept risk if a decent reward is associated to it.

Accordingly, in current volatile markets and low-rate environment market participants strive

to select risk factors with a proven ability to deliver positive risk premia over the long run and

to reduce idiosyncratic risks.

2.2.1 Smart Beta

Smart Beta investing (Amenc et al. [1]) aims above all at circumventing the shortcomings of

cap-weighted indices, that are mainly concentrated in a few stocks, with large capitalization

and high growth, and display sub-optimal factor exposures.

These highly concentrated cap-weighted products, characterized by a large-cap and growth

bias, display a strong presence of idiosyncratic risk, while empirical studies have proved

that small-cap and value investing provide positive rewards. Amenc et al. [2] construct factor

indices exposed to rewarded risk factors only and suggest that the so-called smart factor indices

lead to better risk-adjusted performance than a broad cap-weighted index, after considering

transaction costs. Smart Beta strategies fill thus the gap between traditional (cap-weighted)

passive investment and traditional active management.

2.2.2 Risk Drivers

Fixing the number of factors to include is a problem that has no conclusive answer and mainly

depends on managers’ views on which factors are expected to deliver the required risk premia.

20

2.3. Statistical Factors

In practice, we often recourse to subjective and empirical decisions based on experience

and the selected factors may differ depending on the portfolio considered and the current

economic environment.

Connor and Korajczyk [23] developed a test statistic that does not require a strict factor

structure2 to assess the determine the number of factors. Their test is based on the observation

that, if L is the appropriate number of factors, then there should be no significant decrease in

the cross-sectional mean-square of idiosyncratic returns in moving form L to L+1 factors.

Whatever the method used to fix the number of factors, the relationship among risk factors is

not static, as we could observed during the financial crisis. Indeed, traditional asset classes

tend to display a high correlation in downturn periods, thus decreasing potential diversifica-

tion possibilities. Applying a traditional risk parity or, alternatively, a Smart Beta approach on

the retained factors thus implies that the volatility and the correlations of the factors should

be closely monitored, to maintain the desired risk contribution of each asset in the portfolio.

Changing correlations as well as the shift in the factors retained in the investment decision

require a dynamic investment management process. We suggest a statistical approach to

identify the risk factors driving asset returns and use a dynamic risk budgeting framework to

manage a diversified portfolio.

2.3 Statistical Factors

There is a large body of literature covering the two mostly used types of factor models, namely

macroeconomic and fundamental factors (see Fama and French [31], Fama and French [32],

Chen et al. [21], Carhart [19]). A third type aims at identifying and estimating risk drivers

using statistical techniques such as principal component analysis (PCA). Miller [66] show that

statistical factors work best with high-frequency data and are more useful when combined with

fundamental factors. Statistical factor models do not generally specify the number of factors

in advance and extract them directly from asset returns. (see Connor [22] for an overview of

these three types of factor models).

Principal component analysis allows to express portfolio returns as a combination of un-

correlated factors. However, this method raises several issues notably the instability of the

components related to the lowest eigenvalues. Another issue is that the principal components

are not unique and are sensitive to the units of measurement3. Moreover, the factors extracted

by the PCA are often difficult to identify and interpret and can lead to counter-intuitive results

when used in portfolio allocation decisions (see Meucci et al. [63] for more details). This

technique has thus been rejected by most practitioners.

2i.e. where specific risk have zero correlation across assets.3When we apply PCA to the covariance matrix of asset prices, one would obtain different results in different

currencies

21


2.3.1 Minimum-Torsion Approach

Meucci et al. [63] propose a new approach relying on uncorrelated statistical factors ex-

tracted from the asset returns, which remain as close as possible to the original data set. This

minimum-torsion concept allows to clearly identify the contribution of each risk factor within

a portfolio of assets and “generalizes the marginal contributions to risk used in traditional risk

parity”.

This methodology circumvents the identification problem met with the standard PCA de-

composition and relies on a tracking-error minimization between uncorrelated factors and

original assets. As the factors are forced to remain as close as possible to the data, we do not

face this interpretation issue and this framework can thus be easier used in a risk budgeting

framework.

Definition 1. Torsion Matrix

If we consider K assets in a portfolio, Meucci et al. [63] show that we can find a so-called torsion

matrix that decorrelates the original assets in K uncorrelated factors. The linear transformation

used to retrieve this torsion matrix is the one that least disrupts the original factors (assets)

denoted R. Meucci et al. [63] suggest to select the torsion matrix 4 that minimizes the tracking

error between the uncorrelated factors and data:

tMT ≡ argminCor (t R)=IN×N

NTEt R,R, (2.1)

where R expresses the asset returns, t a valid torsion matrix and NTE denotes the normalized

tracking error defined as :

NTEF || R ≡√

1

K

∑KV(FK −RK

Sd(RK )

).

Meucci et al. [63] show that this normalization, unlike the PCA approach, is not sensitive to

factors expressed in different units.

2.3.2 Uncorrelated Factors

In a manner similar to PCA, after having solved the minimum torsion optimization (2.1) above

we can easily retrieve the uncorrelated factors FMT:

FMT = tMT R. (2.2)

The factor exposures, denoted wMT, can be obtained by inverting the torsion matrix tMT and

multiplying it by the asset weights w . Eventually, the return of a given portfolio Rp with an

allocation w = (w1, . . . , wK ) can either be expressed as a linear combination of the asset returns

4We refer to Chapter 1, Section 1.4 for the full derivation of the torsion matrix.

22


R or, alternatively, with the help of the uncorrelated factors FMT:

wMT = t−1MT w, Rp = wT R = wTMT FMT.

This methodology allows to fully characterize the K asset returns with the help of K uncor-

related factors. However, it could be optimal, in the presence of a large number of assets, to

consider a subset of these risk factors and thus to proceed to a shrinkage of the torsion matrix.

2.3.3 Effective Rank

To assess the appropriate number of risk drivers to retain, we recourse to PCA and the concept

of effective rank suggested by Roy and Vetterli [84], which extends the notion of rank of a matrix.

Let us consider a correlation matrix C of size K ×K and proceeds to a principal component

decomposition

C= E Λ E ′,

where E are the eigenvectors of size K ×K andΛ is the K ×K diagonal matrix of the eigenvalues

λ1 ≥λ2 ≥ . . . ≥λK ≥ 0,

We denote λ= (λ1,λ2, . . . ,λK )T the vector of the K positive eigenvalues and define the eigen-

value distribution as

pi = λi

|| λ ||1, i = 1,2, . . . ,K ,

where || ||1 denotes the L1-norm.

Definition 2. Effective rank

As stated by Roy and Vetterli [84] the effective rank of the matrix C is denoted by eRank(C) and

is defined as

eRank(C) = expH(p1, p2, . . . , pK ),

where H(p1, p2, . . . , pK ) is the spectral entropy given by

H(p1, p2, . . . , pK ) =−K∑

i=1pi log pi . (2.3)

The effective rank measure applied to C retrieves the effective number of risk drivers, which is

generally lower than the number of assets (K ). This can easily be explained by the fact that

some assets may display a relatively high correlation, which reduces the true dimension of the

correlation matrix.

23


Special Case: highly correlated markets

The CAPM-related literature shows that the market can be viewed as the first and predominant

equity factor (Lintner [53], Mossin [68], Sharpe [88] and Treynor [95]). Thus, a problem arises

when considering for instance the correlation matrix of a national stock market: in this

particular case the principal component analysis results merely in a single large eigenvalue,

corresponding to market-wide fluctuations, which makes the above measure ineffective (see

also Kim and Jeong [46] for more details).

Definition 3. Modified effective rank

Following Kakushadze and Yu [44], we can circumvent this problem by removing the first eigen-

value related to the so-called market factor and separately proceeds with the eRank calculation

described above on the remaining eigenvalues. The modified eRank metric used subsequently is

defined as

eRank2(C) = exp−K∑

i=2pi log pi +1 = expH(p2, p3, . . . , pK )+1,

where 1 is related to the first factor that has been removed from the calculation. We therefore

consider the L major risk drivers, defined as

L = beRank2(C)e , (2.4)

where b·e denotes the nearest integer function.

Shrunk Uncorrelated Factors

Using the notion of effective rank defined in Equation (2.4) we shrink the torsion matrix tMT to

retain only the L most relevant risk drivers. We define the shrunk factors and the corresponding

torsion matrix of dimension K ×L as

FSMT ≡1:L

FMT, tSMT ≡1:L

tMT . (2.5)

Our portfolio can be now decomposed into a systematic component, explained by the L risk

drivers, as well as a specific component denoted εp :

Rp = wT R︸︷︷︸asset-based

= wTSMT FSMT︸︷︷︸systematic

+ εp︸︷︷︸specific

.

2.3.4 Factor Risk Budgeting

An appealing feature of the uncorrelated factors obtained in Equation (2.2) is that we can

easily compute the risk budgeting portfolio analytically. This is unfortunately not the case

24

2.4. Diversification

when the factors are not uniformly correlated with each other, except if the correlation reaches

a lower bound or in case of perfect correlation (for further details, see Roncalli [82]).

In the case of uniform correlation ρi , j = ρ the risk parity portfolio, with budgets bi = 1/K ,

where K corresponds to the number of factors, is given by:

wi =σ−1

i∑Kj=1σ

−1j

, i = 1,2, . . . ,K . (2.6)

In the more general case, when the investor defines risk budgets b1, . . . ,bK for each factor we

get:

wi = biσ−1

i∑Kj=1 b jσ

−1j

, i = 1,2, . . . ,K . (2.7)

However, practitioners may set their risk budgets on the portfolio assets rather than on factors

that change over time. We thus have to redirect these asset-based budgets to the underlying

risk drivers. During our research for this project, we also devised a methodology 5 that

considers the loadings of the risk factors left aside in the shrinkage computation to reallocate

the user-based risk budgets to the corresponding L risk factors.

2.4 Diversification

This section details the concept of diversification and presents a way of measuring the level of

diversification within a portfolio.

2.4.1 Idiosyncratic Risk

According to the Asset Pricing Theory, an investment strategy should strive to diversify away

unrewarded risks, also called specific risk. Our statistical factor model, which identifies L risk

drivers with the help of the modified effective rank methodology (see Definition 3), enables

the decomposition of asset returns into systematic and specific risk:

The covariance matrix Γ of our factor model can be written

Γ= EL ΛL E ′L︸︷︷︸

systematic risk

+ Υ2︸︷︷︸specific risk

, (2.8)

where EL ΛL E ′L corresponds to the systematic risk related to the L factors identified and Υ2 is

a diagonal matrix of dimension K ×K expresses the idiosyncratic risk of each asset. EL is a

5As we only consider algorithmic risk budgeting strategies in this thesis, we do not give the details of thismethodology that has not been further investigated.

25

2.4. Diversification

K ×L matrix corresponding to the L factors exposures, whereasΛL is a L×L diagonal matrix

corresponding to the variance of the L uncorrelated factors.

An investment strategy will achieve this diversification by setting risk budgets b j that are

inversely proportional to the idiosyncratic risk υ j of the assets considered:

bi ∝ 1

υ2i

, i = 1,2, . . . ,K .

2.4.2 Measuring Diversification

Following Meucci [61] we recourse to the effective number of bets methodology to quantify

the diversification of a given portfolio. We propose here to define the diversification level as a

percentage, with 100% meaning perfect risk diversification. We show in Section 2.3.3 that a

portfolio can be decomposed into a systematic and a specific component:

Rp = wT R = wTSMT FSMT+εp .

Intuitively, a perfectly diversified portfolio can be fully explained by the systematic component

and displays no specific risk. As the factors FSMT are per definition uncorrelated, we can now

calculate the contribution of each risk factor to total risk:

RCi ≡VwT

SMTiFSMTi

VRp , i = 1, . . . ,K ,

where V denotes the variance. The contributions RCi sum up to one, are non-negative and

can thus be considered as weightings. A risk parity portfolio will be characterized by an equal

contribution of each risk factor to total risk.

Definition 4. Diversification measure

Computing the spectral entropy defined in Equation (2.3) of the risk contributions RCi , we

obtain the effective number of bets defined as

B≡ expH(RC1,RC2, . . . ,RCK ).

The measure B ranges from 1, when all the variability (risk) stems from a single risk driver,

to L when the total risk of a given portfolio is equally spread among the risk drivers. If we

normalize the measure B by the number of risk drivers L retained and multiply by 100, we

obtain a diversification measure expressed as a percentage

D≡ expH(RC1,RC2, . . . ,RCK )

L×100 = B

L×100. (2.9)

26

2.5. Investment Strategies

2.5 Investment Strategies

We apply the minimum-torsion approach in its standard version as well as the shrunk alter-

native, obtained through the modified eRank methodology used to identify the number of

risk drivers. Building on these uncorrelated sources of risk, we compare various investment

strategies that do not rely on expected returns or any external data. Only historical data of the

assets considered are required.

Naive Factor Diversification

Our first strategy is related to a standard naive diversification approach, also called equally-

weighted. Instead of equally-weighting the assets in the portfolio, we apply this technique to

the K uncorrelated factors. We retrieve the corresponding weights of the K assets by using the

computed torsion matrix tMT. These weights are defined as

wi = tMT1

K, i = 1,2, . . . ,K .

Factor Risk Parity

We also apply the traditional risk parity approach to the uncorrelated factors identified, where

the overall portfolio risk is equally spread among the K risk drivers, i.e. the risk contribution of

each factor is identical. Using Equation (2.6) and the torsion matrix tMT we retrieve the asset

weights:

σ−1i = 1

SdFSMTi ,

wi = tMT

σ−1i∑K

j=1σ−1j

, i = 1,2, . . . ,K ,

where SdFSMTi denotes the standard deviation of the risk factor i .

Factor Risk Budgeting – Naive

The rationale behind the naive diversification approach led us to consider an alternative,

where an equal budget is allocated to each factor, without considering any factor specificity,

such as volatility. This can be seen as an equally-weighted approach along the factors, that

actually translates into a risk budgeting strategy along the assets. Using the torsion matrix tMT,

we define the asset-based budgets as:

bi = tMT1

K, i = 1,2, . . . ,K .

27

2.5. Investment Strategies

In this particular case, the methodology defined in Equation (2.7) cannot be used to retrieve

the weights of the K assets, as the budgets are now expressed along the assets and correlations

must be considered. We therefore use the algorithm developed by Spinu [91] for computing

the allocation weights of the risk budgeting6.

Factor Risk Budgeting – Proportional

We then consider two other risk budgeting strategies: in the first one, we set risk budgets

that are inversely proportional to the volatility of the K factors. This framework is related to

the early traditional risk parity strategies applied to assets (see Bhansali et al. [9]), where the

correlations were not considered. In our setting the underlying assumptions are met, as the

factors are per definition uncorrelated. Using Equation (2.7) and tMT, we retrieve the asset

weights:

bi = 1

SdFSMTi ,

wi = tMT biσ−1

i∑Kj=1 b jσ

−1j

, i = 1,2, . . . ,K ,

where SdFSMTi denotes the standard deviation of the risk factor i .

Factor Risk Budgeting – Specific

In the second risk budgeting strategy, we set risk budgets that are inversely proportional to the

specific risk of the assets (Υ2), applying the risk decomposition in Equation (2.8), that remain

unexplained by the L factors. Using Equation (2.7) and tSMT, we retrieve the asset weights:

bi = 1

υ2i

,

wi = tSMT biσ−1

i∑Kj=1 b jσ

−1j

, i = 1,2, . . . ,K ,

where υ2i denotes the specific risk of asset i .

6This algorithm is based on Newton’s method.

28

2.6. Application

2.6 Application

In this section we present a concrete application of the risk-budgeting strategies developed in

this chapter.

2.6.1 Goal

In this application the main purpose is to compare the performance and tail risk metrics of

various investment strategies. A secondary goal is to evaluate the level of diversification of

each strategy when considering the asset and the risk drivers (factors) respectively. Analyzing

diversification along factors, that are the assumed to be the true risk drivers of asset returns,

should emphasize the distorted picture provided by a pure asset-based analysis.

2.6.2 Data

Our dataset is provided by Complementa Investment-Controlling AG7 and is composed of

eight asset classes from May 1997 through December 2015: World and Emerging-Markets

Equities, Global Bonds, Inflation-Linked Bonds, High Yield and Emerging-Markets Bonds,

Hedge Funds and Commodities. All time series are hedged in CHF. Each portfolio is rebalanced

on a monthly basis, using the previous four years of data (48 months) to estimate second

moments and extract the risk factors. Trading costs of 20 basis points (bp) are considered for

the return calculations as well as a price impact cost of 1 bp. Our investment universe is meant

to be representative of a well-diversified Swiss endowment fund with exposure to different

risk premia.

We start by reporting summary statistics – annualized return and volatility as well as correla-

tions, Value-at-Risk, Expected Shortfall and Maximum Drawdown – over the whole sample

period. Table 2.1 highlight the significant variations across asset classes in terms of return

and risk. Emerging Markets Equities show a negative return (-0.01%) and the highest risk

(25% volatility) whereas Emerging Markets Bonds have the highest return (6%). The highest

drawdown is recorded by Emerging Markets Equities (-19%), while Global Bonds display the

lowest one (-4%).

The correlation matrix across asset classes in Figure 2.1 emphasizes the potential for diver-

sification provided by an optimal combination of these asset classes. High Yield and Global

bonds have surprisingly the lowest correlation (-0.13), whereas Global and Emerging Markets

equities exhibit the highest coefficient (0.83).

7 Complementa Investment-Controlling AG supports investors in planning, organizing and monitoring thefunding process. It has been an operationally independent unit of State Street Holdings Germany GmbH sinceOctober 2011.

29

2.6. Application

Table 2.1 – Asset Classes – Statistics

Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDDGlobal Equities 0.02 0.17 0.10 0.64 -0.10 -0.12

EM Equities -0.01 0.25 -0.03 0.66 -0.13 -0.19Global Bonds 0.02 0.07 0.29 0.22 -0.03 -0.04

Inflation-Linked Bonds 0.04 0.08 0.45 0.19 -0.03 -0.05High-Yield Bonds 0.05 0.10 0.57 0.37 -0.03 -0.07

EM Bonds 0.06 0.12 0.50 0.35 -0.03 -0.08Hedge Funds 0.02 0.12 0.14 0.42 -0.05 -0.08Commodities -0.04 0.16 -0.22 0.70 -0.07 -0.11

This table shows the annualized return and volatility of each asset class over the period05/1997-12/2015. Value-at-Risk and Expected Shortfall are calculated with a 95% confidencelevel. The Maximum Drawdown (MDD) over the period is displayed in the last column.

Figure 2.1 – Asset Classes – Correlations.

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1Glob

al Equ

ities

EM Equitie

s

Global

Bonds

Inflat

ion−L

inked

Bonds

High−Y

ield B

onds

EM Bonds

Hedge

Funds

Commod

ities

Global Equities

EM Equities

Global Bonds

Inflation−Linked Bonds

High−Yield Bonds

EM Bonds

Hedge Funds

Commodities

1 0.83

1

0.35

0.18

1

0.5

0.35

0.8

1

0.54

0.62

−0.13

0.2

1

0.48

0.66

0.03

0.18

0.69

1

0.68

0.56

0.66

0.73

0.15

0.17

1

0.44

0.5

0.22

0.38

0.33

0.29

0.46

1

This figure displays asset classes correlations over the period 05/1997-12/2015. Highly corre-lated asset classes are highlighted by blue dots with shadings related to the degree of correla-tion.

2.6.3 Benchmark Strategies

We provide two benchmark strategies to assess the incremental value of our approach: the

first one is the well-established 60-40 allocation, which seeks long-term capital appreciation,

30

2.6. Application

taking current income into account, by investing in an amount equal to 60% of its assets in

Global Equities and 40% of its assets in Global Bonds. The second benchmark strategy is a

simple equally-weighted approach, also called naive diversification, where each asset has the

same weight in the portfolio.

Our first investment strategy described in Strategy 2.5, leans on the naive diversification

approach and applies it to the uncorrelated factors, i.e. each factor is equally-weighted in the

portfolio. Table 2.2 shows that the 60-40 allocation, rebalanced on a monthly basis, delivers a

disappointing net performance (0.87%) and a low Sharpe ratio (0.09). Our naive diversification

strategy along the factors manages to outperform the two benchmark strategies, with an

average annualized return of 2.21% and a Sharpe ratio of 0.26 compared to 2.11% and 0.22

respectively for the naive benchmark strategy applied to the assets directly. However, this

simple approach does not reduce the maximum drawdown significantly (28.65%, 27.59%,

24.97% for the three strategies respectively).

Table 2.2 – Risk Parity Strategies – Key figures

Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDD60% / 40% 0.87 9.33 0.09 -4.81 -6.62 28.65

equally-weighted: assets 2.11 9.44 0.22 -5.01 -8.72 27.59equally-weighted: factors 2.21 8.41 0.26 -4.41 -8.81 24.97

risk parity: assets 2.25 7.39 0.30 -3.85 -6.73 19.38risk parity: factors 2.52 7.05 0.36 -3.59 -6.42 17.91

This table presents the annualized return and volatility (after costs) as well as tail risk metricsof the standard equally-weighted approach and the risk parity strategies applied to the assetsand to the statistical factors respectively.

2.6.4 Risk Parity Strategies

In a second step, we build on the traditional risk parity approach relying on an equal contribu-

tion of each asset in the portfolio to the overall portfolio volatility and apply it to the statistical

factors identified using Definition (1). The approach is described in Strategy 2.5

Figure 2.2 shows that the risk parity strategy applied to the factors improves the factor-based

naive strategy, posting a better net performance and lower drawdowns. Table 2.2 highlights

the appeal of our factor risk parity strategy over traditional risk parity and naive diversification

respectively.

2.6.5 Risk Budgeting Strategies

In a last step we extend our approach to a risk budgeting framework. We first construct a

portfolio relying on the naive risk budgeting approach detailed in Strategy 2.5, where we

allocate an equal risk budget to each factor.

31

2.6. Application

Figure 2.2 – Risk Parity Strategies – Performance−0

.2−0

.10.

00.

10.

20.

30.

4 60% / 40%equally−weighted: factorsrisk parity: factors

Cum

ulat

ive

Ret

urn

Net Performance

−0.0

50.

000.

05

Mon

thly

Ret

urn

Avr. 01 Oct. 01 Avr. 02 Oct. 02 Avr. 03 Oct. 03 Avr. 04 Oct. 04 Avr. 05 Oct. 05 Avr. 06 Oct. 06 Avr. 07 Oct. 07 Avr. 08 Oct. 08 Avr. 09 Oct. 09 Avr. 10 Oct. 10 Avr. 11 Oct. 11 Avr. 12 Oct. 12 Avr. 13 Oct. 13 Avr. 14 Oct. 14 Avr. 15

−0.3

0−0

.20

−0.1

00.

00

Dra

wdo

wn

This figure compares the performance after costs of two benchmark strategies, the widespread60-40 and the equally-weighted strategies, to the risk parity approach applied to the statisticalfactors.

Following Strategy 2.5, we form an additional portfolio where the risk budgets are inversely pro-

portional to the volatility of the statistical factors. Eventually, we construct the portfolio relying

on the modified effective rank methodology defined in (3) and diversify away idiosyncratic risk

of the assets as detailed in Strategy 2.5.

We can observe in Tables 2.2 and 2.3 that the strategies applied to the assets directly are all

outperformed by their factor-based counterparts.

Table 2.3 – Risk Budgeting Strategies – Key figures

Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDDrisk budgets: factors, naive 2.90 7.73 0.38 -3.70 -4.91 12.28

risk budgets: assets, specific 1.29 7.53 0.17 -4.04 -7.64 17.92risk budgets: factors, specific 3.58 6.84 0.52 -2.85 -4.09 10.10

risk budgets: assets, proportional 2.18 6.84 0.32 -3.49 -6.12 16.55risk budgets: factors, proportional 2.38 6.37 0.37 -3.14 -5.85 15.91

This table presents the annualized return and volatility as well as tail risk metrics of the threerisk budgeting strategies applied to assets and factors.

32

2.6. Application

Table 2.3 shows that the naive risk budgeting approach provides a considerable improvement

relative to the risk parity strategy. Despite a similar Sharpe ratio, tail risk has substantially

been reduced (from 17.91% to 12.28%). However, the proportional risk budgeting approach

reveals in Figure 2.3 disappointing results, with a similar drawdown pattern as the naive risk

budgeting strategy but an inferior annualized performance (2.38% vs 2.90%). The significant

added-value in terms of annualized return, Sharpe ratio as well as tail risk metrics is provided

by Strategy 2.5.

Figure 2.3 – Risk Budgeting Strategies – Performance

0.0

0.2

0.4

0.6

risk budgets: factors, naiverisk budgets: factors, specificrisk budgets: factors, proportional

Cum

ulat

ive

Ret

urn

Net Performance

−0.0

6−0

.04

−0.0

20.

000.

020.

04

Mon

thly

Ret

urn


−0.1

5−0

.10

−0.0

50.

00

Dra

wdo

wn

This figure compares the performance after costs of risk budgeting strategies based on equalrisk budgets and on the volatility of the factors, as well as strategies relying on the diversifica-tion of specific risk left unexplained by the statistical factors.

Table 2.4 summarizes the risk-return characteristics of the factor-based strategies considered

in this chapter. Figure 2.4 gives an overview of the performance after costs over time as well as

downside risks of these strategies.

33

2.6. Application

We can observe that the main strategy, relying on the diversification of specific risk, displays the

best results: an average return of 3.58% resulting in a Sharpe ratio of 0.52. This strategy is also

characterized by a stable net performance and a relatively low maximum drawdown (10.10%).

This dynamic approach, which identifies the number of risk drivers at each rebalancing step

and diversify along them, allows to considerably reduce the portfolio tail risk without altering

the performance.

Table 2.4 – Risk Parity vs. Risk Budgeting – Key figures

Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDDequally-weighted: factors 2.21 8.41 0.26 -4.41 -8.81 24.97

risk parity: factors 2.52 7.05 0.36 -3.59 -6.42 17.91risk budgets: factors, naive 2.90 7.73 0.38 -3.70 -4.91 12.28

risk budgets: factors, specific 3.58 6.84 0.52 -2.85 -4.09 10.10

This table presents the annualized return and volatility (after costs) as well as tail risk metricsof the best performing factor-based strategies, relying on naive diversification, risk parity aswell as risk budgeting techniques.

Figure 2.4 – Risk Parity vs. Risk Budgeting – Performance

0.0

0.2

0.4

0.6

equally−weighted: factorsrisk parity: factorsrisk budgets: factors, naiverisk budgets: factors, specific

Cum

ulat

ive

Ret

urn

Net Performance

−0.1

0−0

.05

0.00

0.05

Mon

thly

Ret

urn


−0.3

0−0

.20

−0.1

00.

00

Dra

wdo

wn

This figure compares the performance after costs of the best performing factor-based strate-gies, relying on naive diversification, risk parity as well as risk budgeting techniques.

34

2.6. Application

Figure 2.5 – Diversification along assets and factors

This figure compares the diversification measure of the five different strategies, when applieddirectly to the asset classes or using the statistical risk factors.

2.6.6 Diversification Analysis

Figure 2.5 reveals the importance of a thorough analysis of inherent risk drivers to assess

the diversification of a given portfolio. Applying the measure D detailed in Equation (2.9) to

the risk factors and assets respectively, we observe that the asset-based measure, blurred by

correlation among assets, underestimates the true diversification. The traditional 60-40%

strategy, invested in only two asset classes, reveals a relatively poor diversification along assets

and factors (23% and 64% respectively). Interestingly, the asset-based measure underestimates

in this case the diversification level of this portfolio. We can also observe that the two risk

parity portfolios are the only strategies showing a perfect diversification (along assets and

factors respectively). Such an extreme degree of diversification is not required and can even be

sub-optimal in terms of risk-adjusted performance, as evidenced in the performance Tables

(2.2) and (2.3). Moreover, the relatively poor net performance of asset-based strategies relative

to their factor-based counterparts reveals the importance of identifying the risk factors driving

the asset returns when constructing a portfolio8 (see Tables 2.2 and 2.3).

8This finding is consistent with other work, such as Bhansali et al. [9]

35

2.7. Conclusion

2.7 Conclusion

We propose a novel approach for dynamic risk budgeting. The model extends the statistical

approach of minimum-torsion proposed by Meucci et al. [63] and shrink the number of uncor-

related factors with the help of the modified effective rank methodology of Kakushadze and Yu

[44]. Relying on the rationale of the APT theory we apply a risk budgeting investment strategy

where we diversify away unrewarded risks. We compare four different investment strategies:

the first one is similar to the equally-weighted approach and rely on a naive diversification

of the uncorrelated factors. We draw on the naive diversification rationale and propose a

risk budgeting approach, that sets an equal risk budget to each factor. These budgets are

then expressed as risk budgets along the assets, using the minimum-torsion matrix. The

two remaining risk budgeting strategies set risk budgets that are inversely proportional to

the volatility of the factors and to the specific risk left unexplained by the statistical factors

respectively. We show that substantial improvement in the Sharpe ratio as well as in tail risk

metrics can be obtained by applying these dynamic statistical risk-based investment strategies.

Promising results in terms of performance and risk management are provided by the last

strategy considered and this will be used as strategic allocation scheme in the real-world

application presented in Chapter 6.

36

3 Portfolio Optimization

The major decision in the portfolio management process consists in allocating investment

capital to a given universe of investable assets, with respect to a set of assumptions on the

market dynamics and constraints. It has been showed that strategic asset allocation is decisive

in determining the expected return and risk of a portfolio and that security selection only

plays a minor role. This long-term based decision process is crucial for institutional as well

as private investors and all financial aspects, such as current wealth, future incomes and

outcomes, goals, inflation, etc. should be considered.

Although some asset classes display a higher return than others over a long-term horizon,

in the short run the investor cannot neglect risk in the analysis. As we have seen in the

financial crisis of 2008, some asset classes that have displayed a low correlation historically,

suddenly drop quickly and simultaneously. Choosing the right balance between investment

opportunities thus depends on the risk the investor is ready to accept and usually fluctuates

over time due to change in wealth levels, market environment or investor’s goals.

We start by an overview of the single-period optimization framework in Section 3.1 and

continue our review with previous research in multi-period optimization and the solutions

proposed in Section 3.2. Finally, a short overview of Dynamic Programming (DP) techniques

used in multi-period optimization is provided in Section 3.3.

3.1 Single-Period Optimization

Markowitz [54], in his seminal work, showed that asset returns are random parameters and

that for the evaluation of a portfolio, one should consider both its expected returns and its risk,

where for representing risk he used the portfolio’s variance. His mean-variance framework laid

the foundations for modern finance and explained how financial markets work. This mean-

variance framework is referred to as modern portfolio theory, whereas post-modern portfolio

theory considers further extensions including non-normal distributions and asymmetric risk

measures.

37

3.1. Single-Period Optimization

3.1.1 Modern Portfolio Theory

Modern Portfolio Theory as proposed by Markowitz [54] frames the time dimension of invest-

ing as a single period over which the parameters of the probability distribution of asset returns

are both known with certainty and are fixed. The future is treated as a single period which

starts today but ends only at some unknown moment in the future. This second assumption

has received attention in the theoretical literature but there has been little progress in terms of

practical advances available to financial practitioners.

The "single-period" framework for portfolio optimization is legitimized by assuming that

all frictions that impact portfolio formation and rebalancing are neglectable. If the cost of

rebalancing is zero, then the single-period assumption delivers the optimal solution. While

these costs may be very small for some investment assets held by some investors, the necessary

conditions are not fulfilled in most practical cases.

For almost all real-world investors, portfolio rebalancing is costly. For taxable investors holding

illiquid assets such as private equity or real estate, transaction costs are often predominant

with respect to return and risk considerations unless holding periods exceed multiple decades.

3.1.2 Mean-Variance Framework

Markowitz [54] paved the way to a new era of modern portfolio management when he pre-

sented the mean-variance framework for managing and optimizing portfolios. Portfolio vari-

ance is a valid risk measure for ranking investor’s preferences if either he exhibits a quadratic

utility function or if the underlying asset returns are normally distributed.

min(w)

1

n

n∑i=1

(m∑

j=1w j(ri , j −µ j

))2

subject tom∑

j=1w jµ j = R

m∑j=1

w j = 1

w j ≥ 0,∀ j ∈ 1, . . . ,m ,

where w represents the j = 1, . . . ,m asset weights, i = 1, . . . ,n are the number of returns

observations r and µ j the expected return.

This optimization problem effectively minimizes portfolio risk, measured by the variance,

subject to the portfolio forecast return being equal to R, a full investment constraint and

positivity constraints on the weights. While it is simple to express the problem in its quadratic

form such that variance is equal to w ′Σw , we leave the problem here in its more general

nonlinear programming form which allow nonlinear constraints that include long-short

optimization with a leverage constraint.1

1In the case of constraints exhibiting a quadratic form, the problem can also be posed as a second order cone

38

3.2. Multi-Period Optimization

Criticisms of variance as a valid method for assessing risk of a given portfolio is mainly aimed

at the quadratic utility assumption which is just a mathematical convenience rather than a

reflection of reality, leading to the irrational investor’s behavior, preferring less to more after a

certain point on the utility curve, whilst the multivariate normality assumption is not usually

borne out by empirical data. Hanoch and Levy [39] was the first to criticize variance as a risk

measure, that penalizing both up and down deviations at the same rate 2.

However, its ease of use and tractability has made it a very popular choice with numerous ex-

tensions to provide for robustness and uncertainty mainly in the derivation of the covariance

matrix. For example, James and Stein [43] provide for a shrinkage estimator, Black and Litter-

man [10] a semi-Bayesian approach while Michaud [65] a general criticism of the approach

with a patented alternative based on resampling methods.

3.2 Multi-Period Optimization

The optimal portfolio selection problem has always played a predominant role in applied

financial research. Numerous papers strive to answer questions like how to construct an

optimal portfolio based on historical data, how the portfolio choice is influenced by asset

returns, etc.

The seminal work of Markowitz [54] relies on a trade-off between expected returns and risk of

a given portfolio, where risk is defined as portfolio variance. His paper is equivalent to the well-

known mean-variance utility maximization problem. This easy to implement methodology

is, despite its caveats, still very popular in the financial industry and solve the static (single-

period) portfolio choice problem (see Brandt and Santa-Clara [16]).

However, most industry problems rely on finding an optimal investment strategy over a long-

term investment horizon and this issue has not yet been solved. The multi-period portfolio

selection issue has been first formulated by Markowitz [55] in his book. Mossin [69] also

covered this topic in his paper, followed by Samuelson [86], Merton and Samuelson [60]. At

the beginning of the 20th century, numerous literatures focused on this topic (see Li and Ng

[52], Steinbach [93], Leippold et al. [51], Brandt and Santa-Clara [16], Celikyurt and Özekici

[20], Skaf and Boyd [89]) but a closed-form solution, except in very restrictive cases, has not

been provided yet.

Literature Review

Much research has been carried on the formulation of full multi-period optimization. Mossin

[69] suggests an explicit multi-period approach for portfolio optimization. A research paper

written by Cargill and Meyer [18] sets the focus on the risk perspective of the multi-period

(SOCP) problem.2The criticism was not only aimed at variance but at any symmetric dispersion measure.

39


optimization problem. This was followed by Merton [59] who introduced a continuous-time

approach similar to mean-variance optimization, and by Pliska [77] who provided a discrete-

time alternative related to the single period method.

A solution to multi-period mean variance optimization problems relying on dynamic stochas-

tic programming is provided by Li and Ng [52]. These methods investigate a range of potential

paths of future outcomes and select the set portfolio composition that meets a defined ob-

jective as well as the client’s constraints. Such techniques are often used by institutional

investors or high net-worth individuals for asset allocation purposes. The computational

power required by this method often restricts the number of assets that can be dealt with. Even

with today’s improvements in computational efficiency, the methodology is still only viable for

portfolios with a small number of assets. Using a set of simplifying assumptions, Sneddon [90]

provides a closed-form solution to the multi-period optimization including optimal turnover

that could be applied to problems with large number of assets.

Another line of research in this domain focuses on the idea of creating rules that inform

investors when it is really necessary to rebalance their portfolio weights. In the absence of

statistically significant and economically material advantage such rules simply tell the investor

to do nothing, hence avoiding rebalancing costs altogether. Preliminary research in this area

includes Rubinstein [85] who examines the efficiency of continuous rebalancing and proposes

a rule for avoiding unnecessary turnover. Kroner and Sultan [48] propose a "hurdle" rule for

rebalancing currency hedges when return distributions are time-varying, while Engle et al.

[29] propose a similar hurdle on alpha improvement as the trigger for rebalancing actively

managed asset allocations.

Bootstrap resampling techniques were used by Bey et al. [8] to identify "indifference" regions

along the efficient frontier. In Gold [35], a similar technique is applied to define indifference

to rebalancing portfolios for illiquid asset classes such as real estate. In Michaud and Michaud

[64], a parametric resampling technique is applied to measure the confidence interval on

portfolio return and risk to design a “when to trade rule.”

In Markowitz and Van Dijk [56] a rebalancing rule based on game theory to approximate multi-

period optimization is defined, but the authors argue it is mathematically intractable (at least

in closed form) for large problems. In Kritzman et al. [47] the authors test the efficiency of the

rebalancing rule presented in Markowitz and Van Dijk [56] against full dynamic programming

for cases up to a maximum of five assets, as dynamic programming becomes computationally

unfeasible for larger numbers of assets. The authors extend the method to one hundred assets.

Practitioner often resort to some kind of "all or nothing" rebalancing strategies. There are

however two main impediments to such extreme rules. When active managers are “inactive”

because the potential benefits of rebalancing are too small, this lack of trading is perceived by

clients as the manager being neglectful rather than as an analytically-driven decision to reduce

trading costs. Presumably this objection could be overcome by appropriate communication

between the asset manager and their investors. The second argument is that after a period

40


of inactivity, the eventual rebalancing concentrates the required trading into a particular

moment in time. For large investors the market impact arising from doing trades that are a

larger fraction of available trading volume per unit of time will create higher transaction costs

than if the trading had been done gradually between the previous portfolio rebalancing and

the current one. For portfolios that are composed of assets with homogeneous transaction

costs it is common to simply place heuristic limits on the amount of turnover allowed in a

given rebalancing procedure.

A value-added/turnover efficient frontier is proposed in Grinold and Stuckelman [37]. The au-

thors derive that under certain common assumptions, value added measured as improvement

in utility, is approximately a square root function of turnover. As such, investors can optimize

their portfolios without considering trading costs, and then simply choose an intermediate

point between the initial portfolio and the optimal portfolio that results in the best trade-off

between utility improvement and incurred transaction costs.

Multi-Period Optimization based on Expected Utility

Mossin [69] was the first to analyze optimal multi-period portfolio strategies based on maxi-

mizing expected utility. His research focused on isolating a type of utility functions of terminal

wealth which are independent of asset returns beyond the current period for intermediate

wealth levels. Such functions are called myopic and have the obvious benefit that the optimal

multi-period strategy can be achieved by only taking the current period into account. The

author found that for general asset distribution, the logarithmic utility function is completely

myopic. When asset returns are serially independent, power functions are the ideal candidates.

Mossin also concluded that for investor exhibiting a risk tolerance that is linear in wealth, the

so-called HARA utility functions, and in the presence of a risk-free asset whose distribution

is known for the whole investment horizon, then these functions lead to partial myopia: the

investor would optimally invest in a given period as if he would only invest in the risk-free asset

in subsequent periods. Accordingly, if the risk-free rate is zero, a complete myopia applies.

Hakansson [38] showed that even when asset returns are serially independent, for HARA utility

functions, no myopic strategies are optimal except in the very restrictive case of absence of

constraints on leverage or short sales. When such restrictions are present then only the power

and logarithmic utility functions lead to a myopic optimal strategy.

To summarize results we can say that a myopic strategy results when the investor exhibits a

logarithmic utility function, for serially dependent and independent asset returns distributions.

In the case of the power utility function, myopia applies only in the case of serially independent

asset returns distributions. Finally, in the absence of restrictions on leverage and short sales

and when the investor exhibits a HARA utility function, a myopic strategy applies only for

serially independent distributions.

41


Multi-Period Optimization under Independence Assumption

Li and Ng [52] as well as Leippold et al. [51] showed that a closed-form solution to a discrete-

time multi-period portfolio optimization problem can be obtained when assuming indepen-

dence of asset returns distribution within a mean-variance framework.

Brandt and Santa-Clara [16] provided an answer to the multi-period portfolio selection prob-

lem assuming that the portfolio weights can be presented as a linear function of certain state

variables. This assumption leads to a massive simplification of the optimization problem and

provides a local maximum, that could obviously substantially differ from the global solution.

When considering continuous-time, Duffie and Richardson [24], Basak and Chabakauri [4] as

well as Aït-Sahalia et al. [3] found a solution to the multi-period portfolio selection problem.

Multi-Period Optimization under Quadratic Utility Function

Bodnar et al. [11] derived a closed-form solution to the dynamic portfolio choice problem with

and without a risk-free asset, using relatively weak assumptions. They imposed the conditions

of the existence of the conditional mean vectors and covariances matrices. They did not make

assumptions about the autocorrelation structure or about the asset returns distribution. Their

solution can be applied to stationary as well as non-stationary stochastic models. However,

the solution relies on the quadratic utility function, which displays an increasing constant risk

aversion as showed in the previous chapter.

Brandt (2006) showed that the quadratic utility function is a good approximation of other utility

functions and above all can provide a welcomed support to the mean-variance framework

of Markowitz. Indeed, Tobin [94] showed in his paper that the Bernoulli principle is met if

returns are normally distributed or if the investor displays a quadratic utility function. As

assets returns are practically never normally distributed, the quadratic utility assumption

justifies the use of the traditional mean-variance model.

Bodnar et al. [11] also proved that under the independence assumption of assets returns, the

optimal multi-period portfolio allocation at a given rebalancing time is closely related to the

optimal single-period portfolio allocation. Both portfolios differ only in the coefficient of risk

aversion. The authors showed that if the allocation is based on the tangency portfolio, the

multi-period solution is the same as the one obtained by solving the single-period problem at

each rebalancing time.

Multi-Period Optimization with a Downside Mean-Square Error Objective

Skaf and Boyd [89] consider the problem of multi-period portfolio optimization with arbitrary

distribution of asset returns and a self-financing budget constraint only. The authors used a

mean-square error objective function. They show that when no other constraint is added to the

42

3.3. Dynamic Programming Techniques

model, the optimization problem can be solved by a standard dynamic programming approach

and the resulting sub-optimal policy is affine. The optimal policy involves solving a convex

quadratic program, at each step, by using the Bellman value function for the unconstrained

problem to approximate future value of portfolios. They provided some examples showing

that even in presence of transaction costs their sub-optimal policy performs as well as without

transaction costs.

Instead of relying on common utility functions, that all have pros and cons when applied to

real-world problems, we chose to follow the idea proposed in Skaf and Boyd [89]. Based on

this idea, we focus our research on an investor willing to reach a desired wealth level w t ar g et

over a finite horizon T and choose an objective function which penalizes final wealth levels

that are below this targeted wealth level. However, such an objective function does not display

the useful convexity property required by quadratic programming approaches which were

used by Skaf and Boyd.

3.3 Dynamic Programming Techniques

Optimizing a decision policy within an industrial framework usually leads to a problem of

sequential decision-making under uncertainty. This class of questions is addressed under the

framework of discrete-time stochastic control and in most cases can be formulated under the

umbrella of Markov Decision Process (see Bäuerle and Rieder [5], Bertsekas [7], Feinberg and

Schwartz [33], and Puterman [79]).

Dynamic programming (DP) offers some techniques to find the optimal policy. DP represents

the optimal controls in terms of an optimization problem involving the value function of the

stochastic control problem (see Bertsekas [7]).

With an increasing number of time steps, severe difficulties arise when solving generic real-

world applications, since the underlying state variables in practice usually must be modeled

in terms of high-dimensional controlled Markov processes. These issues cause a variety of

problems, frequently referred to as the curse of dimensionality.

This makes even representing the value function intractable when the state or action spaces

are infinite, or as a practical matter, when the number of states or actions is large. Even when

the value function can be represented, evaluating the optimal policy can still be intractable.

Closed form solutions to such problems are exceptions and usually an exact solution is out

of reach and is not of primary importance in real-world applications. For these reasons,

approximate numerical solutions (approximate dynamic programming (ADP)) are targeted

almost always in practice as a general method for finding sub-optimal control policies (see

Powell [78]). In ADP, approximate value functions are substituted for value functions in the

expression for the optimal policy. The goal is to select the control-Lyapunov function (i.e. the

approximate value function) so that the performance of the resulting policy is close to optimal.

43

3.3. Dynamic Programming Techniques

The accumulation of numerical inaccuracies is the main difficulty in the step-wise calculation

of approximate solutions via backward induction (see Bender et al. [6]). Due to the interleaved

application of numerical integration, the calculation of each value function relies on one

which was obtained in the previous step. This concatenation causes a deviation from the true

value functions, inevitably progressing with the number of time steps. This difficulty becomes

severe for a generic real-world problem, since the underlying state variables in practice usually

must be modeled in terms of high-dimensional controlled Markov processes.

44

4 Model Predictive Control

In this chapter, we introduce in Section 4.1 linear convex stochastic control problems con-

sidered in a multi-period portfolio optimization context, in the presence of transaction costs

and portfolio restrictions. We shortly outline in Section 4.2 how such linear convex prob-

lems can be handled by Model Predictive Control (MPC) techniques and detail in Section 4.3

the methodology, when scenarios are included in the multi-period portfolio optimization

problem. In Section 4.4 we explain the kind of transaction costs that can be handled by the

scenario-based MPC scheme and how to include them. In Section 4.5, we detail the inclusion

of portfolio constraints into the scenario-based MPC scheme. we formulate in Section 4.6 the

scenario-based multi-period MPC portfolio optimization problem and show how this problem

can be split into a convex quadratic and a non-quadratic component.

4.1 Introduction

We consider a multi-period investment optimal strategy in discrete time and with a finite

horizon and time-varying distribution of returns. Such optimization problems are formulated

as a stochastic control problem, usually with a linear dynamics which is easier to handle. This

linear dynamics require that we model the wealth evolution in terms of assets’ values instead

of the usual portfolio weights. Investor constraints as well as transaction costs cannot be

neglected in real-world applications and can normally be defined as convex functions, leading

to linear convex stochastic control problems.

Linear convex stochastic control problems can easily be solved in the absence of transaction

costs, as they are reduced to a sequence of single-period optimization problems. When

only so-called impact costs are considered, the optimization problem can also be solved by

dynamic programming (DP), because these costs can be formulated as a quadratic function.

In this case, Skaf and Boyd [89] showed that the resulting optimal trading policies reduce to

affine functions of the current portfolio. When a small number of assets is considered, i.e. a

maximum of three assets, we can recourse to numerical dynamic programming to find the

optimal policy.

45

4.2. Background

When non-quadratic transaction costs are considered, such as bid-ask spread or brokerage

costs, the optimization problem is no more computationally tractable. As an exact solution

cannot be found, we resort to a suitable approximation, a so-called sub-optimal policy. Sev-

eral techniques can be applied to derive the approximation, such as Approximate Dynamic

Programming and Model Predictive Control (MPC). As changes in financial markets can occur

often and have to be considered, we focus on techniques that can directly incorporate these

market changes. ADP-based policies require some pre-computations in order to find the

approximated value functions, which can be computationally costly. However, MPC has the

advantage of not requiring pre-computations and can thus deal with sudden market changes,

expressed in terms of modified returns expectations as well as time-varying covariances.

4.2 Background

Model Predictive Control is built around the idea of controlling a system by predicting its

evolution and choosing an optimal control based on the forecasted trajectory. In a portfolio

management context, the controls correspond to a trade vector which is usually selected to

minimize so-called stage-cost functions over the system’s states and controls. In MPC, all

future asset returns are simply replaced by their expected values.

MPC is therefore often chosen for its ability to handle portfolio constraints, as well as its

appealing properties in terms of performance and robustness (see Morari et al. [67], Mayne

et al. [58], Mayne [57]).

Linear convex optimization problems can be solved by various efficient algorithms, which

aim at improving the computation time or at approximating the optimal solution with a

sub-optimal policy. Modern interior-point techniques, as presented in Wang and Boyd [96],

recourse to a few complex iterations whereas Fast gradient, multiplicative dual update and

Alternating Direction Method of Multipliers (ADMM) rely on many simple iterations (see

Nesterov and Nemirovskii [71], Richter et al. [80], Nesterov [70] and Parikh and Boyd [73]). In

this thesis, we recourse to the ADMM for solving our MPC problem, also known as the Douglas-

Rachford algorithm. This algorithm is adapted for solving portfolio optimization problems or

in a high-frequency trading context, where execution speed is favored over precision.

4.3 Scenario-Based MPC

In this framework, we handle uncertainty pertaining to the system’s dynamics by generating a

set of scenarios, i.e. a set of dynamics equations and find for each scenario s the sequence of

trades (controls) over the whole investment horizon that minimizes the expected costs.

The state and control vectors at time t are denoted by xst and us

t ∈ Rn respectively, where s

corresponds to a given scenario and t = 1, . . . ,T .

46

4.4. Portfolio, Benchmark and Trading

We consider an uncertain discrete-time linear convex optimization problem over a horizon T ,

with S dynamics equations and corresponding time-varying probabilities. At every time step

t ∈, t ≥ 0, given the current state x(t ), the scenario-based MPC problem is defined by

min(x s ,us )S

s=1

S∑s=1

p stΦ(xs ,us)

subject to xst+1 = As

t xst +B s

t ut + f st

xst+1 ∈X s

t ×Ut s

xs0 = xstart, s = 1, . . . ,S, t = 0, . . . ,T −1,

(4.1)

where p st ∈ [0,1] is the probability for the scenario s at time t and

∑Ss=1 p s

t = 1, f st is a distur-

bance vector and the sets X st and U s

t are polyhedral constraints on the states and controls in

each scenario.

For each scenario s, the objective functionΦ(xs ,us) in (4.1) can be written as

Φ(xs ,us) := 1

2

T∑t=0

(xst

T Q st xs

t +ust

T R st us

t ), (4.2)

where Q st ∈ Rn×n is a stage-cost matrix on the state and R s

t ∈ Rn×n is the stage-cost matrix for

the controls. Q st and R s

t are usually symmetric positive semi-definite matrices.

We show in section 4.6 how to translate this constrained portfolio optimization problem into a

constrained scenario-based linear-convex control that we have just defined.

4.4 Portfolio, Benchmark and Trading

The portfolio and benchmark universe are composed of n assets. The state variable xst ∈ Rn

corresponds to the dollar amount invested in each asset at time t in scenario s, bst ∈ Rn

corresponds to the benchmark composition at time t in scenario s, whereas the control

variable ust ∈ Rn is the amount of each asset bought or sold at time t .

We define µst and g s

t = 1+µst as the vectors of expected returns and expected gains in scenario

s for the period t to t +1 respectively.

We assume the following linear dynamics for the portfolio wealth in a given scenario s:

xst+1 =G s

t (xst +us

t ), s = 1, . . . ,S, t = 0, . . . ,T −1,

where G st is the diagonal matrix of expected asset gains at time t , in scenario s. Asset gains are

typically non-negative.

47

4.5. Constraints

The benchmark at time 0 is known and its dynamics in a given scenario is thus fully determin-

istic

bst+1 =G s

t bst , s = 1, . . . ,S, t = 0, . . . ,T −1.

We highlight that this formulation is needed to get the required linear wealth dynamics. This

requirement wouldn’t be met if other state variables had been chosen, such as portfolio weights

or the number of shares invested in each asset.

The trades in scenario s are determined in each period t by the policy φst : Rn → Rn :

ust =φs

t (xst ), s = 1, . . . ,S, t = 0, . . . ,T −1.

We note that the trades done at time t only depend on the portfolio holdings xst in a given

scenario. Bertsekas [7] showed that there is no added value to including past returns or

portfolio states in the trading policy. With this setting, we can easily enforce a self-financing

strategy by adding the constraint 1Tust = 0.

4.5 Constraints

The post-trade constraint set Ct defines which post-trade portfolios are acceptable. Ct is

assumed nonempty, and thus for any value of xst , we can find a us

t for which

xs∗t = xs

t +ust ∈Ct , t = 0, . . . ,T −1, s = 1, . . . ,S,

where xs∗t corresponds to the portfolio at time t , in scenario s, just after trading. As the pre-

trade portfolio xst is determined by the random asset gains g s

t in the previous time period, it

is not directly under our control and we thus have to set the constraints on the post-trade

portfolio. In the special case of a simple long-only constraint, due to non-negativity of the

asset gains g st , the pre-trade portfolio xt will also meet the restriction.

4.5.1 Minimum and Maximum Weights

We can set constraints on the minimum and maximum allowed positions for each asset

separately or , better yet, set position limits relative to the total portfolio value (e.g. weights):

−xs∗t ≤−(1Txs∗

t )γlbt , xs∗

t ≤ (1Txs∗t )γub

t , s = 1, . . . ,S, t = 0, . . . ,T −1,

with γlbt and γub

t ∈ Rn contain positive entries, which ensure that the value in asset i meet

or exceed the fraction (γlbt )i and not exceed the fraction (γub

t )i of the total portfolio value

respectively. This constraint is convex, with Ct a polyhedron (see Boyd et al. [15]).

48

4.6. Problem Description

4.5.2 Brokerage Costs and Bid-Ask Spread

Commissions charged by a broker at each transaction can take various forms. However, the

considered function has to be convex, to be handled by standard MPC1. If we set the brokerage

fees proportional to the traded volume, we get a convex function given by:

ψst (xs

t ,ust ) = κTt |us

t |, s = 1, . . . ,S, t = 0, . . . ,T −1, (4.3)

where κt ≥ 0 is the vector of commission rates and the absolute value is element-wise. When

the brokerage fees are uniform, e.g. equal across assets, the function reduces to ψst (xs

t ,ust ) =

κt |ust |, where κt ≥ 0 is a scalar.

Bid-ask spread can be considered as additional transaction costs, which can be significant for

illiquid assets. These are modeled in the same way as brokerage costs. (κt )i corresponds to

one-half the bid-ask spread for the asset i (see Boyd et al. [15]).

4.5.3 Price impact

When large orders are executed, the prices tend to move against the traders as orders are filled.

Such indirect costs are defined as price impact. A quadratic form for price-impact cost can be

chosen to ensure convexity of the function and is given by:

ψst (xs

t ,ust ) = cTt us

t2, s = 1, . . . ,S, t = 0, . . . ,T −1,

where (ct )i ≥ 0 and the square is element-wise.

Another price impact model has been used by Meucci and Nicolosi [62], the 3/2 power trans-

action cost ψst (xs

t ,ust ) = cTt |us

t |(3/2), which also meets the convexity requirement.

4.6 Problem Description

Adapting from Meucci and Nicolosi [62], we solve a deterministic, discrete-time and finite-

horizon problem of an investor willing to maximize a satisfaction index relying on expected

risk-adjusted returns against a defined benchmark under a set of defined scenarios with

time-varying probabilities. Transaction, impact costs as well as lower and upper bounds on

the portfolio weights are considered.

Whereas Meucci and Nicolosi [62] only consider quadratic impact costs in their framework,

we extend the formulation in order to take bid-ask costs into account. Moreover, Meucci and

Nicolosi [62] track the changes in portfolio exposure at a given time t , we choose here to model

both portfolio exposures and the trades done separately, allowing to consider self-financing

strategies.

1although non-convex constraints can be handled by hybrid MPC procedures

49

4.6. Problem Description

We can define the following stage-cost function in each scenario s, which considers expected

excess return, tracking-error, market impact and transaction costs defined in (4.5):

p st

(−µs

t (xst +us

t −bst )+λt (xs

t +ust −bs

t )TΣst (xs

t +ust −bs

t )+κTt |ust |+us

tT diag(ct )us

t

), (4.4)

where p st is the (time-varying) probability assigned to scenario s, κt ≥ 0, ct ≥ 0 are vectors of

bid-ask and market impact (quadratic) costs respectively. Σst º 0 corresponds to the covariance

matrix in scenario s, which is positive semi-definite and λt > 0 is the (time-varying) risk

aversion parameter applied to the quadratic risk (tracking error).

4.6.1 Portfolio Restrictions

We assume that the initial portfolio is fully invested in the first asset and that we liquidate

the portfolio at the end of the investment horizon T . Moreover, we impose a self-financing

strategy2. We thus have the following equality constraints in this linear-convex framework

xs0 = xstart =

(1,0, · · · ,0

)Txs

T +usT = 0

1Tust = 0, s = 1, . . . ,S, t = 0, . . . ,T −1.

We also add a restriction on post-trade portfolio weights, as defined in (4.5.1) and obtain the

following inequality constraints:

−xs∗t ≤−(1Txs∗

t )γlbt , xs∗

t ≤ (1Txs∗t )γub

t , s = 1, . . . ,S, t = 0, . . . ,T −1.

2Note that this framework allows to handle in- and outflows as well.

50

4.7. Decomposition Quadratic / Non-Quadratic

4.6.2 Objective function

Instead of maximizing a satisfaction index, we consider the related minimization problem.

We thus minimize the negative of the satisfaction index.

Using our stage-cost functions in (4.4) at each time-step t , given a set of scenarios s = 1, . . . ,S

and their probability of occurrence p st , the initial/final portfolio and self-financing restrictions

as well as the weights constraints, we can formulate the objective function of our constrained

scenario-based linear-convex control problem

min(x,u)

S∑s=1

T∑t=0

p st

(−µs

tT(xs

t +ust −bs

t )+λst (xs

t +ust −bs

t )TΣst (xs

t +ust −bs

t )

+ustT diag(ct )us

t +κTt |ust |)

subject to xst+1 =G s

t (xst +us

t )+ f st

bst+1 =G s

t (bt )

1Tust = 0

xs0 = xstart

xsT +us

T = 0

(1Txs∗t )γl b

t ≤ xs∗t ≤ (1Txs∗

t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,

(4.5)

with state and control variables xst ∈ Rn·S ,us

t ∈ Rn·S , t = 0, . . . ,T and s = 1, . . . ,S. The variable

f st ∈ Rn·S corresponds to cash flows at time t , in scenario s.

A discount factor can be included to discount expected satisfaction.

4.7 Decomposition Quadratic / Non-Quadratic

The objective function consists of convex objectives, the so-called stage-cost functions de-

fined in (4.4). As shown in (4.2) we can rewrite the portfolio optimization problem (4.5) as a

combination of a convex quadratic part φ(x,u) and a convex non-quadratic partΨ(x,u)

min(x,u)

Φ(x,u)+Ψ(x,u)


t (xst +us

t )+ f st

bst+1 =G s

t (bt )

1Tust = 0

xs0 = xstart

xsT +us

T = 0

(1Txs∗t )γl b


t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,

51


where x ∈ RS(

n·(T+1))

denotes the portfolio composition (states) and u ∈ RS(

n·(T+1))

the se-

quence of trades (controls) for all scenarios and over the investment horizon, respectively.

Similarly, we define (x,u) ∈ RS(

2n·(T+1)), which corresponds to the concatenated portfolios

and trades for all scenarios and over the investment horizon.

4.7.1 Quadratic Component

We denote by Φ(x,u) the quadratic part of the objective function, which includes all elements

of the objective functions except the bid-ask (or brokerage) costs, which are non-quadratic.

Using the following definitions

x =

x1

x2

...

xs

...

xS

, u =

u1

u2

...

us

...

uS

, Q =

Q1 0 · · · · · · · · · 0

0 Q2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · Q s · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · QS

,

R =

R1 0 · · · · · · · · · 0

0 R2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · R s · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · RS

, V =

V 1 0 · · · · · · · · · 0

0 V 2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · V s · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · V S

,

where

xs =

xs1

xs2...

xst...

xsT

, us =

us1

us2

...

ust

...

usT

, Q s =

Q s1 0 · · · · · · · · · 0

0 Q s2 · · · · · · · · · 0

......

. . ....

... 0

0 0 · · · Q st · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · Q sT

,

R s =

R s1 0 · · · · · · · · · 0

0 R s2 · · · · · · · · · 0

......

. . ....

... 0

0 0 · · · R st · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · R sT

, V s =

V s1 0 · · · · · · · · · 0

0 V s2 · · · · · · · · · 0

......

. . ....

... 0

0 0 · · · V st · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · V sT

,

s = 1, · · · ,S.

52


Q st := p s

t (2λstΣ

st )

V st := p s

t (2λstΣ

st )

R st := p s

t (2λstΣ

st +diag(c s

t ))

q st := p s

t (−µst −Q s

t bst )

r st := p s

t (−µst −V s

t bst ).

the convex quadratic terms of the function Φ(x,u) can be written in a convenient matrix

format

φ(x,u) = (1/2)

x

u

1

T Q V q

V T R r

qT r T 0

x

u

1

,

where Q,R and V º 0.

4.7.2 Non-Quadratic Component

We denote by Ψ(x,u) the non-quadratic convex part of the objective function, which includes

the bid-ask (or brokerage) costs and is defined by

Ψ(x,u) = κT|u|

κ ∈ RS(

n·(T+1))

is a column vector which contains the bid-ask cost for each asset in each

scenario, which is also allowed to vary over time and is written

κ=

κ1

κ2

...

κs

...

κS

, κs =

κ0

κ1...

κt...

κT

, t = 0, . . . ,T.

Not only does this splitting into a quadratic and non-quadratic convex parts facilitates the

representation of the optimization problem but it will also be helpful in formulating and

understanding the ADMM splitting algorithm in the next chapter.

53

4.8. Conclusion

4.8 Conclusion

We defined a linear-convex problem with restrictions that cannot be solved analytically. Vari-

ous efficient tools relying on generic interior-point cone solvers, such as SeDuMi or SDPT3,

can help solving such optimization problems. Yalmip and CVX are two parse-solvers allow-

ing the user to describe at a high-level this optimization problem, the parser will transform

and express it as a cone program. However, evaluating these scenario-based MPC problems

requires solving large quadratic problems, beyond the computational limits of the solver.

For real-time applications in a portfolio management context or for trading purposes, a very

high-accuracy trading policy is not required and the focus is set on reducing computation

times. In the following chapter we show how to solve the scenario-based MPC problem in

(4.5), by breaking the global problem into a quadratic optimal control part that can be solved

in a very efficient way for all scenarios at one time and a set of single-period problems that can

be solved separately. We also extend the state-of-the-art methodology by integrating weight

constraints defined in (4.5.1) into the two-sets splitting framework without using an additional

splitting set.

54

5 Fast Scenario-Based Optimal Control

Several methods have been employed to solve the optimization problem in (4.5), such as

dynamic programming, approximate dynamic programming, Linear Matrix inequalities (see

Boyd et al. [15]).

The Alternating Direction of Multipliers (ADMM), which was introduced by Glowinski and

Marroco [34], relies on a simple algorithm that is very-well adapted to problems arising in

scenario-based MPC in particular. ADMM splits the whole optimization into sub-problems,

which are easier to solve, to converge to a coordinated solution for the (large) global problem.

In Section 5.1 we start with a review of relevant convex optimization definitions as well as a

description of proximal operators that will be subsequently used by the ADMM algorithm.

The general methodology of the ADMM algorithm is presented in Section 5.2. We apply the

algorithm to solve the scenario-based MPC problem presented in the previous chapter in

Section 5.3 and following the methods presented in Parikh and Boyd [73], we apply modern

techniques from linear algebra in Section 5.4, to solve the quadratic part of the optimization

problem very efficiently and handle high-dimensional optimization problems. The non-

quadratic optimization problem is solved in Section 5.5 using proximal operators.

In Section 5.6 we present our contribution and extend the current ADMM methodology by

including portfolio weights constraints into the two-sets splitting algorithm, without using an

additional splitting set. Moreover, we propose a modification of the termination criterion used

in previous work, which allows to improve convergence speed by embedding the probabilities

assigned to the scenarios considered into the criterion. The complete Extended Two-Set

Alternating Direction Method of Multipliers algorithm is presented. We conclude this chapter

in Section 5.7.

55

5.1. Definitions

5.1 Definitions

5.1.1 Convex Functions

Definition 5. A function f : Rn → ∪ ∞ is convex if its domain dom f is convex and for any

x, y ∈ dom f

f (θx + (1−θ)y) ≤ θ f (x)+ (1−θ) f (y), θ ∈ [0,1].

Definition 6. A function f : Rn → R ∪ ∞ is strongly convex if the following strict inequality

holds for x 6= y

f (θx + (1−θ)y) < θ f (x)+ (1−θ) f (y), θ ∈ [0,1].

Definition 7. A function f : Rn → R ∪ ∞ is called proper if its epigraph

epi f = (x, t ) ∈ Rn ×R | f (x) ≤ t (5.1)

is a nonempty closed convex set. We can write the effective domain of f as

dom f = (x) ∈ Rn | f (x) <+∞,

which corresponds to the set of points for which f takes on finite values. (see Parikh and Boyd

[73] for more details).

Definition 8. The indicator function of a closed convex set Ct ∈ R2n is

ψt (xt +ut ) = 1Ct (xt ,ut ) =0, (xt ,ut ) ∈Ct .

∞, otherwise.

We note that the indicator function is a convex function.

5.1.2 Proximal Operators

Proximal operators carry out small convex optimization problems. These often have closed-

form solutions or can at least be solved very efficiently.

Definition 9. The proximal operator denoted prox f : Rn → R of f is

prox(v) = argminx

(f (x)+ 1

2||x − v ||22

),

where || · || is the `2 norm. The function minimized is strongly convex and has a unique mini-

mizer for v ∈ Rn .

56

5.1. Definitions

To simplify the notation, the function f is often scaled with a parameter ρ > 0 and we denote

Definition 10. the proximal operator of f with parameter ρ

proxρ(v) = argminx

(f (x)+ ρ

2||x − v ||22

).

In this thesis, transactions costs are formulated as a `1 minimization problem (see Chapter 4

Equation 4.3. As shown in Parikh and Boyd [73],

Definition 11. The proximal operator of f = || · ||1, is

(proxρ f (v)

)i=

vi −ρ vi ≥ ρ0 |vi | ≤ 1

vi +ρ vi ≤−ρ

and is also known as the element-wise soft-thresholding operator. Parikh and Boyd [73] show

that this can also be expressed in compact form as

proxρ f (v) = (v −ρ)+− (−v −ρ)+ . (5.2)

The inclusion of inequality constraints related to portfolio weights restrictions, introduced in

Section 5.6 relies on the projection onto a box or hyper-rectangle.

Definition 12. The projection onto a box C = x | lb ≤ x ≤ u takes the form

(ΠC (v))k =

lk vk ≤ lk

vk lk ≤ vk ≤ uk

uk vk ≥ uk,

(5.3)

where lk ,uk are the lower and upper bounds of the box, respectively.

5.1.3 Proximal minimization

If the function f : Rn → R ∪ +∞ is a closed proper convex function, the minimization can

be carried out with the help of a proximal minimization algorithm. At each iteration, this

algorithm also known as proximal iteration solves

vk+1 := proxρ f (vk ),

where k is an iteration counter and vk corresponds to the kth iteration. If the problem is

feasible, it can be shown that vk converges to the set of minimizers of the function f and f (xk )

converges to its minimum.

According to the definition, prox f (v) is a point that operates a trade-off between remaining

57

5.2. ADMM

close to v and minimizing f . This is the reason why, it is also called a proximal point of v with

respect to f . The parameter ρ corresponds to the trade-off parameter.

We will use extended closed proper convex functions subsequently, meaning they can take the

infinity value outside their domain.

Assumption 1. Functions f and g are closed convex functions.

5.2 ADMM

We present in this section a popular splitting method, known as the Alternating Direction of

Multipliers (ADMM), which is a special form of the Douglas-Rachford splitting algorithm. It

combines the decomposability of dual ascent and displays strong convergence properties of

the method of multipliers and is particularly adapted to optimization problems that are too

large to be handled by generic solvers (for more details see Parikh and Boyd [73]).

The algorithm solves following convex optimization problems

minimize f (x)+ g (x)

subject to x − x = 0,(5.4)

with variables x ∈ Rn and x ∈ Rm , where f , g : Rn → R ∪ +∞ are closed proper convex

functions. The optimization variable has been split in two parts denoted x and x, with the

objective function separable across this splitting. The so-called consensus constraint ensures

that x and x agree.

This method is a so-called proximal algorithm, meaning it solves convex optimization prob-

lems using proximal operators of the objective functions.

To solve the problem in (5.4) we can form the augmented Lagrangian

Lρ(x, x, y) := f (x)+ g (x)+ yT (x − x)+ ρ

2||x − x||22,

where ρ > 0 is a parameter and y ∈ Rn is a dual variable associated with the consensus

constraint. The alternating direction method of multipliers (ADMM) consists then of the

following iterations

xk+1 := argmin(x)

Lρ(x, xk , yk )

xk+1 := argmin(x)

Lρ(xk+1, x, yk )

yk+1 := yk +ρ(xk+1 − xk+1),

where k is an iteration counter and ρ > 0 is a step size used in the algorithm. Lρ is first mini-

mized over the variable x, using the most recent updated value of the other primal variable x

and the dual variable y . We observe that the dual variable (scaled) corresponds to the sum of

the consensus errors up to iteration k.

58

5.2. ADMM

As shown in Parikh and Boyd [73] this method enables handling the two objectives separately

and taking constraints into account, as they can by definition take on infinite values. the two

functions f and g are accessed through their proximal operators and obviously it is assumed

that these operators can be evaluated in an efficient way. Using the proximal operators of f

and g respectively, the algorithm reads

xk+1 := proxρ f (xk − yk )

xk+1 := proxρg (xk+1 + yk )

yk+1 := yk +ρ(xk+1 − xk+1).

5.2.1 Accelerated ADMM

An accelerated variant of the ADMM presented above has been proposed in Goldstein et al. [36]

and is presented in Algorithm 1. The convergence performance of the ADMM can be improved

by using this predictor-corrector acceleration step and avoids inherent spiral movements

around the optimum. A global convergence can be guaranteed if both objective functions f

and g are strongly convex (see Definition 7).

As our portfolio optimization problem will consider constraints on weights, this requirement

is not met. However, Goldstein et al. [36] show that for weakly convex problems, a restart rule

can be applied to ensure stability. The restart rule simultaneously recourse to both the primal

and dual residuals:

ek = 1

ρ||yk − yk ||22 +ρ||x − x||22.

At every iteration k, we compare ek to its previous value multiplied by a constant η ∈ [0,1],

usually chosen close to 11. If the combined residual has been decreased by a factor of at least

η, we apply the acceleration, otherwise we “restart” the algorithm by setting αk+1 = 1.

Our extensive testing in the portfolio optimization framework has shown that the inclusion of

Nesterov-based acceleration step does not always improve the performance of the algorithm.

Following Boyd et al. [14] we will also consider other acceleration techniques to improve the

convergence rate in Section 5.6.

1we used η= 0.999 in all our experiments as indicated in Goldstein et al. [36]

59

5.3. Splitting the MPC Problem

Algorithm 1 Accelerated Alternating Direction Method of Multipliers (aADMM)

Require: Initialize x−1 = ˆx0, y−1 = y0 ρ > 0, α1 = 1, η ∈ [0,1]1: for iteration k = 1,2, . . . do2: xk = argmin

(x)Lρ(x, xk , yk )

3: xk = argmin(x)

Lρ(xk , x, yk )

4: yk = yk +ρ(xk − xk )5: if ek < ηek−1 then

6: αk+1 = 1+p

1+4αk 2

2

7: yk+1 = yk + αk−1αk+1 (yk − yk−1)

8: ˆxk+1 = x + αk−1αk+1 (xk − xk−1)

9: else10: αk+1 = 1, ˆxk+1 = x and yk+1 = yk

11: end if12: end for

5.3 Splitting the MPC Problem

5.3.1 Overview

In this section we show how to fit the optimal control problem defined in (4.5) to the ADMM

framework and how to derive the algorithm, relying on the methodology presented in the

previous section. To ease the reading, we restate the scenario-based MPC problem

min(x,u)

S∑s=1

T∑t=0

p st

(−µs

tT(xs

t +ust −bs

t )+λst (xs

t +ust −bs

t )TΣst (xs

t +ust −bs

t )

+ustT diag(ct )us

t +κTt |ust |)


t (xst +us

t )+ f st

bst+1 =G s

t (bt )

1Tust = 0

xs0 = xstart

xsT +us

T = 0

(1Txs∗t )γl b


t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,

(5.5)



f st ∈ Rn·S corresponds to cash flows at time t .

We reformulate the problem as a combination of a convex quadratic part Φ(x,u) and a convex

60


non-quadratic partΨ(x,u)

min(x,u)

Φ(x,u)+Ψ(x,u)


t (xst +us

t )+ f st

bst+1 =G s

t (bt )

1Tust = 0

xs0 = xstart

xsT +us

T = 0

(1Txs∗t )γl b


t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1.

(5.6)

The splitting decision is not unique and the choice can affect the speed of the algorithm

significantly. Both sub-problems, when no closed-form solution exists, should be cheap to

solve, i.e. with the help of proximal minimizations or simple projections. Expensive operations

such as matrix inversions should be avoided and if some quantities remain constant over the

iterations, they should be pre-factored.

We chose to split here the objective function in (5.5) into a quadratic and a non-quadratic part

(i.e. a two-set splitting), that will be solved alternatively until a convergence is reached.

5.3.2 Notation

Input data used in the optimization problem are the initial and final portfolio (initial and ter-

minal state) xstart, xfinal ∈ Rn , the dynamics transition (asset gains) matrix G , the self-financing

portfolio, the vector of cash flows f , the quadratic costs elements Q, R, S, q and r (see Chap-

ter 4, Section 4.7.1) and the non-quadratic cost functions Ψ (see Chapter 4, Section 4.7.2). All

data, except the initial portfolio are given for the investment horizon t = 0, . . . ,T and for the

scenarios s = 1, . . . ,S.

We use x ∈ RS(

n·(T+1))

and u = (u0, . . . ,uT ) to denote the portfolio composition (states) and u ∈RS(

n·(T+1))

for the trades sequences (controls). Moreover, we define w = (x,u) ∈ RS(

2n·(T+1)),

which corresponds to the concatenated portfolios and trades for all scenarios and over the

investment horizon.

We start by expressing the equality constraints in the form of an indicator function. To

this purpose, we define the set D of the state and control variables that preserve the linear

dynamics, the self-financing constraint as well as the starting and final portfolio constraints.

We denote by 1D the indicator function on D, that is a closed proper convex function as

explained in Definition 8. D reads

D = (x,u) | xs0 = xst ar t , xs

T = x f i nal , xst+1 =G s

t (xst +us

t )+ f st , bs

t+1 =G st (bt ),

1Tust = 0, s = 1, . . . ,S, t = 0, . . . ,T −1

.

(5.7)

61


5.3.3 ADMM Formulation

We formulate the scenario-based MPC portfolio optimization problem to fit the ADMM

framework. The objective function defined in (5.6) is composed of S · (T +1) convex objective

terms. We already split the function into a convex quadratic function Φ(x,u) and a convex

non-quadratic one denotedΨ(x,u) that we restate here for convenience:

min(x,u)

Φ(x,u)+Ψ(x,u)


t (xst +us

t )+ f st

bst+1 =G s

t (bt )

1Tust = 0

xs0 = xstart

xsT +us

T = 0

(1Txs∗t )γl b


t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1.

The quadratic part of the function, denoted by Φ(x,u), reads

Φ(x,u) = (1/2)

x

u

1

T Q V q

V T R r

qT r T 0

x

u

1

,

whereas the non-quadratic functionΨ(x,u) reads

Ψ(x,u) = κT|u|.

We refer to Chapter 4, Sections 4.7.1 and 4.7.2, respectively, for the derivation.

The equality constraints can be easily integrated into the convex optimal control problem

by the inclusion of the indicator function 1D defined in (5.7), which contains the equality

constraints of the optimization problem. We can now rewrite the global problem

min(x,u)

1D(x,u),+Φ(x,u)+Ψ(x,u)

subject to (1Txs∗t )γl b


t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1.

(5.8)

Relying on the method of Augmented Lagrangians, we can formulate the optimal control

problem defined in (5.8) in a consensus form as in (5.4) above:

minimize 1D(x,u),+Φ(x,u)+Ψ(x, u)

subject to (x,u) = (x, u).(5.9)

We note that the inequality constraint, related to portfolio weights, is not included in this

62


formulation. We will provide an extension to the ADMM methodology in Section 5.6, to

include portfolio weights constraints, without using a three-set splitting.

We split here the state-control variables and we add the consensus constraint that they must

agree. The first element of the equation contains the quadratic objective, the return dynamics,

the self-financing restriction, as well as the initial and terminal portfolio constraint. The

second element is separable across assets and contains the non-quadratic objective on the

states and control. The equality constraint simply states that the two pairs of state and control

variables should be in consensus.

5.3.4 Splitting Operator

We define two pairs of state and control variables, denoted x,u ∈ RS(

n·(T+1))

and x, u ∈RS(

n·(T+1)). We will use two dual variables z and y . We initiate the four variables that we

denote (x0, u0) and (z0, y0) for k = 0,1, . . ..

We can write the augmented Lagrangian of the splitting problem in (5.9)

Lρ(x, x,u, u, z, y) := 1D (x,u)+Φ(x,u)+Ψ(x, u)

−ρ(z, y)T (x − x,u − u)

+ ρ

2||x − x,u − u||22,

that we can easily reformulate in a squared form

Lρ(x, x,u, u, z, y) := 1D (x,u)+Φ(x,u)+Ψ(x, u)+ ρ

2||(x − x,u− u)− (z, y)||22 −

ρ

2||(z, y)||22.

By minimizing over the primal variables and maximizing over the dual variables we derive the

algorithm updates

(xopt,uopt) := argmin(x,u)

(1D (x,u)+Φ(x,u)+ ρ

2||(x,u)− (x, u)− (z, y)||22

)(xopt, uopt) := argmin

(x,u)

(Ψ(x, u)+ ρ

2||(x, u)− (x,u)+ (z, y)||22

)(zopt, yopt) := argmax

(z,y)

(− ρ

2||(z, y)||22 +

ρ

2||(z, y)− (x − x,u − u)||22

).

By starting from initial values for the pairs of variables (x0, u0) and (z0, y0), for k = 0, . . . ,max.iter,

63


the algorithm reads

(xk+1,uk+1) := argmin(x,u)

(1D (x,u)+Φ(x,u)+ ρ

2||(x,u)− (xk , uk )− (zk , yk )||22

)(5.10a)

(xk+1, uk+1) := argmin(x,u)

(Ψ(x, u)+ ρ

2||(x, u)− (xk+1,uk+1)+ (zk , yk )||22

)(5.10b)

(zk+1, yk+1) := (zk , yk )+ (xk+1, uk+1)− (xk+1,uk+1), (5.10c)

where k corresponds to the iteration counter and ρ > 0 is the step size of the algorithm.

(zk , yk ) ∈ RS(

(2n)(T+1))

is the scaled dual variable related to the consensus constraint.

We begin by minimizing a sum of convex quadratic functions with respect to the state and

control variables, subject to a return dynamics constraint. We thus have a convex quadratic

control problem to solve. The second step involves the minimization of a convex objective

function, that is separable across assets and scenarios and is thus easily parallelizable. The

right-hand side of Equation (5.10c) is the proximal operator of the functionΨt evaluated at

(xk+1 − zk ,uk+1 − yk ).

We will show in Sections 5.4 and 5.5 how to efficiently solve the two convex functions for all

scenarios s = 1, . . . ,S.

Termination Criterion

As detailed in Boyd et al. [14], the ADMM algorithm converges under two soft assumptions,

provided the solution exists.

Assumption 2. The extended-real-valued functions

f : Rn → R ∪ +∞ and g : Rm → R ∪ +∞ are closed proper convex functions.

Assumption 3. The unaugmented Lagrangian L0 has a saddle point.

Assumption 2 is satisfied if and only if the epigraph of the function, as defined in (5.1), is a

closed nonempty convex set.

Assumption 3 means explicitly that (xopt, xopt, yopt) exist and for which

L0(xopt, xopt, y) ≤ L0(xopt, xopt, yopt) ≤ L0(x, x, yopt)

holds for all x, x and y . For further details we refer to [14].

The primal and dual residuals, denoted respectively r k and sk , for (5.9) are given by

r k = (xk ,uk )− (xk , uk ), sk = ρ((xk , uk )− (xk−1, uk−1)

).

64

5.4. Solving the Convex Quadratic Control Problem

r k and sk converge to zero under the defined algorithm. A reasonable termination criterion is

when both residuals are below defined thresholds

||r k ||2 < εprimal, ||sk ||2 < εdual,

where εprimal > 0 and εdual > 0 are tolerances for primal and dual feasibility, respectively.

Following Boyd et al. [14], these thresholds can be assigned in the following way:

εprimal = εabsp(T +1)(2n)+εrelmax||(xk ,uk )||2, ||(xk , uk )||2

εdual = εabsp(T +1)(2n)+εrel||(zk , yk )||2,

where εabs > 0 and εrel > 0 are absolute and relative tolerance levels, respectively. This enables

that the tolerances scale with the size of the problem as well as the variable values.

Opposite to interior-points algorithms, the splitting method converges rapidly to modest

accuracy, which is sufficient for portfolio construction purposes. However, convergence to

accurate solutions can take many iterations.

5.4 Solving the Convex Quadratic Control Problem

The convex quadratic control terms of the function Φ(x,u) do not include any transaction

costs and thus are only dependent on the term known at time t for each scenario s. We derive

the analytical solution in the current section.

The first step of the splitting algorithm consists in solving a linear-quadratic optimal control

problem that can be expressed as

minimize1

2wT Ew + cT w

subject to Aw = d ,

where w ∈ R(2n)(T+1) includes the state and control variables (x and u). The quadratic and

affine terms are grouped in the objective function and the equality constraint contains the

return dynamics of the system, the self-financing constraint as well as the starting and final

portfolio restrictions. We define for s = 1, . . . ,S

c =

c1 0 · · · · · · · · · 0

0 c2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · c s · · · 0

0 0 · · · · · · · · · 0

0 0 · · · · · · · · · cS

, d =

d 1 0 · · · · · · · · · 0

0 d 2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · d s · · · 0

0 0 · · · · · · · · · 0

0 0 · · · · · · · · · d S

,

65


E =

E 1 0 · · · · · · · · · 0

0 E 2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · E s · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · E S

, A =

A1 0 · · · · · · · · · 0

0 A2 · · · · · · · · · 0...

.... . .

...... 0

0 0 · · · As · · · 0

0 0 · · · · · · . . . 0

0 0 · · · · · · · · · AS

,

where

cs =

q s0 −ρ

((x0

k + (zs0)k)

r s0 −ρ

((us

0k + (y s

0)k)

...

q st −ρ

((xs

t )k + (zst )k)

r st −ρ

((us

t )k + (y st )k)

...

q sT −ρ((xs

T )k + (zsT )k)

r sT −ρ((us

T )k + (y sT )k)

, ds =

xst ar t

f s0...

f st...

f sT−1

x f i nal

, As =

Aas

Abs

Acs

,

E s =

Q0 +ρ I V0 · · · · · · · · · · · · · · · · · · 0 0

V0 R0 +ρ I · · · · · · · · · · · · · · · · · · 0 0

0 0 Q1 V1 · · · · · · · · · · · · 0 0

0 0 V1 R1 +ρ I · · · · · · · · · · · · 0 0...

......

.... . .

......

...... 0

0 0 · · · · · · · · · Qt +ρ I Vt · · · 0 0

0 0 · · · · · · · · · Vt Rt +ρ I · · · 0 0

0 0 · · · · · · · · · · · · · · · . . . 0 0

0 0 · · · · · · · · · · · · · · · · · · QTT +ρ I VT

0 0 · · · · · · · · · · · · · · · · · · V TT RT +ρ I

66


and

Aas =

I 0 0 0 · · · 0 0 0 0

−G s0 −G s

0 I 0 · · · 0 0 0 0

0 0 −G s1 −G s

1 · · · 0 0 0 0...

......

.... . .

......

......

0 0 0 0 · · · −G sT−1 −G s

T−1 I 0

Abs =

0 I 0 0 · · · 0 0 0 0

0 0 0 I · · · 0 0 0 0...

......

.... . .

......

......

0 0 0 0 · · · 0 0 0 I

As

c =[

0 0 0 · · · 0 0 0 I I]

.

We note that E s is a block diagonal matrix, formed by T +1 blocks of size (2n2) and is positive

definite since the parameter ρ is positive.

A1 encodes the dynamics (asset gains), A2 contains the self-financing constraint, whereas A3

imposes the final portfolio restriction. We note that A contains the identity blocks that makes

it full rank ((T +1)n). We thus face a standard equality-constrained quadratic program, that

can be solved efficiently due to the special structure (sparse matrix, block diagonal) displayed

by the matrix A.

Sufficient and necessary optimality conditions are provided by the Kuhn and Tucker (KKT)

conditions for this type of problem:[E AT

A 0

][w

η

]=[−c

d

], (5.12)

where η ∈ R(T+1)n are dual variables related to the equality constraints. This is an optimization

problem with S(3n)(T +1) equations and variables. As E is positive definite and A is full rank,

the KKT matrix in (5.12) is invertible but substantial computational effort can be saved by

exploiting its structure (sparsity) and avoiding the inversion.

The splitting algorithm used require solving this quadratic problem many times (i.e. at each

iteration) but we note that the coefficient matrix will remain unchanged, only values of f

are changing at each iteration. We choose to solve and cache (5.12) by following a modern

approach described in Boyd and Vandenberghe [13] and applied in Stathopoulos et al. [92],

using a sparse LDLT decomposition. The authors show that the coefficient matrix can be

factored as[E AT

A 0

]= PLDLT P T ,

67

5.5. Solving the Convex Non-Quadratic Problem

where P corresponds to a permutation matrix, chosen based on the sparsity of (5.12) resulting

in a lower-triangular matrix L, with one on the diagonal and few non-zeros as well as a stable

factorization (see Bunch and Parlett [17]). D is block diagonal with 2 blocks. This factorization

enables to solve (5.12)[w

η

]= P(L−T(D−1(L−1(P T

[−c

d

])))). (5.13)

Forward and backward substitution are used to carry out the multiplication by L−1 and

L−T respectively, thus avoiding to resort to division. As the algorithm leads to solve this

optimization problem many times, by only changing the values of c, we can compute and

cache P , L and D−1. We then have to solve the system using (5.13) at each iteration step.

Regularization In order to ensure that the factorization detailed above always exists and is

stable, as in Saunders [87] we regularize the system by writing[E AT

A −εI

],

where ε> 0 is a constant. After regularization the matrix is quasi-definite and a stable factor-

ization can be carried out for any permutation P . Saunders [87] suggest a value of 10−8 for ε,

allowing to ensure stability without modifying the optimization problem substantially.

5.5 Solving the Convex Non-Quadratic Problem

In the second step of the ADMM algorithm, we solve the non-quadratic following problems

S ·T times

min(x s

t ,ust )S

s=1

Ψt (xst ,us

t )+ ρ

2||(xs

t ,ust )− (v s

t , w st )||22.

These are straightforward single-period proximal steps. In our scenario-based MPC optimal

control problem defined in (5.6), Ψ(x,u) = κT|u| encompasses the minimization of the trans-

action costs. This problem can efficiently be solved by the proximal operator of the absolute

value which is given by the soft-thresholding operator detailed in (5.2).

This optimization step is separable over assets and can be evaluated in parallel.

68

5.6. Extending the State-of-the-Art

5.6 Extending the State-of-the-Art

In this section we present our contribution to extend existing techniques to include the

inequality constraint in (5.5), related to portfolio weights. Whereas previous research recourse

to a three-set splitting scheme in the ADMM algorithm to consider equality and inequality

constraints separately, we propose here a new approach, named embedded update splitting,

to include such constraints in the two-set splitting scheme defined in (5.6).

Although the scenario-based MPC problem is fully separable across scenarios, we show in

Section 5.4 that the quadratic convex control can be efficiently solved at one go for all scenarios

simultaneously. We provide here an additional contribution to the state-of-the-art, by using a

probability-weighted norm for the primal and dual residuals of the ADMM splitting problem

in the termination criterion, allowing to improve the convergence speed significantly.

5.6.1 Improving Convergence

As mentioned in Section 5.2.1, the Accelerated ADMM Method does not always improve

the convergence performance of the algorithm. A so-called relaxation method, relying on

the combination of the two iterative methods of Jacobi and Gauss-Seidel (explained in Ap-

pendix A.1), has been used in previous work (see Eckstein and Bertsekas [26], Eckstein [25],

Eckstein and Ferris [27]). Our extensive tests in a portfolio optimization context have shown

that the number of iterations can be substantially reduced by using this procedure.

Relaxation Method

The methodology simply replaces (xk+1,uk+1) of the first update equation in (5.10) with

(xk+1,uk+1) =α(x∗,u∗)+ (1−α)(xk + uk ), (5.14)

where α ∈ [0,2] corresponds to the relaxation parameter. When α> 1 this procedure is referred

to as over-relaxation, whereas this is referred to as under-relaxation when α < 1. Eckstein

[25] and Eckstein and Ferris [27] suggest values of α between 1.5 and 1.8, to improve the

convergence. We note than if we set α = 1, we get the Gauss-Seidel Method. In line with

common practice, we use a value of 1.8 subsequently.

Probability-weighted Residuals

The scenario-based MPC approach and the ADMM algorithm used is fully separable across

scenarios and can be computed in parallel. In previous research, Kang et al. [45] show that the

computations required by the algorithm scale linearly with the number of scenarios.

In this section, although our tests confirmed the convergence of the scenario-based MPC

portfolio problem expressed as a two-stage splitting problem, we suggest an alteration of the

69


termination criterion defined in Section 5.3.4. In Section 5.4 we benefit from the particular

structure of the first sub-problem and solve the quadratic minimization at one go. We build

on this appealing feature and propose to include the probabilities of the scenarios into the

termination criterion. We show that the two-stage splitting problem, converges faster when

using probability-weighted primal and dual residuals, regardless of the size of the optimization

problem considered.

Using the respective probabilities assigned to the scenario, we can write the weighted primal

and dual residuals, denoted respectively r kw and sk

w , for (5.9) as

r kw =

S∑s=1

ps

((xk

s ,uks )− (xk

s , uks ))

skw =

S∑s=1

ps ·ρ((xk

s , uks )− (xk−1

s , uk−1s )), s = 1, . . . ,S.

(5.15)

r kw and sk

w converge to zero under the defined algorithm. The termination criterion is satisfied

when both residuals are below defined thresholds

||r kw ||2 < εprimal

w , ||skw ||2 < εdual

w ,

where εprimal > 0 and εdual > 0 are tolerances for primal and dual feasibility, respectively. These

thresholds rely on the same approach used in Section 5.3.4, except that we again recourse to

probability-weighted approach. We define the probability-weighted first and second set of

primal variables, as

xkw =

S∑s=1

ps xks , xk

w =S∑

s=1ps xk

s

ukw =

S∑s=1

psuks , uk

w =S∑

s=1ps uk

s , , s = 1, . . . ,S,

where xw , xw , uw and uw denote the weighted first and second primal states and controls

respectively. We compute the probability-weighted dual variables zkw and yk

w

zkw =

S∑s=1

ps zks , yk

w =S∑

s=1ps yk

s , s = 1, . . . ,S.

Using the weighted norms in (5.15) as well as the weighted state, control and dual variables

just defined, the termination criterion reads

εprimalw = εabsp(T +1)(2n)+εrelmax||(xk

w ,ukw )||2, ||(xk

w , ukw )||2

εdualw = εabsp(T +1)(2n)+εrel||(zk

w , ykw )||2,

(5.16)

where εabs > 0 and εrel > 0 are absolute and relative tolerance levels, respectively.

70


We illustrate the improved convergence speed of our termination criterion by comparing the

number of iterations needed by the weighted and non-weighted scheme, for small, medium,

large and extra-large optimization problems detailed in Section 5.6.3, respectively. Figure 5.1

shows that even for small problems the convergence is improved by using the weighted

methodology, this difference increases with the size of the problem. Previous research also20

040

060

080

0

Nb of

iterat

ions

extra−large large medium small

Unweighted Criterion

Weighted Criterion

Small Medium Large Extra LargeStates & controls (n) 4.00 20.00 20.00 50.00Horizon (T ) 5.00 10.00 15.00 20.00Number of scenarios (S) 2.00 5.00 7.00 10.00Step size (r ho) 0.10 0.10 0.10 0.10Unweighted: Solve time (sec) 1.13 31.61 95.19 764.40Unweighted: Number of iterations 43.00 431.00 561.00 929.00Unweighted: Objective value -0.78 -1.55 -2.44 -4.86Weighted: Solve time (sec) 1.19 22.95 84.22 501.27Weighted: Number of iterations 41.00 330.00 375.00 549.00Weighted: Objective value -0.78 -1.55 -2.44 -4.86

Figure 5.1 – Convergence properties: weighted vs. unweighted scheme

considered an adaptive step size parameter ρ. We have shunned this practice in this thesis,

as a variation of ρ would require a recalculation of the pre-computed matrix in the convex-

quadratic control problem in Section 5.4.

In Section 5.7, we identify avenues for future research in this direction.

71


5.6.2 Extended Two-Set Splitting

Stathopoulos et al. [92] show how to formulate a standard convex quadratic optimal control

problem under equality and inequality constraints with a three-set decomposition scheme. In

the three-set setting, the objective function, equality and inequality constraints are split into

three sets of variables. O’donoghue et al. [72] propose a simple scheme for taking straightfor-

ward long-only restrictions into account.

We propose here a modification of the update step given by the soft-thresholding operator

in Equation (5.10c). To derive the modified update step, we rely on a special case met in

stochastic portfolio optimization, assuming the absence of transaction costs. In this simplified

case, the optimal policy determining the trade sequence, denoted by φ(x,u), can be expressed

as a function of the post-trade portfolio x∗:

φ(x,u) =φ(x∗).

Our optimization problem in (5.5) would reduce to a convex quadratic problem, with a linear

dynamics and inequality constraints. This problem would only require a minimization over a

function of x∗ = x +u to find the optimum x∗opt and the resulting optimal policy would be

retrieved by taking u = x∗opt −x.

We can thus describe the optimal policy problem as follows: We start by solving

min(x∗)

φ(x∗)

subject to x∗ ∈C ,

over the variables x∗ = x+u. The optimal policy is reduced to rebalance the portfolio to match

the optimal post-trade portfolios. This affine optimal policy reads

φopt(u) = x∗opt −x. (5.17)

Based on this principle, we show here how to embed this straightforward rebalancing scheme

into the update step in Equation (5.10c), to consider weights constraints. To this purpose, we

modify the ADMM algorithm in (5.10), and detail the procedure :

1. We solve the quadratic constrained problem in in (5.10b) and retrieve the updated states

and controls xopt and uopt of the first primal set.

2. We solve the soft-thresholding minimization in (5.10c) and retrieve the updated controls

uopt of the second primal set.

72


3. We use the projection onto a box defined in (5.3), to set weights restrictions and retrieve

the optimal post-trade portfolio:

l b = (1T(x +u))γlb , ub = (1T(x +u))γub

x∗opt := ΠC (x + u)),

where γl b and γub ∈ R(n·S)(T+1) contain lower and upper bounds, respectively, which

ensure that the value in asset i meet or exceed the fraction (γlbt )i and not exceed the

fraction (γubt )i of the total portfolio value respectively.

We note that the value of the portfolio used in the projection is retrieved from the first set

of primal variables x +u, whereas the vector that we project corresponds to the second

set of primal variables x + u.

4. As explained in (5.17) we could simply re-update the second primal control vector u

by subtracting the optimal state vector xopt from the optimal post-trade portfolio x∗opt.

Inspired by the splitting philosophy, we suggest instead to split the rebalancing scheme

over the second set of states and controls equally:

ω= x∗opt − x − u

2

xopt = x +ωuopt = u +ω,

(5.18)

where ω corresponds to half of the required portfolio rebalancing after considering

bid-ask costs and portfolio weights constraints.

We name this method the embedded update splitting, which displays appealing con-

vergence properties compared to the simple rebalancing scheme proposed in (5.17).

Extensive testing has shown that the simple scheme does not always converge, even for

small optimization problems (4 assets and investment horizon of 5 periods).

73


We can now combine the embedded update splitting in (5.18) with the over-relaxation method

in (5.14) and formulate the Extended Two-Set ADMM Algorithm:

Algorithm 2 Extended Two-Set Alternating Direction Method of Multipliers (eADMM)

Require: Initialize x−1 = ˆx0, u−1 = ˆu0, z−1 = z0, y−1 = y0

1: for iteration k = 1,2, . . . do

. Constrained Quadratic Minimization

2: (xk+1,uk+1,η) := P(L−T(D−1(L−1(P T[−c

d

])))).Over-Relaxation:

3: (xk+1,uk+1) = α(x∗,u∗)+ (1−α)(xk + uk )

. Soft-Thresholding (Transaction Costs):4: uk+1 := prox(uk − yk )5: xk+1 := (xk − zk )

. Box Projection (weights constraints):6: x∗ := ΠC=x∗ | lb≤x∗≤ub(xk+1 + uk+1))

. Embedded Update Splitting:

7: ω := x∗− xk+1 − uk+1

28: xk+1 := xk+1 +ω9: uk+1 := uk+1 +ω

.Dual Update10: (zk+1, yk+1) := (zk , yk )+ (xk+1, uk+1)− (xk+1,uk+1)

. Termination Criterion11: if ||r k

w ||2 < εprimalw and ||sk

w ||2 < εdualw then

12: Convergence = true return xk+1 and uk+1

13: else14: Convergence = false15: end if16: end for

74


5.6.3 Numerical Results

We test the performance of Extended Two-Set Alternating Direction Method of Multipliers

(eADMM) presented in (2), we implemented the algorithm in the language R without paral-

lelization. We benchmark the performance of our algorithm with the package CVX in Matlab,

relying on the standard SDPT3 solver.

We consider a multi-period portfolio optimization problem, with transactions costs and lower-

upper-bound constraints set to 5% and 30% respectively for each asset. We generated a

random matrix of covariances and expected returns and applied a noise to generate scenarios.

In Table 5.1 we present average computational results over 15 runs for portfolios of different

sizes (n=4, 10, 50, 100), consider various horizons (T=5, 10, 50) and vary the number of

scenarios (n=3, 5, 10). All computations were performed on a 2 Ghz Intel Core i7 CPU processor

with 8 GB RAM. We highlight the fact that opposite to R, Matlab has a slight edge due to the

multi-threaded execution of subroutines. We used a tolerance level of 10-4 in the termination

criterion presented in (5.16).

Table 5.1 – Computational Time Results for Stochastic MPC Problems

Small Medium Large Extra Large

States & controls (n) 4 20 30 50

Horizon (T ) 5 10 15 20

Number of scenarios (S) 2 5 7 10

Step size (ρ) 0.1 0.1 0.1 0.1

CVX: Solve time (sec) 7.42 45.34 107.51 593.34

CVX: Number of iterations 29.00 42.00 52.00 66.00

CVX: Objective value -0.78 -1.55 -2.86 -4.86

eADMM: Solve time (sec) 1.19 22.95 84.22 501.27

eADMM: Number of iterations 41.00 330.00 414.00 549.00

eADMM: Objective value -0.78 -1.55 -2.86 -4.86

This table compares the performance of the proposed Extended ADMM algorithm to theSDPTD3 used in CVX (Matlab) for optimization problems of different sizes.

Comparing CVX to the Extended ADMM algorithm, the computational time is similar for

small-size problems but diverge rapidly when the dimension of the problem increases. For

large-scale problems we faced memory issues with CVX and were not able to solve the prob-

lem. This computational performance of the algorithm can be further improved, through

an implementation in C and a parallelization. We provide suggestions for future research in

Chapter 7.

75

5.7. Conclusion

5.7 Conclusion

In this chapter we rely on the Alternating-Direction Method of Multipliers (ADMM) for solv-

ing the scenario-based MPC problem presented in the previous chapter. We present tech-

niques available to accelerate the algorithm and extend the state-of-the-art by providing an

adapted two-set splitting scheme, which allows to consider inequality constraints, as well as

a probability-based termination criterion to improve convergence speed. We detail the ex-

tended fast optimal control algorithm developed and highlight the soundness of our approach

using small-scale to large-scale optimization examples. This new algorithm will be applied in

the next chapter to a real-world application related to large-scale portfolio optimization.

76

6 Application

In this chapter, we combine the different concepts derived in this thesis and apply them to a

real-world portfolio optimization problem.

We first suggest an approach to generate scenarios (views) about expected returns and co-

variances and subsequently illustrate how these scenarios are handled by the scenario-based

MPC framework detailed in Chapter 4. The ADMM methodology from Chapter 5 is used to

recompute the optimal multi-period asset allocation at each rebalancing date.

We explain the methodology in Section 6.3.2 to generate the scenarios, leaning on the EU-

R/CHF exchange rate which is used as an indicator and drive the dynamic asset allocation

strategy. We suggest an innovative method for steering the risk aversion in the optimization

dynamically, using the probabilities assigned to the scenarios.

We back-test the strategy in Section 6.4 and compare its performance relative to a given

strategic allocation (benchmark) as well as naive diversification strategies. We show that

the results displayed by our dynamic portfolio strategy outperforms the benchmark on a

risk-adjusted basis and provide the desired portfolio stability without deviating significantly

from the strategic asset allocation. We conclude this chapter in Section 6.5.

6.1 Background

We consider an Equity portfolio manager, who identifies various scenarios about future returns

and risks of the assets constituting the Dow Jones Index. He is confronted with the pressure

of delivering short-term performance to his clients, while ensuring a long-term portfolio

stability and not deviating too much from a strategical allocation, exogenously assigned. The

strategical allocation is defined by the risk-budgeting strategy proposed in (2.5), which aims at

diversifying away idiosyncratic risk left unexplained by the statistical risk drivers identified.

Although cash flows can be easily handled we do not consider them in this application.

However, we consider both transaction and impact costs and use restrictions on portfolio

weights.

77

6.2. Data

6.2 Data

The dataset used in this chapter is retrieved from Yahoo Finance and is composed of twenty-

nine stocks of the Dow Jones index, from May 1999 through June 2016. We also consider the

exchange rate between Euro and Swiss Francs between 1998 and June 2016, which acts as an

indicator for extracting market regimes with a Hidden Markov Model (see Section 6.3.2).

Table 6.1 – Dow Jones Stocks – Statistics

Stock Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) MDD

AAPL 0.27 0.44 0.63 -0.04 -0.06 0.82

AXP 0.04 0.37 0.12 -0.04 -0.05 0.84

BA 0.09 0.31 0.29 -0.03 -0.04 0.71

CAT 0.08 0.33 0.24 -0.03 -0.05 0.73

CSCO 0.01 0.41 0.03 -0.04 -0.06 0.89

CVX 0.08 0.26 0.30 -0.02 -0.04 0.45

DD 0.03 0.29 0.11 -0.03 -0.04 0.70

DIS 0.09 0.31 0.27 -0.03 -0.04 0.68

GE 0.03 0.31 0.08 -0.03 -0.05 0.86

GS 0.06 0.39 0.14 -0.03 -0.05 0.79

HD 0.09 0.32 0.29 -0.03 -0.05 0.70

IBM 0.04 0.27 0.14 -0.03 -0.04 0.59

INTC 0.03 0.39 0.07 -0.04 -0.06 0.82

JNJ 0.08 0.20 0.42 -0.02 -0.03 0.36

JPM 0.04 0.41 0.09 -0.04 -0.06 0.74

KO 0.04 0.22 0.19 -0.02 -0.03 0.44

MCD 0.09 0.24 0.38 -0.02 -0.03 0.74

MMM 0.11 0.24 0.45 -0.02 -0.03 0.54

MRK 0.03 0.28 0.10 -0.03 -0.04 0.69

MSFT 0.04 0.32 0.12 -0.03 -0.05 0.69

NKE 0.13 0.32 0.43 -0.03 -0.04 0.59

PFE 0.03 0.26 0.11 -0.02 -0.04 0.69

PG 0.06 0.22 0.29 -0.02 -0.03 0.54

TRV 0.11 0.31 0.37 -0.03 -0.04 0.55

UNH 0.19 0.34 0.57 -0.03 -0.05 0.74

UTX 0.08 0.28 0.30 -0.03 -0.04 0.53

VZ 0.05 0.26 0.19 -0.02 -0.04 0.57

WMT 0.05 0.25 0.19 -0.02 -0.04 0.38

XOM 0.07 0.25 0.29 -0.02 -0.04 0.37

This figure shows the annualized return and volatility of each stock over the period 05/1999-06/2016. Value-at-Risk and Expected Shortfall are calculated with a 95% confidence level. TheMaximum Drawdown (MDD) over the period is displayed in the last column.

78

6.2. Data

We report summary statistics – annualized return and volatility as well as correlations, Value-

at-Risk, Expected Shortfall and Maximum Drawdown – over the whole sample period. Table 6.1

highlight the significant variations across stocks in terms of return and risk. Cisco displays the

lowest return (1%) and one of the highest risks (41% volatility) whereas Apple has the highest

return (27%). The highest drawdown is recorded by Cisco (-89%), while Johnson & Johnson

displays the lowest one (-36%).

The correlation matrix across assets in Figure 6.1 emphasizes the potential for diversification

provided by an optimal combination of these 29 stocks. Correlation coefficients range from

0.15 between Apple and Procter & Gamble to 0.83 between Chevron and Exxon Mobile.

Figure 6.1 – Asset Classes – Correlations.

This figure displays asset correlations over the period 05/1999-06/2016. Highly correlatedstocks are highlighted by blue dots with shadings related to the degree of correlation.

79

6.3. Scenario Generator

6.3 Scenario Generator

Several methods can be used to generate scenarios about future returns and risks. Some of

them rely on bootstrapping techniques of historical data, other recourse to simulations or

sampling from a given distribution.

Inspired by Hinz and Yee [42], we apply the Hidden Markov Model (HMM) methodology to

the FX-rate between EUR and CHF to identify market regimes and define scenarios for our

stocks in each of these regimes. We let the parameters of the asset price dynamics switch

between different market regimes, allowing more flexibility when sudden regime shifts occur.

We introduce the theory of Hidden Markov Models and explain the methodology used to

generate scenarios for the return dynamics of the 29 stocks.

6.3.1 Hidden Markov Model

Hidden Markov Models (HMMs) are powerful methods for modelling time varying dynamics

of a statistical process and only require a set of (soft) assumptions. A stochastic process is

modelled as a set of states, each of these states possess a set of signals. Movements between

different states characterize underlying changes in the stochastic process. HMMs have been

extensively used in finance (see Hardy [40], Haussmann and Sass [41], Erlwein [30] and Lee

[50]).

Definition 13. A Markov chain is a stochastic process xt with a countable set of states and

respects the Markov property, defined as

P (X t+1 = j ) | X0 = i0, X1 = i1, . . . , X t = it ) = P (X t+1 = j | it ),

where it denotes the prevailing state at time t .

The process evolves over time and transit from one state to another, as defined by the transition

matrixΠ. This transition matrix also named transition kernel gives the probability Pi j for the

process of migrating to state j given the current state i . We assume that all probabilities are

stationary.

The idea of Hidden Markov modelling is to realize a time series (yt )Tt=0 in such a way that

it behaves as it was driven by a background device which may operate in different regimes.

Thereby, one supposes that the operating regime is not directly observed and evolves like a

Markov chain (xt )Tt=0 on a finite space.

The major advantage thereby is that it is possible to trace the evolution of the hidden states

indirectly, based on the observation of (yt )Tt=0, using efficient recursive schemes for calculation

of the so-called hidden state estimate

xt = E(xt | y j , j ≤ t ) t = 0, . . .T.

80


Thereby, at each time t = 0, . . . ,T −1, the probability vector xt describes the distribution of xt

conditioned on the past observations (y j )tj=0. More importantly, such approach reproduces a

Markovian dynamics in the following sense: Although (yt )Tt=0 is not Markovian in general, it

turns out (see Yushkevich [97]) that the observations (yt )Tt=0 equipped with latent variables

(xt )Tt=0 form a two-component process such that

the evolution (xt , yt )Tt=0 is Markovian.

From this perspective, modelling a time series (yt )Tt=1 in this way yields a technique to address

control problems in certain non-Markovian situations. Let us introduce the ingredients

required therefore. Assume that an unobservable global regime evolves like a Markov chain

(xt )Tt=0 on the set X = e1, . . . ,ed of unit vectors in R

¯d , while the information available to

the controller is gained from the observation of the process (yt )Tt=0 which takes values in a

measure space Y .

We suppose that the joint evolution ((xt , yt ))Tt=0 follows a Markov process whose transition

kernels Qt for t = 0, . . . ,T −1 are acting on functions φ : X ×Y → R as∫φ(x ′, y ′)Qt (d(x ′, y ′) | (x, y)) = ∑

x ′∈X

∫Yφ(x ′, y ′)Γx,x ′µx (dy ′). (6.1)

Thereby, the stochastic matrix Γ= (Γx,x ′)x,x ′∈X describes the transition from xt to xt+1 whereas

µx denotes the distribution of the observation yt+1 conditioned on xt = x ∈ X . Assuming

that for each x ∈X the distribution µx is absolutely continuous with respect to a reference

measure µ on Y , we introduce the densities

νx (y) = dµx

dµ(y), y ∈Y , x ∈X ,

to write the distributions as

µx (dy) = νx (y)µ(dy) x ∈X .

It turns out that (xt , yt )Tt=0 follows a Markov process on the state space X ×Y , driven by

transition kernels Qt which act for t = 0, . . . ,T −1 on functions φ : X ×Y → R¯

as

∫X×Y

φ(x ′, y ′)Qt (d(x ′, y ′) | (x, y)) =∫Yφ

(Γ>V (y ′)x

‖V (y ′)x‖ , y ′)‖V (y ′)x‖µ(dy ′).

In this formula, V (y) stands for the diagonal matrix whose diagonal elements are given by

(νx (y))x∈X for y ∈Y , and the norm is defined as ‖z‖ =∑ni=1 |zi |, each z ∈ R

¯d .

81


6.3.2 Methodology

We consider the scenario-based MPC portfolio problem described in details in Chapter 5 that

we restate here for convenience:

min(x,u)

S∑s=1

T∑t=0

p st

(−µs

tT(xs

t +ust −bs

t )+λst (xs

t +ust −bs

t )TΣst (xs

t +ust −bs

t )

+ustT diag(ct )us

t +κTt |ust |)


t (xst +us

t )+ f st

bst+1 =G s

t (bt )

1ust = 0

xs0 = xstart

xsT +us

T = 0

(1Txs∗t )γl b


t )γubt , s = 1, . . . ,S, t = 0, . . . ,T −1,



f st ∈ Rn·S corresponds to cash flows at time t and T is the investment horizon.

Given a set of scenarios, the objective of the portfolio manager is to obtain a risk-adjusted

outperformance over a given benchmark portfolio, in the presence of transaction costs and

constraints.

To generate plausible scenarios over the horizon T we start by fitting an HMM Model with

Gaussian innovations to the EUR/CHF exchange rate over the period 1998-2000 and identify

two market regimes. The exchange rate will be used as an indicator and will drive the dynamic

asset allocation strategy including a large set of assets.

In each of the two regimes identified, we compute the (conditional) average asset returns as

well as the covariance matrix and consider that uncertainty increases with the length of the

time horizon T . These two sets of returns-risk pairs at each time t of the investment horizon

deliver the scenarios that we will use by the fast optimal control methodology detailed in

Chapter 5. Note that for illustration purposes, we do not here re-fit the parameters of the

model but preliminary results show that this could add value to the dynamic strategy results.

The current regime probabilities and the transition kernel in (6.1) allow not only to consider

the current uncertainty pertaining to the two scenarios, but also to consider the dynamics of

the regime probabilities in the future.1

1The scenario-based optimizer presented in Chapter 5 handles these time-varying probabilities.

82


Additionally, we suggest a novel approach to steer the (time-varying) risk aversion of the opti-

mization procedure presented in Chapter 5, Section 5.3. The underlying principle relies on the

observation that if probabilities at time t are all equal, we assume that our uncertainty related

to the current regime is at its maximum. In other words, without a strong confidence in his

market views, the portfolio manager should not deviate too much from his strategic allocation

(benchmark). Conversely, if a regime is currently assigned a probability of 1, this means that

our uncertainty is at its minimum and that the portfolio manager can take calculated risks.

We consider these two boundaries to drive the risk aversion parameter, helped by the spectral

entropy, given by

H(p1, p2, . . . , pS) =−S∑

s=1ps log ps ,

where S denotes the number of regimes (scenarios). expH(p1, p2, . . . , pS) ranges from 1 to

S = 2 in our two-regime example. We can scale the spectral entropy and express it as a value,

denoted by $, between 0 and 1

$(p1, p2, . . . , pS) = exp−∑Ss=1 ps log ps

S, s = 1, . . . ,S. (6.2)

In the numerical example presented in Section 6.4 the risk aversion λ will range from 0.5 to

20, a low value for λ will deviate from the strategic allocation (benchmark) and consider the

expected returns in each scenario, whereas a high value will lead the optimizer to remain close

to the benchmark. The risk aversion parameter λ can thus be expressed as a function of the

scaled spectral entropy in (6.2)

λt =φ($(p1, . . . , ps , . . . , pS)

), t = 1, . . . ,T, s = 1, . . . ,S.

We summarize the required steps for retrieving the scenarios and the risk aversion parameter

at time t (rebalancing date) as follows:

1. Fit the 2-regimes HMM Model to the FX Data (only fitted once, at time t = 0) and retrieve

the current probabilities as well as the transition matrix.

2. Define the conditional expected asset returns for each period of the investment horizon

as the historical conditional returns in each regime and add a gaussian noise which

scales with the distance from time t , i.e. we consider increasing uncertainty over the

horizon.

E(Rt+i | s) = 1

M

∑Rk∈s

Rk + εt+i , i = t +1, . . . , t +T, s = 1, . . . ,S,

where Rt+1 ∈ Rn is the vector of expected returns for the n assets for the period t + i in

83

6.4. Large-Scale Dynamic Portfolio Strategy

scenario s, εt+i ∼N (0,p

t + i ) ∈ Rn corresponds to the gaussian noise with increasing

horizon over the forecasting horizon and M is the number of historical observations

counted in scenario s.

3. Define the conditional expected asset covariances for each period of the investment

horizon as the historical covariances in each regime and add a gaussian noise which

scales with the distance from time t , i.e. we consider increasing uncertainty over the

horizon.

Σt+i | s = ∑Rk∈s

∑Rl∈s

Cov(Rk ,Rl ) + εt+i , i = t +1, . . . , t +T, j = 1, . . . ,S,

where Σt+i ∈ Rn×n is the estimated covariance matrix for the period t + i in scenario s,

εt+i ∼N (0,p

t + i ) ∈ Rn×n corresponds to the gaussian noise with increasing horizon

over the forecasting horizon.

4. Compute the risk aversion parameter λt , kept fixed over the investment horizon, using

the scaled entropy measure in (6.2), the current probabilities ps and the boundaries for

the risk aversion:

λt =$(p1, . . . , ps , . . . , pS) · (λ+−λ−)+λ−, t = 1, . . . ,T, s = 1, . . . ,S,

where λ−,λ+ ∈ R corresponds to the minimum and maximum risk aversion, respectively.

6.4 Large-Scale Dynamic Portfolio Strategy

In this section we combine all the results presented in this thesis. We calibrate the HMM

model on FX rate over the period 1998-2000 and back-test our dynamic investment strategy

over the period 2001/01-2016/06.

6.4.1 Benchmark

We consider a strategic allocation given by the risk-budgeting strategy proposed in Chapter 2,

Section 2.5 applied to 29 assets constituting the Dow Jones 30 Index. This strategy will be used

as a benchmark in the multi-period optimization and relies on the identification of statistical

risk drivers, aiming at diversifying away the specific risk left unexplained by the factors. For

convenience purposes, we restate the strategy here:

Benchmark: Factor Risk Budgeting – Specific

We set risk budgets that are inversely proportional to the specific risk of the assets (Υ2) that

remain unexplained by the L statistical factors, as detailed in Chapter 2, Section 2.5. Applying

the risk decomposition in Equation (2.8) and using the corresponding torsion matrix tSMT in

84


(2.5), the asset weights are given by

bi = 1

υ2i

,

wi = tSMT biσ−1

i∑Kj=1 b jσ

−1j

, i = 1,2, . . . ,K .

where υ2i denotes the specific risk of asset i .

6.4.2 Scenarios and Rebalancing

We allow the portfolio to rebalance on a monthly basis and thus consider an investment

horizon of one month in the multi-period ADMM optimization procedure. We use two

scenarios over the following month (i.e. 21 days), given by the average returns and covariances

observed over the last two years (500 days) in each of the regime identified by the HMM fitted

to the EUR/CHF exchange rate over the period 1998-2000. We follow the methodology detailed

in Section 6.3.2 to compute the conditional expected returns and covariances at each time

step of the investment horizon.

We subsequently use these scenarios and the benchmark strategy to optimize and derive

an optimal tactical portfolio, helped by the fast optimal control methodology presented in

Chapter 5.

6.4.3 Risk Aversion

Time-varying probabilities assigned to scenarios, retrieved with the transition kernel in (6.1),

are considered and steer the risk aversion parameter between 0.5 and 20 dynamically at each

rebalancing date, based on the methodology explained in Section 6.3.2.

6.4.4 Costs and Restrictions

We consider transaction costs of 10 bp (bid-ask) as well as quadratic costs of 1 bp (impact). No

more than 30% can be invested in a single stock and short positions are restricted to 20% of

the portfolio value.

6.4.5 Results

We compare the performance after costs of the benchmark strategy with our dynamic invest-

ment strategies relying on market scenarios. Figure 6.2 shows that our dynamic portfolio

approach clearly outperforms on a risk-adjusted basis its benchmark and reduce the draw-

downs borne over the period.

85


Figure 6.2 – Dynamic Portfolio – Performance0

12

34

56 Dynamic Portfolio

Benchmark

Naive: assets

Naive: factors

Cum

ulat

ive

Ret

urn

Performance

−0.0

6−0

.02

0.02

0.04

0.06

0.08

Dai

ly R

etur

n

2001−12−31 2002−12−02 2003−12−01 2004−12−01 2005−12−01 2006−12−01 2007−12−03 2008−12−01 2009−12−01 2010−12−01 2011−12−01 2012−12−03 2013−12−02 2014−12−01 2015−12−01

−0.4

−0.3

−0.2

−0.1

0.0

Dra

wdo

wn

This figure compares the performance after costs of the dynamic portfolio relative to itsbenchmark and naive diversification approaches applied to the assets and market factorsrespectively.

Table 6.2 – Dynamic Portfolio vs. Benchmark – Key figures

Average Return Volatility Sharpe Ratio VaR (95%) ES (95%) Max. Drawdowndynamic portfolio 10.93 8.54 1.28 -3.26 -4.79 12.87

benchmark 13.22 13.81 0.96 -5.29 -9.73 20.74equally-weighted: assets 14.44 16.28 0.89 -5.58 -7.39 23.78

equally-weighted: factors 15.38 15.11 1.02 -5.34 -8.42 19.88

This table presents the annualized return and volatility as well as tail risk metrics of thedynamic portfolio, the strategic allocation (benchmark) and two equally-weighted (naive)strategies applied to the assets and market factors respectively.

Table 6.2 summarizes the results of the back-tested dynamic and benchmark strategies as well

as two naive diversification approaches (equally-weighted) detailed in Chapter 2, Section 2.6.3.

Table 6.2 shows that the naive diversification approach along the factors provides a consider-

able improvement relative to the standard equally-weighted strategy, with a Sharpe ratio of

1.02 and 0.89 respectively. However, these two approaches had to bear an important drawdown

during the two financial crises (2001, 2008). The strategic allocation (benchmark), relying on a

specific diversification along the factors identified, displays a better risk-adjusted performance

86

6.5. Conclusion

relative to the standard naive approach but cannot outperform the naive diversification along

the factors.

Remarkably, our dynamic portfolio strategy, relying on market scenarios with time-varying

probabilities and risk aversion, displays the best results in terms of risk-adjusted performance.

We could not only substantially improve the Sharpe ratio relative to the benchmark (from

0.96 to 1.28) and the other strategies, but also considerably reduced tail risks. In particular,

the maximum drawdown has been significantly reduced (from 20.74% for the benchmark to

12.87% for the dynamic portfolio). The tracking-error, measured as the standard deviation

of the return difference between the portfolio and the benchmark, amounts to 9.41% 2. This

metric measures the extent of the deviation to the strategic allocation and reveals that the

underlying confidence placed in the scenarios did not reach extreme values, i.e. the risk

aversion stayed within the defined boundaries.

6.5 Conclusion

In this Chapter, we applied our findings from previous chapters and proposed an application

to a real-world portfolio allocation problem. We suggested a method to generate market

scenarios over the investment horizon. Relying on a simple Hidden Markov Model fitted to the

EURCHF exchange rate, with two states, we use these market regimes to compute conditional

expected asset returns and covariances. We handled these scenarios with the scenario-based

MPC methodology detailed in this thesis and solved the multi-period allocation problem with

the new Fast Optimal Control algorithm developed in Chapter 5.

We proposed an innovative approach to dynamically steer the risk aversion over time, relying

on the relative confidence placed in the scenarios considered. Using the spectral entropy

measure, we derive a methodology to increase or reduce the risk-aversion within pre-defined

bounds.

We back-tested the dynamic allocation approach and compared the results with a strategical

allocation (benchmark) as well as two naive diversification strategies, which set an equal

weight to the assets or factors respectively. Our results show that the dynamic approach, even

after considering transaction costs, delivers a risk-adjusted outperformance compared to

the other strategies and also allows to significantly reduce tail risks. The appealing features

displayed by the dynamic framework proposed, even within a simplified framework with only

2 regimes, pave the way for further research directions.

2Tracking error is often used in practice for active mandates, which aim at outperforming an assigned bench-mark.

87

7 Conclusion and Outlook

7.1 Conclusion

This thesis was motivated by the need to device dynamic multi-period portfolio strategies

that can be solved efficiently and outperform common standard strategies used in practice.

We improved the state-of-the-art by proposing a new formulation of the two-set splitting

ADMM algorithm, used in previous work and considered a scenario-based approach to take

uncertainty into account in the multi-period framework.

We presented in Chapter 1 the naive diversification method and the benchmark approach

reported in previous research. We suggested a procedure, the so-called minimum-torsion ap-

proach, allowing to retrieve uncorrelated factors. We devised an investment strategy relying on

the naive diversification, where we allocated wealth applying a correction to the original equal

weight distribution. The correction has been carried out with the help of the minimum-torsion

matrix and results in a portfolio in which highly correlated assets are under-represented. We

illustrated our findings with a case study on 375 stocks in the SP500 index.

Building on the modified naive approach and relying these statistical factors, we proposed in

Chapter 2 a novel dynamic approach, focusing on statistical analysis of the data and on risk

budgeting techniques. We suggested a shrunk version of the minimum-torsion matrix, using

the effective rank approach to extract the number of risk factors driving asset returns. This pure

statistical approach enabled a risk decomposition of a given portfolio into a systematic and

specific component as well as an assessment of its level of diversification. We devised various

dynamic investment strategies, especially an innovative implementation of a risk budgeting

technique, where the budget of a given asset is inversely proportional to its idiosyncratic risk,

left unexplained by the statistical factors. We illustrated our approach through an empirical

application.

We reviewed the single and multi-period optimization framework in Chapter 3 and present

the solutions proposed in the literature. We gave a short overview of Dynamic Programming

(DP) techniques and the issues associated when considering a large-scale portfolio.

88

7.2. Further Research

We presented in Chapter 4 Model predictive control (MPC), a widespread technique for solving

linear convex optimization problems. We derived a scenario-based formulation of the portfolio

optimization problem over a given investment horizon, in the presence of portfolio restrictions

and transactions costs. We showed how the resulting problem can be expressed as a quadratic

and non-quadratic component.

In Chapter 5, we used the Alternating Direction Method of Multipliers (ADMM), to solve the

scenario-based MPC portfolio optimization problem efficiently and quickly. We developed a

new algorithm, which allows to include inequality constraints in the two-set splitting scheme

of the ADMM. We also proposed a modified stopping criterion for the ADMM, which rely on

the probabilities of the scenarios considered, allowing to improve convergence properties of

the algorithm.

We presented a real-world large-scale multi-period portfolio application in Chapter 6, where

we combined the different concepts derived in this thesis. We suggested an approach to

generate scenarios relying on a Hidden Markov Model (HMM) and solved the constrained

multi-period MPC problem with the new ADMM algorithm developed in this thesis. We

suggested a novel concept to steer the risk aversion over time, relying on the probabilities

assigned to the different scenarios. We finally back-tested the strategy with a large-scale

portfolio and showed that the results obtained provided the desired outperformance, without

deviating significantly from the strategic asset allocation.

7.2 Further Research

7.2.1 Risk-Budgeting

In Chapters 1 and 2, we suggested various investment strategies which display appealing risk-

adjusted properties. This area should be built upon and improved, in particular in connection

with risk-budgeting techniques. The factor-based risk budgeting methods developed in this

thesis rely on the decomposition of the portfolio risk, measured by the standard deviation.

Considering the non-normality of asset returns, future research in this field should strive to

design strategies based on the diversification of tail or downside risks. Serious challenges

will be posed to the researcher, as the decorrelation used notably by PCA techniques or the

minimum-torsion approach used in this thesis do not lead to a strict independence of the

factors retrieved (tail-risk) and one should recourse to an Independent Component Analysis

approach.

7.2.2 Multi-Period Optimization via ADMM

The techniques developed in this thesis to include inequality constraints into the two-set

splitting scheme and the improvement of the convergence properties when dealing with sce-

narios can be extended further. As suggested by our industry partners, a promising direction

89


are so-called GPU computations, which allow to perform computations at a low-level on the

graphic card processor and to run tasks in parallel on a machine with multiple cores.

Enthusiasm for studies about the improvement of the convergence properties of the ADMM

algorithm has reignited among researchers in recent years and there is a potential for accel-

erating the algorithm further. A suggestion might be to express the step size parameter in

the algorithm as a function of the residuals and use an adaptive step size at each iteration.

However, as already mentioned in Chapter 5, Section 5.6.1, this would require a recalculation

of the pre-computed matrix in the convex-quadratic control problem but GPU computations

could potentially alleviate this issue.

Another area of research would be to use a risk penalty in the objective function that rely on

tail risk measures. Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR) are nowadays

essential ingredients for a proper risk management. Rockafellar and Uryasev [81] present a

method in a single-period framework for minimizing CVaR. The authors suggest a scenario de-

composition, in which case the optimization problem can be reduced to linear programming.

Future work should build on these results and implements the CVaR penalty in a multi-period

optimization.

7.2.3 Investment Strategies

The scenario-based MPC framework developed in this thesis opens many opportunities for

rebalancing techniques. At each rebalancing time t , the optimal portfolio is recomputed,

given the scenarios and the investment horizon. Three rebalancing schemes can apply:

• A so-called open-loop scheme can be used, where we simply rebalance the portfolio

over the horizon given the optimal controls computed at time t , hence no recourse is

possible.

• On their side, closed-loop schemes also called controlled approach, consider incoming

market information and recompute the optimal controls over the investment horizon.

• The last method simply implements the first control at time t and do not rebalance over

the investment horizon, i.e. the method reduces to a buy-and-hold strategy over the

investment horizon.

Future research might conduct a thorough investigation of the optimal rebalancing scheme,

relying on the tools developed in this thesis. Depending on the type of investor considered

and transaction costs incurred, a lower or higher rebalancing frequency might be appropriate.

7.2.4 Scenario Generator

We presented a scenario generation method relying on a Hidden Markov Model (HMM), which

identified three regimes. This area of research as well as other machine-learning techniques

90


are very popular and a lot of empirical research is awaiting in this area. A direction for further

research is to find a combination of various investment strategies, whose performances are

weakly correlated in a given regime, allowing to stabilize or ideally improve the risk-adjusted

performance of an aggregated portfolio.

In this thesis we considered the open-loop approach on a monthly basis, without refitting the

parameter of the HMM Model. A dynamic strategy could benefit from a refitting on a regular

basis as, over the years, the market dynamics could lead to the identification of other regimes,

i.e. the parameters of the HMM are time-varying.

91

A Appendix

A.1 The Jacobi and Gauss-Seidel Iterative Methods

A.1.1 Jacobi Method

The Jacobi Method is an iterative method which relies on a splitting methodology. Assume we

want to solve the system Ax = b, where A ∈ Rn is a sparse matrix. We split A = D −E −F and

move the two terms on different sides of this equation and obtain

Dx = (E +F )x +b.

The iteration consists in replacing x on the left side by xk , while replacing x on the right side

by xk−1

xk = D−1(E +F )xk−1 +D−1b.

A.1.2 Gauss-Siedel Method

The Gauss-Siedel Method is derived in the same way as the Jacobi Method but uses the splitting

A = (L+D)+U , resulting in the following iteration

(D +L)xk+1 =−Uxk +b.

For each k ≥ 1, we generate xk from xk−1 by

xk =Cg xk−1 + cg , k = 1, · · · ,T −1,

where Cg = (D −L)−1U and cg = (D −L)−1b. This iterative method performs a forward sub-

stitution at each step and overwrite the old value with this new calculated value. The error

reduction is faster than in the Jacobi method.

92

Bibliography

[1] Noel Amenc, Felix Goltz, and Lionel Martellini. Smart Beta 2.0. The Journal of Index

Investing, 4(3):15–23, 2013.

[2] Noel Amenc, Felix Goltz, Ashish Lodh, Lionel Martellini, and Eric Shirbini. Risk Allo-

cation, Factor Investing and Smart Beta: Reconciling Innovations in Equity Portfolio

Construction. Edhec Publications, July 2014.

[3] Y. Aït-Sahalia, J. Cacho-Diaz, and T.R. Hurd. Portfolio choice with jumps: A closed-form

solution. The Annals of Applied Probability, 19:556–584, 2009.

[4] S. Basak and G. Chabakauri. Dynamic mean-variance asset allocation. Review of Finan-

cial Studies, 23:2970–3016, 2010.

[5] N. Bäuerle and U. Rieder. Markov Decision Processes with Applications to Finance.

Springer, Heidelberg, 2011. doi: 10.1007/978-3-642-18324-92.

[6] C. Bender, C. Gärtner, and N. Schweizer. Pathwise dynamic programming. Working

paper, 2015.

[7] D. Bertsekas. Dynamic Programming and Optimal Control, volume 1. Athena Scientific,

2005.

[8] R. Bey, R. Burgess, and P. Cook. Measurement of estimation risk in markowitz portfo-

lios. Working Paper, 1990. Available at: http://www.thierry-roncalli.com/download/

risk-factor-parity.pdf.

[9] Vineer Bhansali, Josh Davis, Graham Rennison, Jason C. Hsu, and Feifei Li. The Risk in

Risk Parity: A Factor Based Analysis of Asset Based Risk Parity. Social Science Research

Network Working Paper Series, October 2012. URL http://ssrn.com/abstract=2167058.

[10] F. Black and R. Litterman. Global portfolio optimization. Financial Analysts Journal, 48

(5):28–43, 1992.

[11] T. Bodnar, N. Parolya, and W. Schmid. A Closed-Form Solution of the Multi-Period

Portfolio Choice Problem for a Quadratic Utility Function. EUV working paper 292, 2012.

[12] J.P. Bouchaud and M. Potters. Theory of Financial Risk. Aleea-Saclay, Eyrolles, Paris, 1997.

93

http://www.thierry-roncalli.com/download/risk-factor-parity.pdf


http://ssrn.com/abstract=2167058

Bibliography

[13] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004.

[14] Stephen Boyd, N. Parikh, E. Chue, B. Peleato, and J. Eckstein. Distributed optimization

and statistical learning via the alternating direction method of multipliers. Found. Trends

Optim., 3(1):1–122, 2011. ISSN 2167-3888. doi: 10.1561/2200000016.

[15] Stephen Boyd, Mark T. Mueller, Brendan O’Donoghue, and Yang Wang. Performance

bounds and suboptimal policies for multi–period investment. Found. Trends Optim., 1

(1):1–72, January 2014. ISSN 2167-3888. doi: 10.1561/2400000001. URL http://dx.doi.

org/10.1561/2400000001.

[16] M. Brandt and P. Santa-Clara. Dynamic portfolio selection by augmenting the asset space.

The Journal of Finance, 61:2187–2217, 2006.

[17] J.R Bunch and B.N. Parlett. Direct methods for solving symmetric indefinite systems of

linear equations. SIAM Journal on Numerical Analysis, 8(4):639–655, 1971.

[18] Thomas F. Cargill and Robert A. Meyer. Multiperiod Portfolio Optimization And The Value

Of Risk Information. Advances in Financial Planning and Forecasting, v2(1):245–268,

1987.

[19] Mark Carhart. On persistence in mutual fund performance. Journal of Finance, 52(1):

57–82, 1997. URL http://EconPapers.repec.org/RePEc:bla:jfinan:v:52:y:1997:i:1:p:57-82.

[20] Celikyurt and Özekici. Multiperiod portfolio optimization models in stochastic markets

using the mean-variance approach. European Journal of Operational Research, 179:

186–202, 2007.

[21] Nai-Fu Chen, Richard Roll, and Stephen A Ross. Economic forces and the stock market.

The Journal of Business, 59(3):383–403, 1986. URL http://EconPapers.repec.org/RePEc:

ucp:jnlbus:v:59:y:1986:i:3:p:383-403.

[22] G. Connor. The three types of factor models: A comparison of their explanatory power.

Financial Analysts Journal, 519–531, 1995.

[23] G. Connor and R.A. Korajczyk. A test for the number of factors in an approximate factor

model. The Journal of Finance, 48(4):1263–1291, 1993. ISSN 1540-6261. doi: 10.1111/j.

1540-6261.1993.tb04754.x. URL http://dx.doi.org/10.1111/j.1540-6261.1993.tb04754.x.

[24] D. Duffie and H. Richardson. Mean-variance hedging in continuous time. Annals of

Probability, 1:1–15, 1991.

[25] J. Eckstein. Parallel alternating direction multiplier decomposition of convex pro-

grams. Journal of Optimization Theory and Applications, 80(1):39–62, 1994. doi:

10.1007/bf02196592.

94

http://dx.doi.org/10.1561/2400000001

http://dx.doi.org/10.1561/2400000001

http://EconPapers.repec.org/RePEc:bla:jfinan:v:52:y:1997:i:1:p:57-82

http://EconPapers.repec.org/RePEc:ucp:jnlbus:v:59:y:1986:i:3:p:383-403

http://EconPapers.repec.org/RePEc:ucp:jnlbus:v:59:y:1986:i:3:p:383-403

http://dx.doi.org/10.1111/j.1540-6261.1993.tb04754.x

Bibliography

[26] Jonathan Eckstein and Dimitri P. Bertsekas. On the douglas—rachford splitting method

and the proximal point algorithm for maximal monotone operators. Mathematical

Programming, 55(1-3):293–318, 1992. doi: 10.1007/bf01581204.

[27] Jonathan Eckstein and Michael C. Ferris. Operator-splitting methods for monotone affine

variational inequalities, with a parallel application to optimal control. INFORMS Journal

on Computing, 10(2):218–235, 1998. doi: 10.1287/ijoc.10.2.218.

[28] A. Edelman. Eigenvalues and condition numbers of random matrices. SIAM J. Matrix

Analy. Appl., 9(4):543, 1988.

[29] R. Engle, J. Mezrich, and L. You. Optimal asset allocation. 1998.

[30] Christina Erlwein. Applications of hidden markov models in financial modelling, 2008.

[31] E.F. Fama and K.R. French. The cross-section of expected stock returns. Journal of finance,

pages 427–465, 1992.

[32] Eugene Fama and Kenneth French. Common risk factors in the returns on stocks and

bonds. Journal of Financial Economics, 33(1):3–56, 1993. URL http://EconPapers.repec.

org/RePEc:eee:jfinec:v:33:y:1993:i:1:p:3-56.

[33] E. A. Feinberg and A. Schwartz. Handbook of Markov Decision Processes. Kluwer Academic,

2002.

[34] R. Glowinski and A. Marroco. Sur l’approximation, par éléments finis d’ordre un, et la réso-

lution, par pénalisation-dualité d’une classe de problèmes de dirichlet non linéaires. Re-

vue française d’automatique, informatique, recherche opérationnelle. Analyse numérique,

9(R2):41–76, 1975. doi: 10.1051/m2an/197509r200411.

[35] R.B. Gold. Why the efficient frontier for real estate is fuzzy. Journal of Real Estate Portfolio

Management, 1:59–66, 1995.

[36] T. Goldstein, B. O’Donoghue, and S.Setzer. Fast alternating direction optimization meth-

ods. Technical report, UCLA Computational and Applied Mathematics Report, 2012.

[37] R.C. Grinold and M. Stuckelman. The value-added/turnover frontier. The Journal of

Portfolio Management, pages 8–17, 1993.

[38] N.H. Hakansson. On myopic portfolio policies, with and without serial correlation of

yields. Journal of Business, 44(3):324–334, 1971.

[39] G. Hanoch and H. Levy. The efficiency analysis of choices involving risk. The Review of

Economic Studies, 36(3):335–346, 1969.

[40] Mary R. Hardy. A regime-switching model of long-term stock returns. North American

Actuarial Journal, 5(2):41–53, 2001. doi: 10.1080/10920277.2001.10595984.

95

http://EconPapers.repec.org/RePEc:eee:jfinec:v:33:y:1993:i:1:p:3-56

http://EconPapers.repec.org/RePEc:eee:jfinec:v:33:y:1993:i:1:p:3-56

Bibliography

[41] Ulrich G. Haussmann and Jörn Sass. Optimal terminal wealth under partial information

for hmm stock returns. Contemporary Mathematics of Finance, page 171–185, 2004. doi:

10.1090/conm/351/06401.

[42] J. Hinz and J. Yee. Stochastic switching for partially observable dynamics and optimal

asset allocation. Working paper, 2016.

[43] W. James and C. Stein. Estimation with quadratic loss. In Proceedings of the Berkeley

Symposium on Mathematical Statistics and Probability, volume 4, page 361. University of

California Press, 1956.

[44] Zura Kakushadze and Willie Yu. Statistical Risk Models. Social Science Research Network

Working Paper Series, February 2016. URL http://ssrn.com/abstract=2732453.

[45] Jia Kang, Arvind U. Raghunathan, and Stefano Di Cairano. Decomposition via admm for

scenario-based model predictive control. 2015 American Control Conference (ACC), 2015.

doi: 10.1109/acc.2015.7170904.

[46] Dong-Hee Kim and Hawoong Jeong. Systematic analysis of group identification in stock

markets. Phys. Rev. E, 72:046133, Oct 2005. doi: 10.1103/PhysRevE.72.046133. URL

https://arxiv.org/pdf/physics/0503076.pdf.

[47] M. Kritzman, S. Mygren, and S. Page. Optimal rebalancing: A scalable solution. Revere

Street Working Papers, 2007.

[48] K.F. Kroner and J. Sultan. Time-varying distributions and dynamic hedging with foreign

currency futures. Journal of Financial and Quantitative Analysis, 28(4):535–551, 1993.

[49] Laurent Laloux, Pierre Cizeau, Jean-Philippe Bouchaud, and Marc Potters. Random

matrix theory and financial correlations. Science Finance (CFM) working paper archive

500053, Science Finance, Capital Fund Management, 1999. URL http://EconPapers.

repec.org/RePEc:sfi:sfiwpa:500053.

[50] D. Lee. Trading USDCHF filtered by Gold dynamics via HMM coupling. ArXiv e-prints,

August 2013.

[51] M. Leippold, P. Vanini, and F. Trojani. A geometric approach to multiperiod mean-

variance optimization of assets and liabilities. Journal of Economic Dynamics and Control,

28:1079–1113, 2004.

[52] D. Li and W.L. Ng. Optimal dynamic portfolio selection: multiperiod meanvariance

formulation. Mathematical Finance, 10:387–406, 2000.

[53] John Lintner. The valuation of risk assets and the selection of risky investments in stock

portfolios and capital budgets. The Review of Economics and Statistics, 47(1):13–37, 1965.

ISSN 00346535, 15309142. URL http://www.jstor.org/stable/1924119.

[54] H. Markowitz. Portfolio selection. The Journal of Finance, 7:77–91, 1952.

96


https://arxiv.org/pdf/physics/0503076.pdf

http://EconPapers.repec.org/RePEc:sfi:sfiwpa:500053

http://EconPapers.repec.org/RePEc:sfi:sfiwpa:500053

http://www.jstor.org/stable/1924119

Bibliography

[55] H. Markowitz. Portfolio Selection: Efficient diversification of investments. John Wiley, New

York, 1959.

[56] H.M. Markowitz and E.L. Van Dijk. Single-period mean-variance analysis in a changing

world. Financial Analysts Journal, 59(2):20–44, 2003.

[57] David Q. Mayne. Model predictive control. Automatica, 50(12):2967–2986, December

2014. ISSN 0005-1098. doi: 10.1016/j.automatica.2014.10.128. URL http://dx.doi.org/10.

1016/j.automatica.2014.10.128.

[58] D.q. Mayne, J.b. Rawlings, C.v. Rao, and P.o.m. Scokaert. Constrained model predic-

tive control: Stability and optimality. Automatica, 36(6):789–814, 2000. doi: 10.1016/

s0005-1098(99)00214-9.

[59] R. Merton. Continuous-Time Finance. Oxford, 1990.

[60] R. C. Merton and P.A. Samuelson. Fallacy of the log-normal approximation to optimal

portfolio decision-making over many periods. Journal of Financial Economics, 1:67–94,

1974.

[61] A. Meucci. Simulations with exact means and covariances. 2009. Available at Symmys:

http://symmys.com/node/162.

[62] A. Meucci and M. Nicolosi. Dynamic portfolio management with views at multiple

horizons. 2015. Available at SSRN: http://ssrn.com/abstract=2583612.

[63] A. Meucci, A. Santangelo, and R. Deguest. Measuring portfolio diversification based

on optimized uncorrelated factors. 2013. Available at SSRN: http://ssrn.com/abstract=

2276632.

[64] Richard O. Michaud and Robert O. Michaud. Efficient Asset Management: A Practical

Guide to Stock Portfolio Optimization and Asset Allocation. Oxford University Press, 2

edition, 2008. URL http://EconPapers.repec.org/RePEc:oxp:obooks:9780195331912.

[65] R.O. Michaud. The markowitz optimization enigma: Is optimized optimal? Financial

Analysts Journal, 45(1):31–42, 1989.

[66] G. Miller. Needles, haystacks, and hidden factors. The Journal of Portfolio Management,

32(2):25–32, 2006.

[67] Manfred Morari, Carlos E. Garcia, and David M. Prett. Model predictive control:

Theory and practice. Model Based Process Control, page 1–12, 1989. doi: 10.1016/

b978-0-08-035735-5.50006-1.

[68] J. Mossin. Equilibrium in a Capital Asset Market. Econometrica,, 34:766–783, 1966.

[69] J. Mossin. Optimal multiperiod portfolio policies. The Journal of Business, 41:215–229,

1968.

97

http://dx.doi.org/10.1016/j.automatica.2014.10.128

http://dx.doi.org/10.1016/j.automatica.2014.10.128

http://symmys.com/node/162




http://EconPapers.repec.org/RePEc:oxp:obooks:9780195331912

Bibliography

[70] Yu. Nesterov. Gradient methods for minimizing composite functions. Mathematical

Programming, 140(1):125–161, 2012. doi: 10.1007/s10107-012-0629-5.

[71] Yurii Nesterov and Arkadii Nemirovskii. Interior-point polynomial algorithms in convex

programming. 1994. doi: 10.1137/1.9781611970791.

[72] Brendan O’donoghue, Giorgos Stathopoulos, and Stephen Boyd. A splitting method for

optimal control. IEEE Transactions on Control Systems Technology, 21(6):2432–2442, 2013.

doi: 10.1109/tcst.2012.2231960.

[73] Neal Parikh and Stephen Boyd. Proximal algorithms. Found. Trends Optim., 1(3):127–239,

January 2014. ISSN 2167-3888. doi: 10.1561/2400000003. URL http://dx.doi.org/10.1561/

2400000003.

[74] Eckhard Platen. A Benchmark Approach to Investing and Pricing. Research Paper Series

253, Quantitative Finance Research Centre, University of Technology, Sydney, August

2009.

[75] Eckhard. Platen and David. Heath. A benchmark approach to quantitative finance /

Eckhard Platen, David Heath. Springer Berlin, 2006. ISBN 9783540262121 3540262121

9783540262121.

[76] Eckhard Platen and Renata Rendek. Approximating the numeraire portfolio by naive

diversification. Research Paper Series 281, Quantitative Finance Research Centre, Univer-

sity of Technology, Sydney, 2010.

[77] Stanley R. Pliska. Introduction to mathematical finance: discrete time models. Blackwell

Publishers, 1997.

[78] W. B. Powell. Approximate dynamic programming: Solving the curses of dimensionality.

Wiley, 2007.

[79] M.L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming.

Wiley, New York, 1994.

[80] Stefan Richter, Colin N. Jones, and Manfred Morari. Real-time input-constrained mpc

using fast gradient methods. Proceedings of the 48h IEEE Conference on Decision and

Control (CDC) held jointly with 2009 28th Chinese Control Conference, 2009. doi: 10.1109/

cdc.2009.5400619.

[81] R.T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. Journal of risk,

2:21–42, 2000.

[82] T. Roncalli. Introduction to Risk Parity and Budgeting. Chapman Hall/CRC, 2013.

[83] T. Roncalli and G. Weisang. Risk parity portfolios with risk factors. Working Paper, 2012.

Available at: http://www.thierry-roncalli.com/download/risk-factor-parity.pdf.

98

http://dx.doi.org/10.1561/2400000003

http://dx.doi.org/10.1561/2400000003


Bibliography

[84] Olivier Roy and Martin Vetterli. The effective rank: A measure of effective dimensionality.

In 15th European Signal Processing Conference, EUSIPCO 2007, Poznan, Poland, Septem-

ber 3-7, 2007, pages 606–610, 2007. URL http://ieeexplore.ieee.org/xpl/articleDetails.jsp?

arnumber=7098875.

[85] M. Rubinstein. Continuously rebalanced investment strategies. The Journal of Portfolio

Management, 10(3):387–406, 1991.

[86] P. A. Samuelson. Lifetime portfolio selection by dynamic stochastic programming. Review

of Economic Studies, 51:239–246, 1969.

[87] M.A. Saunders. Cholesky-based Methods for Sparse Least Squares: the Benefits of Regular-

ization. 1996.

[88] William Sharpe. Capital asset prices: A theory of market equilibrium under conditions of

risk. Journal of Finance, 19(3):425–442, 1964. URL http://EconPapers.repec.org/RePEc:

bla:jfinan:v:19:y:1964:i:3:p:425-442.

[89] J. Skaf and S. Boyd. Multi-Period Portfolio Optimization with Constraints and Transaction

Costs. Stanford working paper, 2009.

[90] L. Sneddon. The dynamics of active portfolios. In Proceedings of the Northfield Research

Conference 2005. Northinfo, 2005.

[91] Florin Spinu. An algorithm for computing risk parity weights. SSRN Electronic Journal,

2013. doi: 10.2139/ssrn.2297383.

[92] G. Stathopoulos, T. Keviczky, and Y. Wang. A hierarchical time-splitting approach for

solving finite-time optimal control problems. In 2013 European Control Conference (ECC),

pages 3089–3094, July 2013.

[93] M. C. Steinbach. Liquidity preference as behavior towards risk. Society for Industrial and

Applied Mathematics Review, 43:31–85, 2001.

[94] J. Tobin. Liquidity preference as behavior towards risk. Review of Economic Studies, 25:

65–86, 1958.

[95] Jack L. Treynor. Market Value, Time, and Risk. Social Science Research Network Working

Paper Series, 1961. URL http://ssrn.com/abstract=2600356.

[96] Yang Wang and Stephen Boyd. Fast model predictive control using online optimization.

IEEE Transactions on Control Systems Technology, 18(2):267–278, 2010. doi: 10.1109/tcst.

2009.2017934.

[97] A. A. Yushkevich. Reduction of a controlled markov model with incomplete data to a

problem with complete information in the case of borel state and control space. Theory

of Probability Its Applications, 21(1):153–158, 1976. doi: 10.1137/1121014.

99

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7098875

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7098875




Documents

FAST SCENARIO-BASED OPTIMAL CONTROL FOR STOCHASTIC ... · on the Alternating-Direction of Multipliers (ADMM), for solving large-scale linear convex multi-period optimization problems