Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Models of Price Impact
Part III Essay
by
Daniel Ritter
Date:
24th of April 2015
University of Cambridge
Faculty of Mathematics
Contents
1 Introduction 1
2 The Almgren Model 3
2.1 The Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 The Linear Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Adaptive Strategies 12
3.1 Mean-Variance Optimal Strategies . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 CARA Investors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Conclusion 36
II
1 Introduction
On the first pages of many introductory books to financial mathematics, the author gives
a number of assumptions and simplifications for the markets considered. Those simpli-
fications typically hold approximately in most cases of applications, but for a thorough
understanding of financial markets they have to be dropped and novel, stronger and more
complicated models of financial markets have to be developed. Two of those simplifica-
tions are a vanishing bid-ask spread and an infinite market depth, which means that there
is any desired number of shares available at the quoted price of an asset and trading this
asset has no influence on the evolution of the price.
For small market participants these assumptions reflect real markets in a satisfiable way.
Bid-ask spreads, typically, are of the size of one or a few ticks and the real market depth is
much bigger than the volume traded by those small investors. They can execute the whole
trade at the given bid respectively ask price. For market participants who want to sell
(buy) a large number of shares of a certain stock on the other hand, those simplifications
no longer reflect real world behaviour good enough, since they would consume the entire
depth of the market at the given bid (ask) price and thus drive down (up) those prices.
This phenomenon is called price impact. Typically, a market participant is considered to
be large if he trades a number of 10 000 shares or more, which is then called a block trade.
This figure, however, should depend on the specific stock, since liquidity, and therefore
market depth, varies from stock to stock.
As described in the chapter “The Block Trader” of Sebastian Mallaby’s book “More
Money Than God” [11], those block trades gained significant importance in the period
from 1965 to 1984, when the percentage of trading volume on the markets represented by
block trades rose from less than 5 per cent to about 50 per cent, caused by an increasing
concentration of money in pension funds, insurance funds, and mutual funds, which was
privately invested before. [11, p. 52f.][16] Nowadays, due to the electronic trading, it is
possible for traders to split block trades of sizes above 10 000 shares in a number of smaller
trades with volumes of a few hundred shares. These small trades all have small impacts on
the price which then add together to the total price impact. This total impact, however,
depends significantly on how the large trade was split. It is, therefore, not only an
important question today, how trades influence the prices and how this can be modelled,
but even more which trading strategy is best for market participants wanting to sell or
buy a large number of shares. A recurring difficulty in these thoughts is that slower selling
or buying typically leads to less price impact on the one hand, since the market has more
time to recover. On the other hand, faster sales or buys minimise the risk of random price
changes during the sell or buy period. It is one of today’s problems for companies, like
hedge funds, who are trading large volumes on the market, to find the optimal middle
course.
1
In this essay, we want to discuss one approach of modelling price impact and properties
of optimal trading strategies in this model. We use the approach made by Almgren and
Chriss in [1]. It suggests that the price impact of a trade consists of a permanent impact
and a temporary impact. The permanent impact reflects the fact that other market
participants assume our trading decision to be driven by some information about future
price changes. Also, it can be based on a speculation about further trades to follow the
first one in near future. It is called permanent impact since it influences the price for
the whole period we consider. The temporary impact is assumed to only influence the
price for the particular trade itself and stems from an imbalance in supply and demand
which moves the price away from the equilibrium price to a less favourable price. This
imbalance, however, is assumed to be compensated until the next trade is made. The
basis for that is the widely discussed and empirically supported resilience of the order
book, which means that market makers tend to fill the gap occurred by trading. For
trades at an interval of some minutes the assumption of full compensation seems to be
sensible if the number of traded shares was not too large.
The essay is outlined as follows: In section 2, we will look at the Almgren model in more
detail. We will begin with a formal setup and then determine optimal strategies for a
special case. Then in section 3, we will discuss the benefit from permitting adaptive
strategies in our model. First, we will show that one can construct an optimal adaptive
trading strategy in the case of mean-variance optimisation. In the second part of section 3,
we will see that in the case of CARA (constant absolute risk aversion) utility optimisation
the optimal strategy is still a static one.
2
2 The Almgren Model
In this section, we will first formalise the setup of the Almgren model, which was broached
in the introduction. After that, we will explicitly solve the problem of mean-variance
optimisation for the linear case of the Almgren model, i. e. if we assume temporary
impact to be linear. The presented ideas come from Almgren and Chriss in [1, p. 7-14].
2.1 The Setup
The situation we want to consider in the following is that we are a trader on the market
and we hold a number of X0 ∈ R+0 shares of an asset at time 0. We want to liquidate
this portfolio by a prescribed time T < ∞. Note that we restrict ourselves to solving
the problem for this sale programme. It is obvious that the theory is also applicable for
buy programmes, where we start with 0 shares at time 0 and have to build up a certain
number of shares by prescribed time T . Also note that the situation can be generalised to
a portfolio of d assets whose price processes can be correlated in some way and make the
calculation more complicated. A discussion generalised to this possibility can be found in
[1, p. 36ff.] and [15], and to some extent in section 3.2.
The discrete times at which we are allowed to trade are given by tk := kτ = k · TN
, where
0 ≤ k ≤ N and τ := TN
is the period between each trade, typically a few minutes to some
hours. If empirical data suggests some intra-day seasonality in the traded volume, like in
[6, p. 1671], one can interpret time t as ‘volume time’ rather than physical time.
We follow a model for the underlying price process chosen by Almgren and Chriss in [1]
which is that of a discrete arithmetic random walk. That is, in periods in which we do
not trade any shares, the price would develop according to
Sk = Sk−1 + στ12 ξk,
where (ξk)k≥1 is a set of independent random variables of mean zero and variance one
on some probability space (Ω,F ,P), and S0 is the deterministic, quoted price of the
asset at time 0. We assume that the process (ξk)k≥1 respectively (Sk)k≥0 induces a fil-
tration (Fk)k≥0 ⊂ F and FN = F is complete. Possible examples for ξk would be
ξk ∼ Unif−1,+1 or ξk ∼ N (0, 1). Note that we have not added any drift to the price
process. This is a sensible choice if we have no estimates on future price movements. A
discussion about optimal execution strategies with non-vanishing drift term can be found
in [1, p. 26ff.] and to some extent in section 3.2. Using an arithmetic random walk to
model the price process has the drawback that with positive probability the prices could
become negative. This probability, however, is very small and can be neglected for typical
trading periods T of some hours or days. Then we have S0 >> σ√T = σ(SN). The reason
for choosing an arithmetic random walk rather than some positive process, like discretised
3
Geometric Brownian motion for example, is that it is mathematically more tractable and
allows an easier analysis since price changes are independent of the current price level.
A trading strategy X = (Xk)k≥0 is specified by the random variables Xk := Xtk ∈ Rwhich give the number of shares that we hold at time tk. For now, we only consider
static trading strategies X, i. e. (Xk)k≥0 is a sequence of deterministic, constant random
variables. Later, we will loosen this restriction and permit certain dynamic strategies as
well which are previsible in the filtered probability space. That is, each random variable
Xk may depend on information about the outcome of the random variables ξ1, . . . , ξk−1.
Note that here and throughout the essay we allow non-integer values for the portfolio
holdings Xk. In practice, at least X0 would be an integer but all results for Xk have to
be rounded to the next integer.
To simplify the notation in the sequel, define the number of shares of the asset sold
between times tk−1 and tk to be nk := Xk−1 −Xk. This yields the following relation:
Xk = X0 −k∑j=1
nj =N∑
j=k+1
nj
As mentioned before, in Almgren’s model we assume the price impact to be composed of
a permanent part Iperm and a temporary part Itemp such that the actual equilibrium price
evolves according to
Sk = Sk−1 + στ12 ξk − Iperm
k
and the realised price for the kth trade is given by
Sk = Sk−1 − Itempk .
So the price changes in the model are due to an exogenous factor, the volatility, which is
independent of the trading, and the endogenous factors permanent and temporary impact,
which are a reaction of the market to the trades. Both impacts should depend on the rate
of trading in the kth interval which is given by nk/τ . We, therefore, set
Ipermk = τg
(nkτ
)Itempk = h
(nkτ
)for some functions g, h : R→ R. A reasonable choice for h has to be non-decreasing, and
non-positive on (−∞, 0] and non-negative on [0,∞). Further, we assume f(v) := v · h(v)
to be strictly convex since larger trades should be punished as compared to smaller trades.
It is a result due to Huberman and Stanzl in [9] that the permanent impact has to be linear
if one wants to rule out quasi-arbitrage which “[...] is the availability of a series of trades
that generate infinite expected profits with an infinite Sharpe ratio.” [9, p. 1] That is,
not only the expected gain from such a strategy would be infinite, but even the expected
4
value of the gains divided by their standard deviation would be. This observation leads
to a linear choice for the permanent impact in our model. Since not having traded should
not be punished, we then get
g(v) = γv,
where γ > 0 and we have no constant summand.
Now we can investigate how different trading strategies lead to different revenues and
which strategies we favour. In absence of any price impact, we could just liquidate the
whole portfolio instantaneously and would receive a sum of S0X0. Considering price
impact in the above sense, our revenues when liquidating X0 shares with strategy X until
time T calculates to:
RXT =
N∑k=1
nkSk
=N∑k=1
(Xk−1 −Xk)Sk−1 −N∑k=1
nkh(nkτ
)=
N∑k=1
Xk−1Sk−1 −Xk
(Sk − στ
12 ξk + τg
(nkτ
))−
N∑k=1
τf(nkτ
)=
N−1∑k=0
XkSk −N∑k=1
XkSk +N∑k=1
Xk
(στ
12 ξk − γnk
)−
N∑k=1
τf(nkτ
)= S0X0 +
N∑k=1
(στ
12 ξk − γnk
)Xk −
N∑k=1
τf(nkτ
)We are doing this calculation without discounting the revenues from different times since
we assume a short time horizon in which we liquidate the whole portfolio. The transaction
costs we have to pay when using execution strategy X then are
C(X0, N,X) := S0X0 −RXT =
N∑k=1
(−στ
12 ξk + γnk
)Xk +
N∑k=1
τf(nkτ
),
where we identify∑N
k=1−στ12 ξkXk as the effect of volatility,
∑Nk=1 τf
(nkτ
)as the effect
of the temporary impact, and
N∑k=1
γnkXk = γN∑k=1
(Xk−1 −Xk)Xk
=γ
2
N∑k=1
X2k−1 −X2
k − (Xk −Xk−1)2
=γ
2X2
0 −γ
2
N∑k=1
n2k
5
as the effect of the permanent impact. One could imagine models where the temporary
impact carries some randomness. This case is discussed to some extent in [3]. Here,
however, the only randomness in the transaction costs lies in the volatility term of the
unaffected price process. One computes the expected transaction costs and their variance
as
E(X) := E [C(X0, N,X)] =γ
2X2
0 −γ
2
N∑k=1
n2k +
N∑k=1
τf(nkτ
), (1)
V (X) := Var [C(X0, N,X)] =N∑k=1
σ2τX2k . (2)
Since the square function is strictly convex, so is V (X) on the set of all liquidating
strategies X. Under sensible choices for γ and f , the expected costs E(X) are also
strictly convex. This has to be checked for each particular choice, however. For the rest
of this section, we will assume that γ and f are such that E(X) is strictly convex.
Note that the strategy of immediately selling all shares, that is X1 = X2 = . . . = XN = 0,
is the unique minimiser of V (X) since it yields a variance of 0. We call this strategy the
instantaneous one and denote it by X inst. The expected (and deterministic) trading costs
in this case are
E(X inst) =γ
2X2
0 −γ
2X2
0 + τf
(X0
τ
)= τf
(X0
τ
).
So without taking any risk, we have to pay costs of τf (X0/τ) for liquidating the portfolio.
Strategies with higher expected costs, therefore, are typically considered as bad choices,
unless one is a risk-loving trader. No matter if one is a risk-averse, risk-neutral, or risk-
loving trader, however, one would always choose an execution strategy with lower expected
costs over one with the same variance but higher expected costs. This leads to the concept
of the efficient frontier, introduced by Almgren and Chriss in [1]. In practice, the purpose
of portfolio liquidation is to make its value available in cash, and not speculating with
it. So it is reasonable to assume a risk-averse trader who is trying to minimise expected
costs for a given maximum level of variance. Therefore, we call a strategy X efficient or
optimal, if it minimises expected costs for a given maximum level of variance V∗:
minX: V (X)≤V∗
E(X) (3)
Due to the convexity of V (X), the set X : V (X) ≤ V∗ is convex. Also, it is bounded
since V (X) =∑N
k=1 σ2τX2
k . Hence there exists a unique minimiser X∗ of the strictly
convex function E(X). Since X∗ has variance V (X∗), we can rewrite (3) by introducing
a Lagrange multiplier λ:
minX,λ
E(X) + λ(V (X)− V (X∗))
6
Note that λ has to be non-negative by the equivalence to (3). If we wanted to solve
this explicitly, we would have to know V (X∗). Instead, we fix λ now and determine the
solution of
minX
E(X) + λ(V (X)− V (X∗))
which is the same as the solution of
minX
E(X) + λV (X), (4)
where we got rid of the unknown constant V (X∗) again. Since both E(X) and V (X)
are strictly convex, we find a unique solution X∗(λ) for each positive λ. As we vary λ
between 0 and∞, we get the set of all efficient strategies X∗, for which we can determine
the corresponding variances V∗ again. What we did here was changing the parametrising
variable in the efficient frontier from V∗ to λ. The parameter λ has also an economical
interpretation since we can identify equation (4) as the typical approach of mean-variance
optimisation for the given level of risk aversion λ.
For some cases of temporary impact functions h, the minimisation problem can be solved
explicitly. In the next section, we will do this for linear impact.
2.2 The Linear Case
Throughout this section we will assume the temporary impact to be linear:
h(nkτ
)= ε sgn(nk) +
η
τnk
for ε and η constants greater than 0. The term ε sgn(nk) can be interpreted as transaction
costs, consisting of fees and half the bid-ask spread. Note that empirical studies refute
the assumption of linear impact and suggest a power law with exponent of one half [4][7]
or 3/5 [2, p. 20], or logarithmic behaviour [13, p. 6] instead. Solving the linear case is
easy and provides some insight into the behaviour of efficient strategies, however.
Plugging in the special form for h into (1), we can compute the expected transaction costs
for the linear model:
E(X) =γ
2X2
0 −γ
2
N∑k=1
n2k +
N∑k=1
τf(nkτ
)=γ
2X2
0 −γ
2
N∑k=1
n2k +
N∑k=1
nk
(ε sgn(nk) +
η
τnk
)=γ
2X2
0 +N∑k=1
ε |nk|+(ητ− γ
2
)n2k
Note that for η/τ ≤ γ/2 we could make this expression as small as we like by first selling
a huge amount of shares and then buying them again (or vice versa). This is due to
7
the fact that the parameters γ and η would be such that the permanent impact drives
down (up) the prices in the future even more than the temporary impact drives them up
(down) for the current sale. This, of course, makes no sense for real markets and violates
all assumptions we made on the model. Thus, we assume η/τ > γ/2 in the following.
But then we see that one can never improve a strategy by intermediate buying, since this
drives up both the expected transaction costs E(X) and their variance V (X) from (2).
From now on, we therefore restrict ourselves to pure sell programmes without intermediate
buying. That is, |nk| = nk and we get that
E(X) =γ
2X2
0 + εX0 +N∑k=1
(ητ− γ
2
)n2k,
which is a strictly convex function on the set of strategies that liquidate X0 shares in time
T . In order to minimise E(X), note that it is minimal if and only if∑N
k=1 n2k is. We can
solve this constrained optimisation problem (recall that∑N
k=1 nk = X0) by introducing
a Lagrange multiplier which we call λ to avoid confusion with the risk aversion λ from
before:
0 =∂
∂nk
(N∑k=1
n2k − λ
(N∑k=1
nk −X0
))= 2nk − λ
This yields λ = 2nk for all k and in particular all nk are equal. So it must hold that
nk = X0/N . We call this strategy the linear one and denote it by X lin. Plugging it into
our formula for E(X) gives us the smallest possible expected costs of
Elin(X0, N) := E(X lin) =γ
2X2
0 +εX0+N∑k=1
(ητ− γ
2
)(X0
N
)2
=γ
2X2
0 +εX0+(η − γτ
2
) X20
T.
As well, we can compute the variance of the strategy:
Vlin(X0, N) := V (X lin) =N∑k=1
σ2τX2k =
N∑k=1
σ2τ
(N − kN
X0
)2
= σ2τX20
N−1∑`=0
(`
N
)2
= σ2 T
NX2
0
(N − 1)N(2N − 1)
6N2
= σ2TX20 ·
1
6
(1− 1
N
)(2− 1
N
)So for all V∗ ≥ Vlin(X0, N) = σ2TX2
0 · 16
(1− 1
N
) (2− 1
N
)the strategy X lin is the optimal
one. It corresponds to the risk aversion λ = 0 since it minimises E(X). Also, we already
know that the optimal strategy for λ =∞ is the instantaneous one, which yields a variance
of 0 and therefore minimises limλ→∞E(X) + λV (X). Its expected costs are
E(X inst) = εX0 +η
τX2
0 .
8
We now want to compute the minimiser of U(X) := E(X)+λV (X) for general λ ∈ (0,∞)
for the case of linear temporary impact, as it was done in [1, p. 13f.].
Proposition 2.1 In the Almgren model with linear temporary impact function, the unique
minimiser of mean-variance with risk aversion λ is given by the strategy X = (X0, . . . , XN),
with
Xj = X0sinh(κ(T − tj))
sinh(κT ),
where κ is a solution of the equation
2(cosh(κτ)− 1)(ητ− γ
2
)= λσ2τ.
Proof. We have
U(X) = E(X) + λV (X) =γ
2X2
0 + εX0 +N∑k=1
(ητ− γ
2
)(Xk−1 −Xk)
2 + λN∑k=1
σ2τX2k .
This yields
0 =∂U
∂Xj
=(ητ− γ
2
)[2(Xj −Xj+1)− 2(Xj−1 −Xj)] + 2λσ2τXj
= 2λσ2τXj −
(ητ− γ
2
)(Xj−1 − 2Xj +Xj+1)
which is equivalent to
Xj−1 − 2Xj +Xj+1 =λσ2
η − γτ2
τ 2Xj = κ2τ 2Xj,
where κ2 := λσ2/(η − γτ2
). Since κ2τ 2 > 0, a solution to this difference equation must be
unique, if it exists. We guess that the solution is of the form
Xj = c−e−κtj + c+e
κtj = c−e−κjτ + c+e
κjτ .
Such a κ must solve the difference equation, i. e.
0 = Xj−1 − 2Xj +Xj+1 − κ2τ 2Xj
= c−e−κjτ (eκτ − (2 + κ2τ 2) + e−κτ
)+ c+e
κjτ(e−κτ − (2 + κ2τ 2) + eκτ
)=(e−κτ − (2 + κ2τ 2) + eκτ
)Xj.
Since X0 6= 0 it follows that e−κτ − (2 + κ2τ 2) + eκτ = 0 and therefore
2 cosh(κτ)− 2 = eκτ + e−κτ − 2 = κ2τ 2.
For positive κ2τ 2 there exist exactly two solutions to this equation, one of them positive
and the other one negative and both having the same absolute value. This is not really
9
surprising and comes from the symmetry in our ansatz. So choose κ to be the positive
solution to 2 cosh(κτ)− 2 = κ2τ 2, say. We still have to determine the coefficients c± and
do this by using the constraints to our solution at times 0 and N :
X0 = c−e−κ·0 + c+e
κ·0 = c− + c+
0 = XN = c−e−κNτ + c+e
κNτ
This yields
c− =X0e
κNτ
eκNτ − e−κNτand c+ = − X0e
−κNτ
eκNτ − e−κNτ.
Altogether we have:
Xj =X0e
κNτ
eκNτ − e−κNτe−κjτ − X0e
−κNτ
eκNτ − e−κNτeκjτ
= X0sinh(κ(N − j)τ)
sinh(κNτ)
= X0sinh(κ(T − tj))
sinh(κT )
From the uniqueness of the solution, one can see a very important feature of optimal
trading strategies, namely that they are time-homogeneous. That is, revaluation of the
strategy at later times tk always yields the strategy obtained at time 0. The only thing
that changes is the start value Xk which replaces the start value X0. But at time tk the
difference equation stays the same and is uniquely determined by its boundary values Xk
and XN = 0. This statement can also be generalised to other impact functions h, as long
as the corresponding utility function U is quadratic. [1, p. 19]
In figure 1 you can see the trajectories of the optimal strategies for the values λ1 = 0,
λ2 = 5 · 10−7, λ3 = 3 · 10−6, and λ4 = ∞. The underlying values for the parameters are
X0 = 106, T = 5d, τ = 1d, N = 5, σ2 = 0.95, η = 2.5 · 10−6, and γ = 2.5 · 10−7. All those
values are adopted from [1, Table 1] and describe a typical situation when liquidating a
portfolio of one million shares within 5 days and being allowed to trade once a day. In
figure 2, one can see the efficient frontier for the linear case. The parameters are the same
as for figure 1. In addition, ε is chosen to be ε = 0.0625, which comes from [1, Table 1]
as well.
Note the flatness of the curve near λ1. By allowing for only a little more expected costs
(e. g. 17% for λ2), one can enormously reduce the variance (e. g. 51% for λ2). Further
reduction of variance is accompanied by strongly increasing costs, however.
10
Figure 1: Optimal Trading Trajectories for Different Levels of Risk Aversion
Figure 2: The Efficient Frontier
11
3 Adaptive Strategies
Up to now, we have only considered static trading strategies and we remarked that reval-
uation in the mid of trading has no effect on further trades. But as mentioned in the
beginning, also dynamic strategies (but fixed at initial time for all possible outcomes) are
possible. In the first part of this section, we will construct an optimal adaptive strategy
which strictly improves one’s mean-variance, measured at initial time t = 0, as compared
to the optimal static strategy obtained in section 2.2. Here, we will follow Lorenz and
Almgren in [10, p. 11-16]. In the second part, we discuss the situation for an investor
with constant absolute risk aversion (CARA). We will show that in this case there is no
gain from permitting adaptive strategies. For that, we will follow Schied, Schoneborn,
and Tehranchi in [15, p. 3-16].
3.1 Mean-Variance Optimal Strategies
For this section, we keep the setting from the linear case and we want to allow previsible
strategies as well now.
We want to obtain optimal previsible strategies by using dynamic programming, i. e. we
will derive optimal dynamic strategies recursively. Before we can formulate our results,
however, we first have to define some new notation. Let
D(X0, N) =
(X,C)
∣∣∣∣∣∣∣∣∣X = (X0, X1, . . . , XN) with XN = 0
X is previsible
X0 ≥ X1 ≥ . . . ≥ XN
C(X0, N,X) ≤ C a.s.
.
This denotes the set of all strategies which liquidate X0 shares in N steps. Also, with
each strategy in this set we get an upper bound C for the trading costs which is an
FN -measurable random variable itself and C(X0, N,X), for example, will always do the
job. Sometimes, however, it can be favourable to deliberately increase trading costs if
this minimises mean-variance. This inconvenience is due to the fact, that variance is not
monotone and punishes upward deviations in the same way as it does with downward
deviations. That is, it can happen that increasing trading costs decreases variance so
much that mean-variance in total goes down. Now define for a given level E ∈ R the set
A (X0, N,E) =
(X,C) ∈ D(X0, N)∣∣ E [C ] ≤ E
which describes the set of strategies whose expected costs are bounded from above by the
constant E. Note that this set is empty if E [C(X0, N,X)] > E for all possible execution
strategiesX. In section 2.2 we have shown that the linear strategyX lin minimises expected
12
trading costs on the set of static strategies. The result stays the same even if we allow
previsible strategies: Using the previsibility of X and Jensen’s inequality we get
E [C(X0, N,X)] = E
[γ
2X2
0 +N∑k=1
(−στ12 ξk)Xk + εX0 +
N∑k=1
(ητ− γ
2
)n2k
]
=γ
2X2
0 +N∑k=1
(−στ12 )E[E [ξk | Fk]Xk
]+ εX0 +
N∑k=1
(ητ− γ
2
)E[n2k
]=γ
2X2
0 + εX0 +N∑k=1
(ητ− γ
2
)E[n2k
]≥ γ
2X2
0 + εX0 +N∑k=1
(ητ− γ
2
)E [nk]
2 ,
and as in section 2.2 this becomes minimal only for E [nk] = X0/N . Since the above
inequality is an equality if and only if nk is constant for all k, we get the unique minimising
strategy nk = X0/N which is the linear strategy X lin. Using this, we get that A (X0, N,E)
is empty if and only if E < E(X lin).
We can also describe the efficient frontier from section 2.1 using sets A (X0, N,E):
Vmin(E) = inf
Var(C) ∣∣ (X,C) ∈ A (X0, N,E)
For E < E(X lin) this is Vmin(E) = ∞. For E ≥ E(X lin) we get Vmin(E) ≤ V (X lin) =
σ2TX20 · 1
6
(1− 1
N
) (2− 1
N
)and for E ≥ E(X inst) it holds Vmin = 0 since the strategy
of instantaneously selling all shares at time t = 0 has expected and deterministic costs
E(X inst) without any variance. Also, we define the set of all efficient strategies as
E (X0, N) =
(X,C)∣∣∣ Var
(C)≥ Var
(C)
for all (X, C) ∈ A (X0, N,E[C]).
That is, a strategy is called efficient if there is no strategy with the same or less expected
costs but lower variance.
We want to show that an efficient strategy for N steps also is efficient at each interme-
diate step. For that, we denote the tail of a trading strategy (X,C) ∈ D(X0, N), with
X = (X0, X1, . . . , XN−1, 0), by (X,C)ξ1 ∈ D(X1, N − 1) where the subscript ξ1 indi-
cates that the strategy is conditioned on the outcome of ξ1, since all random variables
X2, . . . , XN can make use of information about this outcome.
In the linear case, the trading costs are given by
C(X0, N,X) =γ
2X2
0 +N∑k=1
(ητ− γ
2
)n2k + εX0 −
N∑k=1
στ12 ξkXk.
Since (γ/2)X20 +εX0 is independent of the trading strategy, we will drop it in the following
and assume
C(X0, N,X) =N∑k=1
(ητ− γ
2
)n2k −
N∑k=1
στ12 ξkXk.
13
Then we have that
C = Cξ1 +(ητ− γ
2
)(X0 −X1)2 − στ
12 ξ1X1,
where Cξ1 describes the cost bound of the tail strategy (X,C)ξ1 . In the same manner, we
also denote the trade schedule of the tail strategy by Xξ1 , which should not be confused
with the asset holdings Xk of strategy X. It will always be clear which of them is meant
in the following. With all the introduced notation, we can now formulate the following
lemma from [10, p. 12]. Note that we have added some more details to the proof from
[10, p. 12f.] here but the idea is the same.
Lemma 3.1 For N ≥ 2, let (X,C) ∈ E (X0, N) be an efficient execution strategy with
X = (X0, X1, . . . , XN−1, 0). Then almost surely it holds (X,C)ξ1 ∈ E (X1, N − 1), i. e.
B =
(X,C)ξ1 6∈ E (X1, N − 1)⊆ Ω has probability zero.
Proof. If B = ∅, then we are done. Suppose now B 6= ∅, then by definition, on B the
strategy (X,C)ξ1 is not efficient and there exists a strategy (X∗ξ1 , C∗ξ1
) ∈ D(X1, N − 1) on
B such that Var(C∗ξ1| ξ1
)< Var
(Cξ1 | ξ1
)while E
[C∗ξ1| ξ1
]≤ E
[Cξ1 | ξ1
]. Without
loss of generality, we can assume equality here by adding the σ(ξ1)-measurable term
E[Cξ1 | ξ1
]− E
[C∗ξ1| ξ1
]which has no effect on the conditional variance.
Define now a new strategy (X, C) by replacing the trade schedules at times t2, . . . , tNand the cost bound in (X,C) by (X∗ξ1 , C
∗ξ1
) on B and keeping the original strategy on Bc.
Since (Xξ1 , Cξ1) ∈ D(X0 −X1, N − 1) on the whole of Ω then, we have that
C(ω) = Cξ1(ω)(ω) + g(ξ1(ω)) ≥ C(X1, N − 1, Xξ1)(ω) + g(ξ1(ω)) = C(X0, N, X)(ω),
where we use the abbreviation g(ξ1) :=(ητ− γ
2
)(X0 − X1)2 − στ
12 ξ1X1, which is a
σ(ξ1)-measurable expression. Therefore (X, C) ∈ D(X0, N). Also, from the fact that
E[C∗ξ1| ξ1
]= E
[Cξ1 | ξ1
]on B, it follows that
E[C]
= E[E[C | ξ1
]]= E
[E[C∗ξ1| ξ1
]1B + E
[Cξ1 | ξ1
]1Bc + g(ξ1)
]= E
[E[Cξ1 | ξ1
]+ g(ξ1)
]= E
[C]
and
Var(C)
= E[Var
(C | ξ1
)]+ Var
(E[C | ξ1
])= E
[Var
(Cξ1 + g(ξ1) | ξ1
)]+ Var
(E[Cξ1 + g(ξ1) | ξ1
])= E
[Var
(Cξ1 | ξ1
)]+ Var
(E[Cξ1 | ξ1
]+ g(ξ1)
)= E
[Var
(C∗ξ1| ξ1
)1B + Var
(Cξ1 | ξ1
)1Bc]
+ Var(E[Cξ1 | ξ1
]+ g(ξ1)
).
14
If P (B) > 0, we could then conclude
Var(C)< E
[Var
(Cξ1 | ξ1
)1B + Var
(Cξ1 | ξ1
)1Bc]
+ Var(E[Cξ1 + g(ξ1) | ξ1
])= E
[Var
(Cξ1 | ξ1
)]+ Var
(E[C | ξ1
])= E
[Var
(Cξ1 + g(ξ1) | ξ1
)]+ Var
(E[C | ξ1
])= E
[Var
(C | ξ1
)]+ Var
(E[C | ξ1
])= Var
(C)
which would contradict our assumption of (X,C) ∈ E (X0, N).
We extend now our previous definition of Vmin to shorter intervals. For 1 ≤ k ≤ N and
x ≥ 0 let
Jk(x, c) = inf
Var(C)| (X,C) ∈ A (x, k, c)
.
Then clearly, Vmin(E) = JN(X0, E). Also we get the following properties:
Jk(x, c) =
∞ , c <
(ητ− γ
2
)x2
k
Vlin(x, k) , c =(ητ− γ
2
)x2
k
non-increasing in c ,(ητ− γ
2
)x2
k≤ c ≤
(ητ− γ
2
)x2
0 , c ≥(ητ− γ
2
)x2
This is due to the fact that the linear liquidation minimises expected costs with variance
Vlin(x, k) and costs(ητ− γ
2
)x2
k. For c ≥
(ητ− γ
2
)x2 we can choose the instantaneous
liquidation which yields variance 0. In between, since we are always minimising, the
variance can only be non-increasing. For k = 1 linear and instantaneous liquidation
coincide, and hence
J1(x, c) =
∞ , c <(ητ− γ
2
)x2
0 , c ≥(ητ− γ
2
)x2.
The following relations between Jk(x, c) and E (x, k) obviously hold:
(X∗, C∗) = argmin
(X,C)∈A (x,k,c)
Var(C)⇒ (X∗, C
∗) ∈ E (x, k), (5)
(X,C) ∈ E (x, k)⇒ Var(C)
= Jk(x,E[C]) (6)
We can now formulate the main statement of this section. It helps derive optimal previsible
strategies by a recursive scheme, which minimises some value function in each trading
period. In each step we give ourselves two control parameters, the number of shares y
to keep in the portfolio at the end of the period and the cost limit function z(ξ). When
following an optimal strategy, we commit ourselves to sell the remaining y shares using an
efficient strategy with cost bound z(ξ) dependent on the price change στ12 ξ. The theorem
and its proof come from [10, p. 14f.].
15
Theorem 3.2 Let the stock price change in the next trading period be στ12 ξ. Define
Gk(x, c) =
(y, z) ∈ R× L1(Ω,R)∣∣∣ E [z(ξ)] +
(ητ− γ
2
)(x− y)2 ≤ c, 0 ≤ y ≤ x
.
Then for k ≥ 2,
Jk(x, c) = min(y,z)∈Gk(x,c)
(Var
(z(ξ)− στ
12 ξy)
+ E [Jk−1(y, z(ξ))]).
Proof. For given x ≥ 0 and c ≥ Elin(x, k), let
(X∗, C∗) = argmin
(X,C)∈A (x,k,c)
Var(C).
That means that X∗ = (x, y, . . .) is an optimal strategy for selling x shares in k trading
periods with expected costs not exceeding c. By (5) this implies (X∗, C∗) ∈ E (x, k) and
further Var(C∗)
= Jk(x,E[C∗]) by (6). Contained in this strategy X∗ we identify both
the number of shares to be sold in the first period x − y, and the strategy of selling the
remaining y shares in the remaining k − 1 periods. We denote this tail-strategy, which
may depend on the outcome of ξ, by (X∗, C∗)ξ.
By lemma 3.1, we know that (X∗, C∗)ξ ∈ E (y, k − 1) almost surely. Writing z(ξ) for
E[C∗ξ
], this implies
Var(C∗ξ
)= Jk−1(y, z(ξ)),
again by (5) and (6). Also, since minimal expected costs are achieved by the linear trading
strategy and since (X∗, C∗)ξ ∈ E (y, k − 1), it must hold
z(ξ) ≥ Elin(y, k − 1).
We can then write
E[C∗ | ξ
]= z(ξ) +
(ητ− γ
2
)(x− y)2 − στ
12 ξy,
Var(C∗ | ξ
)= Jk−1(y, z(ξ)),
and derive
E[C∗]
= E [z(ξ)] +(ητ− γ
2
)(x− y)2,
Var(C∗)
= Var(z(ξ)− στ
12 ξy)
+ E [Jk−1(y, z(ξ))] .
Since (X∗, C∗) was chosen to be the minimiser of Var
(C)
over the set A (x, k, c), we can
equivalently minimise
Var(C∗)
= Var(z(ξ)− στ
12 ξy)
+ E [Jk−1(y, z(ξ))]
16
over all (z(ξ), y) constrained to
E [z(ξ)] +(ητ− γ
2
)(x− y)2 ≤ c,
0 ≤ y ≤ x,
z(ξ) ≥ Elin(y, k − 1).
Since Jk−1(y, z(ξ)) becomes∞ for z(ξ) < Elin(y, k−1), such a pair can never be minimising
the expression above and we can drop the last constraint.
Using the theorem, we can then describe optimal solutions recursively: We start with the
original problem
Vmin(E) = min(X,C)∈A (X0,N,E)
Var(C).
By the theorem, we get that
Vmin(E) = JN(X0, E) = min(y,z)∈GN (X0,E)
(Var
(z(ξ)− στ
12 ξy)
+ E [JN−1(y, z(ξ))]),
which we can solve if JN−1(y, z(ξ)) is known for all ((y, z) ∈ GN(X0, E). We can repeat this
minimisation up to J1 where the instantaneous execution strategy is always the optimal
(and only) one. Plugging in backwards yields all minimal values and their minimisers.
Let k ≥ 2 and (y, z(ξ)) be the minimiser in
Jk(x, c) = min(y,z)∈Gk(x,c)
(Var
(z(ξ)− στ
12 ξy)
+ E [Jk−1(y, z(ξ))]),
and(X∗k−1(y, z(ξ)), C
∗k−1(y, z(ξ)
)the minimiser in
Jk−1(y, z(ξ)) = min(X,C)∈A (y,k−1,z(ξ))
Var(C).
Then recursively,
X∗k(x, c) = (x,X∗k−1(y, z(ξ))),
C∗k(x, c) = C
∗k−1(y, z(ξ)) +
(ητ− γ
2
)(x− y)2 − στ
12 ξy,
where X∗1 (x, c) = (x, 0) and C∗1(x, c) = max
(ητ− γ
2
)x2, c
. Combining the strategies for
all steps yields a (in general previsible) strategy X∗ and a cost bound C∗ ≥ C(X0, N,X
∗)
which solve the original problem of minimising Var(C)
with expected costs bounded by
some constant E. Note that, with our cost bound in the last step, we allow ourselves
to give away some money if c is greater than the actual occurred costs(ητ− γ
2
)x2. As
mentioned before, this is sensible in the case where lower costs increase the variance of
the costs. By construction of our cost bounds we will never have total expected costs
higher than E, even when giving away money in the last step.
17
Figure 3: Adaptive Trajectories [10, p. 26]
It is not immediately clear that this optimisation process will not yield the same static
execution strategy as proposition 2.1, regardless of the additional information in each
period. Numerical simulations conducted by Lorenz and Almgren in [10], however, show
that this is not the case. In figure 3, you can see the optimal static solution (drawn in
black) for some point on the static efficient frontier. Also you can see the simulations
for optimal adaptive strategies in two rather extreme cases of price movements of the
asset (red and blue). Besides the fact that the adaptive trajectories do not coincide with
the static one, the figure also shows that adaptive strategies are aggressive in-the-money
which means that we increase our trading speed when the asset price rises and decrease it
when the asset price falls. This is because when the asset price rises we can spend those
gains we make on higher impact costs to decrease future variance due to both market
volatility and the unexpected gains. If the asset price falls, we decrease future trading
costs by slower trading in order to compensate the losses due to the price fall. This
argumentation, of course, only makes sense if price movements are uncorrelated, as it is
the case for our model, or even negatively correlated. For a positive correlation, it would
be reasonable to keep holdings while prices are in an upwards trend and get rid of them,
when there is a downwards trend.
18
3.2 CARA Investors
In the previous sections, we have discussed optimal trading strategies with respect to
minimising mean-variance and, in section 3.1, we have seen that permitting adaptive
strategies can strictly improve the result of our optimisation. In this section, we fol-
low Schied, Schoneborn, and Tehranchi in [15] where they consider optimisation under
the utility function u(r) = − exp(−αr) instead, which is called exponential or CARA
(constant absolute risk aversion) utility function. It has the advantage that an optimal
solution is time consistent and does not depend on the investor’s initial wealth. Also for
a CARA utility function, we obtain that optimal trading strategies are deterministic and
cannot, as in the mean-variance case, be improved by allowing information dependent
trading. This will be the main result of this section. We update our setting a little bit
and change to a continuous time model from now on. Also, we consider a multi-asset
market with drift.
We assume that our probability space (Ω,F ,P) is equipped with a filtration (Ft)t≥0 sat-
isfying the usual conditions. For the price process, we consider a Bachelier model which
is the continuous time analogue of the previously considered model. That is, the price
processes are given by
Sit = Si0 +m∑j=1
σijBjt + bit, i = 1, . . . , d,
where S0 ∈ Rd is the initial price vector, B is an m-dimensional Brownian motion adapted
to the filtration (Ft)t≥0 and starting at B0 = 0, σ ∈ Rd×m is the volatility matrix, and
b ∈ Rd is the drift vector. To rule out arbitrage possibilities in the unperturbed market,
we assume that b ⊥ ker Σ, where Σ := σσ> is the covariance matrix. Otherwise one could
follow a constant strategy H ∈ ker Σ = ker σ> with b>H > 0. Then
d(Ht · St) = H · dSt = H · (σ dBt + b dt) = (σ>H)>dBt + b>H dt = b>H dt,
and H would be an arbitrage possibility.
The trading strategies (Xt)t∈[0,T ] are continuous time processes adapted to the filtration
(Ft)t≥0 and we assume them to be absolutely continuous. This describes that we cannot
trade a positive number of shares in an infinitesimal period of time. It assures us that
the derivative Xt exists almost everywhere, and is an L1-function. If, namely, ε > 0
is given, by the absolute continuity of X we find a δ > 0 such that for all intervals
I1 = (s1, t1), . . . , Im = (sm, tm) with 0 ≤ s1 ≤ t1 ≤ s2 ≤ t2 ≤ . . . ≤ sm ≤ tm ≤ T and∑mk=1 |tk − sk| < δ, it follows
∑mk=1 |Xtk −Xsk | < ε. But then for each coordinate X i of
X we have
ε >
m∑k=1
|Xtk −Xsk | ≥m∑k=1
|X itk−X i
sk|,
19
and hence X i is absolutely continuous. Thus, we know that the derivatives X i exist
almost everywhere and are in L1 = L1([0, T ],B([0, T ]),Leb). The same holds for X which
is given by Xt = (X1t , . . . , X
dt )>.
In addition, we assume that |Xt(ω)| is bounded for almost all ω ∈ Ω and all t ∈ [0, T ].
This assumption is sensible since we cannot buy or sell short arbitrarily many assets.
We introduce the following classification:
Xdet(T,X0) =X : [0, T ]→ Rd absolutely continuous with given X0 and XT = 0
,
X (T,X0) =
(Xt)t∈[0,T ] adapted with t 7→ Xt(ω) ∈ Xdet(T,X0) almost surely
and supt∈[0,T ]
|Xt| ∈ L∞(P)
Xdet(T,X0) describes the set of all admissible deterministic strategies which liquidate the
portfolio X0 in time T . It is a proper subset of the set X (T,X0) which denotes the set of
all admissible adapted strategies liquidating the portfolio.
Analogously to the previous sections, we see that the execution costs of a strategy X are
given by
C(X0, T,X) = −∫ T
0
Xt · dSt + F (X0, T,X),
where the functional F is given by
F (X0, T,X) : =
∫ T
0
vt · [Γ(X0 −Xt) + h(vt)] dt =1
2X>0 ΓX0 +
∫ T
0
f(vt)dt.
Here vt := −Xt is the trading speed at time t and Γ ∈ Rd×d describes the linear per-
manent impact. Further, h : Rd → Rd is the temporary impact, and f(v) := v · h(v)
which we assume to be non-negative, strictly convex, to have superlinear growth and to
be continuously differentiable. Also analogously to before, the revenues of the strategy
X ∈ X (T,X0) are:
RXT = S0 ·X0 − C(X0, T,X)
For the CARA utility function u(r) = − exp(−αr), α > 0, we can then formulate the
following theorem from [15, p. 6]. We have largely copied the proof from [15, p. 12f.] but
also filled in some details and adapted it to our case.
Theorem 3.3 We have
supX∈X (T,X0)
E[u(RX
T )]
= supX∈Xdet(T,X0)
E[u(RX
T )]. (7)
In particular, when there exists a deterministic strategy X∗ that maximises the expected
utility E[u(RX
T )]
within the class Xdet(T,X0) of deterministic strategies, then X∗ also
maximises the expected utility within the class X (T,X0) of all strategies.
20
Proof. We have to consider the expression
E[u(RX
T )]
= −e−αX0·S0E[exp
(−α∫ T
0
Xt · dSt + αF (X0, T,X)
)].
First, we note that for deterministic X it holds
E[exp
(−α∫ T
0
Xt · dSt)]
= E[exp
(∫ T
0
(−αXt) · (σdBt + b dt)
)]= exp
(1
2
∫ T
0
(−αXt)>Σ(−αXt)dt+
∫ T
0
(−αXt)>b dt
)(8)
since σ>(−αXt) is bounded and therefore∫ T
0
(−αXt) · σdBt ∼ N(
0,
∫ T
0
(−αXt)>Σ(−αXt)dt
).
Denoting the log-moment generating function of S1−S0 by Λ : Rd → R, we can compute
Λ(θ) = log(E[eθ·(S1−S0)
])= log
(E[eθ·(σB1+b)
])= log
(e
12θ>Σθ · eθ>b
)=
1
2θ>Σθ + θ>b.
Plugging this into (8) then yields
E[exp
(−α∫ T
0
Xt · dSt)]
= exp
(∫ T
0
Λ(−αXt)
),
and for the original expression we get
E[u(RX
T )]
= − exp
(−αX0 · S0 +
∫ T
0
Λ(−αXt)dt+ αF (X0, T,X)
),
if X ∈ Xdet(T,X0) and F (X0, T,X) therefore is deterministic.
Now define
M := infX∈Xdet(T,X0)
(∫ T
0
Λ(−αXt)dt+ αF (X0, T,X)
).
If M = −∞, then obviously both sides in (7) equal zero and the statement of the theorem
holds. Suppose now M > −∞ and take ε > 0 and Xε ∈ Xdet(T,X0) such that∫ T
0
Λ(−αXεt )dt+ αF (X0, T,X
ε) ≤M + ε.
We now want to bound the expression E[exp
(−α∫ T
0Xt · dSt + αF (X0, T,X)
)]for ar-
bitrary X ∈ X (T,X0) using the deterministic strategy Xε. In order to do so, we change
to the measure PX , given by the Radon-Nikodym density
dPX
dP= exp
(−α∫ T
0
Xt · dSt −∫ T
0
Λ(−αXt)dt
).
21
This expression is always positive and we get that PX ∼ P. To derive that PX is indeed
a probability measure we have to show that
E[exp
(−α∫ T
0
Xt · dSt −∫ T
0
Λ(−αXt)dt
)]= 1. (9)
To prove this, we define the simple previsible processes
Xn :=n−1∑k=0
Xtk1(tk,tk+1],
with tk := kT/n. Further, we define the processes Zn by
Znt := exp
(−α∫ t
0
Xnu · dSu −
∫ t
0
Λ(−αXnu )du
).
For u ≥ v, it holds Bu − Bv ∼ N (0, (u− v)Im), and so for θ an Fv-measurable random
variable we get
E[eθ·(Su−Sv)
∣∣Fv] = E[eθ·(σ(Bu−Bv)+b(u−v))
∣∣Fv] = e12
(u−v)θ>Σθ+θ>b(u−v) = e(u−v)Λ(θ).
Thus, we compute
E [ZnT ] = E
[exp
(−α∫ T
0
Xnu · dSu −
∫ T
0
Λ(−αXnu )du
)]= E
[exp
(n−1∑k=0
(−αXtk) · (Stk+1− Stk)−
n−1∑k=0
Λ(−αXtk)(tk+1 − tk)
)]
= E
[exp
(n−2∑k=0
(−αXtk) · (Stk+1− Stk)−
n−2∑k=0
Λ(−αXtk)(tk+1 − tk)
)
× E[exp
(−αXtn−1 · (Stn − Stn−1)− Λ(−αXtn−1)(tn − tn−1)
) ∣∣∣ Ftn−1
] ]
= E
[exp
(n−2∑k=0
(−αXtk) · (Stk+1− Stk)−
n−2∑k=0
Λ(−αXtk)(tk+1 − tk)
)],
and taking conditional expectations with respect to Ftn−2 , . . . ,Ft0 we get E [ZnT ] = 1.
Also, using that the exponential function is continuous and that
−α∫ T
0
Xnt · dSt −
∫ T
0
Λ(−αXnt )dt −→ −α
∫ T
0
Xt · dSt −∫ T
0
Λ(−αXt)dt
in probability by the definition of the integrals, we derive that
ZnT −→ exp
(−α∫ T
0
Xt · dSt −∫ T
0
Λ(−αXt)dt
)22
in probability. In order to derive that (ZnT )n≥1 is uniformly integrable, it is now sufficient
to show that it is bounded in L2 . But we can write E [(ZnT )2] as
E[(Zn
T )2]
= E[exp
(−∫ T
0
2αXnt · dSt −
∫ T
0
Λ(−2αXnt )dt
)Y n
], (10)
where we define the sequence (Y n)n≥1 by
Y n := exp
(∫ T
0
Λ(−2αXnt )− 2Λ(−αXn
t )dt
).
Using that Λ(θ) = 1/2 · θ>Σθ + θ>b is continuous we get that
supθ∈B(0,2αC)
|Λ(θ)| =: m <∞,
where C is the bound we assumed on |Xt(ω)| for almost all ω. Thus, it holds almost
surely
Y n ≤ exp
(∫ T
0
(m+ 2m)dt
)= e3mT =: K.
Iteratively taking conditional expectations in (10), we also get E [(ZnT )2] ≤ K. So (Zn
T )n≥1
is bounded in L2 and hence uniformly integrable. This, however, shows (9), since on the
one hand E [ZnT ] = 1 for all n and on the other hand by uniform integrability
E [ZnT ] −→ E
[exp
(−α∫ T
0
Xt · dSt −∫ T
0
Λ(−αXt)dt
)].
Now we can conduct the measure change and derive that
E[exp
(−α∫ T
0
Xt · dSt + αF (X0, T,X)
)]= E
[dPX
dPexp
(∫ T
0
Λ(−αXt)dt+ αF (X0, T,X)
)]= EX
[exp
(∫ T
0
Λ(−αXt)dt+ αF (X0, T,X)
)]≥ EX
[exp(M)
]≥ e−εEX
[exp
(∫ T
0
Λ(−αXεt )dt+ αF (X0, T,X
ε)
)]= e−εEX
[−eαX0·S0E
[u(RXε
t )]]
= −e−εeαX0·S0E[u(RXε
t )].
where the first inequality holds since PX ∼ P and for P-almost all ω ∈ Ω we have
X(ω) ∈ Xdet(T,X0), which implies that PX-almost surely∫ T
0
Λ(−αXt)dt+ αF (X0, T,X) ≥M.
23
Then, we further derive
supX∈X (T,X0)
E[u(RX
T )]
= supX∈X (T,X0)
−e−αX0·S0E
[exp
(−α∫ T
0
Xt · dSt + αF (X0, T,X)
)]≤ e−εE
[u(RXε
t )]
≤ e−ε supX∈Xdet(T,X0)
E[u(RX)
t
]and by sending ε 0 we get the result.
Remark 3.4 The proof does not depend on our specific choices for the price process
and the cost functional. It still holds if we consider the price process (St)t≥0 to be a
d-dimensional Levy process for the filtration (Ft)t≥0 which has all exponential moments
E[eλ·St
]< ∞. Also, we can replace our specific choice for the functional F (X0, T,X) in
the continuous Almgren model by an arbitrary functional F : Xdet(T,X0) → R ∪ ∞which yields the execution costs for each X(ω). To be sure that execution costs are not
infinite for all strategies, we would assume that F (X lin) <∞, where X lin is the strategy
of continuously trading at the same speed.
For our model, we can make use of the result to prove the following even stronger theorem
given in [15, p. 9]:
Theorem 3.5 For a CARA utility function, u(x) = −e−αx, α > 0, there exists a
P-almost surely unique optimal strategy X∗ ∈ X (T,X0). This strategy X∗ is a deter-
ministic function of time.
Before we can proof theorem 3.5, we have to make some preparing observations which
will show us how to approach the problem.
Remark 3.6 First, note that if in (7) a maximiser exists, it is unique since X (T,X0)
is a convex set and we can check that E[u(RX
T )]
as a function of X is strictly concave:
Let µ ∈ (0, 1) and X 6= Y ∈ X (T,X0), then with vXt = −Xt and vYt = −Yt we have
RµX+(1−µ)YT = S0 ·X0 +
∫ T
0
(µXt + (1− µ)Yt) · dSt −1
2X>0 ΓX0
−∫ T
0
f(µvXt + (1− µ)vYt )dt
> S0 ·X0 + µ
∫ T
0
Xt · dSt + (1− µ)
∫ T
0
Yt · dSt −1
2X>0 ΓX0
− µ∫ T
0
f(vXt )− (1− µ)
∫ T
0
f(vYt )dt
= µRXT + (1− µ)RY
T ,
24
since f is strictly convex. As a utility function, u is strictly increasing and strictly concave.
Therefore, we conclude
E[u(RµX+(1−µ)YT
)]> E
[u(µRX
T + (1− µ)RYT
) ]> µE
[u(RX
T )]
+ (1− µ)E[u(RY
T )].
In order to maximise E[u(RX
T )]
or equivalently to minimise E[exp(−αRX
T )], we now
define the value function of the problem by
V (T,X0, R0) : = infX∈X (T,X0)
E[e−αR
XT
](11)
= infX∈X (T,X0)
E[exp
(−αR0 + α
∫ T
0
Xt · dSt − α∫ T
0
f(vt)dt
)],
where R0 := S0 ·X0− 12X>0 ΓX0. More general, we set Rt := R0 +
∫ t0Xu ·dSu−
∫ t0f(vu)du,
which can be understood as the revenues we have secured until time t. This will later
allow us to evaluate V at different times t. We can rewrite expression (11) using theorem
3.3, the normal distribution of C(X0, T,X) if X is deterministic, and the fact that the
exponential function is both continuous and monotonically increasing:
V (T,X0, R0) = infX∈X (T,X0)
E[e−αR
XT
]= inf
X∈Xdet(T,X0)E[e−αR
XT
]= inf
X∈Xdet(T,X0)E[e−αS0·X0+αC(X0,T,X)
]= inf
X∈Xdet(T,X0)e−αS0·X0 · eαE[C(X0,T,X)]+α2
2Var(C(X0,T,X))
= exp
[−αS0 ·X0 + inf
X∈Xdet(T,X0)
(αE [C(X0, T,X)] +
α2
2Var (C(X0, T,X))
)]So as noticed in [15, p. 10], the problem we actually have to solve is the mean-variance
minimisation with level of risk aversion λ = α/2:
infX∈Xdet(T,X0)
(E [C(X0, T,X)] +
α
2Var (C(X0, T,X))
)This observation strongly simplifies the search for an optimal trading strategy. Note,
however, that we really have to restrict ourselves to deterministic strategies this time,
and we cannot improve the result by allowing adaptive strategies as in section 3.1, since
in this case C(X0, T,X) is not normally distributed any more and the above equation
fails. For
C(X0, T,X) = −∫ T
0
Xt · dSt +1
2X>0 ΓX0 +
∫ T
0
f(vt)dt,
25
we can now calculate the expectation as
E [C(X0, T,X)] =1
2X>0 ΓX0 +
∫ T
0
(−b>Xt + f(vt)
)dt,
and the variance as
Var (C(X0, T,X)) =
∫ T
0
X>t ΣXt dt.
Omitting the constant term 12X>0 ΓX0, we then obtain the Lagrangian problem
infX
∫ T
0
L(Xt, Xt)dt
where we minimise over absolutely continuous curves X starting at X0 and ending at
XT = 0, and the Lagrangian L is given by
L(q, p) :=α
2q>Σq − b>q + f(−p) (12)
for q, p ∈ Rd. The Hamiltonian corresponding to L is
H(q, p) := −α2q>Σq + b>q + f ∗(−p), (13)
where f ∗(z) = supx∈Rd(x>z − f(x)) is the Fenchel-Legendre transform of f for z ∈ Rd.
Altogether, we obtain
V (T,X0, R0) = exp
[−αS0 ·X0 + inf
X∈Xdet(T,X0)
(α
2X>0 ΓX0 + α
∫ T
0
(−b>Xt + f(vt)
)dt
+α2
2
∫ T
0
X>t ΣXtdt
)]= exp
(−αR0 + α inf
X∈Xdet(T,X0)
∫ T
0
L(Xt, Xt)dt
). (14)
In order to prove theorem 3.5, we still have to show that there is, indeed, a strategy
X ∈ Xdet(T,X0) minimising the integral in (14). To this end, consider the following
theorem which combines the results from Lemma 7.3 in [5, p. 73] and Theorem 7.1 in
[5, p. 74] for our particular case, but cannot be proven in the scope of this essay:
Theorem 3.7 For a function L : Rd × Rd → R, consider the variational problem
inf
∫ T
0
L(Yt, Yt)dt, (15)
where the infimum is taken over all Lipschitz continuous curves (Yt)0≤t≤T such that Y0 = 0
and YT = X0. Assume that the Hamiltonian H(q, p) corresponding to L(q, p) satisfies the
following conditions.
26
(H1) H(q, p) is strictly convex in p.
(H2) H is continuously differentiable.
(H3) H(q, p)/|p| → ∞ as |p| → ∞ for each q.
(H4) |∇qH| ≤ c1(p · ∇pH −H) + c2 for some constants c1, c2 with c1 ≥ 0.
(H5) p · ∇pH −H ≥ c3 for some constant c3.
(H6) H, |∇pH| ≤ g(|p|) for some non-negative, increasing function g : [0,∞)→ R.
Then (15) has an extremal. Furthermore,
u(T,X0) := min
∫ T
0
L(Yt, Yt)dt,
with the minimum taken over all Lipschitz curves with Y0 = 0 and YT = X0, solves the
Hamilton-Jacobi equation∂
∂Tu+H(X0,∇X0u) = 0
with initial condition u(0, 0) = 0.
N.B. The difference between theorem 3.7 and the original theorem and lemma in
[5, p. 73f.] is that, as suggested in [15, p. 14], we have already chosen the boundary
set B = (0, 0) ⊂ R × Rd and the boundary data f ∈ C(B) as f(0, 0) = 0 such that
all conditions required of f in the original theorem are vacuously true. Note that this
boundary data f has nothing to do with our function f(v) = v · h(v). Also, in theorem
3.7 we restrict ourselves to time-homogeneous Lagrangians respectively Hamiltonians, and
functions of time t in the original theorem become constants in our case.
Remark 3.8 While in our original problem the strategies X start at X0 and end at
XT = 0, in theorem 3.7 they have to start at 0 and end at X0. Simply by considering
Yt := XT−t, however, we are in the right position. We just have to change the Lagrangian
and Hamiltonian slightly [15, p. 14]:
L(q, p) := L(q,−p) =α
2q>Σq − b>q + f(p),
H(q, p) := H(q,−p) = −α2q>Σq + b>q + f ∗(p)
This yields:∫ T
0
L(Yt, Yt)dt =
∫ T
0
L(XT−t,− ˙(XT−t))dt =
∫ T
0
L(XT−t, XT−t)dt
=
∫ 0
T
−L(Xt, Xt)dt =
∫ T
0
L(Xt, Xt)dt
27
Theorem 3.7 is now almost applicable to our case. However, while in (14) we take the
infimum over the set of absolutely continuous curves, in theorem 3.7 we minimise over the
strictly smaller set of Lipschitz continuous curves. The following lemma shows that we
can approximate each absolutely continuous curve Y by a sequence (Y n)n≥1 of Lipschitz
continuous curves such that∫ T
0L(Y n
t , Ynt )dt →
∫ T0L(Yt, Yt)dt as n → ∞. In particular,
we find Lipschitz continuous curves Y n such that∫ T
0L(Y n
t , Ynt )dt → inf
∫ T0L(Yt, Yt)dt
and there is no loss in restricting ourselves to this smaller set. Note that it is enough to
show the lemma under the assumption that∫ T
0L(Yt, Yt)dt <∞. Otherwise, we can either
leave out Y in the sequence which approaches the infimum, or if the infimum is infinite
itself, we can simply choose an arbitrary sequence of Lipschitz continuous curves which
then also has to approach infinity. While this fact was already observed in [15, p. 15], its
justification was left out there. Here, however, we want to give a detailed proof.
Lemma 3.9 For each absolutely continuous curve Y : [0, T ]→ Rd with Y0 = 0, YT = X0,
and∫ T
0L(Yt, Yt)dt < ∞, we can find a sequence of Lipschitz curves Y n : [0, T ] → Rd
such that Y n0 = 0, Y n
T = X0, and∫ T
0L(Y n
t , Ynt )dt→
∫ T0L(Yt, Yt)dt, where L is as defined
above.
Proof. Since Y is absolutely continuous, as discussed before, we know that the derivative
Y = (Y 1, . . . , Y d)> exists almost everywhere and is in L1 = L1([0, T ],B([0, T ]),Leb).
In particular, Y = (Y 1, . . . , Y d)> is a measurable function and hence it follows that0 ≤ t ≤ T : |Yt| ≤ R
∈ B([0, T ]) for all R ≥ 0. Since Y ∈ L1, we find an R ≥ 0 such
that B :=
0 ≤ t ≤ T : |Yt| ≤ R∈ B([0, T ]) has measure Leb(B) > 0. Define now
Znt := Yt ·
∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣ +1B(t)
Leb(B)
∫ T
0
Ys ·
(∣∣∣Ys∣∣∣− n)+∣∣∣Ys∣∣∣ ds.
Here and in the following, we always set Yt/∣∣∣Yt∣∣∣ = 0, if Yt = 0. Since Y ∈ L1, we know
that ∣∣∣∣∣∣∣1B(t)
Leb(B)
∫ T
0
Ys ·
(∣∣∣Ys∣∣∣− n)+∣∣∣Ys∣∣∣ ds
∣∣∣∣∣∣∣ ≤1
Leb(B)
∫ T
0
∣∣∣Ys∣∣∣ ·(∣∣∣Ys∣∣∣− n)+∣∣∣Ys∣∣∣ ds
≤ 1
Leb(B)
∫ T
0
∣∣∣Ys∣∣∣ ds =: β <∞,
and hence
|Znt | ≤
∣∣∣Yt∣∣∣ ·∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣ + β ≤ n+ β.
28
Now, define Y nt :=
∫ t0Zns ds. We get that
|Y nt − Y n
s | =∣∣∣∣∫ t
s
Znudu
∣∣∣∣ ≤ ∣∣∣∣∫ t
s
|Znu |du
∣∣∣∣ ≤ ∣∣∣∣∫ t
s
(n+ β)du
∣∣∣∣ = (n+ β)|t− s|,
and hence Y n is a sequence of Lipschitz continuous curves. Also, we have Y n0 = 0 and
Y nT =
∫ T
0
Znt dt =
∫ T
0
Yt ·
∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣ dt+Leb(B)
Leb(B)
∫ T
0
Yt ·
(∣∣∣Yt∣∣∣− n)+∣∣∣Yt∣∣∣ dt =
∫ T
0
Yt dt = X0,
for all n.
It remains to show that∫ T
0L(Y n
t , Znt )dt→
∫ T0L(Yt, Yt)dt, as n→∞:
First note that Znt → Yt as n → ∞ for almost all t ∈ [0, T ], where we use that Y ∈ L1
and hence ∫ T
0
Yt ·
(∣∣∣Yt∣∣∣− n)+∣∣∣Yt∣∣∣ dt→ 0, as n→∞.
Now, we want to use the dominated convergence theorem for the ith component of Y nt ,
Y i,nt =
∫ t
0
Zi,ns ds =
∫ T
0
Zi,ns 1[0,t](s)ds,
where Zi,ns is the ith component of Zn
s . But since on B it holds R ≥∣∣∣Yt∣∣∣ ≥ ∣∣∣Y i
t
∣∣∣, we get
that ∣∣Zi,ns 1[0,t](s)
∣∣ ≤ ∣∣Zi,ns
∣∣ ≤ |Zns | ≤ 1Bc(s)
∣∣∣Ys∣∣∣+ 1B(s) (R + β) ,
which is an integrable function of s since Y ∈ L1. The dominated convergence theorem
then yields
Y i,nt =
∫ T
0
Zi,ns 1[0,t](s)ds→
∫ T
0
Y is 1[0,t](s)ds =
∫ t
0
Y is ds = Y i
t , as n→∞,
and hence Y nt → Yt for all t. Since the function q 7→ α
2q>Σq − b>q is continuous, we then
haveα
2(Y n
t )>ΣY nt − b>Y n
t →α
2(Yt)
>ΣYt − b>Yt, as n→∞,
for all t ∈ [0, T ]. Further, since Y ∈ L1, we have
|Y nt | =
∣∣∣∣∫ t
0
Zns ds
∣∣∣∣ ≤ ∫ t
0
|Zns |ds ≤
∫ T
0
|Zns |ds
≤∫ T
0
(1Bc(s)
∣∣∣Ys∣∣∣+ 1B(s)(R + β))ds =: γ <∞,
29
and hence again with continuity of q 7→ α2q>Σq − b>q,∣∣∣α
2(Y n
t )>ΣY nt − b>Y n
t
∣∣∣ ≤ supz∈Rd : |z|≤γ
∣∣∣α2z>Σz − b>z
∣∣∣ <∞.By the dominated convergence theorem we then have∫ T
0
(α2
(Y nt )>ΣY n
t − b>Y nt
)dt→
∫ T
0
(α2
(Yt)>ΣYt − b>Yt
)dt.
Now it only remains to show that∫ T
0f(Zn
t )dt→∫ T
0f(Yt)dt:
Since f is continuous, we know that f(Znt ) → f(Yt), as n → ∞ for almost all t. Again,
we want to use dominated convergence and since f is non-negative we see that
f (Znt ) = 1Bc(t)f (Zn
t ) + 1B(t)f (Znt ) ≤ f
Yt ·∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣
+ maxz∈Rd : |z|≤R+β
f(z).
The maximum is taken over a compact set and so the second summand is only a constant
for the continuous function f . To deal with the first summand, we note that f is assumed
strictly convex and non-negative, and f(0) = 0. Since for |Yt| > 0, (|Yt| ∧ n)/|Yt| ∈ [0, 1],
we then get
f
Yt ·∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣
≤∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣ · f(Yt) +
1−
∣∣∣Yt∣∣∣ ∧ n∣∣∣Yt∣∣∣ f(0) ≤ f(Yt).
For Yt = 0 the statement trivially holds. Thus,
f (Znt ) ≤ f
(Yt
)+ constant,
which is in L1 by the assumption that∫ T
0L(Yt, Yt)dt <∞. So we can apply the dominated
convergence theorem once more and get∫ T
0
f(Znt )dt→
∫ T
0
f(Yt)dt,
which finishes the proof.
We can now prove theorem 3.5. Note that it only remains to check conditions (H1)-(H6)
from theorem 3.7. This will yield the existence of a minimiser in (14) which then is
deterministic and unique as remarked before. At the proof of conditions (H1)-(H6), we
follow the idea in [15, p. 15]. Again we have added a couple of details to the original
proof.
30
Proof. First, we note that f ∗(p) is a strictly convex, continuously differentiable and su-
perlinearly growing function, since f was strictly convex, continuously differentiable and
superlinearly growing. See Theorem 26.6 in [14, p. 259] for instance, where we use the
equivalence of co-finiteness and superlinear growth in our case, and that a differentiable
convex function on the open set Rd is always continuously differentiable. Then, since in
H(q, p) = −α2q>Σq + b>q + f ∗(p)
the only dependence on p comes from f ∗(p), (H1) and (H3) are satisfied and H is contin-
uously differentiable with respect to p. By computing
∇qH(q, p) = −αΣq + b
we see thatH is also continuously differentiable with respect to q, and hence it is altogether
continuously differentiable and (H2) holds.
Theorem 26.6 in [14, p. 259] also gives us the equation
f ∗(p) = p · (∇f)−1(p)− f((∇f)−1(p)).
Since f is continuous on the closed set Rd, it is in particular a closed convex function and
we can also apply Theorem 26.5 in [14, p. 258] which gives us the equation
(∇f ∗)(p) = (∇f)−1(p).
Thus, we get
p · ∇pH(q, p)−H(q, p) = p · (∇f ∗)(p)− f ∗(p) +α
2q>Σq − b>q
= p · (∇f ∗)(p)−[p · (∇f ∗)(p)− f((∇f ∗)(p))
]+α
2q>Σq − b>q
= f((∇f ∗)(p)) +α
2q>Σq − b>q
≥ α
2q>Σq − b>q (16)
≥ α
4q>Σq − b>q, (17)
where we used f ≥ 0 and α/4 · q>Σq ≥ 0.
(H5) says that p · ∇pH −H is bounded below by a constant. Therefore, we consider the
expressionα
4q>Σq − b>q
and show that it has a global minimum (not necessarily attained at a single point but
possibly on an affine subspace of Rd). Since α > 0 and Σ = σσ> is positive semidefinite,
this is the case if its gradient becomes zero for some q ∈ Rd. But we can compute
∇q
(α4q>Σq − b>q
)=α
2Σq − b,
31
and so we have to solve the linear system
α
2Σq = b.
Since Σ = σσ> is a symmetric, positive semidefinite matrix, we find an orthonormal
basis B = (v1, v2, . . . , vd) of eigenvectors of Σ with corresponding eigenvalues ei ≥ 0,
i = 1, . . . , d. With respect to this basis the linear system becomes
α
2
e1
e2
. . .
ed
q1
q2
...
qd
=
b1
b2
...
bd
and we solve it by setting
qi :=
bi · 2αei
, if ei 6= 0
arbitrary (e. g. 0) , if ei = 0, i = 1, . . . , d.
At this point we have to recall that we assumed b ⊥ ker Σ in the beginning. So we have
that bi = 0 whenever ei = 0 since this means that vi ∈ ker Σ. Hence, q as constructed
above, indeed, solves the linear system. Defining c3 as the minimum, we then have
p · ∇pH −H ≥ c3 and this is (H5).
Using the same inequality in (16), we also get
p · ∇pH(q, p)−H(q, p) ≥ α
4q>Σq + c3. (18)
Again, we consider the orthonormal basis B = (v1, . . . , vd) of eigenvectors of Σ and we
denote the coordinates of q with respect to B by qi. That is, q =∑d
i=1 qivi. Using
eiq2i ≥
ei|qi| ≥ ei|qi| − ei , if |qi| ≥ 1
0 ≥ ei|qi| − ei , if |qi| < 1, i = 1, . . . , d,
(18) then becomes
p · ∇pH(q, p)−H(q, p) ≥ α
4
d∑i=1
eiq2i + c3 ≥
α
4
d∑i=1
ei|qi|+ c3 −α
4
d∑i=1
ei.
Now, we compute
|∇qH(q, p)| = | − αΣq + b| ≤ α|Σq|+ |b| ≤ αd∑i=1
ei|qi|+ |b|.
32
and we see that
|∇qH(q, p)| ≤ c1(p · ∇pH(q, p)−H(q, p)) + c2,
for c1 := 4 and c2 := α∑d
i=1 ei − 4c3 + |b|. Since 4 > 0 this is condition (H4).
Lastly, we have to check condition (H6). But we can simply set
g1(x) = supp∈Rd: |p|≤x
f ∗(p) + x
and
g2(x) = supp∈Rd: |p|≤x
|(∇f ∗)(p)|+ x.
Since f ∗ and ∇f ∗ are continuous, then g1 : [0,∞) → R and g2 : [0,∞) → R. Also, by
construction these functions are strictly increasing, and while g2 trivially is non-negative,
g1 also is since f ∗(0) = supq∈Rd(−f(q)) ≥ −f(0) = 0. We can now set g = g1 ∨ g2, and
have condition (H6).
So we have shown the main result of this section. In the process of proving it, we came
across the value function of the problem. In the following, we want to prove a further
property of it.
Having in mind the Martingale Principle of Optimal Control, it is suggested that the
process (Yt)0≤t≤T given by the time t value of the objective
Yt : = exp
(−αR0 + α
∫ t
0
Xu · dSu − α∫ t
0
f(vu)du
)× inf
(Xu)t≤u≤T∈X (T−t,Xt)E[exp
(α
∫ T
t
Xu · dSu − α∫ T
t
f(vu)du
)]= e−αRt · inf
(Xu)t≤u≤T∈X (T−t,Xt)E[exp
(α
∫ T
t
Xu · dSu − α∫ T
t
f(vu)du
)]= V (T − t,Xt, Rt)
is a martingale under optimal control and a submartingale otherwise. Using dXt = −vt dt,we get that
dYt = dV (T − t,Xt, Rt)
=∂V
∂T
∂(T − t)∂t
dt+∇X0V · dXt +∂V
∂R0
dRt +1
2
∂2V
(∂R0)2d〈R〉t
= −∂V∂T
dt− v>t ∇X0V dt+∂V
∂R0
(X>t σ dBt + b>Xt dt− f(vt) dt
)+
1
2
∂2V
(∂R0)2X>t ΣXt dt
and looking at the drift term we deduce that
0 = infvt∈Rd
(−∂V∂T− v>t ∇X0V +
∂V
∂R0
b>Xt −∂V
∂R0
f(vt) +1
2
∂2V
(∂R0)2X>t ΣXt
)33
or equivalently the Hamilton-Jacobi-Bellman equation
∂V
∂T=
1
2X>t ΣXt
∂2V
(∂R0)2+ b>Xt
∂V
∂R0
+ infξ∈Rd
(−ξ>∇X0V −
∂V
∂R0
f(ξ)
). (19)
Note that we only have to take the infimum over the derivatives vt since Xt is already
known at time t, but vt can be chosen freely. Also note the typing error in [15, p. 14],
where the authors gave this equation with a minus sign in front of the infimum.
In addition to this differential equation it is sensible to assume the singular initial condition
limT0
V (T,X0, R0) =
e−αR0 , if X0 = 0,
∞ , otherwise.(20)
Here the singularity reflects the fact that we must finish the liquidation by time T . As
T 0, this is only possible if we already start with an empty portfolio.
Both the HJB-equation and the initial condition were observed by Schied, Schoneborn,
and Tehranchi in [15, p. 9].
We can show that the solution for the value function we obtained before, indeed, solves
this heuristically suggested differential equation with initial condition. [15, p. 14f.]
Proposition 3.10 The function V (T,X0, R0) as given in (14) solves the singular Cauchy
problem (19) and (20).
Proof. We define
S(T,X0) : = infX∈Xdet(T,X0)
∫ T
0
L(Xt, Xt)dt
= inf
∫ T
0
L(Yt, Yt)dt,
where as before the second infimum is taken over all Lipschitz continuous curves starting
at Y0 = 0 and ending at YT = X0 by lemma 3.9. In the proof of theorem 3.5, we have
already shown that conditions (H1)-(H6) hold for the Hamiltonian H corresponding to
the Lagrangian L. Therefore by theorem 3.7, we get
0 =∂S
∂T(T,X0) +H(X0,∇X0S(T,X0)) =
∂S
∂T(T,X0) +H(X0,−∇X0S(T,X0)).
Using this, we can now check that V (T,X0, R0) = exp(−αR0 + αS(T,X0)) solves (19):
∂V
∂R0
=∂
∂R0
e−αR0+αS(T,X0) = −αV,
∂2V
(∂R0)2= α2V,
34
∇X0V = ∇X0e−αR0+αS(T,X0) = αV∇X0S(T,X0) = − ∂V
∂R0
∇X0S(T,X0),
∂V
∂T=
∂
∂Te−αR0+αS(T,X0)
= αV∂S
∂T(T,X0)
= αV (−H(X0,−∇X0S(T,X0))
= −αV(−α
2X>0 ΣX0 + b>X0 + f ∗(∇X0S(T,X0))
)=
1
2X>0 ΣX0
∂2V
(∂R0)2+ b>X0
∂V
∂R0
+∂V
∂R0
f ∗
(−∇X0V
∂V∂R0
)
=1
2X>0 ΣX0
∂2V
(∂R0)2+ b>X0
∂V
∂R0
+ (−αV ) supξ∈Rd
(−ξ>∇X0V
∂V∂R0
− f(ξ)
)
=1
2X>0 ΣX0
∂2V
(∂R0)2+ b>X0
∂V
∂R0
+ infξ∈Rd
(−ξ>∇X0V −
∂V
∂R0
f(ξ)
),
where we used that −αV < 0 when replacing the supremum by the infimum.
It remains to show the singular initial condition. First we note that for X0 = 0, it holds
S(T, 0) = 0 for all T and therefore
limT0
V (T, 0, R0) = limT0
e−αR0+αS(T,0) = limT0
e−αR0 = e−αR0 .
Further, we derive that∫ T
0
L(Xt, Xt)dt =
∫ T
0
(α2X>t ΣXt − b>Xt
)dt+
∫ T
0
f(Xt)dt
≥∫ T
0
c3 dt+ Tf
(∫ T
0
Xt
Tdt
)
= Tc3 + Tf
(−X0
T
)using the lower bound c3 on α/2 · q>Σq− b>q from the proof of condition (H5) in theorem
3.5 and Jensen’s inequality for the convex function f . If X0 6= 0, by our assumption of f
having superlinear growth, the last term blows up as we send T 0. So S, as well as V ,
approach ∞ as T 0.
35
4 Conclusion
The aim of this essay was to introduce the Almgren model as an example of a model
for price impact on markets. We saw that under the strong assumptions of a linear
temporary impact and static trading strategies one can explicitly solve the mean-variance
minimisation of occurring costs. Also, we were able to show that with linear impact one
can use dynamic programming to derive optimal adaptive strategies which do strictly
better in minimising mean-variance since they react to price changes and adjust the
trading speed. In the last part of the essay we were able to prove a rather counter-intuitive
fact, namely that if one changes to an optimisation with respect to CARA utility there
is also a unique optimal trading strategy but it is a deterministic one. Interestingly, one
can get this optimal strategy by a mean-variance optimisation over static strategies again
where the level of risk aversion λ is given by half the constant absolute risk aversion α
from the CARA utility function. For the case of a one-asset market with linear temporary
impact, we then see that the optimal strategy is the one we derived in section 2.2.
We have, however, mentioned that a linear temporary impact is strongly refuted by em-
pirical analyses. Unfortunately, the Almgren model itself makes no prediction about the
shape of the impact. It only relies on some assumptions which are economically necessary
but do not describe the impact in more detail. It is, therefore, required to estimate the
impact from real data or to establish another model which describes its shape. One way
of doing this is to look at the so-called limit order book (LOB for short) which contains
all current limit orders, i. e. buy orders below or at the bid price and sell orders at or
above the ask price. When trading many shares one would first fulfil all limit orders at the
bid respectively ask price and then go on with orders at lower respectively higher prices.
So the number of shares available at each price determines the actual price one has to
pay for an order of a certain number of shares, and this can be converted into a price
impact. In order to determine permanent and temporary price impact, one would have to
know how the LOB behaves over time. If all fulfilled orders are instantaneously replaced
by new ones, we identify the price impact as temporary. Although the resilience of the
order book is an empirical fact, complete recovery until the next trade is only plausible
for long times between the trades or small trade sizes. The assumption in the Almgren
model that price impact only consists of permanent and temporary impact, therefore, is
maybe too strong to make it applicable to trading at short intervals. It would make sense
to introduce some decay of the impact as it was done by Gatheral in [8] or in the LOB
approach first introduced by Obizhaeva and Wang in [12]. For sensibly long intervals
between the executions of trades, however, the LOB is a good tool to estimate temporary
impact. Also the permanent impact can be modelled using this approach by identifying
the new bid respectively ask price which ensues after the recovery.
36
List of Figures
1 Optimal Trading Trajectories for Different Levels of Risk Aversion . . . . . 11
2 The Efficient Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Adaptive Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
37
References
[1] R. Almgren and N. Chriss. Optimal execution of portfolio transactions. J. Risk 3:
5–39 (2000)
[2] R. Almgren, C. Thum, H. L. Hauptmann and H. Li. Direct estimation of equity
market impact. Risk 18: 57–62 (2005)
[3] R. F. Almgren. Optimal execution with nonlinear impact functions and trading-
enhanced risk. Applied Mathematical Finance 10(1): 1–18 (2003)
[4] BARRA. Market impact model handbook (Berkeley, California, Barra, 1997)
[5] S. H. Benton. The Hamilton-Jacobi equation: a global approach. Mathematics in
Science and Engineering. Academic Press, New York (1977)
[6] B. Biais, P. Hillion and C. Spatt. An empirical analysis of the limit order book and
the order flow in the Paris Bourse. Journal of Finance 50(5): 1655–1689 (1995)
[7] X. Gabaix, P. Gopikrishnan, V. Plerou and H. E. Stanley. A theory of power law
distributions in financial market fluctuations. Nature 423: 267–270 (2003)
[8] J. Gatheral. No-dynamic-arbitrage and market impact. Quantitative Finance 10(7):
749–759 (2010)
[9] G. Huberman and W. Stanzl. Price manipulation and quasi-arbitrage. Econometrica
72(4): 1247–1275 (2004)
[10] J. Lorenz and R. Almgren. Mean-variance optimal adaptive execution. Applied
Mathematical Finance 18(5-6): 395–422 (2011)
[11] S. Mallaby. More Money Than God. The Penguin Press, New York (2010)
[12] A. Obizhaeva and J. Wang. Optimal trading strategy and supply/demand dynamics.
Journal of Financial Markets 16(1): 1–32 (2013)
[13] M. Potters and J. P. Bouchaud. More statistical properties of order books and price
impact. Physica A: Statistical Mechanics and its Applications 324(1): 133–140 (2003)
[14] R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, New
Jersey (1970)
[15] A. Schied, T. Schoneborn and M. Tehranchi. Optimal basket liquidation for CARA
investors is deterministic. Applied Mathematical Finance 17(5-6): 471–489 (2010)
[16] R. Smith. Street hazard. The Wall Street Journal (1985)
38