Trading in VIX Derivatives1 25 Thalesians.pdfHighlights of Talk 1.The VIX futures curve exhibits stationary behavior, with mean reversion toward a contango. 2.Which is a good model

Trading in VIX Derivatives1

Presented at IAQF Thalesian Series

Andrew Papanicolaou

FRE DepartmentNYU TandonBrooklyn NY

April 25th 2017

1Joint work with Marco Avellaneda.1 / 85

Fear

Figure : The VIX is the market’s impulse response to fear.

2 / 85

The VIX Index

For t ≤ T let the options on SPX be

P(t,K ,T ) = e−r(T−t)Et(K − ST )+

C(t,K ,T ) = e−r(T−t)Et(ST − K )+ ,

with Et being risk-neutral expectation.

The VIX is

VIXt =

√√√√2erτ

τ

(∫ EtSt+τ

0

P(t,K , t + τ)dK

K 2+

∫ ∞EtSt+τ

C(t,K , t + τ)dK

K 2

)

with τ = 30 days.

3 / 85

Highlights of Talk

1. The VIX futures curve exhibits stationary behavior, with meanreversion toward a contango.

2. Which is a good model for capturing statistical dynamics ofVIX futures?

3. How do complacency and the unknown factor into astationary model?

4. Can we manage the negative roll yield in VIX ETNs?

5. Is the market showing too little concern?

4 / 85

The VIX Time Series

0

20

40

60

80

100VIX Daily Closing (1/2/2004 to 1/3/2017)

02−Ja

n−2004

30−D

ec−

2004

27−D

ec−

2005

22−D

ec−

2006

21−D

ec−

2007

18−D

ec−

2008

16−D

ec−

2009

14−D

ec−

2010

09−D

ec−

2011

10−D

ec−

2012

06−D

ec−

2013

04−D

ec−

2014

02−D

ec−

2015

29−N

ov−

2016

Figure : The VIX has measured market fear since 2004.

5 / 85

Statistical Properties of VIX & VX Futures

From the VIX time series...

1. time series mean, 17.16

2. time series mode, 12.64

3. augmented Dicky-Fuller stat: reject (no unit root)

Typically...

1. most days the VIX futures is in contango, with backwardationcoming when there’s fear.

2. backwardation mean reverts within a few weeks, which is fastcompared to interest rates or oil

6 / 85

VX Term Structure

Figure : A typical contango, Jan 23, 2017.

7 / 85

VX Term Structure

Figure : Backwardation for 5 months, Aug 17, 2011. Obviousdollar-neutral trade here.

8 / 85

VX Term Structure

Figure : Full on backwardation, Oct 16, 2008

9 / 85

VX Term Structure

Figure : April 11, 2017. Notice the “scoop”, which is typically howtrouble starts.... Perhaps elections in France and the possibility of aFrench exit are causing fear.

10 / 85

VX Term Structure

Figure : VIX weeklies included, April 11, 2017. Weeklies are not liquid.

11 / 85

The “Dull” or Most Likely State

Figure : April 10th 2017. The blue line connects the modes for each VXcontract.

12 / 85

Comparison with Other Stationary Curves

Other curves are stationary, but with different properties,I Oil: normal backwardation and storage theory

I normal for producers to hedge price (Keynes),I 100 years later, storage industry in Cushing OK,

cash-and-carry arb puts lower bound on contango.I inventories/storage lower vol.

I Gold: considering only (21m)3 in human hands, relativelycheap to store, relatively flat curve/low vol

I Rates: Lot’s of instruments (bonds, swaps, etc..), relativelycomplete market; hedgable

Deep contango and high vol in non-storables:

I electricity

I VIX futures

13 / 85

ContangoRollYield

Thisisthemostlikelycurve:Longposi)onsinshort-termVXlosefasterthanlong-term

T

VX

Figure : The most likely yield, and the red arrows illustrating the negativeroll yield for long positions at all maturities.

14 / 85

ANon-Sta*onaryCurve

T

VX

The“Scoop-Shaped”curvewillrevertbacktothecontango.

Figure : A non-stationary or transient state. The black arrows illustratethe positive roll yield for long positions at shorter maturities.

15 / 85

The “Dull” State of Fear

Figure : Sooner or later it becomes normal......

16 / 85

Bergomi’s Model [Bergomi, 2005, Bergomi, 2008]

Let T > 0 denote a future’s maturity,

Ft,T = EtVIXT ∀t ≤ T ,

where Et denotes a time-t conditional risk-neutral expectation.

Bergomi model arises naturally from the risk-neutral martingale,

dFt,TFt,T

=d∑

i=1

σi (T − t)dW it ,

where each σi (t) is a diffusion coefficient that tends toward zero ast →∞, and vector dWt is Brownian increments with correlations

dW it dW

jt = ρijdt .

17 / 85

Bergomi’s Model

The SDE for Ft,T is

Ft,T = Ft0,T exp

d∑i=1

∫ t

t0

σi (T − s)dW is − 1

2

d∑i ,j=1

∫ t

t0

ρijσ2ij(T − s)ds

,

where σ2ij = σiσj and t0 ≤ t is an initial time.

Different kernels:

I The exponential kernel σi (t) = σie−κi t with κi > 0 (Markov)

I Power law σi (t) = t−γ with γ > 0 (fractional Brownianmotion, non-Markov, [Gatheral et al., 2014])

18 / 85

Rolling ContractsDenote τ = T − t to have rolling futures contract,

V τt = Ft,t+τ ,

for which there is the following expression:

V τt = V t+τ

t0exp

(d∑

i=1

∫ t

t0

σi (τ + t − s)dW is

−12

d∑i ,j=1

∫ t

t0

ρijσ2ij(τ + t − s)ds

.

Take the volatility functions to be

σi (t) = σie−κi t

where κi > 0. We take the factor Xt to be a stationary OUprocess,

X it = σi

∫ t

−∞e−κi (t−s)dW i

s ,

and letting t0 tend toward −∞,.......... 19 / 85

Stationary State

.......... we obtain the stationary model for the futures curve,

V τt = V∞ exp

d∑i=1

e−κiτX it −

1

2

d∑i ,j=1

ρij σi σjκi+κj

e−(κi+κj )τ

.

In particular, evaluating at τ = 0 give

VIXt = exp

d∑i=1

X it −

1

2

d∑i ,j=1

ρij σi σjκi+κj

.

20 / 85

The “Dull” or Most Likely State

The mode for this model.....

mode(V τt ) = V∞ exp

−1

2

d∑i ,j=1

ρij σi σjκi+κj

e−(κi+κj )τ

.

This should be a contango....

21 / 85

The Dull State for a 2-Factor Model

0 2 4 6 8 10 1210

15

20

25

30

35

VIX Term Structure from 2−Factor Gaussain OU

maturity (in months)

VIX0 = 12.7028

VIX0 = 24.0596

VIX0 = 32.0561

Figure : 2 Factors, X 1t and X 2

t mean-reverting OU processes. 22 / 85

PCA and Model Selection

PCA from February 8th 2011 to December 15th 2016

I 8 rolling contracts (including the VIX)

I N = 1, 499 days

I each day the VIX and the VX future curve form row entry inN × 8 matrix.

Notation,Vij = ln(V

τjti )− ln(V τj ) ,

where

ln(V τj ) =1

N

∑i

ln(Vτjti ) ,

with i = 1, 2, 3, . . . ,N, and j = 0, 1, 2 . . . , 7 with τj = j 30365 .

23 / 85

PCA and Model SelectionThe singular value decomposition (SVD),

USψ′ = V ,

where

I U is an N × 8 matrix orthonormal columns,

I S is an 8× 8 diagonal matrix containing the singular values,and

I ψ is an 8× 8 orthonormal matrix whose columns are theprincipal components used to form any given futures curve.

In other words, for d ≤ 8 we have

ln(Vτjti ) = ln(V τj ) +

d∑`=1

ai`ψj` ,

where the coefficient matrix is a = US .24 / 85


0 1 2 3 4 5 6 713

14

15

16

17

18

19

20

VX Curve Reconstructed with 1 Components

0 1 2 3 4 5 6 715.5

16

16.5

17

17.5

18

18.5


0 1 2 3 4 5 6 715.5

16

16.5

17

17.5

18

18.5


Maturity (in months)0 1 2 3 4 5 6 7

15.5

16

16.5

17

17.5

18

18.5


Maturity (in months)

Figure : PCA reconstruction of VX curves for April 10th 2014.

25 / 85


0 1 2 3 4 5 6 7

25

30

35

40

45

50


0 1 2 3 4 5 6 7

25

30

35

40

45

50



25

30

35

40

45

50



25

30

35

40

45

50


Figure : PCA reconstruction of VX futures curve for August 8th 2011,US credit downgrade.

26 / 85

Mode of PCA Weights

-0.05 0 0.05 0.1

0

20

40

60

80

100

120

Histogram of 1st PCA Weight

-0.15 -0.1 -0.05 0 0.05 0.1

0

20

40

60

80

100

120

140

Histogram of 2nd PCA Weight

Figure : The histogram of weights for the 1st and 2nd principalcomponents, ai1 and ai2 respectively.

27 / 85

The Most Likely Curve via PCA

0 1 2 3 4 5 6 713

14

15

16

17

18

19

20

21

22

Maturity (in months)

Most Likely Curve

most likely curve

mean curve

Figure : Recall mean VIX higher than mode VIX. Here mean curve ishigher than mode curve,

mode(

ln(Vti ))≈ ln(V ) + mode

(ai1)ψ1 + mode

(ai2)ψ2

28 / 85

The Real-World or Statistical ModelThe stationary, risk-neutral process,

dX 1t = −κ1X

1t dt + σ1dW

1t ,

dX 2t = −κ2X

2t dt + σ2dW

2t .

The real-world or statistical dynamics of the bivariate OU processare

dX 1t = κp1(µ1 − X 1

t )dt + σ1dWp,1t ,

dX 2t = κp2(µ2 − X 2

t )dt + σ2dWp,2t ,

where

d

(W 1

t

W 2t

)=

(σ1 00 σ2

)−1(κp1µ1 − (κp1 − κ1)X 1

t

κp2µ2 − (κp2 − κ2)X 2t

)︸︷︷︸

market price of volatility risk

dt+d

(W p,1

t

W p,2t

).

29 / 85

Parameter Estimation

The data is the observed VIX and VX futures,

Yτji = ln(V

τjti )

with τj = j × 30 days for j = 0, 1, . . . , 7. Let the parameters of theOU process by denoted by θ,

θ = (V∞, κ1, κ2, σ1, σ2, ρ︸︷︷︸risk neutral

, κp1 , κp2 , µ1, µ2︸︷︷︸

real world

) .

Using the model

Yi = HθXti + G θ + ετji (observed) , (1)

Xti+1 = AθXti + µθ + ∆W pi+1 (latent) , (2)

where cov(ετji ) = R and cov(∆W p

i+1) = Qθ.

30 / 85

Estimated Parameters

Estimated θ

V∞ 21.2068κ1 0.6879κ2 23.7273σ1 1.3393σ2 1.8297ρ 0.2018

κp1 1.2938κp2 17.0911µ1 0.1904µ2 -0.0056

Table : Estimated VIX risk-neutral mode of 10.5068. For the statistical,the optimization has constraints to look for a mean and mode that areequal those of the VIX data; the estimated model’s statistical mode is12.6400 and it’s mean is 18.9844, compared to the mode and mean ofthe VIX time series of 12.6400 and 17.1639, respectively. The totalfitting error to the VX term structure is 8.50378.

31 / 85

Goodness of Fit

0 1 2 3 4 5 6−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Kalman Filter

t (in years)

X1

X2

0 1 2 3 4 5 6−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Least−Squares Estimator

t (in years)

X1

X2

Figure : Left: The Kalman filter X θi . Right: The least-square estimator

X θ,lsqi .

I Important to use Kalman filter; maintains our a prioripreference for OU dynamics X .

I Daily Least squares (LSQ) estimation can be used afterparameters estimated, but if daily LSQ used in iterativeparameter estimation algorithm then there is overfitting.

32 / 85

Goodness of Fit

0 1 2 3 4 5 610

20

30

40

50

Vt

τ

, τ = 0 days

0 1 2 3 4 5 610

20

30

40

50

Vt

τ

, τ = 30 days

0 1 2 3 4 5 610

20

30

40

Vt

τ

, τ = 60 days

0 1 2 3 4 5 610

20

30

40

Vt

τ

, τ = 90 days

0 1 2 3 4 5 610

20

30

40

Vt

τ

, τ = 120 days

t (in years)0 1 2 3 4 5 6

10

20

30

40

Vt

τ

, τ = 150 days

t (in years)

data

fit

33 / 85

Goodness of Fit

−0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50

200

400

600

800

1000

1200

Innovations

−0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.40

50

100

150

200

250

300

350

400

450

Residuals X θi+1− AθX θ

i− µθ

Figure : Left: The histograms of the innovations νi , which are normallydistributed under the null hypothesis of normally-distributed εi and∆W p. Right: The histograms of the residuals X θ

i+1 − AθX θi − µθ, which

would also be normally distributed under the null hypothesis.

34 / 85

Most Likely VX Curve Given High VIX

The outlier probability for V τt is equivalent to the probability of the

normal random variable exceeding some large M > 0,

Pp

(X 1t + X 2

t ≥ M)≤ exp

−1

2

(M − (µ11 + µ2

2))2(1, 1

)Σ

(11

) ,

where Pp denotes the statistical probability measure, and

Σ =

σ2

1

2κp1

ρσ1σ2

κp1+κp2

ρσ1σ2

κp1+κp2

σ22

2κp2

35 / 85


Conditional on an exceedance over a threshold M, the most likelyvalue of (X 1

t ,X2t ) is found by maximizing the joint density subject

to a constraint. The density is

p(x1, x2) =1

2π|Σ|exp

(−1

2

(x1 − µ1, x2 − µ2

)Σ−1

(x1 − µ1

x2 − µ2

)),

and the Lagrangian optimization problem is

minx1,x2

1

2

(x1 − µ1, x2 − µ2

)Σ−1

(x1 − µ1

x2 − µ2

)− δ

(x1 + x2 −M

),

where δ ≥ 0 is a Lagrange multiplier.

36 / 85


The solution is (X 1

X 2

)ml(τ,M)

=

(µ1

µ2

)+ δ∗Σ

(11

),

where δ∗ is the optimal Lagrange multiplier, which forM > µ1

1 + µ22 is

δ∗ =M − (µ1

1 + µ22)(

1, 1)

Σ

(11

) ;

obviously δ∗ = 0 if M ≤ µ11 + µ2

2.

37 / 85


Maturity (in months)0 5 10 15 20 25

10

15

20

25

30

35

40

Most Likely Curve Given VIX High

most likely curve given X1+X

2 > 1.2

most likely curve

Figure : The most likely curve given X 1 + X 2 ≥ M.

38 / 85


Maturity (in months)0 5 10 15 20 25

12

13

14

15

16

17

18

19

20

21

22

Most Likely Curve Given VIX High

most likely curve given X1+X

2 > 0.25

most likely curve

Figure : The most likely curve given X 1 + X 2 ≥ M.

39 / 85

Or the Curve if we know more...

Maturity (in months)0 2 4 6 8 10 12

12

13

14

15

16

17

18

19

20

High VIX with Dull State Expected in 4 Months

Non-Stationary Curvemost likely curve

Figure : Most likely given X 1 + X 2 ≥ M andX 1e−κ1τ4 + X 2e−κ2τ4 = µ1e−κ1τ4 + µ2e−κ2τ4 .

40 / 85

Summary to this Point

I Characterized stationarity of VX curve

I Looked at PCA and found 2 factors is sufficient

I Fit the Gaussian Bergomi model and found it captures someof features.

41 / 85

Complacency

42 / 85

Complacency

Of considerable concern right now.

James Mackintosh of WSJ writes:

“At the moment, the gap between realized and impliedvolatility is normal. Investors are prepared for some rise involatility, but from an abnormally low level.”

Typically, the spread VIX minus realized vol is positive to reflectpremium in options.

However, JM remarks that low VIX isn’t the whole story, as thepremium in SPX put options is more indicative of fear.

43 / 85

Complacency

Indeed, there a several causes and effects:

I stimulus and easy money every time the market goes down;younger traders only know the “Big Dip Era”

I The ETF craze, entire market “waxes and wanes” in unisonbecause of the migration to passive funds

I post-crisis corporate buy backsI Self-fullfilling prophecy:

I VIX will stay low, so short the contango, which in turn drivesdown VIX prices

I many funds shorting the long-dated VIX futures and collectingpremium

I The ZIV is an example of this short

44 / 85

Past VIX Events

Non-Complacent moments in the last 2 decades:

I the Russia crisis 8/98

I the dotcom collape 3/00

I market euphoria begins 1/06

I the credit crunch 8/07

I Lehman collapse 9/08

I Greece debt 5/10

I Eurozone debt/US downgrade 8/11

Brexit and Trump election didn’t have much effect.

The French election had some effect in the last 2 weeks.

45 / 85

We’re Overdue for an Event

I The forest servicepractices controlled burnsas a means to preventdisastrous wildfires.

I We’re overdue for avolatility event, the forestis littered with kindling..,,,

I Is there a volatility forestservice?

46 / 85

The VIX ETNs

0

5000

10000

01-M

ar-

2017

03-M

ar-

2016

06-M

ar-

2015

10-M

ar-

2014

12-M

ar-

2013

12-M

ar-

2012

15-M

ar-

2011

18-M

ar-

2010

VXX and VXZ Daily Closing (3/1/2010 to 3/1/2017)

0

200

400

VXX

VXZ

Figure : Negative roll yields for positions in front end or in the back end.Front end is more volatile. $1 million invested in the first VIX ETN in2009 would be $600 today. 47 / 85

Negative Roll Yield

-6

-4

-2

0

2

4

6

8

07-3

0-1

6

10-2

4-1

5

01-1

7-1

5

Contango Yield

04-1

2-1

4

06-2

8-1

3

09-0

8-1

2

11-2

2-1

1

02-0

8-1

1

V2-V1

V7-V4

Figure : Negative roll yields for positions in front end or in the back end.Front end is more volatile.

48 / 85

Pairs Trading for VIX ETNs

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Co-Integrated Time Series log(VXX)-β1-β

2log(VXY)

01-M

ar-

2017

03-M

ar-

2016

06-M

ar-

2015

10-M

ar-

2014

12-M

ar-

2013

12-M

ar-

2012

15-M

ar-

2011

18-M

ar-

2010

Figure : Engle-Granger rejects H0: ln(VXX ) and ln(VXY ) have noco-integration. Run regression ln(VXX ) = β1 + β2 ln(VXY ), thenDickey-Fuller test rejects unit root in residual.

49 / 85

Construction of ETNsETNs roll between 2 or more VX contracts

I let t be the calendar dateI let T1 and T2 be first and second expirations after t.I let r = T2 − T1. i.e. number business days in the period

between T1 and T2

I let θ be number business days between today and 1 monthfrom today.

Figure : Contracts Ft,T1 and Ft,T2 used in ETN with roll horizon θ.50 / 85

Construction of ETNsFor T1 ≤ t + θ ≤ T2 define

a(t) =T2 − (t + θ)

rand notice

I 0 ≤ a(t) ≤ 1I a(T1 − θ) = 1 and a(T2 − θ) = 0I linear in t.

Denote interpolation as

Vt = a(t)FT1t + (1− a(t))FT2

t .

The change in Vt is

dVt =(a(t)FT1

t − a(t)FT2t

)dt + a(t)dFT1

t + (1− a(t))dFT2t

=1

θ

(FT2t − FT1

t

)dt + a(t)dFT1

t + (1− a(t))dFT2t

51 / 85

Construction of ETNs

The value of the ETN in given by E given by the formula

dEt

Et=

a(t)dFT1t + (1− a(t))dFT2

t

a(t)FT1t + (1− a(t))FT2

t

+ rdt

where r is the interest rate.

This corresponds to the variation in a futures trading account inwhich an investor buys the front contract with 100% of his capitaland then gradually buys the second contract and sells the first(“rolls”) – buying calendar spreads on each date – so as to be fullyinvested in the ‘first contract after the next expiration date, and soon.

52 / 85

The Roll Yield

The continuous-time analogue of the evolution of VXX (or anycontinuously rolled ETF/ETN) is therefore:

dEt

Et=

dV θt

V θt

− ∂ ln(V θt )

∂θdt + rdt.

Note: The futures strategy implemented by the ETN issuermaintains the number of contracts constant in the roll, bytransferring a number of contracts equal to

Total number of contracts

number of business days between expirations

from one maturity to the next one, by buying that number ofcalendar spreads daily.

53 / 85

Losing to Contango

Since under a risk-neutral measure all self-financing strategiesshould return zero, we should have

Et

[dV θt

V θt

− ∂ ln(V θt )

∂θdt]

= rdt, (3)

where Et [∗] is conditional expectation.

I This statement does not hold under the empirical measure,

since we expect V θt to be mean-reverting and ∂ ln(V θt )

∂θ to bemostly positive,

I due to contango.

54 / 85

Simulated ETNs

0 0.5 1 1.5 2 2.5 30

100

200

ETNs and VIX

ET

N V

alu

e

1 Month ETN

7 Month ETN

VIX

0 0.5 1 1.5 2 2.5 30

50

100

VIX

Figure : VXX and VXY simulated from fitted 2-factor model.

55 / 85

Simulated ETNs

0 0.5 1 1.5 2 2.5 3−100

−50

0

50

100

Short 1m, Long 7m

ET

N V

alu

e

0 0.5 1 1.5 2 2.5 30

20

40

60

80

VIX

Figure : A simulated long-short in VXX and VXY.

56 / 85

2nd Summary

I Complacency and the risk of being too sure are of concernright now (April 2017)

I ETNs are a bi-product of successful bets on low vol (and ingeneral the ETN craze)

I ETN’s losses are due to negative roll yield, which is similar tothe insurance premium for low-strike SPX put options.

57 / 85

The Distribution of Fear

Figure : By buying and selling VIX options, traders contribute towards acollective distribution on the Fear Index.

58 / 85

VIX Tail Hedge (VXTH) Enter/Exit Strategy

A portfolio to hedge against tail events:

I 1% of portfolio weight in 30-∆ VIX call options if VIX inrange of 15%-30%; rest in S&P500 stocks

I 1/2% of portfolio weight in 30-∆ VIX call options if VIX inrange of 30%-50%; rest in S&P500 stocks

I 0% of portfolio weight in VIX options if VIX outside range of15%-50%; all in S&P500 stocks

Big losses in the S&P500 coincide with spikes in VIX. VXTHprovides positive cash flow on these days.

S&P 500 gained about 15% between 2006 and 2011; VXTHgained about 40%

59 / 85

VIX Tail Hedge (VXTH)

60 / 85

Non-Orthogonality of Markets

61 / 85

Time-Spread Portfolio [Papanicolaou, 2016]

A future on VIXsquared: Vt,T = EtVIX2T

Vt,T =2er(T+τ−t)

τ

∫ ∞0

(P(t,K ,T + τ)− P(t,Ke−rτ ,T )

) dKK 2

i.e. Vt,T is not only an index, but an asset that can be held,

Vt,T = 2

∫ ∞0

Cvix(t,K ,T )dK ,

where Cvix(t,K ,T ) denotes a VIX call option.

62 / 85

Prediction

Figure : The VIX premium swells when there is unknown.....

63 / 85

Past VIX Events

I the Russia crisis 8/98

I the dotcom collape 3/00

I market euphoria begins 1/06

I the credit crunch 8/07

I Lehman collapse 9/08

I Greece debt 5/10

I Eurozone debt/US downgrade 8/11

Brexit and Trump election didn’t have much effect.

Ambiguity and Overconfidence [Brenner et al., 2011].

What if France exits???

64 / 85

French Politics

It’s going to be how it was,..... which was pretty OK.

Or look out!! Will French exithave consequences???

65 / 85

I don’t get it??

I We have convinced ourselves that VIX is stationary

I ..... that the VX curve is stationary

I ..... and we even have a reasonable model

Then how is it that Marine Le Pen means all bets are off?

66 / 85

What is a Model for Risk?

I All this time researchingand hedging,

I but then a jockey walksby at the bettingwindow....

I Hey Sherlock!! Look upfor just a second and seethat this jockey is ajuicer!!

67 / 85

How to Store?

Figure : Where can I purchase the Smell of Fear?

68 / 85

Fear in a Bottle

How can we buy tomorrow’s volatility?

I Think about cash-and-carry strategies for contango instorables (e.g oil or gas).

I Already talked about complacency and unhedged shorting oflong-dated VX.

I There is no carriable form of volatility.

I How can we implement a cash and carry?

It’s not clear how to buy tomorrow’s volatility.... today.

Suitable instruments for constructing future contracts:

I time-spread portfolio

I forward-start options

I compound options.

69 / 85

In Terms of Greeks

Let Θ denote option sensitivity to changes in time to maturity,

Θ(t,K ,T ) = − ∂

∂TP(t,K ,T ) .

Time-spread portfolio can be written as

Vt,T = −2

τ

∫ ∞0

∫ τ

0Θ(t,K ,T + u)du

dK

K 2

P(t,T + τ,K )− P(t,T ,K ) = −∫ τ

0Θ(t,K ,T + u)du .

VOLATILITY RISK QUANTIFIED AS EXPOSURE TOTIME

70 / 85

K40 42 44 46 48 50 52 54 56

price

0

1

2

3

4

5

6

7

8

P(t, T + τ,K)−P(t, T,K) = −

∫τ

0 Θ(t, T + u,K)du

Height of Shaded Area = −

∫τ

0 Θ(t, T + u,K)du

Intrinsic Value

71 / 85

Can Zero-Θ Instruments be Used to Construct Storage?

I We see the usefulness of instruments that do not decay due topassage of time,

I Time Spread portfolio is a contract on future volatility..... nota carry of today’s.

It seems really hard to carry fear.

An equivalent trade to cash-and-carry for VX has yet to bemade.

72 / 85

The Thrill of Making Money

Figure : “The Snap”, the most iconic moment in surfing, Tom Carrollputting it all on the line @ the Banzai Pipeline, North Shore of Oahu ’91.The complacent line was to go straight, but Tom saw differently.

73 / 85

Thank You!

74 / 85

Generalized OU

We use a factor model where the factors are given by an Rd -valuedprocess Xt that is mean reverting and in its stationary state,

Xt =

∫ t

−∞e−κ(t−s)dLs ,

where κ > 0 is a positive definite matrix with

‖κ−1‖ = decorrelation time,

and Lt is an Rd -valued stable Levy process having triple (a, σ2, ν)and Levy-Khintchine representation (see Chapter 1.2 in[Applebaum, 2004])

75 / 85

Generalized OU

logEe i〈u,L1〉 = i 〈u, a〉 − ‖σu‖2

2

+

∫Rd\0

(e i〈x ,a〉 − 1− i 〈x , a〉 1|x |<1

)ν(dx) ,

where σ is an invertible volatility matrix for a diffusion component,and ν(x) is an intensity measure with

∫Rd\0(1 ∧ |x |2)ν(dx) <∞.

Using the model from[Bergomi, 2005, Bergomi, 2008, Ould Aly, 2014], the termstructure is

V τt = V∞ee

−κτXt− 12E[[e−κτXt ]] , (4)

where [[ · ]] denotes quadratic (cross) variation of a vector-valuedprocess.

76 / 85

Generalized OUAssuming the moment generating function (MGF) exists,

Λ1(u) = logEeuL1 <∞ ,

for 0 ≤ u < K <∞. If E log(1 + |L1|

)<∞ then

ΛX (u) = logEeuXt =1

κ

∫ u

0

dz

zΛ1(z) ,

for all u ∈ [0,K ). Now we can construct the VIX futures curve,

V τt = V∞ exp

(e−κτXt − ΛX (e−κτ )

),

so thatVIXt = V∞eXt−ΛX (1) ,

and the relation

V τt = V∞

(VIXt

V∞eΛX (1)

)e−κτ

× ϕ(τ) (5)

ϕ(τ) = e−ΛX (e−κτ ) . (6)

77 / 85

Generalized OU

The yield is

log

(V τt

VIXt

)= (1− e−κτ ) log

(V∞

VIXt

)+ e−κτΛX (1)− ΛX (e−κτ ) .

(7)

Now let m(x) denote the density of Xt ’s distribution. The Fouriertransform of m is

m(q) =

∫e−ixqm(x)dx = eΛX (−iq) ,

and so the density is given by

m(x) =1

2π

∫e ixu+ΛX (−iu)du .

78 / 85

Generalized OU

The mode (for a unimodal distribution) is x∗ such that

m′(x∗) =i

2π

∫ue ix

∗u+ΛX (−iu)du = 0 ,

and the most-likely value of VIX in the dull-state is

mode(VIXt) = V∞ex∗−ΛX (1) .

79 / 85

The Double Nelson Model [Bayer et al., 2013]

Take VIXt = X 1t + X 2

t where

dX 1t = κ1(µ1 − X 1

t )dt + σ1X1t dW

1t

dX 2t = κ1(µ2 − X 2

t )dt + σ2X1t dW

2t .

Has heavy tailedness.

80 / 85

Kalman Filtering

X θi = Eθ[Xti |Y0:i ] ,

Ωθ = Eθ(Xti − X θi )(Xti − X θ

i )tr ,

and which are given by the Kalman filter,

X θi+1 = AθX θ

i + µθ + K θ(Yi+1 − HAθX θ

ti− G θ

), (8)

where

Ωθ = (I − K θHθ)(HθΩθ(Hθ)tr + R

)−1(9)

K θ = Ωθ(Hθ)tr

(HθΩθ(Hθ)tr + R

)−1(10)

Ωθ = Aθ(Aθ)tr − AθΩθ(Hθ)tr

(HθΩθ(Hθ)tr + R

)−1HθΩθ(Aθ)tr + Qθ .

(11)

81 / 85

Innovations Process

We denote innovation process as

νθi = Yi − HθAθX θi−1 − G θ ,

which is an iid normal random variable under the null hypothesisthat θ is the true parameter value,

νθi ∼ iidN(

0,HθAθΩθ(HθAθ)tr + HθQθ(Hθ)tr + R).

82 / 85

Maximum Likelihood Estimation (MLE)Hence there is the log-likelihood function,

L(Y1:N |θ,R)

= −1

2

N∑i=1

∥∥∥∥(HθAθΩθ(HθAθ)tr + HθQθ(Hθ)tr + R)−1/2

νθi

∥∥∥∥2

− 1

2ln∣∣∣HθAθΩθ(HθAθ)tr + HθQθ(Hθ)tr + R

∣∣∣ ,where

∣∣∣ · ∣∣∣ denotes matrix determinant. The maximum likelihood

estimate (MLE) is

(θ, R)mle = arg maxθ,R

L(Y1:N |θ,R) .

In practice filtering makes it difficult to implement code for findingan MLE, so instead there are iterated algorithms such asexpectation maximization (EM) that, while suboptimal, willconverge to reasonable parameter estimate.

83 / 85

Iterative Scheme

Break the parameter space into the risk neutral and real-worldparameters,

θ = (θ1, θ2)

where

θ1 = (V∞, κ1, κ2, σ1, σ2, ρ)

θ2 = (κp1 , κp2 , µ1, µ2) .

The following is an iteration method that works:

84 / 85

Algorithm (Parameter Estimation)Initialize with parameter estimates θ(0) = (θ

(0)1 , θ

(0)2 ) and R(0).

1. Compute Kalman Filter using θ(0) and R(0), and re-estimate θ1,

θ1 = arg minθ1

N∑i=1

‖Yi − H θ(0)

X θ(0)

i − G θ‖2 ,

and replace θ(0)1 with θ1;

2. Re-estimate θ2 and matrix R with least-squares estimators,

X θ,lsqi =

((Hθ)trHθ

)−1

(Hθ)tr(Yi − G θ) ,

residuals ηθi = Yi − HθX θ,lsqi with ηθ = 1

N

∑i ηθi and covariance

θ2 = arg minθ2

N∑i=1

∥∥∥(Qθ)−1/2(X lsq

i+1 − AθX lsqi − µ

θ)∥∥∥2

,

R =1

N

N∑i=1

(ηθi − ηθ

)(ηθi − ηθ

)tr.

Replace θ(0)2 with θ2, replace R(0) with R, and repeat from step #1.

85 / 85

Applebaum, D. (2004).Levy Processes and Stochastic Calculus.CambridgeUniversity Press, Cambridge UK.

Bayer, C., Gatheral, J., and Karlsmark, M. (2013).Fast Ninomiya-Victoir calibration of the double-mean-revertingmodel.Quantitative Finance, 13(11):1813–1829.

Bergomi, L. (2005).Smile dynamics II.Risk, pages 67–73.

Bergomi, L. (2008).Smile dynamics III.Risk, pages 90–96.

Brenner, M., Izhakian, Y., and Sade, O. (2011).Ambiguity and overconfidence.SSRN 2284652.

85 / 85

Gatheral, J., Jaisson, T., and Rosenbaum, M. (2014).Volatility is rough.Available at SSRN 2509457.

Ould Aly, S. M. (2014).Forward variance dynamics: Bergomi’s model revisited.Applied Mathematical Finance, 21(1):84–107.

Papanicolaou, A. (2016).Analysis of VIX markets with a time-spread portfolio.Applied Mathematical Finance, 23(5):374–408.

85 / 85

Documents

Trading in VIX Derivatives1 25 Thalesians.pdfHighlights of Talk 1.The VIX futures curve exhibits stationary behavior, with mean reversion toward a contango. 2.Which is a good model