Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Factor Investing using Penalized PrincipalComponents
Markus Pelger 1 Martin Lettau 2
1Stanford University
2UC Berkeley
February 8th 2018
AI in Fintech Forum 2018
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Motivation
Motivation: Asset Pricing with Risk Factors
The Challenge of Asset Pricing
Most important question in finance: Why are prices different fordifferent assets?
Fundamental insight: Arbitrage Pricing Theory: Prices of financialassets should be explained by systematic risk factors.
Problem: “Chaos” in asset pricing factors: Over 300 potential assetpricing factors published!
Fundamental question: Which factors are really important inexplaining expected returns? Which are subsumed by others?
Goals of this paper:
Bring order into “factor chaos”
⇒ Summarize the pricing information of a large number of assets witha small number of factors
1
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Motivation
Why is it important?
Importance of factors for investing
1 Optimal portfolio construction
Only factors are compensated for systematic riskOptimal portfolio with highest Sharpe-ratio must be based onfactor portfolios(Sharpe-ratio=expected excess return/standard deviation)“Smart beta” investments = exposure to risk factors
2 Arbitrage opportunities
Find underpriced assets and earn “alpha”
3 Risk management
Factors explain risk-return trade-offFactors allow to manage systematic risk exposure
2
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Motivation
Contribution of this paper
Contribution
This Paper: Estimation approach for finding risk factors andoptimal factor investing
Key elements of estimator:
1 Statistical factors instead of pre-specified (and potentiallymiss-specified) factors
2 Uses information from large panel data sets: Many assets withmany time observations
3 Searches for factors explaining asset prices (explain differencesin expected returns) not only co-movement in the data
4 Allows time-variation in factor structure
3
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Motivation
Contribution of this paper
Results
Asymptotic distribution theory for weak and strong factors
⇒ No “blackbox approach”
Estimator discovers “weak” factors with high Sharpe-ratios
⇒ high Sharpe-ratio factors important for asset pricing andinvestment
Estimator strongly dominates conventional approach (PrincipalComponent Analysis (PCA))
⇒ PCA does not find all high Sharpe-ratio factors
Empirical results:
Investment: 3 times higher Sharpe-ratio then benchmarkfactors (PCA)Asset Pricing: Smaller pricing errors in- and out-of samplethan benchmark (PCA, 5 Fama-French factors, etc.) 4
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Model
The Model
Approximate Factor Model
Observe excess returns of N assets over T time periods:
Xt,i = Ft1×K
>︸ ︷︷ ︸factors
ΛiK×1︸︷︷︸
loadings
+ et,i︸︷︷︸idiosyncratic
i = 1, ...,N t = 1, ...,T
Matrix notation
X︸︷︷︸T×N
= F︸︷︷︸T×K
Λ>︸︷︷︸K×N
+ e︸︷︷︸T×N
N assets (large)T time-series observation (large)K systematic factors (fixed)
F , Λ and e are unknown5
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Model
The Model
Approximate Factor Model
Systematic and non-systematic risk (F and e uncorrelated):
Var(X ) = ΛVar(F )Λ>︸ ︷︷ ︸systematic
+ Var(e)︸ ︷︷ ︸non−systematic
⇒ Systematic factors should explain a large portion of thevariance
⇒ Idiosyncratic risk can be weakly correlated
Arbitrage-Pricing Theory (APT): The expected excess return isexplained by the risk-premium of the factors:
E [Xi ] = E [F ]Λ>i
⇒ Systematic factors should explain the cross-section of expectedreturns
6
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Model
The Model: Estimation of Latent Factors
Conventional approach: PCA (Principal component analysis)
Apply PCA to the sample covariance matrix:
1
TX>X − X X>
with X = sample mean of asset excess returns
Eigenvectors of largest eigenvalues estimate loadings Λ.
Much better approach: Risk-Premium PCA (RP-PCA)
Apply PCA to a covariance matrix with overweighted mean
1
TX>X + γX X> γ = risk-premium weight
Eigenvectors of largest eigenvalues estimate loadings Λ.
F estimator for factors: F = 1NX Λ = X (Λ>Λ)−1Λ>. 7
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Model
The Model: Objective Function
Conventional PCA: Objective Function
Minimize the unexplained variance:
minΛ,F
1
NT
N∑i=1
T∑t=1
(Xti − FtΛ>i )2
RP-PCA (Risk-Premium PCA): Objective Function
Minimize jointly the unexplained variance and pricing error
minΛ,F
1
NT
N∑i=1
T∑t=1
(Xti − FtΛ>i )2
︸ ︷︷ ︸unexplained variation
+γ1
N
N∑i=1
(Xi − FΛ>i
)2
︸ ︷︷ ︸pricing error
with Xi = 1T
∑Tt=1 Xt,i and F = 1
T
∑Tt=1 Ft and risk-premium weight γ
8
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Model
The Model: Interpretation
Interpretation of Risk-Premium-PCA (RP-PCA):
1 Combines variation and pricing error criterion functions:
Select factors with small cross-sectional pricing errors (alpha’s).Protects against spurious factor with vanishing loadings as itrequires the time-series errors to be small as well.
2 Penalized PCA: Search for factors explaining the time-series butpenalizes low Sharpe-ratios.
3 Information interpretation:
PCA of a covariance matrix uses only the second moment butignores first momentUsing more information leads to more efficient estimates.RP-PCA combines first and second moments efficiently.
9
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Model
The Model: Interpretation
Interpretation of Risk-Premium-PCA (RP-PCA): continued
4 Signal-strengthening: Intuitively the matrix 1T X>X + γX X>
converges to
Λ(ΣF + (1 + γ)µFµ
>F
)Λ> + Var(e)
with ΣF = Var(F ) and µF = E [F ]. The signal of weak factors witha small variance can be “pushed up” by their mean with the right γ.
10
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Illustration
Illustration (Size and accrual)
Illustration: Anomaly-sorted portfolios (Size and accrual)
Factors
1 PCA: Estimation based on PCA of correlation matrix, K = 32 RP-PCA: K = 3 and γ = 1003 FF-long/short: market, size and accrual (based on extreme
quantiles, same construction as Fama-French factors)
Data
Double-sorted portfolios according to size and accrual (fromKenneth French’s website)Monthly return data from 07/1963 to 05/2017(T = 647) for N = 25 portfoliosOut-of-sample: Rolling window of 20 years (T=240)
Optimal factor investing: maximum Sharpe-ratio portfolio
Ropt = F · Σ−1F µF 11
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Illustration
Illustration (Size and accrual)
In-sample Out-of-sampleSR RMS α Idio. Var. SR RMS α Idio. Var.
RP-PCA 0.33 0.06 2.11 0.23 0.08 2.19PCA 0.13 0.14 1.97 0.11 0.14 2.04
FF-long/short 0.21 0.12 2.63 0.11 0.12 2.23
Table: Maximal Sharpe-ratios, root-mean-squared pricing errors andunexplained idiosyncratic variation. K = 3 factors and γ = 100.
SR: Maximum Sharpe-ratio of linear combination of factors
Cross-sectional pricing errors α:
Pricing error αi = E [Xi ]− E [F ]Λ>i
RMS α: Root-mean-squared pricing errors√
1N
∑Ni=1 αi
2
Idiosyncratic Variation: 1NT
∑Ni=1
∑Tt=1(Xt,i − F>t Λi )
2
⇒ RP-PCA significantly better than PCA and quantile-sorted factors. 12
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Illustration
Loadings for statistical factors (Size and Accrual)
0 5 10 15 20 25
Portfolio
-0.5
0
0.5
Loadin
gs
1. PCA factor
0 5 10 15 20 25
Portfolio
-0.5
0
0.5
Loadin
gs
2. PCA factor
0 5 10 15 20 25
Portfolio
-0.5
0
0.5
Loadin
gs
3. PCA factor
0 5 10 15 20 25
Portfolio
-0.5
0
0.5
Loadin
gs
1. RP-PCA factor
0 5 10 15 20 25
Portfolio
-0.5
0
0.5
Loadin
gs
2. RP-PCA factor
0 5 10 15 20 25
Portfolio
-0.5
0
0.5
Loadin
gs
3. RP-PCA factor
⇒ RP-PCA detects accrual factor while 3rd PCA factor is noise.
13
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Illustration
Maximal Sharpe ratio (Size and accrual)
SR (In-sample)
1 factor 2 factors 3 factors0
0.05
0.1
0.15
0.2
0.25
0.3
0.35 =-1=0=1=10=50=100
SR (Out-of-sample)
1 factor 2 factors 3 factors0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Figure: Maximal Sharpe-ratio by adding factors incrementally.
⇒ 1st and 2nd PCA and RP-PCA factors the same.
⇒ RP-PCA detects 3rd factor (accrual) for γ > 10.14
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak vs. Strong Factors
The Model
Strong vs. weak factor models
Strong factor model ( 1N Λ>Λ bounded)
Interpretation: strong factors affect most assets (proportionalto N), e.g. market factorStrong factors lead to exploding eigenvalues
⇒ RP-PCA always more efficient than PCA⇒ optimal γ relatively small
Weak factor model (Λ>Λ bounded)
Interpretation: weak factors affect a smaller fraction of assetsWeak factors lead to large but bounded eigenvalues
⇒ RP-PCA detects weak factors which cannot be detected byPCA
⇒ optimal γ relatively large
In Data: Typically 1-2 strong factors and several weak factors withhigh Sharpe-ratios 15
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Weak Factor Model
Weak Factor Model
Weak factors have small variance or affect small fraction of assets
Λ>Λ bounded (after normalizing factor variances)
Statistical model: Spiked covariance from random matrix theory
Eigenvalues of sample covariance matrix separate into two areas:
The bulk, majority of eigenvaluesThe extremes, a few large outliers
Bulk spectrum converges to generalized Marchenko-Pastur distrib.
Large eigenvalues converge either to
A biased value (characterized by the Stieltjes transform)To the bulk of the spectrum if the true eigenvalue is too small
⇒ Phase transition phenomena: estimated eigenvectorsorthogonal to true eigenvectors if eigenvalues too small
16
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Weak Factor Model
Intuition: Weak Factor Model
“Signal” matrix for PCA of covariance matrix:
ΛΣFΛ>
K largest eigenvalues θPCA1 , ..., θPCAK measure strength of signal
“Signal” matrix for RP-PCA:
Λ(ΣF + (1 + γ)µFµ
>F
)Λ>
K largest eigenvalues θRP-PCA1 , ..., θRP-PCA
K measure strength of signal
If µF 6= 0 and γ > −1 then RP-PCA signal always larger than PCAsignal:
θRP-PCAi > θPCAi
17
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Weak Factor Model
Theorem 1: Risk-Premium PCA under weak factor model
The correlation of the estimated with the true factors is
Corr(F , F )p→ U︸︷︷︸
rotation
ρ1 0 · · · 00 ρ2 · · · 0
0 0. . .
...0 · · · 0 ρK
V︸︷︷︸rotation
with
ρ2i
p→ 1
1+θiB(θi ))if θi > θcrit
0 otherwise
Critical value θcrit and function B(.) depend only on the noise distributionand are known in closed-form
Based on closed-form expression choose optimal RP-weight γ
For θi > θcrit ρ2i is strictly increasing in θi .
⇒ RP-PCA strictly dominates PCA18
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Strong Factor Model
Strong Factor Model
Strong factors affect most assets: e.g. market factor
1N Λ>Λ bounded (after normalizing factor variances)
Factors and loadings can be estimated consistently and areasymptotically normal distributed
Assumptions more general than in weak factor model
Asymptotic Efficiency
Choose RP-weight γ to obtain smallest asymptotic variance of estimators
RP-PCA (i.e. γ > −1) always more efficient than PCA
In “most cases” optimal γ = 0 (smaller than in weak factor model)
RP-PCA and PCA are both consistent
19
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Time-varying loadings
Time-varying loadings
Model with time-varying loadings
Observe panel of excess returns and L covariates Zi,t−1,l :
Xt,i = Ft1×K
> gK×1
(Zi,t−1,1, ...,Zi,t−1,L) + et,i
Loadings are function of L covariates Zi,t−1,l with l = 1, ..., Le.g. characteristics like size, book-to-market ratio, past returns,...
Factors and loading function are latent
Idea: Approximate non-linear function g by basis functions
⇒ Project return data on appropriate basis functions⇒ Equivalent to apply RP-PCA to portfolio data
Special case: Characteristics sorted portfolios corresponds tokernel basis functionsGeneral case: Obtain arbitrary interactions and break curse ofdimensionality by conditional tree sorting projection 20
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Single-sorted portfolios
Portfolio Data
Monthly return data from 07/1963 to 12/2016 (T = 638) forN = 370 portfolios
Kozak, Nagel and Santosh (2017) data: 370 decile portfolios sortedaccording to 37 anomalies
Factors:
1 RP-PCA: K = 6 and γ = 100.2 PCA: K = 63 Fama-French 5: The five factor model of Fama-French
(market, size, value, investment and operating profitability, allfrom Kenneth French’s website).
4 Proxy factors: RP-PCA and PCA factors approximated with5% of largest position.
21
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Single-sorted portfolios
In-sample Out-of-sampleSR RMS α Idio. Var. SR RMS α Idio. Var.
RP-PCA 0.66 0.15 2.73 0.53 0.11 3.19PCA 0.28 0.15 2.70 0.22 0.14 3.19
Fama-French 5 0.32 0.23 4.97 0.31 0.21 4.62
Table: Deciles of 37 single-sorted portfolios from 07/1963 to 12/2016(N = 370 and T = 638): Maximal Sharpe-ratios, root-mean-squaredpricing errors and unexplained idiosyncratic variation. K = 6 statisticalfactors.
RP-PCA strongly dominates PCA and Fama-French 5 factors
Results hold out-of-sample.
22
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Single-sorted portfolios: Maximal Sharpe-ratio
SR (In-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.1
0.2
0.3
0.4
0.5
0.6
0.73 factors4 factors5 factors6 factors7 factors
SR (Out-of-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure: Maximal Sharpe-ratios.
⇒ Spike in Sharpe-ratio for 6 factors
23
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Single-sorted portfolios: Pricing error
RMS (In-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.05
0.1
0.15
0.2
0.25
3 factors4 factors5 factors6 factors7 factors
RMS (Out-of-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.05
0.1
0.15
0.2
0.25
Figure: Root-mean-squared pricing errors.
⇒ RP-PCA has smaller out-of-sample pricing errors
24
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Single-sorted portfolios: Idiosyncratic Variation
Idiosyncratic Variation (In-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
1
2
3
4
5
3 factors4 factors5 factors6 factors7 factors
Idiosyncratic Variation (Out-of-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
1
2
3
4
5
Figure: Unexplained idiosyncratic variation.
⇒ Unexplained variation similar for RP-PCA and PCA
25
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Single-sorted portfolios: Maximal Sharpe-ratio
SR (In-sample)
1 factor
2 factors
3 factors
4 factors
5 factors
6 factors
7 factors0
0.1
0.2
0.3
0.4
0.5
0.6
0.7 =-1=0=1=10=50=100
SR (Out-of-sample)
1 factor
2 factors
3 factors
4 factors
5 factors
6 factors
7 factors0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure: Maximal Sharpe-ratios for different RP-weights γ and number offactors K
⇒ Strong increase in Sharpe-ratios for γ ≥ 10.26
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Optimal Portfolio with RP-PCA
Composition of Stochast Discount Factor (RP-PCA)
Indus
try M
om. R
ev.
Ind. R
el. Rev
. (L.V.)
Value-M
om-Prof
.
Indus
try Rel.
Revers
als
Value-P
rofita
bility
Seaso
nality
Value (
M)
Gross P
rofita
bility
Momen
tum (1
2m)
Idios
yncra
tic Vola
tility
Long
Run Rev
ersals
Asset T
urnov
er
Momen
tum-R
evers
als
Sales/P
rice
Return
on Asse
ts (A)
Inves
tmen
t/Asse
ts
Value (
A)
Return
on Boo
k Equ
ity (A)
Gross M
argins
Compo
site Is
suan
ce
Earning
s/Pric
eSize
Momen
tum (6
m)
Accrua
l
Cash F
lows/P
rice
Short-T
erm Rev
ersals
Value-M
omen
tum
Inves
tmen
t Grow
th
Indus
try M
omen
tum
Asset G
rowth
Dividen
d/Pric
e
Share
Volume
Net Ope
rating
Assets
Price
Inves
tmen
t/Cap
ital
Leve
rage
Sales G
rowth
-0.3
-0.2
-0.1
0
0.1
0.2
0.3High DecileLow Decile
Figure: Portfolio composition of highest Sharpe ratio portfolio(Stochastic Discount Factor) based on 6 RP-PCA factors.
27
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Optimal Portfolio with RP-PCA (largest positions)
Composition of Stochast Discount Factor (RP-PCA)
Indu
stry M
om. R
ev.
Ind.
Rel.
Rev
. (L.
V.)
Value-
Mom
-Pro
f.
Indu
stry R
el. R
ever
sals
Value-
Profita
bility
Seaso
nality
Value
(M)
Gross
Pro
fitabil
ity
Mom
entu
m (1
2m)
Idios
yncr
atic
Volatili
ty
Long
Run
Rev
ersa
ls
Asset
Tur
nove
r
Mom
entu
m-R
ever
sals
Sales/P
rice
Retur
n on
Ass
ets (
A)
-0.3
-0.2
-0.1
0
0.1
0.2
0.3 High DecileLow Decile
Figure: Largest portfolios in highest Sharpe ratio portfolio (StochasticDiscount Factor) based on 6 RP-PCA factors. 28
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Optimal Portfolio with PCA
Composition of Stochast Discount Factor (PCA)
Value-P
rofita
bility
Asset T
urnov
er
Sales/P
rice
Leve
rage
Value-M
om-Prof
.
Gross P
rofita
bility
Indus
try M
om. R
ev.
Earning
s/Pric
eSize
Idios
yncra
tic Vola
tility
Long
Run Rev
ersals
Ind. R
el. Rev
. (L.V.)
Return
on Boo
k Equ
ity (A)
Dividen
d/Pric
e
Cash F
lows/P
rice
Value-M
omen
tum
Value (
A)
Return
on Asse
ts (A)
Indus
try M
omen
tum
Momen
tum (6
m)
Momen
tum (1
2m)
Compo
site Is
suan
ce
Momen
tum-R
evers
als
Inves
tmen
t/Asse
ts
Share
Volume
Short-T
erm Rev
ersals
Asset G
rowthPric
e
Seaso
nality
Gross M
argins
Inves
tmen
t/Cap
ital
Inves
tmen
t Grow
th
Value (
M)
Indus
try Rel.
Revers
als
Sales G
rowth
Accrua
l
Net Ope
rating
Assets
-0.3
-0.2
-0.1
0
0.1
0.2
0.3High DecileLow Decile
Figure: Portfolio composition of highest Sharpe ratio portfolio(Stochastic Discount Factor) based on 6 PCA factors. 29
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Optimal Portfolio with PCA (largest positions)
Composition of Stochast Discount Factor (PCA)
Value-
Profita
bility
Asset
Tur
nove
r
Sales/P
rice
Leve
rage
Value-
Mom
-Pro
f.
Gross
Pro
fitabil
ity
Indu
stry M
om. R
ev.
Earnin
gs/P
rice
Size
Idios
yncr
atic
Volatili
ty
Long
Run
Rev
ersa
ls
Ind.
Rel.
Rev
. (L.
V.)
Retur
n on
Boo
k Equ
ity (A
)
Divide
nd/P
rice
Cash
Flows/P
rice
-0.3
-0.2
-0.1
0
0.1
0.2
0.3 High DecileLow Decile
Figure: Largest portfolios in highest Sharpe ratio portfolio (StochasticDiscount Factor) based on 6 PCA factors. 30
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Interpreting factors: Generalized correlations with proxies
RP-PCA PCA
1. Gen. Corr. 1.00 1.002. Gen. Corr. 1.00 1.003. Gen. Corr. 0.99 0.994. Gen. Corr. 0.98 0.995. Gen. Corr. 0.92 0.946. Gen. Corr. 0.78 0.89
Table: Generalized correlations of statistical factors with proxy factors(portfolios of 5% of assets).
Problem in interpreting factors: Factors only identified up toinvertible linear transformations.
Generalized correlations close to 1 measure of how many factors twosets have in common.
⇒ Proxy factors approximate statistical factors. 31
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Interpreting factors: 6th proxy factor
6. Proxy RP-PCA Weights 6. Proxy PCA Weights
Momentum (6m) 1 0.28 Leverage 10 0.33Momentum (6m) 2 0.25 Asset Turnover 10 0.25Value (M) 10 0.25 Value-Profitability 10 0.25Value-Momentum 1 0.23 Profitability 10 0.22Industry Momentum 1 0.20 Asset Turnover 9 0.22Industry Reversals 9 0.19 Sales/Price 10 0.20Industry Momentum 2 0.19 Sales/Price 9 0.18Momentum (6m) 3 0.18 Size 10 0.17Idiosyncratic Volatility 2 -0.18 Value-Momentum-Profitability 1 -0.19Industry Mom. Reversals -0.18 Profitability 2 -0.19Value-Momentum 8 -0.20 Value-Profitability 1 -0.20Momentum (6m) 10 -0.21 Profitability 4 -0.20Value-Momentum 9 -0.23 Value-Profitability 2 -0.20Value-Momentum 10 -0.23 Profitability 1 -0.23Short-Term Reversals 1 -0.24 Idiosyncratic Volatility 1 -0.24Industry-Momentum 10 -0.24 Profitability 3 -0.25Industry Rel. Reversals 1 -0.28 Asset Turnover 2 -0.28Idiosyncratic Volatility 1 -0.38 Asset Turnover 1 -0.35
32
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Time-stability of loadings
Figure: Time-varying rotated loadings for the first six factors. Loadingsare estimated on a rolling window with 240 months. 33
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Time-stability of loadings of individual stocks
SR (In-sample)
RP-PCA PCA0
0.2
0.4
0.61 factor2 factors3 factors4 factors5 factors6 factors
SR (Out-of-sample)
RP-PCA PCA0
0.2
0.4
0.6
RMS (In-sample)
RP-PCA PCA0
0.1
0.2
0.3
0.4
RMS (Out-of-sample)
RP-PCA PCA0
0.1
0.2
0.3
0.4
Idiosyncratic Variation (In-sample)
RP-PCA PCA0
20
40
60
Idiosyncratic Variation (Out-of-sample)
RP-PCA PCA0
20
40
60
Figure 19: Stock price data from 01/1972 to 12/2016 (N = 270 and T = 500): MaximalSharpe-ratios, root-mean-squared pricing errors and unexplained idiosyncratic variationfor di↵erent number of factors. RP-weight = 10.
25
Figure: Stock price data (N = 270 and T = 500): Maximal Sharpe-ratiosfor different number of factors. RP-weight γ = 10.
Stock price data from 01/1972 to 12/2016 (N = 270 and T = 500)
⇒ Out-of-sample performance collapses
⇒ Constant loading model inappropriate34
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Empirical Results
Time-stability of loadings of individual stocks
Figure: Stock price data: Generalized correlations between loadingsestimated on the whole time horizon and a rolling window 35
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Conclusion
Conclusion
Methodology
Estimator for estimating priced latent factors from large data sets
Combines variation and pricing criterion function
Asymptotic theory under weak and strong factor model assumption
Detects weak factors with high Sharpe-ratio
More efficient than conventional PCA
Empirical Results
Strongly dominates PCA of the covariance matrix.
Potential to provide benchmark factors for horse races.
RP-PCA factor investing outperforms benchmark models
36
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Time-stability: Generalized correlations
0 50 100 150 200 250 300 350 400
Year
0
0.2
0.4
0.6
0.8
1G
ener
aliz
ed C
orre
latio
nRP-PCA (total vs. time-varying)
1st GC2nd GC3rd GC4th GC5th GC6th GC
0 50 100 150 200 250 300 350 400
Year
0
0.2
0.4
0.6
0.8
1
Gen
eral
ized
Cor
rela
tion
PCA (total vs. time-varying)
1st GC2nd GC3rd GC4th GC5th GC6th GC
Figure: Generalized correlations between loadings estimated on the wholetime horizon T = 638 and a rolling window with 240 months for K = 6factors.
A 1
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Time-stability of loadings of individual stocks
0 50 100 150 200 250 300Time
0
0.5
1
Gen
eral
ized
Cor
rela
tion RP-PCA (total vs. time-varying)
1st GC2nd GC3rd GC4th GC5th GC6th GC
0 50 100 150 200 250 300Time
0
0.5
1
Gen
eral
ized
Cor
rela
tion PCA (total vs. time-varying)
Figure: Stock price data (N = 270 and T = 500): Generalizedcorrelations between loadings estimated on the whole time horizon and arolling window with 240 months.
A 2
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Effect of Risk-Premium Weight γ
0 5 10 15 200
0.2
0.4SR
SR (In-sample)
1 factor2 factors3 factors
0 5 10 15 200
0.2
0.4
SR
SR (Out-of-sample)
0 5 10 15 200
0.1
0.2
0.3
RMS (In-sample)
0 5 10 15 200
0.1
0.2
0.3
RMS (Out-of-sample)
0 5 10 15 200
2
4
Varia
tion
Idiosyncratic Variation (In-sample)
0 5 10 15 200
2
4
Varia
tion
Idiosyncratic Variation (Out-of-sample)
Figure: Maximal Sharpe-ratios, root-mean-squared pricing errors andunexplained idiosyncratic variation.
⇒ RP-PCA detects 3rd factor (accrual) for γ > 10.A 3
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Literature (partial list)
Large-dimensional factor models with strong factors
Bai (2003): Distribution theoryFan et al. (2016): Projected PCA for time-varying loadingsKelly et al. (2017): Instrumented PCA for time-varyingloadingsPelger (2016), Aıt-Sahalia and Xiu (2015): High-frequency
Large-dimensional factor models with weak factors(based on random matrix theory)
Onatski (2012): Phase transition phenomenaBenauch-Georges and Nadakuditi (2011): Perturbation of largerandom matrices
Asset-pricing factors
Harvey and Liu (2015): Lucky factorsClarke (2015): Level, slope and curvature for stocksKozak, Nagel and Santosh (2015): PCA based factorsBryzgalova (2016): Spurious factors A 4
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme deciles of single-sorted portfolios
Portfolio Data
Monthly return data from 07/1963 to 12/2016 (T = 638) forN = 64 portfolios
Kozak, Nagel and Santosh (2017) data: 370 decile portfolios sortedaccording to 37 anomalies
⇒ Here we take only the lowest and highest decile portfolio for eachanomaly (N = 64).
Factors:
1 RP-PCA: K = 6 and γ = 100.2 PCA: K = 63 Fama-French 5: The five factor model of Fama-French
(market, size, value, investment and operating profitability, allfrom Kenneth French’s website).
4 Proxy factors: RP-PCA and PCA factors approximated with 8largest positions. A 5
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles
Anomaly Mean SD Sharpe-ratio Anomaly Mean SD Sharpe-ratio
Accruals - accrual 0.37 3.20 0.12 Momentum (12m) - mom12 1.28 6.91 0.19Asset Turnover - aturnover 0.40 3.84 0.10 Momentum-Reversals - momrev 0.47 4.82 0.10Cash Flows/Price - cfp 0.44 4.38 0.10 Net Operating Assets - noa 0.15 5.44 0.03Composite Issuance - ciss 0.46 3.31 0.14 Price - price 0.03 6.82 0.00Dividend/Price - divp 0.2 5.11 0.04 Gross Profitability - prof 0.36 3.41 0.11Earnings/Price - ep 0.57 4.76 0.12 Return on Assets (A) - roaa 0.21 4.07 0.05Gross Margins - gmargins 0.02 3.34 0.01 Return on Book Equity (A) - roea 0.08 4.40 0.02Asset Growth - growth 0.33 3.46 0.10 Seasonality - season 0.81 3.94 0.21Investment Growth - igrowth 0.37 2.69 0.14 Sales Growth - sgrowth 0.05 3.59 0.01Industry Momentum - indmom 0.49 6.17 0.08 Share Volume - shvol 0.00 6.00 0.00Industry Mom. Reversals - indmomrev 1.18 3.48 0.34 Size - size 0.29 4.81 0.06Industry Rel. Reversals - indrrev 1.00 4.11 0.24 Sales/Price sp 0.53 4.26 0.13Industry Rel. Rev. (L.V.) - indrrevlv 1.34 3.01 0.44 Short-Term Reversals - strev 0.36 5.27 0.07Investment/Assets - inv 0.49 3.09 0.16 Value-Momentum - valmom 0.51 5.05 0.10Investment/Capital - invcap 0.13 5.02 0.03 Value-Momentum-Prof. - valmomprof 0.84 4.85 0.17Idiosyncratic Volatility - ivol 0.56 7.22 0.08 Value-Profitability - valprof 0.76 3.84 0.20Leverage - lev 0.24 4.58 0.05 Value (A) - value 0.50 4.57 0.11Long Run Reversals - lrrev 0.46 5.02 0.09 Value (M) - valuem 0.43 5.89 0.07Momentum (6m) - mom 0.35 6.27 0.06
Table: Long-Short Portfolios of extreme deciles of 37 single-sortedportfolios from 07/1963 to 12/2016: Mean, standard deviation andSharpe-ratio.
A 6
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles
In-sample Out-of-sampleSR RMS α Idio. Var. SR RMS α Idio. Var.
RP-PCA 0.64 0.18 3.59 0.53 0.15 4.23PCA 0.35 0.22 3.57 0.28 0.19 4.24
RP-PCA Proxy 0.62 0.19 4.08 0.48 0.17 4.19PCA Proxy 0.37 0.22 3.77 0.315 0.18 4.20
Fama-French 5 0.32 0.30 7.31 0.31 0.262 6.40
Table: First and last decile of 37 single-sorted portfolios from 07/1963 to12/2016 (N = 64 and T = 638): Maximal Sharpe-ratios,root-mean-squared pricing errors and unexplained idiosyncratic variation.K = 6 statistical factors.
RP-PCA strongly dominates PCA and Fama-French 5 factors
Results hold out-of-sample. A 7
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles: Maximal Sharpe-ratio
SR (In-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.1
0.2
0.3
0.4
0.5
0.6
0.73 factors4 factors5 factors6 factors7 factors
SR (Out-of-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure: Maximal Sharpe-ratios.
⇒ Spike in Sharpe-ratio for 6 factors
A 8
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles: Pricing error
RMS (In-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
3 factors4 factors5 factors6 factors7 factors
RMS (Out-of-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Figure: Root-mean-squared pricing errors.
⇒ RP-PCA has smaller out-of-sample pricing errors
A 9
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles: Idiosyncratic Variation
Idiosyncratic Variation (In-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
1
2
3
4
5
6
7
3 factors4 factors5 factors6 factors7 factors
Idiosyncratic Variation (Out-of-sample)
RP-PCA
RP-PCA ProxyPCA
PCA Proxy0
1
2
3
4
5
6
7
Figure: Unexplained idiosyncratic variation.
⇒ Unexplained variation similar for RP-PCA and PCA
A 10
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles: Maximal Sharpe-ratio
SR (In-sample)
1 factor
2 factors
3 factors
4 factors
5 factors
6 factors
7 factors0
0.1
0.2
0.3
0.4
0.5
0.6
0.7=-1=0=1=10=50=100
SR (Out-of-sample)
1 factor
2 factors
3 factors
4 factors
5 factors
6 factors
7 factors0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Figure: First and last decile of 37 single-sorted portfolios (N = 64 andT = 638): Maximal Sharpe-ratios for different RP-weights γ and numberof factors K .
A 11
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Interpreting factors: Generalized correlations with proxies
RP-PCA PCA
1. Gen. Corr. 1.00 1.002. Gen. Corr. 1.00 1.003. Gen. Corr. 0.98 0.994. Gen. Corr. 0.96 0.975. Gen. Corr. 0.88 0.956. Gen. Corr. 0.72 0.89
Table: Generalized correlations of statistical factors with proxy factors(portfolios of 8 assets).
Problem in interpreting factors: Factors only identified up toinvertible linear transformations.
Generalized correlations close to 1 measure of how many factors twosets have in common.
⇒ Proxy factors approximate statistical factors. A 12
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Interpreting factors: Cumulative absolute proxy weights
RP-PCA Proxy PCA Proxy
Momentum (12m) 1.70 Idiosyncratic Volatility 1.28Idiosyncratic Volatility 1.65 Momentum (12m) 1.14Industry Rel. Rev. (L.V.) 1.21 Momentum (6m) 1.04Momentum (6m) 1.21 Price 0.95Price 1.21 Asset Turnover 0.83Industry Mom. Reversals 1.11 Industry Rel. Rev. (L.V.) 0.79Value (M) 1.03 Value (M) 0.75Industry Rel. Reversals 0.84 Industry Momentum 0.73Share Volume 0.55 Industry Mom. Reversals 0.73Net Operating Assets 0.47 Dividend/Price 0.67Value-Momentum-Prof. 0.46 Gross Profitability 0.64Size 0.42 Sales/Price 0.58Return on Book Equity (A) 0.32 Value-Profitability 0.58Industry Momentum 0.29 Net Operating Assets 0.37Value-Momentum 0.27 Value (A) 0.36Long Run Reversals 0.24 Value-Momentum-Prof. 0.33Earnings/Price 0.22 Cash Flows/Price 0.31
Table: Composition of proxy factors: Anomaly categories and the sum ofabsolute values of the portfolio weights of the K = 6 proxy factors.
A 13
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Interpreting factors: Composition of proxies
2. Proxy (RP-PCA) 3. Proxy (RP-PCA) 4. Proxy (RP-PCA) 5. Proxy (RP-PCA) 6. Proxy (RP-PCA)
indrrevlv10 0.54 valmomprof10 0.17 mom1210 0.28 mom1210 -0.28 price1 0.38indmomrev10 0.52 indmomrev10 -0.20 mom10 0.26 mom10 -0.28 mom1 0.36
ivol10 0.24 ivol10 -0.21 valuem1 0.25 valmomprof10 -0.29 valuem10 0.34Accrual1 -0.21 mom121 -0.23 lrrev10 -0.24 roea1 -0.32 indrrev10 0.32
shvol1 -0.22 indrrevlv10 -0.26 mom1 -0.30 shvol1 -0.33 indrrev1 -0.26ep1 -0.22 indmomrev1 -0.40 valuem10 -0.44 price1 -0.37 valmom10 -0.27
indrrev1 -0.25 indrrevlv1 -0.41 price1 -0.45 size10 -0.42 indmom10 -0.29mom121 -0.42 ivol1 -0.67 mom121 -0.49 noa10 -0.47 ivol1 -0.53
2. Proxy (PCA) 3. Proxy (PCA) 4. Proxy (PCA) 5. Proxy (PCA) 6. Proxy (PCA)
ivol1 0.59 valuem10 0.46 mom10 0.36 divp10 0.30 valprof10 0.33indrrevlv10 0.43 price1 0.38 indmom10 0.35 roea1 -0.25 Aturnover10 0.32
indmomrev10 0.37 divp10 0.37 mom1210 0.34 shvol1 -0.27 sp10 0.27indrrevlv1 0.36 value10 0.36 valmomprof10 0.33 size10 -0.28 prof10 0.24
indmomrev1 0.36 sp10 0.32 valmom1 -0.31 mom1 -0.29 valprof1 -0.25ivol10 0.27 lrrev10 0.31 indmom1 -0.35 noa10 -0.37 prof1 -0.40
mom121 0.03 cfp10 0.31 mom121 -0.39 mom121 -0.38 ivol1 -0.42indmom1 0.03 valuem1 -0.29 mom1 -0.39 price1 -0.57 Aturnover1 -0.51
Table: Portfolio-composition of proxy factors for first and last decile of 37single-sorted portfolios: First proxy factors is an equally-weightedportfolio.
A 14
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles: Time-Stability
Figure: Time-varying rotated loadings for the first six factors. Loadingsare estimated on a rolling window with 240 months. A 15
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Extreme Deciles: Time-Stability
0 50 100 150 200 250 300 350 400
Time
0
0.2
0.4
0.6
0.8
1
Gen
eral
ized
Cor
rela
tion
RP-PCA (total vs. time-varying, K=4)
1st GC2nd GC3rd GC4th GC
0 50 100 150 200 250 300 350 400
Time
0
0.2
0.4
0.6
0.8
1
Gen
eral
ized
Cor
rela
tion
PCA (total vs. time-varying, K=4)
1st GC2nd GC3rd GC4th GC
0 50 100 150 200 250 300 350 400
Time
0
0.2
0.4
0.6
0.8
1
Gen
eral
ized
Cor
rela
tion
RP-PCA (total vs. time-varying, K=6)
1st GC2nd GC3rd GC4th GC5th GC6th GC
0 50 100 150 200 250 300 350 400
Time
0
0.2
0.4
0.6
0.8
1
Gen
eral
ized
Cor
rela
tion
PCA (total vs. time-varying, K=6)
1st GC2nd GC3rd GC4th GC5th GC6th GC
Figure: Generalized correlations between loadings estimated on the wholetime horizon T = 638 and a rolling window. A 16
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Double-sorted portfolios
Double-sorted portfolios
Data
Monthly return data from 07/1963 to 05/2017 (T = 647)13 double sorted portfolios (consisting of 25 portfolios) fromKenneth French’s website
Factors
1 PCA: K = 32 RP-PCA: K = 3 and γ = 1003 FF-Long/Short factors: market + two specific anomaly
long-short factors
A 17
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Sharpe-ratios and pricing errors (in-sample)
Sharpe-Ratio αRPCA PCA FF-L/S RPCA PCA FF-L/S
Size and BM 0.25 0.22 0.21 0.13 0.13 0.14BM and Investment 0.24 0.17 0.24 0.08 0.11 0.12
BM and Profits 0.23 0.20 0.24 0.10 0.12 0.16Size and Accrual 0.33 0.13 0.21 0.06 0.14 0.12
Size and Beta 0.25 0.24 0.23 0.06 0.07 0.10Size and Investment 0.32 0.26 0.21 0.10 0.11 0.22
Size and Profits 0.22 0.21 0.25 0.06 0.06 0.16Size and Momentum 0.25 0.19 0.18 0.15 0.16 0.17Size and ST-Reversal 0.28 0.25 0.24 0.16 0.17 0.35
Size and Idio. Vol. 0.35 0.31 0.32 0.15 0.16 0.16Size and Total Vol. 0.34 0.30 0.31 0.16 0.16 0.16Profits and Invest. 0.28 0.24 0.30 0.10 0.12 0.11
Size and LT-Reversal 0.22 0.18 0.16 0.11 0.13 0.16
A 18
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Sharpe-ratios and pricing errors (out-of-sample)
Sharpe-Ratio αRPCA PCA FF-L/S RPCA PCA FF-L/S
Size and BM 0.23 0.18 0.16 0.18 0.19 0.19BM and Investment 0.21 0.15 0.24 0.13 0.17 0.17
BM and Profits 0.19 0.17 0.23 0.16 0.17 0.19Size and Accrual 0.24 0.09 0.11 0.08 0.14 0.12
Size and Beta 0.22 0.20 0.16 0.08 0.09 0.09Size and Investment 0.30 0.23 0.17 0.13 0.14 0.16
Size and Profits 0.22 0.20 0.20 0.10 0.1 0.17Size and Momentum 0.18 0.12 0.08 0.17 0.18 0.19Size and ST-Reversal 0.22 0.17 0.23 0.22 0.23 0.25
Size and Idio. Vol. 0.37 0.30 0.28 0.17 0.18 0.18Size and Total Vol. 0.35 0.28 0.27 0.18 0.20 0.19Profits and Invest. 0.31 0.25 0.29 0.12 0.15 0.14
Size and LT-Reversal 0.16 0.10 0.04 0.13 0.14 0.14
A 19
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
The Model: Objective function
Time-series objective function:
Minimize the unexplained variance:
minΛ,F
1
NT
N∑i=1
T∑t=1
(Xti − FtΛ>i )2
= minΛ
1
NTtrace
((XMΛ)>(XMΛ)
)s.t. F = X (Λ>Λ)−1Λ>
Projection matrix MΛ = IN − Λ(Λ>Λ)−1Λ>
Error (non-systematic risk): e = X − FΛ> = XMΛ
Λ proportional to eigenvectors of the first K largest eigenvalues of1
NT X>X minimizes time-series objective function
⇒ Motivation for PCA
A 20
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
The Model: Objective function
Cross-sectional objective function:
Minimize cross-sectional expected pricing error:
1
N
N∑i=1
(E [Xi ]− E [F ]Λ>i
)2
=1
N
N∑i=1
(1
TX>i 1−
1
T1>FΛ>i
)2
=1
Ntrace
((1
T1>XMΛ
)(1
T1>XMΛ
)>)s.t. F = X (Λ>Λ)−1Λ>
1 is vector T × 1 of 1’s and thus F>1T estimates factor mean
Why not estimate factors with cross-sectional objective function?
Factors not identifiedSpurious factor detection (Bryzgalova (2016))
A 21
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
The Model: Objective function
Combined objective function: Risk-Premium-PCA
minΛ,F
1
NTtrace
(((XMΛ)>(XMΛ
))+ γ
1
Ntrace
((1
T1>XMΛ
)(1
T1>XMΛ
)>)
= minΛ
1
NTtrace
(MΛX
>(I +
γ
T11>)XMΛ
)s.t. F = X (Λ>Λ)−1Λ>
The objective function is minimized by the eigenvectors of thelargest eigenvalues of 1
NT X>(IT + γ
T 11>)X .
Λ estimator for loadings: proportional to eigenvectors of the first Keigenvalues of 1
NT X>(IT + γ
T 11>)X
F estimator for factors: 1NX Λ = X (Λ>Λ)−1Λ>.
Estimator for the common component C = FΛ is C = F Λ>
A 22
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
The Model: Objective function
Weighted Combined objective function:
Straightforward extension to weighted objective function:
minΛ,F
1
NTtrace(Q>(X − FΛ>)>(X − FΛ>)Q)
+ γ1
Ntrace
(1>(X − FΛ>)QQ>(X − FΛ>)>1
)= min
Λtrace
(MΛQ
>X>(I +
γ
T11>)XQMΛ
)s.t. F = X (Λ>Λ)−1Λ>
Cross-sectional weighting matrix Q
Factors and loadings can be estimated by applying PCA toQ>X>
(I + γ
T 11>)XQ.
Today: Only Q equal to inverse of a diagonal matrix of standarddeviations. For γ = −1 corresponds to PCA of a correlation matrix.
Optimal choice of Q: GLS type argument A 23
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation
Simulation parameters
N = 250 and T = 350.
Factors: K = 4
1. Factor represent the market with N(1.2, 9): Sharpe-ratio of0.42. Factor represents an industry factors following N(0.1, 1):Sharpe-ratio of 0.1.3. Factor follows N(0.4, 1): Sharpe-ratio of 0.4.4. Factor has a small variance but high Sharpe-ratio. It followsN(0.4, 0.16): Sharpe-ratio of 1.
Loadings normalized such that 1N Λ>Λ = IK .
Λi,1 = 1 and Λi,k ∼ N(0, 1) for k = 2, 3, 4.
Errors: Cross-sectional and time-series correlation andheteroskedasticity in the residuals. 1/3 of variation due tosystematic risk. A 24
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation parameters
Errors
Residuals are modeled as e = σeDTAT εANDN :
ε is a T × N matrix and follows a multivariate standard normaldistribution
Time-series correlation in errors: AT creates an AR(1) model withparameter ρ = 0.1
Cross-sectional correlation in errors: AN is a Toeplitz-matrix with(β, β, β) on the right three off-diagonals with β = 0.7
Cross-sectional heteroskedasticity: DN is a diagonal matrix withindependent elements following N(1, 0.2)
Time-series heteroskedasticity: DT is a diagonal matrix withindependent elements following N(1, 0.2)
Signal-to-noise ratio: σ2e = 9
Parameters produce eigenvalues that are consistent with the data.
Around 1/3 of the variation is due to systematic risk and 2/3 isnon-systematic.
A 25
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation
0 100 200 300 400
-1000
0
1000
2000
3000
4000
5000
60001. Factor
0 100 200 300 400
0
200
400
600
800
1000
12002. Factor
0 100 200 300 400
-500
0
500
1000
1500
2000
25003. Factor
0 100 200 300 400
0
500
1000
1500
2000
2500
30004. Factor
True factor
RP-PCA =0
RP-PCA =10
RP-PCA =50
RP-PCA =100
PCA
Figure: Sample paths of the cumulative returns of the first four factorsand the estimated factor processes. N = 250 and T = 350.
A 26
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation
True Factor γ = −1 γ = 0 γ = 10 γ = 50 γ = 100
In-sampleSR 1.17 0.56 0.80 1.07 1.10 1.10α 0.42 0.57 0.51 0.42 0.41 0.41
Idio. Var. 23.77 22.728 22.784 22.912 22.923 22.925Ex. Var. 0.33 0.36 0.36 0.35 0.35 0.35
Out-of-sampleSR 1.17 0.60 0.79 0.92 0.92 0.93α 0.42 0.58 0.52 0.47 0.47 0.47
Idio. Var. 23.84 23.15 23.13 23.15 23.15 23.15Ex. Var. 0.33 0.35 0.35 0.35 0.35 0.35
Table: (i) Maximal Sharpe-ratio (SR), (ii) root-mean-squared pricingerror (α), unexplained variation and proportion of variation explained byK = 4 factors. N = 250 and T = 350.
A 27
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation
γ = −1 γ = 0 γ = 10 γ = 50 γ = 100
In-sample1. Factor 0.98 0.98 0.98 0.98 0.982. Factor 0.92 0.92 0.92 0.92 0.923. Factor 0.92 0.92 0.92 0.92 0.92
4. Factor 0.15 0.45 0.68 0.69 0.69
Out-of-sample1. Factor 0.94 0.97 0.98 0.98 0.982. Factor 0.79 0.88 0.92 0.92 0.923. Factor 0.77 0.86 0.91 0.92 0.92
4. Factor 0.17 0.46 0.68 0.69 0.69
Table: Correlation between estimated and true factors. The risk-premiumweight γ = −1 corresponds to PCA. N = 250 and T = 350.
A 28
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation
Simulation parameters
Consider only K = 1 factor
Sharpe-ratio =1, i.e. µF = σF
Different signal strength σ2F
Λi ∼ N(0, 1), i.e. normalized variance σ2F · N in weak factor model.
Same residuals as before
A 29
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Correlation, N=150, T=150
Correlation with true factor (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
=-1=0=10=50=100
True factor
Correlation with true factor (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
Figure: Correlation of estimated with true factor.
⇒ For signal σ2 < 0.8 a value of γ ≥ 10 is optimal.
A 30
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Sharpe-ratio, N=150, T=150
SR (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
1.2
=-1=0=10=50=100
True factor
SR (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
1.2
Figure: Maximal Sharpe-ratios.
⇒ For signal σ2 < 0.8 a value of γ ≥ 10 is optimal.
A 31
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Pricing Error, N=150, T=150
RMS (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
=-1=0=10=50=100
True factor
RMS (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
Figure: Root-mean-squared pricing errors.
⇒ For signal σ2 < 0.8 a value of γ ≥ 10 is optimal.
A 32
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Idiosyncratic Variation, N=150, T=150
Idiosyncratic Variation (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
5
10
15
20
25
=-1=0=10=50=100
True factor
Idiosyncratic Variation (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
5
10
15
20
25
Figure: Unexplained idiosyncratic variation.
⇒ Unexplained variation similar for RP-PCA and PCA.
A 33
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Correlation, N=250, T=250
Correlation with true factor (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
=-1=0=10=50=100
True factor
Correlation with true factor (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
Figure: Correlation of estimated with true factor.
⇒ For signal σ2 < 0.8 a value of γ ≥ 0 is optimal.
A 34
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Correlation, N=50, T=250
Correlation with true factor (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
=-1=0=10=50=100
True factor
Correlation with true factor (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
Figure: Correlation of estimated with true factor.
⇒ For signal σ2 < 0.8 a value of γ ≥ 10 is optimal.
A 35
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Correlation, N=100, T=250
Correlation with true factor (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
=-1=0=10=50=100
True factor
Correlation with true factor (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
Figure: Correlation of estimated with true factor.
⇒ For signal σ2 < 0.8 a value of γ ≥ 10 is optimal.
A 36
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Correlation, N=250, T=150
Correlation with true factor (In-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
=-1=0=10=50=100
True factor
Correlation with true factor (Out-of-sample)
2=0.3 2=0.5 2=0.8 2=1.00
0.2
0.4
0.6
0.8
1
Figure: Correlation of estimated with true factor.
⇒ For signal σ2 < 0.8 a value of γ ≥ 0 is optimal.
A 37
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Weak factor model prediction
0 0.5 1 1.5 20
0.5
1Statistical Model
PCARP-PCA ( =0)RP-PCA ( =10)RP-PCA ( =50)
0 0.5 1 1.5 20
0.5
1Monte-Carlo Simulation
PCARP-PCA ( =0)RP-PCA ( =10)RP-PCA ( =50)
Correlations between estimated and true factor based on the weak factor model
prediction and Monte-Carlo simulations for different variances of the factor.
The residuals have cross-sectional and time-series correlation and
heteroskedasticity. The Sharpe-ratio of the factor is 1. N = 250,T = 350. A 38
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simulation: Weak factor model prediction
0 0.5 1 1.5 20
0.5
1Statistical Model
PCARP-PCA ( =0)RP-PCA ( =10)RP-PCA ( =50)
0 0.5 1 1.5 20
0.5
1Monte-Carlo Simulation
PCARP-PCA ( =0)RP-PCA ( =10)RP-PCA ( =50)
Correlations between estimated and true factor based on the weak factor model
prediction and Monte-Carlo simulations for different variances of the factor.
The residuals have only cross-sectional correlation. The Sharpe-ratio of the
factor is 1. N = 250,T = 350. A 39
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model: Dependent residuals
0 5 10 15 20 25 30 350
0.2
0.4
0.6
0.8
1
dependent residualsi.i.d residuals
Figure: Values of ρ2i ( 1
1+θiB(θi ))if θi > σ2
crit and 0 otherwise) for different
signals θi . The average noise level is normalized in both cases to σ2e = 1
and c = 1. For the correlated residuals we assume that Σ1/2 is a Toeplitzmatrix with β, β, β on the right three off-diagonals with β = 0.7.N = 250,T = 350.
A 40
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Weak Factor Model
Weak factors either have a small variance or affect a smallerfraction of assets:
Λ>Λ bounded (after normalizing factor variances)
Statistical model: Spiked covariance models from random matrixtheory
Eigenvalues of sample covariance matrix separate into two areas:
The bulk, majority of eigenvaluesThe extremes, a few large outliers
Bulk spectrum converges to generalized Marchenko-Pasturdistribution (under certain conditions)
A 41
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Weak Factor Model
Large eigenvalues converge either to
A biased value characterized by the Stieltjes transform of thebulk spectrumTo the bulk of the spectrum if the true eigenvalue is belowsome critical threshold
⇒ Phase transition phenomena: estimated eigenvectorsorthogonal to true eigenvectors if eigenvalues too small
Onatski (2012): Weak factor model with phase transitionphenomena
Problem: All models in the literature assume that random processeshave mean zero
⇒ RP-PCA implicitly uses non-zero means of random variables
⇒ New tools necessary!A 42
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Assumption 1: Weak Factor Model
1 Rate: Assume that NT→ c with 0 < c <∞.
2 Factors: F are uncorrelated among each other and are independent of eand Λ and have bounded first two moments.
µF :=1
T
T∑t=1
Ftp→ µF ΣF :=
1
TFtF
>t
p→ ΣF =
σ2F1· · · 0
.... . .
...0 · · · σ2
FK
3 Loadings: The column vectors of the loadings Λ are orthogonally
invariant and independent of ε and F (e.g. Λi,k ∼ N(0, 1N
) and
Λ>Λ = IK
4 Residuals: e = εΣ with εt,i ∼ N(0, 1). The empirical eigenvaluedistribution function of Σ converges to a non-random spectral distributionfunction with compact support and supremum of support b. Largesteigenvalues of Σ converge to b. A 43
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Intuition: Weak Factor Model
“Signal” matrix for PCA of covariance matrix:
ΛΣFΛ>
K largest eigenvalues θPCA1 , ..., θPCAK measure strength of signal
“Signal” matrix for RP-PCA:
Λ(ΣF + (1 + γ)µFµ
>F
)Λ>
K largest eigenvalues θRP-PCA1 , ..., θRP-PCA
K measure strength of signal
If µF 6= 0 and γ > −1 then RP-PCA signal always larger than PCAsignal:
θRP-PCAi > θPCAi
A 44
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Theorem 1: Risk-Premium PCA under weak factor model
Under Assumption 1 the correlation of the estimated with the true factors is
Corr(F , F )p→ U︸︷︷︸
rotation
ρ1 0 · · · 00 ρ2 · · · 0
0 0. . .
...0 · · · 0 ρK
V︸︷︷︸rotation
with
ρ2i
p→ 1
1+θiB(θi ))if θi > θcrit
0 otherwise
Critical value θcrit and function B(.) depend only on the noise distributionand are known in closed-form
Based on closed-form expression choose optimal RP-weight γ
For θi > θcrit ρ2i is strictly increasing in θi .
⇒ RP-PCA strictly dominates PCA A 45
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Strong Factor Model
Strong factors affect most assets: e.g. market factor
1N Λ>Λ bounded (after normalizing factor variances)
Statistical model: Bai and Ng (2002) and Bai (2003) framework
Factors and loadings can be estimated consistently and areasymptotically normal distributed
RP-PCA provides a more efficient estimator of the loadings
Assumptions essentially identical to Bai (2003)
A 46
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Asymptotic Distribution (up to rotation)
PCA under assumptions of Bai (2003): (up to rotation)
Asymptotically Λ behaves like OLS regression of F on X .Asymptotically F behaves like OLS regression of Λ on X>.
RP-PCA under slightly stronger assumptions as in Bai (2003):
Asymptotically Λ behaves like OLS regression of FW on XW
with W 2 =(IT + γ 11
>
T
)and 1 is a T × 1 vector of 1’s .
Asymptotically F behaves like OLS regression of Λ on X .
Asymptotic Efficiency
Choose RP-weight γ to obtain smallest asymptotic variance of estimators
RP-PCA (i.e. γ > −1) always more efficient than PCA
Optimal γ typically smaller than optimal value from weak factor model
RP-PCA and PCA are both consistentA 47
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simplified Strong Factor Model
Assumption 2: Simplified Strong Factor Model
1 Rate: Same as in Assumption 1
2 Factors: Same as in Assumption 1
3 Loadings: Λ>Λ/Np→ IK and all loadings are bounded.
4 Residuals: e = εΣ with εt,i ∼ N(0, 1). All elements and all row sums ofΣ are bounded.
A 48
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Simplified Strong Factor Model
Proposition: Simplified Strong Factor Model
Assumption 2 holds. Then:
1 The factors and loadings can be estimated consistently.
2 The asymptotic distribution of the factors is not affected by γ.
3 The asymptotic distribution of the loadings is given by
√T(
Λi − Λi
)D→ N(0,Ωi )
Ωi =σ2ei
(ΣF + (1 + γ)µFµ
>F
)−1 (ΣF + (1 + γ)2µFµ
>F
)(ΣF + (1 + γ)µFµ
>F
)−1, E [e2
t,i ] = σ2ei
4 γ = 0 is optimal choice for smallest asymptotic variance.γ = −1, i.e. the covariance matrix, is not efficient.
A 49
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Time-varying loadings
Model with time-varying loadings
Observe panel of excess returns and L covariates Zi,t−1,l :
Xt,i = Ft1×K
> gK×1
(Zi,t−1,1, ...,Zi,t−1,L) + et,i
Loadings are function of L covariates Zi,t−1,l with l = 1, ..., Le.g. characteristics like size, book-to-market ratio, past returns, ...
Factors and loading function are latent
Idea: Similar to Projected PCA (Fan, Liao and Wang (2016)) andInstrumented PCA (Kelly, Pruitt, Su (2017)), but
we include the pricing error penaltyallow for general interactions between covariates
A 50
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Time-varying loadings
Projected RP-PCA (work in progress)
Approximate nonlinear function gk(.) by basis functions φm(.):
gk(Zi ,t−1) =M∑
m=1
bm,kφm(Zi,t−1) g(Zt−1)︸ ︷︷ ︸K×N
= B>︸︷︷︸K×M
Φ(Zt−1)︸ ︷︷ ︸M×N
Apply RP-PCA to projected data Xt = XtΦ(Zt−1)>
Xt = FtB>Φ(Zt−1)Φ(Zt−1)> + etΦ(Zt−1)> = FtB + et
Special case: φm = 1Zt−1∈Im ⇒ X characteristics sorted portfolios
Obtain arbitrary interactions and break curse of dimensionality byconditional tree sorting projection
Intuition: Projection creates M portfolios sorted on any functionalform and interaction of covariates Zt−1. A 51
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Assumption 1: Weak Factor Model
1 Residual matrix can be represented as e = εΣ with εt,i ∼ N(0, 1). Theempirical eigenvalue distribution function of Σ converges to anon-random spectral distribution function with compact support. Thesupremum of the support is b.
2 The factors F are uncorrelated among each other and are independent ofe and Λ and have bounded first two moments.
µF :=1
T
T∑t=1
Ftp→ µF ΣF :=
1
TFtF
>t
p→ ΣF =
σ2F1· · · 0
.... . .
...0 · · · σ2
FK
3 The column vectors of the loadings Λ are orthogonally invariant and
independent of ε and F (e.g. Λi,k ∼ N(0, 1N
) and
Λ>Λ = IK
4 Assume that NT→ c with 0 < c <∞. A 52
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Definition: Weak Factor Model
Average idiosyncratic noise σ2e := trace(Σ)/N
Denote by λ1 ≥ λ2 ≥ ... ≥ λN the ordered eigenvalues of 1Te>e. The
Cauchy transform (also called Stieltjes transform) of the eigenvalues isthe almost sure limit:
G(z) := a.s. limT→∞
1
N
N∑i=1
1
z − λi= a.s. lim
T→∞
1
Ntrace
((zIN −
1
Te>e)
)−1
B-function
B(z) :=a.s. limT→∞
c
N
N∑i=1
λi
(z − λi )2
=a.s. limT→∞
c
Ntrace
(((zIN −
1
Te>e)
)−2(1
Te>e
))
A 53
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Estimator
Risk-premium PCA (RP-PCA): Apply PCA estimation to
Sγ := 1TX>
(IT + γ 11
>
T
)X
PCA : Apply PCA to estimated covariance matrix
S−1 := 1TX>
(IT − 11
>
T
)X , i.e. γ = −1.
⇒ PCA special case of RP-PCA
“Signal” Matrix for Covariance PCA
MVar = ΣF + cσ2e IK =
σ2F1
+ cσ2e · · · 0
.... . .
...0 · · · σ2
FK+ cσ2
e
⇒ Intuition: Largest K “true” eigenvalues of S−1.
A 54
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Lemma: Covariance PCA
Assumption 1 holds. Define the critical value σ2crit = limz↓b
1G(z)
. The first K
largest eigenvalues λi of S−1 satisfy for i = 1, ...,K
λip→
G−1
(1
σ2Fi
+cσ2e
)if σ2
Fi+ cσ2
e > σ2crit
b otherwise
The correlation between the estimated and true factors converges to
Corr(F , F )p→
%1 · · · 0...
. . ....
0 · · · %K
with
%2i
p→
1
1+(σ2Fi
+cσ2e )B(λi ))
if σ2Fi
+ cσ2e > σ2
crit
0 otherwise
A 55
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Corollary: Covariance PCA for i.i.d. errors
Assumption 1 holds, c ≥ 1 and et,i i.i.d. N(0, σ2e ). The largest K eigenvalues
of S−1 have the following limiting values:
λip→
σ2Fi
+σ2e
σ2Fi
(c + 1 + σ2e ) if σ2
Fi+ cσ2
e > σ2crit ⇔ σ2
F >√cσ2
e
σ2e (1 +
√c)2 otherwise
The correlation between the estimated and true factors converges to
Corr(F , F )p→
%1 · · · 0...
. . ....
0 · · · %K
with
%2i
p→
1− cσ4
eσ4Fi
1+cσ2
eσ2Fi
+σ4e
σ4Fi
(c2−c)if σ2
Fi+ cσ2
e > σ2crit
0 otherwise A 56
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
“Signal” Matrix for RP-PCA
“Signal” Matrix for RP-PCA
MRP =
(ΣF + cσ2
e Σ1/2F µF (1 + γ)
µ>F Σ1/2F (1 + γ) (1 + γ)(µ>F µF + cσ2
2)
)
Define γ =√γ + 1− 1 and note that (1 + γ)2 = 1 + γ.
⇒ Projection on K demeaned factors and on mean operator.
Denote by θ1 ≥ ... ≥ θK+1 the eigenvalues of the “signal matrix” MRP
and by U the corresponding orthonormal eigenvectors :
U>MRP U =
θ1 · · · 0...
. . ....
0 · · · θK+1
⇒ Intuition: θ1, ..., θK+1 largest K + 1 “true” eigenvalues of Sγ .
A 57
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Theorem 1: Risk-Premium PCA under weak factor model
Assumption 1 holds. The first K largest eigenvalues θi i = 1, ...,K of Sγ satisfy
θip→
G−1
(1θi
)if θi > σ2
crit = limz↓b1
G(z)
b otherwise
The correlation of the estimated with the true factors converges to
Corr(F , F )p→(IK 0
)U︸ ︷︷ ︸
rotation
ρ1 0 · · · 00 ρ2 · · · 0
0 0. . .
...0 · · · 0 ρK0 · · · 0
D1/2K Σ
−1/2
F︸ ︷︷ ︸rotation
with
ρ2i
p→
1
1+θiB(θi ))if θi > σ2
crit
0 otherwiseA 58
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Theorem 1: continued
ΣF =D1/2K
(ρ1 · · · 0...
. . ....
0 · · · ρK0 · · · 0
>
U>(IK 00 0
)U
ρ1 · · · 0...
. . ....
0 · · · ρK0 · · · 0
+
1− ρ21 · · · 0
.... . .
...0 · · · 1− ρ2
K
)D1/2K
DK =diag((θ1 · · · θK
))
A 59
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Lemma: Detection of weak factors
If γ > −1 and µF 6= 0, then the first K eigenvalues of MRP are strictly largerthan the first K eigenvalues of MVar , i.e.
θi > σ2Fi + cσ2
e
For θi > σ2crit it holds that
∂θi∂θi
> 0∂ρi∂θi
> 0 i = 1, ...,K
Thus, if γ > −1 and µF 6= 0, then ρi > %i .
⇒ For µF 6= 0 RP-PCA always better than PCA.
A 60
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Example: One-factor model
Assume that there is only one factor, i.e. K = 1. The “signal matrix” MRP
simplifies to
MRP =
(σ2F + cσ2
e σFµ(1 + γ)µσF (1 + γ) (µ2 + cσ2
e )(1 + γ)
)and has the eigenvalues:
θ1,2 =1
2σ2F + cσ2
e + (µ2 + cσ2e )(1 + γ)
± 1
2
√(σ2
F + cσ2e + (µ2 + cσ2
e )(1 + γ))2 − 4(1 + γ)cσ2e (σ2
F + µ2 + cσ2e )
The eigenvector of first eigenvalue θ1 has the components
U1,1 =µσF (1 + γ)√
(θ1 − (σ2F + cσ2
e ))2 + µ2σ2F (1 + γ)
U1,2 =θ1 − σ2
F + cσ2e√
(θ1 − (σ2F + cσ2
e ))2 + µ2σ2F (1 + γ) A 61
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Weak Factor Model
Corollary: One-factor model
The correlation between the estimated and true factor has the following limit:
Corr(F , F )p→ ρ1√
ρ21 + (1− ρ2
1)(θ1−(σ2
F+cσ2
e ))2+1
µ2σ2F
(1+γ)
A 62
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Strong Factor Model
Strong factors affect most assets: e.g. market factor1N
Λ>Λ bounded (after normalizing factor variances)
Statistical model: Bai and Ng (2002) and Bai (2003) framework
Factors and loadings can be estimated consistently and are asymptoticallynormal distributed
RP-PCA provides a more efficient estimator of the loadings
Assumptions essentially identical to Bai (2003)
A 63
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Asymptotic Distribution (up to rotation)
PCA under assumptions of Bai (2003):
Asymptotically Λ behaves like OLS regression of F on X .Asymptotically F behaves like OLS regression of Λ on X .
RP-PCA under slightly stronger assumptions as in Bai (2003):
Asymptotically Λ behaves like OLS regression of FW on XW
with W 2 =(IT + γ 11
>
T
).
Asymptotically F behaves like OLS regression of Λ on X .
Asymptotic Expansion
Asymptotic expansions (under slightly stronger assumptions as in Bai (2003)):
1√T(H>Λi − Λi
)=(
1TF>W 2F
)−1 1√TF>W 2ei + Op
(√TN
)+ op(1)
2√N(H>−1
Ft − Ft
)=(
1N
Λ>Λ)−1 1√
NΛ>e>t + Op
(√N
T
)+ op(1)
with known rotation matrix H. A 64
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Assumption 2: Strong Factor Model
Assume the same assumptions as in Bai (2003) (Assumption A-G) hold and inaddition (
1√T
∑Tt=1 Ftet,i
1√T
∑Tt=1 et,i
)D→ N(0,Ω) Ω =
(Ω1,1 Ω1,2
Ω2,1 Ω2,2
)
A 65
Intro Model Illustration Theoretical Results Empirical Results Conclusion Appendix
Strong Factor Model
Theorem 2: Strong Factor Model
Assumption 2 holds and γ ∈ [−1,∞). Then:
For any choice of γ the factors, loadings and common components can beestimated consistently pointwise.
If√
TN→ 0 then
√T(H>Λi − Λi
)D→ N(0,Φ)
Φ =(
ΣF + (γ + 1)µFµ>F
)−1 (Ω1,1 + γµFΩ2,1 + γΩ1,2µF + γ2µFΩ2,2µF
)·(
ΣF + (γ + 1)µFµ>F
)−1
For γ = −1 this simplifies to the conventional case Σ−1F Ω1,1Σ−1
F .
If√
NT→ 0 then the asymptotic distribution of the factors is not affected
by the choice of γ.
The asymptotic distribution of the common component depends on γ ifand only if N
Tdoes not go to zero. For T
N→ 0
√T(Ct,i − Ct,i
)D→ N
(0,F>t ΦFt
)A 66