View
225
Download
0
Category
Preview:
Citation preview
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
1/49
Ordinary Least Squares
Rómulo A. Chumacero
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
2/49
OLS
Motivation
• Economics is (essentially) observational science• Theory provides discussion regarding the relationship between variables
— Example: Monetary policy and macroeconomic conditions
• What?: Properties of OLS
• Why?: Most commonly used estimation technique
• How?: From simple to more complex
Outline
1. Simple (bivariate) linear regression
2. General framework for regression analysis
3. OLS estimator and its properties
4. CLS (OLS estimation subject to linear constraints)
5. Inference (Tests for linear constraints)
6. Prediction
1
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
3/49
OLS
An Example
Figure 1: Growth and Government size
2
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
4/49
OLS
Correlation Coefficient
• Intended to measure direction and closeness of linear association
• Observations: { } =1
• Data expressed in deviations from the (sample) mean:
e = − = −1 X=1
=
• Cov( ) = E ()− E () E ()
= −1
X=1 eewhich depends on the units in which and are measured
• Correlation coefficient is a measure of linear association independent of units
= −1 X
=1
ee
=
= v uut −1 X=1
e2 = • Limits: −1 ≤ ≤ 1 (applying Cauchy-Schwarz inequality)
3
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
5/49
OLS
Caution
• Fallacy: “Post hoc, ergo propter hoc” (after this, therefore because of this)• Correlation is not causation
• Numerical and statistical significance, may not mean nothing
• Nonsense (spurious) correlation
• Yule (1926):
— Death rate - Proportion of marriages in the Church of England (1866-1911)
— = 095
— Ironic: To achieve immortality → close the church!• A few more recent examples
4
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
6/49
OLS
Ice Cream causes Crime
Figure 2: Nonsense 1
5
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
7/49
OLS
Yet another reason to hate Bill Gates
Figure 3: Nonsense 2
6
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
8/49
OLS
Facebook and the Greeks
Figure 4: Nonsense 3
7
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
9/49
OLS
Let’s save the pirates
Figure 5: Nonsense 4
8
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
10/49
OLS
Divine justice
Figure 6: Nonsense 5?
9
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
11/49
OLS
Simple linear regression model
• Economics as a remedy for nonsense (correlation does not indicate direction of dependence)• Take a stance:
= 1 + 2 +
— Linear
— Dependent / independent
— Systematic / unpredictable
• observations, 2 unknowns
• Infinite possible solutions
— Fit a line by eye
— Choose two pairs of observations and join them
— Minimize distance between and predictable component
∗ minP ||→LAD∗ minP2 →OLS
10
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
12/49
OLS
Our Example
Figure 7: Growth and Government size
11
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
13/49
OLS
Our Example
Figure 8: Linear regression
12
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
14/49
OLS
Simple linear regression model
• Define the sum of squares of the residuals () function as:
( ) = X
=1
( − 1 − 2)2
• Estimator: Formula for estimating unknown parameters
• Estimate: Numerical value obtained when sample data is substituted in formula
• The OLS estimator ( b ) minimizes ( ). FONC: ( )
1 ¯̄̄̄ b = −2X³ − b 1 − b 2´ = 0 ( )
2
¯̄̄̄ b = −2
X
³ − b 1 − b 2´ = 0
• Two equations, two unknowns: b 1 = − b 2 b 2 = 2 = =P
=1 eeP =1
e2
13
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
15/49
OLS
Simple linear regression model
• Properties:
— b 1 b 2 minimize — OLS line passes through the mean point ( )
— b ≡ − b 1 − b 2 are uncorrelated (in the sample) with
Figure 9: SSR
14
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
16/49
OLS
General Framework
• Observational data {1 2 }• Partition = ( ) where ∈ R, ∈ R
• Joint density: ( ), vector of unknown parameters
• Conditional distribution: ( ) = ( | 1) ( 2); ( 2) = R ∞
−∞ ( )
• Regression analysis: statistical inferences on 1
Ignore ( 2) provided 1 and 2 are “variation free”
• : ‘dependent’ or ‘endogenous’ variable. : vector of ‘independent’ or ‘exogenous’ variables
• Conditional mean: ( 3). Conditional variance: ( 4)
( 3) = E ( | 3) =
Z ∞−∞
(| 2)
( 4) = Z ∞
−∞
2 (| 2) − [ ( 3)]2
• : diff erence between and conditional mean:
= ( 3) + (1)
15
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
17/49
OLS
General Framework
Proposition 1 Properties of 1. E ( | ) = 02. E () = 03. E [()] = 0 for any function (·)4. E () = 0
Proof. 1. By definition of and linearity of conditional expectations,
E ( | ) = E [ − () | ]= E [ | ]− E [() | ]= ()
− () = 0
2. By the law of iterated expectations and the first result,
E () = E [E ( | )] = E (0) = 0
3. By essentially the same argument,
E [()] = E [E [() | ]]
= E [()E [ | ]]
= E [() · 0] = 0
4. Follows from the third result setting () =
16
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
18/49
OLS
General Framework
• (1) + fi
rst result of Proposition 1: regression framework: = ( 3) +
E ( | ) = 0
• Important: framework, not model: holds true by definition.
• (·) and (·) can take any shape
• If (·) is linear: Linear Regression Model (LRM).
( 3) = 0
×1
=
⎡⎣ 1...
⎤⎦
×=
⎡⎣ 01...
0
⎤⎦ =
⎡⎣ 11 · · · 1... . . . ...
1 · · ·
⎤⎦
×1=
⎡⎣ 1...
⎤⎦
17
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
19/49
OLS
Regression models
Defi
nition 1 The Linear Regression Model (LRM) is:1. = 0 + or = +
2. E ( | ) = 03. rank ( ) = or det( 0 ) 6= 04. E () = 0 ∀ 6=
Definition 2 The Homoskedastic Linear Regression Model (HLRM) is the LRM plus
5. E ¡
2 |¢
= 2 or E (0 | ) = 2
Definition 3 The Normal Linear Regression Model (NLRM) is the LRM plus
6.
∼ N ¡0
2
¢
18
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
20/49
OLS
Definition of OLS Estimator
• Defi
ne the sum of squares of the residuals () function as: ( ) = ( − )0 ( − ) = 0 − 2 0 + 0 0
• The OLS estimator (
b ) minimizes ( ). FONC:
( )
¯̄̄̄ b = −2 0 + 2 0 b = 0which yield normal equations 0 = 0 b .
Proposition 2
b = ( 0 )−1 ( 0 ) is the arg min
( )
Proof. Using normal equations: b = ( 0 )−1 ( 0 ). SOSC: 2 ( )
0
¯̄
¯̄ b
= 2 0
then b is minimum as 0 is p.d.m.• Important implications:
—
b is a linear function of
— b is a random variable (function of and )— 0 must be of full rank19
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
21/49
OLS
Interpretation
• Defi
ne least squares residuals b = − b (2) b2 = −1 b0 b = b + b = + ; where = ( 0 )−1 0 and = −
Proposition 3 Let be an × matrix of rank . A matrix of the form = (0)−1 0 is called a projection matrix and has the following properties:
i) = 0 = 2 (Hence is symmetric and idempotent)
ii) rank ( ) =
iii) Characteristic roots (eigenvalues) of consist of r ones and n-r zeros iv) If = for some vector , then = (hence the word projection)v) = − is also idempotent with rank n-r, eigenvalues consist of n-r ones and r zeros,
and if = , then MZ = 0
vi) can be written as 0, where 0 = , or as 101 + 2
02 + +
0 where is a vector
and = ( )
20
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
22/49
O S
Interpretation
= b + b = + Y
Col(X )
MY
PY
0
Figure 10: Orthogonal Decomposition of
21
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
23/49
The Mean of
b
Proposition 4 In the LRM, E h³ b − ́ | i = 0 and E b = Proof. By the previous results,
b = ( 0 )
−1 0 = ( 0 )
−1 0 ( + )
= + ( 0 )−1
0
Then
E h³ b − ́ | i = E h( 0 )−1 0 | i
= ( 0 )−1
0E ( | )
= 0
Applying the law of iterated expectations, E b = E hE ³ b | ́ i =
22
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
24/49
The Variance of
b
Proposition 5 In the HLRM, V ³ b | ́ = 2 ( 0 )−1 and V ³ b ́ = 2E h( 0 )−1iProof. Since b − = ( 0 )−1 0
V
³ b |
´ = E
∙
³ b −
´³ b −
´0|
¸= E h( 0 )−1 00 ( 0 )−1 | i= ( 0 )
−1 0E [0 | ] ( 0 )
−1
= 2 ( 0 )−1
Thus, V ³ b ́ = E hV ³ b | ́ i+ V hE ³ b | ́ i = 2E h( 0 )−1i• Important features of V
³ b | ́ = 2 ( 0 )−1:— Grows proportionally with 2
— Decreases with sample size— Decreases with volatility of
23
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
25/49
The Mean and Variance of
b2
Proposition 6 In the LRM, b2 is biased.Proof. We know that b = . It is trivial to verify that b = . Then, b2 = −1 b0 b =
−10 This implies that
E
¡ b2 |
¢ = −1E [0 | ]
= −1
trE [0
| ]= −1E [tr (0 ) | ]= −1E [tr (0) | ]= −12tr ( )= 2 (
−) −1
Applying the law of iterated expectations we obtain E b2 = 2 ( − ) −1Unbiased estimator:
e2 = ( − )−1
b0
b.
Proposition 7 In the NLRM, V b2 = −22 ( − ) 4• Important:
— With the exception of Proposition 7, normality is not required
—
b2 is biased, but it is the MLE under normality and is consistent
— Variance of b and b2 depend on 2. bV ³ b ́ = e2 ( 0 )−124
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
26/49
b is BLUE
Theorem 1 (Gauss-Markov) b is BLUE Proof. Let = ( 0 )−1 0, so b = . Consider any other linear estimator = ( + ) .
Then,E ( | ) = ( 0 )
−1 0 + = ( − )
For to be unbiased we require = 0, then:V ( | ) = E
£( + ) 0 ( + )0
¤As ( + ) ( + )0 = ( 0 )−1 + 0, we obtain
V ( | ) = V ³ b | ´+ 2 0As 0 is p.s.d. we have V ( | ) ≥ V
³ b | ́• Despite popularity, Gauss-Markov not very powerful
— Restricts quest to linear and unbiased estimators— There may be “nonlinear” or biased estimator that can do better (lower MSE)
— OLS not BLUE when homoskedasticity is relaxed
25
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
27/49
Asymptotics I
• Unbiasedness is not that useful in practice (frequentist perspective)
• It is also not common in general contexts
• Asymptotic theory: properties of estimators when sample size is infinitely large
• Cornerstones: LLN (consistency) and CLT (inference)
Defi
nition 4 (Convergence in probability) A sequence of real or vector valued random variables {} is said to converge to in probability if
lim →∞
Pr (k − k ) = 0 for any 0
We write → or lim = .
Definition 5 (Convergence in mean square) {} converges to in mean square if lim
→∞E ( − )2 = 0
We write → .
Definition 6 (Almost sure convergence) {} converges to almost surely if
Prh
lim →∞
= i
= 1
We write → .
Definition 7 The estimator b of 0 is said to be a weakly consistent estimator if b
→0.
Definition 8 The estimator b of 0 is said to be a strongly consistent estimator if b → 0.26
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
28/49
Laws of Large Numbers and Consistency of
b
Theorem 2 (WLLN1, Chebyshev) Let E () = , V () =
2
, ( ) = 0 ∀ 6= .If lim →∞
1
P
=1 2 ≤ ∞, then
− → 0
Theorem 3 (SLLN1, Kolmogorov) Let {} be independent with fi nite variance V () =
2
∞. If P∞=1 22 ∞, then − → 0
• Assume that −1 0 → (invertible and nonstochastic)
b − = ( 0 )
−1 0
= ¡ −1 0 ¢−1 ¡ −1 0¢ → 0• b is consistent: b →
27
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
29/49
Analysis of Variance (ANOVA)
= b + b − =
³ b
− ´
+
b
¡ − ¢0 ¡ − ¢ = ³ b − ´0 ³ b − ´+ 2³ b − ´0 b + b0 bbut b 0 b = 0 = 0 and 0 b = 0 b = 0. Thus
¡ −
¢0
¡ −
¢ =
³ b −
´0
³ b −
´+
b0
b
This is called the ANOVA formula, often written as
= +
2 =
= 1−
= 1−
0
0
= − −10. If regressors include constant, 0 ≤ 2 ≤ 1.
28
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
30/49
Analysis of Variance (ANOVA)
• Measures percentage of variance of accounted for in variation of b .• Not “measure” or “goodness” of fit
• Doesn’t explain anything
• Not even clear if 2 has interpretation in terms of forecast performance
Model 1: = + Model 2: − = + with = − 1• Mathematically identical and yield same implications and forecasts
• Yet reported 2 will diff er greatly
• Suppose ' 1. Second model: 2 ' 0, First model can be arbitrarily close to one
• 2 is increases as regressors are added. Theil proposed:
2
= 1−
−
= 1−
e2
b2• Not used that much today, as better model evaluation criteria have been developed
29
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
31/49
OLS Estimator of a Subset of
Partition = £ 1 2 ¤ = µ 1 2 ¶Then 0 b = 0 can be written as:
01 1
b 1 +
01 2
b 2 =
01 (3a)
02 1 b 1 + 02 2 b 2 = 02 (3b)Solving for b 2 and reinserting in (3a) we obtain b 1 = ( 01 2 1)−1 01 2
b 2 = ( 02 1 2)
−1 02 1
where = − = − ( 0 )−1 0 (for = 1 2).Theorem 4 (Frisch-Waugh-Lovell) b 2 and b can be computed using the following algorithm:
1. Regress on 1 obtain residual
e
2. Regress 2 on 1 obtain residual e 2
3. Regress e on e 2, obtain b 2 and residuals bFWL was used to speed computation
30
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
32/49
Application of FWL: (Demeaning)
Partition = £ 1 2 ¤ where 1 = and 2 is the matrix of observed regressorse 2 = 1 2 = 2 − (0)−1 0 2= 2 − 2
e = 1 = − (0)−1 0 = − FWL states that b 2 is OLS estimate from regression of e on e 2
b 2 = Ã
X=1 e2e02!−1
Ã
X=1 e2e!Thus the OLS estimator for the slope coefficients is a regression with demeaned data.
31
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
33/49
Constrained Least Squares (CLS)
Assume the following constraint must hold:0 = (4)
( × matrix of known constants), ( -vector of known constants). , rank() = .CLS estimator of ( ) is value of that minimizes subject to (4).
L ( ) = ( − )0 ( − ) + 2 0 (0 − ) is a -vector of Lagrange multipliers. FONC:
L
¯̄̄̄ = −2 0 + 2 0 + 2 = 0
L
¯̄̄̄
= 0 − = 0
=
b − ( 0 )−1
h0 ( 0 )
−1
i−1
³0
b −
´ (5)
2 = −1¡
− ¢0 ¡ − ¢ is BLUE
32
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
34/49
Inference
• Up to now, properties of estimators did not depend on the distribution of
• Consider the NLRM with ∼ N ¡
0 2¢
. Then:
| ∼ N ¡
0 2¢
• On the other hand as b = ( 0 )−1 0 , then: b | v N ³ 2 ( 0 )−1´• However, as
b
→ , it also converges in distribution to a degenerate distribution
• Thus, we require something more to conduct inference• Next, we discuss finite (exact) and large sample distribution of estimators to test hypothesis
• Components:
— Null hypothesis H0
— Alternative hypothesis H1
— Test statistic (one tail, two tails)
— Rejection region
— Conclusion
33
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
35/49
Inference with Linear Constraints (normality)H0 : 0 = H1 : 0 6=
The Test = 1. Assume is normal, under the null hypothesis:
0
b
v N
h 20 ( 0 )
−1i
0 b − h20 ( 0 )−1
i12 ∼ N (0 1) (6)Test statistic used when is known. If not, recall
b0
b2 ∼ 2 − (7)As (6) and (7) are independent, hence:
= 0 b −
he20 ( 0 )−1 i
12 ∼ −
(6) holds even when normality of is not present.
If H0 : 1 = 0, define =£
1 0 · · · 0¤0
= 0,
=
b 1
q bV 1134
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
36/49
Inference with Linear Constraints (normality)
• Confi
dence interval:Pr
∙ b − 2q bV b + 2q bV ¸ = 1− • Tail probability, or probability value ( -value) function
= ( ) = Pr (| | ≥ | |) = 2 (1− Φ (| |))Reject the null when the -value is less than or equal to
• Confidence interval for :
Pr"( − ) e22 −1−2
2 ( − ) e22 −2
# = 1− (8)
35
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
37/49
The Test (normality)
1. Under the null:
¡ ¢− ³ b ́2
∼ 2 When 2 is not known, replace 2 with
e2 and obtain
¡ ¢− ³ b ́e2 = − ³0 b − ´0 h0 ( 0 )−1 i−1 ³0 b − ´ b0 b ∼ − (9)
As with tests, reject null when the value computed exceeds the critical value
36
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
38/49
Asymptotics II
• How to conduct inference when is not necessarilly normal?
Figure 11: Convergence in distribution
37
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
39/49
CLT
Definition 9 (Convergence in distribution) {} is said to converge to in distribution if the distribution function F of converges to the distribution F of at every continuity
point of F . We write → and we call F the limiting distribution of {}. If {} and {}
have the same limiting distribution, we write = .
Theorem 5 (CLT1, Lindeberg-Lévy) Let {} be i.i.d. with E = and V = 2. Then
= − [V ]
12 =√
−
→ N (0 1)
• Assume that −1 0 → (invertible and nonstochastic) and that −12 0 → N
¡0 2
¢√ ³ b − ́ = ¡ −1 0 ¢−1 ³ −12 0´ → N ¡0 2−1¢• Thus, under the HLRM, asymptotic distribution does not depend on distribution of
• Normal vs t-test / Chi2 vs F test
38
OLS
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
40/49
Tests for Structural Breaks
Suppose we have two regimes regression
1 = 1 1 + 1
2 = 2 2 + 2
E ∙ 12 ¸ £
01
02 ¤ = ∙
21 1 0
0 22
2 ¸
H0 : 1 = 2
Assume 1 = 2. Define
= +
=∙
1 2¸ , = ∙ 1 0
0 2¸ , = ∙ 1
2¸ , and = ∙ 1
2¸
Applying (9) we obtain:
1 + 2−
2
³ b 1 − b 2´0
h( 01 1)
−1 + ( 02 2)−1
i−1
³ b 1 − b 2´ 0 h − ( 0 )−1 0i ∼ 1+ 2−2 (10)where
b 1 = (
01 1)
−1 01 1 and
b 2 = (
02 2)
−1 02 2.
39
OLS
T f S l B k
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
41/49
Tests for Structural Breaks
Same result can be derived as follows: Define under alternative (structural change)
³ b ́ = 0 h − ( 0 )−1 0i
and under the null hypothesis
¡ ¢ = 0 h − ( 0 )−1 0i 1 + 2 − 2
¡
¢− ³ b ́
³ b
´ ∼ 1+ 2−2 (11)
Unbiased estimate of 2 is e2 = ¡ ¢ 1 + 2 − 2
Chow tests are popular, but modern practice is skeptic. Recent theoretical and empirical
applications: period of possible break as endogenous latent variable.
40
OLS
P di ti
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
42/49
Prediction
• Out-of-sample predictions for (for ) is not easy. In that period: = 0
+
• Types of uncertainty:
— Unpredictable component
— Parameter uncertainty
— Uncertainty about
— Specification uncertainty
• Types of forecasts
— Point forecast
— Interval forecast
— Density forecast
• Active area of research
41
OLS
P di ti
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
43/49
Prediction
• If HLRM holds, the predictor that minimizes MSE is b0 b • Given , mean squared prediction error is
E h
(
b − )2 |
i = 2
h1 + 0 (
0 )−1
i• To construct estimator of variance of forecast error, substitute e2 for 2• You may think that a confidence interval forecast could be formulated as:
Pr
∙
b − 2
q bV
b
b + + 2
q bV
b
¸ = 1−
WRONG. Notice that
− b r
2
h1 + 0 (
0 )−1
i =
+ 0
³ − b ́
r 2
h1 + 0 (
0 )−1
iRelation does not have a discernible limiting distribution (unless is normal). We didn’t needto impose normality for all the previous results (at least asymptotically).
We assumed that the econometrician knew . If is stochastic and not known at , MSEcould be seriously underestimated.
42
OLS
P di ti
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
44/49
Prediction
x 1x 0
x
f(y)
y
Figure 12: Forecasting
43
OLS
Measures of predictive accuracy of forecasting models
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
45/49
Measures of predictive accuracy of forecasting models
RMSE = v uut 1 X
=1( − b )2
MAE = 1
X =1| −
b |
Theil statistic:
=
v uut
P =1 ( − b )2P
=1
2
∆ =v uutP =1 (∆ −∆ b )2P
=1 (∆ )2
∆ = − −1 and ∆
b =
b − −1
or, in percentage changes,
∆ = − −1
−1and ∆ b = b − −1
−1
These measures will reflect the model’s ability to track turning points in the data
44
OLS
Evaluation
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
46/49
Evaluation
• When comparing 2 models, is one model really better than the other?
• Diebold-Mariano: Framework for comparing models
= (
b)− (
b ) ; =
√
→ N (0 1)
• Harvey, Leyborne, and Newbold (HLN): Correct size distortions and use Student´s
= ·
∙ + 1− 2 + (− 1)
¸12
45
OLS
Finite Samples
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
47/49
Finite Samples
• Statistical properties of most methods: known only asymptotically
• “Exact” finite sample theory can rarely be used to interpret estimates or test statistics
• Are theoretical properties reasonably good approximations for the problem at hand?
• How to proceed in these cases?
• Monte Carlo experiments and bootstrap
46
OLS
Monte Carlo Experiments
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
48/49
Monte Carlo Experiments
• Often used to analyze finite sample properties of estimators or test statistics
• Quantities approximated by generating many pseudo-random realizations of stochastic processand averaging them
— Model and estimators or tests associated with the model. Objective: assess small sample
properties.
— DGP: special case of model. Specify “true” values of parameters, laws of motion of variables, and distributions of r.v.
— Experiment: replications or samples ( ), generating artificial samples of data according
to DGP and calculating the estimates or test statistics of interest
— After replications, we have equal number of estimators which are subjected to statisticalanalysis
— Experiments may be performed by changing sample size, values of parameters, etc. Re-sponse surfaces.
• Monte Carlo experiments are random. Essential to perform enough replications so resultsare sufficiently accurate. Critical values, etc.
47
OLS
Bootstrap Resampling
8/19/2019 Tema I (Mínimos Cuadrados Ordinarios)
49/49
Bootstrap Resampling
• Bootstrap views observed sample as a population
• Distribution function for this population is the EDF of the sample, and parameter estimatesbased on the observed sample are treated as the actual model parameters
• Conceptually: examine properties of estimators or test statistics in repeated samples drawnfrom tangible data-sampling process that mimics actual DGP
• Bootstrap do not represent exact finite sample properties of estimators and test statisticsunder actual DGP, but provides approximation that improves as size of observed sampleincreases
• Reasons for acceptance in recent years:
— Avoids most of strong distributional assumptions required in Monte Carlo
— Like Monte Carlo, it may be used to solve intractable estimation and inference problems
by computation rather than reliance on asymptotic approximations, which may be verycomplicated in nonstandard problems
— Boostrap approximations are often equivalent to first-order asymptotic results, and maydominate them in cases.
48
Recommended