Upload
vuque
View
256
Download
0
Embed Size (px)
Citation preview
INTRODUCTION TO PANEL DATA ANALYSIS USING EVIEWS
FARIDAH NAJUNA MISMAN, PhD
FINANCE DEPARTMENT
FACULTY OF BUSINESS & MANAGEMENT
UiTM JOHOR
PANEL DATA WORKSHOP-23&24 MAY 2017 1
OUTLINE
1. Introduction
2. CLRM Assumptions
3. Static Panel Data Models
4. Getting Start with EViews 9
5. Data Analysis
6. Reading The Results
PANEL DATA WORKSHOP-23&24 MAY 2017 2
1. INTRODUCTIONThere are 3 types of data structure available:
1. Time Series data is data that is collected at regular time intervals such as everymonth or every year. (N=1, t=1……T)
• Usually this represents the values for a single firm or a single variable at different points in time.
• Most macroeconomic data for real variables e.g. GDP or Consumption, is quarterly time series data.
• The data for monetary variables such as Interest rates is often monthly time series data.
2. Cross sectional data is data associated with the values of many different firms orhouseholds that is collected at a single point in time. (i=1……N, T=1)
3. Panel data is a combination of the other two where we have values for allmembers of a panel or group of firms or households measured at more than oneperiod in time. (i=1…..N, t=1……T)
PANEL DATA WORKSHOP-23&24 MAY 2017 3
1. INTRODUCTION
Classical panel data: N>T or known as short or micro panel
Macro panel: T>N or known as long panel
Balanced panel : data available for all cross section for all periods.
No of observation: n = NT
Unbalanced panel : different T for individual. (notes: Eviews cannot read unbalanced panel)
PANEL DATA WORKSHOP-23&24 MAY 2017 4
1. INTRODUCTION Selection of econometric models will depend o type of data:
1. Least Squares Regression: Normally applied to cross-section data set (e.gOrdinary Least Squares , OLS)
2. Time-series Model: Normally applied to time series data, to uncover long run relations and short run dynamics.
3. Panel Data Modelling: Normally used to capture heterogeneity across samples and due to the need to have bigger sample size.
❖ Statics Panel data model : POLS, FE, RE, BE
❖Dynamic panel data: GMM
❖Panel unit root and cointegration (macro panel)
PANEL DATA WORKSHOP-23&24 MAY 2017 5
1. INTRODUCTION
Advantages & Disadvantages
Panel Data allow us to control for variables you cannot observe or measure such as:❖ Time-invariant factors like geographical area, firm management characteristics.
❖ Variables that change over time but not across entities like national policies, federal regulation, international agreements.
In other word, panel data is able to take into account for individual heterogeneity (uniqueness)- resulted efficient estimates
PANEL DATA WORKSHOP-23&24 MAY 2017 6
1. INTRODUCTIONAdvantages:
i. Larger sample size, more variation, less collinearity therefore it will increased precision of estimates
ii. Ability to study the dynamic- repeated cross-sectional observations-adjustment over times
iii. Ability to account for heterogeneity across individual often ignored in pooled data-more robust against misspecification due to omitted variable
Disadvantages:
i. Data availibity/maintenance
ii. Measurement errors
iii. Elf-selection bias
PANEL DATA WORKSHOP-23&24 MAY 2017 7
1. INTRODUCTION
Why Analyse Panel Data?
We are interested in describing change over time o social change, e.g.changing attitudes, behaviours, social relationships o individual growth ordevelopment, e.g. life-course studies, child development, career trajectories,school achievement o occurrence (or non-occurrence) of events
We want superior estimates trends in social phenomena o Panel models canbe used to inform policy – e.g. health, obesity o Multiple observations oneach unit can provide superior estimates as compared to cross-sectionalmodels of association
We want to estimate causal models o Policy evaluation o Estimation oftreatment effects
PANEL DATA WORKSHOP-23&24 MAY 2017 8
1. INTRODUCTION
What kind of data are required for panel analysis?
Basic panel methods require at least two “waves” of measurement. Consider student GPAs and job hours during two semesters of college
One way to organize the panel data is to create a single record for each combination of unit and time period
Notice that the data include:
A time-invariant unique identifier for each unit (StudentID)
A time-varying outcome (GPA)
An indicator for time (Semester)
Panel datasets can include other time-varying or time-invariant variables
PANEL DATA WORKSHOP-23&24 MAY 2017 9
2.CLASSICAL LINEAR REGRESSION MODEL (CLRM)
Table taken from page 37, “Applied Econometrics:, Asteriou & Hall, 2nd ed. 2011, Palgrave MacmillanPANEL DATA WORKSHOP-23&24 MAY 2017 10
3. PANEL DATA MODEL: POOLED OLS
Pooled OLS
yit = β0 + βit Xit + αi + νit
i. αi and vit are normally distributed and they are mutually independent,
ii. E(αi) = E(vij) = 0, for i = 1,...,m, j = 1,2,...,m(i),
iii. E(αiαi´) =
ii
otherwise,
,
,0
21
iv. E(vijvi´j´) =
jjii
otherwise
,
.
,
,0
22
PANEL DATA WORKSHOP-23&24 MAY 2017 11
4.GETTING START WITH EViews 9
PANEL DATA WORKSHOP-23&24 MAY 2017 12
PANEL DATA WORKSHOP-23&24 MAY 2017 13
PANEL DATA WORKSHOP-23&24 MAY 2017 14
PANEL DATA WORKSHOP-23&24 MAY 2017 15
PANEL DATA WORKSHOP-23&24 MAY 2017 16
PANEL DATA WORKSHOP-23&24 MAY 2017 17
PANEL DATA WORKSHOP-23&24 MAY 2017 18
PANEL DATA WORKSHOP-23&24 MAY 2017 19
PANEL DATA WORKSHOP-23&24 MAY 2017 20
5. DATA ANALYSIS
PANEL DATA WORKSHOP-23&24 MAY 2017 21
DESCRIPTIVE STATISTICS
PANEL DATA WORKSHOP-23&24 MAY 2017 22
PANEL DATA WORKSHOP-23&24 MAY 2017 23
CORRELATION ANALYSIS
PANEL DATA WORKSHOP-23&24 MAY 2017 24
PANEL DATA WORKSHOP-23&24 MAY 2017 25
PANEL DATA WORKSHOP-23&24 MAY 2017 26
PANEL DATA WORKSHOP-23&24 MAY 2017 27
POOLED OLS REGRESSION
PANEL DATA WORKSHOP-23&24 MAY 2017 28
PANEL DATA WORKSHOP-23&24 MAY 2017 29
PANEL DATA WORKSHOP-23&24 MAY 2017 30
PANEL DATA WORKSHOP-23&24 MAY 2017 31
NORMALITY TEST
PANEL DATA WORKSHOP-23&24 MAY 2017 32
PANEL DATA WORKSHOP-23&24 MAY 2017 33
PANEL DATA WORKSHOP-23&24 MAY 2017 34
DUMMY VARIABLES
PANEL DATA WORKSHOP-23&24 MAY 2017 35
PANEL DATA WORKSHOP-23&24 MAY 2017 36
PANEL DATA WORKSHOP-23&24 MAY 2017 37
PANEL DATA WORKSHOP-23&24 MAY 2017 38
PANEL DATA WORKSHOP-23&24 MAY 2017 39
PANEL DATA WORKSHOP-23&24 MAY 2017 40
6.READING THE RESULTSDependent Variable: CR
Method: Panel Least Squares
Date: 05/23/17 Time: 17:06
Sample (adjusted): 1996 2011 Time included
Total no of groups
Periods included: 16n=NT
Cross-sections included: 17
Total panel (unbalanced) observations: 85
Variable Coefficient Std. Error t-Statistic Prob.
C 12.83313 2.387841 5.374368 0.0000
FE -0.160617 0.039199 -4.097434 0.0001
FQ 2.032662 0.380137 5.347179 0.0000
CB 0.362423 0.185213 1.956787 0.0539
CAPR -0.203388 0.075746 -2.685126 0.0088
R-squared 0.371546 Mean dependent var 6.020596
Adjusted R-squared 0.340123 S.D. dependent var 5.639222
S.E. of regression 4.580898 Akaike info criterion 5.938690
Sum squared resid 1678.770 Schwarz criterion 6.082375
Log likelihood -247.3943 Hannan-Quinn criter. 5.996484
F-statistic 11.82412 Durbin-Watson stat 0.735389
Prob(F-statistic) 0.000000
Constant
If this no is < 0.05
then the model is
ok.
This is F test to see
whether all coeffs in
the model are diff
than zero.
PANEL DATA WORKSHOP-23&24 MAY 2017 41
Coefficient Std. Error t-Statistic Prob.
12.83313 2.387841 5.374368 0.0000
-0.160617 0.039199 -4.097434 0.0001
2.032662 0.380137 5.347179 0.0000
0.362423 0.185213 1.956787 0.0539
-0.203388 0.075746 -2.685126 0.0088Coefficients of the
regressors.
Indicate how much
Y changes
When X increase
by one unit.
T-values test the hypothesis that
each coeff is diff from 0
To reject this, the t-value has to be
higher than 1.96 (95% confidence
interval). If this is the case then you
can say that the variables has a
significant influence on your DV
(Y). The higher the value the higher
the relevance of the variable.
Two-tail p-values test the
hypothesis
That each coeff is diff
from 0. To reject this,
P-value has to be lower
than 0.05 (95%). If this is
Case the you can say that
the variable has a
significant influence
On you DV (Y)PANEL DATA WORKSHOP-23&24 MAY 2017 42
R-squared 0.371546 Mean dependent var 6.020596
Adjusted R-squared 0.340123 S.D. dependent var 5.639222
S.E. of regression 4.580898 Akaike info criterion 5.938690
Sum squared resid 1678.770 Schwarz criterion 6.082375
Log likelihood -247.3943 Hannan-Quinn criter. 5.996484
F-statistic 11.82412 Durbin-Watson stat 0.735389
Prob(F-statistic) 0.000000
R-squared shows the amount
Of variance of Y explained by X
Adjusted R-squared shows the same
as R-squared but adjusted by the
number of cases and number of
variables.
When the number of variables is
small and the number of cases is
very large,
then Adj R-squared is closer to R-
squared
PANEL DATA WORKSHOP-23&24 MAY 2017 43