Introduction to Panel data analysis using eviews · introduction to panel data analysis using eviews faridah najuna misman, phd finance department faculty of business & management

INTRODUCTION TO PANEL DATA ANALYSIS USING EVIEWS

FARIDAH NAJUNA MISMAN, PhD

FINANCE DEPARTMENT

FACULTY OF BUSINESS & MANAGEMENT

UiTM JOHOR

PANEL DATA WORKSHOP-23&24 MAY 2017 1

OUTLINE

1. Introduction

2. CLRM Assumptions

3. Static Panel Data Models

4. Getting Start with EViews 9

5. Data Analysis

6. Reading The Results


1. INTRODUCTIONThere are 3 types of data structure available:

1. Time Series data is data that is collected at regular time intervals such as everymonth or every year. (N=1, t=1……T)

• Usually this represents the values for a single firm or a single variable at different points in time.

• Most macroeconomic data for real variables e.g. GDP or Consumption, is quarterly time series data.

• The data for monetary variables such as Interest rates is often monthly time series data.

2. Cross sectional data is data associated with the values of many different firms orhouseholds that is collected at a single point in time. (i=1……N, T=1)

3. Panel data is a combination of the other two where we have values for allmembers of a panel or group of firms or households measured at more than oneperiod in time. (i=1…..N, t=1……T)


1. INTRODUCTION

Classical panel data: N>T or known as short or micro panel

Macro panel: T>N or known as long panel

Balanced panel : data available for all cross section for all periods.

No of observation: n = NT

Unbalanced panel : different T for individual. (notes: Eviews cannot read unbalanced panel)


1. INTRODUCTION Selection of econometric models will depend o type of data:

1. Least Squares Regression: Normally applied to cross-section data set (e.gOrdinary Least Squares , OLS)

2. Time-series Model: Normally applied to time series data, to uncover long run relations and short run dynamics.

3. Panel Data Modelling: Normally used to capture heterogeneity across samples and due to the need to have bigger sample size.

❖ Statics Panel data model : POLS, FE, RE, BE

❖Dynamic panel data: GMM

❖Panel unit root and cointegration (macro panel)


1. INTRODUCTION

Advantages & Disadvantages

Panel Data allow us to control for variables you cannot observe or measure such as:❖ Time-invariant factors like geographical area, firm management characteristics.

❖ Variables that change over time but not across entities like national policies, federal regulation, international agreements.

In other word, panel data is able to take into account for individual heterogeneity (uniqueness)- resulted efficient estimates


1. INTRODUCTIONAdvantages:

i. Larger sample size, more variation, less collinearity therefore it will increased precision of estimates

ii. Ability to study the dynamic- repeated cross-sectional observations-adjustment over times

iii. Ability to account for heterogeneity across individual often ignored in pooled data-more robust against misspecification due to omitted variable

Disadvantages:

i. Data availibity/maintenance

ii. Measurement errors

iii. Elf-selection bias


1. INTRODUCTION

Why Analyse Panel Data?

We are interested in describing change over time o social change, e.g.changing attitudes, behaviours, social relationships o individual growth ordevelopment, e.g. life-course studies, child development, career trajectories,school achievement o occurrence (or non-occurrence) of events

We want superior estimates trends in social phenomena o Panel models canbe used to inform policy – e.g. health, obesity o Multiple observations oneach unit can provide superior estimates as compared to cross-sectionalmodels of association

We want to estimate causal models o Policy evaluation o Estimation oftreatment effects


1. INTRODUCTION

What kind of data are required for panel analysis?

Basic panel methods require at least two “waves” of measurement. Consider student GPAs and job hours during two semesters of college

One way to organize the panel data is to create a single record for each combination of unit and time period

Notice that the data include:

A time-invariant unique identifier for each unit (StudentID)

A time-varying outcome (GPA)

An indicator for time (Semester)

Panel datasets can include other time-varying or time-invariant variables


2.CLASSICAL LINEAR REGRESSION MODEL (CLRM)

Table taken from page 37, “Applied Econometrics:, Asteriou & Hall, 2nd ed. 2011, Palgrave MacmillanPANEL DATA WORKSHOP-23&24 MAY 2017 10

3. PANEL DATA MODEL: POOLED OLS

Pooled OLS

yit = β0 + βit Xit + αi + νit

i. αi and vit are normally distributed and they are mutually independent,

ii. E(αi) = E(vij) = 0, for i = 1,...,m, j = 1,2,...,m(i),

iii. E(αiαi´) =

ii

otherwise,

,

,0

21

iv. E(vijvi´j´) =

jjii

otherwise

,

.

,

,0

22


4.GETTING START WITH EViews 9










5. DATA ANALYSIS


DESCRIPTIVE STATISTICS



CORRELATION ANALYSIS





POOLED OLS REGRESSION





NORMALITY TEST




DUMMY VARIABLES







6.READING THE RESULTSDependent Variable: CR

Method: Panel Least Squares

Date: 05/23/17 Time: 17:06

Sample (adjusted): 1996 2011 Time included

Total no of groups

Periods included: 16n=NT

Cross-sections included: 17

Total panel (unbalanced) observations: 85

Variable Coefficient Std. Error t-Statistic Prob.

C 12.83313 2.387841 5.374368 0.0000

FE -0.160617 0.039199 -4.097434 0.0001

FQ 2.032662 0.380137 5.347179 0.0000

CB 0.362423 0.185213 1.956787 0.0539

CAPR -0.203388 0.075746 -2.685126 0.0088

R-squared 0.371546 Mean dependent var 6.020596

Adjusted R-squared 0.340123 S.D. dependent var 5.639222

S.E. of regression 4.580898 Akaike info criterion 5.938690

Sum squared resid 1678.770 Schwarz criterion 6.082375

Log likelihood -247.3943 Hannan-Quinn criter. 5.996484

F-statistic 11.82412 Durbin-Watson stat 0.735389

Prob(F-statistic) 0.000000

Constant

If this no is < 0.05

then the model is

ok.

This is F test to see

whether all coeffs in

the model are diff

than zero.


Coefficient Std. Error t-Statistic Prob.

12.83313 2.387841 5.374368 0.0000

-0.160617 0.039199 -4.097434 0.0001

2.032662 0.380137 5.347179 0.0000

0.362423 0.185213 1.956787 0.0539

-0.203388 0.075746 -2.685126 0.0088Coefficients of the

regressors.

Indicate how much

Y changes

When X increase

by one unit.

T-values test the hypothesis that

each coeff is diff from 0

To reject this, the t-value has to be

higher than 1.96 (95% confidence

interval). If this is the case then you

can say that the variables has a

significant influence on your DV

(Y). The higher the value the higher

the relevance of the variable.

Two-tail p-values test the

hypothesis

That each coeff is diff

from 0. To reject this,

P-value has to be lower

than 0.05 (95%). If this is

Case the you can say that

the variable has a

significant influence

On you DV (Y)PANEL DATA WORKSHOP-23&24 MAY 2017 42

R-squared 0.371546 Mean dependent var 6.020596

Adjusted R-squared 0.340123 S.D. dependent var 5.639222

S.E. of regression 4.580898 Akaike info criterion 5.938690

Sum squared resid 1678.770 Schwarz criterion 6.082375

Log likelihood -247.3943 Hannan-Quinn criter. 5.996484

F-statistic 11.82412 Durbin-Watson stat 0.735389

Prob(F-statistic) 0.000000

R-squared shows the amount

Of variance of Y explained by X

Adjusted R-squared shows the same

as R-squared but adjusted by the

number of cases and number of

variables.

When the number of variables is

small and the number of cases is

very large,

then Adj R-squared is closer to R-

squared


Documents

Introduction to Panel data analysis using eviews · introduction to panel data analysis using eviews faridah najuna misman, phd finance department faculty of business & management