13
Probability Statistics and Econometric Journal Vol.1, Issue: 1; May - June- 2019 ISSN (5344 3321); p ISSN (9311 4223) Probability Statistics and Econometric Journal An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index Available @CIRD.online/PSEJ: E-mail: [email protected] pg. 13 ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR REGRESSION MODEL Ijomah, Maxwell Azubuike Dept. of Maths/Statistics, University of Port Harcourt, Nigeria. Abstract: The problem of multicollinearity in regression is well known and published but the severity of this problem using available diagnostics has not been well established. There seems to be disagreement among researchers regarding the cut off point for severe collinearity in a multiple regression model. In this paper, a simulation study was carried out with various scenarios of different collinearity diagnostics to investigate the effects of collinearity under various correlation structures amongst two explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Our result reveals that a variance inflation factor above 5 (VIF > 5) or eigenvalue less than 0.1 is an indication of severe collinearity. Key words: collinearity, severity, condition Index, condition number, eigenvalues, variance inflation factor. 1 Introduction Multiple regression models are widely used in applied statistical techniques to quantify the relationship between a response variable Y and multiple predictor variables Xi, and we utilize the relationship to predict the value of the response variable from a known level of predictor variable or variables. The models take the general form P P X X X Y ......... .......... 2 2 1 1 0 (1) where β 0 = a constant β 1 , β 2 , …, β p = regression parameters ε = random error . When we have n observations on Y and Xi’s this equation can be represented as follows: X Y where ' 2 1 ) ,......., , ( n y y y Y is a n-vector of responses ) 1 ( p nx X pn n n p p x x x x x x x x x ...... 1 : ...... 1 : ...... 1 2 1 2 22 12 1 21 11 ) 2 ( and ' 1 0 ) ,......., , ( p is a p+1-vector parameters and ' 2 1 ) ., ,......... , ( n is a n-vector of error terms. An ordinary least square solution to (1) is one in which the Euclidean norm of the vector ) ( X Y is minimized. That is , 2 min X Y By setting the gradient of the square of this norm, ), ( ) ( 1 X Y X Y to zero with respect to the vector , the necessary condition for the solution vector ˆ is that ˆ must be a solution to Y X X X 1 1 (3)

ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 13

ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR

REGRESSION MODEL

Ijomah, Maxwell Azubuike

Dept. of Maths/Statistics, University of Port Harcourt, Nigeria.

Abstract: The problem of multicollinearity in regression is well known and published but the severity of this problem using available

diagnostics has not been well established. There seems to be disagreement among researchers regarding the cut off point for severe

collinearity in a multiple regression model. In this paper, a simulation study was carried out with various scenarios of different

collinearity diagnostics to investigate the effects of collinearity under various correlation structures amongst two explanatory variables

and to compare these results with existing guidelines to decide harmful collinearity. Our result reveals that a variance inflation factor

above 5 (VIF > 5) or eigenvalue less than 0.1 is an indication of severe collinearity.

Key words: collinearity, severity, condition Index, condition number, eigenvalues, variance inflation factor.

1 Introduction

Multiple regression models are widely used in applied statistical

techniques to quantify the relationship between a response

variable Y and multiple predictor variables Xi, and we utilize the

relationship to predict the value of the response variable from a

known level of predictor variable or variables.

The models take the general form

PP XXXY ...................22110

(1)

where β0

= a constant β1, β

2, …, β

p = regression parameters ε =

random error . When we have n observations on Y and Xi’s this

equation can be represented as follows:

XY

where '

21 ),.......,,( nyyyY is a n-vector of

responses

)1( pnxX

pnnn

p

p

xxx

xxx

xxx

......1

:

......1

:

......1

21

22212

12111

)2(

and

'

10 ),.......,,( p is a p+1-vector parameters and

'

21 ).,,.........,( n is a n-vector of error terms. An

ordinary least square solution to (1) is one in which the Euclidean

norm of the vector )( XY

is minimized. That is ,

2min XY

By setting the gradient of the square of this norm,

),()( 1 XYXY to zero with respect to the vector

, the

necessary condition for the solution vector ̂

is that ̂

must

be a solution to

YXXX 11

(3)

Page 2: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 14

In other symbols, the solution is

YXXX 111 )(ˆ

(4)

The symbol I denotes the transposition of a vector or matrix.

Recall that one of the assumptions for the model (4) was that X

is of full rank, i.e. X′X ≠ 0. This requirement says that no column

of X can be written as exact linear combination of the other

columns. If X is not of full rank, then X′X = 0, so that the

ordinary least squares (OLS) estimate YXXX 111 )(ˆ

is

not uniquely defined and the sampling variance of the estimate

is infinite. However, if the columns of X are nearly collinear

(although not exactly) then X′X is close to zero and the least

squares coefficients become unstable since V(βˆ)=σ2(X′X)−1 can

be too large.

In practice neither of the above extreme cases is often met. In

most cases there is some degree of intercorrelation among the

explanatory variables. It should be noted that multicollinearity in

addition to regression analysis, is also connected to time series

analysis. It is also quite frequent in cross-section data

(Koutsoyiannis, 1977). When this assumption is violated, and

there is intercorrelation in the explanatory variables, we say there

exist collinearly or multicollinearity between the explanatory

variables. Multicollinearity stands out among the possible

pitfalls of empirical analysis for the extent to which it is poorly

understood by practitioners. The presence of Multicollinearity

has some destructive effects on regression analysis such as

prediction inferences and estimations. Consequently, the validity

of parameter estimations becomes questionable (Montgomery et

al., 2001; Kutner et al., 2004; Chatterjee and Hadi, 2006; Midi

et al., 2010).

The problem of multicollinearity is being able to separate the

effects of two (or more) variables on an outcome variable. If two

variables are significantly related, it becomes impossible to

determine which of the variables accounts for variance in the

dependent variable. High interpredictor correlations will lead to

less stable estimate of regression weights. When the covariates

in the model are not independent from one another, collinearity

or multicollinearity problems arise in the analysis, which leads

to biased coefficient estimation and a loss of power.

Various econometric references have indicated that collinearity

increases estimates of parameter variance, yields high R2 in the

face of low parameter significance, and results in parameters

with incorrect signs and implausible magnitudes (Belsley et al.,

1980; Kmenta, 1986; Greene, 1990). Furthermore, any small

change in the data may lead to large differences in regression

coefficients, and causes a loss in power and makes interpretation

more difficult since there is a lot of common variation in the

variables (Vasu and Elmore 1975; Belsley 1976; Stewart 1987;

Dohoo et al., 1996; Tu et al., 2005). Collinearity therefore makes

it more difficult to achieve significance of the collinear

parameters. Infact, since the regression coefficients are

interpreted as the effect of change in their corresponding

variables, all other things held constant, our ability to interpret

the coefficients declines the more persistent and severe the

collinearity.

It has often been suggested in various studies that

multicollinearity has little or no effect on regression when it is

not severe. That is, moderate multicollinearity may not be

problematic (Judge et al. (1980), Belsley (1991), Anderson and

Wells (2008). But the “big” question is: when is multicollinearity

said to be severe? With all of the statistical tests available for

detecting collinearity, none of them has satisfactorily shown

when exactly can severity be attained in a regression model.

Unfortunately practitioners often inappropriately apply rules or

criteria that indicate that the multicollinearity diagnostics have

attained unacceptably high levels. Furthermore, the use of

condition index to test for the incidence of multicollinearity, only

shows if multicollinearity is present but not the degree of the

presence of multicollinearity. The typical level of collinearity is

more modest, and its impact is not well understood. This paper

therefore reconsiders the severity of multicollinear variables in a

linear regression under various correlation structures among two

predictive and explanatory variables, to compare these results

with existing guidelines to decide harmful collinearity using both

eigenvalues and correlation coefficient, and to provide a

guideline on model selection under such correlation structures.

Dillon and Goldstein (1984) also suggested that large values of

∑λi−1 would indicate severe collinearity but did not precisely

defined what means “large” in the above context. This paper

hopes to help understand it. The aim is to investigate when

actually can collinear variables be considered as severe using the

above mentioned statistical techniques. The rest of the paper is

organized as follows. Section 2 presents brief review for effects

of collinearity and diagnostic tools. Section 3 describes how to

generate correlated data, and develop a simulation study with

various scenarios of different collinearity structures and

Page 3: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 15

empirical results are discussed in section 4. Summary and

conclusions are provided in Section 5.

2 A Review of Multicollinearity Diagnostics

To assess whether collinearity is indeed problematic, several

collinearity diagnostics are commonly employed in statistics.

One such diagnostic, the condition index, is the square root of

the ratio of largest to smallest eigenvalues in the correlation

matrix of the independent variables. If a condition number is not

“too big”, then the matrix is said to be well-conditioned. It means

that the result was obtained with a good accuracy. A matrix with

a big condition number κ(A) (called an ill-conditioned matrix)

can generate approximations with a huge error. Belsley et al.

(1980) and Johnston (1984, p. 250) suggest that condition indices

in excess of 20 are problematic. If 100 < CN < 1000,

multicollinearity is moderate and if CN > 1000 there is severe

multicollinearity (Montgomery and Peck, 1981). Green et al.

Albaum (1988), Tull and Hawkins (1990) and Lehmann et al.

(1998) respectively suggest 0.9, 0.35 and 0.7 as a threshold of

bivariate correlations for the harmful effect of collinearity.

Regarding other collinearity diagnostics, Belsley (1991)

suggested that a CI between 10 and 30 would indicate possible

problems of multicollinearity, and CI larger than 30 suggest the

presence of multicollinearity.

A second commonly used diagnostic is the variance inflation

factor (VIF) (see Fox et al. (1992), which is given by

211

RVIF

where R2 is R2 of a regression of regressor xi

on all the remaining regressors. When VIF’s in excess of 10, they

are considered to be problematic (Hair et al., 1995). Various

recommendations have been made concerning the magnitudes of

variance inflation factors which are indicative for

multicollinearity. A variance inflation factor of 10 or greater is

usually considered sufficient to indicate a multicollinearity

(Chatterjee et al., 2000). According to some authors,

multicollinearity is problematic if largest VIF exceeds value of

10, or if the mean VIF is much greater than 1. However, there

are no formal criteria for determining the magnitude of variance

inflation factors that cause poorly estimated coefficients. The

decision to consider a VIF to be large was essentially arbitrary.

Belsley et al. (1980) pointed out that there is not a clear cutoff

point to distinguish between "high" and "low" VIFs. Several

researchers (e.g., Hocking and Pendelton 1983; Craney and

Surles, 2002) have suggested that the "typical" cutoff values (or

rules of thumb) for "large" VIFs of 5 or 10 are based on the

associated Ri2 of 0.80 or 0.90, respectively. O’Brien (2007)

recommended that well-known VIF rules of thumb (e.g., VIFs

greater than 5 or 10 or 30) should be treated with caution when

making decisions to reduce collinearity (like eliminating one or

more predictors) and indicated that researchers should also

consider other factors (e.g., sample size) which influence the

variability of regression coefficients. Some investigators use

correlation coefficients cutoffs of 0.5 (Donath et al., 2012) and

above but most typical cutoff is 0.80 (Berry & Feldman,1985).

Although VIF greater than 5 or VIF greater than 10 ( Kutner et

al. 2004) are suggested for detecting multicollinearity, there is

no universal agreement as what the cut-off based on values of

VIF should be used to detect multicollinearity. Caution for

misdiagnosis of multicollinearity using low pairwise correlation

and low VIF was reported in the literature for collinearity

diagnostic as well. O’Brien (2007) demonstrated that VIF rules

of thumb should be interpreted with cautions and should be put

in context of the effects of other factors that influence the

stability of the specific regression coefficient estimate and

suggested that any VIF cut-off value should be based on practical

consideration. Freund et al. (1986), also suggested VIF to be

evaluated against the overall fit of the model, using the model R2

statistics. VIF >1/(1-overall model R2) indicates that correlation

between the predictors is stronger than the regression

relationship and multicollinearity can affect their coefficient

estimates, while Hair et al. (1995) suggest variance inflation

factors (VIF) less than 10 are indicative of inconsequential

collinearity.

A third collinearity diagnostic, is the regressor correlation matrix

determinant (Johnston, 1984). This diagnostic ranges from 1

when there is no collinearity, to 0 when there is perfect

collinearity. This diagnostic does not incorporate the moderating

effect of correlations with the dependent variable.

Another commonly used diagnostic is the set of eigenvalues

which are just characteristic roots of X’X. A set of eigenvalues

of relatively equal magnitudes indicates that there is little

multicollinearity (Freund and Littell 2000: 99). A zero

eigenvalue means perfect collinearity among independent

variables and very small eigenvalues implies severe

multicollinearity. Conventionally, an eigenvalue close to zero

(say less than .01).

Farrar and Glauber, (1967) also proposed a procedure for

detecting multicollinearity which comprised of three tests (i.e.

Chi-square test, F-test and T-test). However, these tests have

Page 4: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 16

been greatly criticized. Robert Wichers (1975) claims that the

third test, where the authors use the partial-correlation

coefficients is ineffective while O’Hagan and McCabe (1974)

quote, “Farrar and Glauber have made a fundamental mistake in

interpreting their diagnostics.”

Huang (1970) has it that multicollinearity is said to be “harmful”

if rij ≥ R2. Such simple correlation coefficients are sufficient but

not necessary condition for multicollinearity. In many cases

there are linear dependencies, which involve more than two

explanatory variables, that this method cannot detect (Judge et

al., 1985). A variable Xi then, would be harmfully multicollinear

only if its multiple correlation with other members of the

independent variable set, Ri2 , were greater than the dependent

variable’s multiple correlation with the entire set, R2 (Greene,

1993). In the case where the linear regression model consists of

more than two independent variables, the correlation matrix is

unable to reveal the presence or number of several coexisting

collinear relations (Belsley, 1991). Field (2009) claims that

when the value of R is greater than 0.0001 there is no severe

multicollinearity.

In summary, reviewing the literature on ways to diagnosing

severe collinearity reveals several points. First, a variety of

alternatives are available and may lead to dramatically different

conclusions based on their cutoff points. Second, what might be

gained from the different alternatives in any specific empirical

situation is often unclear. Part of this ambiguity is likely to be

due to inadequate knowledge about what degree of collinearity

is "harmful" (Mason and Perresult, 1991). In much of the

empirical research on harmful collinearity, data with extreme

levels of collinearity are used to provide rigorous tests of the

approach being proposed. Such extreme collinearity is rarely

found in actual cross-sectional data.

3 Materials and Methods

Simulation study for investigating the severity of

multicollinearity on regression parameters. Several datasets of

sample size 50, 100, 500 and 1000 with one response variable y

and two predictors xi, i=1, 2, were generated from a multivariate

normal distributions MVN ),(

),(~)),,(( 21 MVNxxy with mean vector

)....,,( 22,21

)0.2,9.1,8.1,7.1,6.1,5.1,4.1,3.1,2.1,1.1,0.1,9.0,8.0,7.0,6.0,5.0,45.0,4.0,3.0,25.0,2.0,1.0(

that varies the correlation coefficient r while keeping X1 and X2

constant at 0.1. This is to enable us capture any slight variation

in the regression model as a result of changes in correlation

between X1 and X2. For the purpose of these simulations, we

considered a 2×2 covariance matrix Σ=DRD where R is a pre-

specified correlation matrix. Since the signs of the correlation

coefficients between predictors and the correlations between the

response variable and the predictors can moderate the effect of

the collinearity on parameter inference and the correlations

between the response variable y and the predictors xi, i=1, 2, 3

were fixed and estimated based on data using SAS 9.1 software.

To simulate predictor variables with different degree of

collinearity, the Pearson pairwise correlation coefficients were

varied from ()98.01.0( r

. As stated above

multicollinearity was assessed using correlation coefficient,

determinant of the matrix and sum of eignvalues. In each

scenario for correlation matrix the average estimates of

regression coefficient, standard errors, t-test statistics, p-values,

correlation matrix determinant [XX '

] and Sum inverse of

eigevalues ( λi−1 ) over 50, 100, 500 and 1000 simulations

were calculated.

4 Result of Analysis

Tables 1 and 2 present the values of the regression estimates and

multicollinearity diagnostics. Four diagnostic statistics used in

this result are variance inflation factor, Condition Index, the sum

inverse of the eigenvalues and the determinant of the correlation

matrix as reported in Tables 1 and 2, respectively. To make the

tables more readable, only the least eigenvalues that are between

0.1 and less are shown. In this study, we consider an eigenvalue

0.1 to be relatively sensitive, and less than 0.1 as severe and

remarkable. Furthermore, the eigenvalues (less than 0.1)

associated with each inverse where this an initial jump in the

value will be used to identify those variates that are involved in

a very near dependency (severe collinearity). A closer look at the

table reveals that there is a discontinuity in sum inverse of the

eigenvalues as the correlation progresses at various samples and

such point triggered severity of the collinearity.

Page 5: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 17

Table 1.: Regression estimates with eigenvalues, determinants at different sample sizes

Sample

size

rx1x2

Parameter

Estimate

Standard

Error

t value

p-value

Eigenvalues

(det(R)) 1

50

0.8777 β1= 0.6404

β2= 1.4395

0.9663

1.0567

0.6627

1.3623

0.5108

0.1796

1.8777

0.1222 0.0055

8.7123

0.8977 β1= 0.6363

β2= 1.4344

0.9647

1.0483

0.6596

1.3683

0.5128

0.1777

1.8977

0.1022 0.0045

9.3065

0.9133 β1= 0.6331

β2= 1.4301

0.9639

1.0414

0.6568

1.3732

0.5145

0.1762

1.9133

0.0867 0.0038

12.0621

100

0.8712 β1= 0.5383

β2= 1.7496

0.6675

0.6924

0.8065

2.5270

0.4219

0.0131

1.8712

0.1288

0.0006 8.2965

0.8916 β1= 0.5251

β2=1.7373

0.6648

0.6878

0.7901

2.5258

0.4314

0.0132

1.8916

0.1084

0.0005 9.7553

0.9077 β1= 0.5141

β2=1.7268

0.6627

0.6842

0.7758

2.5238

0.4398

0.0132

1.9077

0.0923

0.0004 11.358

500

0.8630 β1=1.0795

β2= 0.9967

0.3277

0.3337

3.2944

2.9865

0.0011

0.0030

1.8630

0.1370

5.9153E-6 7.8383

0.8865 β1=1.0766

β2=0.9938

0.3259

0.3313

3.3033

2.9995

0.0010

0.0028

1.8865

0.1135

4.8408E-6 8.0123

0.9046 β1=1.0741

β2=0.9913

0.3246

0.3295

3.3088

3.0085

0.0010

0.0028

1.9046

0.0954

4.0302E-6 11.0068

1000

0.8695 β1=1.1116

β2=0.9731

0.2371

0.2364

4.6891

4.1168

3.1248E-6

0.00004

1.8695

0.1305

7.2445E-7 8.1984

0.8919 β1=1.1085

β2=0.9698

0.2357

0.2350

4.7036

4.1269

2.9162E-6

0.00003

1.8919

0.1081

5.9318E-7 9.7775

0.9091 β1=1.1057

β2=0.9668

0.2346

0.2339

4.7132

4.1329

2.7848E-6

0.00003

1.9091

0.0909

4.9409E-7 11.5276

We further considered the severity using other diagnostic measures (variance inflation factor and condition index) and our result is

shown in table 2. Here again the VIF of above 5 is considered severe for two explanatory variables only. This agrees with the work of

O’Brien (2007) and (Hair et al., 2010). For Condition index, Belsley (1982) stated that K (X) lower than 10 imply light collinearity,

values between 10 and 30 imply moderate collinearity and values higher than 30 imply strong collinearity. In addition, the regressors

must be unit length (that is, must be divided by the standard deviation) and not to be centered. Our result in table 2 shows that when the

highest condition index is above 11, it appears to be severe except for sample size of 1000 which was slightly lower than 11.

Table 2. : Regression estimates with eigenvalues, variance inflation factors and condition Index at different sample sizes

Sample

size

rx1x2

Parameter

Estimate

Standard

Error

t value

p-value

VIF

CI

Page 6: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 18

50

0.8777 β1= 0.6404

β2= 1.4395

0.9663

1.0567

0.6627

1.3623

0.5108

0.1796

4.3562

4.3562

4.0851

9.6633

0.8977 β1= 0.6363

β2= 1.4344

0.9647

1.0483

0.6596

1.3683

0.5128

0.1777

5.1533

5.1533

4.0896

10.6045

0.9133 β1= 0.6331

β2= 1.4301

0.9639

1.0414

0.6568

1.3732

0.5145

0.1762

6.0311

6.0311

4.0914

11.5478

100

0.8712 β1= 0.5383

β2= 1.7496

0.6675

0.6924

0.8065

2.5270

0.4219

0.0131

4.1483

4.1483

4.2457

9.6528

0.8916 β1= 0.5251

β2=1.7373

0.6648

0.6878

0.7901

2.5258

0.4314

0.0132

4.8776

4.8776

4.2538

10.5904

0.9077 β1= 0.5141

β2=1.7268

0.6627

0.6842

0.7758

2.5238

0.4398

0.0132

5.6790

5.6790

4.2589

11.5291

500

0.8630 β1=1.0795

β2= 0.9967

0.3277

0.3337

3.2944

2.9865

0.0011

0.0030

4.6704

4.6704

3.8710

9.4603

0.8865 β1=1.0766

β2=0.9938

0.3259

0.3313

3.3033

2.9995

0.0010

0.0028

5.5034

5.5034

3.8837

10.3905

0.9046 β1=1.0741

β2=0.9913

0.3246

0.3295

3.3088

3.0085

0.0010

0.0028

6.4180

6.4180

3.8930

11.3220

1000

0.8695 β1=1.1116

β2=0.9731

0.2371

0.2364

4.6891

4.1168

3.1248E-6

0.00004

4.0992

4.0992

3.8501

8.7473

0.8919 β1=1.1085

β2=0.9698

0.2357

0.2350

4.7036

4.1269

2.9162E-6

0.00003

4.8887

4.8887

3.8702

9.7029

0.9091 β1=1.1057

β2=0.9668

0.2346

0.2339

4.7132

4.1329

2.7848E-6

0.00003

5.7638

5.7638

3.8848

10.6601

5 Conclusion

As shown in the literature above, several authors have argued on

the severity of collinearity in regression models using different

measures with conflicting values. In our case, we have identified

when collinearity can be considered a problem in regression for

two explanatory variables. Following Dillon and Goldstein

(1984) who argued that large values of ∑λi−1 would indicate

severe collinearity, our result shows that whenever there is a

jump in the in the sum inverse of eigenvalues such can be

considered as severe collinearity. Also with a correlation of 0.9

and above or least eigenvalue less than 0.1, such regression

should be looked upon critically.

Contrary to Marquardt (1970), Hair et al., (1995), who suggested

that a VIF should not exceed 10, we observed that this applies to

more than two explanatory variables. But for only two

explanatory variables, If any of the VIF values exceeds 5 or 10,

it implies that the associated regression coefficients are poorly

estimated because of multicollinearity (Montgomery, 2001)

VIFs greater than 5 represent critical levels of multicollinearity

where the coefficients are poorly estimated, and the p-values are

questionable.

The detection of multicollinearity in econometric models is

usually based on the so-called condition number (CN) of the data

matrix X. However, the computation of the CN, which is the

greater condition index, gives misleading results in particular

cases and many commercial computer packages produce an

inflated CN, even in cases of spurious multicollinearity, i.e. even

if no collinearity exists when the explanatory variables are

considered (Lazaridis, 2007). And this is due to the very low

total variation of some columns of the transformed data matrix,

which is used to compute CN. Again, Belsley et al. (1980) and

Johnston (1984) suggest that condition indices in excess of 20

are problematic. However, this diagnostic does not apply in the

two variable cases as indicated above.

References

Page 7: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 19

1. Belsley, D.A. (1991a). A guide to using the collinearity

diagnostics. Computer Science in Economics and

Management, 4, 33-50.

2. Belsey, D.A., Kuh, E. & Welsch, R.E. (1980).

Regression Diagnostics: Identifying Influential Data and

Sources of Collinearity. New York: John Wiley & Sons.

3. Belsley, D.A. (1982) "Assessing the presence of harmful

collinearity and other forms of weak data through a test

for signal-to-noise". Journal of Econometrics, 20, 211-

253.

4. Berry WD, Feldman S.(1985) Multiple Regression in

Practice (Quantitative Applications in the Social

Sciences) SAGE Publications; Thousand Oaks. CA

5. Chatterjee, S., A. S. Hadi, and B. Price. 2000. Regression

analysis by example. John Wiley and Sons, New York,

New York, USA.

6. Chatterjee, S. and A.S. Hadi, 2006. Regression

Analysis by Example. 4th Edn., Wiley, New

York, ISBN-10: 0471746966.

7. Craney, T.A., and Surles, J.G. (2002) “Model-

Dependent Variance Inflation Factor Cutoff Values,”

Quality Engineering, 14(3), 391-403.

8. Dillon, W.R. & Goldstein, M. (1984). Multivariate

Analysis Methods an Applications. New York: John

Wiley & Sons.

9. Dohoo IR, Ducrot C, Fourichon C. (1996). An overview

of techniques for dealing with large numbers of

independent variables in epidemiologic studies.

Preventive Veterinary Medicine, 29, 221-239.

10. Donath C, Grässel E, Baier D, Pfeiffer C, Bleich S, et

al.(2012). Predictors of binge drinking in adolescents:

ultimate and distal factors – a representative study. BMC

Public Health,12:263.

11. Farrar and Glauber (1967), Multicollinearity in

regression analysis, Review of Economics and Statistics,

49, pp. 92-107.

12. Field, A. (2009). Discovering Statistics using SPSS for

Windows, Third edition. Sage publications, Los

Angeles, London, New Delhi, Singapore, Washington

D.C.

13. Feldstein MS. (1973). Multicollinearity and the Mean

Square Error of Alternative Estimators.

Econometrica;41(2):337–346.

14. Fox J., and Monette G. (1992). Generalized Collinearity

Diagnostics. Journal of the American Statistical

Association. 1992; 87 (417):178–183.

15. Freund, R.J. & Littell, R.C. (1986) SAS System Linear

Regression, 1986 edition. SAS Institute Inc., Cary, NC,

USA

16. Greene (2000), "Econometric Analysis" .Fourth edition,

Upper Saddle River, NJ: Prentice- Hall.

17. Green, E., Tull, D. S. and Albaum, G. (1988) Research

for Marketing Decisions, 5th edn, Prentice-Hall, Inc.,

Englewood Clivs, NJ.

18. Hair J.F., Tatham R.L., Anderson R.E. (1998).

Multivariate Data Analysis. 5. Prentice Hall.

19. Hawking, R. R. and Pendleton, O. J. (1983), "The

regression dilemma" Commun. Stat.- Theo. Meth, 12,

497-527

20. Johnston, J. (1984). Econometric Methods. New York:

McGraw-Hill.

21. Judge, G. G., Griffths, W. E., Hill E. C., Lutkep¨ohl, H.

& Lee, T. C. (1985). The Theory and Practice of

Econometrics, 2nd edn. New York: Wiley.

22. Kmenta, J (1986). Elements of Econometrics. Ann

Arbor, MI: University of Michigan Press.

23. Koutsoyiannis. A. (1977) Theory of Econometrics.

Macmillan Education Limited.

24. Kutner, M.H., C.J. Nachtsheim and J. Neter,

(2004). Applied Linear Regression Models. 4th

Edn., McGraw Hill, New York.

25. Lazaridis A. (2007), A Note Regarding the Condition

Number: The Case of Spurious and Latent

Multicollinearity

26. Lehmann D. R., Gupta, S. and Steckel, J. (1988)

Marketing Research, Addison-Wesley Educational

Publishers, Inc., Reading, Massachussetts.

27. Mason C.H., Perreault, W.D., Jr. Collinearity, Power,

and Interpretation of Multiple Regression Analysis.

Journal of Marketing Research. 1991;28(3):268–280.

28. Montgomry D.C. and Peck E.A. (1981), "Introduction to

Linear Regression Analysis", New York, NY: Wiley

29. Montgomery, D.C., Peck E.A. and Vining

G.G.(2001). Introduction to Linear Regression

Analysis. 3rd Edn., Jon Wiley and Sons, New

York, USA., 672.

Page 8: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 20

30. Midi, H., A. Bagheri and A.H.M.R. Imon, (2010).

The application of robust multicollinearity

diagnostic method based on robust coefficient

determination to a non-collinear data. J. Applied

Sci., 10: 611-619

31. O’Brien, R.M. (2007), “A Caution Regarding Rules of

Thumb for Variance Inflation Factors,” Quality &

Quantity, 41, 673-690.

32. Stewart GW. (1987). Collinearity and Least Square

Regression. Statistical Science, 2(1), 68-94.

33. Tu Y.K., Kellett M., Clerehugh V. (2005). Problems of

correlations between explanatory variables in multiple

regression analyses in the dental literature. British Dental

Journal, 199(7):457–461.

34. Tull, D. S. and Hawkins, D. I. (1990) Marketing

Research, 5th edn, Macmillan Publishing Company,

New York.

35. Vasu ES, Elmore PB.(1975). The Effect of

Multicollinearity and the Violation of the Assumption of

Normality on the Testing of Hypotheses in Regression

Analysis.Presented at the Annual Meeting of the

American Educational Research Association,

Washington, D.C., March 30-April 3.

36. Wichers C.R. (1975). The Detection of

Multicollinearity: A Comment.The Review of

Economics and Statistics, Vol. 57, No. 3 (Aug., 1975),

pp. 366-368

APPENDIX

n = 50

rx1x2 Parameter

Estimate

Standard

Error

t value p-value Eigenvalues (det(R))

1 VIF

CI

0.0005

β1= 0.8555

β2= 1.6054

1.3510

1.3727

0.6333

1.1694

0.5296

0.2481

1.0005

0.9995 0.0789

2.0000

1.0000

1.0000

1.5443

2.1037

0.1369 β1= 0.8153

β2= 1.5960

1.2293

1.3323

0.6632

1.1980

0.5104

0.2369

1.1369

0.8631 0.0604

2.0382 1.0191

1.0191

2.2548

3.04752

0.2287 β1= 0.7879

β2= 1.5789

1.1716

1.2974

0.6725

1.2167

0.5046

0.2298

1.2287

0.7713 0.0502

2.1104 1.0552

1.0552

2.6419

3.3923

0.3216 β1= 0.7622

β2= 1.5595

1.1235

1.2620

0.6785

1.2357

0.5008

0.2227

1.3216

0.6784 0.0413

2.2307 1.1154

1.1154

3.0230

3.6728

0.4806 β1= 0.7214

β2= 1.5247

1.0560

1.2008

0.6831

1.2698

0.4979

0.2104

1.4870

0.5130 0.0281

2.6219 1.3109

1.3109

3.6132

4.2545

0.5551 β1= 0.7061

β2= 1.5105

1.0337

1.1762

0.6831

1.2842

0.4979

0.2054

1.5551

0.4449 0.0234

2.8905 1.4453

1.4453

3.7693

4.6357

0.6133 β1= 0.6936

β2= 1.4983

1.0168

1.1553

0.6821

1.2969

0.4985

0.2010

1.6133

0.3867 0.0197

3.2057 1.6028

1.6028

3.8664

5.0563

0.7071 β1= 0.6748

β2= 1.4790

0.9943

1.1224

0.6786

1.3176

0.5007

0.1940

1.7044

0.2956 0.0144

3.9700 1.9850

1.9850

3.9752

5.9461

0.7697 β1= 0.6618

β2= 1.4648

0.9812

1.0986

0.6744

1.3333

0.5034

0.1889

1.7697

0.2303 0.0108

4.9068 2.4534

2.4534

4.0301

6.8624

0.8169 β1= 0.6524

β2= 1.4562

0.9735

1.0808

0.6702

1.3454

0.5060

0.1849

1.8169

0.1831 0.0084

6.0111 3.0056

3.0056

4.0598

7.7903

0.8516 β1= 0.6456

β2= 1.4460

0.9690

1.0673

0.6663

1.3548

0.5085

0.1819

1.8517

0.1484 0.0067

7.2802 3.6001

3.6001

4.0762

8.7247

Page 9: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 21

0.8777 β1= 0.6404

β2= 1.4395

0.9663

1.0567

0.6627

1.3623

0.5108

0.1796

1.8777

0.1222 0.0055

8.7123 4.3562

4.3562

4.0851

9.6633

0.8977 β1= 0.6363

β2= 1.4344

0.9647

1.0483

0.6596

1.3683

0.5128

0.1777

1.8977

0.1022 0.0045

9.3065 5.1533

5.1533

4.0896

10.6045

0.9133 β1= 0.6331

β2= 1.4301

0.9639

1.0414

0.6568

1.3732

0.5145

0.1762

1.9133

0.0867 0.0038

12.0621 6.0311

6.0311

4.0914

11.5478

0.9257 β1= 0.6304

β2= 1.4266

0.9636

1.0358

0.6543

1.3773

0.5161

0.1749

1.9257

0.0743 0.0033

13.9787 6.9893

6.9893

4.0916

12.4926

0.9356 β1= 0.6283

β2= 1.4237

0.9635

1.0311

0.6521

1.3808

0.5175

0.1739

1.9356

0.0644 0.0031

16.0559 7.0280

7.0280

4.0908

13.4384

0.9438 β1= 0.6265

β2= 1.4211

0.9636

1.0271

0.6501

1.3837

0.5188

0.1730

1.9438

0.0562 0.0024

18.2936 9.1468

9.1468

4.0894

14.3852

0.9504 β1= 0.6250

β2= 1.4190

0.9639

1.0237

0.6484

1.3862

0.5199

0.1722

1.9504

0.0496 0.0022

20.6917 10.3458

10.3458

4.0876

15.3327

0.9560 β1= 0.6237

β2= 1.4171

0.9642

1.0207

0.6468

1.3883

0.5209

0.1716

1.9560

0.0440 0.0019

23.2499 11.6250

11.6250

4.0856

16.2808

0.9607 β1= 0.6226

β2= 1.4154

0.9646

1.0181

0.6454

1.3902

0.5218

0.1710

1.9603

0.0393 0.0017

25.9604 12.9842

12.9842

4.0834

17.2293

0.9647 β1= 0.6216

β2= 1.4140

0.9650

1.0159

0.6442

1.3919

0.5226

0.1705

1.9647

0.0353 0.0015

28.8470 14.4235

14.4235

4.0813

18.1783

0.9681 β1= 0.6207

β2= 1.4127

0.9655

1.0139

0.6429

1.3934

0.5234

0.1701

1.9681

0.0319 0.0014

31.8857 15.9428

15.9428

4.0791

19.1276

n = 100

rx1x2 Parameter

Estimate

Standard

Error

t value p-

value

Eigenvalues (det(R))

1 VIF CI

0.0032

β1=0.7344

β2=1.8694

0.9245

0.9322

0.7944

2.0053

0.4289

0.0477

1.0032

0.9968

0.0082 2.0000 1.0000

1.0000

1.5952

2.1771

0.1152 β1= 0.7870

β2=1.9454

0.8504

0.8788

0.9255

2.2138

0.3570

0.0292

1.1552

0.8448

0.0060 2.0493 1.0247

1.0247

2.3517

3.0709

0.2465 β1= 0.7778

β2=1.9469

0.8152

0.8492

0.9542

2.2928

0.3424

0.0240

1.2465

0.7535

0.0050 2.1294 1.0647

1.0647

2.7689

3.3787

0.3362 β1= 0.7580

β2=1.9361

0.7853

0.8223

0.9653

2.3544

0.3368

0.0206

1.3362

0.6638

0.0041 2.2548 1.1274

1.1274

3.1992

3.6071

0.4928 β1= 0.7094

β2=1.9001

0.7413

0.7799

0.9570

2.4365

0.3410

0.0166

1.5571

0.4429

0.0028 2.9002 1.3208

1.3208

3.8790

4.1034

0.5571 β1= 0.6859

β2=1.8810

0.7258

0.7638

0.9450

2.4626

0.3470

0.0156

1.6124

0.3876

0.0024 3.2001 1.4501

1.4501

3.9738

4.5506

0.6124 β1= 0.6642

β2=1.8627

0.7134

0.7505

0.7134

0.7505

0.7134

0.7505

1.6998

0.3002

0.0020 3.9192 1.6001

1.6001

4.0413

5.0058

0.6998 β1= 0.6270

β2=1.8304

0.6958

0.7303

0.9012

2.5064

0.3697

0.0139

1.6998

0.3002

0.0015 3.9192 1.9596

1.9596

4.1298

5.9252

Page 10: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 22

0.7633 β1= 0.5973

β2=1.8039

0.6843

0.7161

0.8728

2.5190

0.3849

0.0134

1.7633

0.2367

0.0011 4.7917 2.3959

2.3959

4.1816

6.8518

0.8099 β1= 0.5734

β2=1.7822

0.6766

0.7059

0.8475

2.5250

0.3988

0.0132

1.8099

0.1900

0.0009 5.8136 2.9068

2.9068

4.2133

7.7828

0.8447 β1= 0.5542

β2=1.7644

0.6713

0.6982

0.8256

2.5271

0.4111

0.0131

1.8447

0.1553

0.0007 6.9823 3.4912

3.4912

4.2331

8.7168

0.8712 β1= 0.5383

β2= 1.7496

0.6675

0.6924

0.8065

2.5270

0.4219

0.0131

1.8712

0.1288

0.0006 8.2965 4.1483

4.1483

4.2457

9.6528

0.8916 β1= 0.5251

β2=1.7373

0.6648

0.6878

0.7901

2.5258

0.4314

0.0132

1.8916

0.1084

0.0005 9.7553 4.8776

4.8776

4.2538

10.5904

0.9077 β1= 0.5141

β2=1.7268

0.6627

0.6842

0.7758

2.5238

0.4398

0.0132

1.9077

0.0923

0.0004 11.358 5.6790

5.6790

4.2589

11.5291

0.9205 β1= 0.5047

β2= 1.7178

0.6612

0.6813

0.7633

2.5214

0.4471

0.0133

1.9205

0.0795

0.0003 13.1044 6.5522

6.5522

4.2621

12.4687

0.9356 β1= 0.4966

β2= 1.7100

0.6601

0.6789

0.7523

2.5189

0.4537

0.0134

1.9309

0.0691

0.0003 14.9940 7.4970

7.4970

4.2640

13.4090

0.9394 β1= 0.4895

β2=1.7033

0.6592

0.6769

0.7426

2.5163

0.4966

1.7100

1.9394

0.0606

0.0003 17.0269 8.5135

8.5135

4.2650

14.3499

0.9465 β1= 0.4833

β2=1.6973

0.6585

0.6751

0.7340

2.5138

0.4647

0.0136

1.9465

0.0535

0.0002 19.2028 9.6014

9.6014

4.2654

15.2913

0.9560 β1= 0.4779

β2=1.6920

0.6579

0.6738

0.7263

2.5113

0.4694

0.0137

1.9524

0.0476

0.0002 21.5217 10.7608

10.7608

4.2654

16.2330

0.9574 β1= 0.4730

β2=1.6873

0.6575

0.6725

0.7194

2.5089

0.4736

0.0138

1.9574

0.0426

0.0002 23.9834 11.9917

11.9917

4.2650

17.1751

0.9617 β1= 0.4687

β2=1.6831

0.6572

0.6714

0.7132

2.5067

0.4774

0.0139

1.9617

0.0383

0.0002 26.5880 13.2940

13.2940

4.2645

18.1174

0.9653 β1= 0.4648

β2=1.6793

0.7075

2.5045

0.7075

2.5045

0.4809

0.0139

1.9653

0.0347

0.0001 29.3354

14.6677

14.6677

4.2638

19.060

Page 11: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 23

n = 500

rx1x2 Parameter

Estimate

Standar

d Error

t value p-value Eigenvalues (det(R))

1 VIF CI

0.0539

β1=1.0284

β2=0.9338

0.4311

0.4484

2.3856

2.0824

0.0174

0.0378

1.0539

0.9460

0.0000722 2.0059

1.0020

1.0020

1.5128

1.8969

0.1321 β1=1.0750

β2=0.9852

0.4014

0.4185

2.6781

2.3538

0.0076

0.0190

1.2130

0.7870

0.0000522 2.0951 1.0476

1.0476

2.2417

2.7061

0.3048 β1=1.0867

β2=0.9987

0.3872

0.4033

2.8064

2.4757

0.0052

0.0136

1.3048

0.6952

0.0000429 2.2048 1.1024

1.1024

2.6500

2.9919

0.3926 β1= 1.093

β2=1.006

0.3751

0.3901

2.9136

2.5795

0.0037

0.0102

1.3926

0.6073

0.0000351 2.3645 1.1823

1.1823

3.0737

3.2069

0.5422 β1=1.0962

β2=1.0114

0.3572

0.3699

3.0686

2.7341

0.0023

0.0065

1.5422

0.4578

0.0000239 2.8326 1.4163

1.4163

3.4827

3.9529

0.6022 β1=1.0955

β2=1.0112

0.3509

0.3625

3.1222

2.7896

0.0019

0.0054

1.6022

0.3978

0.00002 3.1379 1.5690

1.5690

3.5718

4.4006

0.6532 β1=1.0941

β2=1.0102

0.3458

0.3565

3.1640

2.8340

0.0017

0.0048

1.6532

0.3468

0.0000169 3.4887 1.7444

1.7444

3.6395

4.8522

0.7330 β1=1.0904

β2=1.0070

0.3384

0.3475

3.2222

2.8981

0.0014

0.0040

1.7330

0.2670

0.0000124 4.3225 2.1612

2.1612

3.7323

5.7636

0.7903 β1= 1.0865

β2=1.0034

0.3335

0.3413

3.2581

2.9399

0.0012

0.0034

1.7903

0.2097

9.4258E-6 5.3275 2.6637

2.6637

3.7901

6.6822

0.8321 β1=1.0828

β2=0.9999

0.3301

0.3369

3.2804

2.9677

0.0011

0.0031

1.8321

0.1680

7.3769E-6 6.5001 3.2501

3.2501

3.8278

7.6054

0.8630 β1=1.0795

β2= 0.9967

0.3277

0.3337

3.2944

2.9865

0.0011

0.0030

1.8630

0.1370

5.9153E-6 7.8383 3.9192

3.9192

3.8532

8.5318

0.8865 β1=1.0766

β2=0.9938

0.3259

0.3313

3.3033

2.9995

0.0010

0.0028

1.8865

0.1135

4.8408E-6 8.0123 4.6704

4.6704

3.8710

9.4603

0.9046 β1=1.0741

β2=0.9913

0.3246

0.3295

3.3088

3.0085

0.0010

0.0028

1.9046

0.0954

4.0302E-6 11.0068 5.5034

5.5034

3.8837

11.3220

0.9188 β1=1.0719

β2=0.9891

0.3236

0.3281

3.3122

3.0150

0.0010

0.0027

1.9188

0.0812

3.4048E-6 12.8359 6.4180

6.4180

3.8930

11.4905

0.9301 β1=1.0699

β2=0.9871

0.3228

0.3269

3.3141

3.0195

0.0010

0.0027

1.9301

0.0699

2.9129E-6 14.8277 7.4138

7.4138

3.9000

12.2543

0.9393 β1=1.0682

β2=0.9854

0.3222

0.3260

3.3150

3.0228

0.0010

0.0026

1.9393

0.0607

2.5193E-6 16.9820 8.4910

8.4910

3.9052

13.1875

0.9468 β1=1.0666

β2=0.9838

0.3217

0.3252

3.3153

3.0251

0.0010

0.0026

1.9468

0.0532

2.1998E-6 19.2986 9.6493

9.6493

3.9091

14.1212

0.9530 β1=1.0653

β2=0.9824

0.3213

0.3246

3.3151

3.0267

0.0010

0.0026

1.9529

0.0470

1.937E-6 21.7775 10.8890

10.8890

3.9122

15.0555

0.9582 β1= 1.0640

β2=0.9812

0.3210

0.3240

3.3146

3.0279

0.0010

0.0026

1.9582

0.0418

1.7183E-6 24.4185 12.2093

12.2093

3.9146

15.9901

0.9626 β1=1.0629

β2=0.9800

0.3207

0.3236

3.3139

3.0286

0.0010

0.0026

1.9626

0.0374

1.5344E-6 27.2216 13.6108

13.6108

3.9165

16.9251

Page 12: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 24

0.9663 β1= 1.0619

β2=0.9790

0.3205

0.3232

3.3131

3.0291

0.0010

0.0026

1.9663

0.0337

1.3784E-6 30.1869 15.0934

15.0934

3.9180

17.8603

0.9695 β1=1.0609

β2=0.9780

0.3203

0.3229

3.3122

3.0293

0.0010

0.0026

1.9695

0.0305

1.2449E-6 33.3141 16.6571

16.6571

3.9191

18.7958

n = 1000

rx1x2 Parameter

Estimate

Standard

Error

t value p-value Eigenvalues (det(R))

1 VIF CI

0.0765 β1=1.0624

β2=0.9256

0.3097

0.3145

3.4308

2.9434

0.0006

0.0033

1.0765

0.9235

8.9052E-6 2.01178 1.0059

1.0059

1.5207

1.8639

0.2379 β1=1.1108

β2=0.9752

0.2897

0.2923

3.8346

3.3363

0.0001

0.0009

1.2379

0.7621

6.3896E-6 2.12002 1.0600

1.0600

2.2697

2.6753

0.3291 β1=1.1224

β2=0.9868

0.2799

0.2817

4.0097

3.5031

0.0001

0.0005

1.3291

0.6709

5.2396E-6 2.2429 1.1214

1.1214

2.6888

2.9946

0.4156 β1=1.1284

β2=0.9927

0.2715

0.2727

4.1557

3.6408

0.00003

0.0002

1.4156

0.5844

4.285E-6 2.4175 1.2088

1.2088

3.1124

3.1949

0.5613 β1=1.1309

β2=0.9946

0.2589

0.2592

4.3676

3.8374

0.00001

0.0001

1.5612

0.4387

2.9164E-6 2.9199 1.4599

1.4599

3.4583

4.0357

0.6194 β1=1.1299

β2=0.9932

0.2544

0.2544

4.4416

3.9047

9.9278E-6

0.00010

1.6194

0.3806

2.4396E-6 3.2446 1.6223

1.6223

3.5508

4.4960

0.6686 β1=1.1281

β2=0.9912

0.2507

0.2505

4.4998

3.9571

7.6017E-6

0.00008

1.6686

0.3314

2.0615E-6 3.6165 1.8083

1.8083

3.6213

4.9606

0.7452 β1=1.1238

β2=0.9864

0.2453

0.2448

4.5821

4.0293

1.1281

0.9912

1.7452

0.2548

1.5154E-6 4.4975 2.2488

2.2488

3.7190

5.8985

0.8001 β1=1.1193

β2=0.9816

0.2416

0.2410

4.6340

4.0733

4.0626E-6

0.00005

1.8001

0.1999

1.1529E-6 5.5569 2.7784

2.7784

3.7809

6.8438

0.8399 β1=1.1153

β2=0.9771

0.2390

0.2383

4.6673

4.1002

3.4675E-6

0.00004

1.8399

0.1600

9.0289E-7 6.7912 3.3956

3.3956

3.8219

7.7940

0.8695 β1=1.1116

β2=0.9731

0.2371

0.2364

4.6891

4.1168

3.1248E-6

0.00004

1.8695

0.1305

7.2445E-7 8.1984 4.0992

4.0992

3.8501

8.7473

0.8919 β1=1.1085

β2=0.9698

0.2357

0.2350

4.7036

4.1269

2.9162E-6

0.00003

1.8919

0.1081

5.9318E-7 9.7775 4.8887

4.8887

3.8702

9.7029

0.9091 β1=1.1057

β2=0.9668

0.2346

0.2339

4.7132

4.1329

2.7848E-6

0.00003

1.9091

0.0909

4.9409E-7 11.5276 5.7638

5.7638

3.8848

10.6601

0.9227 β1=1.1032

β2=0.9642

0.2338

0.2331

4.7196

4.1363

2.7007E-6

0.00004

1.9226

0.0774

4.1759E-7 13.4483 6.7241

6.7241

3.8958

11.6186

0.9334 β1=1.1010

β2=0.9619

0.2331

0.2325

4.7238

4.1380

2.6468E-6

0.00004

1.9334

0.0666

3.5738E-7 15.5392 7.7696

7.7696

3.9042

12.5781

0.9422 β1=1.0992

β2=0.9599

0.2326

0.2319

4.7264

4.1384

2.6129E-6

0.00004

1.9421

0.0579

3.0919E-7 17.8004 8.9002

8.9002

3.9107

13.5383

Page 13: ON THE SEVERITY OF MULTICOLLINEAR VARIABLES IN A LINEAR ...cird.online/PSEJ/wp-content/uploads/2019/08/CIRD-EPAJ-19-1415-fin… · Dept. of Maths/Statistics, University of Port Harcourt,

Probability Statistics and Econometric Journal

Vol.1, Issue: 1; May - June- 2019

ISSN (5344 – 3321);

p –ISSN (9311 – 4223)

Probability Statistics and Econometric Journal

An official Publication of Center for International Research Development Double Blind Peer and Editorial Review International Referred Journal; Globally index

Available @CIRD.online/PSEJ: E-mail: [email protected]

pg. 25

0.9493 β1=1.0976

β2=0.9581

0.2321

0.2315

4.7281

4.1381

2.5927E-6

0.00004

1.9493

0.0507

2.7005E-7 20.2314 10.1157

10.1157

3.9158

14.4991

0.9552 β1=1.0960

β2=0.9565

0.2318

0.2312

4.7289

4.1373

2.5819E-6

0.00004

1.9552

0.0448

2.3785E-7 22.8323 11.4161

11.4161

3.9198

15.4604

0.9606 β1=1.0946

β2=0.9550

0.2315

0.2309

4.7293

4.1361

2.5778E-6

0.00003

1.9601

0.0399

2.1104E-7 25.6029 12.8015

12.8015

3.9231

16.4222

0.9643 β1=1.0934

β2=0.9537

0.2312

0.2307

4.7292

4.1347

2.5784E-6

0.00004

1.9643

0.0357

1.885E-7 28.5433 14.2717

14.2717

3.9258

17.3843

0.9679 β1=1.0923

β2=0.9526

0.2309

0.2305

4.7289

4.1332

2.5824E-6

0.00004

1.9679

0.0321

1.6936E-7 31.6533 16.8257

16.8257

3.9279

18.2466

0.9710 β1=1.0913

β2=0.9515

0.2308

0.2303

4.7283

4.1316

2.5888E-6

0.00004

1.9710

0.0291

1.5299E-7 34.9330 17.4665

17.4665

3.9297

19.3093