Basic output through to regression models
Title: Stata2Mplus conversion for ego_ghq12_id.dta.dta List of variables converted shown below
ghq01 : ghq time1 item 1 ghq02 : ghq time1 item 2 ghq03 : ghq time1 item 3 ghq04 : ghq time1 item 4 ghq05 : ghq time1 item 5 ghq06 : ghq time1 item 6 ghq07 : ghq time1 item 7 ghq08 : ghq time1 item 8 ghq09 : ghq time1 item 9 ghq10 : ghq time1 item 10 ghq11 : ghq time1 item 11 ghq12 : ghq time1 item 12 f1 : Scores for factor 1 id : Data: File is "C:\work\courses\mar09_course\ego\ego_ghq12_id.dta.dat" ; listwise is on;
Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; !usevariables = ghq01 ghq03 ghq05 ghq07 ghq09 ghq11; usevariables = ghq02 ghq04 ghq06 ghq08 ghq10 ghq12; idvariable = id;
Analysis: Type = basic ;
output: !sampstat;
plot: type is plot3;
savedata: file is "C:\work\courses\mar09_course\ego\ego_odd.dat" ;
Output
INPUT READING TERMINATED NORMALLY
Stata2Mplus conversion for ego_ghq12_id.dta.dtaList of variables converted shown belowghq01 : ghq time1 item 1ghq02 : ghq time1 item 2ghq03 : ghq time1 item 3ghq04 : ghq time1 item 4ghq05 : ghq time1 item 5ghq06 : ghq time1 item 6ghq07 : ghq time1 item 7ghq08 : ghq time1 item 8ghq09 : ghq time1 item 9ghq10 : ghq time1 item 10ghq11 : ghq time1 item 11ghq12 : ghq time1 item 12f1 : Scores for factor 1id :
SUMMARY OF ANALYSIS
Number of groups 1Number of observations 1119Number of dependent variables 6Number of independent variables 0Number of continuous latent variables 0
Observed dependent variables
Continuous GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12
Variables with special functions ID variable ID
Estimator MLInformation matrix OBSERVEDMaximum number of iterations 1000Convergence criterion 0.500D-04Maximum number of steepest descent iterations 20Maximum number of iterations for H1 2000Convergence criterion for H1 0.100D-03
Input data file(s) C:\work\courses\mar09_course\ego\ego_ghq12_id.dta.datInput data format FREE
SUMMARY OF DATA
Number of missing data patterns 1
SUMMARY OF MISSING DATA PATTERNS
MISSING DATA PATTERNS (x = not missing) 1 GHQ02 x GHQ04 x GHQ06 x GHQ08 x GHQ10 x GHQ12 x
MISSING DATA PATTERN FREQUENCIES
Pattern Frequency 1 1119
COVARIANCE COVERAGE OF DATA
Minimum covariance coverage value 0.100
PROPORTION OF DATA PRESENT
Covariance Coverage
GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12
________ ________ ________ ________ ________ ________
GHQ02 1.000 GHQ04 1.000 1.000 GHQ06 1.000 1.000 1.000 GHQ08 1.000 1.000 1.000 1.000 GHQ10 1.000 1.000 1.000 1.000 1.000 GHQ12 1.000 1.000 1.000 1.000 1.000 1.000
RESULTS FOR BASIC ANALYSIS
ESTIMATED SAMPLE STATISTICS
Means GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ 1 2.161 2.123 2.060 2.195 1.987 2.223
Covariances GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ GHQ02 0.768 GHQ04 0.152 0.373 GHQ06 0.350 0.229 0.653 GHQ08 0.211 0.199 0.271 0.387 GHQ10 0.387 0.264 0.439 0.305 0.873 GHQ12 0.266 0.196 0.312 0.250 0.380 0.520
Correlations GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ GHQ02 1.000 GHQ04 0.284 1.000 GHQ06 0.494 0.465 1.000 GHQ08 0.387 0.525 0.538 1.000 GHQ10 0.473 0.464 0.581 0.524 1.000 GHQ12 0.421 0.445 0.535 0.556 0.564 1.000
PLOT INFORMATION
The following plots are available:
Histograms (sample values) Scatterplots (sample values)
SAVEDATA INFORMATION
Order and format of variables GHQ02 F10.3 GHQ04 F10.3 GHQ06 F10.3 GHQ08 F10.3 GHQ10 F10.3 GHQ12 F10.3 ID I5
Save file C:\work\courses\mar09_course\ego\ego_odd.dat Save file format 6F10.3 I5
Histograms and Scatterplots
Define new variables
Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ;
usevariables = sumodd sumeven; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12;
Analysis: Type = basic ;
Etc.
Histogram dialogue box
8 bins
10 bins
12 bins
multihist ghq02 ghq04 ghq06 ghq08 ghq10 ghq12 F
requ
ency
ghq02 (n=1119/1119)
ghq time1 item 21 2 3 4
0
200
400
600
ghq04 (n=1119/1119)
ghq time1 item 41 2 3 4
0
200
400
600
800
ghq06 (n=1119/1119)
ghq time1 item 61 2 3 4
0
200
400
600
ghq08 (n=1119/1119)
ghq time1 item 81 2 3 4
0
200
400
600
800
ghq10 (n=1119/1119)
ghq time1 item 101 2 3 4
0
100
200
300
400
ghq12 (n=1119/1119)
ghq time1 item 121 2 3 4
0
200
400
600
800
Scatterplot dialogue box
Full sample
Random sample of 250
5
9
13
17
21
25
5 7 9 11 13 15 17 19 21 23 25
Sum of even items
Su
m o
f o
dd
ite
ms
Regression models
Linear RegressionVariable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12;Analysis: estimator = ML;Model: sumodd on sumeven;output: sampstat cinterval;plot: type is plot3;
savedata: file is "C:\work\courses\mar09_course\ego\ego_oddeven_regress.dat" ; SAVE = MAHALANOBIS COOKS INFLUENCE;
Linear RegressionVariable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12;Analysis: estimator = ML;Model: sumodd on sumeven;output: sampstat cinterval;plot: type is plot3;
savedata: file is "C:\work\courses\mar09_course\ego\ego_oddeven_regress.dat" ; SAVE = MAHALANOBIS COOKS INFLUENCE;
Linear RegressionVariable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12;Analysis: estimator = ML;Model: sumodd on sumeven;output: sampstat cinterval;plot: type is plot3;
savedata: file is "C:\work\courses\mar09_course\ego\ego_oddeven_regress.dat" ; SAVE = MAHALANOBIS COOKS INFLUENCE;
TESTS OF MODEL FIT
Chi-Square Test of Model Fit
Value 0.000 Degrees of Freedom 0 P-Value 0.0000
Chi-Square Test of Model Fit for the Baseline Model
Value 1635.553 Degrees of Freedom 1 P-Value 0.0000
CFI/TLI CFI 1.000 TLI 1.000
Loglikelihood H0 Value -5155.247 H1 Value -5155.247
Output
Information Criteria
Number of Free Parameters 3 Akaike (AIC) 10316.495 Bayesian (BIC) 10331.555 Sample-Size Adjusted BIC 10322.027 (n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.000 90 Percent C.I. 0.000 0.000 Probability RMSEA <= .05 0.000
SRMR (Standardized Root Mean Square Residual)
Value 0.000
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value
SUMODD ON SUMEVEN 0.890 0.015 60.886 0.000
Intercepts SUMODD 1.941 0.193 10.051 0.000
Residual Variances SUMODD 2.868 0.121 23.654 0.000
CONFIDENCE INTERVALS OF MODEL RESULTS
Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5%
SUMODD ON SUMEVEN 0.852 0.861 0.890 0.919 0.928
Intercepts SUMODD 1.444 1.563 1.941 2.320 2.439
Residual Variances SUMODD 2.556 2.631 2.868 3.106 3.181
Compare with Stata
Source | SS df MS Number of obs = 1119-------------+------------------------------ F( 1, 1117) = 3700.56 Model | 10633.9457 1 10633.9457 Prob > F = 0.0000 Residual | 3209.82016 1117 2.87360802 R-squared = 0.7681-------------+------------------------------ Adj R-squared = 0.7679 Total | 13843.7659 1118 12.3826171 Root MSE = 1.6952
------------------------------------------------------------------------------ sumodd | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- sumeven | .8900851 .0146318 60.83 0.000 .8613762 .9187941 _cons | 1.941059 .1933 10.04 0.000 1.561787 2.320332------------------------------------------------------------------------------
Say something about OLS / ML estimation
SAVE = MAHALANOBIS COOKS INFLUENCE;
Mahalanobis distance
1
3
5
7
9
11
13
15
1 70 139 208 277 346 415 484 553 622 691 760 829 898 967 1036 1105
Observation
Influence
0
0.05
0.1
0.15
0.2
0.25
0.3
0 5 10 15 20 25 30
Sum of even items
Infl
ue
nc
e
Logistic regression 1 – cts predictorVariable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; categorical are sumodd; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; cut sumodd (16);
Analysis: estimator = ML;
Model: sumodd on sumeven;
output: sampstat; cinterval;
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value
SUMODD ON SUMEVEN 0.970 0.070 13.856 0.000
Thresholds SUMODD$1 15.665 1.080 14.499 0.000
CONFIDENCE INTERVALS OF MODEL RESULTS
Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5%
SUMODD ON SUMEVEN 0.790 0.833 0.970 1.107 1.150
Thresholds SUMODD$1 12.882 13.547 15.665 17.783 18.448
CONFIDENCE INTERVALS FOR THE LOGISTIC REGRESSION ODDS RATIO RESULTS
SUMODD ON SUMEVEN 2.203 2.300 2.638 3.026 3.159
Compare with Stata. gen sumodd_g = sumodd. recode sumodd_g 0/16=0 17/24=1(sumodd_g: 1119 changes made)
. tab sumodd_g
sumodd_g | Freq. Percent Cum.------------+----------------------------------- 0 | 916 81.86 81.86 1 | 203 18.14 100.00------------+----------------------------------- Total | 1,119 100.00
. logistic sumodd_g sumeven
Logistic regression Number of obs = 1119 LR chi2(1) = 666.85 Prob > chi2 = 0.0000Log likelihood = -196.45269 Pseudo R2 = 0.6292
------------------------------------------------------------------------------ sumodd_g | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- sumeven | 2.638126 .184695 13.86 0.000 2.299868 3.026134------------------------------------------------------------------------------
Logistic regression 2 – binary predictorVariable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; categorical are sumodd; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; cut sumeven (16); cut sumodd (16);
Analysis: estimator = ML;
Model: sumodd on sumeven;
output: sampstat; cinterval;
Don’t put sumeven here
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value
SUMODD ON SUMEVEN 4.647 0.273 17.020 0.000
Thresholds SUMODD$1 2.687 0.132 20.307 0.000
CONFIDENCE INTERVALS OF MODEL RESULTS
Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5%
SUMODD ON SUMEVEN 3.944 4.112 4.647 5.182 5.350
Thresholds SUMODD$1 2.346 2.428 2.687 2.946 3.028
CONFIDENCE INTERVALS FOR THE LOGISTIC REGRESSION ODDS RATIO RESULTS
SUMODD ON SUMEVEN 51.618 61.069 104.289 178.096 210.706
Compare with Stata. gen sumeven_g = sumeven. recode sumeven_g 0/16=0 17/24=1(sumeven_g: 1119 changes made)
. tab sumeven_g
sumeven_g | Freq. Percent Cum.------------+----------------------------------- 0 | 957 85.52 85.52 1 | 162 14.48 100.00------------+----------------------------------- Total | 1,119 100.00
. xi: logistic sumodd_g i.sumeven_g
Logistic regression Number of obs = 1119 LR chi2(1) = 484.77 Prob > chi2 = 0.0000Log likelihood = -287.49045 Pseudo R2 = 0.4574
------------------------------------------------------------------------------ sumodd_g | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------_Isumeven_~1 | 104.2885 28.47434 17.02 0.000 61.0702 178.0917------------------------------------------------------------------------------
Logistic regression 3 – ordinal predictorVariable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd ghq02_1 ghq02_2; categorical are sumodd; idvariable = id;
Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; !sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; cut sumodd (16); ghq02_1 = ghq02; ghq02_2 = ghq02; cut ghq02_1 (1); cut ghq02_2 (2); if ghq02_2 eq 1 then ghq02_1 = 0;
Analysis: estimator = ML;
Model: sumodd on ghq02_1 ghq02_2;
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value
SUMODD ON GHQ02_1 2.103 0.524 4.015 0.000 GHQ02_2 3.786 0.515 7.348 0.000
Thresholds SUMODD$1 4.182 0.504 8.301 0.000
CONFIDENCE INTERVALS OF MODEL RESULTS
Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5%
SUMODD ON GHQ02_1 0.754 1.076 2.103 3.129 3.452 GHQ02_2 2.459 2.776 3.786 4.796 5.113
Thresholds SUMODD$1 2.884 3.195 4.182 5.170 5.480
CONFIDENCE INTERVALS FOR THE LOGISTIC REGRESSION ODDS RATIO RESULTS
SUMODD ON GHQ02_1 2.125 2.933 8.188 22.853 31.550 GHQ02_2 11.691 16.056 44.075 120.987 166.159
Compare with Stata
. recode ghq02 4=3(ghq02: 88 changes made)
. xi: logistic sumodd_g i.ghq02i.ghq02 _Ighq02_1-3 (naturally coded; _Ighq02_1 omitted)
Logistic regression Number of obs = 1119 LR chi2(2) = 190.38 Prob > chi2 = 0.0000Log likelihood = -434.68929 Pseudo R2 = 0.1796
------------------------------------------------------------------------------ sumodd_g | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- _Ighq02_2 | 8.1875 4.287567 4.02 0.000 2.933598 22.85083 _Ighq02_3 | 44.07477 22.7058 7.35 0.000 16.05759 120.9761------------------------------------------------------------------------------