36
Mar-20 H.S. 1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/ courses

Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 1

Stata:Linear Regression

Stata 3, linear regression

Hein Stigum

Presentation, data and programs at:

http://folk.uio.no/heins/

courses

Page 2: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

SYNTHETIC DATA EXAMPLEBirth weight by gestational age

Mar-20 H.S. 2

Page 3: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 3

Linear regression

Birth weight

by

gestational age

Page 4: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 4

Regression idea

residual error,e xofeffect ,tcoefficienb

covariate =xoutcome=y

:model

1

10

==

++= exbby

covariate = x,x :cofactorsmany with model

21

22110 exbxbby +++=

2500

3000

3500

4000

4500

5000

birth

wei

ght (

gram

)

250 260 270 280 290 300 310gestational age (days)

Page 5: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 5

Model, measure and assumptions

• Model

• Association measureb1 = change in y for one unit increase in x1

• Assumptions– Independent errors– Linear effects– Constant error variance

• Robustness– influence

),0(, 222110 seebbb Nxxy µ+++=

Page 6: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 6

Association measure

11

1

2210

2210

121

22110

12

11

βy

β xβββxβββ

yyy

xβxββy

xx

=D

=---++=

-=D

++=

==

Model:

Start with:

Hence:

Page 7: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 7

Purpose of regression

• Estimation– Estimate association between outcome and

exposure adjusted for other covariates

• Prediction– Use an estimated model to predict the

outcome given covariates in a new dataset

Page 8: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Outcome distributions by exposure

Exposed Unexposed

-3 0 1 4Outcome

Mar-20 H.S. 8

Exposed Unexposed

-3 -2 -1 0 1 2 3Outcome

Linear regression

Quantile regressionor

cutoff, logistic regression

0 2 4 6 8Outcome

Linear regressionor

transform,linear regression

Page 9: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 9

Workflow• DAG• Scatter- and densityplots• Bivariate analysis• Regression

– Model estimation– Test of assumptions

• Independent errors• Linear effects• Constant error variance

– Robustness • Influence

Egest age

Dbirth weight

C2education

C1sex

Page 10: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Scatter and density plotsScatter of birth weight by

gestational ageDistribution of birth weight for

low/high gestational age

Mar-20 H.S. 10

gest<280 days gest>=280 days

0 2000 4000 6000Birth weight (gr)

Look for deviations from linearityand outliers Look for shift in shape

3704962

020

0040

0060

00

Birth

wei

ght (

gr)

240 260 280 300 320 340Gestational age

Page 11: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 11

Syntax• Estimation

– regress y x1 x2 linear regression

– regress y c.age i.sex continuous age, categorical sex

– regress y c.age##i.sex main+interaction

• Compare models– estimates store m1 save model

– estimates table m1 m2 compare coefficients

– estimates stats m1 m2 compare model fit

• Post estimation– predict res, residuals predict residuals

Page 12: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 12

Model 1: outcome+exposure

regress bw gest crude model

estimates store m1 store model results

Page 13: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 13

Model 2 and 3: Add covariates

Estimate association:m1 is biased, m2=m3

Prediction: m3 is best

regress bw gest i.educ sex add covariatesestimates table m1 m2 m3 compare coefs

estimates stats m1 m2 m3 compare fit

m3 more precise?

Page 14: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Factor (categorical) variables

• Variable– educ = 1, 2, 3 for low, medium and high education

• Built in– i.educ use educ=1 as base (reference)

– ib3.educ use educ=3 as base (reference)

• Manual “dummies”– educ=1 as base, make dummies for 2 and 3– generate Medium =(educ==2) if educ<.– generate High =(educ==3) if educ<.

Mar-20 H.S. 14

Page 15: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Create meaningful constant

Expected birth weight at:

sexeduceducgestbwE ×+×+×+×+= 43210 32)( bbbbb

gr1572 0 -=b!

gest= 0, educ=1, sex=0, not meaningful

gest=280, educ=1, sex=0 gr342628010 =×+ bb!!

Expected birth weight:

Margins:margins, at(gest= 0 educ=1 sex=0) = -1572 not meaningful

margins, at(gest= 280 educ=1 sex=0) = 3426

Mar-20 H.S. 15

Page 16: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

coeff 95% conf. Int.Birth weight at ref 3426 (3385 , 3467)Gestational age

per day 17.9 (16 , 20)Education

Low 0Medium 71.5 (25 , 118)High 99.1 (51 , 148)

SexBoy 0Girl -154.3 (-187 , -121)

Results so far

Mar-20 H.S. 16

Would normally check for interaction now!

Page 17: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

ASSUMPTIONS

Mar-20 H.S. 17

Page 18: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 18

Test of assumptions• Assumptions

– Independent residuals:– Linear effects:– Constant variance:

-300

0-2

000

-100

00

1000

2000

2500 3000 3500 4000 4500Linear prediction

estat hettestp=0.9 no heteroskedasticity

discuss

plot residuals versus predicted y

predict res, residualspredict pred, xbscatter res pred

Page 19: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 19

Violations of assumptions• Dependent residuals

Use mixed models or GEE

• Non linear effectsAdd square term or spline

• Non-constant varianceUse robust variance estimation

-1-.5

0.5

1

200 220 240 260 280 300gest

-2-1

01

2res

3400 3500 3600 3700 3800p

Page 20: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

ROBUSTNESSMeasures of influence

Mar-20 H.S. 20

Page 21: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 21

Influence idea

outlier

regression without outlier

regression with outlier

020

0040

0060

00

Birth

wei

ght (

gr)

250 300 350 400Gestational age (days)

Page 22: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 22

Measures of influence

• Measure change in:– Predicted outcome – Deviance– Coefficients (beta)

• Delta beta

Remove obs 1, see changeremove obs 2, see change

-.6-.4

-.20

.2

Influence

1 2 10Id

Page 23: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

1 2 34 56

78

9101112131415 16171819 2021

2223242526

2728293031

3233

34 353637383940

41424344

454647484950

515253 545556575859 6061

6263

64

6566 67

68697071727374 7576

77787980 818283 84

85

86878889 90

91929394

9596

97

9899 100101102103104

105106107108

109110111112113114115116 117118119

120121 122123

124125126127128129130131132133134135

136137138

139140141142143144145146

147148149150

151 152153154155156157158

159160161162

163

164 165166167168 169170171

172173 174175176

177

178179180181 182

183

184185186187

188189190 191192193194195196197 198199200201202203204

205206207208 209210

211212

213214215216217218219220221 222

223224 225226227228 229230231

232233234

235236237238239240241242243 244 245246247248

249250251252 253254255 256257 258

259260261262263

264265

266267

268269270271272273274275

276277278279

280281282283

284285286287288

289290 291292293

294295296297

298299300301302

303304305306307308309310311312

313314315316317318319320321322323324325326327 328329330

331332333

334

335336337338339340341

342

343344345 346347348

349350

351352353354355

356 357358359

360361

362363364365

366

367368369

370

371372 373374375376

377378 379380381382 383384385386

387 388389390

391392

393

394395

396397

398399400401 402403404405 406407408409410

411412

413414 415416417418 419420

421422423424425426 427428

429430

431432433434

435

436437438439

440441442443444445446

447448449450451452453454

455

456457 458459460461462

463464465466467468469

470471472473474475476477478479

480481

482483484485486 487488489

490

491492493494 495

496497498499 500501502503504

505506507

508509 510511512513

514515516517518 519520521522523524525 526527528

529

530531532

533534535536537538539540541542543544545546547548549

550 551552553554555

556 557558559560561 562

563 564565

566

567568

569

570

571572573

574575576577578

579580581582583584585586587588

589590

591592 593594595596597598599600601

602

603604605606607

608609610

611612613 614615

616617618

619620621622

623

624625626627628

629630631

632633634635

636

637638

639640641642643

644645646647648

649

650651652653654655656657

658

659660 661662663664665 666667668669

670671672

673674675

676677678679680681 682683684

685686687688

689 690691692 693 694695696697698

699700701702703704705706707708709710711

712713714715716717718719720721722723724725726

727

728729730731732733734735736 737738739

740741742743744745

746747748749750751752753754

755756757758 759760

761762763764765766

767768769770771772773 774775776777 778779 780781782 783784

785786787

788789790791792793794795796

797798799800 801802803804805806807808

809

810811812813814815816817

818

819

820

821822

823824825826827 828829830

831832833834

835836 837838839840841842

843

844845846

847848849

850851852853 854855856857858859860

861 862863

864865866867868869870871

872873874875

876877878879880881882 883 884885886887

888

889890 891892893894

895896897898899900901902903904905906907908909910911912913914

915

916917

918919920921922923924925926

927928929930931932933934935936937

938

939

940941942943944

945946947948949950951952

953 954955956957958959960961962

963964965 966967968969970971972973974975976977978979 980981982 983984985986987988

989990991992993994995 996997

9989991000

100110021003100410051006

10071008100910101011

101210131014101510161017101810191020

1021

102210231024102510261027

1028

102910301031103210331034

1035103610371038

1039104010411042

104310441045

10461047

1048104910501051105210531054105510561057

1058

1059106010611062

1063 10641065

1066106710681069107010711072107310741075

1076107710781079108010811082

1083108410851086

1087108810891090109110921093

10941095109610971098

10991100

11011102110311041105

1106

1107110811091110

11111112

11131114

1115111611171118 1119112011211122

11231124112511261127112811291130

113111321133

1134113511361137

113811391140114111421143

114411451146

114711481149

1150115111521153115411551156

115711581159 1160

116111621163

1164

116511661167

11681169

117011711172

1173117411751176117711781179

118011811182118311841185 11861187

1188118911901191119211931194119511961197

1198119912001201120212031204 12051206

12071208 12091210

12111212 12131214

12151216

12171218 121912201221

1222122312241225 1226122712281229

12301231123212331234123512361237

123812391240

1241124212431244

124512461247

1248

1249125012511252

1253125412551256

12571258125912601261

12621263

126412651266

12671268

1269

12701271 12721273127412751276 1277

1278127912801281128212831284

12851286

128712881289

1290129112921293

1294129512961297129812991300

1301130213031304

1305130613071308

1309131013111312

13131314

1315131613171318 1319132013211322132313241325132613271328132913301331

13321333133413351336133713381339134013411342134313441345134613471348

1349

13501351

135213531354135513561357

1358135913601361 136213631364

13651366

1367 13681369137013711372

13731374

1375

137613771378 137913801381138213831384

1385138613871388138913901391139213931394

139513961397

1398139914001401

140214031404140514061407140814091410

141114121413141414151416

14171418141914201421 14221423

142414251426

1427142814291430

14311432 143314341435

14361437143814391440144114421443

14441445144614471448144914501451145214531454

14551456

14571458

1459

146014611462146314641465146614671468

146914701471147214731474 147514761477

1478147914801481

1482148314841485 1486148714881489

1490149114921493 14941495149614971498 14991500150115021503

1504150515061507150815091510151115121513 15141515

1516

1517151815191520152115221523152415251526

15271528

152915301531153215331534

153515361537

1538153915401541154215431544154515461547 154815491550155115521553

1554 15551556 15571558

1559156015611562156315641565

1566156715681569 15701571157215731574

15751576

157715781579

158015811582158315841585158615871588

158915901591 15921593159415951596 15971598

1599160016011602

16031604160516061607160816091610161116121613

16141615

1616161716181619

1620 16211622 16231624

162516261627

162816291630

1631163216331634

16351636

1637163816391640 16411642164316441645

16461647164816491650

165116521653165416551656

16571658

1659166016611662166316641665166616671668

16691670

167116721673

1674

16751676 16771678167916801681

1682168316841685

168616871688

16891690169116921693

1694169516961697169816991700

170117021703

17041705

17061707170817091710171117121713 1714171517161717

17181719172017211722

1723

17241725172617271728

172917301731

173217331734173517361737

17381739174017411742

174317441745 174617471748

1749

1750 17511752175317541755 175617571758175917601761176217631764176517661767

1768

1769177017711772

17731774177517761777

1778177917801781178217831784178517861787

17881789

17901791179217931794

17951796179717981799

18001801

18021803

1804180518061807180818091810181118121813

18141815181618171818

18191820

18211822182318241825

1826182718281829

18301831

18321833183418351836183718381839

184018411842184318441845

184618471848

1849 1850

1851

18521853

1854185518561857

185818591860

18611862186318641865186618671868

18691870

1871

1872

18731874 187518761877

1878 187918801881

188218831884

1885188618871888188918901891

18921893189418951896

1897 189818991900

190119021903190419051906190719081909191019111912191319141915

1916191719181919

19201921192219231924 192519261927192819291930193119321933

193419351936

19371938 1939

194019411942

1943194419451946194719481949195019511952195319541955

19561957195819591960196119621963

1964196519661967196819691970197119721973197419751976197719781979198019811982198319841985198619871988 19891990

1991 19921993199419951996199719981999

20002001200220032004 20052006200720082009201020112012

2013201420152016

2017201820192020202120222023

20242025

2026 202720282029

2030203120322033

203420352036203720382039

20402041204220432044

20452046

2047

2048

2049205020512052

205320542055205620572058 20592060

206120622063

2064

20652066

206720682069207020712072

207320742075207620772078

207920802081

2082208320842085

20862087208820892090209120922093 20942095209620972098209921002101210221032104210521062107

21082109

2110

211121122113

21142115211621172118

2119

2120 2121212221232124212521262127212821292130

21312132213321342135213621372138213921402141214221432144214521462147214821492150215121522153215421552156

21572158

21592160216121622163216421652166

21672168216921702171217221732174217521762177217821792180218121822183218421852186

218721882189219021912192

219321942195 21962197

219821992200220122022203

220422052206 220722082209221022112212

22132214

2215221622172218221922202221 222222232224

2225222622272228222922302231

2232223322342235

2236

223722382239 224022412242224322442245

2246224722482249

2250225122522253 22542255225622572258 225922602261226222632264

2265226622672268 2269

22702271 227222732274227522762277

227822792280

2281228222832284

2285

228622872288 22892290 2291

229222932294

2295

22962297229822992300

230123022303

23042305230623072308230923102311231223132314

23152316231723182319232023212322

2323 232423252326

23272328 2329233023312332

233323342335

23362337233823392340234123422343 234423452346234723482349

23502351235223532354

23552356

2357

2358

23592360236123622363

2364236523662367236823692370237123722373

237423752376

2377237823792380238123822383

2384238523862387

2388 238923902391

23922393239423952396239723982399

2400240124022403

2404

2405240624072408240924102411241224132414241524162417 24182419242024212422

2423

2424

2425242624272428 2429243024312432243324342435 243624372438

243924402441

2442

2443244424452446244724482449

24502451

245224532454

245524562457245824592460246124622463246424652466

24672468

2469

24702471

2472

2473247424752476247724782479248024812482

2483248424852486248724882489249024912492 2493

2494

2495 2496249724982499 250025012502250325042505250625072508250925102511

251225132514251525162517

25182519

25202521252225232524

25252526252725282529

253025312532253325342535

2536253725382539

254025412542254325442545

25462547

254825492550255125522553255425552556 2557

2558255925602561

25622563256425652566256725682569257025712572

25732574257525762577

257825792580258125822583

25842585258625872588258925902591

25922593259425952596259725982599260026012602

26032604260526062607260826092610 26112612261326142615

261626172618261926202621

2622262326242625262626272628

2629

2630

263126322633

2634

26352636263726382639 264026412642 26432644264526462647264826492650265126522653265426552656265726582659266026612662266326642665266626672668

266926702671

2672

26732674267526762677 267826792680 26812682

2683268426852686

26872688

2689269026912692269326942695269626972698269927002701270227032704 2705270627072708270927102711

27122713271427152716 2717271827192720

2721272227232724

2725

27262727

27282729273027312732273327342735

27362737

2738 2739274027412742

274327442745

2746

2747274827492750275127522753

275427552756275727582759

2760276127622763276427652766276727682769

277027712772277327742775

27762777

27782779

27802781 27822783

2784278527862787

27882789

2790

27912792279327942795

2796279727982799280028012802280328042805280628072808 28092810

281128122813281428152816

281728182819282028212822

2823282428252826

2827

28282829 28302831283228332834283528362837

28382839284028412842284328442845

28462847284828492850 28512852285328542855

285628572858

28592860 2861

28622863286428652866286728682869287028712872

287328742875

2876 2877

2878

28792880

28812882288328842885

2886288728882889

2890289128922893289428952896

2897289828992900

290129022903

2904290529062907290829092910291129122913

29142915

2916291729182919292029212922292329242925

29262927

29282929

29302931293229332934

2935

2936293729382939

29402941294229432944294529462947

2948294929502951295229532954

2955295629572958295929602961296229632964296529662967

2968296929702971297229732974 297529762977297829792980

2981

298229832984298529862987298829892990299129922993299429952996 29972998299930003001

300230033004

300530063007

3008300930103011

3012301330143015 30163017301830193020

3021 3022302330243025

30263027302830293030303130323033

303430353036

3037

3038303930403041304230433044304530463047304830493050

305130523053305430553056

305730583059306030613062 306330643065306630673068306930703071307230733074307530763077307830793080

308130823083

3084308530863087 3088308930903091309230933094 3095309630973098309931003101

3102310331043105

31063107

3108310931103111311231133114311531163117

3118 311931203121312231233124312531263127 31283129

313031313132313331343135

313631373138

3139314031413142314331443145314631473148

3149

31503151

3152

31533154315531563157315831593160

316131623163

3164

31653166

3167

3168

316931703171317231733174317531763177317831793180

31813182

31833184

3185318631873188318931903191319231933194 31953196319731983199320032013202

32033204

32053206320732083209 3210321132123213

321432153216321732183219322032213222322332243225322632273228

32293230323132323233323432353236323732383239

324032413242

324332443245

324632473248324932503251325232533254325532563257325832593260326132623263326432653266 3267326832693270

32713272

3273

3274

3275

3276

32773278

3279328032813282 32833284

3285

3286328732883289329032913292

32933294 3295329632973298329933003301

330233033304330533063307330833093310331133123313

3314331533163317 3318331933203321

3322332333243325

332633273328332933303331

3332

333333343335

3336

3337333833393340

334133423343334433453346334733483349

33503351

3352

33533354 3355335633573358

3359

336033613362

33633364 336533663367336833693370

337133723373337433753376

33773378337933803381338233833384338533863387

33883389

3390339133923393

33943395

3396339733983399340034013402 34033404

34053406

340734083409 34103411341234133414

34153416341734183419

34203421 34223423342434253426

3427342834293430343134323433 343434353436

343734383439

3440

344134423443344434453446344734483449345034513452

345334543455345634573458 3459346034613462

34633464

346534663467

34683469347034713472347334743475

34763477 347834793480

3481 348234833484348534863487

34883489

34903491349234933494349534963497 3498349935003501350235033504

35053506

3507350835093510351135123513

35143515 3516

35173518 3519352035213522352335243525

3526

35273528352935303531

35323533353435353536353735383539354035413542

354335443545 35463547

354835493550

3551

3552

35533554355535563557 3558355935603561

3562

35633564356535663567356835693570357135723573

357435753576

357735783579

3580358135823583

3584 3585358635873588

3589359035913592

35933594359535963597

359835993600

36013602 36033604360536063607

36083609361036113612361336143615

361636173618361936203621362236233624

3625362636273628

36293630363136323633363436353636363736383639 36403641

364236433644364536463647

36483649365036513652

365336543655365636573658365936603661

36623663

3664

36653666

3667366836693670

3671367236733674

3675

3676

3677

36783679

3680

36813682368336843685368636873688

36893690

36913692

36933694369536963697 36983699370037013702

3703

37043705

37063707370837093710

3711371237133714

3715

371637173718 371937203721372237233724372537263727

372837293730373137323733373437353736 3737

3738373937403741374237433744374537463747

37483749

3750 375137523753375437553756375737583759

37603761

37623763 37643765376637673768 37693770

377137723773

3774

37753776377737783779378037813782

3783

378437853786378737883789

37903791379237933794379537963797

379837993800

380138023803

380438053806

3807

3808380938103811 3812381338143815381638173818 381938203821 38223823382438253826382738283829383038313832

38333834383538363837383838393840384138423843384438453846

38473848384938503851

3852

3853385438553856 385738583859386038613862

3863386438653866386738683869 3870387138723873

38743875

38763877

38783879

38803881

38823883388438853886

38873888

38893890

38913892389338943895389638973898389939003901 39023903390439053906

390739083909

39103911391239133914391539163917 3918

39193920392139223923392439253926392739283929 3930

3931393239333934

3935393639373938393939403941 39423943

394439453946

3947

3948394939503951395239533954

39553956 39573958395939603961 3962396339643965396639673968

3969

397039713972397339743975397639773978397939803981

39823983 398439853986

3987398839893990 399139923993

39943995

39963997

39983999 40004001

40024003400440054006400740084009401040114012401340144015 4016401740184019

4020

4021

4022402340244025

4026

402740284029403040314032403340344035

403640374038 4039404040414042

40434044

40454046

4047

40484049

4050

4051405240534054405540564057

4058405940604061 40624063406440654066

406740684069407040714072 407340744075 4076407740784079 40804081408240834084

40854086

4087

408840894090409140924093 409440954096409740984099410041014102

4103410441054106410741084109

411041114112411341144115411641174118

41194120412141224123412441254126

41274128

412941304131413241334134413541364137

41384139414041414142 414341444145

414641474148

41494150

415141524153 4154

41554156

4157415841594160416141624163416441654166416741684169

41704171417241734174417541764177

4178

417941804181

41824183418441854186 41874188

41894190 419141924193

41944195419641974198

4199

4200420142024203420442054206420742084209421042114212

42134214421542164217421842194220

4221

422242234224

42254226

42274228

422942304231423242334234

42354236

42374238 42394240424142424243

42444245424642474248

4249

42504251

4252

4253

4254

42554256

42574258425942604261426242634264426542664267426842694270

4271427242734274427542764277

427842794280428142824283 4284

42854286428742884289429042914292

4293429442954296429742984299430043014302

43034304

4305430643074308

43094310

431143124313

431443154316

43174318431943204321432243234324 4325432643274328432943304331 43324333

43344335433643374338

433943404341

43424343

43444345

434643474348

4349435043514352

4353

4354435543564357

43584359436043614362

43634364

43654366436743684369

43704371437243734374437543764377

437843794380 43814382 438343844385

438643874388438943904391

43924393

43944395

4396439743984399

4400440144024403440444054406440744084409

4410441144124413441444154416

441744184419

442044214422 4423

44244425442644274428

44294430443144324433

4434443544364437443844394440

44414442444344444445444644474448

4449

4450 44514452445344544455445644574458 44594460 4461446244634464446544664467

44684469447044714472

447344744475447644774478

44794480

44814482 44834484 4485448644874488

448944904491 44924493449444954496 44974498449945004501450245034504

450545064507

4508450945104511

45124513 4514451545164517451845194520452145224523452445254526

4527452845294530

45314532

4533453445354536 4537453845394540454145424543

45444545

4546

45474548454945504551455245534554

4555455645574558455945604561

45624563456445654566

45674568456945704571457245734574

4575457645774578

4579458045814582458345844585

45864587458845894590 459145924593

45944595459645974598 45994600

4601 4602460346044605

46064607460846094610

461146124613

4614

4615 46164617

4618

461946204621 462246234624462546264627462846294630463146324633463446354636463746384639464046414642464346444645

4646464746484649

465046514652465346544655

465646574658

4659

4660466146624663466446654666 466746684669

467046714672467346744675467646774678

46794680

46814682

4683

46844685468646874688

46894690

4691469246934694

4695469646974698 4699470047014702

4703470447054706

4707 470847094710471147124713

47144715471647174718471947204721472247234724472547264727472847294730

4731

4732

473347344735

47364737

4738

4739 47404741

4742474347444745474647474748

4749

475047514752

47534754 47554756

4757475847594760

47614762

4763

47644765 476647674768

47694770

47714772 47734774

477547764777

477847794780 47814782

47834784

478547864787

478847894790

4791479247934794

479547964797479847994800

480148024803

4804 4805480648074808

48094810481148124813 4814

48154816481748184819

482048214822

48234824482548264827 4828482948304831

48324833

4834483548364837483848394840484148424843 4844484548464847

484848494850485148524853485448554856

48574858485948604861486248634864

486548664867

486848694870487148724873

4874487548764877 487848794880

48814882

4883

4884488548864887

488848894890

48914892489348944895489648974898489949004901

490249034904490549064907490849094910

4911 4912

4913491449154916491749184919492049214922

49234924 49254926

4927492849294930493149324933493449354936

4937

493849394940 4941

494249434944

494549464947 4948

4949495049514952 49534954

4955 49564957

4958

495949604961

4962496349644965496649674968

4969497049714972

49734974497549764977

4978497949804981

4982498349844985

4986498749884989499049914992 499349944995499649974998

49995000

0.0

05.0

1.0

15Le

vera

ge

0 .002 .004 .006 .008Normalized residual squared

Mar-20 H.S. 23

Leverage versus residuals2

lvr2plot, mlabel(id)

high influence

Page 24: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

370

4962

-.8-.6

-.4-.2

0.2

Dfb

eta

gest

0 1000 2000 3000 4000 5000id

beta(gest)= 17.9

Delta-beta for gestational age

Mar-20 H.S. 24

dfbeta(gest)scatter _dfbeta_1 id

OBS, variable specific

If obs nr 370 is removed, beta will change from 17.9 to 18.6

Page 25: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Mar-20 H.S. 25

Removing outlier

regress bw gest i.educ sex if id!=370est store m4est table m3 m4, b(%8.1f)

Page 26: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Removing outlier

Mar-20 H.S. 26

Full model N=5000 Outlier removed N=4999

One outlier affected several estimates

Final model

coeff 95% conf. Int.Birth weight at ref 3426 (3385 , 3467)Gestational age

per day 17.9 (16 , 20)Education

Low 0Medium 71.5 (25 , 118)High 99.1 (51 , 148)

SexBoy 0Girl -154.3 (-187 , -121)

coeff 95% conf. Int.Birth weight at ref 3433 (3391 , 3474)Gestational age

per day 18.5 (17 , 20)Education

Low 0Medium 64.2 (18 , 110)High 88.6 (40 , 137)

SexBoy 0Girl -152.7 (-185 , -120)

Page 27: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Help

• Linear regression– help regress

• syntax and options– help regress postestimation

• dfbeta• estat hettest• lvr2plot• predict• margins

Mar-20 H.S. 27

Page 28: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

NON-LINEAR EFFECTSbw2

Mar-20 H.S. 28

Page 29: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

bw2: Non-linear effects

Mar-20 H.S. 29

1000

2000

3000

4000

5000

6000

Birth

wei

ght (

gr)

240 260 280 300 320Gestational age

Handle:add

polynomialor

spline

Page 30: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Non-linear effects: polynomial

Mar-20 H.S. 30

regress bw2 c.gest##c.gest i.educ sex 2. order polynomial in gest

margins, at(gest=(250(10)310)) predicted bw2 by gestmarginsplot plot

2500

3000

3500

4000

Line

ar P

redi

ctio

n

250 260 270 280 290 300 310Gestational age

Predictive Margins with 95% CIs

Page 31: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Non-linear effects: spline• Qubic spline

• Plot

• Linear spline

Mar-20 H.S. 31

mkspline g=gest, cubic nknots(4) make spline with 4 knotsregress bw2 g1 g2 g3 i.educ sex regression with spline

gen igest=5*round(gest/5) 5-year integer values of gest margins, over(igest) predicted bw by gestmarginsplot

mkspline g1 280 g2=gest make linear spline with knot at 280regress bw2 g1 g2 i.educ sex regression with spline

2500

3000

3500

4000

Line

ar P

redi

ctio

n

250 260 270 280 290 300 310igest

Predictive Margins with 95% CIs

Page 32: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

INTERACTIONbw3

Mar-20 H.S. 32

Page 33: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Interaction definitions

• Interaction: combined effect of two variables

• Scale– Linear models additive

• y=b0+b1x1+b2x2 both x1 and x2 = b1+b2

– Logistic, Poisson, Cox multiplicative

• Interaction– deviation from additivity (multiplicativity)

Û– effect of x1 depends on x2

Mar-20 H.S. 33

Page 34: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

bw3: Interaction (only linear effects)

• Add interaction terms

• Show results

Mar-20 H.S. 34

regress bw3 c.gest##i.sex i.educ gest-sex interaction

margins, dydx(gest) at(sex=0) effect of gest for boysmargins, dydx(gest) at(sex=1) effect of gest for girls

Page 35: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Summing up 1• Build model

– regress bw gest crude model– est store m1 store– regress bw gest i.educ sex full model– est store m2– est table m1 m2 compare coefficients

• Interaction– regress bw3 c.gest##i.sex i.educ test interaction– margins, dydx(gest) at(sex=0) gest for boys

• Assumptions– predict res, residuals residuals– predict pred, xb predicted– scatter res pred plot

Mar-20 H.S. 35

Page 36: Stata: Linear Regression - people.umass.edupeople.umass.edu/biep640w/pdf/Stigum Hein Stata for linear regressi… · –educ= 1, 2, 3 for low, medium and higheducation •Builtin

Summing up 2

• Non-linearity (linear spline)– mkspline g1 280 g2=gest spline with knot at 280– regress bw2 g1 g2 i.educ sex regression with spline

• Robustness– dfbeta(gest) delta-beta– scatter _dfbeta_1 id plot versus id

Mar-20 H.S. 36