Upload
kylie-kellow
View
226
Download
0
Embed Size (px)
Citation preview
04/18/23 H.S. 104/18/23 H.S. 1
Stata Introduction, Shortv2
Hein Stigum
Presentation, data and programs at:
http://folk.uio.no/heins/
courses
Stata introduction
• General use– Interface and menu
– Do-files and syntax
– Data handling
• Analysis– Descriptive
– Graphs
– Bivariate
04/18/23 H.S. 2
04/18/23 H.S. 304/18/23 H.S. 3
Why Stata
• Pro– Aimed at epidemiology
– Many methods, growing
– Graphics
– Structured, Programmable
– Coming soon to a course near you
• Con– Memory>file size
Interface
04/18/23 H.S. 504/18/23 H.S. 5
Interface Stata 9
Interface Stata 12
04/18/23 H.S. 6
Dofile
Dataedit
04/18/23 H.S. 704/18/23 H.S. 7
Menu
04/18/23 H.S. 804/18/23 H.S. 8
Do-file example
New do-file: icon or Ctrl-9
Run: Mark, Ctrl-D
04/18/23 H.S. 904/18/23 H.S. 9
Syntax
• Syntax[bysort varlist:] command [varlist] [if exp] [in range][, opts]
• Examples– mean age– mean age if sex==1– bysort sex: summarize age– summarize age ,detail
Data handling
04/18/23 H.S. 1104/18/23 H.S. 11
Import data
• Using SPSS 14.0-17.0– Save as, Stata Version 8 SE
04/18/23 H.S. 1204/18/23 H.S. 12
Use and save data
• Open data– use “C:\Course\Myfile.dta”, clear
• Describe– describe describe all variables
– list x1 x2 in 1/20 list obs nr 1 to 20
• Save data– save “C:\Course\Myfile.dta” ,replace
04/18/23 H.S. 1304/18/23 H.S. 13
Use data from web
• webuse “file” use data from Stata homepage
1.webuse set “http://www.med.uio.no/forskning/doktorgrad-karriere/forskerutdanning/kurs/biostatistikk/mf9510-logistisk-regresjon-overlevelsesanalyse-cox/”
set homepage
2.webuse “birth1” data for exercise 1
04/18/23 H.S. 1404/18/23 H.S. 14
Generate, replace
• Index– generate index=0
– replace index=1 if sex==1 & age<30
• Young/Old– generate old=(age>50)
• Serial numbers, lags– generate id=_n
– generate age1=age[ _n-1]
if age<.
04/18/23 H.S. 1504/18/23 H.S. 15
Dates
• From numeric to dateex: m=12, d=2, y=1987
generate birth=mdy(m,d,y)
format birth %td
• From string to dateex: bstr=“01.12.1987”
generate birth=date(bstr,”DMY”)
format birth %td
04/18/23 H.S. 1604/18/23 H.S. 16
Missing• Obs!!!
– Represented as ”.”– Missing values are large numbers – age>30 will include missing.– age>30 if age<. will not.
• Test– replace age=0 if (age==.)
• Remove– drop if age==.
• Change– replace educ=. if educ==99
04/18/23 H.S. 1704/18/23 H.S. 17
Describe missing• Summarize variables
• Missing in tables
misstable summarize bullied sex new command
summarize id bullied sex
tab bullied sex, missing
04/18/23 H.S. 1804/18/23 H.S. 18
Help
• General– help command
– findit keyword search Stata+net
• Examples– help table
– findit aflogit
04/18/23 H.S. 1904/18/23 H.S. 19
Summing up
• Use do files– Run: Mark, Ctrl-D
• Syntax– command [varlist] [if exp] [in range] [, options]
• Missing– age>30 if age<.
– generate old=(age>50) if age<.
• Help– help describe
Descriptive
04/18/23 H.S. 21
Descriptive• Continuous
• Categorical
summarize weight
summarize weight, details fractiles ++
tabulate bullied
tabulate bullied, nolab show coding
04/18/23 H.S. 2204/18/23 H.S. 22
Other descriptives
tabstat mAge, stat( N min p50 mean max) by(parity)
04/18/23 H.S. 23
Graphics
04/18/23 H.S. 2404/18/23 H.S. 24
Twoway plots• Syntax
– twoway (plot1, opts) (plot2, opts), opts
• One plot– kdensity bw
– scatter bw gest
0 2000 4000 6000Birth weight
kernel = epanechnikov, bandwidth = 102.3251
Kernel density estimate
02
000
400
06
000
Birt
h w
eig
ht
240 260 280 300 320 340Gestational age
04/18/23 H.S. 2504/18/23 H.S. 25
0.0
002
.000
4.0
006
.000
8kd
ensi
ty w
eigh
t
1000 2000 3000 4000 5000gram
Weight distribution by sex
twoway ( kdensity bw if sex==1, lcolor(blue) ) ///( kdensity bw if sex==2, lcolor(red ) )
04/18/23 H.S. 2604/18/23 H.S. 26
twoway (scatter bw gest) (fpfitci bw gest) (lfit bw gest)
200
03
000
400
05
000
600
0g
ram
250 270 290 310days
Weight by gestational age
scatter smooth with CI line fit
04/18/23 H.S. 2704/18/23 H.S. 27
Titles
1000
2000
3000
4000
5000
ytitl
e
240 260 280 300 320xtitle
note
subtitletitle
scatter bw gest, title("title") subtitle("subtitle") ///xtitle("xtitle") ytitle("ytitle") note("note")
Bivariate analysis
04/18/23 H.S. 2904/18/23 H.S. 29
2 independent samples
2000 3000 4000 5000 6000Birth weight
twoway ( kdensity weight if sex==1, lcolor(blue) ) ///
( kdensity weight if sex==2, lcolor(red) )
Equal means?
Equal variance?
Do boys and girls have the same mean birth weight?
04/18/23 H.S. 3004/18/23 H.S. 30
2 independent samples test
ttest weight, by(sex) unequalttest w1 w2, paired
ttest weight, by(sex) 2-sample T-test
04/18/23 H.S. 3104/18/23 H.S. 31
Crosstables
equal proportions?
Are boys bullied as much as girls?
tabulate bullied sex, col chi2 nofreq
04/18/23 H.S. 3204/18/23 H.S. 32
Summing up
• Descriptivesummarize weight
tabulate sex
• Graphstwoway (plot1, opts) (plot2, opts), opts
• Bivariate• ttest weight, by(sex)• tabulate bullied sex, chi2