21
2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital DK-8200 Aarhus Denmark

2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Embed Size (px)

Citation preview

Page 1: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

2014 Nordic and Baltic Stata Users Group Metting

Working sideways in Stata

Jakob HjortDataManager, MPH

Department of CardiologyAarhus University Hospital

DK-8200 AarhusDenmark

Page 2: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

The rectangular dataset

Page 3: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Statistics

The rectangular dataset

Page 4: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Statistics

The rectangular dataset

”It is not the data we want it’s the ssence of data”

results

Page 5: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Datamanagement

The rectangular dataset

Page 6: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Datamanagement

The rectangular dataset

Page 7: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

The rectangular dataset

DatamanagementStatistics

Page 8: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

DatamanagementStatistics

The rectangular dataset - transpose?

Page 9: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

use ”family.dta”, clear* Dataset with: fam_name, inc_mother & inc_father

mata st_view(x=0,.,(”inc_mother”,”inc_father”)) income=colsum(x’)’ st_addvar(”long”,”inc_household”) st_store(.,”inc_household”,income)end

list fam_name inc_mother inc_father inc_household

The rectangular dataset – subset in matrix using mata?

Page 10: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

generate [type] newvar=exp [if] [in]

The direct approach

Datamanagement

Page 11: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

generate [type] newvar=exp [if] [in]

The direct approach

Weight Height BMI

Datamanagement Ex.: generate BMI=Weight/Height^2

Page 12: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

egen [type] newvar=fcn(arguments) [if] [in] [,options]

rowtotal, rowmin, rowmax, rowfirst, rowlast, rowmean, rowmedian, rowmiss, rownonmiss, rowpctile, rowsd, concat, anycount, anymatch, anyvalue,count, diff, fill, group, iqr, kurt, max, mdev, mean, median, min, mode, mtr, pc, pctile, rank, sd, seq, skew, std, tag, total

The direct approach

Datamanagement

Page 13: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

egen [type] newvar=fcn(arguments) [if] [in] [,options]

rowtotal, rowmin, rowmax, rowfirst, rowlast, rowmean, rowmedian, rowmiss, rownonmiss, rowpctile, rowsd, concat, anycount, anymatch, anyvalue,count, diff, fill, group, iqr, kurt, max, mdev, mean, median, min, mode, mtr, pc, pctile, rank, sd, seq, skew, std, tag, total

The direct approach

IncJan IncFeb income

Datamanagement Ex.: egen income=rowtotal(inc*)

IncMar IncApr IncMay IncJun IncJul …

Page 14: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

program define _growmin version 6, missing gettoken type 0 : 0 gettoken g 0 : 0 gettoken eqs 0 : 0

syntax varlist [if] [in] [, BY(string)] if `"`by'"' != "" { _egennoby rowmin() `"`by'"' }

tempvar touse mark `touse' `if' `in' quietly { gen `type' `g' = . tokenize `varlist' while "`1'"!="" { replace `g' = cond(`1' < `g',`1',`g') mac shift } }end

Looking under the skirts – just for inspiration

viewsource _growmin.ado the rowmin() function of egen

Page 15: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

program define _growmin version 6, missing gettoken type 0 : 0 gettoken g 0 : 0 gettoken eqs 0 : 0

syntax varlist [if] [in] [, BY(string)] if `"`by'"' != "" { _egennoby rowmin() `"`by'"' }

tempvar touse mark `touse' `if' `in' quietly {1. gen `type' `g' = .2. tokenize `varlist'3. while "`1'"!="" {4. replace `g' = cond(`1' < `g',`1',`g')5. mac shift6. } }end

Looking under the skirts – just for inspiration

viewsource _growmin.ado the rowmin() function of egen

1. Initialize target variable2. Prepare the variable-list3. Looping:4. In-the-loop-commands

Page 16: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Prepare the variable-list

Variables can be specified with wildcards - The expanded list is stored in `vars'(unab means unabbreviate – however the command itself can’t be un-abbreviated)

. unab vars: inc* . unab vars: incJan-incDec

1. Initialize target variable2. Prepare the variable-list3. Looping:4. In-the-loop-commands

. local vars incJan incFeb incMar incApr incMay incJun /// incJul incAug incSep incOct incNov incDec

. ds inc* . ds incJan-incDecincJan incFeb incMar incApr incMay incJun incJul incAug incSep incOct incNov incDec

Full specification of each and every variable – OK with 12 but what in case of hundreds?The list is stored in `vars'

Variables can be specified with wildcards - The list is stored in `r(varlist)’Nice feature: the expanded list is shown for inspection

Page 17: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Looping

”foreach” is the quickest and the most transparent loop command

foreach lvar in incJan incFeb { // do stuff with "`lvar'”}

unab lvar: inc*foreach lvar in `lvar' { // do stuff with "`lvar'”}

ds inc*foreach lvar in `r(varlist)' { // do stuff with "`lvar'” }

1. Initialize target variable2. Prepare the variable-list3. Looping:4. In-the-loop-commands

Page 18: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

”foreach” is the quickest and the most transparent loop command

foreach lvar in incJan incFeb { // do stuff with "`lvar'”}

unab lvar: inc*foreach lvar in `lvar' { // do stuff with "`lvar'”}

ds inc*foreach lvar in `r(varlist)' { // do stuff with "`lvar'” }

1. Initialize target variable2. Prepare the variable-list3. Looping:4. In-the-loop-commands

alt 0 9 6

Hold + press …

on numeric keypad

`

0 3 9 ’Hold + press …

on numeric keypad

alt =

=

Left single-quote

Right single-quote

Looping

Page 19: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

In the loop

generate minimum=.unab vars: inc*foreach lvar in `vars' { replace minimum = cond(`lvar' < minimum,`lvar’,minimum)}

generate minimum=.unab vars: inc*foreach lvar in `vars' { replace minimum = `lvar’ if `lvar’<minimum}

generate minimum=.unab vars: inc*foreach lvar in `vars' { if `lvar’<minimum { replace minimum = `lvar’ }}

1. Initialize target variable2. Prepare the variable-list3. Looping:4. In-the-loop-commands

!

Page 20: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Some of the danish participants who might know ”the DREAM database”will propably be able to see how these approaches can be useful when working with this fantastic but difficult construction.

Page 21: 2014 Nordic and Baltic Stata Users Group Metting Working sideways in Stata Jakob Hjort DataManager, MPH Department of Cardiology Aarhus University Hospital

Thank you very much