View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Everything I wish I had Everything I wish I had known about research known about research
design and data design and data analysis…analysis…
Statlab WorkshopStatlab Workshop
Fall 2006Fall 2006
Kyle Hood and Frank FarachKyle Hood and Frank Farach
Outline of a paperOutline of a paper
IntroductionIntroductionTheoryTheoryData DescriptionData DescriptionAnalysisAnalysisConclusionConclusion
Identifying a QuestionIdentifying a Question
Tradeoff between work in and resultsTradeoff between work in and results Easy to do, trivial resultsEasy to do, trivial results Result is interesting, but difficulty is highResult is interesting, but difficulty is high
New tools open up new questionsNew tools open up new questions New statistical or computational tools New statistical or computational tools
make formerly difficult questions make formerly difficult questions approachableapproachable
New theory opens up new questionsNew theory opens up new questions
IntroductionIntroductionTopicTopic
Most general levelMost general level
QuestionQuestion What is the question you want to What is the question you want to
answer?answer? Be specificBe specific Ask only what you can answerAsk only what you can answer
Review the LiteratureReview the Literature ““Stay the course”Stay the course”
TheoryTheoryCategorize your theoryCategorize your theory
Descriptive vs. causalDescriptive vs. causal
Write down your theoryWrite down your theory In paragraph formIn paragraph form Using a statistical modelUsing a statistical model
Identify testable hypotheses from Identify testable hypotheses from theorytheory
Do you need statistics after all?Do you need statistics after all? Quantitative v. Qualitative researchQuantitative v. Qualitative research
VariablesVariablesDependent Variable (Dependent Variable (response, response,
outcome, criterion)outcome, criterion)
Independent Variables (Independent Variables (explanatory or explanatory or predictor variables)predictor variables)
Treatment VariableTreatment Variable Covariates / Confounding Variables Covariates / Confounding Variables
Categorical and Continuous Categorical and Continuous VariablesVariables
Remember: Types of variables we choose, Remember: Types of variables we choose, determine the statistics we usedetermine the statistics we use
You need DataYou need DataThink about analyses early!Think about analyses early!Collecting your own dataCollecting your own data
Retrospective, prospective, experimental & Retrospective, prospective, experimental & observational methodsobservational methods
Can find most data you’ll need on-line!Can find most data you’ll need on-line!Statlab Webpage Statlab Webpage (http://statlab.stat.yale.edu)(http://statlab.stat.yale.edu)
AdvisorsAdvisors Yale StatCat (http://ssrs.yale.edu/statcat/)Yale StatCat (http://ssrs.yale.edu/statcat/) ICPSR (http://www.icpsr.umich.edu)ICPSR (http://www.icpsr.umich.edu) Reference Librarian (Julie Linden)Reference Librarian (Julie Linden)
EndogeneityEndogeneity
Problem: “Independent” variables are Problem: “Independent” variables are not really independentnot really independent
The “dependent” and “independent” The “dependent” and “independent” variables are determined in equilibrium variables are determined in equilibrium (example: effect of education on wages)(example: effect of education on wages)
Treatment effects will be biasedTreatment effects will be biased
Modeling approaches to deal with thisModeling approaches to deal with this Assumption-based methodsAssumption-based methods InstrumentsInstruments
So, you want to make a So, you want to make a surveysurvey
Extensive on-line resources and softwareExtensive on-line resources and software Question types determine analysesQuestion types determine analyses
Open vs. close ended questions, Likert scales, rank order Open vs. close ended questions, Likert scales, rank order datadata
Assumptions of normalityAssumptions of normality ValidityValidity
Internal & External validityInternal & External validity Pilot testingPilot testing
You need variance to analyze!You need variance to analyze! Sample sizeSample size
It depends; power, effect size, cost (UCLA power calculator)It depends; power, effect size, cost (UCLA power calculator)
Once You’ve Found or Once You’ve Found or Collected your dataCollected your data
Download the data and documentationDownload the data and documentation StatTransfer (Statlab)StatTransfer (Statlab)
Determine data file typeDetermine data file type Probably a text file (.txt, .dat, .raw)Probably a text file (.txt, .dat, .raw)
Converting text & delimited filesConverting text & delimited files
Choose a statistical software programChoose a statistical software program SPSS, Stata, SAS, Matlab, Excel, R, C++SPSS, Stata, SAS, Matlab, Excel, R, C++
Managing your dataManaging your dataBack up all Master Data FilesBack up all Master Data Files
CDR/CDRW, USB Key CDR/CDRW, USB Key
CodebookCodebook All codes All codes Adding variables, cases, computing new Adding variables, cases, computing new
variablesvariables
Keep a roadmap Keep a roadmap Keep a log of all analyses with what you Keep a log of all analyses with what you
have donehave done Save syntax filesSave syntax files
Syntax FilesSyntax FilesWhat are they?What are they?
Text-files used to enter commands in bulkText-files used to enter commands in bulk
Is it worth learning?Is it worth learning?You will make mistakes, need to make You will make mistakes, need to make
changeschanges
SPSS and many other programs let you use SPSS and many other programs let you use pull down menus pull down menus
How do I know what to write?How do I know what to write?Program’s manual provides the underlying Program’s manual provides the underlying
commandcommand
So, how do I analyze my So, how do I analyze my data?data?
CorrelationCorrelation Correlation allows you to quantify relationships Correlation allows you to quantify relationships
between variables (r, r-squared)between variables (r, r-squared) Regression allows prediction of dependent variable Regression allows prediction of dependent variable
based on one or more independent variablesbased on one or more independent variables
Group differencesGroup differences t-test & ANOVAt-test & ANOVA Chi-square for categorical and frequency dataChi-square for categorical and frequency data
Significance v. effect sizeSignificance v. effect sizeMore Complex ModelsMore Complex Models
Descriptive StatisticsDescriptive Statistics
VariablesVariablesDependent Variable(s)Dependent Variable(s)
Independent Variable(s)Independent Variable(s)
Important CovariatesImportant Covariates
GraphsGraphs
Summary Statistics on Key VariablesSummary Statistics on Key VariablesNumber, Mean, Minimum, Maximum, Standard Number, Mean, Minimum, Maximum, Standard
DeviationDeviation
Cross-TabsCross-Tabs
Putting Output into a Putting Output into a PaperPaper
Cut and PasteCut and PasteGraphsGraphs
Cut and Paste into Word Processing documentCut and Paste into Word Processing document
Save as .jpeg or .tif fileSave as .jpeg or .tif file
TablesTablesCut and PasteCut and Paste
Format in Word Processing documentFormat in Word Processing document
Import into Excel, format, and then place in WordImport into Excel, format, and then place in Word
More Advanced AnalysisMore Advanced AnalysisMultivariate techniques help to account for Multivariate techniques help to account for
confounding factors, allow for testing confounding factors, allow for testing change over time and more complex change over time and more complex hypotheseshypotheses……
(See: Tabachnick & Fidell, Using Multivariate (See: Tabachnick & Fidell, Using Multivariate Statistics)Statistics)
1)1) Be honest about your abilities.Be honest about your abilities.
2)2) Ask for helpAsk for help
3)3) Best off including techniques that you fully Best off including techniques that you fully understand.understand.
Take Away MessagesTake Away Messages
1)1) Determine your question, methods Determine your question, methods and statistics before you startand statistics before you start
2)2) Keep a codebook of everythingKeep a codebook of everything
3)3) Keep a log of all commands issuedKeep a log of all commands issued
4)4) Save data at every stepSave data at every step
5)5) Ask for helpAsk for help
6)6) Don’t get in over your headDon’t get in over your head