View
8
Download
0
Category
Preview:
Citation preview
Day 13: Multiple Regression andReporting Results
Daniel J. Mallinson
School of Public AffairsPenn State Harrisburgmallinson@psu.edu
PADM-HADM 503
Mallinson Day 13 November 16, 2017 1 / 42
Road map
Multiple RegressionMulticollinearityAutoregression
Things to keep in mind when reporting results
Mallinson Day 13 November 16, 2017 2 / 42
RecapBivariate Regression and Correlation
Figure: “U.S. Phillips Curve” by Farcaster, CC BY-SA 3.0
Mallinson Day 13 November 16, 2017 3 / 42
Multiple Regression
Two or more IVs
Principles and interpretations from bivariate regression apply
Multi-dimensional analysis, so scatterplots are not as useful, butbasic notion still applies
Mallinson Day 13 November 16, 2017 4 / 42
Multiple Regression
A generic model:
A generic multiple regression formula:
Y = a + b1X1 + b2X2 + b3X3 . . . + bnXn (1)
Mallinson Day 13 November 16, 2017 5 / 42
Multiple Regression
Statistics You Should Know About:
R-square (R2)
A measure of association; indicates amount of variance explained inthe DV by the model (IVs)
Adjusted R2
A more conservative version of R2; adjusted for number of IVs
Unstandardized coefficients (beta)
Specific relationships between a individual IV and the DV
Mallinson Day 13 November 16, 2017 6 / 42
Multiple Regression
Statistics You Should Know About:
Beta Weights
Standardized beta coefficients; used for comparing variables measuredon different scales
F-Ratio
Significance test for the model, recall F from ANOVA
Mallinson Day 13 November 16, 2017 7 / 42
Multiple Regression
Steps in Analysis:
1 Need a theoretical model → prevents “garbage can” models
2 Enter all relevant IVs into analysis
3 Search for violations of regression assumptions (e.g.,multicollinearity, autocorrelation)
4 Make careful decisions about variables to remove/add/transform,do not remove just because they are not statistically significant
5 Interpret your final (“well-specified”) model
Mallinson Day 13 November 16, 2017 8 / 42
Multiple Regression
An SPSS example:
Use country.sav data file
122 countriesTip: if you have multiple variables, you need more cases (e.g.,countries)
DV: Female Life Expectancy
Mallinson Day 13 November 16, 2017 9 / 42
Multiple Regression
The Model:
Mallinson Day 13 November 16, 2017 10 / 42
Multiple Regression
In SPSS:
Analyze
Regression
Linear
Select variables (see model above)
Method: “Enter”
Under “Statistics” select:
EstimatesModel fitCollinearity diagnostics
Mallinson Day 13 November 16, 2017 11 / 42
Multiple Regression
There are four methods of selecting variables to enter into ananalysis:
Enter
Forward selection
Backward selection
Stepwise selection
Enter is the most common methods. For the other three, consult abook on regression analysis
Mallinson Day 13 November 16, 2017 12 / 42
Multiple RegressionResults:
Mallinson Day 13 November 16, 2017 13 / 42
Multicollinearity
IVs are not independent of each other
Violates regression assumption: each IV has impact on DV andnot on each other
Thus, their effects on DV cannot be isolated from each other
Need to identify if this problem exists and correct
Sampling problem, but not always a solution
Mallinson Day 13 November 16, 2017 14 / 42
Multicollinearity
How to detect:
Look at last two columns of output
If any Tolerance value is close to zero, there is amulticollinearity problem for the IV it is associated with (listedon the left of the table)If the VIF is larger than 5 (or larger than 10, according to somestatisticians), then there is a multicollinearity problem for the IVit is associated with
Mallinson Day 13 November 16, 2017 15 / 42
Multicollinearity
In our results:
Variables with multicollinearity problems include:
GDP per capitaPhones per 100 people
See the tolerance and VIF statistics
These are correlated highly with at least some of the othervariables
Mallinson Day 13 November 16, 2017 16 / 42
Multicollinearity
Another method of detection:
A simpler method is to run bivariate Pearson’s r correlations forall IVs
Can create a correlation table and make decisions about whichvariables to include
Helps you identify IVs with high correlations
Ask whether they are measuring the same concept
Mallinson Day 13 November 16, 2017 17 / 42
In SPSS: Analyze → Correlate → Bivariate Correlations → Chooseall variables
Mallinson Day 13 November 16, 2017 18 / 42
Multiple Regression
Removing variables and re-running model
Only remove insignificant variables if they are not vital to yourtheory or models in past literature
Removing even insignificant variables will have an impact on theestimation of the model (including other coefficients)
Have more latitude in this area if the model is simply exploratory
Do need to address multicollinearity problem
Mallinson Day 13 November 16, 2017 19 / 42
Multiple RegressionRe-run the model without Phones per 100 people
Mallinson Day 13 November 16, 2017 20 / 42
Multiple Regression
SPSS Output
Table 1: Variables Entered/Removed
Shows the variables included, type of selection, and any removedvariables
Table 3: ANOVA
Shows F-ratio for entire model, Sig. of .000 means entire model issignificant
Mallinson Day 13 November 16, 2017 21 / 42
Multiple Regression
SPSS Output
Table 2: Model Summary
Look at Pearson’s r, r-square, and adjusted r-square
These are scores for the two IVs combined
R2 (.66) means the two IVs together explain 68% of variance infemale life expectancy
Mallinson Day 13 November 16, 2017 22 / 42
Multiple Regression
SPSS Output
Table 4: Coefficients
Tolerance and VIF are good
Pct Urban and Doctors are statistically significant predictors
The model is good
Mallinson Day 13 November 16, 2017 23 / 42
Multiple RegressionWhat does the regression equation mean?
Female Life expectancy = 53.905+ .111 percent urban+ .000 GDP+ .040 Radios+ .009 Hospital beds+ .465 Doctors
The base for female life expectancy (no urban population and nodoctors) is 53.5 years
Every percentage point increase in urbanization increases femalelife expectancy by .142 years
Every additional doctor per 10,000 people increases female lifeexpectancy by .568 years
Isolated effectsMallinson Day 13 November 16, 2017 24 / 42
Multiple Regression
The Beta Coefficients (Standardized Coefficients) in Table 4:
Which of the two IVs is more important?
The betas of the two variables can be compared:
Doctors per 10,000 people: .469Percent urban: .248
If you want to increase female life expectancy, the best methodis by increasing the number of doctors per 10,000 people
Urbanization helps, but comes in second
Mallinson Day 13 November 16, 2017 25 / 42
Multiple RegressionSPSS Output
Table 5: Collinearity Diagnostics
Useful if you want to conduct more detailed analyses ofmulticollinearity
Guidelines for interpreting the table:
If eigenvalue is close to zero, there is a problem ofmulticollinearityIf the condition index is larger than 15, there is a problem ofmulticollinearity
Mallinson Day 13 November 16, 2017 26 / 42
Multiple Regression
Presenting Results of a Single Model
Unstandardized RegressionCoefficient (s.e.) Beta Weight
Percent Urban, 1992 0.111∗ (0.035) 0.248GDP Per Capita 0.000 (0.000) 0.093Radios per 100 people 0.040 (0.027) 0.110Hospital Bed per 10,000 0.009 (0.030) 0.025Doctors per 10,000 0.465∗ (0.104) 0.469Constant 53.905∗ (1.483)R2 .679Adjusted R2 .664∗p < 0.05
Mallinson Day 13 November 16, 2017 27 / 42
Multiple RegressionPresenting Results of a Multiple Models
Model 1 Model 2Coefficient Beta Coefficient Beta
Percent Urban, 1992 0.111∗ 0.248 0.142∗ 0.313(0.035) (0.033)
GDP Per Capita 0.000 0.093(0.000)
Radios per 100 people 0.040 0.110(0.027)
Hospital Bed per 10,000 0.009 0.025(0.030)
Doctors per 10,000 0.465∗ 0.469 0.568∗ 0.565(0.104) (0.074)
Constant 53.905∗ 53.546∗
(1.483) (1.379)R2 .679 .660Adjusted R2 .664 .654∗p < 0.05Standard errors in parentheses
Mallinson Day 13 November 16, 2017 28 / 42
Multiple Regression
Causal Inference:
Remember: Correlation 6= Causation
Regression implies causation
Statistical controls vs. experimental control
Beware of potential omitted variable bias
Interpret findings with appropriate caution
Mallinson Day 13 November 16, 2017 29 / 42
Communicating Research Findings
Outline
Preliminary Reminders
Guidelines for Paper Writing
Components of a Quantitative Research Paper
Oral Presentations
Ethical Issues
Mallinson Day 13 November 16, 2017 30 / 42
Preliminary Reminders
Audience
Presentation should be adjusted for the needs of the audience
Audience analysis:
Tailoring to audienceMaking contents and message clear
Contents
For all kinds of audiences, presentations should be accurate, clear,coherent, and concise
Mallinson Day 13 November 16, 2017 31 / 42
Guidelines for Paper Writing
Accuracy
Appropriate uses of concepts cited in a paper
Appropriate uses of analytical methods and accuracy incalculations
Citing sources properly
Mallinson Day 13 November 16, 2017 32 / 42
Guidelines for Paper Writing
Clarity
Use the following carefully and sparingly:
Complex sentencesConjunctions (“although,” “however”) and pronouns (“this,”“that”)Allusions (indirect, vague references) – AvoidMetaphors, Embellishments, Poetic Expressions, and Cliches
Double-check and revise your draft
Mallinson Day 13 November 16, 2017 33 / 42
Guidelines for Paper Writing
Coherence
Make an outline of your paper
Discuss few concepts or issues and clarify and/or elaborate onthem (focus)
Do not casually list many concepts or issues
Paragraphs should be right size (not too long or short)
Each should have one or few clear point(s)
Mallinson Day 13 November 16, 2017 34 / 42
Guidelines for Paper Writing
Conciseness
Keep it brief!
Make sure that every point you make is directly relevant to yourmain point(s) in the paper
Make sure that every word you use has a specific function in asentence
Mallinson Day 13 November 16, 2017 35 / 42
Components of a Quantitative ResearchPaper
See Example 15.1 in textbook
A generic outline:
Executive Summary or AbstractIntroductionReview/TheoryMethodologyFindings/ResultsRecommendations/Conclusions
Qualitative research papers may have somewhat differentoutlines
Mallinson Day 13 November 16, 2017 36 / 42
Components of a Quantitative ResearchPaper
Executive Summaries and Abstracts
Differences between the two (see Example 15.2 in text)
Background Information
Problem, research questions, purpose of study
Literature Review
Three types:
Chronological order
Organize discussion around key variables (method for classes)
Organize discussion around theoretical approaches
Mallinson Day 13 November 16, 2017 37 / 42
Components of a Quantitative ResearchPaper
Methodology
Design, sampling, operational definitions of variables, procedures ofdata collection, and data analysis methods used
Findings/Results
Focus on important findings (do not cover everything); use tablesand graphs
Recommendations/Conclusions
Include a summary and conclusions; recommendations part ofprofessional (not academic) report
Mallinson Day 13 November 16, 2017 38 / 42
Oral Presentations
Use the traditional plan of research papers (background,methods, results, recommendation) – but with differentemphases
May be necessary to be informal during the presentation
Practice, practice, practice!
Mallinson Day 13 November 16, 2017 39 / 42
Ethical Issues
Fabrication, falsification
Plagiarism
Full disclosure for handling research errors
Peer review: blind and double-blind review
Saving (and publishing) data for other’s use
APA guidelines: Keep data for at least 5 yearsUse a Dataverse: https://dataverse.harvard.edu/
Mallinson Day 13 November 16, 2017 40 / 42
Questions?
Figure: Q&A by Libby Levi, CC BY-SA 2.0
Mallinson Day 13 November 16, 2017 41 / 42
Lab/HomeworkUsing the county data file edited.sav, I want you to conduct a briefstudy of the correlates of crime in North Carolina. For this, you willuse the crime index variable (CrimeIndex) as the dependent variable.Look through the dataset and documentation to identify at least 5variables that you believe to be associated with crime. Provide mewith the following:
Hypothesis for each variableTable of initial and final model resultsDiscussion of whether there is multicollinearity among some ofthe variables and what you will do about itInterpret the statistically significant coefficients, both in theirraw form and their beta weightsReport which of your hypotheses are supported and which arenot.
Include SPSS output in an appendix, do not rely on it for your briefreport. Instead, make actual tables in Word.
Mallinson Day 13 November 16, 2017 42 / 42
Recommended