ECON 497 Midterm Page 1 of 16 ECON 497: Economic Research and Forecasting Name:________________ Spring 2005 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions, and explain your answers. Fifty points total, points per part indicated in parentheses. 1. Linear regression involves estimating a linear relationship between one or more independent or explanatory variables and a dependent variable. Imagine that such a relationship has been estimated between the price of a car in thousands of dollars (P i ), the interior space in cubic feet (S i ) and a dummy variable indicating whether it has four wheel drive (D i ): The estimated equation is: P i = 8.3 + 0.1S i + 2.3D i A. Calculate the predicted price for a car with interior space of 100 cubic feet that does not have four wheel drive. (1) 8.3 + 0.1*100 + 2.3*0 = 18.3 or $18,300 B. What is the interpretation of the coefficient on the four wheel drive dummy variable? (1) Other things being the same, a car with four wheel drive will be priced 2300 dollars more than a car that doesn’t have four wheel drive.
Assignment #1 due 4/12 - Metropolitan State Universityfaculty.metrostate.edu/BELLASAL/497/497S05Midterm-ans.doc · Web viewZero interior space is probably outside of the range of
Assignment #1 due 4/12 Spring 2005 Bellas
Midterm
You have three hours and twenty minutes to complete this exam.
Answer all questions, and explain your answers. Fifty points total,
points per part indicated in parentheses.
1. Linear regression involves estimating a linear relationship
between one or more independent or explanatory variables and a
dependent variable. Imagine that such a relationship has been
estimated between the price of a car in thousands of dollars (Pi),
the interior space in cubic feet (Si) and a dummy variable
indicating whether it has four wheel drive (Di):
The estimated equation is:
Pi = 8.3 + 0.1Si + 2.3Di
A. Calculate the predicted price for a car with interior space of
100 cubic feet that does not have four wheel drive. (1)
8.3 + 0.1*100 + 2.3*0 = 18.3 or $18,300
B. What is the interpretation of the coefficient on the four wheel
drive dummy variable? (1)
Other things being the same, a car with four wheel drive will be
priced 2300 dollars more than a car that doesn’t have four wheel
drive.
C. As you’re presenting these results to a hostile crowd, a heckler
in the crowd asks you if you really believe that a car that has no
interior space (Si=0) and doesn’t have four wheel drive (Di=0)
would sell for $8,300. How do you respond? (1)
Zero interior space is probably outside of the range of the sample
of cars on which the model is based, so the predictions of the
model for a car with zero interior space is probably invalid.
2. Dummy variables take the value of 0 or 1 and allow qualitative
factors to be represented in linear regression. In addition,
interactive or slope dummies allow the effects of a second variable
to vary from one qualitative group to another. For purposes of this
question, imagine that the annual wage (in $1,000) of a person with
a bachelors degree (Wi) is estimated as a function of their age
(Ai) and whether or not they took a course in economics (Ei):
i.
i
i
1
0
i
A
W
e
b
Regression results showed positive values for β0 in all three
models.
A. Imagine that you were to graph the predicted wage against age
based on the results of the first model (model i.). Show what this
would look like. (1)
B. Imagine that you were to graph the predicted wage against age
based on the results of the third model (model iii.). Show what
this would look like, being clear to specify the predictions for
the economists and non-economists. (1)
C. What would it mean if the estimated coefficient on β3 were
positive and significant? (1)
This would mean that as they age, economists’ wages increase at a
faster rate than do non-economists’ wages.
D. Imagine that the estimated coefficients in the second model
(model ii.) were
Wi = 12.0 + 1.5*Ai + 3.2*Ei
Calculate what the estimated coefficients would be if the economics
dummy were replaced with a “didn’t take economics” dummy. (1)
Wi = 15.2 + 1.5*Ai - 3.2*DTEi
3. What would an economics class be without assumptions? This is
especially true in an econometrics class because the basic
regression model, conversationally known as ordinary least squares
(OLS to its friends) relies on seven classical assumptions. If
these assumptions are satisfied, OLS is the best linear unbiased
estimator (BLUE) that can possibly exist. Without them, it is
not.
A. One assumption is that the error term has constant variance.
What is the eight-syllable term given to violation of this
assumption? (1)
Heteroskedasticity
B. Another assumption is that no explanatory variable is a linear
function of any other explanatory variable(s). What is the
eight-syllable term given to the violation of this assumption?
(1)
Multicollinearity
C. How do the above violations bias coefficient estimates?
(2)
Neither one biases the coefficient estimates.
4. One of the classical assumptions is that the model is correctly
specified, meaning that all relevant explanatory variables are
included. Of course, you can’t include all relevant explanatory
variables, there’s always something missing. In question #1, the
example was given of estimating car prices as a function of
interior space and a four wheel drive dummy variable. Imagine that
some car manufacturers are seen as being cooler than others, but
coolness isn’t something that can be quantified, so it is left out
of the equation. How would the estimated coefficient on interior
space be biased if cooler auto manufacturers tended to make smaller
cars? (2)
Coolness would have a positive impact on a car’s price, and
coolness and interior space (size) are negatively correlated. So,
if coolness is excluded, this will have a negative bias on the
estimated sign of size.
5. One linear regression hypothesis test that all regression
packages do is an F-test of the explanatory power of the
model.
A. What is the null hypothesis of this test? (1)
The null hypothesis is that all of the slope coefficients are
zero.
B. If you get a p-value (known in SPSS as a SIG. value) of 0.038
for this F-test, what does this imply about the explanatory power
of your model? (1)
This is a small p-value, which means that the null hypothesis
should be rejected in favor of the alternative. That is, at least
one of the slope coefficients is not zero.
6. As nice as the F-test is, the thing that most folks are really
interested in is the t-test of significance of the estimated
coefficients.
A. What is the null hypothesis of this test? (2)
The null hypothesis is that the coefficient in question is
zero.
B. If you get a p-value of 0.237 for this t-test, what does this
imply about the estimated coefficient on the variable in question?
(2)
This implies that the estimated coefficient is not significantly
different from zero.
C. If you get an estimated coefficient of 0.038 and an associated
p-value of 0.237 for this t-test and someone asked you your best
guess about the value of the coefficient on that variable, what
value would you tell them? (2)
Your best guess as to the value is the estimated coefficient of
0.038, even if this is not significantly different from zero.
7. Here is some totally fake SPSS output. Calculate the correct
values for the blanks. If you can’t calculate a value, make your
best guess and justify it.
ANOVA
Model
A. (2) 2800 + 1200 = 4000
B. (2) Can’t calculate this, but because the R2 is so large (see
part F) this is probably very small and probably 0.000.
C. (2) t = B/SE t = 7.00/3.50 = 2
D. (2) t = B/SE 1.00 = 3.00/SE SE = 3.00
E. (2) The t-value is 6.00, so the Sig. value is probably
0.000.
F. Calculate the R2 for this regression. (2) 2800/4000 =
0.70.
8. Among my favorite things about the Studenmund text are the four
criteria for determining whether an explanatory variable should be
added to a regression. Consider the following output from a
regression (this is actual data) of the price of a house on its
size in square feet and the number of bathrooms. You might also
consider adding the size of the lot on which the house sits. Here
are the regression results without and with lot size.
Model Summary
a.
Coefficients
a
-14944.2
34193.198
-.437
.663
178.794
26.432
.635
6.764
.000
-15255.3
23840.001
-.059
-.640
.523
8.873
4.250
.131
2.088
.038
(Constant)
SQFT
BATHS
LOT
Model
1
B
a.
Discuss whether or not lot size should be included in the
regression based on Studenmund’s four criteria. (2)
Theory: Land value is part of the value, so it should be
included.
Adj. R2: This increases when land is added, so it should be
included.
t-test: The estimated coefficient on lot is positive and
significant, so include it.
Bias: Estimated coefficients on SqFt and Baths doesn’t change much,
don’t include it.
Overall, I would say that it should be included.
9. Imagine that you’re regressing the number of packs of cigarettes
consumed annually (Ci) on the price of a pack of cigarettes in
dollars (Pi). Offer an interpretation of the coefficient on price
from each of the following models.
A. Ci = β0 – β1*Pi (2)
If the price increases by one unit the quantity consumed will fall
by βi units.
B. LN(Ci) = β0 - β1*Pi (2)
If the price increases by one unit the quantity consumed will fall
by 100% x βi .
C. LN(Ci) = β0 – β1*LN(Pi) (2)
The price elasticity of demand is βi .
10. I get some sick pleasure out of watching people worry about
multicollinearity.
A. What options are available for detection of this
multicollinearity? (2)
You can check to see if you have good explanatory power but no
significant estimated coefficients.
You can look at correlation coefficients among the explanatory
variables.
You can calculate variance inflation factors (VIFs) when you do a
regression.
B. In one word, what should you do to address this problem in your
regression? (1)
Nothing.
Most potential solutions to multicollinearity are worse than the
multicollinearity is. Excluding an explanatory variable, for
example, would introduce excluded variable bias whereas
multicollinearity won’t bias coefficient estimates.
11. Heteroskedasticity is sometimes a problem in regression
analysis.
A. Draw a scatterplot, being careful to label the axes correctly,
that demonstrates heteroskedasticity. (2)
B. What are the consequences of heteroskedasticity. (2)
Estimated coefficients are not biased but the standard errors will
be artificially small, so that estimated coefficients may appear to
be significant when they aren’t really significant.
_1170879263.unknown
_1170879285.unknown
_1170528376.unknown
_1170528406.unknown
_1170528466.unknown
_1170528281.unknown