Upload
katoka
View
54
Download
1
Tags:
Embed Size (px)
DESCRIPTION
A Critical Examination of Hedonic Analysis of a Regression Model (HARM) and META-ANALYSIS Albert R. Wilson BSSE, MBA, CRE (Ret). Regression Model. A model intended to allow an exploration of the hypothetical relationship between possible explanatory variables and the sales price. - PowerPoint PPT Presentation
Citation preview
A Critical Examination ofHedonic Analysis of a
Regression Model (HARM) and
META-ANALYSIS Albert R. Wilson
BSSE, MBA, CRE (Ret)
2
Regression Model
A modelintended to allow an explorationof the hypothetical relationshipbetween possible explanatory
variablesand the sales price
3
Regression Model
• Reflection of reality• The touchstone of that reality? Actual
market participants
4
“Estimated” versus “Predicted”
• Estimated = Sale IN database
• Predicted = Sale NOT IN database
5
Predicted Sales Prices
At the mean predicted sales price variance
is larger than estimated varianceby σ2 (variance in the data)
6
Mean Confidence Intervals (MCI)Estimated and Predicted
MCI FOR PREDICTED 4.38 TIMES MCI FOR ESTIMATED7
DATABASE EDITING
GARBAGE IN => GARBAGE OUT (GIGO)
8
Case ExampleInfluence on the Removal of
“Flipping Transactions” on the Predicted Prices for 33 Properties
PREDICTED SALES PRICES
PROPERTY NO. AS PRESENTED FLIPS REMOVED % CHANGE
SUM 5,069,239 4,018,112 (1,051,127)
n 391 379 -12
Adj. R-squared 0.7684 0.7593 -0.0091
9
Editing and Confirmation of Data
STEP 1: Edit to identify obvious issues (the desk edit)
Case ExampleAssessor’s Data 4,325Removed 747 17.3%R-Squared 0.79 0.83
MLS Data 1,888Removed 779 44.3%10
Editing and Confirmation of Data
STEP 2: Identify sales that are not appropriate to the analysis
11
Editing and Confirmation of Data
STEP 3: Sales confirmation•A values-neutral interview of sale participants•OBJECT: to elicit the primary factors motivating the conclusion of the sale priceMUST NOT INTRODUCE ANALYST OPINION
THIS IS THE ONLY MEANS OF IDENTIFYING/CONFIRMING THE REASONS
FOR A CONCLUDED PRICE12
Regression Model Considerations
Faithfully represent:•Identified concerns of actual market participants•Restrictions imposed by the data
Estimates of prices the ONLY VERIFIABLE OUTPUT
13
Coefficient Calculation
Result of iterative calculations designed to provide the
most accurate estimates of sales prices
in database
14
Coefficient Calculation
Goodness of Fit•Measures of the Goodness of Fit apply only to the relationship between the estimated and actual sales prices in the database•They do not apply to the coefficients
15
Most commonly-cited Goodness-of-Fit Measure
R-Squared(Coefficient of Determination)
16
R-Squared
• Generally-applied interpretation:–R-Squared is the amount of
variance “explained” by the model
17
Low R-Squared Models
Mathematically, as the R-Squared approaches 0.30, it becomes
more likely that the model is only measuring
random effects
18
The Omitted and Additional Variable
Problem•Omitting generally increases magnitude and statistical significance of the remaining coefficients•Adding generally decreases the magnitude and statistical significance of the remaining variable coefficients
19
Illustration of Omitting or Adding a Variable Base Model Added Variable–APN Omitted Variable–Pool
Variable Coeff. t-stat Coeff. t-stat % Change Coeff. t-stat
% Chang
e
Intercept 67,370 17.52 -663,632 -8.14
-1085.06
%66,293 17.14 -1.60%
APN .023 8.98 Fixtures 2,653 5.39 2,511 5.15 -5.35% 2,886 5.84 8.74%NoPatio (12,801
) -7.77 (5,036) -2.73 -60.66% (13,451) -8.13 5.08%
SqFt 40.79 29.23 42.80 30.61 4.93% 41.59 29.72 1.96%Pool 8,366 6.77 8,908 7.28 6.48% Garage 19,382 12.90 20,153 13.54 3.98% 19,980 13.24 3.09%Middle Ring (16,141
) -11.24 (11,230) -7.38 -30.43% (15,276
) -10.61 -5.36%Inner Ring (8,875) -4.52 (7,114) -3.64 -19.84% (8,012) -4.06 -9.72%2000 207 0.08 1,787 -0.67 763.29% 271 0.10 30.92%2001 (2,017) -0.76 665 0.258 -132.97% (2,028) -0.76 0.55%2002 (719) -0.25 3,976 1.36 -652.99% (615) -0.21 -
14.46%2003 7,213 2.67 7,647 2.86 6.02% 7,258 2.71 0.62%2004 41,149 15.50 40,380 15.37 -1.87% 40,901 15.31 -0.60%2005 132,077 51.04 130,662 50.93 -1.07% 131,12
9 50.43 -0.72%
2006 160,367 45.29 159,842 45.63 -0.33% 159,897 44.89 -0.29%
R-Squared 0.83 0.83 0.83
20
Consequences of Variable Selection
Including the Assessor’s Parcel Number
APN Coefficient Value0.023t-statistic 8.98Mean Value 30,834,360R-Squared 0.83Mean Sale Price $211,000
Results in an incremental increase in the sales price of 0.023 x 30,834.360 = $709,190(APN Coef.) x (Mean Value) = (Incremental Increase)
21
Consequences of Variable Selection
Omission of a Variable:•Removal of “Pool”; present in 38% of properties
– SQFT Cofficient changed from $40.79 to $41.79– Approximately the same t-statistic
•Removal of “Fixtures”; present in 100% of properties
– SQFT Coefficient changed from $40.79 to $46.50– T-statistic = 50.94
22
Coefficients
Coefficients are simplymultipliers for the explanatory variable
23
Causation in Real EstateFrom the Real Estate Appraiser’s perspective:1.Causation demonstrated through sales confirmation interviews.2.Causation NEVER proven through a regression.
24
Strengths and Weaknesses
• Can never be better than the data• Requires significant amount of data: five to 15 or
more sales• Upper limit to the amount of data: too much may
be worse than too little• Guide: Are the sales competitive to the subject?• Estimate of sales prices most accurate at the
mean value of the data• Variance of a predicted sales price larger than
variance of estimated• Thousands of possible regression models
25
Further Considerations
• Absent standards, the “Rubber Ruler” may apply
• When recognized and published standards are not used, author must demonstrate the accuracy and reliability of his/her work
26
Hedonic Analysis
The Hedonic Assumption
The coefficient accurately and only represents the contribution of the declared meaning of the
explanatory variable to the sale price
28
Hedonic Analysis
The validity of the hedonic assumption must be demonstrated
29
“Revealed Preference”Idea cannot be supported
for real estate
Supporting Literature Not a single paper demonstrated the validity
of the hedonic assumption
PLUS•NO indication of confirmation of raw data•NO indication of adherence to any recognized / published standards•NO indication of confirmation of results with the normal or typical market participant
THE RUBBER RULER EFFECT IS MUCH IN EVIDENCE.
31
Regression Model Accuracy
If the regression model is inaccurate, then there is no reason
to expect the coefficients to be accurate or meaningful.
Therefore the HARM cannot be accurate.
32
CASE EXAMPLETO POOL OR NOT TO POOL
• Using the data from the previous case.• Does a pool influence value?• By how much?• The Hedonic Approach, the coefficient is the marginal contribution to value.
33
COMBINED POOL AND NO POOLS
COMBINED POOL AND NO POOLS, POOL COEFFICIENT SET
TO ZERO
Variable COEFFICIENT
MEAN VALUES
EXPECTED VALUES
COEFFICIENT
MEAN VALUES
EXPECTED VALUES
Intercept 54,089.83 1 54,090 54,089.83 1 54,090ORIG_FIXTURES 2,805.33 8.73 24,491 2,805.33 8.73 24,491ORIG_NOPATIO -14,116.47 0.34 -4,800 -14,116.47 0.34 -4,800ORIG_POOL 9,161.98 0.38 3,482 9,161.98 0 0ORIG_SQF 41.52 2283.62 94,815 41.52 2283.62 94,815ORIG_X_3GARAGE 16,212.83 0.4 6,485 16,212.83 0.4 6,485
SY2000 5,980.33 1 5,980 5,980.33 1 5,980EXPECTED MEAN SALE PRICE
184,543 181,061
Adj R2 0.8816 0.8816
34
TO POOL OR NOT TO POOL (CONT.)
• What are the coefficients if there is no pool?
35
COMBINED WITH NO POOL VARIABLE
Variable COEFFICIENT MEAN VALUES EXPECTED VALUES
Intercept 52788.1063 1 52,788ORIG_FIXTURES 3,087.8801 8.73 26,957ORIG_NOPATIO -14,724.7843 0.34 -5,006 ORIG_SQF 42.3986 2283.62 96,822ORIG_X_3GARAGE 16,924.691 0.4 6,770SY2000 5,727.7462 1 5,728EXPECTED MEAN SALE PRICE 184,059
Adj R2 0.8790
36
Comparision• Orig Fixt 2,805 3,088• Orig-nopatio -14,116 -14,725• Orig-no pool 9,162 NA• Orig-sqf 41.52 42.40• Orig-garage 16,213 16,925• SY2000 5,980 5,728• ESP $184,513 $184,059• R-sq 0.88 0.8837
POOL OR NOT TO POOL (CONT.)
• WHAT HAPPENS IF WE CONSIDER A DATABASE WITH POOLS, AND SEPARATELY A DATABASE WITHOUT POOLS?
38
WITH POOL ON PROPERTY WITHOUT POOL ON PROPERTY
Variable COEFFICIENT
MEAN VALUES
EXPECTED VALUES
COEFFICIENT
MEAN VALUES
EXPECTED VALUES
Intercept 65,957.89 1.00 65,958 54,993.78 1.00 54,994
ORIG_FIXTURES 2,505.59 9.65 24,179 2,784.14 8.16 22,719
ORIG_NOPATIO -15,415.46 0.22 -3,391 -14,838.47 0.41 -6,084ORIG_POOL
ORIG_SQF 41.63 2,586.79 107,690 41.46 2,097.20 86,956ORIG_X_3GARAGE 15,768.93 0.40 6,308 16,308.32 0.31 5,056
SY2000 4,211.37 1.00 4,211 7,209.87 1.00 7,210EXPECTED MEAN SALE PRICE
204,954 170,850
Adj R2 0.08711 0.8895
39
POOLS AND NO POOLS SEPARATELY
• ESTIMATED SALE PRICE WITH POOL $204,954– R-SQUARED 0.87
• ESTIMATED SALE PRICE W/O POOL $170,805– R-SQUARED 0.89
40
The Coefficient – What Counts?
ALL THAT STATISTICAL SIGNIFICANCE CAN TELL US IS THAT
FOR THIS MODEL AND DATABASE THE COEFFICIENT IS A SIGNIFICANT
(OR INSIGNIFICANT) MULTIPLIER FOR THE EXPLANATORY
VARIABLE. NOTHING MORE.
41
The Appropriate Standard:
Economic SignificanceFor us, economic significance
is determined by what the normal or typical participant
considers important to the conclusion of the transaction.
42
A Criticality:NOT ONE hedonic analysis
encountered to date has actually asked this question:
“What was important to you in concluding your transaction?”
43
Hedonic Analysis of a Regression Model (HARM) is:
• Highly inaccurate and unreliable method• Not appropriate for appraisal work
Observations apply to hedonic analysisNOT
regression models!
44