23
CHALLENGES IN VISITOR- VOLUME ESTIMATION: OVERVIEW Presented By: Dr. Michael Kaylen University of Missouri

CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

  • Upload
    minnie

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW. Presented By: Dr. Michael Kaylen University of Missouri. Outline. Statistical Background Estimators, Projection Weights, and Properties of Estimators Sources of Error in a Survey Frame vs. Target Populations, Non-Response Bias - PowerPoint PPT Presentation

Citation preview

Page 1: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

CHALLENGES IN VISITOR-VOLUME

ESTIMATION:OVERVIEW

Presented By:Dr. Michael Kaylen

University of Missouri

Page 2: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

OutlineI. Statistical Background

Estimators, Projection Weights, and Properties of Estimators

II. Sources of Error in a Survey Frame vs. Target Populations, Non-Response

Bias

III. Travel-Related MeasuresTravel Parties, Household Trips, Person Trips, Person Nights, Traveler Expenditures

IV. A Note on Sample SizeV. A Note on StratificationVI. Considerations

Page 3: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Statistical BackgroundProperties of Estimators for Population Parameters• Population Parameters are characteristics of

populations Example: Missouri is interested in the population of households in the continental U.S. (excluding those in Missouri) at the start of the first quarter of 2008. A parameter of interest is the total number of trips those households took to Missouri during the first quarter of 2008. If there are N households in the population of interest and yh is the number of trips household h made to Missouri, then the parameter is given by:

N

hh

yY1

Page 4: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Statistical BackgroundProperties of Estimators for Population Parameters• Estimators for unknown parameters are functions

of the elements of a random sample. Example: For the Missouri case, suppose the sampling design is such that the probability of household h’s inclusion in the sample set (S) is given by πh. The “design weight” is wh=1/ πh and an estimator for Y is

If we randomly sample one out of 1,000 households, the inclusion probability is just 1/1,000 and the design weight (“projection weight”) for every household in that stratum would be 1,000.

yw hSh

hY

^

Page 5: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Statistical BackgroundProperties of Estimators for Population Parameters• 3 Properties of Estimators

Bias: The expected (average) difference between an estimator and the parameter.

When we take a sample and calculate the value of an estimator for that sample, we have an estimate. The difference between that estimate and the true value for the parameter is referred to as sampling error. An estimator is unbiased if its average sampling error is zero.

Variance: The expected (average) squared difference between an estimator and the expected (average) value for the estimator.

Mean Squared Error: The expected (average) squared difference between an estimator and the parameter.

Note: MSE = Bias2 + Variance

Page 6: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Target population

SOURCES OF ERROR IN A SURVEY

Page 7: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Frame population

Target population

SOURCES OF ERROR IN A SURVEY

Page 8: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Sample

Frame population

Target population

SOURCES OF ERROR IN A SURVEY

Page 9: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

SampleResponse set

Frame population

Target population

SOURCES OF ERROR IN A SURVEY

Page 10: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Sample Nonresponse setResponse set

Frame population

Target population

SOURCES OF ERROR IN A SURVEY

Page 11: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Travel-Related Measures

Household 1

HH Members John, Katy, Steve

Trip 1

Travel Party John, Katy, Steve

Number of Nights 6

Household Expenditures $170

Trip 2

Travel Party John, Steve (joined with Tom)

Number of Nights 4

Household Expenditures $70

Household 2

Tom, Mary

Tom (joined with John, Steve)

4

$80

Page 12: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Travel-Related MeasuresMissouri Perspective

Travel Party Trips (John, Katy, Steve ) & (John, Steve, Tom)

2

Household Trips (John, Katy, Steve) & (John, Steve) & (Tom)

3

Person Trips (# of Travelers)

3 + 3 6

Person Nights 3(6) + 3(4) 30

Traveler Expenditures $170+$70+$80 $320

Survey Results

Household 1

Projection

# Trips to MO

2 4

# Persons from HH

3 + 2 = 5 10

HH Person Nights

(3x6) + (2x4) = 26

52

HH Expenditures

$170 + $70 $480

Household 2

Projection

1 2

1 2

4 8

$80 $160

Households 1&2

2 + 1 = 3

5 + 1 = 6

26 + 4 = 30

$240 + $80 = $320

Page 13: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Sample SizeRecently found on the Web:

For results based on this sample of 2,679 registered voters, the maximum margin of sampling error is ±2 percentage points.

Page 14: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Sample Size

Margin of Sampling Error = Radius of Confidence Interval for a Statistic from a Survey, usually referring to a 95% Confidence Interval.

Example: 95% Confidence Interval for Percentage Favoring Obama is 48% + 2%.

 

Page 15: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Sample SizeWhy do travel volume estimates need large samples?Answer: Relative Margin of Error matters.

 If an estimated proportion is p and the margin of error is ME, the relative margin of error is:RME = ME/p

In the example, RME = .02/.48 = 0.042, so the ME is about 4.2% of p.

Page 16: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Sample Size

Travel Example: From past studies, we know about 1% of households in continental U.S. (excluding MO) visit MO in any given month. If we want to estimate the percentage for a given month, we need a smaller confidence interval than 1% + 2%! In this case, RME = .02/.01 = 2, so the ME is 200% of the estimated percentage.

Page 17: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Sample Size

1% Households Visit

ME RME CI: 1M +/-

N

0.20% 20.00% 200,000 9,508

0.15% 15.00% 150,000 16,903

0.10% 10.00% 100,000 38,032

0.05% 5.00% 50,000 152,127

0.04% 4.00% 40,000 237,699

0.03% 3.00% 30,000 422,576

0.02% 2.00% 20,000 950,796

0.01% 1.00% 10,000 3,803,184

3% Households Visit

CI: 3M +/-

N

600,000 3,105

450,000 5,521

300,000 12,421

150,000 49,685

120,000 77,632

90,000 138,013

60,000 310,529

30,000 1,242,117

How big of a RME can be tolerated? How big of a sample do you need to achieve it?

Page 18: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Stratification

Data providers often use sampling designs based on stratification of demographic variables such as household income, region, education, etc. There are two issues the user might want to consider.

Even though the final weights balance the sample for non-response, increased variance due to non-response may be important.

For example, consider a stratum calling for 10 households to be sampled, each representing 1000 households. If only 5 of the households respond to the survey, we’ll end up with 5 respondents, each having a weight of 2000. The net effect of this smaller sample with larger weights is that we are more likely to get estimates far away from the true value. There is not a bias problem, but there is an increase in variance.

Page 19: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

A Note on Stratification

Data providers often use sampling designs based on stratification of demographic variables such as household income, region, education, etc. There are two issues the user might want to consider.

The strata definitions may not coincide with the user’s needs. For example, the proportion of households that visit MO over a given time period is much higher for its neighboring states that for the non-neighboring states. If a sample over- or under-represents the neighboring states, we are likely to under- or over-estimate household visitation to MO.

Page 20: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Considerations

What is the Frame Population versus the Target Population?

Is the sample size adequate? (relative margin of error is key)

What is the response rate?

Page 21: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Considerations Are the design (projection) weights

reflective of the sampling unit? (e.g., beware of sample designs based on households with weights based on people)

Are the survey questions relative to the sampling unit?(e.g., beware of potential double counting from sampling at the household level but asking questions about travel parties)

Page 22: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

Considerations

Is the sample balanced for demographic variables that will likely covary with the study variables?(e.g., household income, location, etc.)?

Do you have access to all of the data?(including weights, non-travelers, non-responders)

Page 23: CHALLENGES IN VISITOR-VOLUME ESTIMATION: OVERVIEW

THANK YOU!

Questions, Comments, Slides?

www.teri.missouri.edu