25
Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical Commission and Economic Commission for Europe Conference of European Statisticians

Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

  • Upload
    bjorn

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures. Steve Matthews and Wesley Yung May 16, 2004 The United Nations Statistical Commission and Economic Commission for Europe Conference of European Statisticians. Outline. Introduction - PowerPoint PPT Presentation

Citation preview

Page 1: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Steve Matthews and Wesley YungMay 16, 2004

The United Nations Statistical Commission and Economic Commission for Europe

Conference of European Statisticians

Page 2: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Outline

Introduction Tax data programs at Statistics Canada The Annual Survey of Manufactures (ASM)

Overview Strategy for use of tax data Analytical studies

Conclusions and Future Work

Page 3: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Introduction

Desire to increase use of tax data Reduce respondent burden Reduce survey costs

Can be used at many stages of survey process Stratification Survey data validation Edit and imputation Estimation

Page 4: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Tax Data programs at Statistics Canada

Tax data available to Statistics Canada Collected by Canada Revenue Agency (CRA) Access via a data-sharing agreement To be used only for statistical purposes

Two extensive tax data programs Unincorporated businesses (T1) Incorporated businesses (T2)

Page 5: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Tax Data programs at Statistics Canada (cont’d)

T1 - Population Unincorporated businesses Account for small share of revenues

Administrative Data Sample-based Limited set of variables Edit and imputation is applied Weighted benchmarked estimates

Page 6: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Tax Data programs at Statistics Canada (cont’d)

T2 - Population Incorporated businesses Account for large share of revenues

Administrative Data Census-based Extensive set of variables Edit and imputation is applied Micro-data is produced

Page 7: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

The Annual Survey of Manufactures

Manufacturing is an important sector of Canadian economy

~17% of GDP

Annual Survey of Manufactures Take-none Portion and Survey Portion Extensive questionnaire (financial and commodity) Data requirements (pseudo-census)

Page 8: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

The Annual Survey of Manufactures (cont’d)

Target population Drawn from Statistics Canada’s Business Register (BR) All businesses classified to manufacturing

Sample design Non-survey portion

Administrative data Survey portion

Stratified SRS (Stratum = NAICS * Province * Size) Small take-some / Large take-some / Take-all Collected via mail-out / mail-back, follow-up via telephone

Page 9: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

The Annual Survey of Manufactures (cont’d)

Edit and Imputation Edits applied to ensure accuracy and coherence Extensive imputation to produce ‘pseudo-census’

datasetHistorical imputationRatio imputationNearest-neighbour donor imputation

Page 10: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

The Annual Survey of Manufactures (cont’d)

Estimation Non-survey portion (tax data)

Total Expenses onlyT1: weighted domain estimates T2: aggregates from administrative census dataset

Survey portion (survey data and imputed data)Aggregates from pseudo-census datasetDomains of interest: NAICS and Province

Page 11: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Studies

Motivation for two studies:

Which variables should be ‘replaced’?

What are the effects of the strategy on final estimates for all variables?

Study 1 – Data comparison

Study 2 – Impact Analysis

Page 12: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 1

Study to select appropriate variables Comparison of reported data collected via survey

and tax Simple businesses only Assess suitability for substitution of survey data

Based on ~6,000 businesses

Page 13: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 1 (cont’d)

Correlation Analysis Wide range of correlations

Total Expenses: 0.9 Total Energy Expenses: -0.10

Reporting Patterns Same pattern (zero or positive) for individual businesses

Total Expenses: 99% Total Energy Expenses: 50%

Page 14: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 1 (cont’d)

Distribution of Ratios Examined histograms, fraction between 0.9 and 1.1

Total Expenses: 60% Total Energy Expenses: 16%

Population Estimates Relative difference between tax and survey-based

estimates Total Expenses: 3% Total Energy Expenses: 28%

Page 15: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 1 (cont’d)

Selected several variables for direct substitution Section totals and sub-totals

expenses, revenues, inventories, etc.

Remaining variables are imputed Imputation => assign distribution of details

within each total

Page 16: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 1 - Conclusions

Distinctively different results for different variables Direct substitution seems feasible for totals Direct substitution not recommended for details

Use standard methods to impute other variables

Page 17: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2

Analysis to evaluate impact of tax data strategy

Bias Comparison of estimates from different scenarios

Variance Shao-Steel approach for variance estimation Reflects variance from sampling and imputation Assume equal probability of response within

imputation class

Page 18: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2 (cont’d)

ScenariosTax Data Used in

ImputationEstimator Variance

HT – No Tax

None (ratio imputation based on frame revenues)

Horvitz-Thompson

Sampling Imputation

PC – No Tax

None (ratio imputation based on frame revenues)

Pseudo-census

Imputation

PC - Tax

Non-response (in or out of sample)Direct substitutionRatio imputation

Pseudo-census

Imputation

Page 19: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2 (cont’d)

Comparison of resulting estimates for Total Expenses

Relative Difference from “HT – No Tax” – Total Expenses

* Median value for all such domains

All Manufacturing

NAICS3 x

Province*

PC – No Tax 1.8% 0.0%

PC – Tax 0.5% 1.3%

Page 20: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2 (cont’d)

Comparison of estimated CV’s for Total ExpensesCo-efficient of Variation – Total Expenses

* Median value for all such domains

All Manufacturing

NAICS3 x

Province*

HT – No Tax 0.3% 1.5%

PC – No Tax 0.3% 1.5%

PC – Tax 0.1% 0.7%

Page 21: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2 (cont’d)

Comparison of resulting estimates for Total Energy Expenses

Relative Difference from “HT – No Tax” – Total Energy Expenses

* Median value for all such domains

All Manufacturing

NAICS3 x

Province*

PC – No Tax 1.2% 0.0%

PC – Tax 0.8% 1.2%

Page 22: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2 (cont’d)

Comparison of estimated CV’s for Total Energy ExpensesCo-efficient of Variation– Total Energy Expenses

* Median value for all such domains

All Manufacturing

NAICS3 x

Province*

HT – No Tax 0.3% 1.8%

PC – No Tax 0.4% 1.8%

PC – Tax 0.4% 1.8%

Page 23: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Analytical Study 2 - Conclusions

Bias Small relative difference between estimated totals from

scenarios

Variance Relatively low CV for all options Tax substitution variables: Scenario 3 most efficient Non-tax substitution variables: Scenario 1 most efficient

Analytical capabilities Scenarios 2 and 3 provide most detail

Page 24: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Conclusions

Results used to select 2004 strategy – “PC – Tax” Meets needs of data users Reduced cost and response burden Maintain (improve) quality

Striving to further increase use of tax data Increased portion of population Increased number of variables

Page 25: Use of Administrative Data in Statistics Canada’s Annual Survey of Manufactures

Future Work

Editing of tax data Similar approach to survey data approach Potential to expand list of direct substitution variables

Indirect use of tax data More adaptive models

Quality indicators Account for increased variance and potential for bias due

to imputation