Topic (vii): Editing and Imputation of Census data Discussion · Topic (vii) Editing and ... –...

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

Topic (vii): Editing and Imputation of Census data

Discussion Session organizer:

Daniel Kilchmann / SFSO

Paper Summaries

WP.40 (Slovenia) – Editing of multiple source data in the case of Slovenian Agricultural Census 2010.

– Huge number of data tables and variables.

– Metadata driven applications for linkage and calculation of derived variables → independence from IT, decrease of IT workload, traceability and repeatability, documentation, but skilled subject staff needed, management and control of huge amount of metadata.

– Priority setting with overlapping sources, large differences.

– Combined data sources → decrease of reporting burden and higher data quality, but insignificant effect on costs and increase of E&I workload.

WP.41 (Austria) – The data imputation process of the Austrian register-based Census.

– Huge cost reduction!– Census data base built up by unique key → unique and multiple

sources, derived variables.– Splitting into census subjects → dependencies between subjects.– Deterministic imputation, derivation from different sources.– Random distributional imputation inside imputation classes

(decks).– Huge number of quality indicators.

WP.42 and WP.43 (UK), – The Practical Implementation of the 2011 UK Census Imputation Methodology.– Item Imputation of Census data in an automated production environment.

– Highly automated process.– Shift from EDIS to CANCEIS → Modules, HH first, increased

imputation quality.– Deterministic imputation for relationship.– Fall back imputation → tuning CANCEIS in automated production

environment.– Hard edit rules with variables in more than one module → e.g.

missingness of addresses.– Soft edit rules: increase of rare characteristics.– Reordering of household members (available in newer version of

CANCEIS)?

WP.44 (Abu Dhabi) – Edit and Imputation of the 2011 Abu Dhabi Census.

– Donor imputation: CANCEIS.– Deterministic imputation: SAS → out of scope responses.– Relax edit rules for large households: extended households,

multiple wifes, large expatriate population.– Manual imputation → very large households.– Predictive, Estimation and Distributional Accuracy for test data.– Shift from manual to mainly donor imputation → decrease in

workload, measurable changes, reproducible.

WP.45 (Mexico) – Editing Census Data: Mexico's experience.

– Traditional Census with 6 kinds of questionnaires.– Vector's methodology: generate all possible combinations of

values that variables involved in an edit rule can have → specific treatment.

– Editing Criteria Simulator to assess quality of edits.– Urban Environment form: standardization of street names,

presentation on map → inconsistencies.

General discussion

• Is it worth shifting from classical Census to register-based Census?

• Are registers used for Censuses stable – can we guarantee stable figures?

• How to define the priority ordering of several sources and how to measure the efficiency of this decision?

• Is it better to have several sources or just one per variable?• Reliability of registers vs reliability of questionnaires?• What about other surveys than Census', is a register-based

strategy implemented or planned? Conclusions?• Respondent burden vs. user burden (in case of flags

indicating the source of data or other para-/metadata)?

General discussion

• Re-use of the process easier for classical Census or for register-based Census?

• Is a fully automated process realistic?• Shift from good to better E&I tool: will we be able to solve all

problems one day? Is that needed? Stability of the process?• Rare sub-populations – 'important' for researchers and how

about E&I?• Always sequence in processing: can we solve problems due

to that, e.g. how to solve the problem with edit rules involving variables form different modules?

• Are there new dimensions of editing planned, like some sort of spatial editing?

Topic (vii): Editing and Imputation of Census data Discussion · Topic (vii) Editing and ... –...

Documents

Integrated Data Editing and Imputation Ton de Waal Department of Methodology Voorburg Statistics Netherlands ICES III conference, Montréal June 19, 2007

Garcia Imputation

Recommended Practices for Editing and Imputation in Cross ... · In Chapter 5 the problems relating to data analysis and estimation in presence of imputation, missing data and outliers

Genotype Imputation

Methodology Report #29: Outpatient Prescription Drugs ... · household and pharmacy data collection and editing process es, the editing and imputation techniques used to supply values

Imputation 2

1 Data Editing, Coding, and Just a Little Imputation Katherine (Jenny) Thompson Office of Statistical Methods and Research for Economic Programs Katherine.J.Thompson@census.gov

Editing, Imputation, and Synthesis: A Public Use File for ... values and imputation process. ... (Left) original values before blanking (Right) imputed values. ... 1. impute/edit survey

Phasing and Imputation

Improving the Quality of Tax Statistics: Recent Innovations in Editing and Imputation Techniques at the Statistics of Income Division of the U.S. Internal

1 Methods for detecting errors in VAT Turnover data Phil Lewis Processing, Editing and Imputation branch Business Statistics Methods-Survey Methodology

An information model for a metadata-driven editing and ...€¦ · An information model for a metadata-driven editing and imputation system Rok Platinovsek UNECE Work Session on Statistical

kNN Imputation

A standardized approach to editing: Statistics Finland’s ... · A modern approach to editing and imputation (E&I) should be metadata driven, produce a full audit trail, enable results

Methods and software for editing and imputation: recent advancements at Istat M. Di Zio, U. Guarnera, O. Luzi, A. Manzari ISTAT – Italian Statistical Institute

Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census

space vector's computation for current control in

Editing And Imputation For Manufacturing Statistics At Statistics Canada Marie Brodeur Director General, Industry Statistics Branch Santiago, Chile March

Monitoring year-to-year variation in structural business statistics Contribution to Q2008 – Rome, 9 July 2008 Session: Editing and Imputation I Guy.Vekeman@ec.europa.eu,

1999 Data Editing and Imputation - urban.org