Topic (vii): Editing and Imputation of Census data Discussion · Topic (vii) Editing and ... –...

Preview:

Citation preview

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

1

Topic (vii): Editing and Imputation of Census data

Discussion Session organizer:

Daniel Kilchmann / SFSO

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

2

Paper Summaries

WP.40 (Slovenia) – Editing of multiple source data in the case of Slovenian Agricultural Census 2010.

– Huge number of data tables and variables.

– Metadata driven applications for linkage and calculation of derived variables → independence from IT, decrease of IT workload, traceability and repeatability, documentation, but skilled subject staff needed, management and control of huge amount of metadata.

– Priority setting with overlapping sources, large differences.

– Combined data sources → decrease of reporting burden and higher data quality, but insignificant effect on costs and increase of E&I workload.

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

3

WP.41 (Austria) – The data imputation process of the Austrian register-based Census.

– Huge cost reduction!– Census data base built up by unique key → unique and multiple

sources, derived variables.– Splitting into census subjects → dependencies between subjects.– Deterministic imputation, derivation from different sources.– Random distributional imputation inside imputation classes

(decks).– Huge number of quality indicators.

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

4

WP.42 and WP.43 (UK), – The Practical Implementation of the 2011 UK Census Imputation Methodology.– Item Imputation of Census data in an automated production environment.

– Highly automated process.– Shift from EDIS to CANCEIS → Modules, HH first, increased

imputation quality.– Deterministic imputation for relationship.– Fall back imputation → tuning CANCEIS in automated production

environment.– Hard edit rules with variables in more than one module → e.g.

missingness of addresses.– Soft edit rules: increase of rare characteristics.– Reordering of household members (available in newer version of

CANCEIS)?

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

5

WP.44 (Abu Dhabi) – Edit and Imputation of the 2011 Abu Dhabi Census.

– Donor imputation: CANCEIS.– Deterministic imputation: SAS → out of scope responses.– Relax edit rules for large households: extended households,

multiple wifes, large expatriate population.– Manual imputation → very large households.– Predictive, Estimation and Distributional Accuracy for test data.– Shift from manual to mainly donor imputation → decrease in

workload, measurable changes, reproducible.

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

6

WP.45 (Mexico) – Editing Census Data: Mexico's experience.

– Traditional Census with 6 kinds of questionnaires.– Vector's methodology: generate all possible combinations of

values that variables involved in an edit rule can have → specific treatment.

– Editing Criteria Simulator to assess quality of edits.– Urban Environment form: standardization of street names,

presentation on map → inconsistencies.

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

7

General discussion

• Is it worth shifting from classical Census to register-based Census?

• Are registers used for Censuses stable – can we guarantee stable figures?

• How to define the priority ordering of several sources and how to measure the efficiency of this decision?

• Is it better to have several sources or just one per variable?• Reliability of registers vs reliability of questionnaires?• What about other surveys than Census', is a register-based

strategy implemented or planned? Conclusions?• Respondent burden vs. user burden (in case of flags

indicating the source of data or other para-/metadata)?

Oslo, 26.9.2012 UNECE Work sessino on SDETopic (vii) Editing and Imputation of Census data

8

General discussion

• Re-use of the process easier for classical Census or for register-based Census?

• Is a fully automated process realistic?• Shift from good to better E&I tool: will we be able to solve all

problems one day? Is that needed? Stability of the process?• Rare sub-populations – 'important' for researchers and how

about E&I?• Always sequence in processing: can we solve problems due

to that, e.g. how to solve the problem with edit rules involving variables form different modules?

• Are there new dimensions of editing planned, like some sort of spatial editing?

Recommended