Upload
irea-mcdaniel
View
235
Download
0
Tags:
Embed Size (px)
Citation preview
Quality issues in the 2001 Census
Ludi Simpson, Cathie Marsh Centre for Census and Survey Research
Quality for which purposes?
Macro administration and planning Fine geography, timeliness, constant questions
Policy evaluation, research Cross-tabulated questions
Micro administration Identified individuals
Major changes for the2001 census
Content and design of the questionnaire New questions
Health, lowest floor level, caring for others, relationship between residents, religion
Students at term-time address, no visitors Fieldwork
Pre-printed list of addresses Contracts for major operations Post-back Focus on hard to count areas
More major changes for the2001 census
Processing and validation All forms scanned All responses fully coded and processed – no 10% tables Redesigned census coverage survey (CCS)
impute the whole population before output: a One Number Census.
Output Simpler cross-tabulations and key statistics UK wide Standard area statistics free on the internet Census Access Project
All standard area statistics; migration and commuting; vector boundaries. Not included: commissioned output; SARs.
Confidentiality - disclosure control
Plan
The quality of enumeration, and adjustments before publication
Differential undercount
Disclosure control: small cell adjustment
One Number Census
“The key to our whole strategy has been to try to minimise the amount of differential undercount”
ONC Steering Group, and ONS evidence to Treasury Committee
Courtesy of Mt Meagre cosmetic stones
Procedures affecting differential undercount
Community Liaison programme Pre-listed addresses Double EDs for enumerators in easier areas Post-back Checks on returned forms Centralised form production Local enumerator loyalty
1991 % non-response after ONC2001, total persons
1614121086420
20
01
% n
on
-re
spo
nse
ON
C2
00
1,
tota
l pe
rso
ns
40
30
20
10
0
SuffolkForest H
South StCherw ell
Richmond
CeredigiSlough U
Manchest
Westmins
Wandsw or
Tow er HaSouthw ar
Redbridg
New hamLew isham
Lambeth
Kensingt
Islingto
Haringey
Hammersm
Hackney
Camden
Brent
City of
More in 2001
Less in 2001
1991 %
2001
%Census non-response 1991 and 2001
Person non-response
(1) Differential between LADs1991 2001
Mean of LADs 2.7% 5.2%Range 0% - 14% 1% - 36%Standard Deviation 1.9% 4.6%Inter Quartile Range 1.1% 3.2%SD/mean 0.72 0.88
(2) Differential between types of peopleBiases severe, but not as marked as in 1991:
Young men or young people generally Private renters, unemployed, not-White
Item non-response: missing, invalid, inconsistent responses
Under 1%: age, sex, marital status 1-5%: 15 variables 5-10%: religion, provision of care, qualifications,
employment status, supervisor status, industry, workplace address, hours worked, travel to work, number of rooms
> 10%: Professional quals 17%, company size 14%
LAD variation: Wokingham, Eastleigh, Hart, E Dorset best, Manchester, Blackburn and 5 London Boroughs worst.
“Biases … were in the same direction as those present in the 1991 Census, but were less marked.” (Edit-imputation evaluation report)
Conclusions on non-response
Post-back problems jeopardised quality Neither levelling up of response rates
nor levelling down Lower response than 1991 Wider geographical differences than
1991 More of all types of people missed Future preparation for next census
Measures to prevent disclosure have an impact on data quality
Measures to prevent disclosure: Thresholds for census areas
OAs 40 hh and 100 residents Broad output categories
No SARs sub-regional geography; no large households
Imputed records not distinguishable Record swapping between areas July 2002: Adjust small numbers to eliminate all 1s and 2s
Table totals the sum of internal cells Different tables, different totals
Impact on small area indicators Unemployment rate from ST028
GrassingtonALL PEOPLE
Economically Active
Employee: Part-time
Employee: Full-time
Self-employed (with employees): p/t
Self-employed (with employees): f/t
Self-empl (without employees): p/t
Self-empl (without employees): f/t
Economically Active: Unemployed
ALL PEOPLE 1,102 729 148 293 15 71 42 120 2116 13 3 0 0 0 0 0 0 017 10 4 0 0 0 0 0 0 018 12 9 0 0 0 0 0 0 319 13 13 0 7 0 0 0 3 0
20 to 24 42 39 8 25 0 0 0 0 3 25 to 29 45 39 4 23 3 0 3 6 0 30 to 34 63 58 13 26 0 5 6 5 3 35 to 39 100 90 24 36 0 10 3 17 0 40 to 44 93 81 14 35 3 3 9 17 0 45 to 49 103 97 17 47 3 13 3 8 6 50 to 54 136 115 23 50 0 17 3 19 3 55 to 59 133 97 24 28 0 12 6 27 0 60 to 64 111 40 14 10 0 5 4 7 0 65 to 69 136 38 7 6 6 3 5 8 3 70 to 74 92 6 0 0 0 3 0 3 0
Males 526 375 22 169 3 55 14 94 916 8 3 0 0 0 0 0 0 017 3 0 0 0 0 0 0 0 018 9 6 0 0 0 0 0 0 3
(Sum of 30 cells) divided by (Sum of 390 cells) 21 ± 4 divided by 729 ± 11 Rate: 2.9% is likely to be in error by 0.6%
3020100-10-20-30
120
100
80
60
40
20
0
-20
K0209LT4
K0209D
K0208LT4
K0208D
K0207LT4
K0207D
K0206LT4
K0206D
K0205LT4
K0205D
K0204LT4
K0204D
K0203LT4
K0203D
K0202LT4
K0202D
Num
ber
of O
A v
alue
s of
0 o
r 3
Difference between Ward value and sum of OA values
Impact of aggregating small area adjustment: 8,850 wards, 16 age groups (KS02)
Impact of aggregating
Impact of small area adjustment: advice for users
Substitute 1.5 for 0s and 3s? No! Average error=sqrt(no. of 0s and 3s) *0.8
5% of time error more than twice this. Beware percentages based on small rounded totals Statistical analyses are affected: measurement error Aggregate a minimal number of areas or cells
Key statistics better than Standard Tables Use Univariate tables for denominators 1 ward is better than the sum of its OAs 1 ward minus 2 OAs is better than 18 OAs
Impact of adjustment is only worse than in 1991 if 0s and 3s are one third of summed values
Variability of Tabular Counts by Processes
0.0002% 0.03%AGE Data Capture
0.91%
Overall standard error 1.08% Edit and Imputation
0.001%SEX One Number Census
0.91%
Overall standard error 1.07% Disclosure Control(Small Cell Adjustment)
MARITALSTATUS
Overallstandard error 1.23%
0.02% 0.01%
0.91%ETHNIC
Overall standard error 1.35%
Note that there is no small cell ajdustment variability for sex and marital status as there were no small counts for these variables
0.41%
0.16%
0.16% 0.91%0.16%
0.14%
Measuring census error
What is the probable distance of the 2001 ONC from the truth?
For future Census planners: for which populations is the ONC expected to have greater error than the census enumeration and other alternatives?
Population size; variable; undercount
How accurate is the census population?
Root mean square error, %
Enumeration
Census + absent
households
One Number Census
E&W 6.2% 0.1%
LAD 10.4% 0.74%
Ward
Output Area
Statistics of a complete population versus Speedy delivery of results
Compromise: a firm output prospectus Achievement of data release
Standard area statistics: Feb 03 – Sept 03 (3-4mths delay) Origin-Destination statistics: Jan 04?
Documentation, software Public availability of key statistics via NeSS/SCROL…
Enumerated
Census enumeration and the population, Great Britain
51
52
53
54
55
56
57
58
59
1981 1991 2001
Mil
lio
ns
Estimated by CVS/CCS
Enumerated
Rolled on from 1981
2001 ONC.
Additions to enumerated 48,843,000 Evidence Judgements
Dual system estimate of undercount
3,199,000 Census Coverage Survey Medium
Revised household estimate
230,000 Addresspoint with LFS Medium
Census day to June 30 43,000 Births, deaths, migration Light
Revised persons estimate 193,000 Longitudinal Study Medium
Further revised persons estimate
? Address matching Mcr, Westr
Medium
Unmonitored int’l migration 81-91
351,000 2001 census Heavy
Unmonitored int’l migration 91-01
c111,000 Improved 91-98 Light
c85,000 Visitor switchers Medium
c108,000 Migrant switchers Heavy
Unexplained difference with rolled forward MYE 2001
291,000 2001 census
Oct
ober
200
203
-04
Sub-national
Best ever sub-national estimation procedures for census year Methods agreed by users before census
Concerns: Insufficient information about quality assurance Administrative comparisons not acted upon Unconvincing geography of undercount
QA of local population: comparators Students, armed forces, prisoners Local pre-census population estimates and
administrative records Child benefit, pensioner, births, school census,
adjusted GP patients “calculate a range of plausible values for the
number of people of each sex within five-year age groups in each geographical area” ONC Guide
Diagnostic range up to double that of the comparators
Composition of undercount: ratio
of undercount rates
Change in population
sex ratio
Type of District:
Under-count
rate, all persons
Male 20-34/ Female
20-34
All 20-34/ All other
20-34 M/F, ONC2001-MYE2000
Inner London 22% 1.11 1.41 -.126
Outer London 10% 1.15 1.95 -.146
Principal metropolitan cities 9% 1.30 1.64 -.131
Large cities 7% 1.30 1.90 -.146
Small cities 6% 1.36 1.91 -.125
Resort, port and retirement 5% 1.25 2.21 -.073
Other metroplitan Districts 5% 1.25 1.75 -.091
New towns 5% 1.35 1.76 -.063
Industrial areas 4% 1.30 1.85 -.074
Urban and mixed urban-rural 4% 1.35 2.30 -.032
Remoter, mainly rural 4% 1.42 2.13 -.028
England and Wales 6% 1.22 2.06 -0.086
Where did population estimates fall?
A few Districts in each region account for most of change
Districts with transient populations – students, armed forces, seasonal labour, immigration
ONS now focusing on outliers Manchester and Westminster address-matching
Were address lists complete?
Conclusions Wide-spread undercount – new types of people Impact on output reduced by ONC imputation? Uncertain population total
Young men, children Multi-source error for small areas: advice for users
Future priorities Fieldwork: improved management Output: firm timetable, pre-release documentation. Information: users are the Census’ best friends Third way: validated administrative records International migration: beyond interim revisions Residence definitions: avoid ‘legal population’ Why were people missed?