Upload
easter-palmer
View
215
Download
0
Embed Size (px)
Citation preview
2011 Census
Dissemination Workshops
London 16th May
&
Manchester 17th May
Agenda
• Welcome and Introductions• 2011 Census Progress Update• 2011 Census Outputs • The ONS Web Strategy• Census Data Feeds: Development & Demonstrations • LUNCH• Alternatives to Data Feeds – Bulk Delivery• Metadata• Next Steps
Objectives of the 2011 Census
• To provide accurate census population
estimates• National response rate of at least 94%
• All LAs have a response rate of at least 80%
• To provide accurate population
characteristics
Changes from 2001 (E&W)
• Address checking before Census Day• Post-out and post-back of questionnaires
• On-line completion
• Questionnaire tracking
• Intensive, targeted and flexible follow up of non response
All systems tested during
Rehearsal
Key operational dates
• 4 March Census helpline/online help centre went live• 7 March Postal delivery started• 7 March Special enumerators started• 27 March Census Day• 6 April Census follow up started• 16 April Special enumeration collection finished• 18 April Non compliance coordinators started• 6 May Follow up finished • 9 May Census Coverage Survey (CCS) started• 2 June CCS finishes• June 2012 Data processing complete• July 2012 First census outputs
HTC-1 (40% - Easiest)HTC-2 (40%)HTC-3 (10%)HTC-4 (8%)HTC-5 (2% - Hardest)
HTC categories
• Large data capture & processing facility
• Receipting
• Scanning
• Character recognition
• Coding
• Processing
• Archiving
Census processing
Address Check Follow up
CEs
Postal Collection
Internet CCS
CEs
ProcessingPost Out
How will the data be handled?• Security and confidentiality are top
priorities for Census
• Confidentiality protected by law
• Strict physical and IT security
• Independent security reviews
• Under lock and key for 100 years
The Census Coverage Survey
• Similar to 2001
• Large sample survey
• 6 weeks after census day
• Short paper based interview
• Independent of census
Coverage Adjustment
• Compare census returns with census coverage survey results
• Adjust for households and persons estimated to have been missed
• Uses the census coverage survey to characterise the households and persons missed
• Final results include these adjustments
Geography: 2001 Output Areas
• Census output geography separated from data collection geography
• A geography created from Census data
• Consistent size in population/no of households
• AND
• socially homogeneous
• meets confidentiality thresholds
• aligns with 2003 administrative boundaries
• Consistent throughout UK
2001 Output Areas
• 175,434 Output Areas• Mean 297 persons; 123
households• Freely available digital
boundary data • Building blocks for
“neighbourhood” geographies: Super Output Areas (LSOAs, MSOAs)
Image courtesy of David Martin. This work is based on data provided through EDINA UKBORDERS with the support of the ESRC and JISC and uses boundary material which is copyright of the Crown.
Super Output Areas (SOAs)
• Created 2004, for Neighbourhood Statistics
• Groupings of Output Areas
• Layered hierarchy – lower & middle layers
• Each layer with size thresholds and targets offer levels of statistical reporting
• Lower SOAs ≈ approx 35,000 OAs, avge pop ≈ 1,500
• Middle SOAs ≈ approx 7,000 OAs, avge pop ≈ 7,200 -
• Upper SOAs not created
2011 Census Output Geography Policy
• Maintain approx 95%+ of the OA/SOA hierarchy
• OAs/SOAs will be redesigned only where:- they have undergone significant population change since 2001
- they have been split by local authority boundary change since 2003
- they have been independently assessed as lacking social homogeneity.
2011 Census Output Geography Policy
• Splits and mergers of current hierarchy
• Supports comparability between 2001 and 2011, and other national statistics
• Where OAs/SOAs are redesigned they will:
- not align to ward and parish boundaries that have changed since 2003
- not align to real-world features- not contain only a single large communal establishment - not contain less than 100 persons and 40 households
Changes since 2001 - population
• Population growth, especially migration
• More and smaller households
• Newly built properties
• Sub-division of existing properties
• Changing socio-economic characteristics of areas
NEW: Workplace Zones
• OAs are based on where people live not work
• OAs can be unsuitable for workplace statistics
• Some OAs contain no/few businesses; some contain many businesses or large employer, e.g. business parks, City of London
• Workplace Zones project looking at splitting/merging OAs for a new geography nesting with OAs
Disclosure Control
• Population threshold is equal to that of Output Areas – minimum of 100 workers and maximum of 625
• No household threshold as individual households cannot be identified within Census data
• 2007 Statistics and Registration Services Act prevents disclosure of any individual workplace
• Combining at least 3 postcodes containing no less than 100 workers should prevent disclosure
Workplace Zones
• Significant demand for equivalent output in 2011 for 340 out of 400 Census 2001 tables
• Where possible the table specifications and layouts used in 2001 will be reproduced for the 2011 Census. These were included in the second round consultation
Outputs Content Consultation
1st Round
• Harmonised UK approach• No post tabular disclosure control• UALA and above:
• Greater detail• Increased flexibility
• Below UALA:• Any univariate table at standard geography
• In general, cross classifications below UALA contain the same level of detail as 2001
Output Design:
Disclosure control
Stand alone Univariate
ConcatenatedLimited
Relationships Small Relational Large Relational
The Dataset Spectrum
Standalone
· Tables of indicators. · Concatenated tables.· Include multiple variables,
and often multiple dimension items , but all data is in ‘univariate’ form.
· Multiple stats and measurement units
Univariate
· Single variables· Most detailed breakdown
of classification· No concatenation· Single table population,
stats unit and measurement unit
GeogGeogGeogGeogGeogGeogGeog
Age
Aged 1 Aged 2 Aged 3 Aged n
KEY STATISTICS & UNIVARIATE
GeogGeogGeogGeogGeogGeogGeog
Males Aged 0-4
% of HH’swith a
car
Area size in
hectares
% of HH’s
without a car
Cross Tabulations
· Multi-variate· Single Statistical and
Measurement Unit· Single ‘Table Population’· No concatenation· Hard to present as cross-
tab by geography on-line· Some but limited
customisation opportunity
Males
Hat No Hat
Females
Age 1
Age 2Age 3Age n
Age 1
Age 2Age 3Age n
Good Health Bad Health
Hat No HatGeog 1
Concatenated limited relationships
· Grouping of simple cross-tabulations for a particular population/theme
· Concatenated simple cross-tabulations
· In this case ‘age’ not cross-classifiable by ‘health’
Geog 1
Method of Travel to work
Car Bus Train Bicycle
Males
Females
Age 1
Age 2Age 3Age n
Good Health
Bad Health
Multivariate
• Just completed; see:
• User guide/policy documentation• Geography and UK harmonisation indicator• Set of indicative tables
Second Round Consultation
ONS Dissemination StrategyOptions for supply of Census data
• Direction of government policy• ONS Web Strategy• Options of supply of data• Aspirations for ‘re-use community’• Call to action
Direction of government policy
• Transparency Board and Public Data principles• Open Government Licence• 5 star data ratings• Public Data Corporation• FOI changes
Public Data Principles
• Public data policy and practice will be clearly driven by the public and businesses
who want and use the data, including what data is released when and in what form
• Public data will be published in reusable, machine-readable form
• Public data will be released under the same open licence which enables free reuse, including commercial reuse
• Public data will be available and easy to find through a single easy to use online access point (data.gov.uk)
• Public data will be published using open standards
• Public data underlying the Government’s own websites will be published in reusable form for others to use
• Public data will be timely and fine grained
• Release data quickly, and then re-publish it in linked data form
• Public data will be freely available to use in any lawful way
• Public bodies should actively encourage the re-use of their public data
• Public bodies should maintain and publish inventories of their data holdings
Open Government Licence
You are free to:
• copy, publish, distribute and transmit the Information; • adapt the Information; • exploit the Information commercially for example, by combining it with
other Information, or by including it in your own product or application.
You must, where you do any of the above:
• acknowledge the source of the Information • ensure that you do not use the Information in a way that suggests any
official status • ensure that you do not mislead others or misrepresent the Information or
its source; • ensure that your use of the Information does not breach the Data
Protection Act 1998 or the Privacy and Electronic Communications (EC Directive) Regulations 2003.
http://www.nationalarchives.gov.uk/doc/open-government-licence
5 star data rating
★ Available on the web (whatever format), but with an open licence
★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table)
★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)
★★★★ All the above plus, Use open standards from W3C to identify things, so that people can point at your stuff
★★★★★ All the above, plus: Link your data to other people’s data to provide context
Other government policy
• Public Data Corporation
http://www.cabinetoffice.gov.uk/news/public-data-corporation-free-public-data-and-drive-innovation
• FOI changes
1. Enabling re-use of our content by others
2. A single ONS website
3. Integration of back office systems
4. To fit within a wider official statistics strategy
5. User focus
6. New ways of presenting and communicating our statistics
7. Automating data take-on
8. Open data standards and metadata
9. Focusing on core requirements and using partnerships
10. Keep it simple
ONS Web Strategy
ONS API(read)
ONS Content
Repository
ONS website Bulk
User
Bulk User
(e.g. Eurostat)
Partner(e.g. CASWEB,
SASPAC)
External systems
External systems
External systems
(mash-up)e.g. Local Authority
External systems
(mash-up)
Data Supplier
Data Supplier
Data Supplier
(e.g. DWP)
Partner (e.g. DirectGov)
ONS API(write)
Bulk User
(e.g. Bank, HMT)
Specific audience
group
Specific audience
group
Other data sources
Users of ONS website
Users in this organisation
Users in this organisation
Users of ONS data on other
systemsCommunity
forum
(e.g. students, local authorities)
(e.g. citizen)
Bulksupply
Options for supply
• ONS Website• Use data on-line• Download to own system
• ONS API
• Intermediary
• Bulk Data Supply
Aspirations for re-use community
• Open• Sharing• Encouraging• Enabling
[Don’t have to behave in this way to re-use our data, but we are keen to encourage this type of use]
• Ordnance Survey example
OS Openspace model
A community – sharing ideas and work (aspiration)
• Forum
• Galleries
• Blogs
• Example code / interactions
• Enablement tools
‘Call to action’
• If you think you may be interested in accessing ONS data through the API service – we want to hear from you
• We are keen to get a better understanding of how API potentially will be used, in order to deliver the appropriate solution
Census Data Feeds Update
Data Feeds Update
• ESRC funded group started in 2007• ONS established Census Web Services Working
Group in 2008• Membership
• ONS/GROS/NISRA/WAG• ESRC/JISC• SASPAC/Local Gvt• NOMIS• Home Office/Ordnance Survey• BBC
Data Feeds: Work To Date
• Working in partnership with Manchester University to create large and complex test datasets; using 2001 data to replicate 2011.
• Launched an “Alpha” API (clone of the API behind the new website) in December 2010.
• SASPAC, Manchester University, and NOMIS rapidly built applications to access the test data via the API
Work To Do
• More datasets from Manchester University; based on the indicative table layouts.
• Working towards a “Beta” API in early 2012; with more data, metadata, and partners.
• Working with the ONS Web Strategy Implementation team to develop a fully operating Data Feeds Service prior to release of 2012 Census data in September 2012.
Data Feeds:Application Development
• Three Applications developed December 2010
• NOMIS: Data Picker and Report
• SASPAC: Integrating data into desktop software
• CDU: Visualisation application
Live Links
• NOMIS live: http://nmtest.dur.ac.uk/Default.aspx and http://nmtest.dur.ac.uk/scenario3.aspx
• NOMIS video https://www.nomisweb.co.uk/Projects/ONS/WDP/FR3/demo.htm
• CDU video
http://cdu.mimas.ac.uk/projects/3dapp/demo.htm
• SASPAC video:
http://www.saspac.org/dev/SASPAC-API-demo-Jan2011.swf
Data Feeds Demonstrations
2011 Census DisseminationWorkshop
• LUNCH
This Afternoon
• Feedback on Data Feeds
• Alternatives to Data Feeds
• Metadata
• Group exercises and discussion
• Over to you……………………..
Discussion
• Feedback: Data Feeds
Bulk Supply:Some Changes Since 2001(Slide 1 of 400……………….)• 2001 Census: Bulk Output
• SuperTABLE• E&W data reformatted from SuperSTAR
– Provided in CSV format – Funded by Census Access
• 2011 Census: Bulk Output• Data structured to enable transfer to ONS website• Format not decided, probably
– CSV for data– SDMX for metadata
• No funding for reformatting
Bulk Data in Standard Format
Census Outputs Dataset Production
Link Datasets to Supporting Information and Disseminate using
ONS API and Website
CustomerServices
Provide Bulk Outputs direct to Customer
Standard Format Outputs
Structural Description of Each Dataset
Content of Each Dataset (Cell/Observation Values)
“Structural Metadata”
Supporting Information = “Reference Metadata”
Discussion
• Alternatives to Data Feeds
Metadata
• Structural :• Needed to make datasets
• Reference/supporting information:• needed to understand the data in the datasets
Metadata
About the Census•Legislation and Policy documentation•Programme/Project documentation•Operational and Statistical Processing documentation•Documentation of Statistical Methodology•Quality Management documentation•Timeliness and Punctuality•Relevance
About the Data•Product and Service documentation•Statistical Population•Accuracy and Reliability•Comparability•Confidentiality•Coherence•Comments
Index of Variables•Definition•Associated Variables•Source Questions
Glossary•Definition
Mandatory Optional
Structural Metadata Reference Metadata
About the Census•Legislation and Policy documentation•Programme/Project documentation•Operational and Statistical Processing documentation•Documentation of Statistical Methodology•Quality Management documentation•Timeliness and Punctuality•Relevance
About the Data•Product and Service documentation•Statistical Population•Accuracy and Reliability•Comparability•Confidentiality•Coherence•Comments
Glossary•Definition
Index of Variables•Definition•Associated Variables•Source Questions
Dataset Details•Dimension•Dimension Item•Target Classification•Target Classification Item
Metadata
• Exercise and Discussion
• What is the demand for it?
• What is the priority information?
• How would you like to access metadata?
2011 Census Dissemination
• Summary
• Next Steps
2011 Census Dissemination
• Thank you for coming……
More information, Q&A reports and slides available on CWSWG website www.ukcensusoutputs.net
Direct link to slides http://bit.ly/lz56pr
• Please give us your feedback
Email: [email protected]
• Interested in API ?
• Register for trials with Beta API in 2012
• Register for Census Web Services Working Group (www.ukcensusoutputs.net)
• Interested in Bulk Delivery?
• Register for further workshops & working group
• Form a Census Bulk Data Working Group?