Upload
evan-owen
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
1
Experiences With Developing and Using Metadata-Driven Processing Systems for the Economic Census
June 19, 2007
Mark [email protected]
2
What is Metadata?
Data
•12
•Yes
•02152005
Metadata
•Number of month in operation•Does your company conduct research and development?•Date completed
3
Forms Design - 1997
Before Metadata:• Each trade worked independently• Inconsistent layout practices• No standards for content or design• Design of each paper questionnaire
one page at a time• A handful of custom coded
computerized questionnaires
4
Dissemination - 1997
Some Metadata in place:• Major advancements to reduce paper and
focus on electronic data products• Metadata was used to create publications
and dissemination products– Metadata was handled as independent files– No centralized Metadata system existed
• Inconsistencies existed across questionnaires and data products
5
Examples of Inconsistencies from 1997 – Question Numbering
SECTOR EMPLOYMENT PAYROLL FRINGE BENEFITS
Construction Item 5 Item 6 Item 8
Mining Long Item 2 Item 3A Item 3C
Mining Short Item 2 Item 3A N/A
Annual Survey of Manufactures Item 2 Item 3A Item 3C
Manufacturing Long Item 2 Item 3A N/A
Manufacturing Short Item 2 Item 3 N/A
Retail Long Item 6 Item 5a N/A
Retail Short N/A N/A N/A
Service Long Item 7 Item 6a N/A
Service Short N/A N/A N/A
Wholesale Long Item 6a Item 5a N/A
Wholesale Short Item 5a Item 4a N/A
Transportation/Utilities Long Item 6 Item 5a N/A
Transportation Short N/A N/A N/A
Finance Long Item 6 Item 5a N/A
Finance Short N/A N/A N/A
Auxiliaries Item 7 Item 6 N/A
6
Examples of Inconsistencies from 1997 – Forms Design
Picture 2
Picture 1
7
Examples of Inconsistencies from 1997 – Dissemination
Picture 1
Picture 2
8
Questionnaire Design - 2002
After Metadata:• Each trade worked only trade-specific
questions• Established standards for questions and
layouts• Design of each question once, allowing for
re-use across all questionnaires• Introduced a generalized system to offer
computerized questionnaires to all respondents
9
Categorization of Forms ContentQuestion Number (generated)
Question Title
Question
Instructions
Item
Numbers
Headers
(calculated)
Item
Wording
Item Instruction Data Elements
10
Reusable Content - Questions
11
Dissemination - 2002
With Integrated Metadata System:• Metadata was entered once and used
multiple times• Data products were more consistent
across subject area and format • The system allowed hundreds of users
the ability to analyze and output publications simultaneously
12
Categorization of Dissemination Content
Headnote - Will be included in metadata for2007. Processed outside system in 2002.
Footnotes - Metadata
File Name/Table Title - Metadata
Stub - Metadata (Both wordingand layout from Metadata)
Table Layout/Header Contents -Metadata (Both wording and layoutfrom Metadata)
Data - Physical data is not metadata.(Layout of data comes fromMetadata extract rules)
13
Forms Design Process Improvements for 2007
• The Redesign of the database has streamlined processes and improved performance.
• The integration of tools has fostered contiguous development of paper and electronic forms.
• Electronic forms have been completed early and made available for advance customer outreach program.
14
Dissemination Plans for 2007
• Continue using same system with some improvements– System upgrade from UNIX to LINUX Blade– Implementation of Software Quality Assurance
• Incorporate additional new products into the current system
• Upgrade publication tools to utilize all metadata
15
Questions