Upload
elizabeth-cook
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
1
An Assessment of the U.S. Census Bureau’s Experience With Developing
Generalized Economic Programs Processing Systems
ICES 2007
Eddie J. SalyersBeverly Eng
2
Objectives for generalized systems
• Reduce resources and processing redundancies.
• Easily accommodate changes in surveys without laborious efforts to retool systems.
• Provide an infrastructure to introduce new surveys quickly.
3
Objectives for generalized systems
• Improve corporate planning and coordination. – Utilize resources to benefit all. – Provide a common language.– Provide greater staff mobility.
4
Economic Programs
• Economic Census– Every five years
• Current Surveys– 110 Current Surveys including monthlies,
quarterlies, and annuals
5
Different Approaches
• Economic Census– One process at a time, for example –
edits, tabs
• Current Surveys– One survey at a time
6
Generalization Background
• 1994 Reorganization– ESMPD– EPCD
• Functional Organization
• Current Surveys StEPS team
• Economic Census “Plain Vanilla” Edit Team
7
Economic Census Background
• Historically some Generalization– Mailout and Data Capture– Organizational lines
• Prior to 1997 / 5 Systems ( 5 Divisions)– Agriculture– Manufacturing and Mining– Construction– Retail / Wholesale / Comm. & Utilities/ FIRE /
Services– Auxiliary Establishments
8
Economic Census Background
• 1997 / Three Systems– Agriculture transferred to USDA– Auxiliary Establishments merged into
Services
9
Economic Census Development
• Edits– “Plain Vanilla”
• Team included subject matter analysts, mathematical statisticians, and programmers
• Basic Requirements• Designed for incorporation in 3 systems
10
Economic Census Development
• Plain Vanilla (PV)– Functionality
• Ratios compared to bounds, such as Annual / Payroll / Number of Employees
• Balance – Inventory stages of fabrication equal total
• Verification – Compare value to list, Valid industry code
11
Economic Census Development
• Plain Vanilla (PV) Edit– Ratio
• Simultaneous equations• Built on SPEER (Structured Program for
Economic Editing and Referrals)• SPEER previously used for Annual Survey of
Manufacturing (ASM) and Census of Mfg. & Mining
• Fellegi-Holt model
12
Economic Census Development
• Plain Vanilla (PV) Edit– Written in FORTRAN / Script processor in
SAS– First used for Annual Survey of
Manufactures 1996– Expanded to 1997 Economic Census
13
Economic Census Development
• Plain Vanilla (PV) Edit Experiences
– Several Problems in 1997 Economic Census
– Study of problems and resolution• Edit did too much• Edit didn’t do enough
14
Economic Census Development
• Plain Vanilla (PV) Problems - 1997
– Edit did too much• Over edits• Ratio module assures variables meet all
implicit and explicit constraints• Does not work well on poorly correlated pairs
or poorly reported items• Does not work well on partial year reporters
15
Economic Census Development
• Plain Vanilla (PV) Problems - 1997– Edit didn’t do enough
• Needed edit for poorly correlated or reported items (ex. Hotel rooms / receipts)
• Problems balancing complex matrices
– Inadequate training / difficulty developing scripts
– Use of one script for several different trade areas
16
Economic Census Development
• Plain Vanilla (PV) Improvements for 2002– Improved balance routines– Appropriate number of scripts to run edit (one per
trade)– New routine to test a variable while freezing one
variable (hotel rooms / receipts)– Expertise from 1997 Experience
• Development of interactive tool to aid in writing scripts not completed
17
Economic Census Development
• Plain Vanilla (PV) Lessons Learned– Requirements gathering should include as many
users as practical and not rely only on selected experts.
– Establish and maintain knowledgeable support of generalized software during implementation.
– Software to meet even basic edit requirements for a program as large and diverse as the Economic Census will be complex.
18
Economic Census Development• Plain Vanilla (PV) Lessons Learned
– Too much generalization can be a bad thing. – Use of generalized software can reduce
overall support requirements, but may also shift the demand for resources.
– It is absolutely necessary to provide accessible classroom training
19
Economic Census Development
• Trade Area Interactive Problem Solving Environment (TIPSE)
– Oracle software– Approach
• Determine what is common• Accept differences
20
Economic Census Development
• Trade Area Interactive Problem Solving Environment (TIPSE)
– Lesson Learned• Centralizing development of systems along
functional lines leads to efficiencies even when custom elements are allowed in the system.
21
Economic Census Development
– ECONDD (Economic Census Query System)• Visual Basic system to allow analyst to
search database• Built in 1997 exclusively for selected trades• Expanded in 2002 for all trades
22
Economic Census Development
– ECONDD Lessons Learned• Software developed for a specific area can
be modified to meet the needs of other areas• Users are receptive to using generalized
software as long as they get improved functionality
23
Economic Census Development
– Tabulations and Macro Analytical Review System (MARS)• SAS using Hybrid Online-Analytical
Processing for MARS• Same files for review and publication• Allows analysts to select the summary data
to review by industry and geography• Allows for reach-through to micro data• Sort, subset, and save as SAS or Excel
24
Economic Census Development
– Tabulations and Macro Analytical Review System (MARS)• Lessons Learned
– Provide comprehensive training when rolling out the system.
– Limit the custom coded functionality if users have a more familiar way to accomplish the tasks such as using Excel.
– System performance continues to be a challenge with these large datasets even when using the latest and greatest technology.
25
Economic Census Development
– Dissemination Metadata User Interface (DMUI) and Final Data Review Tool (FDRT) • DMUI – Oracle• FDRT - SAS• Publish pdf files of formatted publications• Database files for American Factfinder
(online)• Database files for CD-ROM
26
Economic Census Development
– Dissemination Metadata User Interface (DMUI) and Final Data Review Tool (FDRT)
• DMUI allows users to enter and update metadata for publications such as publication content and variable descriptions
• FDRT provides for final review of published data and allows for updating of flags and footnotes
• The systems generate the publication files using common tabulation files and metadata assuring consistency across formats
27
Economic Census Development
– Dissemination Metadata User Interface (DMUI) and Final Data Review Tool (FDRT) • Lessons Learned
– Collecting and assembling metadata from legacy systems can be very challenging.
– Handoffs between systems can be problematic and require unplanned programming to make the pieces fit together.
28
Economic Census Development
– Overall Lessons Learned • Users would like a common look and feel• Multiple passwords and logins are annoying• Having separate systems resulting in some
duplication of functionality• System performance was a major concern
for some components• More TRAINING is needed
29
• Annual, quarterly and monthly programs.
• Many have as their frame the Business Register or derivative files from the Register.
• Establishment or company based.
• Exceptions: construction surveys are project based and do not use the Register.
• Primarily mail-out/mail-back programs.
Current Economic Surveys
30
Prior to 1995:
– Each area developed their own systems to accommodate specific program needs.
– Resulted in 16 different processing systems.
– Many systems performed similar functions.
– Separate resources maintained and managed each system.
Processing of Current Economic Surveys
31
Prior to 1995:
– Multiple groups were solving similar processing problems.
– Areas with more resources had better systems.
– Enhancements were program-specific.
Processing of Current Economic Surveys - continued
32
Current Economic Surveys
Development of Generalized System
In May 1995 :
• Team included subject matter analysts, mathematical statisticians, and programmers.
• Today, that system is known as the Standard Economic Processing System, or StEPS.
33
Standard Economic Processing System
• SAS
• Complete Integrated System for Surveys
– Mail out, check-in, and telephone follow up
– Data capture: interactive forms data entry/batch keying entry
– Data review and correction
– Editing data
– Imputation of data
– Tabulation
What is StEPS?
34
• Runs on many platforms (Unix, LINUX,etc.).
• Site License
• Fourth generation language:
– Reduced development time for programmers; and common language also for statistical staff
• Comes with self-contained products, such as:
– Pull-down menus; data query capabilities; data analysis package: SAS/INSIGHT; and point-and-click analysis: SAS/ASSIST
Benefits of Using SAS Language
35
• Run any survey and reference period of data.
• Process items that differ across surveys; e.g.
– Revenue data from the Quarterly Services Survey
– Repair expenditures from the Survey of Residential Alterations and Repairs.
• Accommodate changes in survey items from one statistical period to the next.
General Requirements for
Developing StEPS
36
• Account for respondents changing (i.e., survey units) between statistical periods.
• Allow surveys to take generalized components and “particularize” them to their specific survey.
• Allow subject matter experts to control changes to their surveys without needing “processors” or programmers.
General Requirements for Developing StEPS (continued)
37
Managing Change
• Three types of change:
– Emergency fix (correction needed immediately)
– Non-emergency fix
– Enhancement
• Improvements identified by programmers
• Additional functions identified through migration efforts
• Additional functions identified through chartered projects
• Improvements identified by users
StEPS Processing
38
StEPS Accomplishments
• StEPS freed up resources to do other work
• StEPS easily accommodated surveys migrated from less sophisticated legacy systems
• StEPS provided infrastructure to set up new surveys quickly
39
Problems With StEPS and the Migration Process
Process for Implementing enhancements needed for a survey
• Enhancements needed for a specific survey can be detrimental to another
• Change Control Board
• StEPS Methodology Advisory Group
40
Problems With StEPS and the Migration Process
Underestimation of resources and time needed to migrate a survey
• Migration team was not usually dedicated to this effort.
• Program areas experienced turnover of staff.• Some program areas had limited staff
knowledgeable about the surveys and legacy systems.
41
Problems With StEPS and the Migration Process
General Resistance from Survey Statisticians • Satisfaction with their customized legacy
system • Possible loss of functionality by going to a
generalized system• Additional work created from the migration
effort
42
Problems With StEPS and the Migration Process
Performance
• Slow – interactive sessions and batch update runs
• SAS Institute – resolution of many processing problems
• Migration of StEPS to Blade/Linux environment
43
Problems With StEPS and the Migration Process
Conflicting Staff Priorities
• Reassigned coordination lead to create buy-in
• Dedicated teams focused on migration effort
44
StEPS Evaluation – Lessons Learned
• Get ‘buy-in’ from managers and staff.• Dedicate adequate resources to the
migration effort.• Identify and document requirements.• Implement and document a change control
process.• Commit adequate resources for training and
documentation of generalized system.
45
StEPS Evaluation – Lessons Learned
• Define and document testing strategy and test plans.
• Improve usability and functionality of StEPS.
• Improve performance.
46
Conclusion
• Generalized Systems – Increased flexibility in moving staff– Efficiencies in development and
maintenance– Decreased start-up time for new surveys– More classroom training desirable