View
219
Download
3
Category
Tags:
Preview:
Citation preview
The Continuing Evolution of Generalized Systems at
Statistics Canada for Business Survey Processing
Chris MohlStatistics Canada
Outline
Why Generalize?
Factors Influencing the Evolution
The Systems
Development, Support and Maintenance
Lessons Learned
Possible Future Activities
Conclusions
Why Generalize Systems?
Fully researched methods
Thoroughly tested
Complete documentation
Expert support team
Minimal user programming required – improves timeliness
Coherent methods across surveys
Factors Influencing the Evolution
Changes in technologyMainframe to PC/UNIX processing
Some underlying software no longer supported
Statistics Canada’s SAS site license
Need for new or more sophisticated methods
The Systems
Can be classified into three groupings
Mature SystemsNo new development
Redesign SystemsReengineering of old systems
New Development SystemsNew methodologies
Mature Systems
The longest surviving generalized systemsNo new functionality being added – only maintenanceSAS macrosInterface built with SAS/AFCan be run in batch mode (macro call within SAS program) or via interfacePC or UNIX
Mature Systems
Generalized Sampling (GSAM)Performs functions related to sample selection for ongoing and ad hoc surveys
Stratification, Allocation, Sampling, Frame Maintenance
Generalized Estimation System (GES)Performs functions related to weighting and estimation
One-stage element and cluster, two-phase element designs
Mostly design based, some synthetic, jackknife
Example of GES Interface Screen
Redesigned Systems
Generalized systems previously existed that performed similar functions but needed replacementWhy?
Often due to outdated architecture – mainframe, obsolete softwareNew capabilities in SASNew methodologies couldn’t be integrated into previous system
Redesigned Systems
Banff (replaces Oracle based GEIS)Performs edit and imputation of numeric continuous data
Nine custom built SAS procedures
SAS Enterprise Guide based “interface” (Banff wizards)
Example of Banff SAS Procedure
Example of Banff Wizard
Redesigned Systems
New CONFID Performs protection of tabular economic dataSAS-based custom built procedures (like Banff) and macros for PC and UNIX
Jasper (replacement for ACTR) Performs automated coding of character stringsRetains interface-based processing, but may later build SAS-based custom built procedures
New Development Systems
Fills in needs for functionality not already available in other generalized systems
Replaces customized programs that may already exists
New Development Systems
Statistical Macro Extensions (StatMx)New functionality not available in GES / GSAM
Multi-stage design estimation, Lavallée-Hidiroglou allocation, extended synthetic estimation
SAS macros, no interface
ForillonTime Series processing
Benchmarking sub-annual series, Raking to retain additivity, trend computations, variance calculations, analytical tools
SAS-based procedures and Enterprise Guide "interface”
Development, Support and Maintenance
Most systems developed and maintained by teams of individuals from two groups
Mathematical statisticians (Methodology Branch)
Programmers (Informatics Branch)
Certain projects are the sole responsibility of one group
Moving away from such situations
Development
Methodologists review mathematical needsConsultation with potential users, literature searches, research into mathematical methods
Programmers review informatics needs
Methodologists write specifications
Programmers produce new version
Methodologists do final certification
Documentation is written
Support
Team members not directly responsible to implement the systems – assist users
Mathematical questions go to methodologists, informatics questions to programmers
Amount of support depends upon number of users, complexity of the methods, “newness” of the system
Maintenance
May consist of bug fixes or adding new functionality
May be identified by the users or by team members
Team members work together to identify if it merits attention and then implement and certify the change
Costs
Generalized systems require a very significant outlay of resources
Varies significantly from project to project
Development of a large project2-3 methodologists, 2-3 programmers over several years
Support and maintenance1 methodologist, 1 programmer per year
Lessons Learned
Reduce Software DiversityEmphasis put on SAS, reduce reliance on different programming languages
Easier to move people from one project to another
Users only need to know one language
Learning SAS is part of staff’s early training
Lessons Learned
Traditional interfaces are expensive – there are alternatives
Interface development can cost as much as the mathematical functionality
Changes can be difficult
Often does not upgrade as well as rest of the system
Most users prefer batch processing for production
Can be necessary when tool is used by non-technical personnel
SAS Enterprise Guide being successfully used
Lessons Learned
People like things they are familiar withCustomized SAS procedures (Banff, Forillon) have been favorably received
Centralization of resources is beneficialPeople can take ideas used in one project and apply it to others
Examples: Enterprise Guide interfaces, Customized SAS procedures
Lessons Learned
Modularity and flexibility are importantSome early systems too rigid – successful ones had more flexibility
Users only want pieces of certain systems
Reduce custom-built systems, put in generalized systems
People often “borrow” other programs and don’t understand all the implications
Support is a problem when person leaves project
However, timing sometimes makes it necessary
Lessons Learned
Buy when possible, but don’t get corneredNo need to build certain components ex. linear programming functionEnsure that changing to an alternate component is not difficultMake sure that the support is there
Stay up to date on technologyDon’t wait too long to react to advances
Ex. Mainframe → PC 1990s, Linux
Possible Future Activities
Current SystemsBanff – categorical data capabilities
New CONFID – add additional functionality
Jasper – review of methodology used
Forillon – add additional functionality
StatMx – advanced variance calculations?
Possible Future Activities
General avenuesContinue movement towards SAS based procedures and Enterprise Guide interfaces
Buy components when possible – free up programming resources for specialized tasks
Metadata table-based processor
Conclusions
Generalized Systems have become a critical part of business survey processingDue to the investments made in development we have to keep them relevantMoving towards a more standardized look and feelUse what we have learned in the past to help shape the future
Chris MohlChris.Mohl@statcan.ca
For more Information
please contact
Pour plus d’information,
veuillez contacter
Recommended