24
Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Embed Size (px)

Citation preview

Page 1: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Can We Trust Data Users to Consider Data Quality?

Presented at the 2008 European Conference on Quality in Official Statistics

Page 2: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Background

The American Community Survey (ACS) is an innovative approach for collecting and publishing demographic, social, economic, and housing data

National sample of about 3 million addresses each year

Page 3: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Background

Combining ACS samples over time permits publication for smallest geographic areas

Combining ACS samples over space permits publication for shorter time periods

Page 4: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

ACS Data Release Schedule

Type of Estimate

Year of Data Release

2006 2007 2008 2009 2010 2011

5-year NA NA NA NA 2005-2009

2006-2010

3-year NA NA 2005-2007

2006-2008

2007-2009

2008-2010

1-year 2005 2006 2007 2008 2009 2010NA: Not Available

Page 5: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Dissemination Options

5-year estimates released for all geographic areas to produce data similar to census sample data

1-year and 3-year estimates released only for a subset of these geographic areas

Page 6: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

ACS Data Users

Technically advanced users have the experience and can usually be trusted to consider quality

Novice users who lack this experience may not understand or take quality into account

Page 7: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Consequences

Release of data that are not perceived as credible leads to loss of trust in the integrity of the survey in general

Page 8: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

“ACS strikes again! Its hard to believe that the Census Bureau expects users to accept these numbers.”

Page 9: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

ACS Dissemination Philosophy

Release as many data as possible to as many areas as possible while being certain that confidentiality is retained

Produce accompanying information on sampling error and educational materials for users

Page 10: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Methods

1-year estimates are only published for geographic areas with a minimum population of 65,000

Products reflect the use of a table-based data release rule and the availability of detailed and collapsed tables

Page 11: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Example of a Collapsed Table

Page 12: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Example - Margins of Error

Page 13: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Example - Confidence Intervals

Page 14: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Example - Statistical Testing

Page 15: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Other Educational Materials

Website includes numerous documents describing survey methods and survey quality

Separate ACS web page on Quality Measures

Page 16: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Review of 2006 ACS Data

Summary of the total estimates produced from the ACS sample

Reliability of published estimates

Effectiveness of publication thresholds and data release rules

Page 17: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Estimate Size – 2006 ACSNumber of published (blue) and not published (grey) estimates (in millions)

5.1

15.2

9.1

27.5

11.9

9

18.3

35.1

7.5

3.7

10.1

4.4

3.8

8.3

Estimates of Zero

Less than 0.5%

0.5 - 1.0%

1.0 - 5.0%

5.0 - 10.0%

10.0 - 20.0%

Greater than 20%

Page 18: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Reliability of 2006 ACS Estimates Number of published estimates (in millions) with CVs of less than 30% (blue) and CVs of 30% or greater (red)

2.8

3.7

20.5

11.0

8.6

18.0

5.1

12.4

5.4

7.0

0.9

0.5

0.4

Estimates of Zero

Less than 0.5%

0.5 - 1.0%

1.0 - 5.0%

5.0 - 10.0%

10.0 - 20.0%

Greater than 20%

Page 19: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Effectiveness of Thresholds and Release Rules – 2006 ACSNumber of estimates (in millions) with CVs of 30% or greater with release rules (red) or without release rules (grey) given varying publication thresholds

51.8

18.6

11.6

3.5

31.7

20.1

9.0

2.0

6.0

94.765,000

125,000

250,000

500,000

1 Million

Page 20: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Effect of Threshold Changes on Scope of PublicationNumber of geographic areas receiving 1-year estimates given varying publication thresholds

6,502

3,830

1,533

1,023

292

65,000

125,000

250,000

500,000

1 Million

Page 21: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

Conclusions

Continued release of 1-year estimates based on 65,000 threshold and use of data release rule

Expansion of educational materials for users with emphasis on quality

Page 22: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

New Initiatives

Survey of users to obtain feedback on measures of sampling error

Development of on-line calculator

Testing of alternative visual display of ACS data

Page 23: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

New Initiatives

Data user guides for targeted audiences

On-line tutorial

Training materials and train-the-trainer sessions

Page 24: Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics

More information

Deborah H. Griffin

U.S. Census Bureau

[email protected]

(301) 763-2855