20
GIS Data Quality

GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Embed Size (px)

Citation preview

Page 1: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

GIS Data Quality

Page 2: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Lecture Outline

• Accuracy

• Precision• Error

– Error Sources

• Using Multiple Data Sets Together

– Completeness

– Compatibility

– Consistency

– Applicability

– Error Propagation

• Recognizing and Avoiding Error

• Metadata

Page 3: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Accuracy and Precision

• Accuracy: the degree to which information on a map or in a digital database matches true or accepted values

• Precision: the level of measurement and exactness of description in a GIS database

• Error: inaccuracy and imprecision of data

• Accuracy and Precision can be applied to both spatial and non-spatial data

• Spatial – often refers to scale of map from which data is derivedcan also refer to GPS data

• Non-spatial – describes level of detail in the attribute data

Page 4: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Accuracy and Precision

AccurateImprecise

InaccuratePrecise

InaccurateImprecise

AccuratePrecise

Page 5: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Sources of Inaccuracy and Imprecision

• Obvious Sources of Error

– Age of Data

– Areal Coverage

– Map Scale

– Density of Observations

– Relevance (use of “surrogate” data)

– Data Format

– Accessibility

– Cost

• Error from Natural Variation or Original Measurements

– Positional Accuracy

– Accuracy of Content

– Variation in the Data

• Processing Errors

– Numerical Errors

– Topological Errors

– Classification and Generalization

– Digitizing and Geocoding

Page 6: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Scale Effects on Position

Scale1:12,5001:25,0001:50,000

1:100,0001:250,0001:1,000,00

0

Horizontal AccuracyHorizontal Accuracy

9.5 m9.5 m

12.7 m12.7 m

25.4 m25.4 m

50.8 m50.8 m

126.9 m126.9 m

507.9 m507.9 m

From: US National Map Accuracy StandardsFrom: US National Map Accuracy Standards

Page 7: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Error Sources Associated With Digitizing

Page 8: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Spatial Data Error

• Location errors– Example: a schoolhouse is located 30 feet away from its

marked location on a map– A 300 meter contour line is offset 5 meters to the

northwest– A satellite image pixel is located 2.4 meters away from its

actual location on the ground

• Attribute errors– A schoolhouse is incorrectly labeled as a church– A 300 meter contour line is actually supposed to be a 310

meter contour line– A 300 meter contour line actually represents an elevation

of 302 meters– A classified satellite image pixel is labeled forest when it is

actually a field

Page 9: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

• One data point – error/accuracy can be easily defined.

• Data sets/maps – error/accuracy must be summarized.

• How is accuracy determined and summarized?– Very accurate data must be collected (sampled) about a

subset of the full dataset/map.– This accurate sample is then compared with the original

data– A summary is created that compares these 2 datasets

(the sample with the same measurements from the original data)

Spatial Data Error

Page 10: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Spatial Data Error

• Locational data accuracy can be summarized with Root Mean Square Error (RMSE).– A kind of average of the distance points/pixels are

represented from their actual location on the ground.

• Locational data can also be summarized in other ways:

• For example:– For horizontal data, the USGS uses the US National

Mapping Accuracy Standards:– 90% of all measurable points are within 1/50 of an

inch for maps of spatial scale less than or equal to 1:20,000, and within 1/30 of an inch for maps of spatial scale greater than 1:20,000.

Page 11: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency
Page 12: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Error• Error is unbiased when the error is in

‘random’ directions– GPS data– Human error in surveying points

• Error is biased when there is systematic variation in accuracy within a geographic data set– Example: GIS tech mistypes coordinate

values when entering control points to register map to digitizing tabletall coordinate data from this map is

systematically offset (biased)• Example: the wrong datum is being used

Page 13: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Error when Using Multiple Data Sets

• Error Propagation – one error leads to another

– using a mis-registered point to register another layer

– additive effect

– E.g., what happens if layer digitized with a spatial bias problem is used as the spatial reference to create another, new layer?

• Error Cascading – erroneous, imprecise and inaccurate information will skew a GIS solution when information is combined selectively into new layers

– errors propagate from layer to layer repeatedly

– effect can be additive or multiplicative

Page 14: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Propagation & Cascading

Page 15: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Using Multiple Data Sets

• Four Data Quality Considerations:

– Completeness

• A complete data set will cover the study area and time period in its entirety

• No data set is 100% complete

– Compatible

• Data sets must be compatible with one another

• Scale, data capture methods, etc.

– Consistency

• There must be consistency between and within data sets

• Data development, data capture methods

– Applicability

• Data must be appropriate for your intended use

Page 16: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Documenting Your Data – Metadata

• Metadata - data about data– Used to document all aspects of a data set– Allows the user to determine the usefulness of data set– Organizations want to maintain their investment– To share information about available data

• Data catalogs & clearinghouses– To aid data transfer & appropriate use

• Metadata standards set by the Federal Geographic Data Committee (FGDC)http://www.fgdc.gov

• All data distributed on the web and by sanctioned data distributors should have FGDC-compliant metadata

Page 17: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency
Page 18: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Metadata in ArcGIS

• Visible in ArcCatalog• Contained in the .xml part of a

shapefile

Page 19: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency
Page 20: GIS Data Quality. Lecture Outline Accuracy Precision Error –Error Sources Using Multiple Data Sets Together –Completeness –Compatibility –Consistency

Reminders

• Case study #7 will be on Friday (Oct. 5th)• Mid-term study guide will be posted online on

Friday (Oct. 5th)• Mid-term review will be on Monday (Oct. 8th)

– Come with questions

• Mid-term exam will be next Wednesday (Oct. 10)

• Lab 3 is due next Friday (Oct. 12th)– This was written incorrectly in the Lab 3 document