Upload
richard-todd
View
214
Download
1
Embed Size (px)
Citation preview
GIS Data Quality
Lecture Outline
• Accuracy
• Precision• Error
– Error Sources
• Using Multiple Data Sets Together
– Completeness
– Compatibility
– Consistency
– Applicability
– Error Propagation
• Recognizing and Avoiding Error
• Metadata
Accuracy and Precision
• Accuracy: the degree to which information on a map or in a digital database matches true or accepted values
• Precision: the level of measurement and exactness of description in a GIS database
• Error: inaccuracy and imprecision of data
• Accuracy and Precision can be applied to both spatial and non-spatial data
• Spatial – often refers to scale of map from which data is derivedcan also refer to GPS data
• Non-spatial – describes level of detail in the attribute data
Accuracy and Precision
AccurateImprecise
InaccuratePrecise
InaccurateImprecise
AccuratePrecise
Sources of Inaccuracy and Imprecision
• Obvious Sources of Error
– Age of Data
– Areal Coverage
– Map Scale
– Density of Observations
– Relevance (use of “surrogate” data)
– Data Format
– Accessibility
– Cost
• Error from Natural Variation or Original Measurements
– Positional Accuracy
– Accuracy of Content
– Variation in the Data
• Processing Errors
– Numerical Errors
– Topological Errors
– Classification and Generalization
– Digitizing and Geocoding
Scale Effects on Position
Scale1:12,5001:25,0001:50,000
1:100,0001:250,0001:1,000,00
0
Horizontal AccuracyHorizontal Accuracy
9.5 m9.5 m
12.7 m12.7 m
25.4 m25.4 m
50.8 m50.8 m
126.9 m126.9 m
507.9 m507.9 m
From: US National Map Accuracy StandardsFrom: US National Map Accuracy Standards
Error Sources Associated With Digitizing
Spatial Data Error
• Location errors– Example: a schoolhouse is located 30 feet away from its
marked location on a map– A 300 meter contour line is offset 5 meters to the
northwest– A satellite image pixel is located 2.4 meters away from its
actual location on the ground
• Attribute errors– A schoolhouse is incorrectly labeled as a church– A 300 meter contour line is actually supposed to be a 310
meter contour line– A 300 meter contour line actually represents an elevation
of 302 meters– A classified satellite image pixel is labeled forest when it is
actually a field
• One data point – error/accuracy can be easily defined.
• Data sets/maps – error/accuracy must be summarized.
• How is accuracy determined and summarized?– Very accurate data must be collected (sampled) about a
subset of the full dataset/map.– This accurate sample is then compared with the original
data– A summary is created that compares these 2 datasets
(the sample with the same measurements from the original data)
Spatial Data Error
Spatial Data Error
• Locational data accuracy can be summarized with Root Mean Square Error (RMSE).– A kind of average of the distance points/pixels are
represented from their actual location on the ground.
• Locational data can also be summarized in other ways:
• For example:– For horizontal data, the USGS uses the US National
Mapping Accuracy Standards:– 90% of all measurable points are within 1/50 of an
inch for maps of spatial scale less than or equal to 1:20,000, and within 1/30 of an inch for maps of spatial scale greater than 1:20,000.
Error• Error is unbiased when the error is in
‘random’ directions– GPS data– Human error in surveying points
• Error is biased when there is systematic variation in accuracy within a geographic data set– Example: GIS tech mistypes coordinate
values when entering control points to register map to digitizing tabletall coordinate data from this map is
systematically offset (biased)• Example: the wrong datum is being used
Error when Using Multiple Data Sets
• Error Propagation – one error leads to another
– using a mis-registered point to register another layer
– additive effect
– E.g., what happens if layer digitized with a spatial bias problem is used as the spatial reference to create another, new layer?
• Error Cascading – erroneous, imprecise and inaccurate information will skew a GIS solution when information is combined selectively into new layers
– errors propagate from layer to layer repeatedly
– effect can be additive or multiplicative
Propagation & Cascading
Using Multiple Data Sets
• Four Data Quality Considerations:
– Completeness
• A complete data set will cover the study area and time period in its entirety
• No data set is 100% complete
– Compatible
• Data sets must be compatible with one another
• Scale, data capture methods, etc.
– Consistency
• There must be consistency between and within data sets
• Data development, data capture methods
– Applicability
• Data must be appropriate for your intended use
Documenting Your Data – Metadata
• Metadata - data about data– Used to document all aspects of a data set– Allows the user to determine the usefulness of data set– Organizations want to maintain their investment– To share information about available data
• Data catalogs & clearinghouses– To aid data transfer & appropriate use
• Metadata standards set by the Federal Geographic Data Committee (FGDC)http://www.fgdc.gov
• All data distributed on the web and by sanctioned data distributors should have FGDC-compliant metadata
Metadata in ArcGIS
• Visible in ArcCatalog• Contained in the .xml part of a
shapefile
Reminders
• Case study #7 will be on Friday (Oct. 5th)• Mid-term study guide will be posted online on
Friday (Oct. 5th)• Mid-term review will be on Monday (Oct. 8th)
– Come with questions
• Mid-term exam will be next Wednesday (Oct. 10)
• Lab 3 is due next Friday (Oct. 12th)– This was written incorrectly in the Lab 3 document