View
218
Download
0
Tags:
Embed Size (px)
Citation preview
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20031
A Digital GeoLibrary: Integrating Keywords And Place
Names
Mathew Weaver and Lois DelcambreComputer Science and Engineering Department
OGI School of Science and EngineeringOregon Health & Science University
Leonard Shapiro, Jason Brewster, Afrem Gutema
Department of Computer ScienceCollege of Engineering & Computer Science
Portland State University
Timothy TolleMonitoring Specialist, Strategic Planning
Region 6 – Pacific Northwest RegionUSDA Forest Service
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20032
Outline
• Introduction: – Metadata++ - a digital library for natural
resource management– The Problem: place names as keywords vs.
locations with geographic footprint
• Our Solution: Metadata++ and standard GIS
• Discussion
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20033
The Metadata++ Digital Library
• Partner: United States Forest Service, Inventory and Monitoring, Region 6, Pacific Northwest Region
• Primary user: Natural resource managers
• Primary content: Agency approved reports and documents(e.g., Decision Notices, Appeal Decisions, Environmental Assessments, Environmental Impact Statements, Specialist Reports)
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20034
Metadata++: A Digital Library built using Hierarchical Controlled
Vocabularies (CVs)
Atmosphere
Air Quality Air Management
standards Air pollution burning smoke Emissions smoke Weather Climate moisture Fire Weather CO dry Processes evaporation
Vegetation Management
Air Management
burning
smoke
standards
Air Pollution
CO
smoke
Climate
moisture
Fire Weather
dry
Processes
inversion
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20035
Metadata++ Controlled Vocabularies
• Numerous, well-structured, standard CVs in wide use (taxonomic classification of plants/animals, vegetation classification)
• CVs of interest identified by experts • Terms are often phrases (e.g., “Adaptive Management
Area”)
• Broader/narrower term• Synonyms• Multiple CVs permitted for each topic• No notion of preferred term• Term may appear in multiple locations in hierarchy• Currently no need to distinguish word senses
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20038
Location is very important!Numerous CVs for location.
And…most users are veryfamiliar with a GIS (Geographic Information System).
But we need search by location AND keywords!
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20039
Our Solution: Metadata++ & GIS
locations selected by a user
synonyms for locations
Metadata++ GIS
CVs of placenames (locations)
Documents (with locations)
to display on mapKnows about documents.Represents controlled vocabularies.Supports complex search for terms.
Knows about polygons, lines, features.Performs spatial reasoning.
We want to exploit the strengths of each system – and not require Metadata++to do spatial reasoning and not require the
GIS system to know about documents.
Key Idea: Assign a unique ID to each place namethat appears in a GIS dataset and that is also
known to Metadata++. Send IDs back and forth.
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200312
Spatial Synonym Discovery
Metadata++
North Santiam River Lower Willamette River Yamhill River - Political + Washington - Oregon Clackamas County Columbia County
GIS
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200313
Related work terminology• Geographic information retrieval (GIR) –retrieve
documents based on geographic references within documents.
• Consider three kinds of documents: – Georeferenced documents have spatial footprint(s)
(coordinate, polygon, etc.). – Georeferenceable documents contain implicit
references to geographic locations (place names)– Non-georeferenceable documents have no geographic
reference
• Spatial queries: work well for georeferenced documents; require that georeferenceable documents have associated footprint(s); Doesn’t work at all for non-georeferenceable documents
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200314
Related Work
• The Alexandria Digital Library Project) Gazetteer– manages placenames, with geographic footprints– includes many, many placenames; intended for general use. – Includes extensions for associating terms with documents, in
addition to footprints.
• Some GIR systems use an ontology that includes place names.
• GeoVSM support keyword and spatial description and search of a single set of documents. Our system accommodates non-georeferenceable documents and allows the user easily combine place names with any other (non-spatial) terms in their search.
• G-Portal [9] is a map-based digital library architecture for georeferenced resources. The map-based interface is used to search for documents.
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200315
Metadata++ Implementation Details
• CVs are implemented using the file system– Every term is a folder– Narrower terms appears as subfolders– “Windows Explorer” can be used to browse
and edit the terms and their hierarchical relationships
• For “places” (terms with spatial footprints)– A shortcut to the GIS dataset is placed in the
folder for the term (placename)– The name of the shortcut is a “guid” which
serves as the term id – inside the GIS system
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200316
Work in Progress• Implementing G-Map “lite” – for web
browser access to Metadata++ with svg map viewer (limited GIS capability)
• Implementing G-Map “power user” – for users with GIS software on their desktop
• Testing and refining the Metadata++ system with real documents and users
• Formalizing the Metadata++ model