17
A Digital Geolibrary: Integrating Keywords and Placenames ECDL 2003 1 A Digital GeoLibrary: Integrating Keywords And Place Names Mathew Weaver and Lois Delcambre Computer Science and Engineering Department OGI School of Science and Engineering Oregon Health & Science University Leonard Shapiro, Jason Brewster, Afrem Gutema Department of Computer Science College of Engineering & Computer Science Portland State University Timothy Tolle Monitoring Specialist, Strategic Planning Region 6 – Pacific Northwest Region USDA Forest Service

A Digital Geolibrary: Integrating Keywords and PlacenamesECDL 20031 A Digital GeoLibrary: Integrating Keywords And Place Names Mathew Weaver and Lois Delcambre

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20031

A Digital GeoLibrary: Integrating Keywords And Place

Names

Mathew Weaver and Lois DelcambreComputer Science and Engineering Department

OGI School of Science and EngineeringOregon Health & Science University 

Leonard Shapiro, Jason Brewster, Afrem Gutema

Department of Computer ScienceCollege of Engineering & Computer Science

Portland State University 

Timothy TolleMonitoring Specialist, Strategic Planning

Region 6 – Pacific Northwest RegionUSDA Forest Service

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20032

Outline

• Introduction: – Metadata++ - a digital library for natural

resource management– The Problem: place names as keywords vs.

locations with geographic footprint

• Our Solution: Metadata++ and standard GIS

• Discussion

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20033

The Metadata++ Digital Library

• Partner: United States Forest Service, Inventory and Monitoring, Region 6, Pacific Northwest Region

• Primary user: Natural resource managers

• Primary content: Agency approved reports and documents(e.g., Decision Notices, Appeal Decisions, Environmental Assessments, Environmental Impact Statements, Specialist Reports)

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20034

Metadata++: A Digital Library built using Hierarchical Controlled

Vocabularies (CVs)

Atmosphere

Air Quality Air Management

standards Air pollution burning smoke Emissions smoke Weather Climate moisture Fire Weather CO dry Processes evaporation

Vegetation Management

Air Management

burning

smoke

standards

Air Pollution

CO

smoke

Climate

moisture

Fire Weather

dry

Processes

inversion

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20035

Metadata++ Controlled Vocabularies

• Numerous, well-structured, standard CVs in wide use (taxonomic classification of plants/animals, vegetation classification)

• CVs of interest identified by experts • Terms are often phrases (e.g., “Adaptive Management

Area”)

• Broader/narrower term• Synonyms• Multiple CVs permitted for each topic• No notion of preferred term• Term may appear in multiple locations in hierarchy• Currently no need to distinguish word senses

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20036

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20037

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20038

Location is very important!Numerous CVs for location.

And…most users are veryfamiliar with a GIS (Geographic Information System).

But we need search by location AND keywords!

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 20039

Our Solution: Metadata++ & GIS

locations selected by a user

synonyms for locations

Metadata++ GIS

CVs of placenames (locations)

Documents (with locations)

to display on mapKnows about documents.Represents controlled vocabularies.Supports complex search for terms.

Knows about polygons, lines, features.Performs spatial reasoning.

We want to exploit the strengths of each system – and not require Metadata++to do spatial reasoning and not require the

GIS system to know about documents.

Key Idea: Assign a unique ID to each place namethat appears in a GIS dataset and that is also

known to Metadata++. Send IDs back and forth.

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200310

Selecting Locations

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200311

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200312

Spatial Synonym Discovery

Metadata++

North Santiam River Lower Willamette River Yamhill River - Political + Washington - Oregon Clackamas County Columbia County

GIS

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200313

Related work terminology• Geographic information retrieval (GIR) –retrieve

documents based on geographic references within documents.

• Consider three kinds of documents: – Georeferenced documents have spatial footprint(s)

(coordinate, polygon, etc.). – Georeferenceable documents contain implicit

references to geographic locations (place names)– Non-georeferenceable documents have no geographic

reference

• Spatial queries: work well for georeferenced documents; require that georeferenceable documents have associated footprint(s); Doesn’t work at all for non-georeferenceable documents

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200314

Related Work

• The Alexandria Digital Library Project) Gazetteer– manages placenames, with geographic footprints– includes many, many placenames; intended for general use. – Includes extensions for associating terms with documents, in

addition to footprints.

• Some GIR systems use an ontology that includes place names.

• GeoVSM support keyword and spatial description and search of a single set of documents. Our system accommodates non-georeferenceable documents and allows the user easily combine place names with any other (non-spatial) terms in their search.

• G-Portal [9] is a map-based digital library architecture for georeferenced resources. The map-based interface is used to search for documents.

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200315

Metadata++ Implementation Details

• CVs are implemented using the file system– Every term is a folder– Narrower terms appears as subfolders– “Windows Explorer” can be used to browse

and edit the terms and their hierarchical relationships

• For “places” (terms with spatial footprints)– A shortcut to the GIS dataset is placed in the

folder for the term (placename)– The name of the shortcut is a “guid” which

serves as the term id – inside the GIS system

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200316

Work in Progress• Implementing G-Map “lite” – for web

browser access to Metadata++ with svg map viewer (limited GIS capability)

• Implementing G-Map “power user” – for users with GIS software on their desktop

• Testing and refining the Metadata++ system with real documents and users

• Formalizing the Metadata++ model

A Digital Geolibrary: Integrating Keywords and Placenames ECDL 200317

Questions?