View
1
Download
0
Category
Preview:
Citation preview
Ontologies for Geographic Information Processing
U. VISSER, H. STUCKENSCHMIDT, G. SCHUSTER & T. V�OGELE
TZI - Center for Computing Technologies, University of Bremen, Universit�atsallee 21-23,
D-28359 Bremen, Germany
fvisserjheinerjschusterjvogeleg@informatik.uni-bremen.de
Received (12/01/1999; Revised 03/20/2000)
Abstract. The development towards open geographical information systems (GIS) and the interoper-
ability between these systems demand new requirements for the description of the underlying data. The
exchange of data between GIS is problematic and often fails due to confusion in the meaning of concepts.
The term semantic translator, a translator between GIS and/or catalogue systems which gives the user
the option to map data between the systems is a current research topic. This paper proposes an overview
of formal ontologies and how they can be used for geographical information integration. A description
of an intelligent architecture for semantic-based information retrieval is introduced and shows in a case
study how this approach can be used for general purposes.
Keywords: Ontology, Semantic Translation, Intelligent Information Integration, Information Retrieval,
Geo-processing
1. Introduction
Information processing in geographical applica-
tions is a complex task. The following subsections
give a brief overview about the problem domain
and recent activities from various organizations.
1.1. The Problem Domain
In order to solve a problem in an environmental
domain (e. g. where does the high sulfate concen-
tration in the river come from?) involves various
data from di�erent areas (e. g. data regarding the
river, adjacent waste dumps, ground water ow
etc). Frequently, all the data is not available from
one database but is distributed and has di�erent
formats (e. g. single data, time series, spatial data
with di�erent resolutions).
Therefore this requires a profound data prepa-
ration before the actual analysis can be accom-
plished. Recent studies in areas such as data
warehousing (Wiener et al., 1996), information in-
tegration (Galhardas et al., 1998; Bergamashi et
al., 1999) and interoperability between GIS (Vck-
ovski et al., 1999) have addressed these problems.
The problem domain is complex, not completely
understood and dynamic. Inventing and develop-
ing methods for environmental information sys-
tems is a challenging task for both computer and
geo-scientists.
Therefore, how do we integrate the intelligent
methods and algorithms computer scientists o�er
us and how do we merge these with the knowledge
of environmental experts?
1.2. Spatio-Temporal Information
Spatio-temporal information can be described as
a window at a certain place and time. This
window gives insight about the data but is only
2 Visser, Stuckenschmidt, Schuster & V�ogele
one window in the space-time continuum. The
OpenGISTM Consortium (OGC) de�nes place as
a measurable piece of the real world. Time is a
point, an interval or a collection of points and in-
tervals in what we perceive as the time contin-
uum. Time and place can be measured and sur-
veyed, and their coordinates in a particular spa-
tial, temporal reference system can be derived.
The consortium uses the term location to cover
both place and time (OGC, 1999b). G�unther
(G�unther, 1998) describes properties of environ-
mental data as follows (extract):
% 'Large' more detailed
1. Complex: spatial data objects often have a
complex structure. An object could be repre-
sented by a single point or thousands of poly-
gons.
2. Dynamic: spatial data are dynamic. Inser-
tions and deletions interact with updates.
3. Large: spatial data tend to be large (in
terms of the amount of data, e. g. geographical
maps).
4. No standard algebra de�ned: basically this
means that there is no standard set of opera-
tors.
1.3. Geographic Information Systems
Geographical information systems are essential
and important tools to analyse and visualize
spatio-temporal information. Originally devel-
oped for the creation of thematic maps, GIS sup-
port data capture (e. g. digitizing), data storage
(DBMS, spatial DBMS), and data analysis (e. g.
combination of spatial and non-spatial data).
Lately, the OGC demands new requirements for
GIS. The objectives of the OGC are a full inte-
gration of geo-spatial data and geo-processing re-
sources into mainstream computing. Open and
interoperable geo-processing, or the ability to
share heterogeneous geo-data and geo-processing
resources transparently in a networked environ-
ment is the main aim of this organization. The
interoperability of GIS demands new requirements
which can be achieved in two ways. Firstly, the de-
velopers of GIS have to come together and de�ne
de facto standards. The OGC's abstract speci�-
GIS III
simplefeature
simplefeature
simplefeature
simplefeature
simplefeature
simplefeature
simplefeature
towards a fully componentized GIS
componentsusesGIS
systems on various platforms
GIS I
GIS II
GIS as autonomous
Simple featuresas basic functions ofvarious GIS as components
Fig. 1. Development of GIS
cation models are a �rst (and big!) step in this
direction. Secondly, an approach is to develop
semantic translators (see section 1.4.2), to de�ne
the meaning of concepts. For example the con-
cept forest in the ATKIS (AdV, 1998) catalogue
has a di�erent semantic than the concept forest in
CORINE land cover catalogue (EEA, 1997-1999).
We will discuss this example later in this paper.
Figure 1 gives an overview about the OGC's GIS
future. The idea is to de�ne 'simple features' and
compose these to a customized GIS system.
1.4. Current Activities
As mentioned previously, the activities related
to interoperability between GIS can be seen as
two main streams: Firstly, the OGC is work-
ing on the standardization of components and has
published their �rst frozen abstract speci�cation
(OGC, 1999a). Secondly, the activities around se-
mantic translators are worth mentioning. OGCs
topic 14 Semantics and Information Communities
hasn't been considered in depth, however, a core
task force will be working on this topic in the fu-
ture.
1.4.1. Standardization: The OGC is working
on an evolution from traditional GIS solutions,
in which proprietary data models and monolithic
software functions are made interoperable and ex-
tensible. Applications which adhere to the ob-
jectives of Open GIS are free to access and use
various types of distributed data, and to utilize
Ontologies for Geographic Information Processing 3
multiple geo-processing tools and services. A for-
mal speci�cation, the Open Geodata Interoper-
ability Speci�cation (OGIS), is currently under
development and will de�ne the types and meth-
ods necessary to build interoperable systems. At
the 2nd International Conference on Interoperat-
ing Geographic Information Systems (Vckovski et
al., 1999) almost 50% of the contributions were
related to the OpenGIS ideas. We can assume
that the OGC and their ideas and visions will
continue in the near future. The relation between
the IT industry with their standards approach and
the GIS interoperability approach was one of the
topics at the above mentioned conference (Berre,
1999). General problems and solutions for syntac-
tic and semantic interoperability in the context of
IT-standards, such as ISO RMODP, ISO CSMF,
CORBA/EJB/COM+, UML, XML and the Euro-
pean DISGIS Esprit IV project (DISGIS-Project,
1999), which deals with practical experiences re-
garding the use of ISO/TC211 and OpenGIS in-
teroperability approaches were discussed.
In summary, standardization is one way to over-
come interoperability de�ciencies between GIS.
However, it is hard to convince vendors to sup-
port standardization and it is a time consuming
task.
1.4.2. Semantic Translation: In order to achieve
interoperability between GIS the data from one
system have to be found and integrated into an-
other system. However, this integration process is
not always supported by the users system. There-
fore, tools are required to achieve interoperabil-
ity of data sources. Problems that might arise
due to heterogeneity of the data are already well-
known within the distributed database systems
community (e. g. Kim and Seo, 1991; Saltor and
Rodriguez, 1997). In general, these problems can
be divided into three categories:
� syntax (e. g. data format heterogeneity),� structure (e. g. homonymes, synonymes or dif-
ferent attributes in database tables), and� semantics (e. g. intended meaning of terms in
a special context or application).
Syntactic approaches can help to overcome
problems that belong to the �rst two categories.
Unfortunately, syntactic approaches do not sup-
port the reconciliation of the semantic heterogene-
ity problems appropriately. As environmental sci-
ence has an interdisciplinary character, environ-
mental information often faces semantic problems.
They arise among others from the use of di�er-
ent terminologies established for certain purposes.
This type of semantic heterogeneity is already
a problem for human experts in communicating
with each other. Therefore, it becomes even more
challenging when attempting to integrate these
terminologies automatically. Furthermore, these
approaches must address two main problems:
� attaching semantics to information sources
and entities and� drawing conclusions from the semantic anno-
tations available.
Early approaches for semantic integration were
mainly based on the use of thesauri to translate
between speci�c vocabularies. Fulton (Fulton,
1996) de�ned the term semantic plug and play,
an architecture in which the relationships among
the data are managed through the models that
de�ne that data and the operations performed on
it. More recently, semantic enrichment of infor-
mation has been discovered as a promising ap-
plication area for well-known AI-techniques and
methods (Fensel, 1999). In this course of research
more powerful concepts for semantic annotation
have been developed. Semantic annotation facil-
ities should become more widely applicable with
the further development of the Resource Descrip-
tion Framework RDF (W3C 1999c webref.). This
is intended to become a standard annotation lan-
guage for semantic information on the world wide
web.
With semantic translation data translation can
go beyond the traditional mapping and conversion
of geometric primitives. If we look at the term 'se-
mantic' w.r.t. geographical data we are referring
to the meaning of a concept (e. g. the concept for-
est in a geographical sense). This is quite di�erent
to the term 'semantic' as it is used in programming
languages, where semantics determines the exact
function of a language.
Commercial and Non-Commercial Tools: Cur-
rently, there are a few commercial and non-
commercial systems on the market that make use
of semantic translation. An example: The Feature
4 Visser, Stuckenschmidt, Schuster & V�ogele
Manipulation Engine (FME) (FME, 1999), origi-
nally developed for the Canadian Government, is
'emerging as a de facto standard in the industry
for sharing geospatial data between diverse appli-
cations' (Michael Cosentino, Geospatial Market
Development Manager, Sun Systems Inc.). Un-
derlying the engine is a rich data model which
is internally consistent and inherently extensible.
Constructs within the models of the input, or out-
put formats, or systems are mapped to constructs
in the engine's model. The engine provides a se-
ries of methods to carry out model-to-model trans-
formations, applicable to data either on input or
output. Cosentino argues that this functionality
ensures that neither the data provider nor the data
consumer feels constrained; they can use their re-
spective systems anyway they wish. FME pro-
vides a translation tool through which sophisti-
cated spatial translation operations between vari-
ous standard GIS data formats can be performed.
FME is the core of a number of applications, such
as the Geo-Task Server (Huber, 1998).
Other Activities: The German Federal/States
working group 'environmental information sys-
tems' (BLAK UIS) stated that semantic interoper-
ability is required for open environmental systems
at their workshop (Bock et al., 1999) this year.
It is anticipated that the authorities will perform
further work on this topic. An idea to overcome
the de�ciencies of exchanging or comparing data
between GIS and/or between catalogues is to use
ontologies. The advantage of ontologies is the ex-
istence of formal semantics. This allows de�ned
ontologies for concepts (such as forest) for di�er-
ent catalogue systems and to de�ne axioms for the
'translation' between those ontologies.
2. Ontologies and their Application
The term 'Ontology' has been used in many ways
and across di�erent communities (Guarino and
Giaretta, 1995). If we try to motivate the use of
ontologies for geographic information processing
we have to make clear what we have in mind when
we refer to ontologies. Thereby, we mainly follow
the description given in (Uschold and Gruninger,
1996). In the following sections we will introduce
ontologies as an explication of some shared vo-
cabulary or conceptualization of a speci�c subject
matter. We will brie y describe the way an ontol-
ogy explicates concepts and their properties and
argue for the bene�t of this explication in di�erent
typical application scenarios.
2.1. Shared Vocabularies and Conceptualizations
In general, each person has an individual view of
the world. However, there is a common basis of
understanding in terms of the language we use to
communicate with each other. Terms from nat-
ural language can therefore be assumed to be a
shared vocabulary relying on a (mostly) common
understanding of certain concepts with only little
variety. This common understanding relies on the
idea of how the world is organized. We often call
this idea a 'conceptualization' of the world. Such
conceptualizations provide a terminology that can
be used for communication.
The example of natural language already shows
that a conceptualization is never universally valid,
but rather for a limited number of persons com-
mitting to that conceptualization. This fact is
re ected in the existence of di�erent languages
which di�er more (English and Japanese) or less
(German and Dutch). Things get even worse when
we are dealing with every day language but with
terminologies developed for special scienti�c or
economic areas. In these cases we often �nd sit-
uations where the same term refers to di�erent
phenomena. The use of the term 'ontology' in phi-
losophy and its use in computer science may serve
as an example. The consequence is a separation
into di�erent groups that share a terminology and
its conceptualization. These groups are also called
information communities.
The main problem with the use of a shared ter-
minology according to a speci�c conceptualization
of the world is that much information remains im-
plicit. When a mathematician talks about the bi-
nomial�n
k
�he has much more in mind than just
the formula itself. He will also think about its in-
terpretation (the number of subsets of a certain
size) and its potential uses (e. g. estimating the
chance of winning in a lottery). Ontologies have
set out to overcome the problem of implicit and
hidden knowledge by making the conceptualiza-
tion of a domain (e. g. mathematics) explicit. This
Ontologies for Geographic Information Processing 5
corresponds to a popular de�nition of the term on-
tology in computer science (Gruber, 1993):
"An ontology is an explicit speci�cation of
a conceptualization."
An ontology is used to make assumptions about
the meaning of a term available. It can also be
seen as an explication of the context a term is nor-
mally used in. Lenat (1998) for example describes
context in terms of twelve independent dimensions
that have to be know in order to completely under-
stand a piece of knowledge and shows how these
dimensions can be explicated using the 'Cyc' on-
tology.
2.2. Speci�cation of Context Knowledge
There are many di�erent ways in which an on-
tology may explicate a conceptualization and the
corresponding context knowledge. The possibili-
ties range from a purely informal natural language
description of a term corresponding to a glossary
up to strictly formal approaches with the expres-
sive power of full �rst order predicate logic or even
beyond (e. g. Ontolingua (Gruber, 1991)). Jasper
and Uschold (1999) distinguish two ways in which
the mechanisms for the speci�cation of context
knowledge by an ontology can be compared:
Level of Formality: The speci�cation of a con-
ceptualization and its implicit context knowledge
can be done at di�erent levels of formality. As
already mentioned above, a glossary of terms can
also be seen as an ontology despite its purely infor-
mal character. A �rst step to gain more formality
is to prescribe a structure to be used for the de-
scription. A good example for this approach is
the new standard web annotation language XML
(Bray et al., 1998). XML o�ers the possibility to
de�ne terms and organize them in a simple hierar-
chy according to the expected structure of the web
document. The organization of the term is called
a 'Data Type De�nitions' (DTD). This DTD is an
ontology describing the terminology of a web page
on a low level of formality. However, the rather
informal character of XML encourages its misuse.
While the hierarchy of an XML speci�cation was
originally designed to describe layout it can also
be exploited to represent sub-type hierarchies (van
Harmelen and Fensel, 1999) which may lead to
confusion. This problem can be solved by assign-
ing formal semantics to the structures used for the
description of the ontology. An example is the
conceptual modeling language CML (Schreiber et
al., 1994). CML o�ers primitives to describe a
domain that can be given a formal semantics in
terms of �rst order logic (Aben, 1993). However,
a formalization is only available for the structural
part of a speci�cation. Assertions about terms
and the description of dynamic knowledge is not
formalized o�ering total freedom for the descrip-
tion. On the other extreme there are also speci�-
cation languages which are completely formal. A
prominent example is the Knowledge Interchange
Format KIF (Genesereth and Fikes, 1992) which
was designed to enable di�erent knowledge-based
systems to exchange knowledge. KIF has been
used as a basis for the Ontolingua language (see
above) thus giving a formal semantics to that lan-
guage as well.
Extent of Explication: The other comparison cri-
terion is the extent of explication that is reached
by the ontology. This criterion is strongly con-
nected with the expressive power of the speci�ca-
tion language used. We already mentioned DTDs
which are mainly a simple hierarchy of terms. We
can generalize this by saying that the least expres-
sive speci�cation of an ontology consists of an or-
ganization of terms in a network using two-placed
relations. This idea goes back to the use of seman-
tic networks in the seventies. Many extensions of
the basic idea have been proposed. One of the
most in uential was the use of roles that could
be �lled out by entities showing a certain type
(Brachman, 1977). This kind of value restriction
can still be found in recent approaches. RDF-
schema descriptions (Brickley et al., 1998) which
might become a new standard for the descrip-
tion of web-pages is an example. An RDF-schema
contains class de�nitions with associated proper-
ties that can be restricted by so-called constraint-
properties. However, default values and value
range descriptions are not expressive enough to
cover all possible conceptualizations. More ex-
pressive power can be provided by allowing classes
to be speci�ed by logical formulas. These for-
6 Visser, Stuckenschmidt, Schuster & V�ogele
mulas can be restricted to a decidable subset of
�rst order logic. This is the approach of so-called
description logics (Borgida and Patel-Schneider,
1994) Nevertheless, there are also approaches al-
lowing for more expressive descriptions. In On-
tolingua, for example, classes can be de�ned by
arbitrary KIF-expressions. Beyond the expres-
siveness of full �rst-order predicate logic there are
also special purpose languages that have an ex-
tended expressiveness to cover speci�c needs of
their application area. Examples are speci�cation
languages for knowledge-based systems often in-
cluding variants of dynamic logic to describe sys-
tem dynamics (compare Fensel and van Harmelen,
1994).
2.3. Bene�ts for Applications
Ontologies are useful for many di�erent appli-
cations that can be classi�ed into several areas
(Jasper and Uschold, 1999). Each of these areas
has di�erent requirements on the level of formal-
ity and the extent of explication provided by the
ontology. We will shortly review common applica-
tion areas namely the support of communication
processes, the speci�cation of systems and infor-
mation entities, and the interoperability of com-
puter systems.
Communication: Information communities are
useful because they facilitate communication and
cooperation among its members by the use of a
shared terminology with a well de�ned meaning.
On the other hand, the formation of informa-
tion communities makes communication between
members from di�erent information communities
very diÆcult because they do not agree on a com-
mon conceptualization. They may use the shared
vocabulary of natural language. However, most
of the vocabulary used in their information com-
munities is highly specialized and not shared with
other communities. This situation demands for
an explication and explanation of the terminology
used. Informal ontologies with a large extent of
explication are a good choice to overcome these
problems. While de�nitions have always played
an important role in scienti�c literature, concep-
tual models of certain domains are rather new.
However, nowadays systems analysis and related
�elds such as software engineering rely on con-
ceptual modeling to communicate structure and
details of a problem domain as well as the pro-
posed solution between domain experts and engi-
neers. Prominent examples of ontologies used for
communication are Entity-Relationship (ER) dia-
grams (Chen, 1976) and object-oriented modeling
languages such as UML (OMG, 1999; Rumbaugh
et al., 1991).
Systems Engineering: ER-diagrams and UML
are not only used for communication, they also
serve as construction plans for data and systems
guiding the process of developing the system. The
use of ontologies for the description of information
and systems has many bene�ts. The ontology can
be used to identify requirements as well as incon-
sistencies in a chosen design. It can help to ac-
quire or search for available information. Once
a systems component has been implemented its
speci�cation can be used for maintenance and ex-
tension purposes. Another very challenging appli-
cation of ontology-based speci�cation is the reuse
of existing software. In this case the specifying
ontology serves as a basis to decide if an existing
component matches the requirements of a given
task.
Depending on the purpose of the speci�cation,
ontologies of di�erent formal strength and expres-
siveness are to be used. While the process of com-
munication design decisions and the acquisition
of additional information normally bene�t from
rather informal and expressive ontology represen-
tations (often graphical), the directed search for
information needs a rather strict speci�cation with
a limited vocabulary to limit the computational
e�ort. At the moment, the support of semi- auto-
matic software reuse seems to be one of the most
challenging applications of ontologies because it
requires expressive ontologies with a high level of
formal strength (see for example van Heijst et al.,
1997).
Interoperability: The above considerations might
provoke the impression that the bene�ts of on-
tologies are limited to systems analysis and de-
sign. However, an important application area of
ontologies is the integration of existing systems.
The ability to exchange information at runtime,
also known as interoperability, is an important
Ontologies for Geographic Information Processing 7
topic. The attempt to provide interoperability
su�ers from problems similar to those associated
with the communication amongst di�erent infor-
mation communities. The important di�erence is
that the actors are not persons able to perform
abstraction and common sense reasoning about
the meaning of terms, but machines. In order
to enable machines to understand each other we
also have to explicate the context of each system,
but on a much higher level of formality in order
to make it machine-understandable (the KIF lan-
guage was originally de�ned for the purpose of
exchanging knowledge models between di�erent
knowledge-based systems). Ontologies are often
used as Inter-Linguas in order to provide inter-
operability (Uschold and Gruninger, 1996): they
serve as a common format for data interchange.
Each system that inter-operates with other sys-
tems has to transfer its information into this com-
mon framework. Interoperability is achieved by
explicitly considering contextual knowledge in the
translation process.
3. Interoperability and Integration of In-
formation Sources
The interoperability of geographical information
systems is an important topic in geoinformatics
research (compare Vckovski et al., 1999). Many
problems have to be solved in order to provide
complete interoperability between heterogeneous
systems. One of the most basic problems is the
integration of the information used by di�erent
systems. This information often shows signi�cant
di�erences in terms of representation and struc-
turing thus making integration a challenging task.
We distinguish four levels of integration we have
to cover in order to provide interoperability:
Technical integration: The World Wide Web
provides a well established infrastructure to
exchange large amounts of information from
all over the world. Information from web-
pages and web-databases can be accessed in
principle.
Syntactic integration: Many standards have
evolved that can be used to integrate di�er-
ent information sources. Beside the classi-
cal database interfaces such as ODBC web-
oriented standards such as HTML and XML
gain importance (see www.w3c.org).
Structural integration: The �rst problem that
goes beyond the purely syntactic level is the
integration of heterogeneous structures. This
problem is solved by mediator systems de�n-
ing mapping rules between di�erent informa-
tion structures (Chawathe et al., 1994).
Semantic integration: In the following, we use
the term semantic integration or semantic
translation, respectively, to denote the resolu-
tion of semantic con icts that disable a one-
to-one mapping between concepts or terms.
Throughout this paper we will focus on this
problem.
We argued that ontologies can be useful to
solve the semantic integration problem (Stucken-
schmidt et al., 1999). In the following we present
a general approach to semantic translation and
discuss the role of ontologies in this approach.
3.1. Semantic Translation as Context Transfor-
mation
Our approach to the semantic integration prob-
lem is based on the view that each information
source serves as a context for the interpretation of
the information contained therein. This view im-
plies that an information entity can only be com-
pletely understood within its source unless we �nd
ways to preserve the contextual information in the
translation process. This claim has two implica-
tions:
1. We have to represent the context of an infor-
mation entity given by its source
2. We have to use this contextual information to
integrate an entity into the new context given
by the target of the translation
We have shown that contextual knowledge of
an information entity can be represented by neces-
sary and suÆcient conditions for deciding whether
an entity belongs to a certain class of objects. Us-
ing these conditions the integration of an entity
in a new context is equivalent with a classi�cation
that is based on its contextual knowledge (Stuck-
enschmidt and Visser, 2000). Details of this ap-
proach are given below.
8 Visser, Stuckenschmidt, Schuster & V�ogele
3.1.1. Contextual Knowledge In information
sources contextual knowledge is often hidden in
type information. Most information sources are
based on a data model describing classes, at-
tributes, and relations. Each entity within the
information source is assigned to one of these cat-
egories. We will refer to them as 'concepts' in the
sequel. Depending on the intended use of the in-
formation source each concept is assumed to serve
a special function and to show special properties
necessary for that function. Some of these prop-
erties will explicitly be contained in the informa-
tion source while other properties remain implicit
because there is a silent agreement that a prop-
erty always holds. In order to support semantic
translation we have to explicate these hidden as-
sumptions by de�ning the necessary and suÆcient
conditions an information entity has to ful�ll in
order to belong to that concept.
Necessary Conditions: Concepts are described
by a set of necessary conditions in terms of val-
ues vi for some properties pi. We write pX
ito
denote that the entity X shows property pi. We
claim that there are properties that are character-
istic for a concept and can therefore always be
observed for instances of that class. We write
NC = fp1; � � � ; pmg to denote that the concept
c has necessary conditions p1; � � � ; pm. Assumingthat class and property de�nitions always refer to
the same entity X we get the following equation:
Nc � c(X)) p
X
i ^ � � � ^ pXm (1)
SuÆcient Conditions: On the other hand, we as-
sume that an entity automatically belongs to the
concept c if it shows suÆcient characteristic prop-
erties. We write SC = fp1; � � � ; png to denote thatp1 � � � ; pn are suÆcient conditions indicating that
X belongs to the concept c. We characterize the
class c by the following equation:
Sc � p
X
1^ � � � ^ pX
n) c(X) (2)
The distinction between necessary and suÆcient
conditions for concept membership enables us to
identify entities that de�nitely belong to a concept
because they show all suÆcient conditions. On the
other hand, we can identify entities that clearly
do not belong to the concept because they do not
ful�ll the necessary conditions.
3.1.2. Context Transformation: Concepts iden-
tify common properties of their members by de�n-
ing necessary conditions for a membership. A clas-
si�cation problem is characterized by the determi-
nation of membership relations between an object
and a set of prede�ned concepts. The identi�ca-
tion process starts with data about the object that
has to be classi�ed. This data is provided by so-
called observation. During the classi�cation pro-
cess the observed data is matched against the nec-
essary conditions provided by the class de�nitions
leading to one or more classes. The match be-
tween observations and membership conditions is
performed using knowledge that associates prop-
erties of objects with their class. This view of clas-
si�cation can be formalized in the following way
(Ste�k, 1995):
� Let C be a set of solution classes (in our case
concept predicates fc1; : : : ; cmg)� Let O be a set of Observations (in our case the
necessary conditions for concept membership
fNcjc 2 Cg)� Let R be a set of classi�cation rules (in our
case suÆcient conditions for class membership
fScjc 2 Cg)
Then in principle a classi�cation task is to �nd a
solution class ci 2 C in such a way, that
O ^ R) ci(X) (3)
In terms of the de�nitions given above, seman-
tic translation is equivalent to a re-classi�cation
of entities already classi�ed in a semantic struc-
ture CS = fcS
1; � � � ; cSng using another semantic
structure CT = fcT
1; � � � ; cT
mg. The process of
re-classi�cation can be based upon the semantic
characterizations given by both structures. The
source structure provides the observations (O =
fNcjc 2 CSg), while solution classes and classi�-
cation rules are provided by the target structure
(C = CT; R = fScjc 2 C
T g). Using these de�ni-
tions, a single information entity can be translated
from one context into the other by �nding a con-
cept de�nition cTiin the target structure satisfying
equation 3.
Ontologies for Geographic Information Processing 9
3.2. Support for the Translation Process
The considerations from the last section provide
a theoretical foundation for semantic translation.
However, there are still many problems that have
to be solved to make this approach function prop-
erly. The most important question is how and
what kind of context knowledge has to be consid-
ered in the translation process because the choice
of the representation has mayor impacts on the
classi�cation method to choose and the expected
results. Ontologies can play an important role in
the translation process because their ability to ex-
plicate context knowledge provides great support.
In the following we analyze the roles di�erent on-
tologies play in our translation approach and de-
scribe how they support the whole process of in-
formation integration.
3.2.1. The Role of Ontologies: A closer look
at the semantic translation approach described
above reveals that di�erent ontologies are used
for di�erent purposes within the approach. In or-
der to get clear notions of these di�erent roles
we adopt the distinction made in (Jasper and
Uschold, 1999). They distinguish three roles an
ontology can play in an application scenario, each
associated with a level of application:
L0: Operational data
L1: Ontology
L2: Ontology representation language
We will see that each of these roles occurs within
our framework. Each role is �lled by a another
kind of ontology with di�erent extents of explica-
tion according to the speci�c requirements.
Operational Information that should be trans-
lated from one information source to another cor-
responds to L0. We argued that the real task is
to determine the concept an information entity
belongs to in a new context. So we rather trans-
late type annotations than the information entity
itself. This type information already is an ontol-
ogy in the sense of an explicit speci�cation of a
conceptualization because we have to describe the
concepts we want to translate. As a consequence,
we are already concerned with an ontology on the
level of operational data. However, this ontology
does not show a large extend of explication be-
cause it consists of a set of concept terms arranged
in a simple taxonomy.
Speci�cation of Contextual Knowledge is the ba-
sis for the translation of information entities. We
use necessary and suÆcient conditions for con-
cept membership to specify contextual knowledge.
This kind of context explication is a typical appli-
cation of an ontology. The descriptions of neces-
sary and suÆcient conditions is therefore an on-
tology corresponding to level L1. It shows a larger
extent of explication than the pure taxonomy of
concept terms because it explicates the intended
meaning of these terms. Each information source
to be integrated is supposed to be speci�ed by
such an ontology to enable us to use its contex-
tual knowledge in the translation process.
Properties of Concept de�ning necessary and
suÆcient conditions serve as a common vocabu-
lary used to build the ontologies of di�erent infor-
mation sources to be integrated. As such they
can be seen as an ontology representation lan-
guage corresponding to level L2. They have to be
shared across all information sources to enable a
classi�er to check whether conditions are ful�lled.
They explicate a common understanding of a ba-
sic vocabulary that is necessary to explain and ex-
change specialized vocabulary from di�erent infor-
mation sources. The extent of explication required
from an ontology specifying properties largely de-
pends on the complexity of the information to be
translated and requirements on the eÆciency of
the translation. If complex information has to be
translated once more complex property de�nitions
may be used than in the case of simple information
that has to be translated into real-time.
3.2.2. Process and Supporting Technologies In
order to clarify the use of di�erent ontologies we
will now discuss the process of intelligent infor-
mation integration that is implied by our ap-
proach. The process sketched below describes ac-
tors, supporting tools, and knowledge items (i. e.
ontologies) involved. Notice that although the ap-
proach described above translates only between
two sources at a time, it is not limited to bilat-
eral integration because we do not use a hard-
coded translator but a general classi�er that will
be able to integrate every information source own-
ing a suitable semantic annotation.
10 Visser, Stuckenschmidt, Schuster & V�ogele
Independent Domain Expert
ModellingTool
Shared Properties
Fig. 2. Authoring of Shared Properties (Step 1)
Authoring of Shared Terminology: Our approach
relies on the use of a shared terminology in terms
of properties used to de�ne di�erent concepts.
This shared terminology has to be general enough
to be used across all information sources to be
integrated but speci�c enough to make meaning-
ful de�nitions possible. Therefore, the shared ter-
minology will normally be built by an indepen-
dent domain expert who is familiar with typical
tasks and problems in a domain, but who is not
concerned with a speci�c information source. As
building a domain ontology is a challenging task
suÆcient tool support has to be provided to build
that ontology. Figure 2 illustrates this process.
A growing number of ontology editors exists
(Duineveld et al., 1999). The choice of a tool has
to be based on the special needs of the domain to
be modeled and the knowledge of the expert.
Annotation of Information Sources: Once a com-
mon vocabulary exists, it can be used to annotate
di�erent information sources. In this case annota-
tion means that the inherent concept hierarchy of
an information source is extracted and each con-
cept is described by necessary and suÆcient condi-
tions using the terminology built in step one. The
result of this annotation process is an ontology of
the information source to be integrated. The an-
ConceptHierarchy
SharedProperties
InformationOntology
Information Owner
AnnotationTool
Fig. 3. Annotation of Information Sources (Step 2)
SourceOntology
Concept Termform Source
Classifier
Concept Term from Target
TargetOntology
Fig. 4. Translation by Re-Classi�cation (Step 3)
notation will normally be done by the owner of an
information source who wants to provide better
access to his information. In order to enable the
information owner to annotate his information he
has to know about the right vocabulary to use.
It will be bene�cial to provide tool support also
for this step. We need an annotation tool with
di�erent repositories of vocabularies according to
di�erent domains of interest. Figure 3 illustrates
the annotation step.
In the case study described later we used the
ontolingua editor (www.ksl.stanford.edu) in order
to build information ontologies from scratch. This
is possible as long as the same group of people
owns both source and target information. How-
ever, in real life scenarios, information sources are
normally completely distributed making annota-
tion support based on property repositories un-
avoidable.
Semantic Translation of Information Entities:
The only purpose of the steps described above was
to lay a base for the actual translation step. The
existence of ontologies for all information sources
to be integrated enables the translator to work on
these ontologies instead of treating real data. This
way of using ontologies as surrogates for informa-
tion sources has already been investigated in the
context of information retrieval (Visser and Stuck-
enschmidt, 1999). In that paper we showed that
the search for interesting information can be en-
hanced by ontologies. Concerning semantic trans-
lation the use of ontologies as surrogates for in-
formation sources enables us to restrict the trans-
lation on the transformation of type information
attached to an information entity by manipulating
concept terms indicating the type of the entity.
Ontologies for Geographic Information Processing 11
Figure 4 illustrates this manipulation. The new
concept term describing the type of an informa-
tion entity in the target information source is de-
termined automatically by a classi�er that uses
ontologies of source and target structures as classi-
�cation knowledge. This is possible, because both
ontologies are based on the same basic vocabulary
that has been built in the �rst step of the integra-
tion approach.
4. Geographic Information Sources and
Ontologies, an Example
An example of the used geographical information
sources and their description with Ontolingua will
be given in this section. We use two catalogue sys-
tems, namely the German ATKIS-OK-250 (AdV,
1998) and the European CORINE land cover cata-
logue (EEA, 1997-1999). The vegetation ontology
will be used for the de�nition of primitives (e. g.
forest-trees, forest-plants, grass).
4.1. Ontologies for ATKIS, CORINE and Vege-
tation
4.1.1. The ATKIS-OK-250 Catalogue: ATKIS
(Amtliches Topographisch-Kartographisches In-
formationssystem) is an oÆcial information sys-
tem in Germany. It is a project of the head survey-
ing oÆces of all the German states. The working
group o�ers digital landscape models (e. g. DLM
250, 1:250 000) with a detailed documentation in
the object catalogue OK-250. This catalogue is
the basis for our description. The ontology for
our concept forest consists of Classes, Functions
and Instances. One class is the following:
;;; ------------------ Classes --------------
;;; Forest
(Define-Okbc-Frame Forest
:Direct-Superclasses (Vegetation-Area)
:Direct-Types (Class Primitive)
:Own-Slots ((Arity 1))
:Sentences
((=> (Forest ?X0)
(Or (Has-Vegetation ?X0 Forest-Plants)
(And (Has-Vegetation ?X0 Grass)
(Is-Cultivated ?X0 1))))
(=> (Forest ?X0)
(> (Size-In-Hectares ?X0) 10)))
:Template-Facets
((Size-In-Hectares (Numeric-Minimum 10))))
We see that the class forest is a subclass from
vegetation area. We also see that there is an in-
ternal rule which says that a thing is a forest if it
has forest-plants or cultivated grass as vegetation.
In addition, the size of the area has to be at least
10 hectares.
;;; ------------------ Instance --------------
;;; Stadtwald-1990
(Define-Okbc-Frame Stadtwald-1990
:Direct-Types (Forest)
:Own-Slots ((Is-Cultivated 0) (Is-Cultivated)
(Has-Vegetation Forest-Trees)
(Size-In-Hectares 25)))
;;; Weidedamm3-1990
(Define-Okbc-Frame Weidedamm3-1990
:Direct-Types (Forest)
:Own-Slots ((Is-Cultivated 0) (Is-Cultivated)
(Has-Vegetation Forest-Plants)
(Size-In-Hectares 12)))
The instances Stadtwald 1990 andWeidedamm3-
1990 show the state of a particular area in 1990.
The Stadtwald is not cultivated but has forest-
trees and is an area of 25 hectares while the Wei-
dedamm3 is not cultivated, has forest-plants, and
measures twelve hectares.
4.1.2. CORINE land cover: From 1985 to
1990, the European Commission carried out the
CORINE Programme (Co-ordination of Informa-
tion on the Environment). The results are essen-
tially of three types, corresponding to the three
aims of the Programme: (a) an information sys-
tem on the state of the environment in the Euro-
pean Community has been created (the CORINE
system). It is composed of a series of data bases
describing the environment in the European Com-
munity, as well as of data bases with background
information. (b) Nomenclatures and methodolo-
gies were developed for carrying out the programs,
which are now used as the reference in the ar-
eas concerned at the Community level. (c) A
systematic e�ort was made to concert activities
with all the bodies involved in the production of
environmental information especially at interna-
tional level. As a result of this activity, and in-
deed of the whole programs, several groups of in-
12 Visser, Stuckenschmidt, Schuster & V�ogele
ternational scientists have been working together
towards agreed targets. They now share a pool
of expertise on various themes of environmental
information.
This nomenclature with its 44 classes is the ba-
sis for our description. In order to demonstrate
the hierarchy we use a tree to describe parts of
the ontology (see �g. 6):
Here we see that a forest has forest-plants and is
at least 25 ha big because the superclass of forest
is Forests-And-Semi-Natural-Areas and then area.
The minimum size in hectares of an area is 25 (see
facet in class area). Please note that according to
the CORINE nomenclature sport and leisure facil-
ities are arti�cial-non-agricultural-vegetated-areas
which themselves consists of sod grass as vegeta-
tion. Let's de�ne some instances for this ontology:
;;; ------------------ Instance --------------
;;; Pauliner-Marsch
(Define-Okbc-Frame Pauliner-Marsch
:Direct-Types (Sport-And-Leisure-Facilities)
:Own-Slots ((Size-In-Hectares 40)
(Size-In-Hectares)))
;;; Stadtwald-2000
(Define-Okbc-Frame Stadtwald-2000
:Direct-Types (Mixed-Forest)
:Own-Slots ((Size-In-Hectares 25)
(Is-Cultivated 0)
(Has-Vegetation Forest-Trees)))
;;; Weidedamm3-2000
(Define-Okbc-Frame Weidedamm3-2000
:Direct-Types (Discontinuous-Urban-Fabric)
:Own-Slots ((Size-In-Hectares 30)
(Is-Cultivated 1)))
This is the information we would get out of
an classi�ed satellite image. There is a Stadt-
wald 2000 instance which is a type of mixed forest
with 25 ha, not cultivated and forest trees. There
is also the Pauliner Marsch, a sports-and-leisure
facility according to CORINE land cover with 40
ha.
4.1.3. Vegetation: If we want to match or pro-
cess the knowledge from the above described on-
tologies we either have to de�ne another ontology
which matches the concepts or we use a domain
ontology which will act as a fundament for the
two ontologies (see also �gure 6). In this ontology
the primitives such as plants, soil-type etc. are
de�ned. Please note that we show a part of the
ontology as a tree for better understanding (�gure
5).
We de�ne forest trees as forest plants and those
as plants. Also, Sod grass is grass and grass is
plants. Special cultures such as vine or hop are
also de�ned as plants.
4.2. Flexible Retrieval of Geographic Informa-
tion
In this chapter we show how the above mentioned
methods can be used for exible retrieval of geo-
graphical information. We mentioned in section
3 how information can be retrieved in general
and that ontology-based information retrieval of-
fers bene�ts for this process. In order to show
how this works we come back to our concepts for-
est within the ATKIS-OK-250 and CORINE land
cover catalogues.
The use of ontologies gives us two options: (a)
integrated views and (b) veri�cation. An inte-
grated view from the users perspective merges the
data between the catalogues. This process can be
seen as two layers which lay on top of each other.
The view needs a third ontology with axioms for
the translation process between the concepts. The
second option gives users the opportunity to ver-
ify ATKIS-OK-250 data with CORINE land cover
data or vice versa.
A query interface { this could be an intelligent
dialogue within a GIS system { sends its request to
an inference engine. The inference engine builds
up the actual knowledge base by using the ontolo-
gies of the concepts. The interesting part of the
whole idea is that the inference engine can infer on
the actual knowledge base and is therefore able to
GrassSod-Grass
Pasture-...
Grass Special-CultureForest-Plants
Forest-Trees
Plants
...
... ...
... ...
Fig. 5. Part of the vegetation ontology
Ontologies for Geographic Information Processing 13
is_cultivated = yes
Forests-And-Semi-Natural-Areas
Vegetated-AreaArtificial-Non-Agricultural-
Sport-And-Leisure-Facilities
Green-Urban-Areas
Discontinuous-Urban-Fabric
Broad-leavedForest
has_vegetationforest_plants
...
... ...
size > 24 ha
Area
Artificial-Surfaces Level 1
Urban-Fabric Forests Level 2
Mixed-Forest Level 3
......
...
sod_grasshas_vegetation
Fig. 6. Part of the CORINE land cover ontology
derive new knowledge which can be used for fur-
ther questions.
A typical problem within the planning process
of authorities is the use of heterogeneous data
categorized as ATKIS-OK-250 and CORINE land
cover satellite pictures (see �gure 7). In order to
map/exchange data between these two catalogues
we �rst look at the ontologies which are partly
described on page 11 (ATKIS-OK-250) and in �g-
ure 6 (CORINE land cover). As theorem prover
we use a PROLOG system such as SWI-Prolog
(Wielemaker, 1998).
A simple query to an inference engine could
be: "What is the superclass of 'Sod grass'?" If
we would use a PROLOG system we would put
the query: sub-class Of(Sod grass,X) and we would
get the solution grass. We would get all solutions
and the complete class hierarchy if we query sub-
S
CORINE
Vegetation
Solution ?
ATKIS-
proverOntology
Theorem-
Forests...Objecttype
mapSatellite-picture
ATKIS CORINELandcover
Class
Data structure Data structure
OntologyATKIS
domain ontologies, such as plants, soiltype etc
Fig. 7. Deductive Integration of Geographic Information
class Of(X,Y). More complex queries require more
complex representations. We use the following ax-
ioms to describe the concept forest. There are two
possibilities:
1. The size in hectares must be greater than 10
and the vegetation has to be forest plants.
The concept forest pants is de�ned in the on-
tology vegetation.
2. The size in hectares must be greater than 10
and the vegetation has to be grass. The con-
cept grass is also de�ned in the ontology veg-
etation. In addition, the vegetation has to be
cultivated.
We denoted the rules in PROLOG syntax:
1. forest(X) :-
2. size In Hectares(X, Y), Y > 10,
3. has Vegetation(X, Z),
4. a kind of(Z, forest Plants).
5. forest(X) :-
6. size In Hectares(X, Y), Y > 10,
7. has Vegetation(X, Z),
8. a kind of(Z, grass),
9. is Cultivated(X, true).
An example for an integrated view would be
the following scenario: The user wants to see
the development of the forests within in certain
area over the last years. He uses ATKIS-OK-
250 data within his GIS and wants to verify the
data with actual satellite images. He gets classi-
�ed CORINE land cover data and is seeking for
the equivalent of forest in this catalogue (see �g.
14 Visser, Stuckenschmidt, Schuster & V�ogele
6. The theorem prover derives the answer to this
question in building up the KB. The query would
be forest(stadtwald 2000). The following shows the
path through the KB (traced). An Exit marks a
return with a Yes.
Call: forest(stadtwald_2000)
Call: size_In_Hectares(stadtwald_2000, _L144)
Exit: size_In_Hectares(stadtwald_2000, 25)
Call: 25>10
Exit: 25>10
Call: has_Vegetation(stadtwald_2000, _L145)
Exit: has_Vegetation(stadtwald_2000, forest_Trees)
Call: a_kind_of(forest_Trees, forest_Plants)
Call: class(forest_Trees)
Exit: class(forest_Trees)
Call: class(forest_Plants)
Exit: class(forest_Plants)
Call: forest_Trees=forest_Plants
Fail: forest_Trees=forest_Plants
Redo: a_kind_of(forest_Trees, forest_Plants)
Call: subclass_Of(forest_Trees, forest_Plants)
Exit: subclass_Of(forest_Trees, forest_Plants)
Exit: a_kind_of(forest_Trees, forest_Plants)
Exit: forest(stadtwald_2000)
We can see that the theorem prover �rst checks
if the size is bigger than 10. It knows (through
the instance entered by the user) that the stadt-
wald 2000 has forest trees. The system is seek-
ing for forest trees and concludes that a forest tree
is a forest plant. This matches the �rst prolog
rule mentioned above and therefore the answer to
the query is Yes. The user now checks whether
the area Weidedamm III which according to the
ATKIS-OK-250 catalogue was a forest in 1990 is
still a forest. He checks this with the help of actual
satellite images in CORINE land cover format.
The query would be forest(weidedamm3 2000) and
the answer can be seen here:
Call: forest(weidedamm3_2000)
Call: size_In_Hectares(weidedamm3_2000, _L144)
Exit: size_In_Hectares(weidedamm3_2000, 30)
Call: 30>10
Exit: 30>10
Call: has_Vegetation(weidedamm3_2000, _L145)
Fail: has_Vegetation(weidedamm3_2000, _L145)
Fail: forest(weidedamm3_2000)
As we see the query fails because the slot
has vegetation fails. This is because there is no
vegetation on this area anymore, the satellite
image was classi�ed Discontinuous Urban Fabric
within the CORINE catalogue. The results pre-
sented are not very surprising, because most of
the conditions for membership to the concept for-
est were directly met by the instance 'stadtwald'
and the missing of vegetation in the instance 'wei-
dedamm' is also a criterion easy to check. Nev-
ertheless, we want to show that the ontological
foundation enables us to perform reasoning that
produces results that are not obvious and require
some additional knowledge. The ATKIS-OK-250
ontology gives two possible de�nitions of the con-
cept forest. we will use the second one to deduce
that the so-called Pauliner Marsch is also a for-
est according to that ontology. The only informa-
tion we have is that it is a member of the con-
cept Sport And Leisure Facilities taken from the
CORINE land cover ontology. To classify this area
as a 'forest' we need background knowledge about
vegetation and cultivation of sport and leisure fa-
cilities. This background knowledge can also be
speci�ed using PROLOG clauses:
1. is Cultivated(X, true) :-
2. instance Of(X, Y),
3. a kind of(Y, arti�cial Surfaces).
4. has Vegetation(X, sod Grass) :-
5. instance Of(X, Y),
6. a kind of(Y, sport And Leisure Facilities).
The clauses attach characterizing proper-
ties to the concepts arti�cial surfaces and
Sport and Leisure Facilities. These properties are
inherited by the instances of the subconcepts
thereby completing the properties needed to clas-
sify the instance under consideration as being a
member of the concept forest. The trace of the
PROLOG reasoner illustrates this:
Call: forest(pauliner_Marsch)
Call: size_In_Hectares(pauliner_Marsch, _L193)
Exit: size_In_Hectares(pauliner_Marsch, 40)
Call: 40>10
Exit: 40>10
Call: has_Vegetation(pauliner_Marsch, _L194)
Call: instance_Of(pauliner_Marsch, _L206)
Exit: instance_Of(pauliner_Marsch,
sport_And_Leisure_Facilities)
Call: a_kind_of(sport_And_Leisure_Facilities,
Ontologies for Geographic Information Processing 15
sport_And_Leisure_Facilities)
...
Exit: a_kind_of(sport_And_Leisure_Facilities,
sport_And_Leisure_Facilities)
Exit: has_Vegetation(pauliner_Marsch, sod_Grass)
Call: a_kind_of(sod_Grass, grass)
...
Exit: a_kind_of(sod_Grass, grass)
After comparing the size of the instance with
the required size, the vegetation is checked. As
there is no vegetation de�ned for the instance,
the mentioned axiom is used to deduce that all
instances of that class have Sod Grass as a vege-
tation. Just as in the examples above, the system
is able to identify sod grass as being a kind of
grass. The fact that the 'Pauliner Marsch' is cul-
tivated is deduced in a similar way as shown in
the trace below.
Call: is_Cultivated(pauliner_Marsch, true)
Call: instance_Of(pauliner_Marsch, _L252)
Exit: instance_Of(pauliner_Marsch,
sport_And_Leisure_Facilities)
Call: a_kind_of(sport_And_Leisure_Facilities,
artificial_Surfaces)
...
Redo: a_kind_of(artificial_Non_Agricultural_
Vegetated_Area,
artificial_Surfaces)
Call: subclass_Of(artificial_Non_Agricultural_
Vegetated_Area,_L276)
Exit: subclass_Of(artificial_Non_Agricultural_
Vegetated_Area,
artificial_Surfaces)
Call: a_kind_of(artificial_Surfaces,
artificial_Surfaces)
...
Exit: a_kind_of(artificial_Surfaces,
artificial_Surfaces)
Exit: a_kind_of(artificial_Non_Agricultural_
Vegetated_Area,
artificial_Surfaces)
Exit: a_kind_of(sport_And_Leisure_Facilities,
artificial_Surfaces)
Exit: is_Cultivated(pauliner_Marsch, true)
Exit: forest(pauliner_Marsch)
4.3. Results of Example
This example still does by no means cover all pos-
sibilities of ontology-based information integra-
tion. We restricted ourselves to simple taxonom-
ical reasoning using only the second order predi-
cates class, subclass Of and instance Of. One can
imagine to make use of other concepts like range
restrictions on slots or mathematical properties of
relations.
This example shows that mapping between two
catalogue systems can be done by using ontologies
for the description of the data. We describe con-
cepts with what we call 'primitives', basic items
of concepts, e. g. a tree or grass. The only require-
ment is that there must be a shared vocabulary,
meaning that the same primitives have to be used
in the catalogue systems. However, the composi-
tion of these primitives can be di�erent.
5. Discussion
In this paper we demonstrated how the use of
formal ontologies can enhance intelligent informa-
tion retrieval. We showed that ontologies with
formal semantics can help to generate semantic
translators between data sources. There are sev-
eral ways to translate one data source into an-
other, but the bene�ts of using underlying ontolo-
gies and an additional inference engine with the
ability to derive new knowledge are obvious. We
outlined the advantages of ontologies and stated
that their formal semantics can help to support
the semi-automatical translation process between
data sources.
We noted that adding new knowledge to an on-
tology is easier than adding knowledge to semi-
structured meta-data. The formal description
helps us to �nd errors (e. g. units, scales). On-
tologies therefore help to improve the data qual-
ity. Ontologies are more exible because we can
use them not only for static information but also
for brokering functional components (Benjamins
et al., 1999).
In addition, we can think about the integration
of other sources. The satellite picture for instance
could be pre-processed with advanced image op-
erators (e. g. for texture, edges and colors). This
additional knowledge could be transferred into a
knowledge base semi-automatically and could act
as an additional source for potential queries.
We believe that our approach can be seen as
a step towards the semantic translation problem.
Annotation of domain knowledge to data means
16 Visser, Stuckenschmidt, Schuster & V�ogele
�nd, acquire and represent the knowledge. As
this is also a time consuming task which has to be
done by domain experts and knowledge engineers
it seems diÆcult to foresee whether this direction
will be followed by organizations. However, if sci-
entists are able to provide suÆcient tools which
are easy to use this can help to overcome the ob-
stacles.
References
1. M. Aben. Formally specifying re-usable knowledge
model components. Knowledge Acquisition Journal,
5:119{141, 1993.
2. AdV. Amtliches Topographisch-Kartographisches In-
formationssystem ATKIS. Landesvermessungsamt
NRW, Bonn, 1998.
3. V.R. Benjamins, B. Wielinga, J. Wielemakers, and
D. Fensel. Towards brokering problem-solving knowl-edge on the internet. In D. Fensel and R. Studer, ed-
itors, Knowledge Acquisition, Modeling and Mnage-
ment, volume 1621 of Lecture Notes in Arti�cial In-
telligence. Springer, 1999.
4. Bergamashi, Castano, Vincini, and Beneventano. In-
telligent techniques for the extraction and integration
of heterogeneous information. In Workshop Intelli-
gent Information Integration, IJCAI 99, Stockholm,
Sweden, 1999.
5. A. Berre. The it standards approach to gis interoper-
ability. Tutorial T2 of the 2nd International Confer-
ence on Interoperating Geographic Information Sys-
tems, 1999.
6. M. Bock, K. Greve, and W. Kuhn, editors. Of-
fene Umweltinformationssysteme { Chancen und
M"oglichkeiten der OpenGIS-Entwicklung im
Umweltbereich, volume 7 of IFGI prints, M�unster,
feb 1999. Institut f�ur Geoinformatik, Universit�atM�unster.
7. A. Borgida and P.F. Patel-Schneider. A semantics
and complete algorithm for subsumption in the classic
description logic. JAIR, 1:277{308, 1994.
8. R.J. Brachman. What's in a concept: Structural foun-
dations for semantic nets. International Journal of
Man-Machine Studies, 9:127{152, 1977.
9. T. Bray, J. Paoli, and C.M. Sperberg-McQueen. Ex-
tensible markup language (xml) 1.0. Technical Report
REC-rdf, W3C, 1998.
10. D. Brickley, R. Guha, and A. Layman. Ressource de-
scription framework schema speci�cation. Technical
Report PR-rdf-schema, W3C, 1998.
11. S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ire-
land, Y. Papakonstantinou, J. Ullman, and J. Widom.
The TSIMMIS Project: Integration of Heterogeneous
Information Sources. In Proceedings of IPSJ Confer-
ence, pages 7{18, 1994.
12. P. P.-S. Chen. The entity relationship model { toward
a un�ed view of data. ACM Transactions on Database
Systems, (1):9 { 36, 1976.
13. DISGIS-Project. Distributed geographical in-formation systems (disgis). White paper,
http://www.disgis.com/White 1.htm/, jul 1999.
(ESPRIT IV 22.084).14. A.J. Duineveld, R. Stoter, M.R. Weiden, B. Kenepa,
and V.R. Benjamins. Wondertools? a comparative
study of ontological engineering tools. In Proceed-
ings of the 12th Ban� Knowledge Acquisition for
Knowledge-Based Systems Workshop [19].15. EEA. Corine land cover. technical guide, European
Environmental Agency, ETC/LC, European Topic
Centre on Land Cover, 1997-1999.16. D. Fensel. Intelligent information integration. In Pro-
ceedings of the IJCAI'99 Workshop, Stockholm, Swe-
den, 1999.17. D. Fensel and F. van Harmelen. A comparison of
languages which operationalise and formalise KADS
models of expertise. The Knowledge Engineering Re-
view, 9:105{146, 1994.18. J. Fulton. Semantic plug and play. In Proceedings
of the Joint Workshop on Standards for the Use of
Models that De�ne the Data and Processes of Infor-
mation Systems, Seattle, WA, 1996.19. B. Gaines, R. Kremer, and M. Musen. Proceedings of
the 12th ban� knowledge acquisition for knowledge-
based systems workshop. Technical report, University
of Calgary/Stanford University, 1999.20. H. Galhardas, Eric Simon, and Anthony Tomasic. A
framework for classifying environmental metadata. In
AAAI, Workshop on AI and Information Integration,
Madison, WI, 1998.21. M.R. Genesereth and R.E. Fikes. Knowledge inter-
change format version 3.0 reference manual. Report of
the Knowledge Systems Laboratory KSL 91-1, Stan-
ford University, 1992.22. O. G�unther. Environmental Information Systems.
Springer, Berlin, 1998.23. Object Managemant Group. Omg uni�ed modeling
language speci�cation uml v1.3. Document ad/99-06-
08, Object Management Group (OMG), 1999.24. T. Gruber. Ontolingua: A mechanim to support
portable ontologies. KSL Report KSL-91-66, Stan-
ford University, 1991.25. T.R. Gruber. A translation approach to portable on-
tology speci�cations. Knowledge Acquisition, 5(2),
1993.26. N. Guarino and P. Giaretta. Ontologies and knowl-
edge bases: Towards a terminological clari�cation.
In N. Mars, editor, Towards Very Large Knowledge
Bases: Knowledge Building and Knowledge Sharing,
pages 25{32. Amsterdam, 1995.27. M. Huber. High-tech-entscheidungstrends bei
geodaten-servern. GeoBit, 98(3):18{20, 1998.28. R. Jasper and M. Uschold. A framework for under-
standing and classifying ontoogy applications. In Pro-
ceedings of the 12th Ban� Knowledge Acquisition for
Knowledge-Based Systems Workshop [19].29. W. Kim and J. Seo. Classifying schematic and data
heterogeinity in multidatabase systems. IEEE Com-
puter, 24(12):12{18, 1991.30. D.B. Lenat. The dimensions of context space. Avail-
able on the web-site of the Cycorp Corporation.
(http://www.cyc.com/publications), 1998.31. FMEr Ltd. Semantic translation. White paper,
http://safe.com/whitepaper o.htm, jul 1999.
Ontologies for Geographic Information Processing 17
32. Enrico Motta. Reusable Components for Knowledge
Models. PhD thesis, KMI, The Open University,
United Kingdom, 1997.
33. OGC. The opengisTM abstract speci�cation. Tech-nical Report 99-100r1.doc, Open GIS Consortium,
1999. 1999a.
34. OGC. Topic 2: Spatial reference systems. TechnicalReport 99-100r1.doc, Open GIS Consortium, 1999.
1999b.
35. J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, andW. Lorensen. Object-Oriented Modeling and Design.
Prentice Hall International, Inc., Englewood Cli�s,
New Jersey, 1991.
36. F. Saltor and E. Rodriguez. On intelligent access
to heterogeneous information. In Proceedings of
the 4th Workshop Knowledge Representation Meets
Databases (KRDB '97), Athens, Greece, 1997.
37. A. Th. Schreibener, B. Wielinga, H. Akkermans, W.
van de Velde, and A. Anjewierden. Cml the com-
monkads conceptual modeling language. In Steels
et al., editor, A Future of Knowledge Acquisition,
Proc. 8th European Knowledge Acquisition Workshop
(EKAW 94), number 867 in Lecture Notes in Arti�-
cial Intelligence. Springer, 1994.
38. M. Ste�k. Introduction to Knowledge Systems. Mor-
gan Kaufman, San Francisco, California, 1995.
39. H. Stuckenschmidt and U. Visser. Semantic transla-
tion based on approximate re-classi�cation. In Pro-
ceedings of the Workshop on Semantic Approxima-
tion, Granularity and Vagueness at KR 2000, 2000.Accepted.
40. H. Stuckenschmidt, U. Visser, G. Schuster, and
T. V�ogele. Ontologies for geographic information inte-gration. In Visser and Pundt, editors, Proceedings of
the Workshop "Intelligent Methods in Environmen-
tal Protection: Special Aspects of Processing in Space
and Time, 13. International Symposium of Computer
Science for Environmental Protection (UI 99), num-ber 5/99 in Research reports of the Department of
Mathematics and Computer Science, University of
Bremen. University of Bremen, 1999.
41. M. Uschold and M. Gruninger. Ontologies: Princi-
ples, methods and applications. Knowledge Engineer-
ing Review, 11(2), 1996.
42. F. van Harmelen and D. Fensel. Practical knowledge
representation for the web. In D. Fensel, editor, Pro-
ceedings of the IJCAI'99 Workshop on Intelligent In-
formation Integration, 1999.
43. G. van Heijst, A.T. Schreiber, and B.J. Wielinga.
Using explicit ontologies for kbs development. In-
ternational Journal of Human-Computer Studies,
46(2/3):183{292, 1997.
44. A. Vckovski, K.E. Brassel, and H.-J. Schek, editors.
Proceedings of the 2nd International Conference on
Interoperating Geographic Information Systems, vol-
ume 1580 of Lecture Notes in Computer Science,
Z�urich, 1999. Springer.
45. U. Visser and H. Stuckenschmidt. Intelligent,
location-dependent acquisition and retrieval of envi-
ronmental information. In M. rumor, editor, Informa-
tion Technology in the Service of Local Government
Planning and Management. The Urban Data Man-
agement Society, 1999.
46. W3C. Resource descrition framework (rdf) schema
speci�cation. http://www.w3.org/TR/PR-rdf-
schema, mar 1999. W3C Proposed Recommendation.
47. J. Wielemaker. Swi-prolog 3.1. Reference manual,
Univ. of Amsterdam, Dept. of Social Science Infor-
matics (SWI), 1998.
48. J.L. Wiener, H. Gupta, W.J. Labio, Y. Zhuge,
H. Garcia-Molina, and J. Widom. Whips: A system
prototype for warehouse view maintenance. In Work-
shop on materialized views, pages 26{33, Montreal,Canada, 1996.
Recommended