6
The 9th International Conference on Computer Supported Cooperative Work in Design Proceedings Peer-to-Peer Collaborative Integration of Dynamic Ontologies Juliana Lucas de Rezende', Jairo Francisco de Souza*, Jan0 Moreira de So~za'~* 'UFRJ - Institute of Mathematics 2COPPE/UFW - Computer Science Department, Graduate School of Engineering [email protected], fiairobd,jano)@cos. uJi.jj. br Abstract With the grown avaiiabilily of large and specialized online ontologies, the questions about the integraiion of independently developed ontologies have become even more important. To facilitate the ontology integration process, this paper presents an ontoloa integration support module, that promotes the creation of new ontologies by reusing others. The hypothesis is that the ontology designer will achieve a reduction in the time dedicated to create a new ontology, as well as obtain ontologies with better quuliv. The experimental use of the profolype developed showed evidence that the hypothesis can be confirmed Keywords: Peer-to-Peer Collaboration, Ontology, Collaborative Design, CSCW. 1. Introduction According to Xexeo [I], the design activity has been described as belonging to a class of problems that have no optimal solution, only satisfactory ones. They are complex, usually interdisciplinary in nature and require a group of people to solve it. The need for a number of individuals to work together raises problems in the CSCW domain: the fact that these individuals come from different backgrounds means that they will have difficulty interacting and understanding each other, which may compromise the project. Thus, it becomes necessary to establish a common language so that all parties can better cooperate. In this fashion, knowledge exchange among all involved in the process becomes easier, as does reuse of objects previously created and stored. New technologies, such as high-speed networks, allow real-time cooperative work. [2] In this sense, the peer-to-peer (P2P) approach can greatly contribute to cooperative work. For instance, P2P architecture permits direct interaction between peers, which promotes a more dynamic application. P2P applications are naturaIIy not-centralized, and this fact allows applications to form small cooperation groups without the presence of a central server, which could become stressed by the number of connections. In other words, the P2P architecture offers a more robust and fault-resilient environment for cooperative work. [3] An ontology [4] is a formal specification of concepts and their relationships. By defining a cOmmon vocabulaty, ontologies reduce concept definition mistakes, allowing for shared understanding, communication improvemenf and more detailed description of resources. Ontologies have crossed the borders of philosophy, achieving an important position in Computer Science. Some ontologies have only inheritance relationships, while others have several other relationships, such as, for example: part-of, part-whole, contain, etc. Ontologies are particularly useful as a design tool because they render the communication between people and the interoperability between systems easier. [5] As a design too1, ontologies have been used in specific methodologies to support the development of software FI. 1.1, Motivation To complement the learning process, a system was developed to promote knowledge building, dissemination, and exchange in learning communities. This system is called Knowledge Chains Editor (KCE), and is based on a process for building personal knowledge through the exchanging of knowledge chains (KC). [7] It is implemented over COPPEER, a framework for creating very flexible collaborative P2P applications that provides non-specific collaboration tools as plug-ins. [l] The KC Figure 11 [7] is a structure that was created to organize the knowledge structure and organization. A KC is made up of a header (which contains basic information related to the chain) and a knowledge units (KU) [7] list. Figure 1. Kowledge Organization 1158

Peer-to-peer collaborative integration of dynamic ontologies

Embed Size (px)

Citation preview

The 9th International Conference on Computer Supported Cooperative Work in Design Proceedings

Peer-to-Peer Collaborative Integration of Dynamic Ontologies

Juliana Lucas de Rezende', Jairo Francisco de Souza*, Jan0 Moreira de So~za'~* 'UFRJ - Institute of Mathematics

2COPPE/UFW - Computer Science Department, Graduate School of Engineering [email protected], fiairobd, jano)@cos. uJi.jj. br

Abstract

With the grown avaiiabilily of large and specialized online ontologies, the questions about the integraiion of independently developed ontologies have become even more important. To facilitate the ontology integration process, this paper presents an ontoloa integration support module, that promotes the creation of new ontologies by reusing others. The hypothesis is that the ontology designer will achieve a reduction in the time dedicated to create a new ontology, as well as obtain ontologies with better quuliv. The experimental use of the profolype developed showed evidence that the hypothesis can be confirmed

Keywords: Peer-to-Peer Collaboration, Ontology, Collaborative Design, CSCW.

1. Introduction

According to Xexeo [I], the design activity has been described as belonging to a class of problems that have no optimal solution, only satisfactory ones. They are complex, usually interdisciplinary in nature and require a group of people to solve it.

The need for a number of individuals to work together raises problems in the CSCW domain: the fact that these individuals come from different backgrounds means that they will have difficulty interacting and understanding each other, which may compromise the project. Thus, it becomes necessary to establish a common language so that all parties can better cooperate. In this fashion, knowledge exchange among all involved in the process becomes easier, as does reuse of objects previously created and stored.

New technologies, such as high-speed networks, allow real-time cooperative work. [2] In this sense, the peer-to-peer (P2P) approach can greatly contribute to cooperative work. For instance, P2P architecture permits direct interaction between peers, which promotes a more dynamic application. P2P applications are naturaIIy not-centralized, and this fact allows applications to form small cooperation groups without the presence of a central server, which could become stressed by the number of connections. In other

words, the P2P architecture offers a more robust and fault-resilient environment for cooperative work. [3]

An ontology [4] is a formal specification of concepts and their relationships. By defining a cOmmon vocabulaty, ontologies reduce concept definition mistakes, allowing for shared understanding, communication improvemenf and more detailed description of resources. Ontologies have crossed the borders of philosophy, achieving an important position in Computer Science.

Some ontologies have only inheritance relationships, while others have several other relationships, such as, for example: part-of, part-whole, contain, etc.

Ontologies are particularly useful as a design tool because they render the communication between people and the interoperability between systems easier. [5 ] As a design too1, ontologies have been used in specific methodologies to support the development of software FI.

1.1, Motivation

To complement the learning process, a system was developed to promote knowledge building, dissemination, and exchange in learning communities. This system is called Knowledge Chains Editor (KCE), and is based on a process for building personal knowledge through the exchanging of knowledge chains (KC). [7] It is implemented over COPPEER, a framework for creating very flexible collaborative P2P applications that provides non-specific collaboration tools as plug-ins. [ l ]

The KC Figure 11 [7] is a structure that was created to organize the knowledge structure and organization. A KC is made up of a header (which contains basic information related to the chain) and a knowledge units (KU) [7] list.

Figure 1. Kowledge Organization

1158

The 9th Intemational Conference on Computer Supported Cooperative Work in Design Proceedings

Conceptually, knowledge can be decomposed into smaller knowledge. Such decomposition may occur recursively. For simplification effects, it was considered that there is a basic unit which can be represented as a KU. A KW is a structure formed by an attribute set.

To build his KC, the apprentice can use the KCE. In c u e of questioning, the apprentice must create a KU whose state is “question”. At this moment, the system starts the KU search. It sends messages to other peers and waits for an answer. Each peer is responsible for the internal search. The internal search consists of verifying if there are any KUs similar to the one in the search. All KUs found are retumed to the solicitor. Figure 21

Figure 2. KCE Architecture

An important question is how to compare two KUs and how to say they are similar. Currently, the comparison between two KUs to verify their similarity is carried out in a very simple way: the names and keywords are compared. The use of ontologies cm make that easier, however, there are several problems when one tries to use independently developed ontologies together, or when existing ontologies are adapted for new purposes. Although there is already a lot of research done in this area, there are still many open questions. In this paper, we investigate the problems that may.arise.

This article is divided into 4 sections, the first section being the current introduction. In the second section, we presents the collaborative ontology editor, where the ontology integration support module will be added. In the third section, we describe the ontology integration and how it can support the ontology designer. Finally, we present our conclusions and future works.

2. Collaborative Ontology Editor

In a multidisciplinary group, each person may have his or her own “personal ontology”, but a shared ontology must be constructed for work to be effective. We believe that a collectively designed ontology is more powerful than an imposed one. In the collectively built ontology, each person will have the opportunity to provide input on the shared representation, and an

agreement will have to be reached at some point. Therefore, a tool to support the cooperative development of ontologies will be useful not only to achieve the goal of creating the ontology, but also to heIp the individuals to have a better performance in the project.

Meanwhile, because their use becomes more common with time, in a wide variety of applications, with many projects developing new ontologies, it is possible to encounter two or more different ontologies representing the same or similar knowledge [5]. Taking into account that ontology creation can be a complex and time- consuming process, a tool that allows for cooperative work, enabling sharing and reuse of ontologies, is considered of great utility.

The Collaborative Ontology Editor (COE) is a P2P application designed to allow ontology developers to share their knowledge. i t provides many activities: ontology creation, ontology edition, ontology sharing, ontology reuse, and other traditional P2P mechanisms. It is implemented over COPPEER; therefore, its users can also take advantage of non-specific collaboration tools provided by it, such as an instant messaging (chat) tool and a file exchange tool.

COE provides a visual interface Figure 31 where users can manipulate the ontology in graphical or textual form. It uses’hyperbolic trees [SI as the main abstraction to manipulate an ontology. The user can navigate the ontology (moving nodes to the center of the image), insert, remove, and move nodes (and their corresponding sub-trees).

Figure 3. COE Interface

3. Ontology Integration and User Support

To improve the search in the KCE, we propose the use of ontologies. As previously stated, KCE and COE are implemented over the COPPEER framework, and, for this reason, KCE can make use of the ontologies created in COE. In this case, for every KU created, the user would have the option of associating it to an ontological concept, which would make future searches and searches by other users in the KC that contains this KU, easier.

1159

The 9th International Conference on Computer Supported Cooperative Work in Design Proceedings

However, this solution would bring serious problems. The first problem would be relative to the creation of a domain ontology. It would be impossible to create a domain ontology, as the domain is unpredictable in our context. The. proposed solution would be, instead of creating a central ontology, to create several ontologies.

Another, more serious, problem is that these ontologies have a dynamic behavior, as they are constantly altered, increasing their domain andor granularity, but ontologies are “naturally” static. So, what can be done to make altering ontologies easier? Many different types of alterations are possible, ranging from the addition of one node to the need for ontology integration. An example of integrating necessity occurs when two ontologies from the same area cover different aspects and contain overlapping information. The simpler cases are supported by COE.

This paper proposes to improve the search in KCE through the use of dynamic ontologies. For this, we have specified an ontology integration support module to go with COE.

Before we present the proposed module, it is necessary to define ontology integration. Ontology integration occurs when the user creates a new ontology from two or more existing ontologies with overlapping parts. [9] There are two ways in which that may occur: by merge or union. [lo] An ontology merge occurs when the user creates a new ontology from two or more existing ontologies concerning the same subject, merging them into a single one that ‘t~nifies” all of them. An ontology union occurs when the user builds a new ontology by reusing other available ontologies (assemble, extend, specialize). Examples are shown in Figure 4.

Figure 4. Ontology Integration: Merge and Union

The ontology integration support module is especially concerned with facilitating the creation of ontologies by reusing others (union) and the integration of ontologies rendered through a KC search in KCE with local ontologies (merge).

In order to obtain the ontologies in a KCE search, each KU must have an attribute that contains the ontoIogica1 term, where it is connected to an ontology concept. Each peer has zero or n ontologies, and each KU is associated to only one ontology. When a KU is returned, KCE also r e m s the ontology associated to it.

If the KU is incorporated by the user into his context, it offers the possibility of integrating the associated ontology to the KU returned. The user needs to become the “owner” of the ontology and, thus, he will be qualified to perform alterations on it. If that isn’t the case, the alterations made by the creator of the ontology will be reflected.

According to Klein [lo], mismatches between ontologies are the key type of problems that hinder the integrate of independently developed ontologies. He classified the different types of mismatches between two levels. The first one is the language or meta-model level. This is the level of the language primitives that are used to specify an ontology. Mismatches at this level are mismatches between the mechanisms to define classes, relations and so on. The second level is the ontology or model level, at which the actual ontology of domain lives. A mismatch at this level is a difference in the way the domain is modelled.

Mismatches at the language level occurs when ontologies written in different ontology languages are integrated. In total, are distinguished four vpes of mismatches: syntax, logical representation, semantics of primitives and language expressivity.

Mismatches at the ontology level happens when two or more ontologies, that describe (partly) overlapping domains, are integrated. These mismatches may occur when the ontologies are written in the same language, as well as when they use different languages. These mismatches are categorized as conceptualization and explication. Conceptualization mismatches difference in the way a domain is interpreted. There are two types of conceptualization mismatches: scope, model coverage and granularity. Explication mismatches difference in the way the conceptualization is specified. The first two of them result from explicit choices of the modeler about the style of modeling: paradigm and concept description (modeling conventions). Further, the next two types of differences can be classified as terminologicaf mismatches: synonym terms and homonym terms. Finally, there is a one trivial type of difference, the encoding.

After presenting ontologies mismatches, it is necessary to present the ontologies similarity measures. There are two categories: linguistic (syntactic) similarity and semantic similarity. Linguistic similarity reach the synonym lists and string matching (shared substrings, common prefixes, common sufixes). Semantic similarity reach the structure of the concept, the concept position in the ontology and graph-based analysis.

The integration process can be organized into three phases: find the places in the ontologies where they overlap; relate concepts that are semantically close via equivalence and subsumption relations (aligning); check the consistency, coherency and non-redundancy of the resuIt. [ 111

1160

The 9th International Conference on Computer Supported Cooperative Work in Design Proceedings

T1 o T 2 T1 =T2 D1 =D2 Equivalence Synonymy

D l o o D2 OverLap Overlap

D10 D2 Homonymy D isjointness

D1> D2 Additional IS-A

(intersection)

Due to the difficulty of performing automatic ontology integration, the goal of this research is to create the ontology integration support module, through which the system suggests integrations to the user. The suggestions possess a degree of trustworthiness. To define the suggestions that will be offered in the event of a certain integration, the system must process the function defmed using fuzzy. Due to the fact that the exact definition of this function is extremely complex, we have decided to calculate similarities by comparing the concept attributes and their inheritance relationships. After validation of the prototype, tests will be performed to exchange the comparison function for a fizzy hnction.

Studies carried out in the area of ontology integration offer a few methodologies. There is no consensus regarding a single methodology and, thus, each group of scientists joining a project requiring ontology integration end up formulating a new methodology. [9, 101 Those methodologies have a lot in common, but there is no consensus, which makes it difficult to devise tools to aid in the process of ontology integration. However, more and more attention is being focused on the area, and a few academic systems have emerged, such as PROMPT 1121 (a tool developed at Stanford, which can be used with Protigi [13]), and Chimaera [ll].

PROMPT is an interactive ontology-merging tool. It guides the user through the merging process making suggestions, determining conflicts, and proposing conflict-resolution strategies.

Chimaera is an ontology merging and diagnosis tool developed by the Stanford University Knowledge Systems Laboratory (KSL). It also has diagnostic support for verifying, validating, and critiquing ontologies.

The ontology integration support module proposed here, intends to group all advantages of PROMPT and Chimaera.

The following is an example of ontology merging, in which two different ontologies having the same domain will be integrated. Figure 51

Synonym list : Library (A) -> Library (El) People (A) -> Person (B) Joumal (A) -> Magazine (B) Newspaper (A) ->Newspaper

Periodical-Publication (A) -> Publication (B) Hyponym list:

It is important to note that the “Journal” concept in ontology A became the “Magazine” concept in ontology B, due to the fact that they are synonym concepts. The other modifications are also related to the synonym and hyponym lists.

It is very difficult to put most methodologies into use quickly, especially when you are dealing with lay users. One example of this is the OntoClean [14]

methodology, developed to evaluate ontological decisions and based on formal notions that are general enough to be used by any ontology, regardless of domain. Among those formal notions are: “essence”, “rigidity”, “identity”, and “unity”.

Figure 5. Ontology Merging

According to Kavouras [15], the integration problem can be summarized in’ the table shown below (Table 1). Considering that T is a term (name) and D is a definition (a natural language definition), then a concept C is represented by the function C = (T, D). Different combinations of these two elements result in a set of possible semantic relations between similar concepts.

Table 1. Different combinations of term (T) and

1161

The 9th International Conference on Computer Supported Cooperative Work in Design Proceedings

While, computationally, comparing T1 and T2 doesn't present a high level of difficulty since it is a comparison of strings, comparing D1 and D2 presents an extremely high level of difficulty, especially when the domain isn't properly represented in the ontology.

To solve the problem, we reduced D to D = (d, S , IS-A), where d is the description of the defiition, S is the set of synonyms of the concept, and IS-A are its inheritance relationships. The structure of C is presented below, where IS-A and the synonyms are separated by semicolons:

<concept name=CONCEPD <description>DESCRIPTION</description> <synonym* SYNONYMSdsynonymc <IS-A>is-a4S-A> 4concepP

The folIowing represents an instance of C:

< concept name =CAR> <description~Four-wheeled vehicle that uses

<synonymeAWTOMOBILE; PASSENGER fuel combustion as its source of energy+description>

VEHICLE4synonymP <IS-ArMOTOR VEHICLEG'IS-A> 4 concepe

The similarity calculation should be performed by comparing keywords (except for the stopwords) contained in the synonyms and description, and by calculating the distance from concept C to the first common parent. In other words, let's imagine C has a parent D, E has a parent F. and G has a parent H. Thus, simply by calcuIating the distance, if D, F, and G have the same T (are possibly the same), and C is at a distance of 3 from D, E at 4 from F, and G at 5 from H, it is more likely that C is similar to E than C to G. To clarify the idea, let see the example bellow [Figure 61:

lor% Zebra J L x

Figure 6. Ontology Example

The concept Horse is more likely the concept Zebra than the concept Monkey. It occurs because Horse and Zebra have Quadruped as a common parent (distance by 1) while Horse and Monkey have Animal as a common parent (distance by 2). So, how much lesser is the distance, greater is the similarity degree. This

characteristic can be useful when two similar concepts don't have been specified as synonyms by the user.

4. Conclusion and Future Works

With the grown availability of large and specialized online ontologies, the questions about the integration of independentIy deveIoped ontologies have become even more important. Ontology integration is a complex process which importance has grown because it permits the creation of new ontologies from two or more existing ontologies. With this facility, to create a new ontology, the ontology designer don't need to begin from zero.

In this paper, we analyzed the problems that hinder the combined use of ontologies. These problems are of several kinds and may occur at several levels. This examination is still very general, and will be worked out further in the future.

The most difficult problems are those of conceptual integration. There are a lot of techniques and heuristics for suggesting alignments, but it still is an opened problem. We think that semantic mapping at the model level will remain a task that requires a certain level of human intervention.

This work presented the ontology integration support model that will be integrated to COE, and hopes that this functionality brings facilities to the ontology designer. There are'any systems focuses on the integration ontology problem (some of them were cited here), however, each one focus in one part of the problem. Our target is to congregate the best practices to solve each part of the problem, to completely solve the problem of the integration of ontologies.

The completeness and correctness of the ontolongies should be assured; however, those problems will be dealt with by COPPEER, as they are necessary hnctionalities in any service made available in a P2P environment.

The experimental use of the ontology integration support module showed evidence that when it is used by the ontology designer, the hypothesis that he would achieve a reduction in the time dedicated to create a new ontology, as well as obtain ontologies with better quality was confirmed. In order to evaluate the reach of the objective, experiments aimed at obtaining qualitative and quantitative data that would make possible the verification of the hypothesis under consideration must be carried out.

It is necessary to point out that it is not the goal of this work to ensure the quality of the new ontology. Our goal is to offer support for the ontology integration process.

Due to the fact that this work is still in progress, many future projects are expected to ensue. The first one is to exchange the comparison function for a fuzzy finction.

1162

The 9th International Conference on Computer Supported Cooperative Work in Design Proceedings

The problems listed are mismatches between ontologies, however, mismatches are not the only problems that have to be solved when one want to use several ontologies together for one task. As changes to ontologies are inevitable in an open domain, it becomes very important to keep track of the changes and of its impact on the dependencies of that ontology. It is often not practically possible to synchronize the changes of an ontology with the revisions to the applications and datasources that use them. Therefore, a versioning method is needed to handle revisions of ontologies and the impact on existing sources. In some sense, the versioning problem can also be regarded as a derivation of ontology integration; it results from changes (possibly required by integration tasks) to individual ontologies.

The central question that a versioning scheme answers is: how to reuse existing ontologies in new situations, without invalidating the existing ones. A versioning scheme provides ways to disambiguate the interpretation of concepts for users of the ontology revisions, and it makes the compatibility of the revisions explicit. To improve COE, ontology versioning can be added to it.

5. Acknowledgements

This work was partially supported by CAPES and Wq.

6. References

[l] G. Xexeo, A. Vivacqua, J.M. de Souza, et al, “Peer-to- Peer Collaborative Editing of Ontologies”, Preceedings ofSth InfernationaI Conference on CSCW in Design, Xiamen, China, 2004. [2] F. Fluckiger. “Understanding Networked Multimedia, Applications and Technology”, Prentice Hall, Hemel Hemstead, England, 1995.

[3] Y. Chawathe, et al, “Making Gnutella-like PZP Systems Scalable”, Proceedings of ACM SIGCOMM, University of Karlsrehe, Germany, 2003. [4] T.R. Gruber, ‘‘Towad Principles for the Design of Ontologies Used for Knowledge Sharing”, Internafionnl Workshop on Formal Ontology, 1993. [5] T.R. Gruber, “A Translation Approach to Portable Ontologies”, 1993, Knowledge Acquisifion, 5 (2): 199-220. [6] F.O. Zlot, R.K.M., Rochq “Modeling Task Knowledge to Support Software Development”, Proceedings of the 14th International Conference on Sofwore Engeneering and Knowledge Engeneering, Ischia, Italy, 2002, pp 35-42. 171 S.L. de Rezende, R.L.S. da Silva, J.M. de Souza, M. Ramirez, “Building Personal Knowledge through Exchanging Knowledge Chains”, Preceedings of IADIS lntematiunal Conference on WBC, Algarve, Portugal, Fevereiro, 2005. [8] HyperTree Java Library - available in http://hypert.ree.sourceforge.net, December, 2004. [9] S. Pinto, et al, “Some Issues on Ontology Integration”, IJCAI-99, in Proceedings of the Workshop on Ontologies and Problem Solving Methodr, Stocholm, Sweden, August 1999. [lo] M. Klein, Combining and relating ontologies: an analysis of problems and solutions, IJCAI 2001, Workshop on Ontologies and Information Sharing, 200 1. [ I l l D.L. McGuimess, et al, “An environment for merging and testing large ontologies”, in KR2000, Principles of Knowledge Represenfation and Reasoning, San Francisco,

1121 N.F. Noy, M.A. Musen, “PROMPT: Algorithm and tool for automated ontology merging and alignment”, in AAAI- 2000, Proceedings of !he 7th Naiional Conference on ArtiJiciul Intelligence, AAAI/MIT Press, Austin, TX, 2000. [I31 Protdgb Project, Stanford University - available in http://protege.stanford.edu/index.h~, December, 2004. [14] N. Guarino, C. Welty, “Evaluating ontological decisions with OntoClean”. ACM Special Issue: Ontologv Applications and Design, vol. 45, issue 2, pp 61-65, February 2002. [I51 M. Kavouras, “Unified Ontological Framework for Semantic Integration”, Workshop on Nexf Generation Geospatial Information, Boston, October, 2003.

2000, pp 483-493.

1163