Knowledge Discovery in Ontology Learning A survey

Knowledge Discovery in Ontology Learning

A survey

Outline• Introduction

• OL Data Input

• OL Application Fields

• OL Methods

• OL Tools (practical session)

Introduction• Ontology Engineering is a time-consuming task

• Ontology Learning (OL) is the semi-automatic process supporting ontology engineering

• OL it is a bottom-up and data-driven process

• OL is an interdisciplinary field

OL Data Input• Pure NL text

• Ontologies

• KB (DB) instances

• Schemata– DB schemata

– Web schemata

• Log files

OL Application Fields• OL can support Ontology Engineering (and management) in different

phases.– Ontology extraction: based on some input the ontology engineer gets

ontology proposal.

– Ontology reuse: pruning existing domain ontologies for a specific application.

– Ontology interoperability (multiple ontology management): mapping discovery.

OL Methods (outline)• Ontology Extraction (from text)

– Weak ontology notion• Document Ontology extraction

– Strong ontology notion• Association rules

• Conceptual clustering

• Ontology Reuse– Ontology Pruning

• Ontology Learning for interoperability

Document Ontology extraction (1)• Extraction of concepts from a set of documents and identification of

relationships between these concepts with different individual terms [3]

• No semantic relations extraction

• Only concepts extraction (aggregation of terms identified with the same concept)

• Use of statistical analisys above a set of documents

• Good for domain specific applications

Document Ontology extraction (2)• Input (text documents)

• Pre-processing

• Normalization

• LSI (using SVD)

• Document Ontology Construction

Document Ontology extraction (3)

m x n m x r

r x rr x n

Documents

Singular Value Decomposition

Concepts

A U S VT

Association Rules (1)• Make use of shallow text processing techniques [6]

• No taxonomic relation

• Assumption: syntactic relations semantic relations

Association Rules (2)• Preprocess the text documents

– Morphological analysis

– Recognition of name entities

– Retrieval of domain specific concepts (if available)

– Disambiguation using context information

• Determine Concept Pairs set (CP) using several heuristic (either general or domain dependant)– NP-PP heuristic

– Sentence heuristic

– Title heuristic

Association Rules (3)

• Determine T = {{ai,1,…,ai,n}| (ai,1, ai,2)CP m >2 ((ai,1, ai,m) H (ai,2,

ai,m) H)}

• Determine support and confidence for all association rules Xk Yk, where |Xk|=|Yk|=1

• Propose to the user only the rules that exceed user-defined thresholds

support (Xk Yk) =

confidence (Xk Yk) =

|{ti|Xk Yk ti}|

|{ti|Xk ti}|

Conceptual Clustering (1)• Use of conceptual clustering approach [2,5] to extract a hierarchy of

concepts and to learn subcategorization frames

• In our case, examples to cluster are set of words, associated to the frequency of the corresponding instantiated frame in the corpora

• Syntactic parser provides parsed sentences including attachments of noun phrases to verbs and clauses<to travel> <subject: father> <by: car><to travel> <subject: neighbor> <by: train><to drive> <subject: friend> <by: car><to drive> <subject: colleague> <by: motor-bike><to drive> <subject: friend> <by: motor-bike>

• Unambiguous parsed sentences is not a requirement, noise is taken in account

• The meaning of the concepts of the ontology is characterized by the subcategorization frames they appear in

Conceptual Clustering (2)E.g.:<to travel> <subject: father> <by: car><to travel> <subject: neighbor> <by: train><to drive> <subject: friend> <by: car><to drive> <subject: colleague> <by: motor-bike><to drive> <subject: friend> <by: motor-bike>

Conceptual Clustering (3)

C1 : to cook in C2 : to put in

oven (4)

stew pan (12)

frying pan (2)

oven (5)

stew pan (3)

wok (6)

pan (2)

Clusters which have a maximum overlap (thus, clusters which contains the same words with the same frequencies) have to be merged.

Ontology Pruning• Ontology pruning is a data-driven means to reuse existing (general)

ontologies in order to tailor them to a certain domain [4]

• The approach uses data-oriented techniques that are based on word/concept frequencies

• The idea is to compare the frequencies of words/concepts in two different corpora, one domain-specific and one generic

• Words/concepts whose frequencies, in the domain-specific corpora, overcome of a certain percentage the frequencies of the same words in the generic corpora, are accepted, the others rejected

OL for Interoperability (1)• The key challenge here is to find semantic mappings between similar

elements from two ontologies [1]

• First problem: how can we define a meaningful similarity measure?

• Second problem: how can we compute such measure using the available data?

• An assumption here, is to have instances that can be used to learn concepts

OL for Interoperability (2)• Similarity Measure

– Many definitions are possible (it is task dependent)

– Many similarity measures are based on the joint probability distribution:P(A , B) – P(¬A , B) – P(A , ¬B) – P(¬A , ¬B)

– Jaccard coefficent – JC(A,B) = =P(A B)

P(A B)

P(A , B)

P(A , B) + P(¬A , B) + P(A , ¬B)

OL for Interoperability (3)• Distribution estimator

– We assume to have a set of instances that is representative of the universe covered by the ontology

– N(UiA,B) is the number of instances of the ith ontology that belongs to both

A and B

– P(A , B) =

– Problem: what if A and B does not belong to the same ontology? (because this is our case!)

[N(U1A,B) + N(U2

[N(U1) + N(U2)]

OL for Interoperability (4)R

E Ft1, t2 t3, t4

t5, t6 t7

t1, t2, t3, t4

t5, t6, t7 Trained Learner L

I Js2 s3, s4

s5, s6 s5, s6

s1, s2, s3, s4

L s1, s3 s2 , s4

U2A , B

U2A , ¬B

U2¬A , B

U2¬A , ¬ B

OL Tools (KAON)• http://kaon.semanticweb.org

• Open Source

• Java based

• Implements a modular framework

• Text2Onto, module for OL from text (association rules, see Association Rules (1))

• Ontology Pruning implemented (simple filter on TF)

References[1] A. Doan, J. Madhavan, P. Domingos, A. Halevy. Learning to map between ontologies on the Semantic Web. In Proceedings of the 11th International World Wide Web Conference (WWW 2002), Hawaii, USA, May 2002.

[2] D. Faure, C. Nedellec. A corpus-based conceptual clustering method for verb frames and ontology acquisition. In 1st International Conference on Language resources and Evaluation -- Workshop on Adapting lexical and corpus resources to sublanguages and applications, Granada, Spain, pages 1--8, 1998.

[3] G. R. Maddi, C. S. Velvadapu, S. Srivastava, J. Gil de Lamadrid. Ontology Extraction from text documents by Singular Value Decomposition.

[4] A. Maedche, R. Volz, R. Studer, B. Lauser. Pruning-based identification of a domain in ontologies. In Proc. of I-KNOW'03, Graz, Austria, 07 2003.

[5] A. Maedche, V. Zacharias. Ontology-based Instance Clustering. In proc. of ECML/PKDD. Springer, 2002.

[6] A. Maedche, S. Staab. Discovering Conceptual Relations from Text. In Proc. Of ECAI-2000.

Knowledge Discovery in Ontology Learning A survey

Documents

Knowledge Discovery

Efficient Discovery of Ontology Functional Dependencies

Ontology Learning for Chinese Information Organization and Knowledge Discovery in Ethnology and Anthropology Kong Jing Institute of Ethnology & Anthropology,

Sweetpotato ontology status - Sweetpotato Knowledge · PDF fileSweetpotato ontology status Reinhard Simon, ... Ontology examples ... - update on crop ontology web-site on-going

Knowledge Discovery Problem

Ontology-based Service Discovery in a Globally Distributed ...knarig/thesis.pdf · Ontology-based Service Discovery in a Globally ... Service Discovery in a Globally Distributed Network

Ontology-Driven Knowledge Organization – Enhancing UDDI ...eprints.rclis.org/6995/1/Oh_Ontology.pdf · Ontology-Driven Knowledge Organization – Enhancing UDDI Web services in

Knowledge Management Australia 2015: The Discovery and Re-Discovery of Knowledge

Ontology and Knowledge System

OntoPlus: Text-driven ontology extension using ontology ... · oms and instances. Ontologies enable effective domain knowledge representation, knowledge sharing and knowledge reuse

Knowledge & Ontology

Knowledge Discovery through Ontology Matching: An Approach ...sinc.unl.edu.ar/sinc-publications/2012/RCSCG12/sinc_RCSCG12.pdf · Knowledge Discovery through Ontology Matching: An

BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING

BioBroker: Knowledge Discovery Framework for … the knowledge discovery on heterogeneous ontologies. In this study, we presented a knowledge discovery framework BioBroker,

CKD: a Cooperative Knowledge Discovery Model for Design Project · 2017. 4. 30. · knowledge management area. Our ambition is to define a cooperative ontology and a classification

eGovernment Process Knowledge Ontology Business Process ...subs.emis.de/LNI/Proceedings/Proceedings220/722.pdf · eGovernment Process Knowledge Ontology Business Process Knowledge

Actionable Knowledge Discovery

An Ontology-based Approach to Knowledge

Ontology-Based Resource Discovery in Pervasive

Knowledge Based Ontology