16
© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas Eigner Competence Center Semantic Web & Language Technology Lab DFKI GmbH Saarbrücken, Germany

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

Embed Size (px)

Citation preview

Page 1: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Evaluating Ontology Search

Towards Benchmarking in Ontology Search

Paul Buitelaar, Thomas EignerCompetence Center Semantic Web &

Language Technology Lab DFKI GmbH

Saarbrücken, Germany

Page 2: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Overview

Ontology Search Knowledge reuse (integration with Ontology Learning)

OntoSelect Browse (ontologies, labels, classes, properties) Search by topic

Evaluating Ontology Search Benchmark (evaluation) data set Experiment (compare SWOOGLE, OntoSelect)

Conclusions

Page 3: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Ontology Search There are more and more ontologies published on the

(Semantic) Web Available as RDFS or OWL files (also still DAML)

Opens up possibilities for reuse of knowledge Access through ontology search engines and/or

(manual/automatic) organization in ontology libraries

But: increasingly harder to find the right one for your application Increasing research in ontology search/selection (Alani et al.,

Buitelaar et al., Ding et al., Sabou et al.) – SWOOGLE, OntoSelect, Watson

Page 4: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

OntoSelect Ontology Library and Search Engine

http://olp.dfki.de/OntoSelect

Monitors the web for ontologies with automatic harvesting and indexing

Browse and search On ontologies, classes, properties and (multilingual) labels Ontology search integrates relevance feedback over

Wikipedia for search term Ontology publishing

Submit ontologies - will be automatically integrated Statistics

On formats, languages, labels used, ontology publishing

Paul Buitelaar, Thomas Eigner, Thierry Declerck OntoSelect: A Dynamic Ontology Library with Support for Ontology Selection In: Proc. of the Demo Session at the International Semantic Web Conference, Hiroshima, Japan, Nov. 2004.

Page 5: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

OntoSelect – Browse

Page 6: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Ontology Search

Page 7: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Keyword as Wikipedia Topic

Page 8: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Keyword Expansion (Extraction)

Relevance Feedback from Wikipedia

Page 9: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Ranked Results (Browsable)

Page 10: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Search Criteria Relevance criteria address ontology content, structure, status:

Coverage - Term Matching How many of the terms in a text collection are covered by labels for

classes and properties?

Structure - Properties Relative to Classes How detailed is the knowledge structure that the ontology

represents?

Connectedness - Number of Included Ontologies Is the ontology connected to other ontologies and how well

established are these?

Page 11: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Evaluation – Benchmark Benchmark: 15 Wikipedia topics and 57 manually assigned

ontologies out of 1056 cached through OntoSelect 15 Wikipedia topics were selected out of the set of all (37284)

class/property labels in OntoSelect, by: Filtering out labels that did not correspond to a Wikipedia page >

5658 labels / topics 5658 labels were used as search terms in SWOOGLE to filter out

labels that returned less than 10 ontologies (out of the 1056 in OntoSelect) > 3084 labels / topics

Out of 3084 labels we manually selected useful topics, e.g. we left out very short labels (‘v’) and very abstract ones (‘thing’) > 50 topics

We randomly selected 15 for which we manually checked the ontologies retrieved from OntoSelect and SWOOGLE > 15 topics with 57 assigned ontologies

Page 12: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Evaluation – Benchmark by Topic 15 (Wikipedia) topics with number of assigned ontologies:

Atmosphere (2) Biology (11) City (3)

http://www.mindswap.org/2003/owl/geo/geoFeatures.owl http://www.glue.umd.edu/ katyn/CMSC828y/location.daml http://www.daml.org/2001/02/geofile/geofile-ont

Communication (10) Economy (1) Infrastructure (2) Institution (1) Math (3) Military (5) Newspaper (2) Oil (0) Production (1) Publication (6) Railroad (1) Tourism (9)

Page 13: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Evaluation – Experiment Comparison of (average) results between

SWOOGLE and OntoSelect

Use OntoSelect benchmark 15 topics (queries) 57 assigned ontologies (relevance assessments) 1056 ontologies (data set)

Use different configurations for OntoSelect With/without keyword expansion/extraction With/without class names (in addition to labels) With/without property labels Weighting of relevance criteria …

Page 14: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Evaluation – Results

Page 15: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Evaluation – Weighting of ‚title‘

Page 16: © Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas

© Paul Buitelaar – EON@ISWC07, November 2007, Busan, South-Korea

Conclusions Conclusions on evaluation are too early

Many more configurations (weights) to compare Extend the benchmark Comparison with other ontology search engines

Main contribution of the presented work First comprehensive benchmark for topic-driven

evaluation of ontology search (Extended) Benchmark will be made publicly available

http://olp.dfki.de/OntoSelect