13
K-DRIVE A Vague Sense Classifier for Detecting Vague Definitions in Ontologies Panos Alexopoulos, John Pavlopoulos 14th Conference of the European Chapter of the Association for Computational Linguistics Gothenburg, Sweden, 26–30 April 2014

A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

Embed Size (px)

Citation preview

Page 1: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

K-DRIVE

A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

Panos Alexopoulos, John Pavlopoulos

14th Conference of the European Chapter of the Association for Computational Linguistics

Gothenburg, Sweden, 26–30 April 2014

Page 2: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

2

K-DRIVE VaguenessIntroduction

●Vagueness is a semantic phenomenon where predicates admit borderline cases, i.e. cases where it is not determinately true that the predicate applies or not (Shapiro 2006).

●This happens when predicates have blurred boundaries:●What’s the threshold number of years separating old and not old

films?●What are the exact criteria that distinguish modern restaurants

from non-modern?

Page 3: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

3

K-DRIVE Vagueness ConsequencesIntroduction

●The problem with vague terms in semantic data is the possibility of disagreements!

●E.g., when we asked domain experts to provide instances of the concept Critical Business Process, there were certain processes for which there was a dispute among them about whether they should be regarded as critical or not.

●The problem was that different experts had different criteria of process criticality and could not decide which of these were sufficient to classify a process as critical.

Page 4: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

4

K-DRIVE Problematic ScenariosIntroduction

1. Structuring Data with a Vague Ontology: Possible disagreement among experts when defining class and relation instances.

2. Utilizing Vague Facts in Ontology-Based Systems: Reasoning results might not meet users’ expectations

3. Integrating Vague Semantic Information: The merging of particular vague elements can lead to data that will not be valid for all its users.

Page 5: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

5

K-DRIVE Problem Definition & ApproachAutomatic Vagueness Detection

●Can we automatically determine whether an ontology entity (class, relation etc.) is vague or not?

● “StrategicClient” as “A client that has a high value for the company” is vague!

● “AmericanCompany” as “A company that has legal status in the Unites States” is not!

Problem Definition

●We train a binary classifier that may distinguish between vague and non-vague term word senses.

●Training is supervised, using examples from Wordnet.●We use this classifier to determine whether a given ontology element

definition is vague or not.

Approach

Page 6: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

6

K-DRIVE DataAutomatic Vagueness Detection

●2,000 adjective senses from WordNet. ●1,000 vague●1,000 non-vague

●Inter-agreement of vague/non-vague annotation among 3 human judges was 0.64 (Cohen’s Kappa)

Vague Senses Non Vague Senses

• Abnormal: not normal, not typical or usual or regularor conforming to a norm

• Compound: composed of more than one part

• Impenitent: impervious to moral persuasion • Biweekly: occurring every two weeks.

• Notorious: known widely and usually unfavorably

• Irregular: falling below the manufacturer's standard

• Aroused: emotionally aroused • Outermost: situated at the farthest possible point from a center.

Page 7: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

7

K-DRIVE Training and EvaluationAutomatic Vagueness Detection

●80% of the data used to train a multinomial Naive Bayes classifier.

●We removed stop words and we used the bag of words assumption to represent each instance.

●The remaining 20% of the data was used as a test set.

●Classification accuracy was 84%!

Page 8: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

8

K-DRIVE Comparison with Subjectivity AnalyzerAutomatic Vagueness Detection

●We also used a subjective sense classifier to classify our dataset’s senses as subjective or objective.

●From the 1000 vague senses, only 167 were classified as subjective while from the 1000 non-vague ones 993.

●This shows that treating vagueness in the same way as subjectiveness is not really effective.

Page 9: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

9

K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection

●As an ontology use case we considered CiTO, an ontology that enables characterization of the nature or type of citations.

●CiTO consists primarily of relations, many of which are vague (e.g. plagiarizes).

●We selected 44 relations and we had 3 human judges manually classify them as vague or not.

●Then we applied our Wordnet-trained vagueness classifier on the textual definitions of the same relations.

Page 10: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

10

K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection

Vague Relations Non Vague Relations

• plagiarizes: A property indicating that the author of the citing entity plagiarizes the cited entity, by including textual or other elements from the cited entity without formal acknowledgement of their source

• sharesAuthorInstitutionWith: Each entity has at least one author that shares a common institutional affiliation with an author of the other entity

• citesAsAuthority: The citing entity cites the cited entity as one that provides an authoritative description or definition of the subject under discussion.

• providesDataFor: The cited entity presents data that are used in work described in the citing entity.

Page 11: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

11

K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection

●Classification Results:

●82% of relations were correctly classified as vague/non-vague ●94% accuracy for non-vague relations.●74% accuracy for vague relations.

●Again, we classified the same relations with the subjectivity classifier:●40% of vague/non-vague relations were classified as

subjective/objective respectively.●94% of non-vague were classified as objective.●7% of vague relations were classified as subjective.

Page 12: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

12

K-DRIVE Future WorkVagueness-Aware Semantic Data

●Incorporate the current classifier into an ontology analysis tool

●Improve the classifier by contemplating new features

●See whether it is possible to build a vague sense lexicon.

Page 13: A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

13

K-DRIVE Questions?Thank you!

iSOCO MadridAv. del Partenón, 16-18, 1º7ª

Campo de las Naciones

28042 Madrid

España

(t) +34 913 349 797

iSOCO PamplonaParque Tomás

Caballero, 2, 6º4ª

31006 Pamplona

España

(t) +34 948 102 408

iSOCO ValenciaC/ Prof. Beltrán Báguena, 4

Oficina 107

46009 Valencia

España

(t) +34 963 467 143

iSOCO BarcelonaAv. Torre Blanca, 57

Edificio ESADE CREAPOLIS

Oficina 3C 15

08172 Sant Cugat del Vallès

Barcelona, España

(t) +34 935 677 200

iSOCO ColombiaComplejo Ruta N

Calle 67, 52-20

Piso 3, Torre A

Medellín

Colombia

(t) +57 516 7770 ext. 1132

Key Vendor Virtual Assistant 2013

Quieres innovar?

Dr. Panos Alexopoulos

Semantic Applications Research Manager

[email protected]

(t) +34 913 349 797