A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

Preview:

Citation preview

K-DRIVE

A Vague Sense Classifier for Detecting Vague Definitions in Ontologies

Panos Alexopoulos, John Pavlopoulos

14th Conference of the European Chapter of the Association for Computational Linguistics

Gothenburg, Sweden, 26–30 April 2014

2

K-DRIVE VaguenessIntroduction

●Vagueness is a semantic phenomenon where predicates admit borderline cases, i.e. cases where it is not determinately true that the predicate applies or not (Shapiro 2006).

●This happens when predicates have blurred boundaries:●What’s the threshold number of years separating old and not old

films?●What are the exact criteria that distinguish modern restaurants

from non-modern?

3

K-DRIVE Vagueness ConsequencesIntroduction

●The problem with vague terms in semantic data is the possibility of disagreements!

●E.g., when we asked domain experts to provide instances of the concept Critical Business Process, there were certain processes for which there was a dispute among them about whether they should be regarded as critical or not.

●The problem was that different experts had different criteria of process criticality and could not decide which of these were sufficient to classify a process as critical.

4

K-DRIVE Problematic ScenariosIntroduction

1. Structuring Data with a Vague Ontology: Possible disagreement among experts when defining class and relation instances.

2. Utilizing Vague Facts in Ontology-Based Systems: Reasoning results might not meet users’ expectations

3. Integrating Vague Semantic Information: The merging of particular vague elements can lead to data that will not be valid for all its users.

5

K-DRIVE Problem Definition & ApproachAutomatic Vagueness Detection

●Can we automatically determine whether an ontology entity (class, relation etc.) is vague or not?

● “StrategicClient” as “A client that has a high value for the company” is vague!

● “AmericanCompany” as “A company that has legal status in the Unites States” is not!

Problem Definition

●We train a binary classifier that may distinguish between vague and non-vague term word senses.

●Training is supervised, using examples from Wordnet.●We use this classifier to determine whether a given ontology element

definition is vague or not.

Approach

6

K-DRIVE DataAutomatic Vagueness Detection

●2,000 adjective senses from WordNet. ●1,000 vague●1,000 non-vague

●Inter-agreement of vague/non-vague annotation among 3 human judges was 0.64 (Cohen’s Kappa)

Vague Senses Non Vague Senses

• Abnormal: not normal, not typical or usual or regularor conforming to a norm

• Compound: composed of more than one part

• Impenitent: impervious to moral persuasion • Biweekly: occurring every two weeks.

• Notorious: known widely and usually unfavorably

• Irregular: falling below the manufacturer's standard

• Aroused: emotionally aroused • Outermost: situated at the farthest possible point from a center.

7

K-DRIVE Training and EvaluationAutomatic Vagueness Detection

●80% of the data used to train a multinomial Naive Bayes classifier.

●We removed stop words and we used the bag of words assumption to represent each instance.

●The remaining 20% of the data was used as a test set.

●Classification accuracy was 84%!

8

K-DRIVE Comparison with Subjectivity AnalyzerAutomatic Vagueness Detection

●We also used a subjective sense classifier to classify our dataset’s senses as subjective or objective.

●From the 1000 vague senses, only 167 were classified as subjective while from the 1000 non-vague ones 993.

●This shows that treating vagueness in the same way as subjectiveness is not really effective.

9

K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection

●As an ontology use case we considered CiTO, an ontology that enables characterization of the nature or type of citations.

●CiTO consists primarily of relations, many of which are vague (e.g. plagiarizes).

●We selected 44 relations and we had 3 human judges manually classify them as vague or not.

●Then we applied our Wordnet-trained vagueness classifier on the textual definitions of the same relations.

10

K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection

Vague Relations Non Vague Relations

• plagiarizes: A property indicating that the author of the citing entity plagiarizes the cited entity, by including textual or other elements from the cited entity without formal acknowledgement of their source

• sharesAuthorInstitutionWith: Each entity has at least one author that shares a common institutional affiliation with an author of the other entity

• citesAsAuthority: The citing entity cites the cited entity as one that provides an authoritative description or definition of the subject under discussion.

• providesDataFor: The cited entity presents data that are used in work described in the citing entity.

11

K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection

●Classification Results:

●82% of relations were correctly classified as vague/non-vague ●94% accuracy for non-vague relations.●74% accuracy for vague relations.

●Again, we classified the same relations with the subjectivity classifier:●40% of vague/non-vague relations were classified as

subjective/objective respectively.●94% of non-vague were classified as objective.●7% of vague relations were classified as subjective.

12

K-DRIVE Future WorkVagueness-Aware Semantic Data

●Incorporate the current classifier into an ontology analysis tool

●Improve the classifier by contemplating new features

●See whether it is possible to build a vague sense lexicon.

13

K-DRIVE Questions?Thank you!

iSOCO MadridAv. del Partenón, 16-18, 1º7ª

Campo de las Naciones

28042 Madrid

España

(t) +34 913 349 797

iSOCO PamplonaParque Tomás

Caballero, 2, 6º4ª

31006 Pamplona

España

(t) +34 948 102 408

iSOCO ValenciaC/ Prof. Beltrán Báguena, 4

Oficina 107

46009 Valencia

España

(t) +34 963 467 143

iSOCO BarcelonaAv. Torre Blanca, 57

Edificio ESADE CREAPOLIS

Oficina 3C 15

08172 Sant Cugat del Vallès

Barcelona, España

(t) +34 935 677 200

iSOCO ColombiaComplejo Ruta N

Calle 67, 52-20

Piso 3, Torre A

Medellín

Colombia

(t) +57 516 7770 ext. 1132

Key Vendor Virtual Assistant 2013

Quieres innovar?

Dr. Panos Alexopoulos

Semantic Applications Research Manager

palexopoulos@isoco.com

(t) +34 913 349 797

Recommended