Upload
panos-alexopoulos
View
271
Download
2
Tags:
Embed Size (px)
Citation preview
K-DRIVE
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
Panos Alexopoulos, John Pavlopoulos
14th Conference of the European Chapter of the Association for Computational Linguistics
Gothenburg, Sweden, 26–30 April 2014
2
K-DRIVE VaguenessIntroduction
●Vagueness is a semantic phenomenon where predicates admit borderline cases, i.e. cases where it is not determinately true that the predicate applies or not (Shapiro 2006).
●This happens when predicates have blurred boundaries:●What’s the threshold number of years separating old and not old
films?●What are the exact criteria that distinguish modern restaurants
from non-modern?
3
K-DRIVE Vagueness ConsequencesIntroduction
●The problem with vague terms in semantic data is the possibility of disagreements!
●E.g., when we asked domain experts to provide instances of the concept Critical Business Process, there were certain processes for which there was a dispute among them about whether they should be regarded as critical or not.
●The problem was that different experts had different criteria of process criticality and could not decide which of these were sufficient to classify a process as critical.
4
K-DRIVE Problematic ScenariosIntroduction
1. Structuring Data with a Vague Ontology: Possible disagreement among experts when defining class and relation instances.
2. Utilizing Vague Facts in Ontology-Based Systems: Reasoning results might not meet users’ expectations
3. Integrating Vague Semantic Information: The merging of particular vague elements can lead to data that will not be valid for all its users.
5
K-DRIVE Problem Definition & ApproachAutomatic Vagueness Detection
●Can we automatically determine whether an ontology entity (class, relation etc.) is vague or not?
● “StrategicClient” as “A client that has a high value for the company” is vague!
● “AmericanCompany” as “A company that has legal status in the Unites States” is not!
Problem Definition
●We train a binary classifier that may distinguish between vague and non-vague term word senses.
●Training is supervised, using examples from Wordnet.●We use this classifier to determine whether a given ontology element
definition is vague or not.
Approach
6
K-DRIVE DataAutomatic Vagueness Detection
●2,000 adjective senses from WordNet. ●1,000 vague●1,000 non-vague
●Inter-agreement of vague/non-vague annotation among 3 human judges was 0.64 (Cohen’s Kappa)
Vague Senses Non Vague Senses
• Abnormal: not normal, not typical or usual or regularor conforming to a norm
• Compound: composed of more than one part
• Impenitent: impervious to moral persuasion • Biweekly: occurring every two weeks.
• Notorious: known widely and usually unfavorably
• Irregular: falling below the manufacturer's standard
• Aroused: emotionally aroused • Outermost: situated at the farthest possible point from a center.
7
K-DRIVE Training and EvaluationAutomatic Vagueness Detection
●80% of the data used to train a multinomial Naive Bayes classifier.
●We removed stop words and we used the bag of words assumption to represent each instance.
●The remaining 20% of the data was used as a test set.
●Classification accuracy was 84%!
8
K-DRIVE Comparison with Subjectivity AnalyzerAutomatic Vagueness Detection
●We also used a subjective sense classifier to classify our dataset’s senses as subjective or objective.
●From the 1000 vague senses, only 167 were classified as subjective while from the 1000 non-vague ones 993.
●This shows that treating vagueness in the same way as subjectiveness is not really effective.
9
K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection
●As an ontology use case we considered CiTO, an ontology that enables characterization of the nature or type of citations.
●CiTO consists primarily of relations, many of which are vague (e.g. plagiarizes).
●We selected 44 relations and we had 3 human judges manually classify them as vague or not.
●Then we applied our Wordnet-trained vagueness classifier on the textual definitions of the same relations.
10
K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection
Vague Relations Non Vague Relations
• plagiarizes: A property indicating that the author of the citing entity plagiarizes the cited entity, by including textual or other elements from the cited entity without formal acknowledgement of their source
• sharesAuthorInstitutionWith: Each entity has at least one author that shares a common institutional affiliation with an author of the other entity
• citesAsAuthority: The citing entity cites the cited entity as one that provides an authoritative description or definition of the subject under discussion.
• providesDataFor: The cited entity presents data that are used in work described in the citing entity.
11
K-DRIVE Use Case: Detecting Vagueness in CiTO OntologyAutomatic Vagueness Detection
●Classification Results:
●82% of relations were correctly classified as vague/non-vague ●94% accuracy for non-vague relations.●74% accuracy for vague relations.
●Again, we classified the same relations with the subjectivity classifier:●40% of vague/non-vague relations were classified as
subjective/objective respectively.●94% of non-vague were classified as objective.●7% of vague relations were classified as subjective.
12
K-DRIVE Future WorkVagueness-Aware Semantic Data
●Incorporate the current classifier into an ontology analysis tool
●Improve the classifier by contemplating new features
●See whether it is possible to build a vague sense lexicon.
13
K-DRIVE Questions?Thank you!
iSOCO MadridAv. del Partenón, 16-18, 1º7ª
Campo de las Naciones
28042 Madrid
España
(t) +34 913 349 797
iSOCO PamplonaParque Tomás
Caballero, 2, 6º4ª
31006 Pamplona
España
(t) +34 948 102 408
iSOCO ValenciaC/ Prof. Beltrán Báguena, 4
Oficina 107
46009 Valencia
España
(t) +34 963 467 143
iSOCO BarcelonaAv. Torre Blanca, 57
Edificio ESADE CREAPOLIS
Oficina 3C 15
08172 Sant Cugat del Vallès
Barcelona, España
(t) +34 935 677 200
iSOCO ColombiaComplejo Ruta N
Calle 67, 52-20
Piso 3, Torre A
Medellín
Colombia
(t) +57 516 7770 ext. 1132
Key Vendor Virtual Assistant 2013
Quieres innovar?
Dr. Panos Alexopoulos
Semantic Applications Research Manager
(t) +34 913 349 797