21
Combining Human and Computational Intelligence Ilya Zaihrayeu, Pierre Andrews, Juan Pane

WP2 2nd Review

Embed Size (px)

Citation preview

Page 1: WP2 2nd Review

Combining Human and Computational Intelligence

Ilya Zaihrayeu, Pierre Andrews, Juan Pane

Page 2: WP2 2nd Review

2

Semantic annotation lifecycle

User

free text annotations

What if the users could use semantic annotations

instead to leverage semantic technology services?

Semantic annotation=structure

and/or meaningReasoning Semantic search …

Problem 1: help the user find and

understand the meaning of semantic

annotations

Problem 2: extract

(semantic) annotations

from contexts of user

resource at publishing

Context Problem 3: QoS of semantics-enabled services

Problem 4: semi-automatic semantification of existing

annotations

4/14/2011

Page 3: WP2 2nd Review

3

Index: meaning summarization

User

Reasoning Semantic search …

Problem 1: help the user find and

understand the meaning of semantic

annotations

4/14/2011

Page 4: WP2 2nd Review

4

Meaning summarization: why?• The right meaning of the words being used for the

annotation are in the mind of the people using them• E.g.: Java:– an island in Indonesia south of Borneo; one of the world's

most densely populated regions– a beverage consisting of an infusion of ground coffee beans;

"he ordered a cup of coffee“– a simple platform-independent object-oriented

programming language used for writing applets that are downloaded from the World Wide Web by a client and run on the client's machine

• Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations

island

beverage

programming language

4/14/2011

Page 5: WP2 2nd Review

5

Meaning summarization: an example

One word summaries are generated from the relations in

the knowledge base, sense definitions, synonyms and

hypernym terms

4/14/2011

Page 6: WP2 2nd Review

6

Meaning summarization: evaluation results

Best precision: 63%

Discriminating power: 76,4%

4/14/2011

If we talk about java, does the word coffee mean the same as island?

Page 7: WP2 2nd Review

7

Index: gold standard dataset

User

Reasoning Semantic search …

Problem 3: QoS of semantics-enabled services?

Problem 4: semi-automatic semantification of existing

annotations

In order to evaluate the performance of the

algorithms, a gold standard dataset is

needed

4/14/2011

Page 8: WP2 2nd Review

8

Proposed Approach

Tag Tokens Senses

javaisland Java islandJava is land…

Java – an island in Indonesia to the south of BorneoIsland – a land mass that is surrounded by water

DisambiguationPreprocessing

Create a gold standard of folksonomy with sense

80% Accuracy 59% Accuracy

# of annotations 4 296

Unique tags 857

Unique URLs 644

Unique users 1 194

Annotator Agreement 81 %

4/14/2011

Page 9: WP2 2nd Review

9

A Platform for Gold Standards of Semantic Annotation Systems

• Manual validation• RDF export• Evaluation of– Preprocessing– WSD – BoW Search– Convergence

• Open source:http://sourceforge.net/projects/tags2con/

7 modules25K lines of code26% of comments

4/14/2011

Page 10: WP2 2nd Review

10

Delicious RDF Dataset @ LOD cloud

http://disi.unitn.it/~knowdive/dataset/delicious/

# triples 85 908

Outlinks to LOD cloud (WN synsets)

651

4/14/2011

Dereferenceable at:

Page 11: WP2 2nd Review

11

Index: QoS for semantic search

User

Reasoning Semantic search …

Problem 3: QoS of semantics-enabled services?

4/14/2011

Page 12: WP2 2nd Review

12

Semantic search: why?

• With the free text search, the following problems may reduce precision and recall:– synonymy problem: searching for “images” should return

resources annotated with “picture”– polysemy problem: searching for “java” (island) should

not return resources annotated with “java” (coffee beverage)

– specificity gap problem: searching for “animals” should also return resources annotated with “dogs”

• Semantic, meaning-based search can address the above listed problems

4/14/2011

Page 13: WP2 2nd Review

13

Semantics vs Folksonomy

Specificity Gap

Semantic search: complete and correct results (the baseline)

Recall goes down as the specificity gap increasescar

taxi

vehiclelink

User

query

submit

resource

annotation

result

SG=1

SG=2

4/14/2011

javaisland

java island

Java(island) island(land)

Used to build “raw” queries

Used to build BoW queries

Used to build semantic queries

correct and completeSpecificity Gap (SG)

Page 14: WP2 2nd Review

14

Index: semantic convergence

User

Reasoning Semantic search …

Problem 4: semi-automatic semantification of existing

annotations

4/14/2011

Page 15: WP2 2nd Review

15

Semantic convergence: Why?Other

3% Cannot decide5%

Ab-brevia-tion2%

Missing sense15%

I don'

t know

4%

With a WN sense71%

“General” domains: cooking, travel, education

Other1% Cannot decide

6% Ab-brevia-tion5%

Missing

sense35%

I don't know3%

With a WN sense49%

Random: programming and web domain

4/14/2011

AjaxMacAppleCSS…

Page 16: WP2 2nd Review

16

Semantic convergence: proposed solution

• Find new senses of terms– Find different senses of the same term (word sense)– Find synonymous of a term (synonymous sets - synset)

• Place the new synset in the vocabulary is-a hierarchy• What we improve

– Better use of Machine Learning techniques– The polysemy issue is not considered in the state of the

art– Missing or “subjective” evaluations in the state of the art

• Evaluation using the Delicious dataset

4/14/2011

Page 17: WP2 2nd Review

17

Convergence Evaluation: Finding Senses

Tag Collocation User Collocation

4/14/2011

B1

B4

B2

B3

t2

t3t4 t5

t1

Precision: 56%Recall: 73%

Random Baseline

Precision: 42%Recall: 29%

Precision: 57%Recall: 68%

B1

B4B3

t2

t3

t4

t5

t1

U1

U2

Page 18: WP2 2nd Review

18

Semantic annotation lifecycle

User

free text annotations

What if the users could use semantic annotations

instead to leverage semantic technology services?

Semantic annotation=structure

and/or meaningReasoning Semantic search …

Problem 1: help the user understand the meaning of semantic

annotations?

Problem 2: extract

(semantic) annotations

from contexts of user

resource at publishing?

Context Problem 3: QoS of semantics-enabled services?

Problem 4: semi-automatic semantification of existing

annotations

combining human and computational intelligence

Conclusions

4/14/2011

Page 19: WP2 2nd Review

19

Conclusions• We developed and evaluated a meaning summarization algorithm• We developed a “semantic folksonomy” evaluation platform• We studied the effect of semantics on social tagging systems:

– how much semantics can help? – how much the user needs to be involved? – How human and computer intelligence can be combined in the generation

and consumption of semantic annotations• We developed and evaluated a knowledge base enrichment algorithm• We built and used a gold standard dataset for evaluating:

– Word Sense Disambiguation– Tag Preprocessing– Semantic Search– Semantic Convergence

4/14/2011

Page 20: WP2 2nd Review

20

Integration with the use cases4/14/2011

Page 21: WP2 2nd Review

21

Publications

• Semantic Disambiguation in Folksonomy: a Case StudyPierre Andrews, Juan Pane, and Ilya Zaihrayeu;

Advanced Language Technologies for Digital Libraries, Springer’s LNCS.• Semantic Annotation of Images on Flickr

Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;ESWC 2011

• A Classification of Semantic Annotation SystemsPierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;Semantic Web Journal – second review phase

• Sense Induction in FolksonomiesPierre Andrews, Juan Pane, and Ilya Zaihrayeu;IJCAI-LHD 2011 – under review

• Evaluating the Quality of Service in Semantic Annotation SystemsIlya Zaihrayeu, Pierre Andrews, and Juan Pane;in preparation

4/14/2011