34
Amine Hallili, PhD student Catherine Faron Zucker & Fabien Gandon, Advisors Elena Cabrio, Supervisor 1

Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Amine Hallili, PhD student

Catherine Faron Zucker & Fabien Gandon, Advisors

Elena Cabrio, Supervisor

1

Page 2: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Headlines Introduction

Motivations Research questions

Chatbot Definition Categories Our Chatbot ?

Ongoing work Our proposal Knowledge Base Ontology (Schema.org, GoodRelations) Pattern Extraction Property Matching Response Generation

Perspectives References

2

Page 3: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Introduction

3

Page 4: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Context & Motivations Why ?

New means of communication FAQ

Social Networks

Mobile Applications

Search Engines

Huge amount of underexploited data especially in Commercial Domain Linked Data

Log files

Raw Text ...

4

Page 5: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Research questions How to construct a Knowledge Base using website APIs ?

Proposing a platform to extract information

How to fully understand user’s question ?

Natural Language Processing

How to keep users interested in interacting with the system?

Natural Language Generation

Friendly interface

Dialog mode

5

Page 6: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Scenario

6

Give me the price of a Nexus 5!

and who sells it?

the price of Nexus 5 is 400$!

several sellers were found. The main one is Google! Do you want to see other sellers?

No, show me the white version, sold by Google and located in France!

here are the images of Nexus 5 white version, sold by Google and located in France...

Page 7: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

ChatBot

7

Page 8: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Chatbot – State of the art

Chatbot, ChatterBot, CleverBot, Chat-Robot (Allen et al) : Computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, primarily for engaging in small talk.

Natural Language Dialog system (NLDs)

Expert System (Liao 2005)

Question Answering system (Hirschman & al)

Multiagent system (Wooldridge 2009)

8

Page 9: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Chatbot – state of the art

9

Page 10: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Ongoing work

10

Page 11: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Our proposal

11

Combining the benefits of both QA systems & NLDs to propose :

A rich KB for data extraction and reasoning

NLP tools to interpret user's question

NLG techniques to generate well-formed sentences.

Integrating Dialog mode to keep user interacting with the system.

Page 12: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Our starting point

12

QAKiS (Cabrio & al 1)

Question Answering wiKiframework System

Test it at qakis.org

Page 13: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Our contributions

13

QAKiS from Open Domain (DBpedia)

=> Closed Domain (Commercial)

Natural Language Generation

Question with constraints (N-Relations)

Dialog Mode

Page 14: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

14

Question

Dialog Manager

Response

Pattern Finder Triple Feeder

Type Recognizer NLP

Off–line Feed KB

NLG

Subject Predicat Value

Ontology

Triple store

Property Recognizer

NE Recognizer

Query Generator

N-Relations Handler

Pattern Picker

Response Formater

Page 15: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

15

Question

Dialog Manager

Response

Pattern Finder Triple Feeder

Type Recognizer NLP

Off–line Feed KB

NLG

Subject Predicat Value

Ontology

Triple store

Property Recognizer

NE Recognizer

Query Generator

N-Relations Handler

Pattern Picker

Response Formater

Page 16: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Knowledge Base creation

16

<sbo:Product rdf:about=‘#Nexus_5’ > <sbo:hasPrice>400</sbo:hasPrice>

</sbo:Product>

Amazon API

BestBuy API

eBay API

[eBay, Amazon, BestBuy] API Ex : getPrice(Nexus_5) => 400$

Data Transformer

RDF Knowledge Base

Page 17: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Knowledge Base - Example

17

Page 18: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

18

Question

Dialog Manager

Response

Pattern Finder Triple Feeder

Type Recognizer NLP

Off–line Feed KB

NLG

Subject Predicat Value

Ontology

Triple store

Property Recognizer

NE Recognizer

Query Generator

N-Relations Handler

Pattern Picker

Response Formater

Page 19: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Ontology reuse

19

Why we need an Ontology ?

Data structuration, Domain representation, Inference.

Existing ontologies on commercial domain

Schema.org Ontology

Covers several domains

Used by state of the art search engines

Partial coverage of the commercial domain

GoodRelations Ontology (Hepp 2008)

Better coverage of the commercial domain

Page 20: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

GoodRelations Ontology

20

Page 21: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

GoodRelations Ontology

21

Page 22: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

22

Question

Dialog Manager

Response

Pattern Finder Triple Feeder

Type Recognizer NLP

Off–line Feed KB

NLG

Subject Predicat Value

Ontology

Triple store

Property Recognizer

NE Recognizer

Query Generator

N-Relations Handler

Pattern Picker

Response Formater

Page 23: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Pattern Extraction - Algorithm API based method

Crawler & annotation based method

For each property

Parse product pages

Get all sentences containing the domain and range values

Make generic patterns

- All pages are tested !

+ Finds extra patterns

+ Easy to implement

For each page => {Subject}

Parse annotation

=> Graph representing the page

For each property

Get all sentences containing the domain and range values

Make generic patterns

- Requires annotated pages

+ More efficient

+ Less time execution

23

Page 24: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Pattern extraction – API method

24

Subject

<sch:hasDimension>

<sch:hasDisplay>

Page 25: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Pattern extraction – Crawler Method

25

Properties

metadata

Sentences expressing properties

Page 26: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

26

Question

Dialog Manager

Response

Pattern Finder Triple Feeder

Type Recognizer NLP

Off–line Feed KB

NLG

Subject Predicat Value

Ontology

Triple store

Property Recognizer

NE Recognizer

Query Generator

N-Relations Handler

Pattern Picker

Response Formater

Page 27: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Property Matching Module

27

<sbo:hasPrice>

[Product] price is [Double]

The price of [Product] is [Double]

[Product] costs [Double]

Give me the price of a Nexus 5! High score

Page 28: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Property Matching (N-Relation)

28

2-relations : Give me the address of Nexus 5 seller !

Give me the Nexus 5 seller !

Give me his address ! => high score

NE : Nexus 5 => [Product]

<hasAddress>

Domain : Seller Range : Address

<soldBy>

Domain : Product Range : Seller Same type

Nexus_5

LaFnac

10 Jean Medecin, 06000, Nice

Page 29: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Property Matching (N-Relation)

29

Property1

Domain : D1 Range : R1

Property2

Domain : D2 Range : R2

Property4

Domain : D4 Range : R4

Property5

Domain : D5 Range : R5

Property3

Domain : D3 Range : R3

Graph representing the question :

Or / And ? Or / And ?

No link ???

No domain or no Range ?!

Page 30: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

30

Question

Dialog Manager

Response

Pattern Finder Triple Feeder

Type Recognizer NLP

Off–line Feed KB

NLG

Subject Predicat Value

Ontology

Triple store

Property Recognizer

NE Recognizer

Query Generator

N-Relations Handler

Pattern Picker

Response Formater

Page 31: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

NL Generation

31

<sbo:hasPrice>

{subject} price is {value}

{subject} costs {value}

Give me the price of a Nexus 5!

Nexus 5 costs 400$!

Page 32: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Give me the price of a Nexus 5!

Dialog Manager

Nexus 5 costs 400$

Pattern Finder Triple Feeder

<sbo:Product> NLP

Off–line Feed KB

NLG

Subject Predicat Value Nexus5 hasPrice 400$

Ontology

Triple store

<sbo:hasPrice>

<sbr:Nexus_5>

Query Generator Select ?v where { <sbr:Nexus_5> <sbo:hasPrice> ?v }

{subject} costs {value}

Nexus 5 costs 400$!

Page 33: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

Perspectives Short term :

NE Recognition improvement

KNN, Similarity, N-Gram, TF-IDF algorithms

N-Relations Implementation

Scale to a bigger KB

Middle term :

Dialog Mode

Multiagent systems

Conversational behavior systems

Serendipity

33

Page 34: Amine Hallili, PhD student Catherine Faron Zucker & Fabien ...wimmics.inria.fr/doc/slides/amine.pdf · Pattern Extraction - Algorithm API based method Crawler & annotation based method

References (Allen et al) J. F. Allen, D. K. Byron, M. Dzikovska, G. Ferguson, L. Galescu, and A. Stent. Toward conversational human-computer interaction. AI Magazine, 22(4):2738, 2001.

(Liao 2005) S.-H. Liao. Expert system methodologies and applications - a decade review from 1995 to 2004. Expert Syst. Appl., 28(1):93-103, 2005.

(Hirschman & al) L. Hirschman and R. J. Gaizauskas. Natural language question answering: the view from here. Natural Language Engineering, 7(4):275300, 2001.

(Wooldridge 2009) M. J. Wooldridge. An Introduction to MultiAgent Systems (2. ed.). Wiley, 2009.

(Cabrio & al 1) E. Cabrio, J. Cojan, A. P. Aprosio, B. Magnini, A. Lavelli, and F. Gandon. Qakis: an open domain qa system based on relational patterns. In International Semantic Web Conference (Posters & Demos), 2012.

(Cabrio & al .2) E. Cabrio, J. Cojan, A. Palmero Aprosio, and F. Gandon. Natural language interaction with the web of data by mining its textual side. Intelligenza Articiale, 6(2):121-133, 2012.

(Augello & al .1) A. Augello, G. Pilato, G. Vassallo, and S. Gaglio. A semantic layer on semi-structured data sources for intuitive chatbots. In CISIS, pages 760-765, 2009.

(Augello & al .2) A. Augello, G. Pilato, A. Mach, and S. Gaglio. An approach to enhance chatbot semantic power and maintainability: Experiences within the frasi project. In ICSC, pages 186-193. IEEE Computer Society, 2012.

(Hepp 2008) M. Hepp. Goodrelations: An ontology for describing products and services offers on the web. In EKAW, pages 329-346, 2008.

34