14
QUARK: Architecture for a question answering machine Felix Burkhardt

Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

Embed Size (px)

Citation preview

Page 1: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

QUARK: Architecture for a question answering machine

Felix Burkhardt

Page 2: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

2

Motivation

Architecture

Modules

Frontends

Spellchecker

Dialogmanager

Semantic Search

Answer Templates

Summary & Outlook

Quark: Architecture for a QA Machine Contents

Chatbot „Tinka“ on the T-mobile.at website

Page 3: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

3

Quark: Architecture for a QA Machine Motivation

To ease the work of human agents

and save costs automatic question answering systems are valuable

One example are so-called „chatbots“, i.e. automatic dialog systems

Page 4: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

4

Quark: Architecture for a QA Machine ARchitecture

• We developed

an architecture to develop, test and compare several components of

such a question answering system.

• It is also used to

build demonstrators for management

Page 5: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

5

Quark: Architecture for a QA Machine Frontends

• There are several

interfaces, e.g. mobile apps for demonstration

• Web interfaces for testing and

maintenance

Page 6: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

6

Quark: Architecture for a QA Machine Spellchecker

• In the current Tinka

implementation we use the open source Hunspell spellchecker

• The API was algorithmically enhanced by supporting

• Weighted user lexicon

• Enabling recognition of bi-grams and tri-grams

Page 7: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

7

Quark: Architecture for a QA Machine Dialogmanager

• Central instance

• Missing dialog model for slot-filling and ellipse handling

• Main challenge is currently the result list order, i.e. a

quality measure for the answers from several moduls

QA-item Central data-structure for answers - Question - Interpretation - Answer - Probability: each knowledge source must provide a probability value so the results can be ranked in a meaningful way. Some knowledge sources are more specific than others (e.g. QA-search more than FAQ) and are ranked higher per se.

Dialog-manager

Central instance to take queries , distributes to several Answer-modules, and consolidate output Later, session handling for multi-step dialogs will also be handled here.

Page 8: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

8

Quark: Architecture for a QA Machine Semantic Search

• With content that already contains

answers, FAQ and extracted from forum, the user query must only be matched with the question.

• Finding important words and synonyms for „query-expansion“ can be done with

GATE‘s term extractor, had to be adapted for German.

• The search index of a SOLR search engine can than be enhanced by these terms.

• The number of matching terms would be part of the quality criterion

Page 9: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

9

Quark: Architecture for a QA Machine Topic classification of Forum data

• Together with the DAI (Distributed

Artificial Intelligence) Labor of TU-Berlin we investigated the topic classification of „Telekom Hilft“ user forum

• Compared „classical machine

learning“ with Deep neural nets.

• Both resulted in 55% accuracy rsp. 83% „one in three“

• Also investigated subclustering

with DNN (4 subclusters per category)

Apache Lucene preprocessing

NER / disambuiguation

Multinomial Naive Bayes

* Zhang, Zhao, LeCun, 2015

Deep Temporal

Convolutional Neural Network *

Comparison

Page 10: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

10

Quark: Architecture for a QA Machine Answer Templates

• We also use GATE to annotate terms in user queries, based on gazeteers.

• Each string gets annotated with Part-

of-Speech, lemma and NER

• Disambiguation is not done yet

• Via JAPE grammars a pre-defined answer (and querstion) template is

determined

• With this template, a SPARQL

database is queried and the answer filled.

• SPARQL is the query language for

RDF, a W3C suggestion for semantic

annotation

Page 11: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

11

Quark: Architecture for a QA Machine Summary and Outlook

• We presented an architecture to answer customer questions automatically

• It is based on open source technology like

• GATE

• SOLR

• RDF/OWL

• It is used to evaluate vendors, build demonstrators and test sub-modules for production

• Many things are missing, e.g.:

• Dialog models

• Common quality-of-answer criterion across sub-modules

• Machine learning of underlying ontology

Page 12: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

12

Quark: Architecture for a QA Machine

Thank you for your attention!

Page 13: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

15,10 7,85 7,45 7,85 7,45 0,20 0,20 15,10

8,10

5,90

4,90

6,40

8,10

5,40

7,40

QUARK BACKUP Ontology learning

13

Page 14: Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

QUARK BACKUP Machine categorization

14