Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE

QUARK: Architecture for a question answering machine

Felix Burkhardt

2

Motivation

Architecture

Modules

Frontends

Spellchecker

Dialogmanager

Semantic Search

Answer Templates

Summary & Outlook

Quark: Architecture for a QA Machine Contents

Chatbot „Tinka“ on the T-mobile.at website

3

Quark: Architecture for a QA Machine Motivation

To ease the work of human agents

and save costs automatic question answering systems are valuable

One example are so-called „chatbots“, i.e. automatic dialog systems

4

Quark: Architecture for a QA Machine ARchitecture

• We developed

an architecture to develop, test and compare several components of

such a question answering system.

• It is also used to

build demonstrators for management

5

Quark: Architecture for a QA Machine Frontends

• There are several

interfaces, e.g. mobile apps for demonstration

• Web interfaces for testing and

maintenance

6

Quark: Architecture for a QA Machine Spellchecker

• In the current Tinka

implementation we use the open source Hunspell spellchecker

• The API was algorithmically enhanced by supporting

• Weighted user lexicon

• Enabling recognition of bi-grams and tri-grams

7

Quark: Architecture for a QA Machine Dialogmanager

• Central instance

• Missing dialog model for slot-filling and ellipse handling

• Main challenge is currently the result list order, i.e. a

quality measure for the answers from several moduls

QA-item Central data-structure for answers - Question - Interpretation - Answer - Probability: each knowledge source must provide a probability value so the results can be ranked in a meaningful way. Some knowledge sources are more specific than others (e.g. QA-search more than FAQ) and are ranked higher per se.

Dialog-manager

Central instance to take queries , distributes to several Answer-modules, and consolidate output Later, session handling for multi-step dialogs will also be handled here.

8

Quark: Architecture for a QA Machine Semantic Search

• With content that already contains

answers, FAQ and extracted from forum, the user query must only be matched with the question.

• Finding important words and synonyms for „query-expansion“ can be done with

GATE‘s term extractor, had to be adapted for German.

• The search index of a SOLR search engine can than be enhanced by these terms.

• The number of matching terms would be part of the quality criterion

9

Quark: Architecture for a QA Machine Topic classification of Forum data

• Together with the DAI (Distributed

Artificial Intelligence) Labor of TU-Berlin we investigated the topic classification of „Telekom Hilft“ user forum

• Compared „classical machine

learning“ with Deep neural nets.

• Both resulted in 55% accuracy rsp. 83% „one in three“

• Also investigated subclustering

with DNN (4 subclusters per category)

Apache Lucene preprocessing

NER / disambuiguation

Multinomial Naive Bayes

* Zhang, Zhao, LeCun, 2015

Deep Temporal

Convolutional Neural Network *

Comparison

10

Quark: Architecture for a QA Machine Answer Templates

• We also use GATE to annotate terms in user queries, based on gazeteers.

• Each string gets annotated with Part-

of-Speech, lemma and NER

• Disambiguation is not done yet

• Via JAPE grammars a pre-defined answer (and querstion) template is

determined

• With this template, a SPARQL

database is queried and the answer filled.

• SPARQL is the query language for

RDF, a W3C suggestion for semantic

annotation

11

Quark: Architecture for a QA Machine Summary and Outlook

• We presented an architecture to answer customer questions automatically

• It is based on open source technology like

• GATE

• SOLR

• RDF/OWL

• It is used to evaluate vendors, build demonstrators and test sub-modules for production

• Many things are missing, e.g.:

• Dialog models

• Common quality-of-answer criterion across sub-modules

• Machine learning of underlying ontology

12

Quark: Architecture for a QA Machine

Thank you for your attention!

15,10 7,85 7,45 7,85 7,45 0,20 0,20 15,10

8,10

5,90

4,90

6,40

8,10

5,40

7,40

QUARK BACKUP Ontology learning

13

QUARK BACKUP Machine categorization

14

Technology

Felix Burkhardt | ARCHITECTURE FOR A QUESTION ANSWERING MACHINE