Towards Learning Dialogue Structures from Speech Data and Domain Knowledge: Challenges to Conceptual...

Towards Learning Dialogue Structures from Speech Data and Domain Knowledge:

Challenges to Conceptual Clustering using Multiple and

Complex Knowledge Source

Jens-Uwe MollerNatural Language Systems Division,

Dept. of Computer Science, Univ. of Hamburg

Overview Dialog modeling based on a set of units

called dialog act Dialog acts from theory doesn’t fit with

a specific domain Labeling dialog is time consuming and

subjective learn an application specific dialog acts

from speech data using conceptual clustering

The learning task Learning dialog acts from turns Unsupervised classification (no

prior definition of dialog acts is given)

Hierarchy classification with inspectable classifying rules

Features Domain knowledge: structure of task, task

knowledge represented by goals and plans Word recognizer: word hypotheses Prosodic data: Pause & Stress mark

important unit Lexical semantics Syntax (less important in spoken dialog) Semantics (larger units of lexical

semantics)

COWEB Symbolic machine learning algorithm Build a classification tree Distinction between subnodes are made

from a function overall attribute Support probabilistic data Support multiple overlapping

hierarchies (for ambiguous case) Can handle multiple entries of one

attribute (e.g. stream of words)

COWEB (2) Learning from simultaneous events Learn from structure data:

Conceptual Graphs. Learn case descriptions from

terminological descriptions Subsumption = correclation

criterion over structured data. e.g. subsumption of individuals to classes

Metrics for Measuring Domain Independence of

Semantic Classes

Andrew Pargellis, Eric Fosler-Lussier, Alexandros Potamianos, Chin-Hui Lee

Dialogue Systems Research Dept., Bell Labs, Lucent Technologies Murray Hill,

NJ, USA

Introduction Employ semantic classes

(concepts) from another domain Need to identify domain-

independent concepts base on comparison across domain

Domain-independent concepts should occur in similar syntactic (lexical) contexts across domains

Comparing concepts across domains

Concept-comparison method

Concept-projection method

Concept-comparison method Find the similarity between all pairs of

concepts across the two domains Two concepts are similar if their

respective bigram contexts are similar Use left and right context bigram

language models

Kullback-Leibler (KL) distance Compare how san francisco and newark

are used in the Travel domain with how comedies and westerns are used in the Movie domain

Distance between two concepts

Concept-projection method How well a single concept from one domain

is represented in another domain. How the words comedies and westerns are

used in both domains

Useful for identifying the degree of domain-independence for a particular concept.

Result: Concept-comparison

Result: Concept-projection

Concept Example

Semi-Automatic Acquisition of Domain-Specific Semantic

Structures

Siu K.C., Meng H.M.Human-Computer Communications Laboratory

Department of Systems Engineering

and Engineering Management

The Chinese University of Hong Kong

Grammar induction Use unannotated corpora Portable across domain & language Output grammar has reasonable

coverage of within-domain data and reject out-of-domain data

Amenable to interactive refinement by human

Support optional injection of prior knowledge

Spatial clustering Use kullback-liebler distance. use left and right context. Consider word with pre-set

minimum occurrence. (set to 5) use left and right context. Consider

word w1, w2 (later be c1, c2) pair-wise for words that have a least pre-set minimum occurrence. (set to 5)

Temporal clustering Use Mutual Information (MI). N-highest MI pairs are clustered

(N=5 in experiment)

Do spatial clustering and temporal clustering iteratively

Post-process by human

Automatic Concept identification In goal-

oriented conversations

Ananlada Chotimongkol and Alexander I. Rudnicky

Language Technologies Institute Carnegie Mellon

University

Concept identification First step towards the goal of

automatically inferring domain ontologies

Goal-oriented human-human conversation has a clear structure

This structure can be used to automatically identify domain topics, e.g. dialog classfication

Clustering algorithm Hierarchical clustering Mutual information based

Criterion=minimize the loss of average mutual information

Kullback-Lierbler based Criterion=word pair with minimum

distance

Evaluation metrics Reference concept from class-

based n-gram model Cluster concept=majority concept Precision Recall Singularity score (SS) Quality score (QS)

Towards Learning Dialogue Structures from Speech Data and Domain Knowledge: Challenges to Conceptual...

Documents

FUZZY C-MEANS CLUSTERING BY INCORPORATING BIOLOGICAL ...eprints.utm.my/id/eprint/32110/5/ShahreenKasimPFSKSM2011.pdffuzzy c-means clustering by incorporating biological knowledge and

Extracting knowledge from life courses: clustering and ...mephisto.unige.ch/pub/publications/gr/Muller_et_al_DaWaK08.pdf · Extracting knowledge from life courses: clustering and

Knowledge Enhanced Clustering

Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogue Systems

Predicting Emotion in Spoken Dialogue from Multiple Knowledge Sources

Probablistic Clustering based on Web Documents - IJCTA · Probablistic Clustering based on Web Documents . ... classification come under the features of discovery knowledge ... algorithms

Combining Dialogue and Semantics for Learning and Knowledge Maturing

Multi-Sentence Knowledge Selection in Open-Domain Dialogue

PRECIOUS KNOWLEDGE: Framing the dialogue in order to

Dialogue Workshop on Knowledge for the 21st Century: Indigenous knowledge, Traditional knowledge, Science and connecting diverse knowledge systems

Unsupervised Classification of Student Dialogue Acts With Query … · 2013-06-25 · Unsupervised Classification of Student Dialogue Acts With Query-Likelihood Clustering Aysu Ezen-Can

Dialogue design system to share information and construct knowledge

Knowledge discovery & data mining Clustering & customer segmentation

Recent Trends in Fuzzy Clustering: From Data to Knowledge

Constrained+K-means+Clustering+ with+Background+Knowledge fileConstrained+K-means+Clustering+ with+Background+Knowledge Kiri+Wagstaﬀ,+Claire+Cardie,+Seth+ Rogers,+Stefan+Schroedl+

Clustering data streams: theory and practice - Knowledge ...gkmc.utah.edu/7910F/papers/IEEE TKDE clustering data streams.pdf · Clustering Data Streams: Theory and Practice Sudipto

Exploring Clustering Based Knowledge Discovery towards ...ceur-ws.org/Vol-1276/MedIR-SIGIR2014-07.pdfExploring Clustering Based Knowledge Discovery towards Improved Medical Diagnosis

Coupling Clustering and Visualization for Knowledge Discovery …bennani/PUBLICATIONS/Ca... · 2012-01-17 · Coupling Clustering and Visualization for Knowledge Discovery from Data

Stategies for Megaregion Governance Collaborative Dialogue ... · collaborative dialogue; joint knowledge development; creation of networks and social and political capital; and boundary

Knowledge Exchange Dialogue