Upload
stefan-trausan-matu
View
316
Download
0
Tags:
Embed Size (px)
Citation preview
Ştefan Trăuşan-Matu, Traian RebedeaUniversitatea “Politehnica” Bucureşti,
Institutul de Cercetări în Inteligenţă Artificială al Academiei Române
Outlook� Chat Conversations with Multiple Participants
� A Polyphonic Model of Discourse
� Inter-animation Analysis
� The PolyCAFe Analysis System
March 23, 2010 Cicling 2010 - Iasi, Romania 2
Chat Conversations with
Multiple Participants� Multiple participants (≥3), conferencing style
� Very important in the context of the spread of forums and instant messengers – chats
� Important tool for� Important tool for� Computer-Supported Collaborative Learning (CSCL)
� Cooperative work online
� Particular features – multiple, parallel discussion chains !!!
� There is a need for� Determining important utterances
� Contributions of the participants
� Degree of collaboration - inter-animation analysis
March 23, 2010 Cicling 2010 - Iasi, Romania 3
Example: CSCL assignment� Students had to debate in chat sessions in groups
ranging from 3 to 8
� In the first part of the conversation, each student had to defend a technology by presenting its features and to defend a technology by presenting its features and advantages and criticize the others by invoking their flaws and drawbacks
� In the final part of the chat, they had to discuss on how they could integrate all these technologies in a single online collaboration platform
March 23, 2010 Cicling 2010 - Iasi, Romania 4
CSCL assignment: Problems�How to assist teachers in evaluating students’ work in chats?
�Offer assistance to students
�Abstraction tools�Abstraction tools
�Automatic feedback
March 23, 2010Cicling 2010 - Iasi, Romania 5March 23, 2010
Experiments with
chat-based CSCL
�K-12 students solving mathematics problems both individually and collaboratively in the VMT project at Drexel University, Philadelphia, USA
Computer Science students at Bucharest �Computer Science students at Bucharest “Politehnica” University, Romania at
�Human-Computer Interaction course in Romanian and French – role playing and debate�Natural Language Processing - role playing and debate�Algorithm Design – problem solving
March 23, 2010Cicling 2010 - Iasi, Romania 6March 23, 2010
The VMT chat environment
March 23, 2010Cicling 2010 - Iasi, Romania 7
VMT Referencing facility
March 23, 2010 Cicling 2010 - Iasi, Romania 8
Discourse� Monologue
� Unidirectional model of communication, from a speaker to a listener – theories and methods of analysis:� Rhetorical Schema Theory
� Centering TheoryCentering Theory
� Co-reference resolution
� Dialogue� Phone-like, face to face style – units of analysis:
� Speech acts
� Dialog acts
� Adjacency pairs
� Multi-parties conversation
March 23, 2010 Cicling 2010 - Iasi, Romania 9
Theories for analysing
multi-parties conversation
� Discourse analysis (Tannen)
� Conversation analysis (Sacks, Jefferson, Schegloff)
� Accountable talk (Resnick)� Accountable talk (Resnick)
� Transactivity (Teasley, Berkowitz & Gibbs, Joshi & Rose)
� Inter-animation (Bakhtin, Wegerif, Trausan-Matu)
� Polyphony (Bakhtin, Trausan-Matu et al.)
10March 23, 2010 Cicling 2010 - Iasi, Romania
Transactivity analysis� TF-IDF
� Latent Semantic Analysis
� Naïve Bayes
Social Network Analysis� Social Network Analysis
� WordNet (wordnet.princeton.edu)
� Support Vector Machines
� Collin’s perceptron
� TagHelper environment
March 23, 2010 Cicling 2010 - Iasi, Romania 11
Transactivity analyis� TF-IDF
� Latent Semantic Analysis Almost all are based also on
� Naïve Bayes a two interlocutors
Social Network Analysis model, in which� Social Network Analysis model, in which
� WordNet (wordnet.princeton.edu) one person speaks
� Support Vector Machines at a time, resulting
� Collin’s perceptron one discussion thread
� TagHelper environment
March 23, 2010 Cicling 2010 - Iasi, Romania 12
A socio-cultural perspective on Natural
Language
�Sfard: “rather than speaking about ‘acquisition of knowledge,’ many people prefer to view learning as becoming a participant in a certain discourse” (2000)
�Wertch: Lotman - text is a „thinking device” (1981)�Wertch: Lotman - text is a „thinking device” (1981)
�Stahl “to learn is to become a skilled member of communities of practice …. and to become competent at using their …. speech genres” (2006)
�Koshmann: “the voices of others become woven into what we say, write, and think” (1999)
�Wegerif - teaching thinking skills by inter-animation: “meaning-making requires the inter-animation of more than one perspective“ (2005)
March 23, 2010Cicling 2010 - Iasi, Romania 13March 23, 2010
Dialogism – Mikhail Bakhtin
• Basis for the CSCL paradigm (Koschman, 1999)• “… Any true understanding is dialogic in nature”
(Voloshinov-Bakhtin, 1973)• Opposed to de Saussure ideas:
Real life dialog should be the focus, not written text• Real life dialog should be the focus, not written text• Words are not arbitrary
• Utterances (not sentences) should be the unit of analysis
• Speech genres� Polyphony � Inter-animation of voices
March 23, 2010Cicling 2010 - Iasi, Romania 14March 23, 2010
Polyphony and counterpoint� Concept derived from classical music
� “These are different voices singing variously on a single theme. This is indeed 'multivoicedness,' exposing the diversity of life and the great complexity of human experience. 'Everything in life is counterpoint, that is, opposition,' “ (Bakhtin, 1984)of human experience. 'Everything in life is counterpoint, that is, opposition,' “ (Bakhtin, 1984)
� Multiple voices – each utterance contains multiple voices
� Voices inter-animate in an unmerged way: � “a plurality of independent and unmerged voices and
consciousnesses” (Bakhtin)
March 23, 2010Cicling 2010 - Iasi, Romania 15March 23, 2010
Polyphonic inter-animation
March 23, 2010 Cicling 2010 - Iasi, Romania 16
March 23, 2010Cicling 2010 - Iasi, Romania 17
Polyphony
March 23, 2010Cicling 2010 - Iasi, Romania 18March 23, 2010
Words, voices and threads� Positions assigned to participants – voices� Additional voices – frequent concepts – repeated
words become voices, stronger or weaker� Voices continue and influence each other through � Voices continue and influence each other through
explicit or implicit links. � Voices correspond to chains or threads of utterances:
� repeated words� lexical chains� co-references� reasoning or argumentation� rhetorical schemas
March 23, 2010 Cicling 2010 - Iasi, Romania 19
LTfLL - EU FP7 Project
(2008-2011)
� Language Technologies for Lifelong Learning
� Netherlands, France, United Kingdom, Germany, Ausria, Romania, BulgariaRomania, Bulgaria
� PolyCAFe system (Polyphony-based Collaboration Analysis and Feedback generation)
� The system has just been validated with students and tutors
March 23, 2010Cicling 2010 - Iasi, Romania 20March 23, 2010
March 23, 2010Cicling 2010 - Iasi, Romania 21March 23, 2010
March 23, 2010Cicling 2010 - Iasi, Romania 22March 23, 2010
NLP pipe� spelling correction, stemmer, tokenizer, Named Entity
Recognizer, POS tagger and parser, and NP-chunker. Stanford NLP software (http://nlp.stanford.edu/software)
� Spellchecker : Jazzy � Spellchecker : Jazzy http://www.ibm.com/developerworks/java/library/j-jazzy/
� Alternative NLP pipes are under development,
� GATE (http://gate.ac.uk)
� LingPipe (http://aliasi.com/lingpipe/).
March 23, 2010 Cicling 2010 - Iasi, Romania 23
Pattern Language
� Regular expression search
� Synonyms, hypernyms and hyponyms via WordNet
� Words’ stems and their part of speech (POS)� Words’ stems and their part of speech (POS)
� consideration of utterances as a search unit, for example, specifying that a word should be searched in the previous n utterances and that two expressions should be in two utterances
March 23, 2010 Cicling 2010 - Iasi, Romania 24
Pattern Language (ex.)
� <S "convergence"> #[*] cube searches pairs of utterances that have a synonym of “convergence” in the first utterance and “cube” in the secondfirst utterance and “cube” in the second
� 1103 # 1107. overlap # cube [that would stil have to acount for the overlap that way] # [an idea: Each cube is assigned to 3 edges. Then add the edges on the diagonalish face.]
March 23, 2010 Cicling 2010 - Iasi, Romania 25
Classical NLP analysis
(content and discourse analysis)� Identification of concepts (using NLP pipe, pattern
language, cue-phrases and graph algorithms)
� Utterance features:� Speech acts� Speech acts
� Argumentation types in utterances (as in Toulmin’s theory Warrant, Concession, Rebuttal and Qualifiers)
� Implicit links:� Repetitions
� Adjacency pairs
� Co-references (with the BART system http://bart-coref.org/)
March 23, 2010 Cicling 2010 - Iasi, Romania 26
Social network analysis� Consider explicit and implicit referencing as arcs
between participants, which are the nodes
� A kind of page-rank algorithm – an utterance is important if it is referred by important utterances; The important if it is referred by important utterances; The strength of a voice (of an utterance) depends on the strength of the utterances that refer to it
� Determines if a person is central/peripheral
March 23, 2010 Cicling 2010 - Iasi, Romania 27
Polyphony, Inter-animation and
Collaboration analysis� Assign an importance value for each utterance
considering several indicators of inter-animation (collaboration)� Detection of voices (chains) inter-animation patterns � Detection of voices (chains) inter-animation patterns
(Trausan-Matu) in the chat� Consider several criteria such as the presence in the chat
of questions, agreement, disagreement� Presence of others’ voices� Social Networks metrics
� Machine learning approach (genetic algorithms and neural networks) for tuning the
March 23, 2010 Cicling 2010 - Iasi, Romania 28
March 23, 2010Cicling 2010 - Iasi, Romania 29March 23, 2010
March 23, 2010Cicling 2010 - Iasi, Romania 30March 23, 2010
March 23, 2010Cicling 2010 - Iasi, Romania 31March 23, 2010
March 23, 2010Cicling 2010 - Iasi, Romania 32March 23, 2010
March 23, 2010Cicling 2010 - Iasi, Romania 33March 23, 2010
Thank You!
Questions?Questions?
Cicling 2010 - Iasi, Romania34
March 23, 2010