Upload
marvin-miller
View
212
Download
0
Embed Size (px)
Citation preview
EmpaceptorThe empathy interceptor
Developed by Elias Holman for MAS630 Final Project
5/11/04
Project Constraints
• Start from a small amount of built-in knowledge
• Make it work with tools that users already use, with tasks that they already perform
• Don’t require heavyweight processing or non-standard hardware (anyone can use it)
What is Empaceptor?
• A pure Java text-processing tool that attempts to build up an association between a word and a set of configurable emotions.
• A service which can be invoked either via a Java API or standard TCP/IP sockets.
What did you build?
EmpaceptorServer
Empaceptor XML Persistence
mboxfile
POPServer
EmpaceptorEmail Service
Shell scriptinvocation
POPoverTCP/IP
Return value of service
StandardsocketsoverTCP/IP
SMTPoverTCP/IP
Local diskaccess
Local diskaccess
To incoming mail server
TCP/IP
Empaceptor architecture showing 1) Incoming mail path 2) Mbox file access 3) Outgoing mail path
The Server
EmpaceptorServer• Processes incoming text
• Learns from text, building an association based on co-occurrence
• Scores text, based on learned associations
• Persists associations in simple XML file stored in user’s home directory
Empaceptor XML Persistence
How learning works
• List of emotion words is starting point.• Words occurring with emotion words are marked
as occurring in emotional context.• Words occurring without emotion words are
simply marked as occurring.• Learned emotional score of word is emotional
occurrences/all occurrences for each emotion.• WordNet gives broader coverage, so scores are
really for synsets, not words.
The Base Emotion Set
• alarmed• alert• angry• annoyed• anxious• astonished• bold• bored• calm• cautious• compassionate• concerned• confident• curious• delighted• depressed• disappointed• disgusted• distressed• embarrassed• envious• excited• fatigued
• fearful• frustrated• generous• gloomy• grateful• guilty• blissful• happy• haughty• helpless• hopeful• humiliated• indifferent• inferior• interested• joyful• lonely• miserable• mirthful• nervous• panicked• passive• patient
• pleased• proud• regretful• relaxed• relieved• restrained• sad• satisfied• scornful• serene• shameful• sorrowful• surprised• suspicious• sympathetic• tender• tense• tranquil• vigilant• weak• worried• yearning
Scoring a piece of text
• Each individual word is scored as in learning (although occurrences are not counted as in learning).
• Score of text is sum of scores of words divided by # of words.
Scoring a piece of textI am happy about my new car. I put it in my garage.
Word Happy Weight Word Happy Weight
I 1/2 (.5) car 1/1 (1)
am 1/1 (1.0) put 0/1 (0)
happy 1/1 (1.0) it 0/1 (0)
about 1/1 (1.0) in 0/1 (0)
my 1/2 (0.5) garage 0/1 (0)
new 1/1 (1)
Mbox import
• Uses GNU open-source tools to read in mbox file.
• Strips off header and sends content to Empaceptor.
• Messages are used for learning, but are not scored themselves.
•Easy way to load large corpus of training data (just point it at your Inbox)
EmpaceptorServer
mboxfile
Local diskaccess
Incoming Mail Path
• Email Program must be set up to invoke arbitrary shell process as filter (supported by Evolution and others).
• Email Program pipes message and desired scoring emotion to Java Empaceptor client process.
• Java Empaceptor client sends over TCP/IP to Empaceptor server for scoring only, not for learning.
•Empaceptor server scores, and compares with configurable threshold. If above threshold, returns “yes”, otherwise “no”.
•Empaceptor client process exits with corresponding integer value 1 for “yes”, 0 for “no”, 15 for error (couldn’t connect to server, etc).
•Email program can use information to take arbitrary action (set color of message).
Incoming Mail Path
EmpaceptorServer
POPServer
EmpaceptorEmail Service
Shell scriptinvocation
POPoverTCP/IP
Return value of service
StandardsocketsoverTCP/IP
SMTP Server
• Modified version of jes (Java Email Server) SMTP server.
• Sends outgoing messages to Empaceptor for scoring and learning before delivery.
• Runs on non-standard port (12223) so can be run along side standard SMTP or sendmail.
EmpaceptorServer
SMTPoverTCP/IP
To incoming mail server
TCP/IP
The EmpaceptorGUI
•Not much to look at, just a set of tasks to perform
• Open and learn from mbox file
• Start/Stop server to handle email client requests
• Start/Stop SMTP server
• Alter set of emotions to process
• Set threshold for email client requests to return “yes” value.
• Test box for entering messages to see their resulting scores as a table.
Let’s see a demo
Usage test
• Ran email through Empaceptor for about five days (tweaking along the way)
• Set up email client (Evolution) to look for happy emails, and tag as purple if found.
• Configured to use SMTP for outgoing mail
• Primed system with Sent mail folder’s mbox file (about 440 messages)
Usage test - Issues
• Email service was too slow. Made scoring faster, and made learning asynchronous for quick response time.
• Fiddled with threshold – 0.1 seems best.
• Bugs, as always
Usage test – Lessons Learned
• System is easily fooled by idiomatic usage of emotional base words:– Happy hour– I’m happy to take care of that
• 440 messages is not a big enough training set to start.
• Performed reasonably well in my opinion, but lots of false positives, and also: what is a good performance metric?
Related Work
• Liu et al’s affective email processing work– Uses common-sense reasoning– Specialized email client to do processing
• Eudora mail Mood Watch– Just keyword spotting, only for strong
language and offensive content
• This approach is somewhere in-between
Future Work
• The ability to look at the last 10,100,1000 messages to see trends.
• More sophisticated text processing.
• Take advantage of recency.
• Get it out into the world!
Thanks
The open-source/free-software community
Professor Picard for feedback and support