Upload
andrea-lang
View
223
Download
3
Tags:
Embed Size (px)
Citation preview
Taxonomy and Social MediaSocial Taxonomies
Tom ReamyChief Knowledge Architect
KAPS Group
Program Chair – Text Analytics World
Knowledge Architecture Professional Services
http://www.kapsgroup.com
2
Agenda
Introduction
It’s a Different World– Content and Intent
New Approaches– To Taxonomy – Text Analytics
New Applications – and Opportunities
Conclusion
3
Introduction: KAPS Group
Knowledge Architecture Professional Services – Network of Consultants Applied Theory – Faceted & emotion taxonomies, natural categories
Services:– Strategy – IM & KM - Text Analytics, Social Media, Integration– Taxonomy/Text Analytics, Social Media development, consulting– Text Analytics Quick Start – Audit, Evaluation, Pilot
Partners – Smart Logic, Expert Systems, SAS, SAP, IBM, FAST, Concept Searching, Attensity, Clarabridge, Lexalytics
Clients: Genentech, Novartis, Northwestern Mutual Life, Financial Times, Hyatt, Home Depot, Harvard Business Library, British Parliament, Battelle, Amdocs, FDA, GAO, World Bank, Dept. of Transportation, etc.
Program Chair – Text Analytics World – March 29-April 1 - SF Presentations, Articles, White Papers – www.kapsgroup.com Current – Book – Text Analytics: How to Conquer Information Overload,
Get Real Value from Social Media, and Add Smart Text to Big Data
4
New Content CharacteristicsIt’s a Very Different World Scale – orders of magnitude – 100’s of millions, Billions Speed – 20-100 million a day Size – Twitter, Blogs, forums, email
– 140 characters to a few sentences Quality – misspellings, lack of structure, incoherence Conversations – not stand alone docs
– Can’t tell what a “document” is about without reference to previous threads
Purpose – communicate - social grooming, rant– Not exchange of ideas, policies, etc.
Simple Content Complexity – single thoughts, simplicity of emotion
5
New Content CharacteristicsIt’s a Very Different World – Search and Taxonomy i tried very slow, NO GOOGLE search, some apps not working.. This is
not a "with GOOGLE" My friend has incredible, that is much batter.. Anyways i returned samsung, replace incredible. What's great about it: 4" LCD What's not so great: NOT A GOOGLE PHONE
(nt 2.0)willie John ci to/for: wanted to know about charges for pic mail for ;bill date 4/5/2010 | repeat: no | auth: pin | ptns affected: 7777777777 | information/instructions given: sup gave pic mail for free and gave adj for $ 2.40 new bal is $ 147.53 | any mobile, anytime: n | ir: yes | ir-email: n |
6
New Content CharacteristicsIt’s a Very Different World – Topical Current Content Content not archived (for users) No real need for search (or just very simple search) Very Poor (if any) metadata – not faceted search Focus on phrases, sentences – not documents Little need of a subject taxonomy About emotions, things, products, people Who are the users? They don’t need our help Taxonomies, we don’t need no stinking taxonomies!
7
It’s a Very Different World
So why are we talking about it at a taxonomy boot camp? Taxonomy = structure (purists can leave now) All of this content is a rich source of research material Companies are mining this resource and they need to add
structure to get deeper understanding Varieties of structure:
– Simple topical taxonomies 2-3 levels– Emotion taxonomies, Ontologies and Semantic Networks– Dynamic taxonomies – built on public taxonomies, enterprise
taxonomy – exposed in hierarchical triples . Need more automatic / semi-automatic solutions
– Advanced text analytics
New Kinds of Social Taxonomies
New Taxonomies – Appraisal – Appraisal Groups – Adjective and modifiers – “not very good”– Four types – Attitude, Orientation, Graduation, Polarity– Supports more subtle distinctions than positive or negative
Emotion taxonomies – Joy, Sadness, Fear, Anger, Surprise, Disgust– New Complex – pride, shame, embarrassment, love, awe– New situational/transient – confusion, concentration, skepticism
Beyond Keywords – Need Text Analytics– Analysis of phrases, multiple contexts – conditionals, oblique – Analysis of conversations – dynamic of exchange, private language– Enterprise taxonomy rolled into a categorization taxonomy
8
9
10
Case Study – Categorization & Sentiment
11
Case Study – Categorization & Sentiment
12
Taxonomy and Social Media: ApplicationsNew Range of Applications Real Sentiment Analysis - Limited value of Positive and Negative
– Degrees of intensity, complexity of emotions and documents– Contextual rules – “I would have loved X except for the battery”
Expertise Analysis – Experts think & write differently – process, chunks– Categorization rules for documents, authors, communities
Behavior Prediction–TA and Predictive Analytics, Social Analytics Crowd Sourcing – technical support to Wiki’s Political – conservative and liberal minds/texts
– Disgust, shame, cooperation, openness
13
14
Taxonomy and Social Media: ApplicationsPronoun Analysis: Fraud Detection; Enron Emails Patterns of “Function” words reveal wide range of insights Function words = pronouns, articles, prepositions, conjunctions, etc.
– Used at a high rate, short and hard to detect, very social, processed in the brain differently than content words
Areas: sex, age, power-status, personality – individuals and groups Lying / Fraud detection: Documents with lies have
– Fewer and shorter words, fewer conjunctions, more positive emotion words
– More use of “if, any, those, he, she, they, you”, less “I”– More social and causal words, more discrepancy words
Current research – 76% accuracy in some contexts Text Analytics can improve accuracy and utilize new sources
15
Taxonomy and Social Media: ApplicationsBehavior Prediction – Telecom Customer Service
Basic Rule
– (START_20, (AND, – (DIST_7,"[cancel]", "[cancel-what-cust]"),– (NOT,(DIST_10, "[cancel]", (OR, "[one-line]", "[restore]", “[if]”)))))
Examples:– customer called to say he will cancell his account if the does not stop receiving
a call from the ad agency. – cci and is upset that he has the asl charge and wants it off or her is going to
cancel his act– ask about the contract expiration date as she wanted to cxl teh acct
Combine sophisticated rules with sentiment statistical training and Predictive Analytics and behavior monitoring
16
Taxonomy, Text Analytics, and Social MediaConclusions
Social Media is a Different World– Content, Scale, Questions
New Types of Taxonomy– Smaller, more dynamic subject taxonomies– Appraisal, Emotion, Things, Motivations, Actions, etc.
Taxonomists – Time to Explore new structures– Ontologies, semantic networks, all of above
Text Analytics – needs good taxonomy design – levels, etc.– Adds a platform – flexible and powerful auto-tagging,
Result: New Types of Applications– Stand alone and with standard search/taxonomy – Merge data and text, external and internal
Questions?
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com