39
Technologies and Challenges Multilingual Sentiment Analysis

Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Technologies and Challenges Multilingual Sentiment Analysis

Page 2: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Lei Zhang (张镭) Computer  Scien.st  in  Text  Analy.cs  team  at  Adobe,  Ph.D.  in  Computer  Science    

Machine  Learning    80%  

Natural  Language  Processing    100%  

 Data  Mining  90%  

Page 3: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Sentiment Analysis

Computa.onal  study  of  people’s  opinions,  appraisals,  a?tudes  from  subjec.ve  informa.on  (i.e.,  text,  audio  and  video  etc.  )        Basic  task  of  sen.ment  analysis  is  to  classify  the  polari.es  of  a  given  informa.on  (posi:ve,  nega:ve,  or  neutral)  

“The  digital  media  cloud  so0ware  is  hard  to  install  .”    

“The  new  Adobe  digital  media  cloud  is  great  !”      

Page 4: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Traditional Approaches for Sentiment Analysis

•  Individuals                  Get  opinions  from  family,  friends  or  colleagues.    •  Organiza.ons              Get  opinions  from  polls  and  surveys.                                  

Page 5: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Current Technologies

With  the  booming  of  Web,  especially  Social  Media,  a  huge  amount  of  subjec.ve  data  generated  by  people  in  forums,  blogs,  TwiPer,  etc.                                      Now,  it  is  impossible  for  human-­‐being  to  analyze  them  all.      We  want  computer  to  automa.cally  mine  sen.ments  from  the  big  data.  It  is  a  challenging  task  but  very  useful.  It  would  provide  insight  for  people  and  organiza.ons  to  make  decisions.                              

Page 6: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Social Media Analytics

         Listen      Understand      

Retrieve,  store  and  analyze  people’s  conversa.ons    

Explore  topics  driving  people’  conversa.ons  and  understand  internal  dynamics  

Evaluate  impact  of  marke.ng  campaigns  in  social  media  channels  

Iden.fy  issues,  themes  and  paPerns.  Deliver  appropriate  content  to  target  audience  

01 03

02 04  Act        Measure  

Page 7: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Sen:ments   Reasons   Ac:ons  

Sentiment Analysis in Social Media Analytics

For  example,  “The  new  digital  media  cloud  so0ware  is  hard  to  install”    

Page 8: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

An Example Review

“  I  bought  an  iPhone  a  few  days  ago.  It  was  such  a  nice  phone.  The  touch  screen  was  really  cool.  The  voice  quality  was  clear  too.  Although  the  baAery  life  was  not  long,  that  is  ok  for  me.  However,  my  mother  was  mad  with  me  as  I  did  not  tell  her  before  I  bought  the  phone.  She  also  thought  the  phone  was  too  expensive,  and  wanted  me  to  return  it  to  the  shop.  …”    

What  we  see  here?  

           Sen.ments,  targets  of  sen.ments  and  opinion  holders    

Page 9: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

What is a Sentiment

•  A  sen.ment  is  a  quintuple        (ej,  fjk,  soijkl,  hi,  tl),   where    –  ej  is  a  target  en.ty.  –  fjk  is  a  feature(aspect)  of  the  en.ty  ej.  –  soijkl   is   the   sen.ment   value   of   the   sen.ment   of   the   opinion   holder   hi   on  

feature(aspect)   fjk   of   en.ty   ej   at   .me   tl.   soijkl   is   +ve,   -­‐ve,   or   neu,   or   a  more  granular  ra.ng.    

–  hi  is  an  opinion  holder.    –  tl  is  the  .me  when  the  sen.ment  is  expressed.  

Page 10: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Sentiment Analysis Objective

•  Objec.ve:  given  an  opinionated  document,    –  Discover  all  quintuples  (ej,  fjk,  soijkl,  hi,  tl),    

•  i.e.,  mine  the  five  corresponding  pieces  of  informa.on  in  each  quintuple,  and  

–  Or,  solve  some  simpler  problems.  

•  With  the  quintuples,    –  Unstructured  text  →  Structured  data  

•  Tradi.onal  data   and   visualiza.on   tools   can  be  used   to   slice,   dice  and  visualize  the  results  in  all  kinds  of  ways  

•  Enable  qualita.ve  and  quan.ta.ve  analysis.        

Page 11: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Sentiment Classification: Document Level

•  Classify  a  document  (e.g.,  a  review)  based  on  the  overall  sen.ment  expressed  by  opinion  holder    –  Classes:  posi.ve,  or  nega.ve  (and  neutral)  

•  It  assumes  

 Each  document  focuses  on  a  single  en.ty  and  contains  opinions  from  a  single  opinion  holder  

 

Page 12: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Subjectivity Analysis : Sentence Level

•  Sentence-­‐level  sen.ment  analysis  has  two  tasks:  –  Subjec.vity  classifica.on:  Subjec.ve  or  objec.ve.  

•  Objec.ve:  e.g.,    “I  bought  an  iPhone  a  few  days  ago.”  •  Subjec.ve:  e.g.,    “It  is  such  a  nice  phone.”    

–  Sen.ment  classifica.on:  For  subjec.ve  sentences  or  clauses,  classify  posi.ve  or  nega.ve.    •  Posi.ve:    e.g.,    “It  is  such  a  nice  phone.”    •  Nega.ve:  e.g.,    “The  screen  is  bad.”    

 

Page 13: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Aspect-based Sentiment Analysis

•  Sen.ment  classifica.on  at  both  document  and  sentence  (or  clause)  levels  are  NOT  sufficient,      –  They  do  not  tell  what  people  like  and/or  dislike    –  A  posi.ve   sen.ment  on  an  en.ty  does  not  mean   that   the  opinion  

holder  likes  everything.  –  An  nega.ve  sen.ment  on  an  en.ty  does  not  mean  that  the  opinion  

holder  dislikes  everything.      

Page 14: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Aspect-based Sentiment Summary “I  bought  an  iPhone  a  few  days  ago.  It  was  such  a  nice  phone.  The  touch  screen  was  really  cool.  The  voice  quality  was  clear  too.  Although  the  baAery  life  was  not  long,  that  is  ok  for  me.  However,  my  mother  was  mad  with  me  as  I  did  not  tell  her  before  I  bought  the  phone.  She  also  thought  the  phone  was  too  expensive,  and  wanted  me  to  return  it  to  the  shop.  …”    

….

Aspect  based  summary:  Aspect1:  Touch  screen  Posi.ve:    212  ¢  The  touch  screen  was  really  cool.    ¢  The  touch  screen  was  so  easy  to  use  and  

can  do  amazing  things.    …  Nega.ve:  6  ¢  The  screen  is  easily  scratched.  ¢  I  have  a  lot  of  difficulty  in  removing  finger  

marks  from  the  touch  screen.    …    Aspect2:  baHery  life  …  

Page 15: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Aspect-based Sentiment Summary User Case

Page 16: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Sentiment Analysis is a Multifaceted Problem

•  (ej,  fjk,  soijkl,  hi,  tl),  

–  ej  -­‐  an  en.ty:    Named  en.ty  extrac.on  (more)  –  fjk  –  an  feature(aspect)  of  ej:    Informa.on  extrac.on  –  soijkl  is  sen.ment:    Sen.ment  determina.on    –  hi  is  an  opinion  holder:    Informa.on/Data  Extrac.on  –  tl  is  the  .me:    Data  Extrac.on  

•  Rela.on  extrac.on  •  Synonym  match  (voice  =  sound  quality)  …    

Page 17: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Methods for Sentiment Analysis

Machine  Learning  Method   Lexicon-­‐based  Method  

Page 18: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Machine Learning Method for Tweets

Picture  from  Hassan  Saif  

Train  a  sen.ment  classifier  to  determine  posi.ve,  nega.ve  and  neutral  sen.ments  of  text    

Page 19: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Machine Learning Method

Pros:  1.  Tradi.onal  and  dominant  sen.ment  analysis  method  for  long  

documents  (e.g.  reviews,  blogs)    

Cons:  1.  Domain-­‐transfer  problem.  A  sen.ment  classifier  may  perform  well  in  

one  domain  but  ofen  performs  bad  in  another  domain.        

2.  Manually  labeling  a  large  set  of  texts  is  labor-­‐intensive  and  .me-­‐consuming.  

Page 20: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Lexicon-based Method

I  made  a  big  mistake  last  night  :(    

Nega.ve  

Opinion  Lexicon  

Natural  Language  Processing  (NLP)  

Algorithm  

great  sad  

:(  

wrong  mistake  

bad  

love  good  

Page 21: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Lexicon-based Method for Tweets

Text  input   Tokeniza.on  

Part  of  Speech  Tagging  

Tweet  Construct  Detec.on  (e.g.,  Emo.con,  Hashtag)    

Named  En.ty  Recogni.on  

Sentence/Clause  Chunking   Conjunc.on  Analysis  

Coreference  Analysis  

Logical  Analysis  

Syntac.c  Parsing  (e.g.,  Modifier,  SVO  )    

Rela.on  Extrac.on  

   

Sen.ment  Assignment  (En.ty  Level  

or  Tweet  Level)  

   

Sen.ment  Aggrega.on  

Opinion  Lexicon  

Page 22: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Lexicon-based Method

Pros:  1.  Only  need  opinion  lexicon;  do  not  need  to  label  training  examples.  2.  Generally  domain-­‐independent.    

Cons:  1.  Some  linguis:c  knowledge  is  required.  

Page 23: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Adobe Social

“Sammy”  is  the    sen.ment  analysis  engine  for  Adobe  Social.          

Page 24: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Sammy Modules

Web  applica:on  (.WAR)

Java  library  (.JAR)

Lexicon-­‐based  classifier

Machine  Learning-­‐based  classifier

Common  NLP  modules

Web  module  

Runs  on  Tomcat  server

Pure  Java  library

Page 25: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Multilingual Sentiment Analysis

As  the  use  of  Social  Media  has  spread  globally,  there  is  an  increasing  importance  to  analyze  mul.lingual  social  media  data.    Sammy  need  to  consider  the  following  (challenges)  for  mul.lingual  analysis:              -­‐  Language  encoding              -­‐  Language-­‐specific  natural  language  processing  (NLP)  tools              -­‐  Language-­‐specific  dic.onaries  and  resources              -­‐  Cross-­‐language  sen.ment  analysis  approaches    

Page 26: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Language Encoding

There  are  many  different  encodings  used  worldwide,  some  of  them  designed  for  a  par.cular  language,  others  covering  the  en.re  range  of  characters  defined  by  Unicode.    We  uses  the  facili.es  provided  by  Java  and  so  it  has  access  to  over  100  different  encodings  including  the  most  popular  locales  ones,  such  as  ISO  8859-­‐1  in  Western  countries  or  ISO-­‐8859-­‐9  in  Eastern  Europe.          

Page 27: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Language Identification

Language  iden.fica.on  is  to  determine  natural  language  by  inspec.on  of  given  data.    It  is  an  important  preprocess  step  for  sen.ment  analysis.  There  are  three  main  approaches  as  follows.    (1)  Common  words  approach  (2)  Sta.s.cal  approach    (3)  N-­‐gram  approach          

Page 28: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Common Words Approach

The  basic  idea:    (1)  Sample  text  in  different  languages    (2)  Store  highly  frequent  words  for  each  language  in  a  database.      (3)  The  text  to  be  classified  is  compared  to  all  the  word  lists  in  database.  (4)  Via  a  scoring  system,  the  word  list  with  most  occurrences  indicates  the  language  of  the  text.      

Page 29: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Statistical Approach

The  basic  idea    (machine  learning):    (1)  Sample  texts  in  different  languages  (2)  Segment  strings,  compute  probabili.es  of  the  occurrence  of  all  string  sequences,  and  get  a  language  model    (a  probability  distribu.on  over  sequences  of  words)  for  each  language.    (3)  For  the  text  to  be  iden.fied,  compute  (  Markov  model)  the  probability              p  (text  |  language  model)  for  all  modes.  (4)  Pick  the  highest  probability  of  the  model  that  produced  the  text.  

 

Page 30: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

N-gram Approach

N-­‐gram  is  a  con.guous  sequence  of  sequence  of  n  items  from  a  give  sequence  of  text.    The  n-­‐gram  of  size  1  is  referred  as  a  “unigram”;  size  2  is  a  “bigram”;  size  3  is  a  “trigram”  …    e.g.,  the  word  “garden”,      bi-­‐grams:      ga,  ar,  rd,  de,  en    tri-­‐grams:    gar,  ard,  rde,  den    

Page 31: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

N-gram Approach (continue)

The  basic  idea:      

(1)  Sample  texts  in  different  languages  (2)  Generate  N-­‐gram  profile  for  each  language  (3)  For  the  text  to  be  iden.fied,  calculates  the  N-­‐gram  profile  and  compares  it  to  the  language  specific  N-­‐gram  profiles.    (4)  The  language  profile  which  has  the  smallest  distance  to  our  text  N-­‐gram  profile  indicates  the  language.      

Page 32: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Language Identification Challenges

For  social  media  data,  we  have  some  new  challenges.      (1) Handing  very  short  texts                  (2)  Handling  texts  of  unknown  language  and  texts  comprised  of  mul.ple  languages.            e.g.,  “  @LEGOJurassic  \n@AudiJapan  \nアウディA5\nアウディ―A5\n#アウディA5\n  #アウディ―A5\nA1  A3  A3  A4  Q1  TT  “  

Page 33: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Language-specific Part of Speech Part  of  Speech  (POS)  is  a  category  of  words  which  have  similar  gramma.cal  proper.es.  Commonly  listed  English  parts  of  speech  are  noun,  verb,  adjec.ve,  adverb,  pronoun  etc.        Input:  “My  dog  also  likes  ea^ng  Sausage”  Output:  “My/PRP$  dog/NN  also/RB  likes/VBZ  ea^ng/VBG  Sausage/NN”    Other  languages  have  their  own  POS.    e.g.,  Japanese  has  language  specific  POS  such  as  “助詞”  (auxiliary  word).    We  need  to  incorporate  language-­‐specific  POS  informa.on  for  analysis.          

Page 34: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Language-specific Opinion Lexicon Opinion  Lexicon  is  a  list  of  opinion  (bearing)  word  (e.g.,  “good”  “bad”).  It  plays  a  cri.cal  role  in  sen.ment  analysis.    We  have  several  well-­‐regarded  sen.ment  lexicons  in  English.  The  same  is  not  true  for  most  of  the  world’s  languages.    Two  main  approaches  to  get  opinion  lexicons  from  other  languages  (1)    Machine  transla.on    (2)    Graph  propaga.on  (given  seed  words,  try  to  expand  words  by  external  knowledge  bases,  such  as  Wik.onary  and  WordNet).              

Page 35: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Multilingual Sentiment Analysis Approaches

For  sen.ment  analysis  at  document  level  and  sentence  level,  its  basic  idea  is  as  follows.    Focused  on  using  extensive  resources  and  tools  available  in  English  and  automated  transla.ons  to  help  build  sen.ment  analysis  systems  in  other  languages  which  have  few  resources  or  tools      

Page 36: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Multilingual Sentiment Analysis Approaches (Continue)

Current  research  proposed  two  main  strategies:      (1)  Translate  test  sentences  in  the  target  language  into  the  source  language  and  classify  them  using  a  source  language  classifier.    (2)  Translate  a  source  language  training  corpus  into  the  target  language  and  build  a  corpus-­‐based  classifier  in  the  target  language.    

Page 37: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Multilingual Sentiment Analysis Approaches (Continue)

For  sen.ment  analysis  at  aspect  level,  the  basic  idea  is  as  follows:    Apply  language-­‐specific  tools  (POS  tagger,  Parser)  to  extract  useful  informa.on  between  aspect  and  sen.ment,  and  then  apply  language  –agonis.c  aggrega.on  methods  to  determine  sen.ments.    

Page 38: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Reference

•  Bing  Liu,    Sen.ment  Analysis  and  Opinion  Mining,  Morgan  &  Claypool,  2012  •  Simon  Kranig,    Evalua.on  of  Language  Iden.fica.on  Methods,    Thesis  

Page 39: Technologies and Challengesfiles.meetup.com/3154762/Multilingual Sentiment Analysis.pdf · Sentiment Analysis Computaonal)study)of)people’s)opinions,)appraisals,)atudes)from) subjec.ve)informaon)(i.e.,)text,)audio)and)video)etc.))

Thank      You  Q  &  A