View
218
Download
2
Category
Preview:
DESCRIPTION
This essay investigates sentiment analysis: biases, spamming issues, methods and tecniques and its application on a daily basis through Datumbox API software and UNIMI side-project 'Voices From The Blog'. Written with student Margherita Zucchini.
Citation preview
TWITTER SENTIMENT ANALYSIS
Margherita Zucchini - Giulia Sosio
Web Communication Course 2014/2015
Summary
• Introduction
• 1 Sentiment Analysis: contents
◦ What is Sentiment Analysis
◦ Different Level of Sentiment Analysis
◦ Opinion, Emotion and Spam Issue
• 2 Sentiment Analysis: methodologies
◦ How to evaluate a sentiment opinion
◦ How to create an efficient lexicon
◦ Summarization
◦ An example of Sentiment Analysis system: Datumbox API
• 3 Twitter
◦ An italian case study: Voices from the blog and Ihappy index
▪ Ihappy index
• Conclusion
• Bibliography
Twitter Sentiment Analysis 1 Sosio Giulia e Zucchini Margherita
Introduction
Opinions and its related concepts such as sentiment, evaluations, attitudes and
emotions are the subjects of study of Sentiment Analysis and Opinion Mining. The
increasingly growth of the field coincide with those of the social media on the Web,
such as reviews, forum discussions, blogs, micro blogs, Twitter and other social
networks. In our paper, we focus on using Twitter, the most popular micro blogging
platform, for the task of sentiment analysis.
With the growth of social medias, individuals and organizations are increasingly
using the content in these tools for decision making. If one wants to buy a consumer
product, he/she is no longer limited by asking one's friends, because there are many
user reviews and discussions in public forums about the product. For a firm it may no
longer be necessary to conduct surveys, opinion polls and focus groups. However,
finding and monitoring opinion sites on the Web and distilling the information
contained in them remains formidable task because of the proliferation of diverse
sources. Each site typically contains a huge volume of opinion text that is not always
easily deciphered in long blogs and forum posts.
Another important task of this kind of analysis is teach to a computer to
understand symbolic languages or some sentiments as for example irony. How can a
computer understand if a writer is joking? This problem is not a small one, if we are
investigating web for a firm which needs to understand the impact of a new product
on market or is studying a new sector to decide if invest in it or not just count how
many time one or more language marks appear could not be enough especially
because each language has its symbolic and sub-symbolic marks and even if some
groups use the same word and the language the meaning of the same word change
from community to community and it is not possible create a program for each
speech around the world. This task is important even to analyze our society and the
changing inside of it especially for big mass event, Twitter and Facebook were
important media for example during Arabian spring.
In this paper we investigate the answers of information and communication
technology and their application in the real world.
Sentiment Analysis: contents
What is Sentiment Analysis
Sentiment analysis is a mathematics approach to investigate language not just on
Twitter or Facebook but in all the text created by human. The aim of sentiment
analysis' method is understand the meaning of a word or a sentence using statistic
and logic information; it is not just used to know what people think or feel about a
topic but also more information about the writer (his education or his social group).
Twitter Sentiment Analysis 2 Sosio Giulia e Zucchini Margherita
This kind of approach to work well needs good dictionary and lexicon and a very
good structure to understand which words are important and which not for a specific
language or grammar.
The problem whit this kind of analysis is in the measure of dictionary and lexicons
but also in the capacity of recognize the right word to analyze each language has
different rules about where a word must be and also knowing its rule in a sentence it
is not so simple. On the other hand give too much information about grammar rules
and meaning could is not a solution because not all the people speaking a language
use a correct grammar and speech change more often than grammar's book and
dictionary.
Scientist has found a possible solution to those problems investigate more levels
at more time and in the next paragraph we speak about it.
Different Level of Sentiment Analysis
In general, sentiment analysis has been investigated in mainly three levels.
Document level: the task at this level is to classify whether a whole opinion
document expresses a positive or negative sentiment. This level assumes that each
document expresses opinions on a single entity, not multiples.
Sentence level: it determines whether each sentence expressed a positive,
negative or neutral opinion. This level of analysis is closely related to subjectivity
classification, which distinguishes sentences that express factual information from
sentences that express subjective views and opinions.
Entity and Aspect level: both the document and sentence level analyses do not
discover what exactly people like and don't like. Aspect level performs finer-grained
analysis. Instead of looking at language constructs, it looks directly at the opinion
itself. It is based on the idea that an opinion consists of a sentiment (positive or
negative) and a target (of opinion). An opinion without a target is identified as one of
a limited use.
To improve its lexicon scientist had created some algorithms to let computers
understand at least if an adjective has the same polarity of another or not using
grammar indicators for example “and” or “but” to define clusters of words similar or
not and when they have a positive connection or not. In this way it is possible create
a good lexicon that know the relation between two or more word and their
polarization. When we speak about polarity of a word we speak about the grade of a
word: if it is positive, negative or neutral and about the operation of give it a different
point from other words or from the word whit or without modifier. Polarity is calculated
by this formula:
Twitter Sentiment Analysis 3 Sosio Giulia e Zucchini Margherita
How we can see polarity is calculated in a positive way and the “i” represent each
day of our examination time. So more texts for more dais means a better precision
about the polarity of a word and about its relationship whit other words of course the
level at which stop and consider entity-polarity defined is fixed by the scientist and at
the end our lexicon would appears for a computer this way:
Using this method we can increase our lexicon starting from just few words simple
but usefull and organize it in cluster that had the same or a similar meaning and
polarity. This way is possible for sentiment analysis been each time more accurate
and specify but for some scientist it is impossible for an algorithm really understand
human languages because of how this work not at a grammar side but at meaning
once and at different levels of languages.
What about a word not in the official dictionary but in an slang one or in a dialect
one? Sentiment analysis is relative simple if we analyze articles or blogs created to
be follow by everyone, but when we start to analyze opinions of normal people it
would not be so easy. For example to speak about rubbish a citizen of Bologna will
use the word “rusco” which is a word used only in Bologna and which come from the
name of a tax for rubbish of course. The meaning of this word is unknown for the
majority of people in Italy but it is used by everyone in Bologna even as a topic for
comments and opinions and from the name come adjectives and adverbs using to
speak about not just rubbish but also person. So there is the need of improve an
algorithm for sentiment analysis whit the capacity of recognize when a sentence is an
opinion or not and its topic and polarity which will be discuss in the next paragraph.
Twitter Sentiment Analysis 4 Sosio Giulia e Zucchini Margherita
Opinion, Emotion and Spam Issue
In the sentiment analysis field, an opinion is represented by a quintiuple (ei, aij,
sijkl, hk, tl) . Where e i is the name of an entity, a i j is an aspect of ei, sijkl is the
sentiment on aspect aij of entity ei, hk is the opinion holder, and t l is the time when
the opinion is expressed by hk. The sentiment sijkl is positive, negative, or neutral, or
expressed with different strength/intensity levels.
We can distinguish different types of opinions:
• regular opinion, often referred to simply as an opinion. It has got two main sub
types, direct opinion and indirect opinion (which often occurs in the medial domain,
and are harder to deal with);
• comparative opinion, which express a relation of similarities or differences
between two or more entities and/or a preference of the opinion holder based on
some shared aspects of the entities; a comparison can be gradable (expresses an
ordering relationship of entities being compared), equative (relation of the type "equal
to"), superlative ("greater or less than"), non-gradable (a relation of two or more
entities but without giving a grade to them);
• explicit opinion, that gives a regular or comparative opinion;
• implicit opinion, which is an objective statement that implies a regular or
comparative opinion.
W e d e fi n e a n emotion to be "a persona l pos i t i ve or negat ive
feeling." Here are some examples:
Emotions, our subjective feelings and thoughts, have been studied in multiple
fields, like psychology, philosophy and sociology. Based on Parrott studies, 2001,
people have six primary emotions: love, joy, surprise, anger, sadness and fear, which
can be sub-divided into many secondary and tertiary emotions.
According to consumer behavior research, evaluations can be broadly categorized
into two types: rational evaluations and emotional evaluations. Rational evaluation is
based on tangible beliefs and utilitarian attitudes. Emotional evaluation, instead, goes
deep into people's state of mind. To make use of these types of evaluations in
practice, sentiment ratings can be designed as emotional negative (-2), rational
negative (-1), neutral (0) , rational positive (+1), and emotional positive (+2). In
practice, neutral often means no opinion or sentiment expressed.
The most important indicators of sentiments are the so-called opinion words.
These are words that are commonly used to express positive or negative sentiments.
For example good, wonderful and amazing are positive sentiment words and bad,
poor and terrible are negative sentiment words. There are also phrases and idioms,
Twitter Sentiment Analysis 5 Sosio Giulia e Zucchini Margherita
of course. A list of such constructs is called a sentiment lexicon (or opinion lexicon).
There are, however, some problems about sentiment lexicon:
• a positive or negative sentiment word may have opposite orientations in different
application domains (for example "this camera sucks" or "this vacuum cleaner really
sucks")
• a sentence containing sentiment words may not express any sentiment at all.
• sarcastic sentences with or without sentiment words are hard to deal.
• many sentences without sentiment words can easily imply opinions.
Opinion spamming has become a major issue. Social media enable anyone from
anywhere in the world to freely express his/her views and opinions without disclosing
his/her true identity. This allows people with hidden agendas or malicious intents to
game the system to give people the impression that they are independent members
of the public and post fake opinions to promote or discredit target products, services,
organizations, or individuals without disclosing their true intentions. We can identify
three types of spam and spamming: fake reviews (Untruthful reviews that are written
not based on the reviewers' genuine experiences of using the products or services,
but are written with hidden motives; reviews about brands only, that do not comment
on the specific product or service but only on the brands or the manufacturers; non-
reviews, advertisements and other irrelevant text containing no opinions.
Another problem about opinion mining has to do with the fact that some sources,
like Twitter, has got a particular structure in its opinion presentations. Tweets are
short (at most 140 characters), informal, and use many internet slangs and
emoticons. Twitter postings are actually easier to analyze (in comparison with forums,
articles, facebook posts) due to the length limit. It is also often easier to achieve high
sentiment analysis accuracy. Reviews are also easier because they are highly
focused with little irrelevant information. To have a better idea of what we intend whit
this look at the table below1:
1This rating was took by Namrata Godbole, Manjunath Srinivasaiah and Steven Skiena using Lydiasentiment analysis sistem.
Twitter Sentiment Analysis 6 Sosio Giulia e Zucchini Margherita
We can see that there is an important difference between the rating in newspaper
and in blogs for the same topic is really different and some times they crash. So who
should believes to know what people thinks about a topic. This is because of the
nature of blogs and microblogs as Twitter but also because of the need of more word
and lexicons whit slang words and abbreviation.
2. Sentiment Analysis: metodologies
How to evaluate a sentiment opinion
1. Mark sentiment words and phrases: this step marks all sentiment words
and phrases in the sentence. Each positive word is assigned the sentiment score of
+1 and each negative word is assigned the sentiment score of 1. Fo example, we
have the sentence, “The voice quality of this phone is not good, but the battery life is
long.” After this step, the sentence becomes “The voice quality of this phone is not
good [+1], but the battery life is long” because “good” is a positive sentiment word
(the aspects in the sentence are italicized).
2. Apply sentiment shifters: sentiment shifters are words and phrases that
can change sentiment orientations. There are several types of such shifters.
Negation words like not, never, none, nobody, nowhere, neither, and cannot are the
most common type.
3. Handle but-clauses: Words or phrases that indicate contrary need special
handling because they often change sentiment orientations too. The most commonly
used contrary word in English is “but”. A sentence containing a contrary word or
phrase is handled by applying the following rule: the sentiment orientations before
the contrary word (e.g. but) and after the contrary word are opposite to each other if
the opinion on one side cannot be determined.
4. Aggregate opinions: This step applies an opinion aggregation function to
the resulting sentiment scores to determine the final orientation of the sentiment on
each aspect in the sentence.
Twitter Sentiment Analysis 7 Sosio Giulia e Zucchini Margherita
How to create an efficient lexicon
It is also crucial to create a sentiment lexicon which is is accurately correlated to
the subject under analysis. Researchers have proposed many approaches to compile
sentiment words. Three main approaches are: manual approach, dictionary-based
approach, and corpus-based approach.
• The manual approach is labor intensive and time consuming, and is thus not
usually used alone but combined with automated approaches as the final check,
because automated methods make mistakes.
• Using a dictionary-based approach, the method works as follows: A small set of
sentiment words (seeds) with known positive or negative orientations is first collected
manually, which is very easy. The algorithm then grows this set by searching in the
WordNet or another online dictionary for their synonyms and antonyms. The newly
found words are added to the seed list. The next iteration begins. The iterative
process ends when no more new words can be found. After the process completes, a
manual inspection step was used to clean up the list.
• The corpus-based approach has been applied to two main scenarios: first is
given a seed list of known (often general-purpose) sentiment words, discover other
sentiment words and their orientations from a domain corpus, and then adapt a
general-purpose sentiment lexicon to a new one using a domain corpus for sentiment
analysis applications in the domain. Although the corpus-based approach may also
be used to build a general-purpose sentiment lexicon if a very large and very diverse
corpus is available, the dictionary-based approach is usually more effective for that
because a dictionary has all words.
Due to contributions of many researchers, several general-purpose subjectivity,
sentiment, and emotion lexicons have been constructed, and some of them are also
publically available:
• General Inquirer lexicon (Stone, 1968): (http://www.wjh.harvard.edu/~inquirer/
spreadsheet_guide.htm)
• Sentiment lexicon (Hu and Liu, 2004): (http://www.cs.uic.edu/~liub/FBS/ sentiment-
analysis.html)
• MPQA subjectivity lexicon (Wilson, Wiebe and Hoffmann, 2005):
(http://www.cs.pitt.edu/mpqa/subj _lexicon .html)
• SentiWordNet (Esuli and Sebastiani, 2006): (http://sentiwordnet.isti.cnr.it/)
• Emotion lexicon (Mohammad and Turney, 2010): (http://www.purl.org/net/emolex)
Twitter Sentiment Analysis 8 Sosio Giulia e Zucchini Margherita
Summarization
Opinion summarization is still an active research area. Most opinion
summarization methods which produce a short text summary have not focused on
the quantitative side (proportions of positive and negative opinions). Future research
can deal with this problem while also producing human readable texts. We should
note that the opinion summarization research cannot progress alone because it
critically depends on results and techniques from other areas of research in
sentiment analysis, e.g., aspect or topic extraction and sentiment classification. All
these research directions will need to go hand-in-hand.
An example of Sentiment Analysis system: Datumbox API
Social Media Monitoring is one of the hottest topics nowadays. As more and more
companies use Social Media Marketing to promote their brands, it became necessary
for them to be able to evaluate the effectiveness of their campaigns.
Evaluating opinions requires performing Sentiment Analysis, which is the task of
identifying automatically the polarity, the subjectivity and the emotional states of
particular document or sentence. It requires using Machine Learning and Natural
Language Processing techniques and this is where most of the developers hit the
wall when they try to build their own tools.
In order to create an efficient tool, we can use a freeware and opensource
software called Datumbox API 1.0v. Datumbox operates by at first buiding at least
two modules: one that evaluate how many people are influenced by a certain
campaign and one that finds out what people think about the particular topic. Using
twitter, Datumbox need at least two things: being able to connect on twitter and
second evaluate the polarity of the tweets based on their words.
The first main steps are: log in on Twitter using your credentials, click on “Create
new Application” button and fill in the form to register a new app. When you create it
select the application and go to the “Details” tab (the first tab) and on the bottom of
the page click the “Create my access token” button. Once you do this, go to the
“OAuth tool” tab and note down the values: Consumer Key, Consumer secret, Access
token and Access token secret.
Here's an example, just a few raws, of a code that use simultaneusly Twitter and
Datumbox API credentials and creates a Sentiment Analysis tool.
<?php
class TwitterSentimentAnalysis {
protected $datumbox_api_key; //Your Datumbox API Key.
protected $consumer_key; //Your Twitter Consumer Key.
Twitter Sentiment Analysis 9 Sosio Giulia e Zucchini Margherita
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
protected $consumer_secret; //Your Twitter Consumer Secret.
protected $access_key; //Your Twitter Access Key.
Protected $access_secret; // Your Twitter Access Secret.
/**
* The constructor of the class
*
* @param string $datumbox_api_key Your Datumbox API Key
* @param string $consumer_key Your Twitter Consumer Key
* @param string $consumer_secret Your Twitter Consumer Secret
* @param string $access_key Your Twitter Access Key
* @param string $access_secret Your Twitter Access Secret
*
* @return TwitterSentimentAnalysis
*/
p u b l i c f u n c t i o n
construct($datumbox_api_key, $consumer_key, $consumer_secret, $access_key
, $access_secret){
$this->datumbox_api_key=$datumbox_api_key;
$this->consumer_key=$consumer_key;
$this->consumer_secret=$consumer_secret;
$this->access_key=$access_key; }
[...]
$sentiment=$DatumboxAPI-
>TwitterSentimentAnalysis($tweet['text']); //call Datumbox service to get
the sentiment
if($sentiment!=false) { //if the sentiment is not
false, the API call was successful.
$results[]=array( //add the tweet message in the
results
'id'=>$tweet['id_str'],
'user'=>$tweet['user']['name'],
'text'=>$tweet['text'],
'url'=>'https://twitter.com/'.$tweet['user']
['name'].'/status/'.$tweet['id_str'],
Twitter Sentiment Analysis 10 Sosio Giulia e Zucchini Margherita
0
'sentiment'=>$sentiment,
);.....
In order to detect the Sentiment of the tweets Datumbox uses a Machine
Learning framework to build a classifier capable of detecting Positive, Negative and
Neutral tweets. This training set consisted of 1.2 million tweets evenly distributed
across the 3 categories. The software tokenized the tweets by extracting their
bigrams and by taking into account the URLs, the hash tags, the usernames and the
emoticons.
In order to select the best features it uses several different algorithms and at the
end choses the Mutual Information. To evaluate the results we used the 10-fold
cross-validation method and our best performing classifier achieves an accuracy of
83.26%.
3. Twitter
In the past few years, there has been a huge growth in the use of micro blogging
platforms such as Twitter. Spurred by that growth, companies and media
organizations are increasingly seeking ways to mine Twitter for information about
what people think and feel about their products and services. Companies such as
Twitratr (twitrratr.com), tweetfeel (www.tweetfeel.com), and Social Mention
(www.socialmention.com) are just a few who advertise Twitter sentiment analysis as
one of their services.
Twitter messages have many unique attributes, and it's interesting to underline
them if we want to talk about its Sentiment Analysis methods.
1. Length. The maximum length of a Twitter message is 140 characters.
From our training set, we calculated that the average length of a tweet is 14 words,
and the average length of a sentence is 78 characters.
2. Language model. Twitter users post messages from many different
mediums, including their cell phones. The frequency of misspellings and slang in
tweets is much higher than other domains.
3. Twitter is used by different people to express their opinion about different
topics, thus it is a valuable source of people’s opinions.
4. Twitter contains an enormous number of text posts and it grows every day.
The collected corpus can be arbitrarily large.
5. Twitter’s audience varies from regular users to celebrities, company
representatives, politicians4, and even country presidents. Therefore, it is possible to
collect text posts of users from different social and interests groups.
Twitter Sentiment Analysis 11 Sosio Giulia e Zucchini Margherita
So investigate Twitter, and applicate sentiment analyses to it, is becoming each
day more and more important. Using Twitter people connect themselves to other
creating a social web where it is possible organize meeting and, for example, flash
mob, show it to the rest of the world and had a feedback that was impossible in all
the other historical period.
At the fourth year of my high school a story teacher asked himself and us what
would be of French Revolution if politician had had mobile phone or Facebook or at
least a TV. Would it happen or not? This question is proposed now to every body whit
the problems of arabian spring. Those social movements were organized on social
platform particularly on Twitter and the instantly feedback from other users give to
participant a strong that it was impossible in events before social network. The
problem is that we have no evidence of that; this events were born on internet and
thank of social networks but nobody has a real instrument to understand the power of
this new place. As all this kind of research there are always a black side: understand
those medias is also a way to understand how control them. Firms and government
had yet account official or not and strategies to promote themselves, to inform and
also to control people but everything is on this social networks is public and
everybody can use them if they ask, and often pay. By this side sentiment analyses
of social networks and blogs had some interesting dark shadows.
What would happens if police could know from where an opinion starts to diffuse
around the web? Or if a firm whit problems of imagine could know what write to
correct it and prevent problems next year? During politician champaign, for example,
sentiment analyses are used to understand what public opinion thinks about their
sentence and how the champaign is going. Unfortunately sentiment analyses
Twitter Sentiment Analysis 12 Sosio Giulia e Zucchini Margherita
methods are not enough efficient and even in USA not everybody are on Twitter so
using just computer is not possible gives a right idea of what is happening on web.
The problem by an algorithm point of view is create a program that could understand
semantic part of a text and connect how many times a topic appears and in which
context. In the next paragraph we will present Voices From the Blog which is
proposing an interesting solution to this problems.
An italian case study: Voices from the blog and Ihappy indexVoices From the Blog (VFB) is a start up whit the aim of study Twitter and other
social media whit a new sentiment analysis approach that can integrate algorithm
and human process. The idea was that to cancel error of missclassification the work
of computer must be control and integrate, where it is need, by human. Using this
method, VFB increase its lexicon and class better than other and using data and
statistical analyses method illustrated before it gives a more specific and accurate
report of what the web is felling about discussing topics actually they have created an
index called Ihappy which measure the happiness in each city of a country. VCB is
collaborating which some important italian journal for example Il Sole 24 Ore e Il
Corriere della Sera and it has also created an index for Wired.com to know the desire
of innovation and creativity in Italy. It is important to observe that this index and in
general their revelations give a better picture of what the web is felling and how
people is talking about important topics especially discussing once that we have
notice before are the really problem of sentiment analyses system that use just
statistical and logical method and not involve humans. This method is very usefull
because it do not let all the work to a computer whit a small or big lexicon but a
person control and teach to a computer how translate human languages to machine
language and this is possible for each language of our word that is spoken by a
researcher. Voices From the Blogs has also created an app to show its results every
day.
Ihappy index
The Ihappy is an index created by Voices From the Blog to analize humor day by
day of Twitter users, it is not just a temporal index but also a geographic once thank
to geo-localization system of Twitter. So for each day of a year we can know where
people is happier and thank to tags even why. The analyses starts from a small
sample of tweet happy and unhappy that Twitter gives every day to everyone and
using some method of text analyses and the human control as they yet do for normal
tex analyses of the web is possible understand the felling of Twitter. The level of
happiness in a particular city and on Twitter is calculated by this formula:
Ihappy=(number of happy post/number of happy & unhappy post)*100%
Twitter Sentiment Analysis 13 Sosio Giulia e Zucchini Margherita
It is important to understand that VFB does not just look to word or particular topic
but the integration between human and computer permit also a semantic analyses of
tweets. This way the Ihappy index consider also if somebody is speaking about his or
her son birth or a sunny day after a week of rain and fog.
The analysis of Twitter by VFB show us that there are some dynamic variables and
some other that are static. First ones are connected whit events or particular days of
the year:
• the happy of one day remain for more days
• which day is analyzed, if it is holiday or there is a celebration, for example
mum's day, it will be a happier day then for example the day in which we change
our hours
• facts of the day, if during the day there were good news or not it changes our
happy level
The second kind of variables are defined and do not change day by day:
• geographic, where we live changes our happiness that decreases when altitude
increases but if the district has seasides it increases again; for example if Milano
would have a seaside happiness of its citizens will be happier of 1,3 points
• institutional and politician, here we do not speak about the color of our city
administration but about the quality of life and social services
• demographic, living in a more populated city makes people happier, maybe
because a city offers more opportunities than town
The Ihappy ebook of 2013 shows us a happier country then the one of year before
whit 310 day of happy we can also see that the real variable that can change the
color of a day is the one connects whit collective events like a holiday or a success in
an important field. The Ihappy index gives also us an idea of how important is
becoming communication and social networks to know our society and in same case
to control it.
ConclusionAfter we wrote about the basic aspects of Sentiment Analysis, starting from the key
point of this type of research (what is in fact an opinion, how is explained, what are
the different level of analysis and which characteristics have different emotions) to the
technical and arithmetical methodologies, we focused more on the Twitter
phenomenon (the real “search field” for this kind of analysis) and a business reality
created on this topic, Voices From the Blog.
This excursus is important, because it allows us to describe the conclusion. In our
Twitter Sentiment Analysis 14 Sosio Giulia e Zucchini Margherita
opinion, Sentiment Analysis as it is nowadays it's not yet full capable of catching the
real moods and trends showing in the social medias. It's still missing the key point,
which are the billions forms of communication that a human being use apart from a
standard lexicon: slangs, dialects, internal jokes among friends, sarcasm and so on.
It's not about misspellings or graphic emoticons, which we have seen are well
integrated in a Sentiment Analysis lexicon. It has got something to do with the
singularity of each users, and specific ways in which he/she only can communicate to
the digital world.
Of course we are talking about a growing field, that is sophisticating its filters and
structures in order to be catch up with trends, so we're hoping that in time it will be
possible to see an integrated structure, in which computers will be fed up with human
minds and natural procedures. It's the same kind of integration we have seen in
Artificial Neural Networks, and that it's still growing and expanding since it has
unlimited options.
Twitter Sentiment Analysis 15 Sosio Giulia e Zucchini Margherita
Bibliography
• International Sentiment Analysis for News and Blogs by Mickail Bautin, LohitVijayarenu and Steven Skiena;
• Large Scale Sentiment Analysis For News and Blogs by Namrata Godbole,Manjunath Srinivasaiah and Steven Skiena;
• Sentiment Analysis and Opinion Meaning by Bing Liu Morgan & ClaypoolPublishers, May 2012.
• Twitter as a Corpus for Sentiment Analysis and Opinion Mining by AlexanderPak, Patrick Paroubek
• Twitter Sentiment Analysis by Alec Go, Lei Huang, Richa Bhayani
• IHappy 2013 by Voices From the Blogs Andrea Ceron Luigi Curini Stefano M.Iacus
• Voices from the Blogs http://voicesfromtheblogs.com
Twitter Sentiment Analysis 16 Sosio Giulia e Zucchini Margherita
Recommended