Why Watching Movie Tweets Won’t Tell the Whole Story?

  • View

  • Download

Embed Size (px)


Why Watching Movie Tweets Won’t Tell the Whole Story?. Felix Ming-Fai Wong, Soumya Sen , Mung Chiang EE, Princeton University. WOSN’12. August 17, Helsinki. Twitter: A Gold Mine?. Apps: Mombo , TwitCritics , Fflick. Or Too Good to be True?. Representativeness???. - PowerPoint PPT Presentation

Text of Why Watching Movie Tweets Won’t Tell the Whole Story?

Why Watching Movie Tweets Wont Tell the Whole Story?

Why Watching Movie Tweets Wont Tell the Whole Story?Felix Ming-Fai Wong, Soumya Sen, Mung ChiangEE, Princeton University1WOSN12. August 17, Helsinki.Twitter: A Gold Mine?2

Apps: Mombo, TwitCritics, Fflick

Or Too Good to be True?3


Why representativeness?Selection biasGenderAgeEducationComputer skillsEthnicityInterests

The silent majority4

Why Movies?Large online interest

Relatively unambiguous

Right in Timing: Oscars5Key QuestionsIs there a bias in Twitter movie ratings?How do Twitters compare to other online site?Is there a quantifiable difference across types of movies?Oscar nominated vs. non-nominated commercial moviesCan hype ensure Box-office gains?

6Data Stats12 Million Tweets1.77M valid movie tweetsFebruary 2 March 12, 201234 movie titlesNew releases (20)Oscar-nominees (14)7Example MoviesCommercial Movies (non-nominees)Oscar-nomineesThe GreyThe DescendantsUnderworld: AwakeningThe ArtistContrabandThe HelpHaywireHugoThe Woman in BlackMidnight in ParisChronicleMoneyballThe VowThe Tree of LifeJourney 2: The Mysterious IslandWar Horse8Oscar-nominees for Best Picture or Best Animated Film Rating ComparisonTwitter movie ratingsBinary Classification: Positivity/NegativityIMDb, Rotten Tomatoes ratings Rating scale: 0 10, 0 5Benchmark definitionRotten Tomatoes: Positive - 3.5/5IMDb: Positive: 7/10Movie score: weighted averageHigh mutual information9

ChallengesCommon Words in titlese.g., Thanks for the help, the grey skyNo API support for exact matchinge.g., a grey cat in the box Misrepresentationse.g., underworld awakeningNon-standard vocabularye.g., sick movieNon-English tweetsRetweets: approves original opinion

10Tweet Classification11

StepsPreprocessingRemove usernamesConversion: Punctuation marks (e.g., ! to a meta-word exclmark)Emoticons (e,g,, or :), etc. ), URLs, @Feature Vector Conversion Preprocessed tweet to binary feature vector (using MALLET)Training & Classification11K non-repeated, randomly sampled tweets manually labeledTrained three classifiers12PreprocessingTraining DataClassificationAnalysisClassifier Comparison SVM based classifier performance is significantly betterNumbers in brackets are balanced accuracy rates13

Temporal Context14

Woman in BlackChronicleThe VowSafe House

Most current tweets are mentionsLBSBefore or After tweets are much more positiveFraction of tweets in joint-classesSentiment: Positive Bias15

Woman in BlackChronicleThe VowSafe House

HaywireStar Wars ITwitters are overwhelmingly more positive for almost all moviesRotten Tomatoes vs. IMDb?Good match between RT and IMDb (Benchmark)16

Twitters vs. IMDb vs. RT ratings (1)New (non-nominated) movies score more positively from Twitter users17

IMDb vs. TwitterRT vs. TwitterTwitters vs. IMDb vs. RT ratings (2)Oscar-nominees rate more positively on IMDb and RT18

IMDb vs. TwitterRT vs. TwitterQuantification Metrics (1)(x*, y*) are the medians of proportion of positive ratings in Twitter and IMDb/RT19


Positiveness (P):Bias (B):IMDb/RT scoreTwitter scorex*y*11Quantification Metrics (2)20

Inferrability (I):If one knows the average rating on Twitter, how well can he/she predict the ratings on IMDb/RT?Summary Metrics21

Oscar-nominees have higher positivenessRT-IMDb have high mutual inferrability and low biasTwitter users are more positively biased for non-nominated new releasesHype-approval vs. Ratings22

Higher (lower) Hype-approval Higher (lower) IMDb/RT ratingsThe VowThe VowThis Means WarThis Means WarHype-approval vs. Box-officeHigh Hype + High IMDb rating: Box-office successOther combination: Unpredictable23

Journey 2SummaryTweeters arent representative enoughNeed to be careful about tweets as online pollsTweeters tend to appear more positive, have specific interests and tasteNeed for quantitative metrics