28

Twitter Sentiment Analysis - final - no personal

Embed Size (px)

Citation preview

Page 1: Twitter Sentiment Analysis - final - no personal
Page 2: Twitter Sentiment Analysis - final - no personal

ADVISORYPARENTAL

EXPL IC IT CONTENT

Page 3: Twitter Sentiment Analysis - final - no personal

What Makes a Good Model?

Team Grant

dammnit I'm lit. &dammnit I kn0 ders b0ut2be kiLLer traFFic! & ya d0nt even kn0 h0w haPPy I am dats its back2sch00l POSITIVE Or, Twitter Sentiment Analysis:

using models to classify tweets

so you don’t have to

Page 4: Twitter Sentiment Analysis - final - no personal

a good model is

1. Valuable2. Accurate3. Sophisticated4. Agile

Page 5: Twitter Sentiment Analysis - final - no personal

1a good model is valuable

Page 6: Twitter Sentiment Analysis - final - no personal

$20M=

Page 7: Twitter Sentiment Analysis - final - no personal

2a good model is accurate

(and the limits of that accuracy understood)

Page 8: Twitter Sentiment Analysis - final - no personal

77%*Performance on 20% hold-out sample.It’s a hell of a lot better on the training sample.(Obviously.)

*

Page 9: Twitter Sentiment Analysis - final - no personal

*2%better than

55%hosted sentiment classifier

75%trained sentiment classifier

Page 10: Twitter Sentiment Analysis - final - no personal

39% 11%

12% 38%

Page 11: Twitter Sentiment Analysis - final - no personal

of the 23% the model got wrong…

model error 41%

neutral 30%

human error 15%

other 13%

You ever have those days where you feel like you = FAIL. Yeah. It's one of those days.

Model + / Human -

UP is intense! i cried and laughed

Model - / Human +

Sorry, typo - Environmentalism.

Model - / Human +

@Zee It's good, but buggy like a motherfucker.

Model + / Human -I really hate twitter... i don't know what i'm doing here

Model - / Human +

so tierd could drop DEAD x

Model - / Human +ActiveRecord::HasManyThroughSourceAssociationMacroError: Invalid source reflection macro :has_one for has_many -> http://bit.ly/135UWH

Model + / Human -

@Dichenlachman I like that you abbreviated bathrooms to b'throoms when b'throoms is the same no. of letters as bathrooms... Bathrooms

Model - / Human +

Page 12: Twitter Sentiment Analysis - final - no personal

bootstrapped hold-out performance

0.75 0.76 0.77 0.78 0.79

μ0.768

-1 σ 0.761

-2 σ 0.755

1 σ 0.775

2 σ 0.781

3 σ 0.788

Page 13: Twitter Sentiment Analysis - final - no personal

3a good model is sophisticated

(but not too sophisticated)

Page 14: Twitter Sentiment Analysis - final - no personal

classification process

raw tweets

NLP & features

model specifica

tiontraining analysis

Page 15: Twitter Sentiment Analysis - final - no personal

made it hurt like a motherfucker fuck my life & i am not that short & your tall & i did grow some balls & date night tonight htp bit ly/nos

MADD-E. it hurt like a MOTHERFUCKER fuck my life & I am not that short & yr tall & i did grow some balls & date night tonight!1! http://bit.ly/Nos9D

1 raw tweet

2

5 vectorize [ 0 0 1 0 0 … 0 0 1 0 0 1 ] 6 model

MADD-E. it huuuurt like a MOTHERFUCKER fml & i’m not that short & yr tall & i did grow some balls & date night tonight!1! http://bit.ly/Nos9D

3

expand contractionssocial media lexiconcorrected XMLrepeat replace

spellcheckremove punctuationremove numbersall lowercase

4 uni-gramsbi-grams { made, it, made it, … }

Page 16: Twitter Sentiment Analysis - final - no personal

why didn’t we do other cool NLP stuff?

twee

ts

what w

e did

engli

sh o

nly

rem

ove

Twitter

sym

bols

rem

ove

stop

words

stem

0.74

0.75

0.76

0.77

0.78accuracy

Page 17: Twitter Sentiment Analysis - final - no personal

why didn’t we do other cool NLP stuff?

twee

ts

what w

e did

engli

sh o

nly

rem

ove

Twitter

sym

bols

rem

ove

stop

words

stem

0.74

0.75

0.76

0.77

0.78accuracy

Page 18: Twitter Sentiment Analysis - final - no personal

raw

spellcheck

normalize case

stem / lemmatize

why does that happen?

LOVIN’

LOVING

loving

love

LOVIN LOVING

Loving loving

love loved

Page 19: Twitter Sentiment Analysis - final - no personal

raw

spellcheck

normalize case

stem / lemmatize

why does that happen?

LOVIN’

LOVING

loving

love

fewerdimensions

(good)

lessinformation

(bad)

Page 20: Twitter Sentiment Analysis - final - no personal

team Grant model specification

Page 21: Twitter Sentiment Analysis - final - no personal

linea

r SVM

naïve

Bay

es (S

KL)

naïve

Bay

es (N

LTK)

rand

om fo

rests

(SKL)

optim

ized

non-

linea

r SVM

naïve

Bay

es (B

ig D

ata)

neur

al n

etwor

ks

decis

ion

tree

(SKL)

decis

ion

tree

(NLT

K)

VADER clas

sifie

r

TextB

lob

Patte

rnAna

lyzer

KNN

decis

ion

tree

(NLT

K, lexic

on)

decis

ion

tree

(NLT

K, em

otico

ns)

0.45

0.50

0.55

0.60

0.65

0.70

0.75

analysis

Page 22: Twitter Sentiment Analysis - final - no personal

linea

r SVM

naïve

Bay

es (S

KL)

naïve

Bay

es (N

LTK)

rand

om fo

rests

(SKL)

optim

ized

non-

linea

r SVM

naïve

Bay

es (B

ig D

ata)

neur

al n

etwor

ks

decis

ion

tree

(SKL)

decis

ion

tree

(NLT

K)

VADER clas

sifie

r

TextB

lob

Patte

rnAna

lyzer

KNN

decis

ion

tree

(NLT

K, lexic

on)

decis

ion

tree

(NLT

K, em

otico

ns)

0.45

0.50

0.55

0.60

0.65

0.70

0.75

analysis

Page 23: Twitter Sentiment Analysis - final - no personal

linea

r SVM

naïve

Bay

es (S

KL)

naïve

Bay

es (N

LTK)

rand

om fo

rests

(SKL)

optim

ized

non-

linea

r SVM

naïve

Bay

es (B

ig D

ata)

neur

al n

etwor

ks

decis

ion

tree

(SKL)

decis

ion

tree

(NLT

K)

VADER clas

sifie

r

TextB

lob

Patte

rnAna

lyzer

KNN

decis

ion

tree

(NLT

K, lexic

on)

decis

ion

tree

(NLT

K, em

otico

ns)

0.45

0.50

0.55

0.60

0.65

0.70

0.75

analysis

Page 24: Twitter Sentiment Analysis - final - no personal

how do they work?

linear SVM

naïve Bayes

random forest

Page 25: Twitter Sentiment Analysis - final - no personal

consensus

76%

76%

74%

DEMOCRACY! 77%

Page 26: Twitter Sentiment Analysis - final - no personal

4a good model is agile

Page 27: Twitter Sentiment Analysis - final - no personal

1. genetically diverse

2. ensemble can handle more libraries / classifiers

3. modular designa) NLPb) feature detectionc) models

4. sequential checks

5. quick enough to classify the firehose

6. easily incorporate new cases for re-training

Page 28: Twitter Sentiment Analysis - final - no personal

?