50
CS3730 Fall 2008 Subjectivity and Sentiment Analysis Lecture (Day 2): Introduction to linguistic subjectivity

CS3730 Fall 2008 Subjectivity and Sentiment Analysis

  • Upload
    halona

  • View
    35

  • Download
    2

Embed Size (px)

DESCRIPTION

CS3730 Fall 2008 Subjectivity and Sentiment Analysis. Lecture (Day 2): Introduction to linguistic subjectivity. Definitions and Annotation Scheme. Manual annotation: human markup of corpora (bodies of text) Why? Understand the problem Create gold standards (and training data) - PowerPoint PPT Presentation

Citation preview

Page 1: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

CS3730 Fall 2008Subjectivity and Sentiment

AnalysisLecture (Day 2): Introduction to

linguistic subjectivity

Page 2: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

2

Definitions and Annotation Scheme

• Manual annotation: human markup of corpora (bodies of text)

• Why? – Understand the problem– Create gold standards (and training data)

Wiebe, Wilson, Cardie LRE 2005Wilson & Wiebe ACL-2005 workshopSomasundaran, Wiebe, Hoffmann, Litman ACL-2006 workshopSomasundaran, Ruppenhofer, Wiebe SIGdial 2007Wilson 2008 PhD dissertation

Page 3: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

3

What is Subjectivity?

• The linguistic expression of somebody’s opinions, sentiments, emotions, evaluations, beliefs, speculations (private states)

Private state: state that is not open to objective observation or verification Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.

Page 4: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

4

Overview

• Fine-grained: expression-level rather than sentence or document level

• Annotate – Subjective expressions– material attributed to a source, but presented

objectively

Page 5: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

5

Overview

• Focus on three ways private states are expressed in language

Page 6: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

6

Direct Subjective Expressions

• Direct mentions of private states

The United States fears a spill-over from the anti-terrorist campaign.

• Private states expressed in speech events

“We foresaw electoral fraud but not daylight robbery,” Tsvangirai said.

Page 7: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

7

Expressive Subjective Elements [Banfield 1982]

• “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said

• The part of the US human rights report about China is full of absurdities and fabrications

Page 8: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

8

Objective Speech Events

• Material attributed to a source, but presented as objective fact

The government, it added, has amended the Pakistan Citizenship Act 10 of 1951 to enable women of Pakistani descent to claim Pakistani nationality for their children born to foreign husbands.

Page 9: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

9

Page 10: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

10

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

Page 11: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

11

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer)

Page 12: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

12

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer, Xirao-Nima)

Page 13: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

13

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer Xirao-Nima)(Writer Xirao-Nima)

Page 14: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

14

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer Xirao-Nima)(Writer Xirao-Nima)

(Writer)

Page 15: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

15

“The report is full of absurdities,” Xirao-Nima said the next day.

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Direct subjective anchor: said source: <writer, Xirao-Nima> intensity: high expression intensity: neutral

Expressive subjective element anchor: full of absurdities source: <writer, Xirao-Nima> intensity: high

Page 16: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

19

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

Page 17: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

20

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(Writer)

Page 18: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

21

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(writer, Xirao-Nima)

Page 19: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

22

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(writer, Xirao-Nima, US)

Page 20: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

23

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(writer, Xirao-Nima, US) (writer, Xirao-Nima)(Writer)

Page 21: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

24

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Objective speech event anchor: said source: <writer, Xirao-Nima>

Direct subjective anchor: fears source: <writer, Xirao-Nima, US> intensity: medium expression intensity: medium

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

Page 22: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

25

The report has been strongly criticized and condemned bymany countries.

Page 23: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

26

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Direct subjective anchor: strongly criticized and condemned source: <writer, many-countries> intensity: high expression intensity: high

The report has been strongly criticized and condemned bymany countries.

Page 24: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

27

As usual, the US state Department published its annual report on human rights practices in world countries last Monday.

And as usual, the portion about China contains little truth and many absurdities, exaggerations and fabrications.

Page 25: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

28

Objective speech event anchor : the entire 1st sentence source : <writer> implicit : true

Direct subjective anchor : the entire 2nd sentence source : <writer> implicit : true intensity : high

As usual, the US state Department published its annual report on human rights practices in world countries last Monday.

And as usual, the portion about China contains little truth and many absurdities, exaggerations and fabrications.

Expressive subjective element anchor : little truth source : <writer> intensity : medium

… Expressive subjective element anchor : many absurdities, exaggerations, and fabrications source : <writer> intensity : medium

Expressive subjective element anchor : And as usual source : <writer> intensity : low

Page 26: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

29

Example

The Foreign Ministry said Thursday that it was “surprised, to put it mildly”

by the U.S. State Department’s criticism of Russia’s human rights

record and objected in particular to the “odious” section on Chechnya.

Page 27: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

30

Example

The Foreign Ministry said Thursday that it was “surprised, to put it mildly”

by the U.S. State Department’s criticism of Russia’s human rights

record and objected in particular to the “odious” section on Chechnya.

(writer,FM,FM)(writer,FM) (writer,FM)

(writer,FM,FM,SD)

(writer,FM) (writer,FM)

Page 28: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

32

(General) Subjectivity Types[Wilson 2008]

Other (including cognitive)Note: similar ideas:polarity, semantic orientation, sentiment

Page 29: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

33

Extensions [Wilson 2008]

I think people are happy because Chavez has fallen.

direct subjective span: are happy source: <writer, I, People> attitude:

inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target:

target span: Chavez has fallen

target span: Chavez

attitude span: are happy type: pos sentiment intensity: medium target:

direct subjective span: think source: <writer, I> attitude:

attitude span: think type: positive arguing intensity: medium target:

target span: people are happy because Chavez has fallen

Page 30: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

34

As usual, the US State Department published its annual report on human rights practices in world countries last Monday.

GATE_objective-speech-event (2, 2) nested-source=w implicit=true [] GATE_agent (46, 108) id=report ['its', 'annual', 'report', 'on', 'human', 'right',

'practice', 'in', 'world', 'country']

And as usual, the portion about China contains little truth and many absurdities, exaggerations and fabrications.

GATE_expressive-subjectivity (128, 140) nested-source=w polarity=neutral intensity=low ['and', 'as', 'usual']

GATE_direct-subjective (128, 128) nested-source=w attitude-link=a100 intensity=high implicit=true []

GATE_target (142, 165) id=t100 ['the', 'portion', 'about', 'china'] GATE_agent (160, 165) id=china ['china'] GATE_attitude (166, 240) intensity=high id=a100 attitude-type=sentiment-neg

target-link=t100 ['contain', 'little', 'truth', 'and', 'many', 'absurdity', 'exaggeration', 'and', 'fabrication']

GATE_expressive-subjectivity (175, 187) nested-source=w polarity=negative intensity=medium ['little', 'truth']

GATE_expressive-subjectivity (192, 240) nested-source=w polarity=negative intensity=high ['many', 'absurdity', 'exaggeration', 'and', 'fabrication']

Page 31: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

35

Its aim of the 2001 report is to tarnish China's image and exert political pressure on the Chinese Government, human rights experts said at a seminar held by the China Society for Study of Human Rights (CSSHR) on Friday.

GATE_objective-speech-event (248, 248) nested-source=w implicit=true [] GATE_direct-subjective (380, 384) nested-source=w,experts expression-

intensity=neutral attitude-link=a110 intensity=medium ['say']GATE_attitude (248, 357) intensity=medium-high id=a110 attitude-type=sentiment-

neg target-link=t2 ['its', 'aim', 'of', 'the', 'report', … 'the', 'chinese', 'government'] GATE_target (259, 274) id=t2 ['the', 'report'] GATE_expressive-subjectivity (281, 288) nested-source=w,experts

polarity=negative intensity=medium ['tarnish'] GATE_direct-subjective (252, 255) nested-source=w,experts,report

polarity=neutral expression-intensity=medium attitude-link=a120,a130 intensity=medium ['aim']

GATE_attitude (252, 255) intensity=medium id=a120 attitude-type=intention-pos target-link=t3 ['aim']

GATE_target (278, 357) id=t3 ['to', 'tarnish', 'china', "'s", 'image', 'and', 'exert', 'political', 'pressure', 'on', 'the', 'chinese', 'government']

GATE_attitude (252, 288) intensity=medium id=a130 attitude-type=sentiment-neg target-link=t4 ['aim', 'of', 'the', 'report', 'be', 'to', 'tarnish']

GATE_target (289, 294) id=t4 ['china'] GATE_agent (359, 379) nested-source=w,experts id=experts ['human', 'right',

'expert'] GATE_agent (259, 274) nested-source=w,experts,report nested-

target=w,experts,report ['the', 'report']

Page 32: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

36

Continued on the next slide…

"The United States was slandering China again," said Xirao-Nima, a professor of Tibetan history at the Central University for Nationalities.

GATE_objective-speech-event (475, 475) nested-source=w implicit=true []

GATE_direct-subjective (523, 527) nested-source=w,nima expression-intensity=neutral attitude-link=a140 intensity=high ['say']

GATE_attitude (494, 508) intensity=high id=a140 attitude-type=sentiment-neg target-link=t5 ['be', 'slander']

GATE_target (476, 493) id=t5 ['the', 'unite', 'state']

GATE_expressive-subjectivity (498, 508) nested-source=w,nima polarity=negative intensity=high ['slander']

Page 33: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

37

"The United States was slandering China again," said Xirao-Nima, a professor of Tibetan history at the Central University for Nationalities.

GATE_direct-subjective (494, 508) nested-source=w,nima,US polarity=negative expression-intensity=high attitude-link=a150 intensity=high ['be', 'slander']

GATE_attitude (494, 508) intensity=high id=a150 attitude-type=sentiment-neg target-link=t6 ['be', 'slander']

GATE_target (509, 514) id=t6 ['china'] GATE_agent (528, 538) nested-source=w,nima id=nima ['xirao-', 'nima'] GATE_agent (476, 493) nested-source=w,nima,US nested-

target=w,nima,US id=US ['the', 'unite', 'state'] GATE_agent (509, 514) nested-target=w,nima,US,china ['china']

Page 34: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

38

These are all the annotationsIt shows that these so-called truths are not true at all," said Xirao-NimaGATE_objective-speech-event (3111, 3111) nested-source=w implicit=true []GATE_direct-subjective (3170, 3174) attitude-type=negative intensity=high

attitude-link=a350,a355 expression-intensity=neutral nested-source=w,nima attitude-toward=report ['say']

GATE_attitude (3111, 3167) intensity=medium-high id=a355 attitude-type=arguing-neg target-link=t101 ['it', 'show', 'that', 'these', 'so-call', 'truth', 'be', 'not', 'true', 'at', 'all']

GATE_attitude (3111, 3167) intensity=high id=a350 attitude-type=sentiment-neg target-link=t101 ['it', 'show', 'that', 'these', 'so-call', 'truth', 'be', 'not', 'true', 'at', 'all']

GATE_target (3125, 3147) id=t101 ['these', 'so-call', 'truth']GATE_expressive-subjectivity (3131, 3147) nested-source=w,nima

polarity=negative intensity=medium ['so-call', 'truth']GATE_expressive-subjectivity (3152, 3167) nested-source=w,nima

polarity=negative intensity=high ['not', 'true', 'at', 'all'] GATE_agent (3175, 3185) nested-source=w,nima ['xirao-', 'nima']

Page 35: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

39

Layering with Other Annotation Schemes

• E.g. Time, Lexical Semantics, Discourse…

• Richer interpretations via combination

• Potential disambiguation both ways

• Example with the Penn Discourse Treebank (PDTB) Version 2 recently released through

Language Data Consortium Joshi, Webber, Prasad, Miltsakaki, … http://www.seas.upenn.edu/~pdtb/

Page 36: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

40

• Swapna will cover the following material later in the course in more detail. This is to give us an idea now.

Page 37: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

41

The type “Cause” is used when the connective indicates that the situations described in Arg1 and Arg2 are causally influenced and the two are not in a conditional relation …

Page 38: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

42

Polarity preserved across Result relation

Other firms "are dealing with the masses. I don't believe they have the culture" to adequately service high-net-worth individuals, he adds.

Page 39: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

43

Polarity preserved across Result relation: PDTB

[Other firms "are dealing with the masses ARG1]. I don't believe IMPLICIT_SO [they have the culture" to adequately service high-net-worth individuals ARG2], he adds.

ARG2 is a result of ARG1

Page 40: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

44

Polarity preserved across Result relation: PDTB

[Other firms "are dealing with the masses ARG1]. I don't believe IMPLICIT_SO [they have the culture" to adequately service high-net-worth individuals ARG2], he adds.

X said Y: “X said” X’s belief space

“I don’t believe” explicit in second sentence

“Swartz said” implicit in first sentence

ARG spans: Dis. Rel within Swartz’s belief space

Page 41: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

45

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Attitude span includes “don’t believe”; schemes require different notions of spans

Page 42: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

46

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Two negative properties, where the second is a result of the first

Page 43: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

47

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Dis Rel between ARGS inside his belief space

Page 44: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

48

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Semantics of result: specific subtype, where a negative state of affairs is the result of another one

Page 45: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

49

The class tag “COMPARISON” applies when the connective indicates that a discourse relation is established between Arg1 and Arg2 in order to highlight prominent differences between the two situations.

Page 46: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

50

In that suit, the SEC accused Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J., over a three-year period.Through his lawyers, Mr. Antar has denied allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Page 47: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

51

PDTB

[In that suit, the SEC accused Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J., over a three-year period. ARG1]IMPLICIT_CONTRAST [ Through his lawyers, Mr. Antar has denied allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others. ARG2]

Contrast between the SEC accusing Mr. Antar of something, and his denying the accusation

Page 48: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

52

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.

Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Two attitudes combined into one large disagreement between two parties

Page 49: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

53

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.

Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Subjectivity: arguing-pos and agree-neg with different sources; Hypothesis: common with contrast. Help recognize the implicit contrast.

Page 50: CS3730 Fall 2008 Subjectivity and Sentiment Analysis

54

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.

Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Semantics of comparison: specific case of highlighting prominent differences in attitudes of different people