8
Opinions Breakout Session Eduard Hovy

Opinions Breakout Session

Embed Size (px)

DESCRIPTION

Opinions Breakout Session. Eduard Hovy. Sources of data. Jan Wiebe’s NRRC study, 2001 Columbia ISI New annotations in next 2 months New TREC annotations. Annotations. Given a text and a topic question, identify text fragments that express: - PowerPoint PPT Presentation

Citation preview

Page 1: Opinions Breakout Session

OpinionsBreakout Session

Eduard Hovy

Page 2: Opinions Breakout Session

Sources of data

• Jan Wiebe’s NRRC study, 2001

• Columbia

• ISI

• New annotations in next 2 months

• New TREC annotations

Page 3: Opinions Breakout Session

Annotations

Given a text and a topic question, identify text fragments that express: • An opinion O with 5 values (= for, anti, neutral,

moot, unknown)• A holder H: one of three types (= person,

organization, group) • A reason R (certain key phrases help) (everything must relate directly to the question, not

to a derived issue or reason) • Index related fragment by number

Page 4: Opinions Breakout Session

Example

Should the census count illegal aliens? • Groups which have filed suit to ignore the aliens contend

large concentrations of them could result in some states gaining seats in the House of Representatives at the expense of other states.

• Meanwhile, other groups want the final census totals to be increased to account for people who may be overlooked in the census—most often blacks and Hispanics living in urban areas.

• If the two sides trying to force changes in the 1990 census both get their way, the results would nearly balance one another, a population expert said Wednesday.

Page 5: Opinions Breakout Session

Example annotationShould the census count illegal aliens? • <OP ID="1" T="H">Groups which have filed suit</OP> <OP

ID="1" T="O" V="ANTI">to ignore the aliens</OP> contend large concentrations of them could <OP ID="1" T="R">result in some states gaining seats in the House of Representatives at the expense of other states</OP>.

• Meanwhile, <OP ID="2" T="H">other groups</OP> <OP ID="2"

T="O" V="PRO">want the final census totals to be increased</OP> <OP ID="2" T="R">to account for people who may be overlooked</OP> in the census—most often blacks and Hispanics living in urban areas.

• If the two sides trying to force changes in the 1990 census both get their way, <OP ID= "3" T="O" V= "MOOT">the results would nearly balance one another</OP>, <OP ID= "3" T="H">a population expert</OP> said Wednesday.

Page 6: Opinions Breakout Session

Strategies for determining annotations

• Determining opinions: like QA pinpointing, locate topic phrase in the sentence

• Determining opinion values: locate keywords with opinion valences, such as “want”, “anti”, “decry”, “welcome”, etc.

• Determining holders: locate Named Entity of appropriate semantic type close to topic, or as subject of opinion statement

• Determining reasons: locate phrase or sentence near (usually after) topic and introduced by markers such as “because”, “could/would/will result in”, “(in order) to”, etc.

Page 7: Opinions Breakout Session

Potential evaluation measures

• The annotation produces a list of ‘O-H-R triangles’. Group together into a set all the triangles with the same O value. This produces a list of opinion sets.

• Now measure: – number of opinions/sets, compared to those created by people

– correctness of each found set (set precision)

– number of missing sets (set recall)

– correctness of each triangle within each set (statement precision)

– number of triangle missing in each set (statement recall)

Page 8: Opinions Breakout Session

Plan for next 6 months• We create specs for opinion annotations

• We annotate text by hand

• NIST annotates some texts in prep for TREC

• Pilot evaluation (machine & maybe human)

You’re welcome to join! Download 25 texts (5 topics) from http://www.isi.edu/natural-language/projects/Opinions/opinion-sources.tar