36
Kira Radinsky, Sagie Davidovich, Shaul Markovitch Technion - Israel Institute of Technology Learning Causality for News Events Prediction

Kira Radinsky, Sagie Davidovich , Shaul Markovitch Technion - Israel Institute of Technology

  • Upload
    mairi

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Learning Causality for News Events Prediction. Kira Radinsky, Sagie Davidovich , Shaul Markovitch Technion - Israel Institute of Technology. What is Prediction?. “A description of what one thinks will take place in the future, based on previous knowledge .” [Online Dictionary]. - PowerPoint PPT Presentation

Citation preview

Page 1: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Kira Radinsky, Sagie Davidovich, Shaul MarkovitchTechnion - Israel Institute of Technology

Learning Causalityfor News Events Prediction

Page 2: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

“…a rigorous, often quantitative, statement, forecasting what will happen under specific conditions.“ [Wikipedia]

What is Prediction?

“A description of what one thinks will take place in the future, based on

previous knowledge.” [Online Dictionary]

Page 3: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Why is News Event Prediction Important?

Event Predicted event (Pundit)

Al-Qaida demands hostage exchange

A country will refuse the demand

Volcano erupts in Democratic Republic of Congo

Thousands of people flee from Congo

7.0 magnitude earthquake strikes Haitian coast

Tsunami-warning is issued

China overtakes Germany as world's biggest exporter

Wheat price will fall

Strategic Intelligence

Strategic planning

Strategic planning

Financial investments

Page 4: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

• Motivation• Problem definition• Solution• Representation• Algorithm

• Evaluation

Outline

Page 5: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Problem Definition: Events Prediction

Prediction Function

, s.t.:occurred at time occurred at time

Ev is a set of eventsT is discrete representation of time

Page 6: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

• Motivation• Problem definition• Solution• Representation• Algorithm

• Evaluation

Outline

Page 7: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Causality Mining Process: OverviewNews Articles acquisition• Crawling [NYT 1851-2009]• Modeling & Normalization

Causality Pattern Classification• <Pattern, Constraint,

Confidence>

Event Extraction• Tagging• Dependency parsing

(Stanford parser)

Thematic roles assignment• Based on VerbNet Index

Thematic roles normalized• Base forms• URIs assignment

(Contextual Disambiguation)

Causality Relations extraction• Context inference

State Inference

Causality Graph Building• Built on 20 machines• 300 million nodes• 1 billion edges• 13 million news articles in total

Page 8: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

• Motivation• Problem definition• Solution• Representation• Algorithm

• Evaluation

Outline

Page 9: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Modeling an Event

Comparison between events (Canonical)1. (Lexicon & Syntax) Language & wording

independent2. (Semantic) Non ambiguous

Generalization / abstraction Reasoning

Many philosophiesProperty Exemplification of Events theory (Kim 1993)Conceptual Dependency theory (Schank 1972)

Page 10: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Event1

Weapon warehouse

bombs

US Army

1/2/1987 11:00AM +(2h) Kabul

Missiles

Location

Instrument

ThemeActio

n

Time-frame

Actor

Caus

ed

Event2

Troops

kill

1/2/1987 11:15AM +(3h)

ThemeActio

n

Time-frame

US

Army

Time Event & Causality Representation• Event Representation• Causality Representation

5

Quantifier

Afghan

Attribute

“US Army bombs a weapon warehouse in Kabul with missiles”

“5 Afghan troops were killed”

Page 11: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

• Motivation• Problem definition• Solution• Representation• Algorithm

• Causality Mining Process• Evaluation

Outline

Page 12: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Machine Learning Problem Definition

Learning algorithm receives a set of examples

Goal function:

and produces a hypothesis which is goodapproximation of

Page 13: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Algorithm Outline

Learning Phase

1. Generalize events

2. Causality prediction rule generation

Prediction Phase3. Finding similar generalized event

4. Application of causality prediction rule

Page 14: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Algorithm Outline

Learning Phase

1. Generalize events1. How do we generalize objects?

2. How do we generalize actions?

3. How do we generalize an event?

2. Causality prediction rule generation

Page 15: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Generalizing Objects

Russian Federation

Eastern Europe

China

USSR

the Russian Federation

643

RUS

185

Russia

Rouble (Rub)

Name official English ISO3 Code

FAOSTAT code

DBPedia ID

Currency Name

UN C

ode

Is in group

Land border

Is successor of

Is predecessor of

Page 16: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Ontology – Linked data

Page 17: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Generalizing Actions

Levin classes (Levin 1993) – 270 classes

Class Hit-18.1Roles and Restrictions: Agent[+int_control] Patient[+concrete] Instrument[+concrete]Members: bang, bash, hit, kick, ...Frames:Name Example Syntax Semantics

Basic Transitive

Paula hit the ball Agent V Patient

cause(Agent, E)manner(during(E), directedmotion, Agent) !contact(during(E), Agent, Patient) manner(end(E),forceful, Agent) contact(end(E), Agent, Patient)

Page 18: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Generalizing Events:Putting it all together

Present Event

Army base

strikes

NATO

1/2/1987 11:00AM +(2h)

Baghdad

Missiles

Location

Inst

rum

ent

Theme

Action

Time-frame

Actor

US

Country Army

Past Event

Weapon warehousebombsUS Army

1/2/1987 11:00AM +(2h)

KabulLocation

ThemeActio

n

Time-frame

Actor

Sim

ilar v

erb CityMilitary

facility

rdf:type

“NATO strikes an army base in Baghdad”

“US Army bombs a weapon warehouse in Kabul with missiles”

Actor: [state of Nato]Property: [Hit1.1]Theme: [Military facility]Location: [Arab City]

Generalization rule

Page 19: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Generalizing Events: HAC algorithm

Page 20: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Generalizing Events:Event distance metric

Present Event

Army base

strikes

NATO

1/2/1987 11:00AM +(2h)

Baghdad

Missiles

Location

Inst

rum

ent

Theme

Action

Time-frame

Actor

US

Country Army

Past Event

Weapon warehousebombsUS Army

1/2/1987 11:00AM +(2h)

KabulLocation

Theme

Time-frame

Actor

Sim

ilar v

erb CityMilitary

facility

rdf:type

“NATO strikes an army base in Baghdad”

“US Army bombs a weapon warehouse in Kabul with missiles”

Actio

n

Page 21: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Learning Phase1. Generalize events

2. Causality prediction rule generation

Algorithm Outline

Page 22: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Cause Event

Weapon warehouse

bombs

US Army

1/2/1987 11:00AM +(2h) Kabul

Missiles

Location

Instrument

ThemeActio

n

Time-frame

Actor

Caus

ed

Effect Event

Troopskill

1/2/1987 11:15AM +(3h)

ThemeActio

n

Time-frame

US

Country Army

Type

Type

Time

Prediction Rule Generation

5

Quantifier

Afghan

Attribute

“US Army bombs a weapon warehouse in Kabul with missiles”

“5 Afghan troops were killed”

Afghanistan

Nationality

Country

EffectThemeAttribute = CauseLocationCountryNationalityEffectAction=killEffectTheme=Troops

Page 23: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Algorithm Outline

Prediction Phase

1. Finding similar generalized event

2. Application of causality prediction rule

Page 24: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Finding Similar Generalized Event

“Baghdad bombing” 0.2

0.30.7

0.8

0.75

0.20.65 0.1

Page 25: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Input Event

Theme1bomb

Actor1

T1 Location1

Instrument1

Location

Instrument

ThemeActio

n

Time-frame

Actor

Caus

ed

Predicted Effect Event

Troopskill

T1 + ∆

ThemeActio

n

Time-frame

Time

Prediction Rule Application

Attribute

Nationality

Country

EffectThemeAttribute = CauseLocationCountryNationalityEffectAction=killEffectTheme=Troops

Page 26: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

• Motivation• Problem definition• Solution• Representation• Algorithm

• Evaluation

Outline

Page 27: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Prediction EvaluationHuman Group 1:

• Mark events E that can cause other events.

Human Group 2: • Given: Random sample of events from E , predictions and

time of events

• Search the web and give estimation on the prediction accuracy

Page 28: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Prediction Accuracy Results

Highly certain CertainAlgorithm 0.58 0.49

Humans 0.4 0.38

Page 29: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Causality EvaluationHuman Group 1:

• Mark events E for test for the second two control groups and the algorithm.

Human Group 2:

• Given: Random sample of events from E.

• State what you think would happen following this event.

Human Group 3:

• Given: algorithm predictions + human (2nd group) predictions

• Evaluate the quality of the predictions

Page 30: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Causality Results

• The results are statistically significant

[0,1) [1-2) [2-3) [3,4] Avg. Rank Avg. AccuracyAlgorithm 0 2 19 29 3.08 77%

Humans 0 3 24 23 2.86 72%

Page 31: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Event Predicted Event (Human) Predicted event (Pundit)

Al-Qaida demands hostage exchange

Al-Qaida exchanges hostage

A country will refuse the demand

Volcano erupts in Democratic Republic of Congo

Scientists in Republic of Congo investigate lava beds

Thousands of people flee from Congo

7.0 magnitude earthquake strikes Haitian coast

Tsunami in Haiti effects coast

Tsunami-warning is issued

2 Palestinians reportedly shot dead by Israeli troops

Israeli citizens protest against Palestinian leaders

War will be waged

Professor of Tehran University killed in bombing

Tehran students remember slain professor in memorial service

Professor funeral will be held

Alleged drug kingpin arrested in Mexico

Mafia kills people with guns in town

Kingpin will be sent to prison

UK bans Islamist group Islamist group would adopt another name in the UK

Group will grow

China overtakes Germany as world's biggest exporter

German officials suspend tariffs

Wheat price will fall

Page 32: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Accuracy of Extraction

Action Actor Object Instrument Location Time 93% 74% 76% 79% 79% 100%

Actor Object Instrument Location84% 83% 79% 89%

Extraction Evaluation

Entity Ontology Matching

Page 33: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Related work

Causality Information Extraction Goal: Extract causality relations from a textTechniques:1. Usage of handcrafted domain-specific patterns

[Kaplan and Berry-Rogghe, 1991]

2. Usage of handcrafted linguistic patterns[Garcia 1997],[Khoo, Chan, &Niu 2000], [Girju &Moldovan 2002]

3. Semi-Supervised pattern learning approaches, based on text features [Blanco, Castell, &Moldovan 2008], [Sil & Huang & Yates 2010]

4. Supervised pattern learning approaches based on text features

[Riloff 1996],[Riloff & Jones 1999], [Agichtein & Gravano, 2000; Lin & Pantel, 2001]

Page 34: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Related work

Temporal Information Extraction Goal: Predicting the temporal order of events or time expressions described in textTechnique: learn classifiers that predict a temporal order of a pair of events based on a predefined features of the pair.

[Ling & Weld, 2010; Mani, Schiffman, & Zhang, 2003; Lapata & Lascarides,2006; Chambers, Wang, & Jurafsky, 2007; Tatu & Srikanth, 2008; Yoshikawa, Riedel, Asahara, & Matsumoto, 2009]

Page 35: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Future work• Going beyond human tagged examples

• Incorporating time into the equation

• When will correlation mean causality?

• Using other sources than news

• Incorporating real time data (Twitter, Facebook)

• Incorporating numerical data (Stocks, Weather, Forex)

• Can we predict general facts?

• Can a machine predict better than an expert?

Page 36: Kira  Radinsky,  Sagie Davidovich ,  Shaul Markovitch Technion  - Israel Institute of Technology

Summary• Canonical event representation• Machine learning algorithm for events prediction• Leveraging world knowledge for generalization• Using text as human tagged examples

• Causality mining from text• Contribution to machine common-sense

understanding

“The best way to predict the future is to invent it” [Alan Kay]