Upload
mairi
View
39
Download
0
Embed Size (px)
DESCRIPTION
Learning Causality for News Events Prediction. Kira Radinsky, Sagie Davidovich , Shaul Markovitch Technion - Israel Institute of Technology. What is Prediction?. “A description of what one thinks will take place in the future, based on previous knowledge .” [Online Dictionary]. - PowerPoint PPT Presentation
Citation preview
Kira Radinsky, Sagie Davidovich, Shaul MarkovitchTechnion - Israel Institute of Technology
Learning Causalityfor News Events Prediction
“…a rigorous, often quantitative, statement, forecasting what will happen under specific conditions.“ [Wikipedia]
What is Prediction?
“A description of what one thinks will take place in the future, based on
previous knowledge.” [Online Dictionary]
Why is News Event Prediction Important?
Event Predicted event (Pundit)
Al-Qaida demands hostage exchange
A country will refuse the demand
Volcano erupts in Democratic Republic of Congo
Thousands of people flee from Congo
7.0 magnitude earthquake strikes Haitian coast
Tsunami-warning is issued
China overtakes Germany as world's biggest exporter
Wheat price will fall
Strategic Intelligence
Strategic planning
Strategic planning
Financial investments
• Motivation• Problem definition• Solution• Representation• Algorithm
• Evaluation
Outline
Problem Definition: Events Prediction
Prediction Function
, s.t.:occurred at time occurred at time
Ev is a set of eventsT is discrete representation of time
• Motivation• Problem definition• Solution• Representation• Algorithm
• Evaluation
Outline
Causality Mining Process: OverviewNews Articles acquisition• Crawling [NYT 1851-2009]• Modeling & Normalization
Causality Pattern Classification• <Pattern, Constraint,
Confidence>
Event Extraction• Tagging• Dependency parsing
(Stanford parser)
Thematic roles assignment• Based on VerbNet Index
Thematic roles normalized• Base forms• URIs assignment
(Contextual Disambiguation)
Causality Relations extraction• Context inference
State Inference
Causality Graph Building• Built on 20 machines• 300 million nodes• 1 billion edges• 13 million news articles in total
• Motivation• Problem definition• Solution• Representation• Algorithm
• Evaluation
Outline
Modeling an Event
Comparison between events (Canonical)1. (Lexicon & Syntax) Language & wording
independent2. (Semantic) Non ambiguous
Generalization / abstraction Reasoning
Many philosophiesProperty Exemplification of Events theory (Kim 1993)Conceptual Dependency theory (Schank 1972)
Event1
Weapon warehouse
bombs
US Army
1/2/1987 11:00AM +(2h) Kabul
Missiles
Location
Instrument
ThemeActio
n
Time-frame
Actor
Caus
ed
Event2
Troops
kill
1/2/1987 11:15AM +(3h)
ThemeActio
n
Time-frame
US
Army
Time Event & Causality Representation• Event Representation• Causality Representation
5
Quantifier
Afghan
Attribute
“US Army bombs a weapon warehouse in Kabul with missiles”
“5 Afghan troops were killed”
• Motivation• Problem definition• Solution• Representation• Algorithm
• Causality Mining Process• Evaluation
Outline
Machine Learning Problem Definition
Learning algorithm receives a set of examples
Goal function:
and produces a hypothesis which is goodapproximation of
Algorithm Outline
Learning Phase
1. Generalize events
2. Causality prediction rule generation
Prediction Phase3. Finding similar generalized event
4. Application of causality prediction rule
Algorithm Outline
Learning Phase
1. Generalize events1. How do we generalize objects?
2. How do we generalize actions?
3. How do we generalize an event?
2. Causality prediction rule generation
Generalizing Objects
Russian Federation
Eastern Europe
China
USSR
the Russian Federation
643
RUS
185
Russia
Rouble (Rub)
Name official English ISO3 Code
FAOSTAT code
DBPedia ID
Currency Name
UN C
ode
Is in group
Land border
Is successor of
Is predecessor of
Ontology – Linked data
Generalizing Actions
Levin classes (Levin 1993) – 270 classes
Class Hit-18.1Roles and Restrictions: Agent[+int_control] Patient[+concrete] Instrument[+concrete]Members: bang, bash, hit, kick, ...Frames:Name Example Syntax Semantics
Basic Transitive
Paula hit the ball Agent V Patient
cause(Agent, E)manner(during(E), directedmotion, Agent) !contact(during(E), Agent, Patient) manner(end(E),forceful, Agent) contact(end(E), Agent, Patient)
Generalizing Events:Putting it all together
Present Event
Army base
strikes
NATO
1/2/1987 11:00AM +(2h)
Baghdad
Missiles
Location
Inst
rum
ent
Theme
Action
Time-frame
Actor
US
Country Army
Past Event
Weapon warehousebombsUS Army
1/2/1987 11:00AM +(2h)
KabulLocation
ThemeActio
n
Time-frame
Actor
Sim
ilar v
erb CityMilitary
facility
rdf:type
“NATO strikes an army base in Baghdad”
“US Army bombs a weapon warehouse in Kabul with missiles”
Actor: [state of Nato]Property: [Hit1.1]Theme: [Military facility]Location: [Arab City]
Generalization rule
Generalizing Events: HAC algorithm
Generalizing Events:Event distance metric
Present Event
Army base
strikes
NATO
1/2/1987 11:00AM +(2h)
Baghdad
Missiles
Location
Inst
rum
ent
Theme
Action
Time-frame
Actor
US
Country Army
Past Event
Weapon warehousebombsUS Army
1/2/1987 11:00AM +(2h)
KabulLocation
Theme
Time-frame
Actor
Sim
ilar v
erb CityMilitary
facility
rdf:type
“NATO strikes an army base in Baghdad”
“US Army bombs a weapon warehouse in Kabul with missiles”
Actio
n
Learning Phase1. Generalize events
2. Causality prediction rule generation
Algorithm Outline
Cause Event
Weapon warehouse
bombs
US Army
1/2/1987 11:00AM +(2h) Kabul
Missiles
Location
Instrument
ThemeActio
n
Time-frame
Actor
Caus
ed
Effect Event
Troopskill
1/2/1987 11:15AM +(3h)
ThemeActio
n
Time-frame
US
Country Army
Type
Type
Time
Prediction Rule Generation
5
Quantifier
Afghan
Attribute
“US Army bombs a weapon warehouse in Kabul with missiles”
“5 Afghan troops were killed”
Afghanistan
Nationality
Country
EffectThemeAttribute = CauseLocationCountryNationalityEffectAction=killEffectTheme=Troops
Algorithm Outline
Prediction Phase
1. Finding similar generalized event
2. Application of causality prediction rule
Finding Similar Generalized Event
“Baghdad bombing” 0.2
0.30.7
0.8
0.75
0.20.65 0.1
Input Event
Theme1bomb
Actor1
T1 Location1
Instrument1
Location
Instrument
ThemeActio
n
Time-frame
Actor
Caus
ed
Predicted Effect Event
Troopskill
T1 + ∆
ThemeActio
n
Time-frame
Time
Prediction Rule Application
Attribute
Nationality
Country
EffectThemeAttribute = CauseLocationCountryNationalityEffectAction=killEffectTheme=Troops
• Motivation• Problem definition• Solution• Representation• Algorithm
• Evaluation
Outline
Prediction EvaluationHuman Group 1:
• Mark events E that can cause other events.
Human Group 2: • Given: Random sample of events from E , predictions and
time of events
• Search the web and give estimation on the prediction accuracy
Prediction Accuracy Results
Highly certain CertainAlgorithm 0.58 0.49
Humans 0.4 0.38
Causality EvaluationHuman Group 1:
• Mark events E for test for the second two control groups and the algorithm.
Human Group 2:
• Given: Random sample of events from E.
• State what you think would happen following this event.
Human Group 3:
• Given: algorithm predictions + human (2nd group) predictions
• Evaluate the quality of the predictions
Causality Results
• The results are statistically significant
[0,1) [1-2) [2-3) [3,4] Avg. Rank Avg. AccuracyAlgorithm 0 2 19 29 3.08 77%
Humans 0 3 24 23 2.86 72%
Event Predicted Event (Human) Predicted event (Pundit)
Al-Qaida demands hostage exchange
Al-Qaida exchanges hostage
A country will refuse the demand
Volcano erupts in Democratic Republic of Congo
Scientists in Republic of Congo investigate lava beds
Thousands of people flee from Congo
7.0 magnitude earthquake strikes Haitian coast
Tsunami in Haiti effects coast
Tsunami-warning is issued
2 Palestinians reportedly shot dead by Israeli troops
Israeli citizens protest against Palestinian leaders
War will be waged
Professor of Tehran University killed in bombing
Tehran students remember slain professor in memorial service
Professor funeral will be held
Alleged drug kingpin arrested in Mexico
Mafia kills people with guns in town
Kingpin will be sent to prison
UK bans Islamist group Islamist group would adopt another name in the UK
Group will grow
China overtakes Germany as world's biggest exporter
German officials suspend tariffs
Wheat price will fall
Accuracy of Extraction
Action Actor Object Instrument Location Time 93% 74% 76% 79% 79% 100%
Actor Object Instrument Location84% 83% 79% 89%
Extraction Evaluation
Entity Ontology Matching
Related work
Causality Information Extraction Goal: Extract causality relations from a textTechniques:1. Usage of handcrafted domain-specific patterns
[Kaplan and Berry-Rogghe, 1991]
2. Usage of handcrafted linguistic patterns[Garcia 1997],[Khoo, Chan, &Niu 2000], [Girju &Moldovan 2002]
3. Semi-Supervised pattern learning approaches, based on text features [Blanco, Castell, &Moldovan 2008], [Sil & Huang & Yates 2010]
4. Supervised pattern learning approaches based on text features
[Riloff 1996],[Riloff & Jones 1999], [Agichtein & Gravano, 2000; Lin & Pantel, 2001]
Related work
Temporal Information Extraction Goal: Predicting the temporal order of events or time expressions described in textTechnique: learn classifiers that predict a temporal order of a pair of events based on a predefined features of the pair.
[Ling & Weld, 2010; Mani, Schiffman, & Zhang, 2003; Lapata & Lascarides,2006; Chambers, Wang, & Jurafsky, 2007; Tatu & Srikanth, 2008; Yoshikawa, Riedel, Asahara, & Matsumoto, 2009]
Future work• Going beyond human tagged examples
• Incorporating time into the equation
• When will correlation mean causality?
• Using other sources than news
• Incorporating real time data (Twitter, Facebook)
• Incorporating numerical data (Stocks, Weather, Forex)
• Can we predict general facts?
• Can a machine predict better than an expert?
Summary• Canonical event representation• Machine learning algorithm for events prediction• Leveraging world knowledge for generalization• Using text as human tagged examples
• Causality mining from text• Contribution to machine common-sense
understanding
“The best way to predict the future is to invent it” [Alan Kay]