Upload
elfreda-hines
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Query Based Event Extraction along a Timeline
H.L. Chieu and Y.K. Lee DSO National Laboratories, Singapore
(SIGIR 2004)
2/35
Abstract We present a framework and a system that extracts
events relevant to a query and place such event along a timeline.
Each event is represented by a sentence, based on the assumption that “important” event are widely cited in many documents for a period time within which these events are of interest.
Evaluation was performed using G8 leader names as queries: comparison made by human evaluators between manually and system generated timelines.
3/35
Introduction The growth of the WWW means that users ne
ed tools to find what they want. In this paper, we attempt to summarize a big c
ollection of documents by placing sentences that report “important” events related to the query along a timeline.
For this purpose, we assume that documents are time-tagged, and each sentence is assigned a date by using simple rules
4/35
Introduction Events that are of interest to many people are
often reported in many different news articles from different sources.
Updates and commentaries of such event would also persist for a period of time.
Our system made use of this property to determine if events are “important”
5/35
Related Work Past work on summarization focused on either summarizing s
ingle document, or summarizing similarities and difference in small clusters of documents.
Recent DUC introduced summarization focused by a viewpoint or a question, but still from small clusters of documents.
Mckeown et al. (2001) used different summarizers for different types of document clusters (MultiGen, DEMS)
Schiffman et al. (2000) summarized collections of documents to produce biographical summaries without taking into account temporal information
6/35
Related Work Work has been done on abstracting, where sentences are gen
erated instead of extracted (Barzilay, 1999, Mckeown, 1999) Kevin and Marcu (2000) explored summarization by sentenc
e compression Summarization system can also be based on information extr
action systems, where summaries are generated from extracted templates (Radev, 1998)
For multidocument summarization, the ordering of sentences within a summary has also been explored to produce coherent summaries for the reader (Barzilay, 2002, 2003)
Sentences extracted on a timeline refer to different events that happened on different dates
7/35
Related Work Most work on temporal aspects of linguistics addresses the a
nnotation of temporal expressions and temporal attributes of events (Mani, 2000, 2001)
The DARPA ACE task of Relation Detection and Tracking involves attaching temporal attributes to relations
The area of TDT brought forth some work on summarization of incoming streams of news
Swan and Allan (2000) pioneered work on timeline overviews Allan et al. (2001) addressed temporal summarization of new
s topics by providing sentence-based summaries, where they defined precision and recall measures that capture relevance and novelty
8/35
Related Work Smith (2002) extracted associations between date
and location and showed that associations detected corresponded well with known event
Kleinberg (2002) modeled data streams to infer a hierarchical structure of bursts in the stream and extract bursts of term usage in title of papers from conference databases
In this paper, our system summarizes documents returned by a query-base search and aims to extract events related to an entity or type of event by sentence extraction along a timeline
9/35
Framework For Timeline Construction
10/35
Framework For Timeline Construction
Ranking Sentences Ranking Metric: “Interest”
We define “interesting event” in SC(q) as events that are a subject of interest in many sentences in SC(q)
The sentences in the set of SC(q) can then be raked according to their interest
q} regarding s asevent same the
reports '|)('{)|( sqSCscardinalqsInterest
11/35
Framework For Timeline Construction
Ranking Sentences Ranking Metric: “Burstiness”
Reports of “important” events often cluster around the date of their occurrence
To sift out “important” events, we compare the number of reports of an event within and outside a date duration
We model reports of events as a random process with an unknown binominal distribution, we can check for associations between event and date
12/35
Framework For Timeline Construction
Ranking Sentences Ranking Metric: “Burstiness”
In stead of taking a fixed value for k, we calculate the log likelihood ratio for each value of K from 1 to 10
)()( ],1,1[ ,10such that n and
tablecontigencyfor ratio likelihood log)(
),(
kLLnLLnkn
kLL
wherenLLBurstiness
13/35
Implementation Assumptions
In step (1), a sentence s is deemed relevant to query q, if one of the terms in q appears in s
In step (2), we resolve date(s) with a few rules Only one event is mentioned per sentence: each se
ntence is mapped to one day
14/35
Implementation Resolving Date
We further assume that the first date expression detected in a sentence s is the date of the event mentioned in s
We crafted simple rules to detect date expression in sentence, and resolve them to absolute dates using the article date as reference
In the case where no date expression is detected in the entire sentence s, date(s) is taken to be the date of publication of the article containing s
15/35
Implementation Mapping sentences to vectors
In order to compare sentences, we mapped each sentence to a vector
In stead of using the traditional idf, we use inverse date frequency (iDf) for term weighting: sentences are grouped by dates rather than documents
We use a cutoff of 3 for date-frequency: all terms that occur in 3 dates or less would have iDf calculated based on a date-frequency of 3
Besides removing stop words from each sentence vector, words that are found to be date expression are also removed
16/35
Implementation Sentences reporting the same event
The decision of whether two sentences report the same event boils down to the problem of paraphrasing
We use the cosine similarity between the vectors of the 2 sentences as the probability that they paraphrase each other
17/35
Implementation Sentences reporting the same event
Interest and burstiness were redefined Sentences reporting events happening on different dates are
unlikely to be paraphrases of each other
where T is a constant, T=10
18/35
Implementation Sentences reporting the same event
Interest and burstiness were redefined The log likelihood ratio can still be calculated
19/35
Implementation Removing duplicated sentences
In order to decide the range of the neighborhood in which sentences are deleted, we define extent of each sentence for each of the measures interest and burstiness
For both measures, extents can take a maximum value of 10 days When picking up the top N sentences after ranking, sentences are
discards if they are within any of the extents of sentences already selected
P=80%
20/35
Implementation Efficiency
Direct computation of the sum of cosines would still be expensive if SC(q) contains a large number of sentences
As the dot product is distributive with respect to addition, the sum of dot products used in the measures can be converted to a dot product of sums
21/35
Case Study on Development Corpus For development, we conducted our experiments on articles f
rom the corpus of Reuters Volume 1, dated from January to August , 1997
We discuss the performance of our system using the query “earthquake(s)” and “quake(s)”
We found a chronology of major earthquakes in Reuters articles on internet (All quakes in this list “killed more than 1,000 people”)
Although many earthquakes occurred in January to August 1997, only two of them were major enough to figure in this list
10/5/1997, Eastern Iran 28/2/1997, Northeast Iran
22/35
Case Study on Development Corpus
23/35
Case Study on Development Corpus
24/35
Evaluation We performed a small experiment to compare our ti
melines against manually constructed timelines We chose to use articles from the first 6 months of 2
002 from the English Gigaword corpus: Agence France Press English Service, Associated Press Worldstream English Service, and The New York Times Newswire Service
For our experiment, we used person names, namely, the eight leaders of the countries in G8 in 2002
25/35
Evaluation
26/35
Evaluation From machine generated timelines, we
automatically removed events that are not in 1/2002~6/2002
For this evaluation, we used only the top ten sentences generated by each measure
We employed four arts undergraduates majoring in history to help us evaluate our system
The evaluation was carried out in three phases
27/35
Evaluation Phase 1
Evaluators were told to construct timelines of ten sentences for queries assigned to them by doing their own research (4 queries/person)
Phase 2 They were given 4 timelines (2:machine, 2:manually) for each of the
8 queries, and asked to answer questions on each timeline They were also told to rank the 4 timelines for each query from best
to worst, and to rank all 40 sentences from 1 to 40, in term of importance
While ranking sentences, they were asked to tie ranks for sentences that relate “equivalent” events
Phase 3 They were asked to judge on the dates of events for the machine
generated timelines, given the sentence and its source document
28/35
Evaluation
29/35
Evaluation
30/35
Evaluation
31/35
Evaluation
32/35
Evaluation
33/35
Evaluation
34/35
Evaluation
35/35
Conclusion and Future Work We have described a system for extracting sentences
from a big corpus to be placed along a timeline given a query from the user
The advantage of our work Efficient Sentences are better units of information as they allow
quick access to their source documents and may be informative enough for the user
It is our intention to integrate it with a search engine so that it can work real time on user queries