Query Based Event Extraction along a Timeline H.L. Chieu and Y.K. Lee DSO National Laboratories, Singapore (SIGIR 2004)

Query Based Event Extraction along a Timeline

H.L. Chieu and Y.K. Lee DSO National Laboratories, Singapore

(SIGIR 2004)

2/35

Abstract We present a framework and a system that extracts

events relevant to a query and place such event along a timeline.

Each event is represented by a sentence, based on the assumption that “important” event are widely cited in many documents for a period time within which these events are of interest.

Evaluation was performed using G8 leader names as queries: comparison made by human evaluators between manually and system generated timelines.

3/35

Introduction The growth of the WWW means that users ne

ed tools to find what they want. In this paper, we attempt to summarize a big c

ollection of documents by placing sentences that report “important” events related to the query along a timeline.

For this purpose, we assume that documents are time-tagged, and each sentence is assigned a date by using simple rules

4/35

Introduction Events that are of interest to many people are

often reported in many different news articles from different sources.

Updates and commentaries of such event would also persist for a period of time.

Our system made use of this property to determine if events are “important”

5/35

Related Work Past work on summarization focused on either summarizing s

ingle document, or summarizing similarities and difference in small clusters of documents.

Recent DUC introduced summarization focused by a viewpoint or a question, but still from small clusters of documents.

Mckeown et al. (2001) used different summarizers for different types of document clusters (MultiGen, DEMS)

Schiffman et al. (2000) summarized collections of documents to produce biographical summaries without taking into account temporal information

6/35

Related Work Work has been done on abstracting, where sentences are gen

erated instead of extracted (Barzilay, 1999, Mckeown, 1999) Kevin and Marcu (2000) explored summarization by sentenc

e compression Summarization system can also be based on information extr

action systems, where summaries are generated from extracted templates (Radev, 1998)

For multidocument summarization, the ordering of sentences within a summary has also been explored to produce coherent summaries for the reader (Barzilay, 2002, 2003)

Sentences extracted on a timeline refer to different events that happened on different dates

7/35

Related Work Most work on temporal aspects of linguistics addresses the a

nnotation of temporal expressions and temporal attributes of events (Mani, 2000, 2001)

The DARPA ACE task of Relation Detection and Tracking involves attaching temporal attributes to relations

The area of TDT brought forth some work on summarization of incoming streams of news

Swan and Allan (2000) pioneered work on timeline overviews Allan et al. (2001) addressed temporal summarization of new

s topics by providing sentence-based summaries, where they defined precision and recall measures that capture relevance and novelty

8/35

Related Work Smith (2002) extracted associations between date

and location and showed that associations detected corresponded well with known event

Kleinberg (2002) modeled data streams to infer a hierarchical structure of bursts in the stream and extract bursts of term usage in title of papers from conference databases

In this paper, our system summarizes documents returned by a query-base search and aims to extract events related to an entity or type of event by sentence extraction along a timeline

9/35

Framework For Timeline Construction

10/35


Ranking Sentences Ranking Metric: “Interest”

We define “interesting event” in SC(q) as events that are a subject of interest in many sentences in SC(q)

The sentences in the set of SC(q) can then be raked according to their interest

q} regarding s asevent same the

reports '|)('{)|( sqSCscardinalqsInterest

11/35


Ranking Sentences Ranking Metric: “Burstiness”

Reports of “important” events often cluster around the date of their occurrence

To sift out “important” events, we compare the number of reports of an event within and outside a date duration

We model reports of events as a random process with an unknown binominal distribution, we can check for associations between event and date

12/35


Ranking Sentences Ranking Metric: “Burstiness”

In stead of taking a fixed value for k, we calculate the log likelihood ratio for each value of K from 1 to 10

)()( ],1,1[ ,10such that n and

tablecontigencyfor ratio likelihood log)(

),(

kLLnLLnkn

kLL

wherenLLBurstiness

13/35

Implementation Assumptions

In step (1), a sentence s is deemed relevant to query q, if one of the terms in q appears in s

In step (2), we resolve date(s) with a few rules Only one event is mentioned per sentence: each se

ntence is mapped to one day

14/35

Implementation Resolving Date

We further assume that the first date expression detected in a sentence s is the date of the event mentioned in s

We crafted simple rules to detect date expression in sentence, and resolve them to absolute dates using the article date as reference

In the case where no date expression is detected in the entire sentence s, date(s) is taken to be the date of publication of the article containing s

15/35

Implementation Mapping sentences to vectors

In order to compare sentences, we mapped each sentence to a vector

In stead of using the traditional idf, we use inverse date frequency (iDf) for term weighting: sentences are grouped by dates rather than documents

We use a cutoff of 3 for date-frequency: all terms that occur in 3 dates or less would have iDf calculated based on a date-frequency of 3

Besides removing stop words from each sentence vector, words that are found to be date expression are also removed

16/35

Implementation Sentences reporting the same event

The decision of whether two sentences report the same event boils down to the problem of paraphrasing

We use the cosine similarity between the vectors of the 2 sentences as the probability that they paraphrase each other

17/35


Interest and burstiness were redefined Sentences reporting events happening on different dates are

unlikely to be paraphrases of each other

where T is a constant, T=10

18/35


Interest and burstiness were redefined The log likelihood ratio can still be calculated

19/35

Implementation Removing duplicated sentences

In order to decide the range of the neighborhood in which sentences are deleted, we define extent of each sentence for each of the measures interest and burstiness

For both measures, extents can take a maximum value of 10 days When picking up the top N sentences after ranking, sentences are

discards if they are within any of the extents of sentences already selected

P=80%

20/35

Implementation Efficiency

Direct computation of the sum of cosines would still be expensive if SC(q) contains a large number of sentences

As the dot product is distributive with respect to addition, the sum of dot products used in the measures can be converted to a dot product of sums

21/35

Case Study on Development Corpus For development, we conducted our experiments on articles f

rom the corpus of Reuters Volume 1, dated from January to August , 1997

We discuss the performance of our system using the query “earthquake(s)” and “quake(s)”

We found a chronology of major earthquakes in Reuters articles on internet (All quakes in this list “killed more than 1,000 people”)

Although many earthquakes occurred in January to August 1997, only two of them were major enough to figure in this list

10/5/1997, Eastern Iran 28/2/1997, Northeast Iran

22/35

Case Study on Development Corpus

23/35

Case Study on Development Corpus

24/35

Evaluation We performed a small experiment to compare our ti

melines against manually constructed timelines We chose to use articles from the first 6 months of 2

002 from the English Gigaword corpus: Agence France Press English Service, Associated Press Worldstream English Service, and The New York Times Newswire Service

For our experiment, we used person names, namely, the eight leaders of the countries in G8 in 2002

25/35

Evaluation

26/35

Evaluation From machine generated timelines, we

automatically removed events that are not in 1/2002~6/2002

For this evaluation, we used only the top ten sentences generated by each measure

We employed four arts undergraduates majoring in history to help us evaluate our system

The evaluation was carried out in three phases

27/35

Evaluation Phase 1

Evaluators were told to construct timelines of ten sentences for queries assigned to them by doing their own research (4 queries/person)

Phase 2 They were given 4 timelines (2:machine, 2:manually) for each of the

8 queries, and asked to answer questions on each timeline They were also told to rank the 4 timelines for each query from best

to worst, and to rank all 40 sentences from 1 to 40, in term of importance

While ranking sentences, they were asked to tie ranks for sentences that relate “equivalent” events

Phase 3 They were asked to judge on the dates of events for the machine

generated timelines, given the sentence and its source document

28/35

Evaluation

29/35

Evaluation

30/35

Evaluation

31/35

Evaluation

32/35

Evaluation

33/35

Evaluation

34/35

Evaluation

35/35

Conclusion and Future Work We have described a system for extracting sentences

from a big corpus to be placed along a timeline given a query from the user

The advantage of our work Efficient Sentences are better units of information as they allow

quick access to their source documents and may be informative enough for the user

It is our intention to integrate it with a search engine so that it can work real time on user queries

Documents

Query Based Event Extraction along a Timeline H.L. Chieu and Y.K. Lee DSO National Laboratories, Singapore (SIGIR 2004)