Upload
tad
View
33
Download
1
Tags:
Embed Size (px)
DESCRIPTION
CMU TDT Report TIDES PI Meeting 2002. The CMU TDT Team: Jaime Carbonell, Yiming Yang, Ralf Brown, Jian Zhang, Nianli Ma, Chun Jin Language Technologies Institute, CMU. Time Line for TDT Activities. ReStarted TDT: Summer 2001 Tasks: FSD, SLD, Detection - PowerPoint PPT Presentation
Citation preview
CMU TDT Report TIDES PI Meeting 2002
The CMU TDT Team:Jaime Carbonell, Yiming Yang, Ralf
Brown, Jian Zhang, Nianli Ma, Chun JinLanguage Technologies Institute, CMU
Time Line for TDT Activities ReStarted TDT: Summer 2001 Tasks: FSD, SLD, Detection New Techniques: Nov 2001 – Present
Topic-conditional Novelty (FSD) Situated NE’s (all tasks) Source-conditional interpolated training (SLD)
Evaluations TDT: Oct 2001, July 2002 New FSD (internal): July 2002 (KDD
Conference)
2002 Dry Run Results: DET
Evaluation Conditions Systran
EBMT DICT
SR=nwt+bnasr TE=mul,eng boundary DEF=10
0.3646 0.3465 [1]
SR=nwt+bnasr TE=mul,eng noboundary DEF=10
0.4040
SR=nwt+bnman TE=arb,eng boundary DEF=10
0.2011 0.6799 [2]
0.1966 [3]
SR=nwt+bnman TE=arb,nat boundary DEF=10
0.1732
[1] Using our Mandarin to English EBMT, and replace our boundary with systran’s boundary.[2] Using our Dictionary-Based Arabic to English translation, and with our own boundaries. So the boundaries of evaluation and our results are mismatching. [3] Using our Dictionary-Based Arabic to English translation, and replace our boundary with systran’s boundary.
Baseline FSD Method (Unconditional) Dissimilarity with
Past Decision threshold on most-similar
story (Linear) temporal decay Length-filter (for teasers)
Cosine similarity with standard weights:
)/log(*))log(1( idfNtftfidf
2002 Dry Run Results: FSD
Evaluation Conditions
SR=nwt+bnasr; TE=eng, nat;
boundary; DEF=10 0.6174 0.5846
SR=nwt+bnasr; TE=eng, nat; noboundary; DEF=10
0.6899 0.6403
normfsdC )( optimalnormfsdC )(
2002 Dry Run DET: CMU-FSD
FSD Observations Cross-site comparable baselines (cost
=.7) “Events-vs-Topics” issue (e.g. Asia crisis) A few mislabled stories wreak havoc for FSD Eager auto-segmentation a problem (misses)
Recommendations for TDT labeling FSD on true events, or events within topic(s) Change auto-segmentation optimality
criterion ?? Recommendations for TDT reserachers
Keep working hard on FSD – not cracked yet
New FSD Directions Topic-conditional models
E.g. “airplane,” “investigation,” “FAA,” “FBI,” “casualties,” topic, not event
“TWA 800,” “March 12, 1997” event First categorize into topic, then use
maximally-discriminative terms within topic
Rely on situated named entities E.g. “Arcan as victim,” “Sharon as peacemaker”
Broad Topics vs Events
Two-level Scheme for FSD
Confusability between Intra-topic Events
AIRPLANE ACCIDENTS BOMBINGS• Each data point in the matrix is the similarity between the two corresponding documents.
• Documents are sorted by event as the first key and by the time of arrival as second key, so the diagonal sub-matrices are intra-event document similarities, while the off-diagonal sub-matrices are inter-event document similarities.
Measuring Effectiveness of NEs
[1] f means a Named Entity; Sk the Kth type of Named Entities among seven types of NEs.[2] We use the effectiveness of each type of NEs to measure how well they can differentiate intra-topic events.
Effectiveness of Named Entities
Experimental Design Baseline: conventional FSD
Simple case: two-level FSD with “perfect” topic labels
Ideal case: two-level FSD with “perfect” topic labels, weighted NE and removing topic-specific stop words
Real case: the same as Ideal Case except using system-predicted topic labels
Data Description Broadcast News: published by Primary Source Media, 261,209 transcripts for news articles from ABC, CNN,
NPR and MSNBC in the period from 1992 to 1998. Document Structure: each document (story) is
composed of several fields, such as Title, Topic, Keywords, Date, Abstract and Body.
(Training) topic labels provided by PSM (4 topics) Airplane accidents, bombings, tornados, hijackings
CMU students labeled 36 events within 4 topics (divided into 50% training and 50% test)
Results for Topic-Conditioned FSD
Confusability Reduction (5 events within topic: airplane accident in test data)
NOTE:1. These graphs only contains test data (5 events for topic “airplane accidents”)2. The left graph is the Baseline, and the right one is the Ideal Case.
Topic-Conditioned Approach to First Story Detection for TDT