Upload
joel-oliver
View
212
Download
0
Embed Size (px)
Citation preview
Notes on ICASSP 2004
Arthur ChanMay 24, 2004
This Presentation (5 pages)
Brief note of ICASSP 2004 NIST RT 04
Evaluation results Other interesting things relate to
CALO
NIST RT 04 Meeting Transcription – Headlines. Meeting Transcription
A challenge to core technology, evaluation and resource preparation.
Core technology Speaker Segmentation Speech to Text (STT)
Evaluation New evaluation scheme is deviced for overlappe
d speech. Resource preparation
LDC has a big headache in preparing the data.
Speaker Segmentation Segmenting the speech
Search for the number of speakers. Get speaker turns. Measured by Diarization rate.
Insights: (from ISL) More speakers: the harder the task. A new measure called speaker speakin
g time entropy is proposed.
STT Very hard task
ICSI, ISL use the state of the art technology +Constrained linear transform +Discriminative training (DT-MAP) +Speaker Adaptive Training.
Individual headphone results WER: 34.8% for non-overlapping speech.
Some meeting is very hard. Many people is speaking at the same time.
Trained on 4 different subset of data, ICSI data is just one of them (70% of the total)
Insights: (ICSI) feature-based technique doesn’t help too much Multiple-distance microphones and array microphones
techniques help. Conclusion: we will also have a hard-time.
Evaluation and Resource Preparation Evaluation:
Overlapped speech require different schemes for evaluation
Will require multiple string matching. (Detail unknown yet.)
Resource Preparation Currently, no tool can satisfy the need of
transcribing multiple channels of speech with interaction
Professional transcriber failed.
Other interesting news from ICASSP related to CALO
Project EARS: Lightly supervised training
3000 hours close captioned speech is used
Discriminative training is found to be useful for some sites.
Others