Upload
fabio-fabio
View
212
Download
0
Tags:
Embed Size (px)
DESCRIPTION
http://fvalente.zxq.net/presentations/cbmi_2012.pdf
Citation preview
Detecting and Labeling Folk Literature in SpokenCultural Heritage Archives using Structural and Prosodic
Features
Fabio Valente and Petr Motlicek
CBMI 2012
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 1 / 23
Outline of the talk
1 Introduction and Motivation
2 Data Description
3 System Description
4 Experiments
5 Conclusion
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 2 / 23
Outline of the talk
1 Introduction and Motivation
2 Data Description
3 System Description
4 Experiments
5 Conclusion
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 3 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
SpeechFind project (2005)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
SpeechFind project (2005)MALACH project (2004)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
SpeechFind project (2005)MALACH project (2004)CHoral project (2009)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
SpeechFind project (2005)MALACH project (2004)CHoral project (2009)
Typically broadcast quality audio.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
SpeechFind project (2005)MALACH project (2004)CHoral project (2009)
Typically broadcast quality audio.
Data represent monolingual testimonies or interviews.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Audio archives of cultural heritage are large and heterogeneous.
Audio and speech processing technologies have been applied toautomatically structure, index and access them.
SpeechFind project (2005)MALACH project (2004)CHoral project (2009)
Typically broadcast quality audio.
Data represent monolingual testimonies or interviews.
Tasks include segmentation, transcription and information retrieval.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 4 / 23
Introduction and Motivation
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Archives des parlers patois de la Suisse romande et des regions voisinesRadio Suisse Romande
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Archives des parlers patois de la Suisse romande et des regions voisinesRadio Suisse Romande
Devoted to the preservation and the disclosure of Swiss Frenchdialects from various regions.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Archives des parlers patois de la Suisse romande et des regions voisinesRadio Suisse Romande
Devoted to the preservation and the disclosure of Swiss Frenchdialects from various regions.
Broadcasted between 1950 and 1980 (varying and heterogeneousquality)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Archives des parlers patois de la Suisse romande et des regions voisinesRadio Suisse Romande
Devoted to the preservation and the disclosure of Swiss Frenchdialects from various regions.
Broadcasted between 1950 and 1980 (varying and heterogeneousquality)
Contains interviews, presentations and discussions...
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Archives des parlers patois de la Suisse romande et des regions voisinesRadio Suisse Romande
Devoted to the preservation and the disclosure of Swiss Frenchdialects from various regions.
Broadcasted between 1950 and 1980 (varying and heterogeneousquality)
Contains interviews, presentations and discussions...
...but also folk literature (stories, poems, theatrical representations,tales) spoken in many dialects.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Archives des parlers patois de la Suisse romande et des regions voisinesRadio Suisse Romande
Devoted to the preservation and the disclosure of Swiss Frenchdialects from various regions.
Broadcasted between 1950 and 1980 (varying and heterogeneousquality)
Contains interviews, presentations and discussions...
...but also folk literature (stories, poems, theatrical representations,tales) spoken in many dialects.
Folk literature accounts for almost 50% of the data.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 5 / 23
Introduction and Motivation
Previous works focused mainly on segmenting/detecting/indexing inmonolingual interviews and testimonies.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 6 / 23
Introduction and Motivation
Previous works focused mainly on segmenting/detecting/indexing inmonolingual interviews and testimonies.
Previous works made large use of ASR based transcriptions.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 6 / 23
Introduction and Motivation
Previous works focused mainly on segmenting/detecting/indexing inmonolingual interviews and testimonies.
Previous works made large use of ASR based transcriptions.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 6 / 23
Introduction and Motivation
Previous works focused mainly on segmenting/detecting/indexing inmonolingual interviews and testimonies.
Previous works made large use of ASR based transcriptions.
Can also folk literature be detected and labeled ?
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 6 / 23
Introduction and Motivation
Previous works focused mainly on segmenting/detecting/indexing inmonolingual interviews and testimonies.
Previous works made large use of ASR based transcriptions.
Can also folk literature be detected and labeled ?
How this perform without any ASR/lexical information (dialects aretoo sparse to adapt ASR) ?
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 6 / 23
Outline of the talk
1 Introduction and Motivation
2 Data Description
3 System Description
4 Experiments
5 Conclusion
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 7 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
The corpus includes over 1500 recordings.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
The corpus includes over 1500 recordings.
A subset of them also includes the show scripts with the approximated “story” boundariesas well as a short description, the dialect spoken, the name of participants and their roles(anchorman, guests, actors).
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
The corpus includes over 1500 recordings.
A subset of them also includes the show scripts with the approximated “story” boundariesas well as a short description, the dialect spoken, the name of participants and their roles(anchorman, guests, actors).
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
The corpus includes over 1500 recordings.
A subset of them also includes the show scripts with the approximated “story” boundariesas well as a short description, the dialect spoken, the name of participants and their roles(anchorman, guests, actors).
90 shows were uniformly sampled across the 30 years thus considering very differentacoustic quality in the audio.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
The corpus includes over 1500 recordings.
A subset of them also includes the show scripts with the approximated “story” boundariesas well as a short description, the dialect spoken, the name of participants and their roles(anchorman, guests, actors).
90 shows were uniformly sampled across the 30 years thus considering very differentacoustic quality in the audio.
The radio broadcast shows contain 60 different story labels including interviews, tales,poetry, recitals, theatrical representations, discussions, bibliographical references.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
The radio shows have been broadcasted between 1950 and 1980 for the preservation anddisclosure of Swiss French dialects from various regions (Fribourg, Valais, Vaud, Jura) aswell as neighboring regions from Italy and France.
The corpus includes over 1500 recordings.
A subset of them also includes the show scripts with the approximated “story” boundariesas well as a short description, the dialect spoken, the name of participants and their roles(anchorman, guests, actors).
90 shows were uniformly sampled across the 30 years thus considering very differentacoustic quality in the audio.
The radio broadcast shows contain 60 different story labels including interviews, tales,poetry, recitals, theatrical representations, discussions, bibliographical references.
The story labels from the show scripts are clustered together in folk literature (tales,history, legends, recitals, anecdotes, poems, theatre) and conventional broadcast data
(introductions, openings, interview, debates, comments).
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 8 / 23
Data Description
Finer classification scheme: Story-telling, Poetry, Theatre,Functional, Interview, Other
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 9 / 23
Data Description
Finer classification scheme: Story-telling, Poetry, Theatre,Functional, Interview, Other
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 9 / 23
Data Description
Finer classification scheme: Story-telling, Poetry, Theatre,Functional, Interview, Other
Story Label Number of Stories Time Percentage
Story-telling 92 16%
Poetry 67 10%
Theatre 26 15%
Functionals 230 16%
Interview 128 35%
Others 24 8%
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 9 / 23
Data Description
Finer classification scheme: Story-telling, Poetry, Theatre,Functional, Interview, Other
Story Label Number of Stories Time Percentage
Story-telling 92 16%
Poetry 67 10%
Theatre 26 15%
Functionals 230 16%
Interview 128 35%
Others 24 8%
Almost 40% of the audio consists of folk literature
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 9 / 23
Data Description
Summary of structural, stylistic and speaker properties for the different story labels. Speakingstyle properties as in [Llisterri2002].
Story Label Story Structure Speakers Speaking styleFunctionals Monologues Professional Prompted
ScriptedInterview Dialogues Professional Spontaneous
Multiparty non-ProfessionalStory-telling Monologues Professional Narrative
ExpressivePoetry Monologues Professional Narrative
non-Professional ExpressiveTheatre Monologues Professional Expressive
DialoguesMultiparty
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 10 / 23
Data Description
Summary of structural, stylistic and speaker properties for the different story labels. Speakingstyle properties as in [Llisterri2002].
Story Label Story Structure Speakers Speaking styleFunctionals Monologues Professional Prompted
ScriptedInterview Dialogues Professional Spontaneous
Multiparty non-ProfessionalStory-telling Monologues Professional Narrative
ExpressivePoetry Monologues Professional Narrative
non-Professional ExpressiveTheatre Monologues Professional Expressive
DialoguesMultiparty
Structural and stylistic already used for broadcast segmentation into topics and stories, speaker
role recognition and content summarization.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 10 / 23
Outline of the talk
1 Introduction and Motivation
2 Data Description
3 System Description
4 Experiments
5 Conclusion
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 11 / 23
System Description
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 12 / 23
System Description
Assign a label to each speaker turn, in two different tasks:
Folk spoken literature versus conventional broadcast.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 13 / 23
System Description
Assign a label to each speaker turn, in two different tasks:
Folk spoken literature versus conventional broadcast.
Storytelling, Poetry, Theatre, Functional, Interview and Other.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 13 / 23
System Description
Assign a label to each speaker turn, in two different tasks:
Folk spoken literature versus conventional broadcast.
Storytelling, Poetry, Theatre, Functional, Interview and Other.
1 Automatic Speaker Turn Extraction/Diarization.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 13 / 23
System Description
Assign a label to each speaker turn, in two different tasks:
Folk spoken literature versus conventional broadcast.
Storytelling, Poetry, Theatre, Functional, Interview and Other.
1 Automatic Speaker Turn Extraction/Diarization.
2 Structural Feature Extraction.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 13 / 23
System Description
Assign a label to each speaker turn, in two different tasks:
Folk spoken literature versus conventional broadcast.
Storytelling, Poetry, Theatre, Functional, Interview and Other.
1 Automatic Speaker Turn Extraction/Diarization.
2 Structural Feature Extraction.
3 Acoustic/Prosodic Feature Extraction.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 13 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
19 MFCC coefficients are extracted.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
19 MFCC coefficients are extracted.
HMM/GMM speech/non-speech detection. GMM trained on broadcast speech.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
19 MFCC coefficients are extracted.
HMM/GMM speech/non-speech detection. GMM trained on broadcast speech.
Speaker Diarization system is used to infer “ who spoke when ”.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
19 MFCC coefficients are extracted.
HMM/GMM speech/non-speech detection. GMM trained on broadcast speech.
Speaker Diarization system is used to infer “ who spoke when ”.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
19 MFCC coefficients are extracted.
HMM/GMM speech/non-speech detection. GMM trained on broadcast speech.
Speaker Diarization system is used to infer “ who spoke when ”.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Automatic Speaker Turn Extraction
The raw audio is pre-processed using a Wiener filter de-noising.
19 MFCC coefficients are extracted.
HMM/GMM speech/non-speech detection. GMM trained on broadcast speech.
Speaker Diarization system is used to infer “ who spoke when ”.
Turn defined as a speech region from the same speaker uninterrupted by pauses longerthen 500 ms while a Sentence is a speech region from the same speaker uninterrupted byany silence/pause
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 14 / 23
Structural Feature Extraction
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 15 / 23
Structural Feature Extraction
Features related to the structure of the show or type of story.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 15 / 23
Structural Feature Extraction
Features related to the structure of the show or type of story.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 15 / 23
Structural Feature Extraction
Features related to the structure of the show or type of story.
Turn duration, the maximum and the minimum sentence duration in the turn, the number of
sentences in the turn, the relative position of the turn in the show, the ratio between amount of
speech and amount of silence in the turn.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 15 / 23
Structural Feature Extraction
Features related to the structure of the show or type of story.
Turn duration, the maximum and the minimum sentence duration in the turn, the number of
sentences in the turn, the relative position of the turn in the show, the ratio between amount of
speech and amount of silence in the turn.
To capture longer term dependencies: N-gram of quantized consecutive
structural features are introduced
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 15 / 23
Structural Feature Extraction
Features related to the structure of the show or type of story.
Turn duration, the maximum and the minimum sentence duration in the turn, the number of
sentences in the turn, the relative position of the turn in the show, the ratio between amount of
speech and amount of silence in the turn.
To capture longer term dependencies: N-gram of quantized consecutive
structural features are introduced
Example: interviews are often a sequence of shorter and longer turnsfrom journalist and interviewees.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 15 / 23
Acoustic/Prosodic Feature Extraction
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 16 / 23
Acoustic/Prosodic Feature Extraction
Features related with acoustic/prosodic speaker behavior are extracted atturn level.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 16 / 23
Acoustic/Prosodic Feature Extraction
Features related with acoustic/prosodic speaker behavior are extracted atturn level.
The average speaking rate, the average articulation rate, various F0 statistics (mean, median,
minimum, maximum, variance and slope), and minimum, maximum, and mean RMS energy
(extracted using Praat) computed based on pseudo-syllables estimation.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 16 / 23
Acoustic/Prosodic Feature Extraction
Features related with acoustic/prosodic speaker behavior are extracted atturn level.
The average speaking rate, the average articulation rate, various F0 statistics (mean, median,
minimum, maximum, variance and slope), and minimum, maximum, and mean RMS energy
(extracted using Praat) computed based on pseudo-syllables estimation.
To capture longer term dependencies: N-gram of quantized consecutive
structural features are introduced
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 16 / 23
Acoustic/Prosodic Feature Extraction
Features related with acoustic/prosodic speaker behavior are extracted atturn level.
The average speaking rate, the average articulation rate, various F0 statistics (mean, median,
minimum, maximum, variance and slope), and minimum, maximum, and mean RMS energy
(extracted using Praat) computed based on pseudo-syllables estimation.
To capture longer term dependencies: N-gram of quantized consecutive
structural features are introduced
Example: interviews are often a sequence of professional andnon-professional speaker turns.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 16 / 23
Boosting Classifier
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Boosting Classifier
In order to integrate continuous, discrete features as well as N-gramcounts into the same classifier, a boosting approach is used.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Boosting Classifier
In order to integrate continuous, discrete features as well as N-gramcounts into the same classifier, a boosting approach is used.
Multi-class Boosting is implemented.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Boosting Classifier
In order to integrate continuous, discrete features as well as N-gramcounts into the same classifier, a boosting approach is used.
Multi-class Boosting is implemented.
The weak learners are one-level decision trees.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Boosting Classifier
In order to integrate continuous, discrete features as well as N-gramcounts into the same classifier, a boosting approach is used.
Multi-class Boosting is implemented.
The weak learners are one-level decision trees.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Boosting Classifier
In order to integrate continuous, discrete features as well as N-gramcounts into the same classifier, a boosting approach is used.
Multi-class Boosting is implemented.
The weak learners are one-level decision trees.
Results are reported in terms of F-measure (Precision/Recall).
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Boosting Classifier
In order to integrate continuous, discrete features as well as N-gramcounts into the same classifier, a boosting approach is used.
Multi-class Boosting is implemented.
The weak learners are one-level decision trees.
Results are reported in terms of F-measure (Precision/Recall).
Precision and Recall are computed as percentage of time correctlylabeled/retrieved in order to take into account possible mismatchesbetween the manual boundaries and the speaker diarization output.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 17 / 23
Outline of the talk
1 Introduction and Motivation
2 Data Description
3 System Description
4 Experiments
5 Conclusion
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 18 / 23
Experiments
Prosodic Structural Prosodic+StructuralFolk literature 0.80 (0.84/0.77) 0.69 (0.73/0.66) 0.83 (0.87/0.80)
Conventional Broadcast 0.91 (0.89/0.93) 0.87 (0.89/0.85) 0.92 (0.94/0.90)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 19 / 23
Experiments
Prosodic Structural Prosodic+StructuralFolk literature 0.80 (0.84/0.77) 0.69 (0.73/0.66) 0.83 (0.87/0.80)
Conventional Broadcast 0.91 (0.89/0.93) 0.87 (0.89/0.85) 0.92 (0.94/0.90)
Prosodic Structural Prosodic+StructuralStorytelling 0.57 (0.58/0.56) 0.48 (0.47/0.48) 0.57 (0.59/0.55)
Poetry 0.37 (0.42/0.33) 0.28 (0.34/0.24) 0.45 (0.47/0.43)Theatre 0.72 (0.81/0.65) 0.69 (0.74/0.64) 0.81 (0.84/0.77)
Functionals 0.79 (0.75/0.83) 0.79 (0.77/0.82) 0.83 (0.81/0.84)Interview 0.83 (0.87/0.85) 0.72 (0.70/0.75) 0.87 (0.84/0.90)Others 0.14 (0.32/0.09) 0.13 (0.22/0.09) 0.23 (0.35/0.17)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 19 / 23
Experiments
Prosodic features outperforms Structural features.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 20 / 23
Experiments
Prosodic features outperforms Structural features.
Results well above chance classification.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 20 / 23
Experiments
Prosodic features outperforms Structural features.
Results well above chance classification.
Interview / Functionals / Theatre recognized above 0.8 F-measure.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 20 / 23
Experiments
Prosodic features outperforms Structural features.
Results well above chance classification.
Interview / Functionals / Theatre recognized above 0.8 F-measure.
Confusion between Storytelling / Poetry
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 20 / 23
Experiments
Prosodic features outperforms Structural features.
Results well above chance classification.
Interview / Functionals / Theatre recognized above 0.8 F-measure.
Confusion between Storytelling / Poetry
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 20 / 23
Experiments
Prosodic features outperforms Structural features.
Results well above chance classification.
Interview / Functionals / Theatre recognized above 0.8 F-measure.
Confusion between Storytelling / Poetry
StoriesPoetry
TheatreFunctional
InterviewOther
Stories
Poetry
Theatre
Functional
Interview
Other
0
0.2
0.4
0.6
0.8
1
Confusion Matrix
InferredLabels
ReferenceLabels
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 20 / 23
Experiments
Storytelling and poetry are characterized by narrative/expressive speaking styles, howeverthe second is composed with more attention to rhythm and meters.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 21 / 23
Experiments
Storytelling and poetry are characterized by narrative/expressive speaking styles, howeverthe second is composed with more attention to rhythm and meters.
Linear regression on consecutive syllables can help in quantifying rhythm and meters[Kochanski2010].
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 21 / 23
Experiments
Storytelling and poetry are characterized by narrative/expressive speaking styles, howeverthe second is composed with more attention to rhythm and meters.
Linear regression on consecutive syllables can help in quantifying rhythm and meters[Kochanski2010].
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 21 / 23
Experiments
Storytelling and poetry are characterized by narrative/expressive speaking styles, howeverthe second is composed with more attention to rhythm and meters.
Linear regression on consecutive syllables can help in quantifying rhythm and meters[Kochanski2010].
Prosodic+Structural + Prediction FeaturesStorytelling 0.57 (0.59/0.55) 0.64 (0.65/0.63)
Poetry 0.45 (0.47/0.43) 0.52 (0.57/0.48)Theatre 0.81 (0.84/0.77) 0.85 (0.87/0.82)
Functionals 0.83 (0.81/0.84) 0.85 (0.83/0.87)Interview 0.87 (0.84/0.90) 0.88 (0.86/0.91)Others 0.23 (0.35/0.17) 0.23 (0.35/0.18)
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 21 / 23
Experiments
Storytelling and poetry are characterized by narrative/expressive speaking styles, howeverthe second is composed with more attention to rhythm and meters.
Linear regression on consecutive syllables can help in quantifying rhythm and meters[Kochanski2010].
Prosodic+Structural + Prediction FeaturesStorytelling 0.57 (0.59/0.55) 0.64 (0.65/0.63)
Poetry 0.45 (0.47/0.43) 0.52 (0.57/0.48)Theatre 0.81 (0.84/0.77) 0.85 (0.87/0.82)
Functionals 0.83 (0.81/0.84) 0.85 (0.83/0.87)Interview 0.87 (0.84/0.90) 0.88 (0.86/0.91)Others 0.23 (0.35/0.17) 0.23 (0.35/0.18)
Small improvements in disambiguating Storytelling and Poetry confusions.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 21 / 23
Experiments
Storytelling and poetry are characterized by narrative/expressive speaking styles, howeverthe second is composed with more attention to rhythm and meters.
Linear regression on consecutive syllables can help in quantifying rhythm and meters[Kochanski2010].
Prosodic+Structural + Prediction FeaturesStorytelling 0.57 (0.59/0.55) 0.64 (0.65/0.63)
Poetry 0.45 (0.47/0.43) 0.52 (0.57/0.48)Theatre 0.81 (0.84/0.77) 0.85 (0.87/0.82)
Functionals 0.83 (0.81/0.84) 0.85 (0.83/0.87)Interview 0.87 (0.84/0.90) 0.88 (0.86/0.91)Others 0.23 (0.35/0.17) 0.23 (0.35/0.18)
Small improvements in disambiguating Storytelling and Poetry confusions.
Still lot of confusion between those two classes.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 21 / 23
Outline of the talk
1 Introduction and Motivation
2 Data Description
3 System Description
4 Experiments
5 Conclusion
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 22 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Detection based on Structural features achieves F-measure 0.69
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Detection based on Structural features achieves F-measure 0.69
Detection based on Prosodic features achieves F-measure 0.80
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Detection based on Structural features achieves F-measure 0.69
Detection based on Prosodic features achieves F-measure 0.80
Once they are combined together they achieve F-measure 0.83
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Detection based on Structural features achieves F-measure 0.69
Detection based on Prosodic features achieves F-measure 0.80
Once they are combined together they achieve F-measure 0.83
Some data types are more easily recognized then others
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Detection based on Structural features achieves F-measure 0.69
Detection based on Prosodic features achieves F-measure 0.80
Once they are combined together they achieve F-measure 0.83
Some data types are more easily recognized then others
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23
Conclusion
Most of previous works on cultural heritage focused on broadcast interviews/testimoniesand made use of ASR words.
Feasibility study on Archives des parlers patois de la Suisse romande et des regions
voisines.
Half of those archives consists of storytelling, poetry, ...
Spoken in several French Swiss dialects, sparse and under-represented.
Can we detect and label folk literature in those archives without any word
information ?
Turn based evaluation with turns obtained using diarization.
Detection based on Structural features achieves F-measure 0.69
Detection based on Prosodic features achieves F-measure 0.80
Once they are combined together they achieve F-measure 0.83
Some data types are more easily recognized then others
Future works : make use of story/boundaries detector, richer feature sets.
Fabio Valente and Petr Motlicek (Idiap Research Institute)Detecting and Labeling Folk Literature CBMI 2012 23 / 23