View
217
Download
0
Embed Size (px)
Citation preview
1
The University of South Florida audiovisual phoneme database
v 1.0
Frisch, S.A., Stearns, A.M.,
Hardin, S.A., & Nikjeh, D.A.
University of South Florida
This work supported by NIH-NIDCD R03 06164
2
Phoneme Database Project
• Recorded wordlist demonstrating all English phonemes in initial, medial, and final word position (if possible)
• Audiovisual recordings– Acoustics– Face video– Ultrasound of tongue– Flexible endoscopy of pharynx, larynx
4
Purpose
• Potential for multimedia tools in teaching phonetics/speech science
• Students have ready access to multimedia computers
• Freeware for acoustic analysis is available
• Need for multimedia resources appropriate for students’ needs
5
Methods – Recording Parameters
• Ultrasound– Mid-saggital image of tongue posture– Probe in direct contact with jaw (no
compressible acoutically transparent standoff)
– No head stabilization
• Digital video camera– Aimed at angle to front of face– Shows lip and jaw movement
6
Methods – Recording Parameters
• Flexible endoscopy– Shows laryngeal setting (but cannot see
glottal cycle)– Also shows pharyngeal articulation
• Audio recording captured as part of all video recordings, used to synchronize videos with one another
7
Word List
• Each English phoneme in word initial, word medial, and word final position where allowed by English phonotactics
• Common words used wherever possible
• Some additional gaps in database due to recording difficulties
• See handout for complete list
8
Procedure
• Each word was read clearly in isolation
• Considerable pause between each word, with articulators moved back to “neutral” position
9
Post-Processing
• Video recordings were superimposed to create a single video file showing facial video, endoscopy of larynx, and ultrasound of tongue position
• Noise reduction applied to audio to eliminate machine noise from recording environment
10
Using the Database
• Recordings can be viewed with freeware Wavesurfer program
• Allows display of common acoustic phonetic analysis windows in conjunction with video image
• Cursor position in acoustic analysis window is tied to the appropriate frame in the video image
• Download from http://www.speech.kth.se/wavesurfer/
12
Example 1 – okay
• Ultrasound shows tongue body raised and tongue root pulled forward to produce high front vowel /e/
• Endoscope window shows arytenoids cartilages are approximated and glottis is closed for voicing
• Video clip shows lips pulled apart for unrounded vowel production
• Etc…
14
Example 2 – voice
• Sample image of diphthong off-glide //• Cursor positioned on spectrogram at end of
diphthong• Ultrasound shows visible tongue tip and body
raising, and advancement of the tongue root• Face video shows lip spreading and jaw raising• Endoscope shows approximation of the
arytenoids and vocal folds
15
Conclusion
• Ultrasound and other multimedia tools have great potential to enhance teaching and learning in phonetics/speech science
• Copies of the database, version 1.0, are available in a compressed file archive on CD-ROM
• Additional suggestions for improvements or additions to the database are welcome