23
CS 551/652: Structure of Spoken Language Lecture 2: Spectrogram Reading and Introductory Phonetics John-Paul Hosom Fall 2008

Phonetics

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Phonetics

CS 551/652:Structure of Spoken Language

Lecture 2: Spectrogram Reading and Introductory Phonetics

John-Paul HosomFall 2008

Page 2: Phonetics

2

Spectrogram Reading

Why bother??

What’s the point of spectrogram reading? Do people read spectrograms as part of their job? Do computers “read” spectrogramsin order to recognize speech?

There are some jobs that require spectrogram reading (e.g. phonetictime alignment), but not many. Automatic speech recognitionsystems do not process speech in this way.

Primary reason for spectrogram reading:If you’re going to work on a problem, it’s advisable to understand the nature of that problem. Spectrogram reading provides a direct method for “hands-on” learning of the characteristics of speech. Studying phonetics, signal processing, or techniques in speech recognition/speech synthesis does notfully convey of the complexity and structure of spoken language.

Page 3: Phonetics

3

A great website on spectrogram reading:

http://home.cc.umanitoba.ca/~robh/

includes “how to” tips on spectrogram reading, a monthly“mystery spectrogram”, and archives of past months’ spectrograms.

Page 4: Phonetics

4

Phonetics: Introduction

Phonology:A description of the systems and patterns of soundsthat occur in a language (abstract), often involving comparisons between languages and/or evolution of a language over time.

Phonetics:A branch of phonology that deals with individual speech sounds, their production, and their written representation.

Phoneme:• A unit of speech that can be used to differentiate words (e.g. “cat” /k ae t/ vs. “bat” /b ae t/).• Phonemes identify minimal pairs in a language.• The set of phonemes in a language subject to interpretation; most languages have 20 to 40 phonemes.

Page 5: Phonetics

5

Phonetics: Introduction

Allophone:A speech sound constituting one of the systematic phonetic variants of a given phoneme. Different allophones are predictable from environment (e.g. “toe”, “caught”, “fitness”, “writer”; “sill”, “still”, “spill”)

Phone:An acoustic realization of a phoneme. (Many different phones may represent the same phoneme.)

“The phoneme /s/ consists of more than 100 allophones”− Pickett, The Acoustics of Speech Communication, p. 7.

Phonemes indicated by / /; phones (allophones) indicated by [ ].

Page 6: Phonetics

6

Phonetics: Introduction

Syllable:• Unit of speech containing one or more phonemes.• A vowel in a syllable is called the syllable nucleus.• Most syllables contain one vowel (or diphthong); some contain only a lateral (“bott/le”) or nasal (“butt/on”) as the most intense sound. • Syllable boundaries sometimes ambiguous (“tas/ty” vs. “tast/y” vs. “ta/sty”)

Coarticulation:The “blending” of two or more adjacent phones, causinga non-distinct boundary between them. Coarticulationis caused by smooth changes in the articulators (lips,tongue, jaw) over time.

Page 7: Phonetics

7

Phonetics: Introduction

Coarticulation Example:

y uw aa r

“you are”: /y uw aa r/

Page 8: Phonetics

8

Phonetics: Introduction (adapted from Schane, p. 4-6)

• Speech signal is continuous; we perceive discrete entities. (How many sound units are in the word “cat”?)

• One assumption of phonology: utterances can be represented as sequence of discrete units.

• Are such units purely an “invention” of linguistics? Spoonerisms (“belly jeans” vs. “jelly beans”) and rhymes indicate small units of language (Reverend William Archibald Spooner (1844-1930))

• Utterances of the same word(s) have many differences… we’re usually only interested in those differences that are “linguistically significant” or that are “perceived as different”.

• Implies a somewhat subjective nature to phonology, whereas we want an objective measure of perceived or produced units.

Page 9: Phonetics

9

Phonetics: Distinctive Phonetic Features

• Phonemes do not differ randomly from one another; there are relationships among phonemes (e.g. /p/ vs. /t/ vs. /ah/)

• A (distinctive) feature is a “phonetic property that can be used to classify sounds” [Ladefoged, p. 42]

• Typically, features are associated with aspects of articulation

• Features may be binary or multi-valued

• Capital letters indicate feature name: Manner square brackets [] indicate feature value: [+fricative]

Page 10: Phonetics

10

Phonetics: Distinctive Phonetic Features

• Exact set of features and feature values depends on goals (no “right” or “wrong” set of features or values)

• Distinctive features provide a vocabulary for describing speech

• Are distinctive features purely an “invention” of linguistics?memory tasks show that when people forget a phoneme, theyusually remember a phoneme with similar distinctive features

Page 11: Phonetics

11

Phonetics: Distinctive Phonetic Features

nasal tract(hard) palate

oral tract

velum (soft palate)

velic port

tongue

tongue tippharynx

glottis(vocal folds and

space between vocal cords)

vocal folds (larynx)= vocal cords

alveolar ridge

lips

teeth

The Speech Production Apparatus (from Olive, p. 23)

Page 12: Phonetics

12

Phonetics: Distinctive Phonetic Features*

Feature Description _Consonantal produced with a constriction along center line of

oral cavity. Only vowels, /w/, /h/, and /y/ are not.

Vocalic largely unobstructed vocal tract. Vowels andliquids (/l/, /r/) are vocalic; glides (/w/, /y/) are not.

Anterior point of articulation near alveolar ridge, includingall labial and dental sounds.

Coronal articulation involves front of tongue

Continuant no complete obstruction in oral cavity; only nasals,stops, and affricates are non-continuant

Strident articulation with long, narrow constriction;such as /s/, /z/, /f/, /v/, /sh/, /zh/, /ch/, /jh/

Voiced vibration of the vocal folds occurs during articulation

Page 13: Phonetics

13

Phonetics: Distinctive Phonetic Features*

Feature Description _Lateral contact between corona of tongue and roof of mouth,

with lowering of sides of tongue (only /l/ in English)

Nasal lowering of the velic port and opening of nasal cavity.

High vowel with high tongue position (narrow constriction);in English, /iy/, /ih/, /uh/, /uw/

Low vowel with low tongue position (no constriction);/ae/, /ao/, /aa/ are (some) low vowels in English.

Back vowels produce with tongue toward back of mouth;/uw/, /uh/, /ah/, /ao/, /aa/, /ow/ are back vowels

Round articulation involving rounding of the lips; only/uw/, /ow/, /ao/, and /uh/ are rounded in English.However, /uh/ may take an unrounded form.

*Adapted from “Language” by C.E.Cairns and F. Williams in Normal Aspects of Speech, Hearing, and Language, edited by Minifie, Hixon, and Williams, 1973, p. 424, as printed in Daniloff p. 51.

Page 14: Phonetics

14

Phonetics: More Distinctive Phonetic Features*

Feature Description _Sonorant “resonant quality” of a sound; vowels are +sonorant,

stops and fricatives are –sonorant. nasals also sonorant.Syllabic is the phoneme the main sound in a syllable?

vowels are syllabic, stops are usually –syllabic, but there are syllabic nasals and liquids.

Tense tense vowels are longer, more fully articulated, andmore “distinct,” e.g. /iy ey uw ow aa/; lax vowelsare less so, e.g. /ih eh uh ah/.

Aspirated produced without a constriction in the vocal tract,but also without voicing (/h/).

Glottalized produced with aperiodic or extremely low-frequencyvibrations of the vocal cords.

Diphthong a single phoneme composed of two or more otherphonemes in sequence (/ay/, /oy/, /ei/, /aw/, /ow/)

* from Schane, pp. 26-32

Page 15: Phonetics

15

Phonetics: Distinctive Phonetic Features

Physiological Features:• Manner

stop /p/, fricative /s/, affricate /ch/, liquid /l/, /r/,glide /j/, /w/, nasal /m/, vowel /ah/, aspiration /h/

• Placebilabial /p/, labiodental /f/, dental /th/, alveolar /t/,palato-alveolar /r/, palatal /sh/, velar /k/, glottal /h/,front /iy/, mid /ah/, back /aa/ (can combine mid + back)

• Heighthigh /iy/, mid-high /ih/, mid /ax/, mid-low /eh/, low /aa/

or high /iy/, mid /eh/, low /aa/ (3 values, plus tense/lax)

• Tenseness, Nasality, Roundingsame as previous descriptions

Page 16: Phonetics

16

Phonetics: Distinctive Feature Relationships: Vowels

Front Back

Unrounded Rounded Unrounded Rounded

High i (iy) ü i (ix) u (uw)

Mid e (eh) ö ^ (ah) o (ow)

Low æ (ae) œ a (aa) (ao)

Front, –Round Back, +Round Back, –Round

Tense Lax Tense Lax Tense Lax

High iy ih uw uh ix

Mid ey eh ow ah, ax†

Low ae ao aa

* from Schane, pp. 12-13. †/ax/ is slightly more centralized than /ah/, and shorter in duration

Page 17: Phonetics

17

Phonetics: Distinctive Phonetic Features: The Case of /ae/

• /ae/ is classified in the preceding table as “lax”, but we have been considering it as “tense”.

• One Rule for Differentiating Tense/Lax:A lax vowel can never be a word-final stressed vowel

e.g. /iy/ can be word final: “be” /b iy/, “tea” /t iy//ih/ can not be word final in one-syllable word: /b ih/, /t ih//ah/ can be word final, but only if unstressed.

• According to this rule, both /eh/ and /ae/ are lax, because they can not be word-final stressed vowels. In this case, the tense vowel in contrast to /eh/ is /ey/.

• However, /ae/ is long in duration (e.g. Forgie and Forgie (1959) and Peterson and Lehiste (1960)), making it acoustically more similar to a tense vowel.

• For spectrogram reading, we’re more concerned with acoustics, so we’ll call /ae/ a tense vowel, although others may call it lax.

Page 18: Phonetics

18

Phonetics: Distinctive Phonetic Features: The Case of /ae/

• Looking at 130,000 words in the CMU dictionary:PHN CNT PCNT EXAMPLES/iy/ 12945 0.10002/ih/ 15 0.00012 “chui”, “des”, “kiwani”, “lui”, “moishe”, “pih”,

“to”/eh/ 30 0.00023 “bienvenue”, “des”, “eh”, “moshe”, “yahweh”,

“zeh”/ae/ 5 0.00004 “dhaka”, “lashua”, “losoya”, “pah”, “yeah”/uw/ 714 0.00552/uh/ 2 0.00002 “l’heureux”, “milieu”/ah/ 6413 0.04955/aa/ 170 0.00131/ao/ 243 0.00188/ey/ 962 0.00743/ay/ 379 0.00293/oy/ 167 0.00129/yu/ 171 0.00132/aw/ 226 0.00175/ow/ 5137 0.03969 0.21280 21% of words end in vowel/diphthong

Page 19: Phonetics

19

Front Central Back

High

Mid

Low

iy

ih

eh

ae

ah

aa

ao

uh

uw

ix

ax

ju

ey

ayaw

ow

Phonetics: Distinctive Feature Relationships: Vowels

from Ladefoged, pp. 38, 81, 218 with correction to /aw/

oy

Page 20: Phonetics

20

ap

prox

iman

t

obs

true

ntPhonetics: Distinctive Feature Relationships: Consonants

Manner Voicing bilabial labio-dental

dental alveolar palato- alveolar

palatal velar glottal

stops+voice b d g

-voice p t k

fricatives+voice v dh z zh

h-voice f th s sh

affricates+voice jh

-voice ch

nasals +voice m n ng

glides +voice w y (w)

retroflex +voice r

lateral +voice l

from Olive, p. 28 and Daniloff, p. 56

Page 21: Phonetics

21

Labial Coronal Dorsal

-sibilant

+nasal m n ng

stop-nasal p b t d k g

+sibilantch jh

s z sh zhfricative

-sibilant

f v th dh

-lateral w r yapproximant

+lateral l

+anterior -anterior

Phonetics: Distinctive Feature Relationships: Consonants

from Ladefoged, p. 44

Page 22: Phonetics

22

Approximants: Terminology

• “Approximants” are NOT the same as “Semi-Vowels” (although Rabiner states they are the same…). American English /r/ is debatable, but we’ll exclude it from the Semi-Vowels for consistency. (Ladefoged p. 229)

• Approximants can be divided into two groups: Liquids and Glides Liquid = {/l/, /r/}, Glide = {/w/, /y/} (Again, Rabiner confuses things by mixing up these sets)

• Lateral = {/l/}

• Retroflex = {/r/, /er/, /axr/}. (In some cases, /er/ is considered a retroflex but /r/ isn’t; we’ll keep things simple by calling /r/ a retroflex).

• Central Approximants = {/r/, /w/, /y/}, Lateral Approximant = {/l/}

Page 23: Phonetics

23

Approximants: Terminology

Approximant

Semi-Vowel / Glide Liquid

LateralRetroflex/y/ /w/

/r, er, axr/ /l/central approximants lateral approximant