27
Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Human Speech Processing Phonetics and Phonology

Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

  • Upload
    others

  • View
    18

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Speech Processing 11-492/18-492Speech Processing 11-492/18-492

Human Speech ProcessingPhonetics and Phonology

Page 2: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

The vocal tractThe vocal tract

Page 3: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

From meat to voiceFrom meat to voice

Blow air through lungsBlow air through lungs Vibrate larynxVibrate larynx Vocal tract shape defines resonanceVocal tract shape defines resonance Obstructions modify soundObstructions modify sound

Tongue, teeth, lips, velum (nasal passage)Tongue, teeth, lips, velum (nasal passage)

Page 4: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

The earThe ear

Page 5: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

From sound to brain wavesFrom sound to brain waves

Sound wavesSound waves Vibrate ear drumVibrate ear drum Cause fluid in cochlear to vibrateCause fluid in cochlear to vibrate Spiral cochlearSpiral cochlear

Vibrate hairs inside cochlearVibrate hairs inside cochlear Different frequencies vibrate different hairsDifferent frequencies vibrate different hairs Converts time domain to frequency domainConverts time domain to frequency domain

Page 6: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

From grunts to meaningFrom grunts to meaning

Grunts and vocalizationGrunts and vocalization Lots of variation availableLots of variation available

(continuous systems – not discrete)(continuous systems – not discrete) Noises become distinct, recognizableNoises become distinct, recognizable

Grow into languages, dialects and idiolectsGrow into languages, dialects and idiolects What are the fundamental units?What are the fundamental units?

Page 7: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Articulatory MovementsArticulatory Movements

Page 8: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Electromagnetic Articulograph Electromagnetic Articulograph

Page 9: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

PhonemesPhonemes

Defined as fundamental units of speechDefined as fundamental units of speech If you change it, it (can) change the meaningIf you change it, it (can) change the meaning

““pat” to “bat”pat” to “bat”

““pat” to “pam”pat” to “pam”

Page 10: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Vowel SpaceVowel Space

• One or two banded frequencies (formants)

Page 11: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

English (US) VowelsEnglish (US) Vowels

AAAA wAshingtonwAshington AEAE fAt, bAdfAt, bAd

AHAH bUt, hUshbUt, hUsh AOAO lAWn, mAlllAWn, mAll

AWAW hOW, sOUthhOW, sOUth AXAX About, cAnoeAbout, cAnoe

AYAY hIde, bUYhIde, bUY EHEH gEt, fEAthergEt, fEAther

ERER makER, sEARchmakER, sEARch EYEY gAte, EIghtgAte, EIght

IHIH bIt, shIpbIt, shIp IYIY bEAt, shEEpbEAt, shEEp

OWOW lOne, nOselOne, nOse OYOY tOY, OYstertOY, OYster

UHUH fUllfUll UWUW fOOlfOOl

Page 12: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

English ConsonantsEnglish Consonants

Stops: P, B, T, D, K, GStops: P, B, T, D, K, G Fricatives: F, V, HH, S, Z, SH, ZHFricatives: F, V, HH, S, Z, SH, ZH Affricatives: CH, JHAffricatives: CH, JH Nasals: N, M, NGNasals: N, M, NG Glides: L, R, Y, WGlides: L, R, Y, W

Note: voiced vs unvoiced:Note: voiced vs unvoiced: P vs B, F vs VP vs B, F vs V

Page 13: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Number of Phonemes in LanguageNumber of Phonemes in Language

US English: 43US English: 43 UK English: 44UK English: 44 Japanese: 25Japanese: 25 Hindi: 81Hindi: 81 Numbers aren’t definite thoughNumbers aren’t definite though

Depends on who you ask,Depends on who you ask, And what you want it forAnd what you want it for

Page 14: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Not all variation is PhoneticNot all variation is Phonetic

Phonology: linguistically discrete unitsPhonology: linguistically discrete units May be a number of different ways to say themMay be a number of different ways to say them /r/ trill (Scottish or Spanish) vs US way/r/ trill (Scottish or Spanish) vs US way

Phonetics vs PhonemicsPhonetics vs Phonemics Phonetics: discrete unitsPhonetics: discrete units Phonemics: all soundsPhonemics: all sounds

/t/ in US English: becomes “flap”/t/ in US English: becomes “flap” ““water” / w ao t er /water” / w ao t er / ““water” / w ao dx er /water” / w ao dx er /

Page 15: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Dialect and IdiolectDialect and Idiolect

Variation within language (and speakers)Variation within language (and speakers) PhoneticPhonetic

““Don” vs “Dawn”, “Cot” vs “Caught”Don” vs “Dawn”, “Cot” vs “Caught” R deletion (Haavaad vs Harvard)R deletion (Haavaad vs Harvard)

Word choice:Word choice: Y’all, YinsY’all, Yins Politeness levelsPoliteness levels

Page 16: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Not all languages use the same setNot all languages use the same set

Asperated stops (Korean, Hindi)Asperated stops (Korean, Hindi) P vs PHP vs PH English uses both, but doesn’t careEnglish uses both, but doesn’t care Pot vs sPot (place hand over mouth)Pot vs sPot (place hand over mouth)

L-R in Japanese not phonologicalL-R in Japanese not phonological US English dialects:US English dialects:

Mary, Merry, MarryMary, Merry, Marry Scottish English vs US EnglishScottish English vs US English

No distinction between “pull” and “pool”No distinction between “pull” and “pool” Distinction between: “for” and “four”Distinction between: “for” and “four”

Page 17: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Different language dimensionsDifferent language dimensions

Vowel lengthVowel length Bit vs beatBit vs beat Japanese: shujin (husband) vs shuujin (prisoner)Japanese: shujin (husband) vs shuujin (prisoner)

TonesTones F0 (tune) used phoneticallyF0 (tune) used phonetically Chinese, Thai, BurmeseChinese, Thai, Burmese

ClicksClicks XhosaXhosa

Page 18: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Co-articulationCo-articulation

Voicing actually doesn’t always stopVoicing actually doesn’t always stop ““have honey”, “impossible”have honey”, “impossible”

Nasalized voices, lip rounding Nasalized voices, lip rounding ““min” vs “bit”, “sow” vs “see”min” vs “bit”, “sow” vs “see”

Lexical stress:Lexical stress: EMphasis, emPHAsisEMphasis, emPHAsis PROject, proJECTPROject, proJECT

Reduction, contractionReduction, contraction ““A boy is riding a bike”A boy is riding a bike” ““I want to go to Disneyland.”I want to go to Disneyland.” ““I will go tomorrow”I will go tomorrow”

Page 19: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

ProsodyProsody

IntonationIntonation TuneTune

DurationDuration How long/short of each phonemeHow long/short of each phoneme

PhrasingPhrasing Where the breaks areWhere the breaks are

Page 20: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Intonation (F0)Intonation (F0)

Rate of vibration during voiced speechRate of vibration during voiced speech Males: 80-140 times a secondMales: 80-140 times a second Females: 130-220 times a secondFemales: 130-220 times a second Children: 180-320 times a secondChildren: 180-320 times a second

Used for:Used for: EmphasisEmphasis Style: questions, statements, confidence etcStyle: questions, statements, confidence etc

Page 21: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Intonation ContourIntonation Contour

Page 22: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Intonation InformationIntonation Information

Large pitch range (female)Large pitch range (female) Authoritive since goes down at the endAuthoritive since goes down at the end

News readerNews reader Emphasis for Finance H*Emphasis for Finance H* Final has a raise – more information to Final has a raise – more information to

comecome

Female American newsreader from WBURFemale American newsreader from WBUR (Boston University Radio)(Boston University Radio)

Page 23: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Intonation ExamplesIntonation Examples

Fixed durations, flat F0.Fixed durations, flat F0. Decline F0Decline F0 ““hat” accents on stressed syllableshat” accents on stressed syllables accents and end tonesaccents and end tones statistically trained statistically trained

Page 24: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

WordsWords

WordsWords The things with space around them (sort of)The things with space around them (sort of) Chinese, Thai, Japanese doesn’t use spacesChinese, Thai, Japanese doesn’t use spaces Speech doesn’t use spacesSpeech doesn’t use spaces

Blackboard vs Black BoardBlackboard vs Black Board

EnglishEnglish Morphology: walk, walks, walking, walkedMorphology: walk, walks, walking, walked

JapaneseJapanese Morphology: aruku, arukimasu, arukimashita, aruite, aruikitai, Morphology: aruku, arukimasu, arukimashita, aruite, aruikitai,

aruikitakatta, arukemasu, ….aruikitakatta, arukemasu, ….

Page 25: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Speech ActsSpeech Acts

Words aren’t always what they seemWords aren’t always what they seem Can you pass the salt?Can you pass the salt? Boston. Boston! Boston?Boston. Boston! Boston? Yeah, rightYeah, right

Multiple ways to say the same thing:Multiple ways to say the same thing: I want to go to Boston.I want to go to Boston. YesYes

Page 26: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and

Human SpeechHuman Speech

Human production and perceptionHuman production and perception Quite different from computersQuite different from computers

PhonologyPhonology Defining the alphabet of speechDefining the alphabet of speech Different languages make different distinctionsDifferent languages make different distinctions

IntonationIntonation How its saidHow its said

Page 27: Human Speech Processing Phonetics and Phonologytts.speech.cs.cmu.edu/courses/11492/slides/human_speech2.pdf · Speech Processing 11-492/18-492 Human Speech Processing Phonetics and