speech processing basics

Speech Processing

• Fundamentals of Digital Speech processing

1.Anatomy and physiology of speech organs

2.The process of speech production

3.The Acoustic Theory of speech production

4.Digital models for speech signals

Applications of Speech Processing

• 1.Speech recognition: speech to text• 2.Speech understanding: Not exact words(meaning is

important rather than text) :speech translation• 3.speech synthesis: Text to speech, computer can

speak to you• 4.Word processing: check and correct spelling,

grammar and style• 5.text prediction: speed up word processing• 6.automatic summarization: Topic identification,

summary generation• 7.text mining : Necessary data

• Anatomy: It is the study of structure of bodies of people or animals• Physiology: It is the study of how people’s and animals bodies functions

and understanding the higher order mechanisms within the human central nervous system that account for speech production in human beings

• Acoustic: It is a scientific study of sounds• Phonetics: It is relating to the sound of a word or to the sounds that are

used in languages • Phonemes: It is the smallest unit of sounds which is significant in a

language • Articulatory:It is the action of productory a sound or word cleary,in speech

or music• Linguistics: It is study of the way in which language works• Semantics: It is the branch of Linguistics that deals with the meanings of

words and sentences.

Speech Processing

SignalProcessing Information

TheoryPhonetics

Acoustics

Algorithms(Programming)

Fourier transformsDiscrete time filtersAR(MA) models

EntropyCommunication theoryRate-distortion theory

Statistical SPStochastic models

PsychoacousticsRoom acousticsSpeech production

ASR: Application

Recognition

Voice Input Analog to Digital Acoustic Model

Language Model

Display Speech EngineFeedback

Automatic Speech Recognition

Speech Generation

• first talker formulates a message(in this mind)that he wants to transmit to listener via speech

• The process of message formulation is creation of printed text expressing the words of message

• The next step is conversion of the message into a language code.

• This roughly corresponds to converting the printed text of message into set of phoneme sequence corresponding to sounds that make up words and pitch accent associated with the sounds

• Once the language code is chosen, the talker must execute a series of neuromuscular commands to cause the vocal cords to vibrate when appropriate and shape the vocal tract such that the proper sequence of speech sounds is created and spoken by the talker, then producing an acoustic signal as final output

Speech Recognition

• First the listener processes the acoustic signal the basilar membrane in the inner ear, which providing a running spectrum analysis of the incoming signal.

• The neural activity along the auditory nerve is converted into a language code at higher centers of processing within the brain and message comprehension is achieved

• The lungs and the associated muscles act as the source of air for exciting the vocal mechanism.

• The muscle force pushes air out of lungs(shown as a piston pushing up within a cylinder)and though the bronchi and trachea.

• When the vocal cords are tensed, the air flow causes them to vibrate ,producing so called voiced speech sounds

• When the vocal cords are relaxed, in order to produce a sound, the air flow either must pass through a constriction in vocal tract and thereby become turbulent, producing so called unvoiced speech sounds

Classifications

• 1.silence(s)-no speech is produced()

• 2.Unvoiced(U):vocal cords are not vibrating so speech signal is aperiodic or random in nature

• 3.Voiced(V): vocal cords are vibrate periodically when air flows from the lungs, so speech signal is periodic

Speech Waveform Characteristics

• Loudness

• Voiced/Unvoiced.

• Pitch.

– Fundamental frequency.

• Spectral envelope.

– Formants.

Speech Waveform Characteristics Cont.

Voiced Speech Unvoiced Speech

/ih/ /s/

Phoneme HierarchySpeech sounds

Vowels ConsonantsDiphtongs

Plosive

NasalFricative

Retroflexliquid

Lateralliquid

iy, ih, ae, aa, ah, ao,ax, eh,er, ow, uh, uw

ay, ey,oy, aw

p, b, t,d, k, g

m, n, ng f, v, th, dh,s, z, sh, zh, h

Language dependent.About 50 in English.

Signal processing

Digital speech processing

• Speech signals are composed of a sequence of sounds.

• The study of these rules and their implication s in human communication is the domain of linguistics.

• The study and classification of sound of speech is called phonetics.

speech processing basics

Engineering

Speech & NLP (Fall 2014): Basics of Phonology & Audio Processing, Zero Crossing Rate, Dynamic Time Warping

Back to Basics - Speech Intelligibility in Courts & Tribunalscoat.asn.au/wp-content/.../11/Back-to-Basics-Speech... · theory Designing for Speech Communication Excellent speech intelligibility

· b. ICDL Application Basics: Basics of Word Processing, Spreadsheet, Presentation Word Processing Basics utJunu Spreadsheet—Basics L61iu Presentation Basics

speech processing lecture

Speech & Audio Processing

MATLAB Functionality for Digital Speech Processing speech processing... · MATLAB Functionality for Digital Speech Processing • MATLAB Speech Processing Code • MATLAB GUI Implementations

Speech Processing, Spring 2006 - Aalborg Universitetkom.aau.dk/project/sipcom/SIPCom06/sites/sipcom8... · Speech Processing, Spring 2006 ... Processing of Speech Signals, IEEE Press,

Digital Speech Processing— Lecture 16 speech... · 1 Digital Speech Processing— Lecture 16 Speech Coding Methods Based on Speech Waveform Representations and Speech Models—Adaptive

RASTA processing of speech - Speech and Audio Processing ...dpwe/papers/HermM94-rasta.pdf · Title: RASTA processing of speech - Speech and Audio Processing, IEEE Transacti ons on

Elevator Speech Basics

RASTA processing of speech - Speech and Audio …labrosa.ee.columbia.edu/~dpwe/papers/HermM94-rasta.pdf · Title: RASTA processing of speech - Speech and Audio Processing, IEEE Transacti

Digital Speech ProcessingDigital Speech Processing—— Lecture 3 · 2010. 5. 19. · 1 Digital Speech ProcessingDigital Speech Processing—— Lecture 3 1 Acoustic Theory of Speech

Speech Signal Processing

Homomorphic speech processing

ECE 556 BASICS OF DIGITAL SPEECH PROCESSINGece556.cankaya.edu.tr/uploads/files/Sunu5(1).pdf · BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017

MATLAB Functionality for Digital Speech Processing...MATLAB Functionality for Digital Speech Processing • MATLAB Speech Processing Code • MATLAB GUI Implementations Lecture_3_2012

Speech Processing

Speech Processing Basics

The Basics, AMCHAM. Speech, 2014

Digital Speech Processing— Lecture 3 speech processing... · Digital Speech Processing ... Human Vocal Apparatus Mid-sagittal plane X-ray of human vocal apparatus