Upload
danielsunder
View
25
Download
0
Tags:
Embed Size (px)
Citation preview
CASSI Speech Recognition:Adding Speech Recognition to Embedded Devices
by
G.V.S.GIREESH (05981A0444), B.Tech Final year, Electronics & Communications Engineering
RAGHU ENGINEERING COLLEGE
INTRODUCTION
What is CASSI ?
Conversay Advanced Symbolic Speech Interface
It can be used in a variety of embedded systems.
It runs on either single or dual-processor hardware designs
> CASSI provides continuous, speaker-independent speech recognition
Conversay developers and customers write application code that uses the CASSI API to integrate speech recognition and text-to-speech (TTS) capability into embedded products.
What is TTS ?Text-To-Speech (TTS):
CASSI contains two modules for performing TTS: Rosetta and a TTS synthesis module.
Rosetta, the text-to-phonetics unit, accepts arbitrary written text as input and outputs a string of
phonemes for CASSI to synthesize
processof incorporating speech technology
1. Definition of capabilities
2. Analysis of hardware resources 3. User interface design
4. Development
HARDWARE ENVIRONMENT:
Modular nature.
Suitable for a variety of systems. Used with single processor designs where one processor handles all component execution. Feature extraction and TTS synthesis may be separated onto their own DSP (or other front-end signal processor)
Front-End Block:The front-end block is used for recognition and TTS functions
Processor Block (Back-End):
The processor block performs all other code functions, including topic management and search
AUTOMATIC SPEECH RECOGNISATION
What does speaker dependent / adaptive / independent mean?
What does continuous speech and isolated-word mean?
A continuous speech system operates on speech in which words are connected together, i.e. not separated by pauses.
An isolated-word system operates on single words at a time - requiring a pause between saying each word.
This is the simplest form of recognition
Continuous speech is more difficult to handle because of a variety of effects.
The Process of Speech Recognition
Acoustic-Phonetic
Pattern Recognition
Artificial Intelligence
INTERFACE
The Experiment
’Yes’ spoken by first person
‘Yes’ spoken by the second person
Divide the sound wave into evenly spaced blocks.
Process each block for important characteristics .
Attempt to associate each block with a Phone, which is the most basic unit of speech,
producing a string of phones.
Find the word whose model is the most likely match
The Basic Steps
speech recognition systems use the basic three-stage
Architecture:
Feature detection in which the raw acoustic waveform is represented in a more useful space
Probabilistic classification of the feature vectors, in which the frames are scored as looking more or less likely as versions
Search for best word-sequence hypothesis in which a word sequence is found that is consistent with the constraints of lexicon and grammar
ADVANTAGES OF SPEECH RECOGNISATION
Easy search and index recorded audio and video data.
Speech recognition is also useful as a form of input.
people working in active environment such as hospitals to use computers.
people with handicaps to use computers.
CONCLUSION !!!
Visual cues to help computers decipher speech sounds that are obscured by environmental noise.
Speech-to-speech translation project for spontaneous speech
Multi-engine Spanish-to-English machine translation system
Building synthetic voices
Thank YouThank YouUnder The Esteemed Guidance OfUnder The Esteemed Guidance Of
Mr. K. PAVAN KUMARMr. K. PAVAN KUMAR((Asst. Professor)Asst. Professor)
Electronics & Communication EngineeringElectronics & Communication EngineeringDepartmentDepartment
RAGHU ENGINEERING COLLEGERAGHU ENGINEERING COLLEGE