GIREESH PPT

CASSI Speech Recognition:Adding Speech Recognition to Embedded Devices

by

G.V.S.GIREESH (05981A0444), B.Tech Final year, Electronics & Communications Engineering

RAGHU ENGINEERING COLLEGE

INTRODUCTION

What is CASSI ?

Conversay Advanced Symbolic Speech Interface

It can be used in a variety of embedded systems.

It runs on either single or dual-processor hardware designs

> CASSI provides continuous, speaker-independent speech recognition

Conversay developers and customers write application code that uses the CASSI API to integrate speech recognition and text-to-speech (TTS) capability into embedded products.

What is TTS ?Text-To-Speech (TTS):

CASSI contains two modules for performing TTS: Rosetta and a TTS synthesis module.

Rosetta, the text-to-phonetics unit, accepts arbitrary written text as input and outputs a string of

phonemes for CASSI to synthesize

processof incorporating speech technology

1. Definition of capabilities

2. Analysis of hardware resources 3. User interface design

4. Development

HARDWARE ENVIRONMENT:

Modular nature.

Suitable for a variety of systems. Used with single processor designs where one processor handles all component execution. Feature extraction and TTS synthesis may be separated onto their own DSP (or other front-end signal processor)

Front-End Block:The front-end block is used for recognition and TTS functions

Processor Block (Back-End):

The processor block performs all other code functions, including topic management and search

AUTOMATIC SPEECH RECOGNISATION

What does speaker dependent / adaptive / independent mean?

What does continuous speech and isolated-word mean?

A continuous speech system operates on speech in which words are connected together, i.e. not separated by pauses.

An isolated-word system operates on single words at a time - requiring a pause between saying each word.

This is the simplest form of recognition

Continuous speech is more difficult to handle because of a variety of effects.

The Process of Speech Recognition

Acoustic-Phonetic

Pattern Recognition

Artificial Intelligence

INTERFACE

The Experiment

’Yes’ spoken by first person

‘Yes’ spoken by the second person

Divide the sound wave into evenly spaced blocks.

Process each block for important characteristics .

Attempt to associate each block with a Phone, which is the most basic unit of speech,

producing a string of phones.

Find the word whose model is the most likely match

The Basic Steps

speech recognition systems use the basic three-stage

Architecture:

Feature detection in which the raw acoustic waveform is represented in a more useful space

Probabilistic classification of the feature vectors, in which the frames are scored as looking more or less likely as versions

Search for best word-sequence hypothesis in which a word sequence is found that is consistent with the constraints of lexicon and grammar

ADVANTAGES OF SPEECH RECOGNISATION

Easy search and index recorded audio and video data.

Speech recognition is also useful as a form of input.

people working in active environment such as hospitals to use computers.

people with handicaps to use computers.

CONCLUSION !!!

Visual cues to help computers decipher speech sounds that are obscured by environmental noise.

Speech-to-speech translation project for spontaneous speech

Multi-engine Spanish-to-English machine translation system

Building synthetic voices

Thank YouThank YouUnder The Esteemed Guidance OfUnder The Esteemed Guidance Of

Mr. K. PAVAN KUMARMr. K. PAVAN KUMAR((Asst. Professor)Asst. Professor)

Electronics & Communication EngineeringElectronics & Communication EngineeringDepartmentDepartment

RAGHU ENGINEERING COLLEGERAGHU ENGINEERING COLLEGE

Documents

GIREESH PPT