Tech Seminar Ppt

CVSR COLLEGE OF ENGINEERINGCVSR COLLEGE OF ENGINEERING

DEPARTMENT OFDEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERINGELECTRONICS AND COMMUNICATION ENGINEERING

TECHNICAL SEMINAR ONTECHNICAL SEMINAR ON

SPEECH TO TEXT CONVERSIONSPEECH TO TEXT CONVERSION

BY

Y.RAJENDER REDDY(08H61A04C5)

INTRODUCTIONINTRODUCTION

Speech recognition is the process of capturing spoken words using

microphone or telephone and converting them into a digitally stored set of

words

Speech to text conversion is one of the application of speech recognition

Speech-to-text system improves accessibility by providing data entry

options for blind, deaf, or physically handicapped users.

BLOCK DIAGRAMBLOCK DIAGRAM

Speech acquisition

Speech preprocessing

Hidden Marcov model

Text storage

External Hardware

Through Microphone

SPEECH ACQUISITIONSPEECH ACQUISITION

The microphone input port with the audio codec receives the signal, amplifies

it, and converts it into 16-bit PCM digital samples at a sampling rate of 8 KHz

.

The system needs a parallel/serial interface to the Nios II processor and an

application running on the processor that acquires and stores data in memory.

The received samples are stored into memory on the Altera Development and

Education (DE2) board.

SPEECH PREPROCESSINGSPEECH PREPROCESSING

Preprocessing involves taking the speech samples as input, blocking the

samples into frames, and returning a unique pattern for each sample.

The unique pattern can be achived by following steps

1. The digital samples are divided into overlapped frames.

2. The system checks the frames for voice activity using endpoint detection

and energy threshold calculations.

3. The speech samples are passed through a pre-emphasis filter.

4.The frames with voice activity are passed through a Hamming window.

CONTINUE……CONTINUE……

5. The system finds linear predictive coding (LPC) coefficients for frames .

6. From the LPC coefficients, the system determines the cepstral coefficients

The cepstral coefficients serve as feature vectors.

HIDDEN MARCOV HIDDEN MARCOV MODELMODEL

Hidden Marcov Model is used for speech recognition, which converts speech

to text

This model consists of three steps

• Training

• HMM-Based recognition

• Digit Models

TRAININGTRAINING

Training involves creating a pattern representative of the features of a class

using one or more test patterns that correspond to speech sounds of the same

class.

An important part of speech-to-text conversion using pattern recognition

is training.

HMM-BASED RECOGNITIONHMM-BASED RECOGNITION

Recognition is the process of comparing the unknown test pattern

with each sound class reference pattern and computing a measure of

similarity between the test pattern and each reference pattern

DIGIT MODELSDIGIT MODELS

The input speech sample is preprocessed and the feature

vector is extracted.

Then, the index of nearest codebook vector for each frame

is sent to all digit models.

The model with the maximum probability is chosen as the

recognisied digit.

TEXT STORAGETEXT STORAGE

The Nios II processor on the DE2 board sends the digital speech data to a PC.

A target program running on the PC receives the text and writes it to the

disk.

APPLICATIONAPPLICATIONSSInteractive voice response system (IVRS)

Voice-dialing in mobile phones and telephones

Hands-free dialing in wireless bluetooth headsets

PIN and numeric password entry modules

Automated teller machines (ATMs)

1. Topic taken from seminartopics.co.in/ece-seminar-topics/

2. Garg, Mohit. Linear Prediction Algorithms. Indian Institute of Technology,

Bombay, India, Apr 2003.

3. Li, Gongjun and Taiyi Huang. An Improved Training Algorithm in Hmm-

Based Speech Recognition.National Laboratory of Pattern Recognition.

Chinese Academy of Sciences, Beijing.

4. Altera Nios ii Document

REFERENCESREFERENCES

THANK THANK

YOUYOU

QUERIES

QUERIES

Documents

Tech Seminar Ppt