voice recognition

Embed Size (px)

Citation preview

Slide 1

6/8/20131

Department of InstrumentationTechnical seminar on Mahitha.G Under the guidance of USN:1RV12LBI07 Dr.K.V.Padmaja M.TECH II-sem Associate professor &Dean BMSPI. Dept. of IT,RVCE.

DEVELOPING A VOICE RECOGNITION SYSTEM

DEPT. OF IT-BMSP&I16/8/2013dept. of IT-BMSPI 2013Developing a voice recognition systemINTRODUCTION Voice Recognition Systems are those systems which can recognize the voices of the individuals.Voice Recognition is the process of translating spoken words into text words on the computer. Voice recognition is an alternative to typing on a key board.

DEPT. OF IT-BMSP&I 26/8/20132Voice Recognition System are those Systems which can be used to recognize the voices from the individuals. These systems are designed to recognize automatically who is speaking and what he/she is speaking, on the basis of information included in the speech waves.23/34How Humans do This????????Articulation producessound waves whichthe ear conveys to the brainfor processing.

DEPT. OF IT-BMSP&I36/8/201333Humans too have a Voice Recognition system, this works as shown hereSpeech signals produced by an individual are received by the ear of another individual and then the spoken words are recognized by the human brain.4/34How might computers do this????DigitizationAcoustic analysis of the speech signalLinguistic interpretation

Acoustic waveformAcoustic signalSpeech recognition

DEPT. OF IT-BMSP&I46/8/201344Now due the advancement in technology, We are trying to implement this System using Computers. This figure Shows how we can make a VRS using Computers.Firstly the speech is acquired using microphone and then this speech signals are analyzed..And then depending upon the dictionary model, this system works.Isolated Systems requires a brief pause between the spoken words.While Continuous Systems doesnt.

Speaker Dependent Systems recognize speech from only one speaker.While Speaker-Independent Systems can recognize anyones speech.

DEPT. OF IT-BMSP&I56/8/20135Before going furtherLets have a look on the classification of Voice Recognition System into two categories.First on the basis how speech is recorded It is divided into Isolated-Voice and Continuous Voice Recognition System.Isolated System requires a brief pause between the spoken words while continuous doesnt.

And other on the basis of speakers.It is classified as Speaker Dependent and Speaker Independent.Speaker Dependent System can recognize voice from a single user while system independent can recognize voice from every one.5How to create a Voice Recognition System Speech Acquisition(Collection). Speech Analysis. User Interface Development.DEPT. OF IT-BMSP&I66/8/20136Now How to create a Voice Recognition System.Firstly the speech is collected.Then speech analysis is performed and finally a User Interface is developed.Speech Acquisition

For training purpose the speech is acquired using the microphone, for the analysis.The sound card of PC converts the Analog Speech input into digital format for further analysis.DEPT. OF IT-BMSP&I76/8/20137Speech Analysis The First important step in Speech Analysis is to separate each word from the ambient noise. Further each spoken word is compared with the inbuilt acoustic model or dictionary which is created during the training session. The above step is done with the help of an efficient speech detection algorithm.DEPT. OF IT-BMSP&I86/8/20138User Interface DevelopmentFinal Step is to Develop a User Interface, so that all users can use these system with ease. For Example: Speech Recognition System of Windows 7 looks so compact & is as shown below.

DEPT. OF IT-BMSP&I96/8/20139Block diagram of a voice recognition system

6/8/201310 Developing a voice recognition systemDEPT. OF IT-BMSP&I2013-14 Process of Speech Recognition 11Voice InputAnalog to DigitalAcoustic ModelLanguage ModelDisplaySpeech EngineFeedbackDEPT. OF IT-BMSP&I6/8/2013 Speech recognition 6/8/201312 How it works?Record voice command (Time domain)Transform into frequency domain using Fourier Transform and get the magnitude spectrumCompare spectrum of voice commandsDEPT. OF IT-BMSP&IProgram block diagram Voice CommandFrequency SpectrumCompare with stored voice commands CommandDo they match?YESFourier TransformNODEPT. OF IT-BMSP&I136/8/2013Fourier transform

Speech Signal (time domain) Frequency domainDEPT. OF IT-BMSP&I146/8/2013APPLICATIONS Telephony and Other Domains. People with Disabilities. Training Air-Traffic Controller. High Performance Fighter Air-Craft. Electronic Medical Records etcDEPT. OF IT-BMSP&I156/8/201315People with DisabilityThis picture shows how Voice Recognition System is helping a Disable man to complete his work.

DEPT. OF IT-BMSP&I166/8/201316Voice Recognition GadgetsVoice Recognition Systems can be embedded with modern gadgets. Example:-TV etc

DEPT. OF IT-BMSP&I176/8/201317Voice Recognition in High-Performance Fighter Aircrafts

Voice Recognition System has substantially added in High-Performance Fighter Aircrafts.This System helps the pilot to control the various subsystem in an effective manner.

F-35 is the first US Fighter Aircraft with Voice Recognition System able to hear the pilot spoken commands to manage various aircraft subsystems, such as communications and navigation.DEPT. OF IT-BMSP&I 186/8/201318 Speech Recognition: MS Office 2003 Open MS Word Tools Speech

This enables the language bar for both speech-to-text and text-to-speech optionsYou will be guided through training needed to create a user voice profile (15 minutes)You will need a microphoneCan dictate directly into MS Office, not other applications

Speech Recognition Vista and Windows 7

Built into the Operating System : Open Speech Recognition by clicking the Start button , clicking Control Panel, clicking Ease of Access, and then clicking Speech Recognition.Click Set up microphone, follow the instructions in the wizard.

6/8/201319 Developing a voice recognition systemDEPT. OF IT-BMSP&I2013-14Microsoft Speech Recognition Windows 7

http://www.microsoft.com/enable/products/windowsvista/speech.aspx2013-14DEPT. OF IT-BMSP&I206/8/2013 6/8/201321

2013-14Developing a voice recognition systemDEPT. OF IT-BMSP&IDictating and CorrectingVoice Recognition System: Flaws and WeaknessLow signal-to-noise ratio.Overlapping speech.Differentiation b/w Homonyms.Intensive use of computer power.DEPT. OF IT-BMSP&I226/8/201322How to Remove Flaws and Weakness of VRSUsing High Quality Microphone.Use Good Quality of Sound Cards.System must be trained properly.If possible work in quiet environment.

DEPT. OF IT-BMSP&I236/8/201323HUMAN PERFORMANCE(According to a paper written by Lippmann)

Digits Error RateWord Error RateDEPT. OF IT-BMSP&I246/8/2013BENEFITS AND CHALLENGES Spelling Concentration and attention Ergonomics Reading and speech Hands-free use Pronunciation & articulation Endurance

6/8/201325 Developing a voice recognition systemDEPT. OF IT-BMSP&I2013-14Recent Improvements in SRFaster training ~10 min.Better recognition ~95%More compatible softwareBetter system control/command 6/8/201326Developing a voice recognition system2013-14DEPT. OF IT-BMSP&IFuture of SRSUI Speech-based User InterfaceImprovements needed: - Greater accuracy - Greater system control/command - More compatible software6/8/201327Developing a voice recognition system2013-14DEPT. OF IT-BMSP&ICONCLUSION Human performance figures suggests that we still have enormous room for improvement.At present several new algorithms are developed to implement voice recognition system.

DEPT. OF IT-BMSP&I286/8/2013

6/8/201329 IEEE PAPERS ON VOICE RECOGNITION SYSTEM[1] An Interactive and Efficient Voice Processing Systemfor Embedded Applications The objective of this paper is to present the design of an embedded system that will be helpful for the physically impaired individuals in their day to day life. This paper proposes a speech recognition and colour sensing technique based on Formant frequency and Euclidean distance analysis for embedded systems. The complete system consists of three subsystems, the speech recognition system, a central controller and the robotic arm. The experimental and simulation results show that the proposed algorithm makes a good balance between the computational complexity and recognition accuracy, and thus is more useful for embedded systems.speech recognition is successfully implemented on Atmels Atmega16 microcontroller. When compared to the existing system, the proposed system provides robustness and reliability. 6/8/201330 DEPT. OF IT-BMSP&I[2]Hardware-Software co design of automatic speech recognition system for embedded Real-time applications The system consists of a standard microprocessor and a hardware accelerator for Gaussian mixture model (GMM) emission probability calculation implemented on a field-programmable gate array. Experiments on widely used benchmark data show that the real-time factor of the proposed system is 0.62, which is about three times faster than the pure software-based baseline system, while the word accuracy rate is preserved at 93.33%. As a part of the recognizer, a new adaptive beam-pruning algorithm is also proposed and implemented, which further reduces the average real-time factor to 0.54 with the word accuracy rate of 93.16%. The proposed speech recognizer is suitable for integration in various types of voice (speech)-controlled applications.6/8/201331 DEPT. OF IT-BMSP&I[2]Hardware-Software co design of automatic speech recognition system for embedded Real-time applications(Contd..) The proposed ASR system shows much better real-time factors than the other approaches without decreasing the word accuracy rate. Other advantages of the proposed approach include rapid prototyping, flexibility in design modifications, and ease of integrating ASR with other applications. These advantages, both quantitative and qualitative, suggest that the proposed co processing architecture is an attractive approach for embedded ASR.Aside from better word accuracy and timing performance, power consumption is also another important issue for embedded applications. The proposed architecture is not tied to any specific target technology.6/8/201332 DEPT. OF IT-BMSP&I REFERENCES6/8/201333

REFERENCES[1] Robert Keefer, Yan Liu, and Nikolaos Bourbakis, THE DEVELOPMENT AND EVALUATION OF AN EYES-FREE INTERACTION MODEL FOR MOBILE READING DEVICES, ieee transactions on human-machine systems, vol. 43, no.1, january 2013.[2] Vinayak nayyar department of electronics and instrumentation. Abhinav Kumar department of electronics and communication SRM university chennai,India AN INTERACTIVE AND EFFICIENT VOICE PROCESSING SYSTEM FOR EMBEDDED APPLICATIONS Mediterranean Conference on Embedded Computing MECO 2012. [3] yujing si, ta li, shang cai, jielin pan, yonghong yan, RECURRENT NEURAL NETWORK LANGUAGE MODEL IN MANDARIN VOICE INPUT SYSTEM 2012 8th international conference on natural computation (ICNC 2012).

6/8/201334 Developing a voice recognition system2013-14DEPT. OF IT-BMSP&I

REFERENCES( Contd) [4] Octavian Cheng, Member, IEEE, Waleed Abdulla, Member, IEEE, and Zoran Salcic, Senior Member, IEEE HARDWARESOFTWARE CO DESIGN OF AUTOMATIC SPEECH RECOGNITION SYSTEM FOR EMBEDDED REAL-TIME APPLICATIONS,Ieee transactions on industrial electronics, vol. 58, no. 3, march 2011. [5] Chee cheun huang1,2and julien epps1,school of electrical engineering and telecommunications,the university of new south wales, sydney, NSW 2052, australia ,national ICT australia (NICTA), australian technology park, sydney, NSW 1430, australia , A STUDY OF AUTOMATIC PHONETIC SEGMENTATION FOR FORENSIC VOICE COMPARISON ICASSP 2012. [6] Afeez Olalekan, Alex Page, Ying Sun, PhD Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI 02881-0805 USA Optimizing the Functionality of a Voice Recognition System for Assistive Technology.

6/8/201335

2013-14Developing a voice recognition systemDEPT. OF IT-BMSP&IREFERENCES( Contd) [7] Kurzweil Applied Intelligence, Inc Developing Continuous Speech Recognition Technology that Uses Natural Language Processing Commands Research and data for status report 93-01-0101 were collected during july september 2001 and april - june 2002.[8] Luke Makischuk,Abderahmane Sebaa ,Ameneh Sadat,Yazdaninik Asma Faizi SPEECH RECOGNITION AND ITS APPLICATION IN VOICE-BASED ROBOT CONTROL SYSTEM. [9] Agus Trihandoyo, Adam Belloum and Kun-Mean Hou Heudiasyc CNRS URA 817, UniversitC de Technologie de Compikgne B.P. 649, 60206 Compikgne Cedex, FRANCE A REAL-TIME SPEECH RECOGNITION ARCHITECTURE FOR A MULTI-CHANNEL INTERACTIVE VOICE RESPONSE SYSTEM 1995 IEEE.[10]Yfan Gong and Yu-Hung Kao,Texas Instruments Incorporated,P.O. Box 6601 99, MS-8649, Dallas, TX 75266, USA IMPLEMENTING A HIGH ACCURACY SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNIZER ON A FIXED-POINT DSP ,2000 IEEE.

6/8/201336

2013-14Developing a voice recognition systemDEPT. OF IT-BMSP&I 6/8/2013

DEPT. OF IT-BMSP&I37