Upload
september-barlow
View
29
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project December 7, 2010. Pitch synchronous windowing is a critical part of many speech processing algorithms - PowerPoint PPT Presentation
Citation preview
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project December 7, 2010
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesisJim Bryan Florida Institute of Technology ECE5525 Final Project December 7, 2010
Pitch synchronous windowing is a critical part of many speech processing algorithmsHomomorphic filtering, for example, is based on the principle that the pitch frequency may be liftered from the vocal tract response via simple subtractionLinear prediction based signal reconstruction simpler with Pitch synchronous windowingcovariance method need pitch synchronous glottal closed portion of the speech. Window selection for overlap and add reconstructionBartlett, simple triangleHann raised cosine typesHamming raised cosine typesBartlett window overlap and add response
Hann overlap and add response
Hamming window overlap and add response
Blackman-Harris overlap and add response
Window selection based on spectral leakage and frequency resolution
Hann Window
Hamming window
Blackman-Harris
Window over lap and add Frame Rate verses Frame length considerations
Linear Prediction wide search pitch period estimationSingle 12th order all pole modelVoiced speech is contained within the sample windowUse inverse filtering to get glottal pulses Take autocorrelation of the residual to determine pitch periodMale speak Moon
Female speaker Moon
Male voice
Female voice
Male residual
Female residual
Autocorrelation of Male
Autocorrelation of female
Synthesize Male single model
Male single model
Female single model
Pitch synchronous ProcessingSegment speech waveform so that the frame length is 3 pitch periods. Make sure the window length is even.Set the Hamming window length to frame length and the frame rate to the frame lengthGenerate a 12 pole LP model for each frame.Inverse filter each frame and save the AR model coefficients and residual in a matrix, where each row is a residual.Take the autocorrelation of the residual of the frame.Find the autocorrelation peak.Determine the pitch period for each frame based on the autocorrelation of the residual of the frame. If the frame does not have a valid pitch period, determine if the frame is fricative or plosive. If the variance of the autocorrelation is low, the frame is fricative. Otherwise the frame is plosive.Save the pitch period for each frame in a vector along with the peak of the autocorrelation as well as the fricative or plosive status.Reconstruct the frame by filtering the residual with the AR coefficients, or synthesize the waveform by estimating the glottal pulse train, adding impulsive fricative noise or a single impulse for plosive frames.Over lap and add segments to reconstruct the signal.Compare to the original speech using SSE
Overlap and add Reconstruction male
Overlap and add female
Overlap and add reconstruction male
Overlap and add reconstruction female
Reconstructed Male
Reconstructed female
Conclusions
Many speech processing applications use a combination of windowing and overlap and add for signal resonstructionPitch synchronous windowing necessary for accurate results in speech processing. Homomorphic deconvolution requires it.A single set of coefficients for a single voiced sound appears to be a reasonable approachPitch period estimation, is extracted from the residual of the inverse filtered voiced sound through the autocorrelation functionPitch synchronous windowing a good foundation for all type of signal processing applications