ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

Embed Size (px)

Citation preview

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    1/30

    ZCR Based Identification of Voiced

    Unvoiced and Silent Parts of SpeechSignal in Presence of Background

    Noise

    Presented by

    Sivaranjan Goswami, B. Tech. 4thYear

    Department of Electronics and Communication Engineering

    Don Bosco College of Engineering and Technology

    Assam Don Bosco University

    Guwahati, Assam (India)

    Contact: [email protected]

    mailto:[email protected]:[email protected]
  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    2/30

    Outline of Presentation

    Introduction and a Brief Overview

    Speech Signal

    Experimental Details Proposed Algorithms

    Experimental Results

    Discussion and Bibliography

    2

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    3/30

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    4/30

    Introduction (1 of 2)

    The identification of voiced, unvoiced and

    silent parts of speech signal is an important

    step of speech processing.

    It can be easily achieved by estimating short-

    time zero crossing rate and short-time average

    magnitude if background is quiet.

    However in the presence of background noise,

    it is a challenging task.

    4

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    5/30

    Introduction (2 of 2)

    A simple algorithm is designed based onshort-time zero-crossing-rate (ZCR) and short-

    time average magnitude to identify the

    voiced, unvoiced and silent frames of speechin quiet background.

    The algorithm is then improved to serve the

    same purpose in the presence of realbackground noise.

    The second algorithm is found to reduce the

    errors of the first algorithm by 60% (approx.).5

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    6/30

    A Brief Overview

    The first algorithm is totally based on short-timezero-crossing-rate (ZCR) and short-time average

    magnitude to identify the voiced, unvoiced and silent

    frames of speech.

    The modified algorithm processes only background

    noise for 1 second at the beginning and creates a

    reference of the background noise.

    The noise reference is used for separation of voicedor unvoiced samples from samples containing only

    noise.

    6

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    7/30

    Speech Signal

    7

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    8/30

    Human Speech Production System

    8

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    9/30

    Types of Excitation

    Voiced

    Unvoiced

    Mixed Plosive

    Whisper

    Silent

    9

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    10/30

    Types of Excitation

    Voiced=High Amplitude Low Frequency (ZCR), quasi periodic pulses

    Unvoiced= Random signal with low amplitude and high ZCR

    Mixed Plosive

    Whisper

    Silent

    Only Voiced and Unvoiced excitations are of our interest.

    10

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    11/30

    Experimental Details

    11

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    12/30

    Calculation of Zero-Crossing

    Rate(ZCR)

    The ZCR of a signal within a short time interval

    t has been found using the equation:

    Where N is the number of times the polarityof the signal is changed during t

    )1....(....................2 t

    NZCRaverage

    12

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    13/30

    Decision of Voiced and Unvoiced

    SpeechFor every time-frame, the average ZCR, fis calculated and the

    power, xcorresponding to the frequency fis calculated using

    Fourier Transform. Then the result is subjected to the

    threshold condition given in relations 2 and 3,

    Unvoiced: fN aand |xN| b .(2)

    Voiced: fN cand |xN| d ..(3)

    where, the subscript N denotes normalized value and a, b, c, d

    are user defined threshold values between 0 and 1.

    13

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    14/30

    Proposed Algorithms

    14

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    15/30

    Algorithm For Quiet BackgroundStart

    Calculate ZCR ofa 20 ms frame

    Calculate power

    using Fourier

    Transform

    Store ZCR and

    power in memory

    Are all

    framesconsidered

    ?

    Normalize ZCR and

    power of a frame

    Apply equations 2

    and 3 to decide

    voiced /unvoiced

    Mark the frame assilent if it is neither

    voiced nor unvoiced

    Are all

    frames

    considered?

    NoNo

    Yes

    Yes

    Display result

    End

    Process 15

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    16/30

    Assumptions for Background Noise

    The Algorithm-1 is modified for noisy

    background under the following assumptions:

    1. The first 1 second of the signal contains only

    background noise.

    2. The frequency of the noise source is different

    from the vocal tract frequency or ZCR.

    3. The human voice has dominating amplitude,since mouth is closer to the microphone than

    the noise source.

    16

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    17/30

    ZCR of Voiced Speech is

    Independent of Noise

    As shown in the figure, theZCR of voiced speech is

    independent of noise,

    under assumption 3.

    17

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    18/30

    Distinguishing Noise and Unvoiced

    Speech It is found that when Algorithm-1 is subjected to

    speech with background noise, many of the silent

    frames are also marked as unvoiced because of their

    similar amplitude and ZCR. The modified algorithm resolves this problem under

    assumption 1 and 2.

    The first 1 second of the recording is pure

    background noise. Hence, a noise reference can be

    created using the ZCR information of the first 1

    second of the recorded speech.

    18

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    19/30

    Algorithm for Creating the Noise ReferenceStart

    Calculate ZCR of

    a 20 ms frame

    Store ZCR in Noise

    Reference vector

    Are all

    frames

    considered

    ?No

    YesDelete redundant time-

    frames with repeated

    ZCRs to reduce the size of

    Noise Reference vector

    End

    Process 19

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    20/30

    Modified Algorithm for Noisy Background

    Start

    Calculate ZCR of

    a 20 ms frame

    Calculate power

    using Fourier

    Transform

    Store ZCR and

    power in memory

    Are allframes

    considered

    ?

    Normalize ZCR and

    power of a frame

    No

    Yes

    Is the ZCR

    is close to

    any ZCR in

    the noise

    reference

    ?

    Apply equation 3

    to decide

    unvoiced/silent

    Mark it assilent

    1

    Are allframes

    considered

    ?

    Yes

    Display result

    End

    Process

    1

    2

    2

    Yes

    No

    No

    20

    Is it

    marked

    voiced by

    equation

    2?

    No

    1

    Yes

    Update Noise

    Reference

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    21/30

    Experimental Results

    21

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    22/30

    Case 1: Quiet Background

    For quiet background the 1st algorithm and the modified algorithm gives

    similar result.

    1stAlgorithm Modified Algorithm

    22

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    23/30

    Case 2: Additive White Gaussian

    Noise (AWGN)

    1stAlgorithm Modified Algorithm

    In this case, the 1stalgorithm gives poor result, the second algorithm improves

    the result, still, the accuracy is poor, since AWGN has uniform spectral power

    density. 23

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    24/30

    Case 3: Real Noise

    1stAlgorithm Modified Algorithm

    In this case, the 1stalgorithm gives poor result, the second algorithm improves

    the result since most of the assumptions are satisfied.

    24

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    25/30

    Comparison of the Two Algorithms

    Table: Percentage of Silent Frames marked Unvoiced

    Background First Algorithm Modified

    AlgorithmNo Noise 0% 0%

    AWGN 80% 30%

    Natural Noise 58% 23%

    25

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    26/30

    Discussion and Bibliography

    26

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    27/30

    Discussion

    Advantages: Simple to Implement

    Accuracy is high

    The information is found to be useful in speech

    enhancement.

    Drawbacks:

    The first 1 second must contain only background noise.

    The algorithm involves two loops, hence it needs further

    modification in order to be implemented in real time.

    It may not give accurate result if the noise contains

    human voice, because the noise will also contain voiced

    and unvoiced parts in that case.

    27

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    28/30

    Bibliography (1 of 2)1. Bachu R.G., Kopparthi S., Adapa B., Barkana B.D Separation of Voiced and

    Unvoiced using Zero crossing rate and Energy of the Speech Signa l, ElectricalEngineering Department; School of Engineering, University of Bridgeport;

    available at http://audio-fingerprint.googlecode.com/svn-

    history/r62/trunk/referencias/ASEE12008 0044 paper.pdf

    2. Thierry Dutoit A (Short) Introduction to Speech Processing, ailable at

    http://tcts.fpms.ac.be/cours/1005-07-08/speech/icme2002 intro.pdf

    3. John R. Deller, Jr. John H. L. Hansen and John G. Proakis. Discrete-Time

    Processing of Speech Signals, JOHN WILEY and SONS, INC; New York

    4. Douglas, S.C.; Chapter 18, Introduction to Adaptive Filters of Digital Signal

    Processing Handbook; Ed. Vijay K. Madisetti and Douglas B. Williams; Boca

    Raton: CRC Press LLC, 1999 available at http://www.dsp-

    book.narod.ru/DSPMW/18.PDF5. S. Ghaemmaghami, M. Deriche, and B. Boashash A new approach to pitch and

    voicing detection through spectrum periodicity measurement; 1997 IEEE

    TENCON - Speech and Image Technologies for Computing and

    Telecommunications, pp: 743-746

    28

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    29/30

  • 8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise

    30/30

    30

    To download full paper:

    https://gauhati.academia.edu/SivaranjanGoswami

    https://gauhati.academia.edu/SivaranjanGoswamihttps://gauhati.academia.edu/SivaranjanGoswami