21
Research Article Detection of Modulated Tones in Modulated Noise by Non-human Primates PETER BOHLEN, 1 MARGIT DYLLA, 1 COURTNEY TIMMS, 1 AND RAMNARAYAN RAMACHANDRAN 1 1 Department of Hearing and Speech Sciences, Vanderbilt University School of Medicine, Nashville, TN 37232, USA Received: 24 January 2014; Accepted: 8 May 2014; Online publication: 5 June 2014 ABSTRACT In natural environments, many sounds are amplitude- modulated. Amplitude modulation is thought to be a signal that aids auditory object formation. A previous study of the detection of signals in noise found that when tones or noise were amplitude-modulated, the noise was a less effective masker, and detection thresholds for tones in noise were lowered. These results suggest that the detection of modulated signals in modulated noise would be enhanced. This paper describes the results of experiments investigating how detection is modified when both signal and noise were amplitude-modulated. Two monkeys (Macaca mulatta) were trained to detect amplitude-modulated tones in continuous, amplitude-modulated broadband noise. When the phase difference of otherwise similarly amplitude-modulated tones and noise were varied, detection thresholds were highest when the modulations were in phase and lowest when the modulations were anti-phase. When the depth of the modulation of tones or noise was varied, detection thresholds decreased if the modulations were anti- phase. When the modulations were in phase, increas- ing the depth of tone modulation caused an increase in tone detection thresholds, but increasing depth of noise modulations did not affect tone detection thresholds. Changing the modulation frequency of tone or noise caused changes in threshold that saturated at modulation frequencies higher than 20 Hz; thresholds decreased when the tone and noise modulations were in phase and decreased when they were anti-phase. The relationship between reaction times and tone level were not modified by manipula- tions to the nature of temporal variations in the signal or noise. The changes in behavioral threshold were consistent with a model where the brain subtracted noise from signal. These results suggest that the parameters of the modulation of signals and maskers heavily influence detection in very predictable ways. These results are consistent with some results in humans and avians and form the baseline for neurophysiological studies of mechanisms of detec- tion in noise. Keywords: amplitude modulation, detection, behavior, comodulation INTRODUCTION The amplitudes of natural sounds fluctuate with time. Due to the prevalence of temporally modulated sounds, the auditory system may be specially adapted to encode and even take advantage of these features (Gans 1992). Studies of physiological responses of auditory-responsive neurons have shown that one such adaptation, phase locking, could lead to an up to 20 dB enhancement in sensitivity to sounds (Joris et al. 1994). However, natural environments are com- posed of multitudes of sounds, and the amplitude of any or all of them could vary with time. Thus, behaviorally relevant target sounds and behaviorally irrelevant distractors could both tap into the auditory Correspondence to : Ramnarayan Ramachandran & Department of Hearing and Speech Sciences & Vanderbilt University School of Medicine & Nashville, TN 37232, USA. Telephone: (615) 322-4991; email: [email protected] JARO 15: 801–821 (2014) DOI: 10.1007/s10162-014-0467-7 D 2014 Association for Research in Otolaryngology 801 JARO Journal of the Association for Research in Otolaryngology

Detection of Modulated Tones in Modulated Noise by Non-human … · 2019. 4. 14. · Research Article Detection of Modulated Tones in Modulated Noise by Non-human Primates PETER BOHLEN,

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

  • Research Article

    Detection of Modulated Tones in Modulated Noiseby Non-human Primates

    PETER BOHLEN,1 MARGIT DYLLA,1 COURTNEY TIMMS,1 AND RAMNARAYAN RAMACHANDRAN1

    1Department of Hearing and Speech Sciences, Vanderbilt University School of Medicine, Nashville, TN 37232, USA

    Received: 24 January 2014; Accepted: 8 May 2014; Online publication: 5 June 2014

    ABSTRACT

    In natural environments, many sounds are amplitude-modulated. Amplitude modulation is thought to be asignal that aids auditory object formation. A previousstudy of the detection of signals in noise found thatwhen tones or noise were amplitude-modulated, thenoise was a less effective masker, and detectionthresholds for tones in noise were lowered. Theseresults suggest that the detection of modulated signalsin modulated noise would be enhanced. This paperdescribes the results of experiments investigating howdetection is modified when both signal and noisewere amplitude-modulated. Two monkeys (Macacamulatta) were trained to detect amplitude-modulatedtones in continuous, amplitude-modulated broadbandnoise. When the phase difference of otherwisesimilarly amplitude-modulated tones and noise werevaried, detection thresholds were highest when themodulations were in phase and lowest when themodulations were anti-phase. When the depth of themodulation of tones or noise was varied, detectionthresholds decreased if the modulations were anti-phase. When the modulations were in phase, increas-ing the depth of tone modulation caused an increasein tone detection thresholds, but increasing depth ofnoise modulations did not affect tone detectionthresholds. Changing the modulation frequency oftone or noise caused changes in threshold that

    saturated at modulation frequencies higher than20 Hz; thresholds decreased when the tone and noisemodulations were in phase and decreased when theywere anti-phase. The relationship between reactiontimes and tone level were not modified by manipula-tions to the nature of temporal variations in the signalor noise. The changes in behavioral threshold wereconsistent with a model where the brain subtractednoise from signal. These results suggest that theparameters of the modulation of signals and maskersheavily influence detection in very predictable ways.These results are consistent with some results inhumans and avians and form the baseline forneurophysiological studies of mechanisms of detec-tion in noise.

    Keywords: amplitude modulation, detection,behavior, comodulation

    INTRODUCTION

    The amplitudes of natural sounds fluctuate with time.Due to the prevalence of temporally modulatedsounds, the auditory system may be specially adaptedto encode and even take advantage of these features(Gans 1992). Studies of physiological responses ofauditory-responsive neurons have shown that onesuch adaptation, phase locking, could lead to an upto 20 dB enhancement in sensitivity to sounds (Joris etal. 1994). However, natural environments are com-posed of multitudes of sounds, and the amplitude ofany or all of them could vary with time. Thus,behaviorally relevant target sounds and behaviorallyirrelevant distractors could both tap into the auditory

    Correspondence to : Ramnarayan Ramachandran & Departmentof Hearing and Speech Sciences & Vanderbilt University Schoolof Medicine & Nashville, TN 37232, USA. Telephone: (615) 322-4991;email: [email protected]

    JARO 15: 801–821 (2014)DOI: 10.1007/s10162-014-0467-7D 2014 Association for Research in Otolaryngology

    801

    JAROJournal of the Association for Research in Otolaryngology

  • sensitivity for modulations. This represents part of thecomplexity of auditory scene analysis problem thathighlights the difficulty in auditory processing incomplex, noisy environments that characterize thenatural environment. Research in visual systemssuggests that visual scene analysis, specifically scenesegmentation, depends on feature borders and con-trasts between local stimulus properties and globalstimulus properties (e.g., Julesz 1986; reviewed inNothdurft 1994).

    While many studies of auditory scene analysishighlight pattern discrimination and identification,some studies deal with the processing of contrastbetween local signals and global signals. Amplitudemodulation is one way to integrate multiple stimuliinto a single auditory object (Yost and Sheft 1989).Consistent with such a hypothesis, detection thresh-olds of a steady-state signal in a modulated maskerwere lower relative to when the signal and themasker were not temporally modulated or whenthe modulation of the masker is uncorrelatedacross different spectral regions (e.g. Hall et al.1984; Schooneveldt and Moore 1989; Fantini 1991;Langemann and Klump 2001; Dylla et al. 2013).When both signals (local stimulus to an auditoryfilter) and masker (global stimulus) were temporal-ly modulated, behavioral performance was highlydependent on temporal correlations between thesignal and the masker: detection thresholds werelower when the modulation of the signal and themasker were different relative to when the signaland the masker were modulated similarly (e.g.,McFadden 1987; Cohen and Schubert 1987; Fantiniand Moore 1994). Since animals also live inenvironments where signals and maskers are bothmodulated, potentially similar results and rulescould apply to animals as well (Bee and Micheyl2008). And, consistent with that hypothesis, exper-iments in avians have found that correlationsbetween signal and masker resulted in higherthresholds for the detection of signal relative towhen the signal and masker were not correlatedwith each other (corvids: Jensen 2007; passarines:Langemann and Klump 2007).

    With the recent popularity of the macaque as amodel for hearing, it is an open question to ask ifsome of the properties of scene analysis and auditoryobject processing that have been described in humansapply to macaques as well. Studies have found thatmacaques have U-shaped audiograms, similar tohumans (e.g., Stebbins et al. 1966; Pfingst et al.1975, 1978), and the modification of the audiogramsby noise are similar to humans (compare resultsfrom Dylla et al. (2013) with Hawkins and Stevens(1950)). An early indication of modulation-basedrelease in masking in macaques was observed when

    tone detection thresholds were lower when either thesignal or the noise was modulated (Dylla et al. 2013),consistent with findings in humans and other species(e.g., Gustafsson and Arlinger 1994; Bacon et al. 1998;Langemann and Klump 2001; Velez and Bee 2010). Inthis paper, we extend the findings of our previousbehavioral study to further investigate how detectionis modified when both tones and noise were time-varying (temporal variation was created by amplitudemodulation) and suggest a model for the computationunderlying the detection. If amplitude modulationhelps auditory object formation, then thresholds todetect an amplitude-modulated signal in a similarlyamplitude-modulated noise would be higher thanwhen the signal and noise were modulated differently.Theories of dip listening would suggest that detectionthresholds would increase as the energy in the dip ofthe masker decreased. The behavioral performance ofthe monkeys is consistent with both predictions, andan energetic masking model where the nervoussystem effectively subtracts noise from the signal canaccount for the results. The results of these experi-ments form the baseline for neurophysiological ex-periments exploring the mechanisms underlyingauditory scene analysis, auditory object formation,and the detection of signals in noise.

    METHODS

    Experiments were conducted on two male rhesusmacaque monkeys (Macaca mulatta) that were both5 years of age at the beginning of these experiments(monkeys C and D). The monkeys were prepared forchronic experiments using standard techniques usedin primate research (e.g., Ramachandran andLisberger 2005; Dylla et al. 2013), and their audio-grams as well as the effects of noise on theiraudiograms were consistent with previous reports onnon-human primates, including studies from ourlaboratory (Stebbins et al. 1966; Pfingst et al. 1975,1978; Dylla et al. 2013). All procedures were approvedby the Institutional Animal Care and Use Committeeat Vanderbilt University and were in strict compliancewith the guidelines for animal research established bythe National Institutes of Health.

    The surgical and experimental procedures havebeen described in detail earlier (Dylla et al. 2013).Briefly, monkeys were prepared for this study with asurgical procedure conducted using isoflurane anes-thesia and performed under sterile conditions.During this surgical procedure, bone cement andscrews were used to secure a head holder to the skull.The monkey was allowed to recover with a regimen ofanalgesics and antibiotics (if necessary) and was undercareful observation by both laboratory staff and

    802 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • veterinary personnel. The head holder was used toposition the monkey’s head in a constant location inthe chair (via a head-post) relative to the speakersduring experiments.

    All experiments were conducted in a double-walledpseudo-anechoic sound booth (model 1200A, IndustrialAcoustics Corp., NY). The monkeys were seated com-fortably in an acrylic primate chair that was custom-designed for their comfort and to leave the area aroundthe ears clear. The monkeys’ heads were fixed to thechair by means of the implanted head holder such thatthe head was level with the center of speakers positioneddirectly in front at a distance of 90.1 cm from the ears.The speakers (Rhyme Acoustics speakers, Madisound,WI) could deliver sounds between 50 and 40 kHz andwere driven by linear amplifiers such that the output ofthe speakers varied by ±3 dB over the entire frequencyrange. The efficacy of the sound system was frequentlytested by calibrating the output with a ½″ probemicrophone system (PS 9200, ACO Pacific, Belmont,CA). All calibrations were performed with the probemicrophone being placed at the location of one of theears of the monkey with its head fixed. The samespeaker was used to deliver tones and noise, so thatthere was no spatial separation between the twostimuli. Tones were calibrated by presenting thestimuli, measuring the signal with the probe micro-phone placed at the location of the monkey ears andusing the known sensitivity of the microphone. Noisewas calibrated by filtering the noise into 1-Hz bandsusing custom software written in Matlab, calibratingthe sound pressure level over the entire frequencyrange of the noise (thus measuring dB spectrumlevel, see below) and then calculating the overalllevel based on the known relationship betweendecibel overall level and decibel spectrum level(see below for details).

    Behavioral Task

    The experiments were controlled by a computerrunning OpenEx software (System 3, TDT Inc.,Alachua, FL). Signals (tones and noise) were generat-ed with a sampling rate of 97.6 kHz. Lever state wassampled at a rate of 24.4 kHz, with a temporalresolution of about 40 μs on the lever release. Thedetails of the task, basic stimulus, and experimentalconditions have been described elsewhere (Dylla et al.2013). Briefly, the monkeys initiated trials by holdingdown a lever (Model 829 Single Axis Hall EffectJoystick, P3America, San Diego, CA). When signals(duration=200 ms, 10 ms rise and fall times) werepresented (∼80 % of the trials, tones/amplitude-modulated tones), monkeys were required to releasethe lever within a 600 ms response window beginningat tone onset. A correct release resulted in fluid

    reward, incorrect non-releases were not penalized,and early release was treated as a false alarm. Oncatch trials (∼20 % of the trials, when no signals werepresented), monkeys were required to hold throughthe response window. Correct rejects were notrewarded, but incorrect releases (false alarms) result-ed in a variable duration (6–10 s) time-out periodduring which no new trials could be initiated.Broadband noise (bandwidth 5 Hz–40 kHz) was usedand was presented continuously, beginning 10 sbefore the first trial could be initiated so that themonkey was adapted to the noise. On signal trials,monkeys were required to detect signal (tone/modu-lated tone) in noise (broadband noise/amplitude-modulated broadband noise), and on catch trials,monkeys were required to reject the noise.

    Tones were generated using the formula S(t)=Asin(2πfct+ϕc), where S(t) represents the tone signal, Arepresents the amplitude in volts, fc represents thecarrier (tone) frequency, and ϕc represents the carrierphase. Usually, the carrier phase was set to be 0 (zero)in all of the experiments described below. Broadbandnoise (N(t)) was generated using the “Random”function in OpenEx, which generated flat-spectrumnoise with roughly equal amplitude at all frequenciesand was further band-limited to 40 kHz. The ampli-tude of the broadband noise is always given as thetotal level, in decibel (dB). Usually, the mean noiseamplitude was set at a 55-dB overall level. Theamplitude in dB SPL spectrum level may be computedby subtracting from that overall level an amount equalto 10*log10(bandwidth), 46 dB. The measure of signallevel used was power (the signal duration was not takeninto account for the calculation of signal level). In theseexperiments, the sound pressure level of the tone couldvary over a 90-dB range, going from −16 to 74 dB SPL.Tone levels were usually presented in steps of 3 or 5 dB,and sound pressure levels were randomly interleavedwithin a block. Under the conditions of the experi-ments, broadband noise at 55 dB caused a roughly 30 dBchange in tone thresholds across many frequencies,consistent with previous results in our laboratory (Dyllaet al. 2013). Figure 1 shows the audiograms of the twomonkeys to tones presented alone (large symbols andsolid lines) and in continuous broadband noise at thenoise level used in this study (55-dB overall level; smallsymbols and dashed lines). The noise level used causedsignificant threshold shifts that showed frequencyspecific trends that were consistent with and matchedprevious data in macaques (Dylla et al. 2013) and withdata from humans (Hawkins and Stevens 1950). Notethat the use of higher noise levels (99 dB SPL spectrumlevel) would result in higher masked thresholds (e.g.Dylla et al. 2013) and may cause different amounts ofmasking release as a result of parametric variations inthe signal or noise modulations.

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 803

  • Temporal variations of signals were created viasinusoidal amplitude modulation. For any signal S(t),sinusoidal amplitude modulation was produced ac-cording to

    SAM tð Þ ¼ S tð Þ � 1þ ds � sin 2π f ms t þ ϕs� �� �

    ;

    where SAM(t) is the amplitude-modulated signal, ds isthe depth of signal modulation, and fms and ϕsrepresent the modulation frequency and modulationphase of the signal, respectively. Amplitude-modulat-ed noises were created similarly according to

    NAM tð Þ ¼ N tð Þ � 1þ dn � sin 2π f mnt þ ϕn� �� �

    ;

    where NAM(t) is the amplitude-modulated noise, dnis the depth of noise modulation, and fmn and ϕnrepresent the modulation frequency and modula-tion phase of the noise, respectively. In both ofthese cases, the mean sound pressure level will beprovided in the data, so the signal and the noisehad peaks that were 6 dB higher than the reportedmean level when the modulation depth was set at1. The parameterization shown above allowed usthe opportunity to vary ds, dn, fms, fmn, ϕs, and ϕnindependently. The experiments were performedin a block design so that all modulation parame-ters were constant within a block, except for A; thisway, the threshold and reaction time metrics couldbe determined using the method of constantstimuli. Across blocks, modulation parameterscould be systematically varied and their effects onbehavior measured.

    Data Analysis

    The analytical techniques have been described previ-ously (Dylla et al. 2013). All analyses were based onsignal detection theoretic methods (Green and Swets1966; Macmillan and Creelman 2005) and implement-ed using MATLAB (Mathworks, Matick, MA). Briefly,the hit rate (H) and false alarm rate (FA) werecalculated based on the number of releases at tonesound pressure level (A) for each block. Signaldetection theory dictates that the behavioral sensi-tivity for a Go/No-Go task can be analyzed in thefollowing way:

    p cð Þ ¼ z−1 z Hð Þ − z FAð Þ2

    � �;

    where z converts hit rate and false alarm rate intounits of standard deviation of a standard normaldistribution (z-score, norminv in MATLAB) (Macmil-lan and Creelman 2005). The inverse z (z−1) thenconverts a unique number of standard deviations of astandard normal distribution into a probability cor-rect (p(c), normcdf in MATLAB). Care was taken toadjust for cases when hit rates and false alarm rateswere 1 and 0, respectively, using methods describedpreviously (Dylla et al. 2013; Macmillan and Creelman2005). The probability correct values were calculatedfor all signal amplitudes to create the psychometricfunction.

    The false alarms (10 % or less in all the blocks) andsometimes less than perfect performance at highersound pressure levels cause the psychometric func-tions to be non-ideal. To account for that, psychomet-ric functions were fit with a modified Weibullcumulative distribution function (cdf) using thefollowing equation:

    p cð Þfit ¼ c − d � e− level=λð Þk

    ;

    where level represents the tone sound pressure level indB SPL, and is related to A by a logarithmic function,λ and k represent the threshold and slope parame-ters, respectively, and c and d represent the proba-bility correct at higher sound levels, and theestimates of chance performance at sound levelsbelow threshold, respectively. To account for thesound pressure levels below 0 dB SPL, sound levelswere translated by up to 16 dB, fit with a Weibullfunction, and then sound levels and thresholds weretranslated back by the same amount as the originaltranslation. From the Weibull cdf, threshold wascalculated as that tone sound pressure level thatwould cause a probability correct value of 0.76.

    Frequency (Hz)100 400 1000 4000 10000 40000

    Thr

    esho

    ld (

    dB S

    PL)

    -10

    0

    10

    20

    30

    40

    50

    60

    70 Monkey D, 55 dB noiseMonkey C, 55 dB noiseMonkey D, Tone aloneMonkey C, Tone alone

    FIG. 1. Thresholds to tones alone and to tones in noise. Thresholdto a 200-ms tone is plotted against the tone frequency for monkeys C(red circles and lines) and D (blue diamonds and lines). Thresholdsare shown when tones were presented alone (large symbols, solidlines) and when tones were presented embedded in continuousbroadband noise at a 55-dB overall level.

    804 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • These analyses were performed under the variousconditions used in this study.

    In all cases, reaction time was also computed, basedon the time of the lever release. Reaction time wascomputed as follows:

    Reactiontime ¼ timeof level release−toneonset time

    Reaction time was computed on all correct Goresponses. We performed statistical analyses on thereaction times to explore the variation of reactiontime with signal strength and with noise level and withthe modulation of noise or signal.

    Statistical Analysis

    All statistical analyses were implemented usingMATLAB and were either coded by one of theauthors based on a theory described in Zar (1984)or was implemented using a built-in function.

    In many cases, the variability in the data was only ableto be estimated using bootstrap methods (Efron andTishirani 1993). Briefly, each trial was resampled usingrandom draws with replacement, while taking care tomaintain the substructure of the block (e.g., number oftrials at each sound level). For example, the variability inthreshold measurements would be estimated by resam-pling each block of behavioral responses 1,000 times.The responses at each tone level (including catch trials)were drawn with replacement from the original datasetat that particular tone level, taking care that the numberof bootstrapped trials at that tone level matched thenumber obtained behaviorally. This was repeated at alltone levels to generate one estimate of the bootstrappedbehavioral data to generate one bootstrapped thresholdestimate. The same procedure was repeated 1,000 timesto generate 1,000 estimates of bootstrapped threshold.This procedure permitted the calculation of the vari-ability of the threshold measured. In all cases, thenumber of iterations was restricted to be the lowestnumber such that the parameters converged. In mostcases, the distributions converged by 1,000 iterations.

    RESULTS

    Effect of Phase Difference

    One way of varying the temporal relationship betweentwo modulated sounds is to impose a phase differencebetween the modulations. The effect of phase differ-ence between the modulations of tone and noise(δϕ=ϕs−ϕn) were investigated in two macaques. Diplistening theories predict that as more of the signal(modulated tone) occurred in the dips of the noise,thresholds would be reduced; (i.e.), the thresholds

    would be lowered when phase differences approached180 ° and would be systematically higher as the phasedifferences deviated from 180 °. Figure 2 shows theresults of such a manipulation in one monkey duringthe detection of a 12.8 kHz tone in broadband noise.Both the tone and the noise were amplitude-modu-lated at 10 Hz, and both tone and noise weremodulated to a depth of 1. Figure 2A shows the hitrates (colored circles) and false alarm rates (coloreddashed lines, labeled FA) as a function of the tonesound pressure level during the detection task forfour different phase differences. The different colorsrepresent different phase differences between thetone and the noise modulations (see legend). Thehit rates diverged from false alarm rates at verydifferent sound pressure levels depending on thephase difference of the modulations. This implies thatthe monkey could reliably release the lever at lowersound levels when the tone and noise modulationswere in anti-phase at tone onset (δϕ=180 °) relative towhen the tone and noise modulations were in phaseat tone onset (δϕ=0 °). The tone levels required for areliable lever release for the phase differences inter-mediate to those (δϕ=90 ° and δϕ=270 °) wereintermediate to those for the other two δϕ valuesand appeared similar to each other. The behavioralaccuracy in the task at each sound pressure level wascalculated by taking hit rate and false alarm rate intoconsideration (as in the “METHODS” section) andplotted as psychometric functions relating probabilitycorrect (p(c)) and tone sound pressure level inFigure 1B. The psychometric functions were fit withWeibull cdfs to generate smooth estimates of behav-ioral accuracy and to estimate behavioral thresholds.The psychometric functions varied with the modula-tion phase difference in a manner similar to the hitrates. The detection thresholds were lowest for δϕ=180 °, intermediate for δϕ=90 ° and δϕ=270 °, andhighest when δϕ=0 °. These results are consistentwith theories of dip listening that suggest decreasesin threshold as more of the signal falls into the dipof the masker.

    Figure 2C shows how response times changed withsound pressure level. The color scheme is the same asin Figure 2B. In all cases, the reaction times decreasedas the tone levels increased, similar to the trend forsteady state tones, and steady state tones masked bynoise. The slopes of the reaction time vs. tone levelrelationship were not significantly different withmodulation phase difference (ANOVA afterbootstrapping, F(7,993)=1.58, p=0.137).

    Figure 3 shows how the phase differences betweenthe signal and noise modulations (δϕ=ϕs-ϕn) influ-enced detection thresholds and reaction times.Figure 3A shows the relationship between the thresh-olds and δϕ for the exemplar case shown in Figure 2.

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 805

  • The thresholds decreased as the phase differenceincreased from 0 to 180 °, but then increased as phasedifference wrapped back to 360 °. The thresholdsappeared to be sinusoidally modulated by phasedifference and were best fit with a sinusoidal functionrelated to half the phase difference and amplitude of16.4 dB. The sinusoidal shape of the curve fit wasconsistent with a subtraction model, where the noiseamplitude was subtracted from the signal amplitudeor one where the modulation waveform of the noisewas subtracted from the modulation waveform of thetone. Figure 3B shows the trend over all otherfrequencies tested, ranging between 0.4 and25.6 kHz (shown in different colors). The offset inthe curves was highly correlated with and was mostlikely related to the audiometric thresholds at thosefrequencies. The trend in threshold changes as afunction of modulation phase difference was similaracross fc values, and the magnitude of the thresholdchange as a function of δϕ was not significantlydifferent as a function of frequency (Kruskal Wallistest, df=5, H=8.57, p=0.127). These results did notvary depending on the onset phase of the tone ornoise modulation, as long as δϕ was maintainedconstant. These results are consistent with listeningin the “dips” of the noise; as the phase differencebetween the signal and noise modulations was varied,the mount of signal in the dips of the noise increased,which could result in improved thresholds.

    Figure 3C and shows the effect of δϕ on reactiontimes at the exemplar fc (12.8 kHz) condition shownin Figure 1. The slope of the linear fit to reaction timevs. sound level did not differ significantly as a functionof phase difference for any frequency studied (seeFig. 2 for an example). We investigated the reactiontimes at each sound level as a function of themodulation phase difference δϕ. The reaction times

    at each sound pressure level did not vary significantlywith δϕ (individual reaction times are not shown forclarity, line joining medians are shown in Fig. 3C).When we examined the reaction times at sound levelsrelative to threshold, the reaction times did not varysignificantly as a function of δϕ (Fig. 3D, line joining

    -20 0 20 40 60 80

    Pro

    port

    ion

    of le

    ver

    rele

    ases

    0

    0.5

    1

    δφ = 270ºδφ = 180ºδφ = 90ºδφ = 0º

    A

    -20 0 20 40 60 80

    Pro

    babi

    lity

    corr

    ect

    0.5

    0.75

    1B

    δφ = 270ºδφ = 180ºδφ = 90ºδφ = 0º

    Data Fit

    Tone Level (dB SPL)

    -20 0 20 40 60 80

    Rea

    ctio

    n T

    ime

    (s)

    0.2

    0.3

    0.4

    0.5

    0.6C

    δφ = 270ºδφ = 180ºδφ = 90ºδφ = 0º

    Data Fit

    fc=12.8 kHz, fms=fmn=10 Hz, ds=dn=1Effect of modulation phase difference

    FA

    FIG. 2. The effect of changing the phase difference between theamplitude modulation of the signal and noise waveform during adetection task. A Hit rate (probability of lever release) vs. tone levelduring detection of a 12.8-kHz tone embedded in broadband noise.Tone and noise were both amplitude-modulated at 10Hz and a depth of1. Noise level was 55-dB overall level. Performance during phasedifferences of 0 ° (black), 90 ° (green), 180 ° (blue), and 270 ° (red) areshown.Dashed horizontal lines represent false alarm rate (FA) during theblocks of the phase differences shown and are color-coded. BBehavioral accuracy (probability correct, see the “METHODS” sectionfor calculation) vs. tone level for the exemplar conditions shown in A.The symbols are color-coded as in A. Weibull cumulative distributionfunction (cdf) fits are shown and are color-coded by phase. Thehorizontal line shows p(c)=0.76; the vertical dashed lines show thebehavioral thresholds under the phase difference conditions shown. CReaction time vs. sound level for during the detection of the amplitude-modulated tone. The reaction times are color-coded based on the phasedifference between tone and noise modulation as in A and B. Thereaction time vs. level relationship was captured by a linear fit (showncolor-coded).

    b

    806 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • medians shown for clarity). This lack of significantmodulation held for both monkey subjects and alltone frequency conditions were studied.

    Effect of Modulation Depth

    The depth of modulation should have a large effecton detection thresholds. Our previous study foundthat modulation of signal or noise caused a maskingrelease (lower thresholds) relative to thresholds forunmodulated tones in unmodulated noise (Dylla et al.2013). Thus, as the depth of the tone or the noisemodulation was parametrically increased from 0 to 1,thresholds would be expected to parametrically de-crease. When modulation depth is changed, thedepth of the trough (or dip) changes by a much

    larger amount than the height of the peak. Thereduction in behavioral thresholds could be expecteddue to the dramatic increase in the depth of the dipwhen the noise modulation depth was increased (thusresulting in a much larger signal to noise ratio aroundthe dip). Figure 4 shows an exemplar case describingthe effects of changing modulation depth duringdetection of modulated tones in modulated noise.Figure 4A shows the hit rate during the detection of a8 kHz tone modulated at 10 Hz at various tonemodulation depths (ds); the masker was broadbandnoise-modulated at 10 Hz at a depth of unity and waspresented at a 55 dB overall level (9 dB SPL spectrumlevel). Increasing tone modulation depths causessmall increases in the peak amplitude of the signal(up to 6 dB for ds=1). The noise modulation was in

    0 45 90 135 180 225 270 315 360

    Thr

    esho

    ld (

    dB S

    PL)

    0

    10

    20

    30

    C, fc = 12.8 kHzThreshold = 21.02-16.4*sin(δφ*π/360)

    A

    δφ, Phase difference (º)0 45 90 135 180 225 270 315 360

    Rea

    ctio

    n T

    ime

    (s)

    0.2

    0.3

    0.4

    0.5

    0.6

    22dB25dB28dB32dB35dB38dB42dB

    C

    δφ, Phase difference (º)

    0 45 90 135 180 225 270 315 360

    0.2

    0.3

    0.4

    0.5

    0.6

    First level > Threshold+3 dB+6 dB+10 dB+13 dB+16 dB+20 dB

    D

    0 45 90 135 180 225 270 315 360

    0

    10

    20

    30

    C, fc=0.4 kHzD, fc=0.8 kHzD, fc=1.6 kHzC, fc=3.2 kHzC, fc=12.8 kHzC, fc=25.6 kHz

    Threshold = 23.7-15.02*sin(δφ*π/360)

    Bfms=fmn=10 Hz, ds=dn=1 fms=fmn=10 Hz, ds=dn=1

    FIG. 3. Behavioral performance as a result of varying phasedifference between tone and noise modulations. A Threshold as afunction of modulation phase difference for the exemplar frequencyshown in Figure 1. The circles represent thresholds at the variousmodulation phase differences, and the dashed red line represents thebest fit (sinusoid) to the threshold variations. B Threshold as afunction of modulation phase difference for multiple tone

    frequencies tested. The individual frequencies are color-coded.Fits to individual fc values are not shown. Thedashed line is the best fit tothe entire data.C Trends of reaction time as a result of modulation phasedifference. The lines connectmedian thresholds at specific sound levels.Different colors show different sound levels. D Similar to C, but levelsare considered relative to threshold.

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 807

  • 0 10 20 30 40 50 60

    Pro

    port

    ion

    of le

    ver

    rele

    ase

    0

    0.25

    0.5

    0.75

    1A

    ds=1.0ds=0.25

    0 10 20 30 40 50 60 70

    0

    0.25

    0.5

    0.75

    1D

    dn=1.0dn=0.25

    0 10 20 30 40 50 60

    Pro

    babi

    lity

    corr

    ect (

    p(c)

    )

    0.5

    0.75

    1B

    0 10 20 30 40 50 60 70

    0.5

    0.75

    1E

    dn=1.0dn=0.25

    DataFit

    Tone Level (dB SPL)

    0 10 20 30 40 50 60

    Rea

    ctio

    n tim

    e (s

    )

    0.2

    0.3

    0.4

    0.5

    0.6C

    ds=1.0ds=0.25

    DataFit

    Tone Level (dB SPL)

    0 10 20 30 40 50 60 70

    0.2

    0.3

    0.4

    0.5

    0.6F

    dn=1.0dn=0.25

    DataFit

    ds=1.0ds=0.25

    DataFit

    fc=8 kHz, fms=fmn=10 Hz, dn=1, δφ=0˚ fc=25.6 kHz, fms=fmn=10 Hz, ds=1, δφ=180˚Variation of depth of tone modulation Variation of depth of noise modulation

    FIG. 4. The effect of varying depth of modulation of tones (A–C) ornoise (D–F) on the detection of modulated tones in modulated noise.Format is similar to Figure 2. A Hit rate vs. tone level duringdetection of a 8-kHz tone in broadband noise at a 55-dB overalllevel for two depths of tone modulation: 0.25 (green) and 1.0 (red).Tone and noise were amplitude-modulated at 10 Hz, and themodulations had a phase difference of 0 °. The depth of noisemodulation was held at 1. Dashed lines show false alarm rates. BProbability correct vs. tone level for the two depths of tonemodulation in A. The psychometric functions (circles) were fit witha Weibull cdf (solid lines). The horizontal line represents the

    threshold criterion (p(c)=0.76), and the vertical lines representthreshold under the two conditions. C Reaction time vs. tone levelduring detection at the two depths of tone modulation. The reactiontimes (circles) relation to sound level was captured by a linear fit(solid lines). D–F Same as A–C, but hit rates (D), probability correctand thresholds (F) and reaction times (F) vs. tone level when thedepth of noise modulation was manipulated. Tone frequency was25.6 kHz, tone and noise modulation frequencies were set at 10 Hz,and noise level was 55 dB. The depth of tone modulation was held at1.0, and the modulations had a phase difference of 180 °.

    808 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • phase with the tone modulation at tone onset (δϕ=0 °). The different colored symbols show hit rates attwo different tone modulation depths (ds=0.25, andds=1), and the hit rate vs. sound level function showsthat as the modulation depth increased, tone levelsrequired to produce hit rates above the false alarmrates increased. Figure 4B shows the behavioralaccuracy (p(c)) for the same case. The psychometricfunctions (circles) and the associated Weibull fits(lines) detailing the behavioral performance at thetwo depths of tone modulation show that the tonedetection thresholds increased as the tone modula-tion depth increased. The reaction times under theseconditions are shown in Figure 4C. In all cases,reaction times decreased as the tone levels increased.Comparing reaction times across the depths ofmodulations, the slopes were not significantly differ-ent across the different modulation depths (ANOVAafter bootstrapping, F(3,997)=1.47, p=0.22).

    Figure 4D–F shows similar data for a case in whichthe depth of noise modulation (dn) was varied.Increase of the depth of noise modulations caused asmall increase in the peak amplitude and largedecreases in amplitude at the trough (e.g., Maloneet al. 2010). Figure 4D shows hit rates for two differentdn values when a 25.6-kHz tone was being detected;tone modulation frequency and depth were heldconstant at 10 Hz and 1, respectively, the noisemodulation frequency was 10 Hz, and the modulationphase difference δϕ was 180 °. The mean noise levelwas held constant at a 55 dB overall level across thedifferent modulation depth conditions. The tone levelrequired to produce hit rates higher than the falsealarm rate was lower for dn=1 compared with dn=0.25.This is in contrast to the experiments with tonemodulation where the tone and noise modulationwere in phase (see Fig. 4A). The resulting psychomet-ric function and their Weibull fits (Fig. 4E) shows thatthe behavioral accuracy increased and thresholdsdecreased as the noise modulation depths increased.As in previous cases, there were no significant changesin the relationship between reaction time and tonelevel as a function of the noise modulation depth(ANOVA after bootstrapping, F(3,997)=1.14, p=0.33).

    The exemplar data and data from some other tonefrequencies (fc) are summarized in Figure 5. For allexamples and data shown, the tone and noisemodulation frequencies were held constant and equalat 10 Hz. As expected from Figure 4A, the effect ofvarying depth of tone modulation resulted in in-creased tone detection thresholds when tone andnoise modulations were in phase (δϕ=0 °) (Fig. 5A).The exemplar case of Figure 4A–C is shown in bluecolors. The threshold changes as a function of ds weresignificantly different from zero for each case(ANOVA after bootstrapping, pG0.01) and were fitwith a line. The slopes of the linear fits at the differenttone frequencies were all significantly different from

    zero (t test for slopes, pG0.01 in all cases) and werenot significantly different from each other (ANOVAafter bootstrapping, F(2,997)=1.48, p=0.228). Thisresult could be because (1) the noise and the tonemodulations became more similar as the depth oftone modulation increased or (2) the signal energy inthe dips of the masker decreased with increaseddepth of tone modulation. When the tone and noisemodulations were 180 ° out of phase at tone onset(δϕ=180 °), dip listening theories would predict thatthe trend would be reversed relative to the in-phasecondition, due to increase in the amplitude of thepeak during the dips of the masker. The experimentaltest of the hypothesis showed that the trend betweenthreshold and tone modulation depth when the toneand noise modulations were anti-phase at tone onsetwas reversed relative to when the modulations were inphase (Fig. 5B). Increasing the depth of modulationof the tone caused a decrease in the tone detectionthresholds. The threshold changes were significantlydifferent from zero (t test for slopes, pG0.008 in allcases). The relationship between threshold and ds wasbest captured by a linear fit. This trend that heldacross all tone carrier frequencies was tested. Theslopes of the linear fit were not significantly differentfrom each other for the various frequencies tested(ANOVA after resampling, F(2,997)=1.79, p=0.1675).Note that the threshold difference between thehighest and lowest modulation depth conditions weresmaller when δϕ=180 ° (modulations were anti-phase)compared to when δϕ=0 ° (modulations were inphase). This result is consistent with smaller increasesin the peak of the modulated signal with increases inmodulation depth (important for δϕ=180 °) asopposed to large decreases in trough depth withincreases in modulation depth (important for δϕ=0 °)(e.g., Malone et al. 2010).

    The effect of varying noise modulation depth on tonedetection thresholds is shown in Figure 5C and D.Changing the noisemodulation depth changes the depthof the dip in the masker; thus, lower noise modulationdepths were expected to be correlated with tone detec-tion at higher thresholds when the tone and noisemodulations are anti-phase, and vice versa. As shown inFigure 4E, changing the depth of modulation of noise(dn) caused a decrease in tone detection thresholds whenthe tone and noise modulations were 180 ° out of phase.This trend is summarized for the exemplar frequency(shown in blue) and for some other frequencies (othercolors) in Figure 5D. The thresholds varied significantly asa result of changing dn (t test for slopes, pG0.01 in allcases), and the relationship between them was capturedby a linear fit. The slopes of the linear fit were notsignificantly different across frequency (ANOVA afterresampling, F(2,997)=2.013, p=0.15). The thresholdchanges as a result of changing dn when δϕ=180 ° werecomparable to the threshold differences after changing dswhen δϕ=0 ° (compare Fig. 5A and D).

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 809

  • Surprisingly, changing dn while keeping δϕ=0 ° didnot result in a significant change in tone thresholds(Kruskal Wallis test, p90.3 in each case). Figure 5Cshows the summary of two examples of changing dn(using two different tone frequencies). In these cases,the tone and noise were modulated at 10 Hz, and themodulations were in phase. The slope of the relation-ship between modulation depth and tone thresholdwas not significantly different from 0 for either of thetwo examples or the several other tone carrierfrequencies tested (t test for slopes, p90.24 in eachcase).

    Effect of Modulation Frequency

    If the tone and noise were modulated at the samefrequency ( fms= fmn), one would expect that the tonedetection threshold would be high; when the modu-lation frequencies are different, detection thresholdswould be expected to be lower than the equalmodulation frequency case (Bregman 1994). Wetested that prediction by varying the tone modulationfrequency or the noise modulation frequency byblocking modulation frequency. The results of twoexperiments are shown in Figure 6. Figure 6A–Cshows the results of an experiment in which the tone

    Depth of tone modulation (ds)0 0.25 0.5 0.75 1

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    40

    D fc = 25.6 kHzD fc = 8 kHz

    C fc = 8 kHz

    A

    δφ = 0º

    Depth of tone modulation (ds)0 0.25 0.5 0.75 1

    10

    20

    30

    40

    D fc = 25.6 kHzD fc = 8 kHzC fc = 8 kHz

    B

    δφ = 180º

    Depth of noise modulation (dn)0 0.25 0.5 0.75 1

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    40

    D fc = 8 kHzC fc = 12.8 kHz

    C

    δφ = 0º

    Depth of noise modulation (dn)0 0.25 0.5 0.75 1

    10

    20

    30

    40

    D fc = 8 kHzC fc = 12.8 kHzC fc = 25.6 kHz

    D

    δφ = 180º

    DataFit

    DataFit

    DataFit

    DataFit

    fms=fmn=10 Hz, dn=1, δφ=180˚Variation of tone modulation depth

    fms=fmn=10 Hz, dn=1, δφ=0˚Variation of tone modulation depth

    fms=fmn=10 Hz, ds=1, δφ=0˚Variation of noise modulation depth

    fms=fmn=10 Hz, ds=1, δφ=180˚Variation of noise modulation depth

    FIG. 5. The effects of varying depths of tone and noise modula-tions. A Threshold as a function of change in the depth of tonemodulation. Thresholds are shown for three different tone frequen-cies (different colors) at various depths of tone modulation. Therelationship was best captured by a linear fit (solid lines). The toneand noise modulations were in phase during these blocks. B Similar

    to A, but for these blocks, the tone and noise modulations were anti-phase (phase difference=180 °). C Threshold as a function of depthof noise modulation. Format is the same as in A and B. For theseblocks, the tone and noise modulations were in phase. D Similar toC, but for these blocks, tone and noise modulations were anti-phase.

    810 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • 0 20 40 60 80

    Pro

    port

    ion

    of le

    ver

    rele

    ase

    0

    0.25

    0.5

    0.75

    1

    fms=10fms=20fms=40

    A

    0 20 40 60 80

    prob

    abili

    ty c

    orre

    ct (

    p(c)

    )

    0.5

    0.75

    1

    fms=10fms=20fms=40

    B

    Tone Level (dB SPL)

    0 10 20 30 40 50 60 70 80

    Rea

    ctio

    n tim

    e (s

    )

    0.2

    0.3

    0.4

    0.5

    0.6

    fms=10fms=20fms=40

    C

    10 20 30 40 50 60 70 80 90

    0

    0.25

    0.5

    0.75

    1

    fmn=10 fmn=20 fmn=40

    D

    10 20 30 40 50 60 70 80 90

    0.5

    0.75

    1

    fmn=10fmn=20fmn=40

    E

    Tone level (dB SPL)

    10 20 30 40 50 60 70 80 90

    0.2

    0.3

    0.4

    0.5

    0.6F

    fmn=10fmn=20fmn=40

    DataFitDataFit

    DataFitDataFit

    fc=3.2 kHz, fmn=10 Hz, ds=dn=1, δφ=0˚ fc=25.6 kHz, fms=10 Hz, ds=dn=1, δφ=0˚Variation of tone modulation freq. Variation of noise modulation freq.

    FIG. 6. The effects of varying modulation frequency of tones (A–C) ornoise (D–F) on detection of modulated tone in modulated noise. Formatof the figures are same as in Figure 4. A–C Hit rate vs. tone level (A),psychometric functions, Weibull cdf fits and detection thresholds (B), andreaction times vs. tone level (C) during detection of modulated tone inmodulated noise. Tone frequency was 3.2 kHz; noise level was55 dB, noise modulation frequency was 10 Hz, tone and noisemodulation depths were 1.0 each, and the modulations were in

    phase at tone onset. Data is shown for three tone modulationfrequencies—10 Hz (blue), 20 Hz (green), and 40 Hz (red). D–FSimilar to A–C, but as noise modulation frequencies werechanged. Tone frequency was 25.6 kHz, noise level 55 dB,frequency of tone modulation was10 Hz, depth of modulation oftone and noise 1.0, and the tone and noise modulations were inphase at tone onset.

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 811

  • modulation frequency was changed between blocks(varying fms), and Figure 6D–F show the results of anexperiment in which noise modulation frequency ( fmn)was varied. In both cases, the modulations of tone andnoise were in phase at the onset of the tone (onset phasedifference, δϕ=0 °). Figure 6A shows the hit rates as thetone modulation frequency was changed between 10(blue), 20 (green), and 40 Hz (red). The false alarmrates were zero in all cases and are shown as separateddashed lines for clarity. As expected, for each modula-tion frequency, the hit rates matched false alarm ratesfor low sound levels and then increased rapidly untilthey reached high values close to unity for higher soundlevels (Fig. 6A). The effect of changing modulationfrequency of the tone was that the tone level at whichthe hit rates diverged from false alarm rates were loweras fms changed from 10 to 20 Hz and differed more fromthe modulation frequency of noise (Fig. 6A). But at ahigher fms, the threshold did not change much. This wastrue at a higher fms value tested (80 Hz, data not shown).The behavioral accuracy was computed from the hitrates, and the psychometric functions andWeibull fits inFigure 6B show that the detection thresholds decreasedas the tone modulation frequency increased from 10 to20 Hz (compare blue and green symbols and lines), butdid not show a large change going from fms=20 Hz tofms=40Hz. The reaction times as a result of changing thetone modulation frequency are shown in Figure 6C. Asin previous cases, while reaction times decreased as tonelevel increased under each of the tone modulationfrequency conditions. The relationship between reac-tion time and tone level was best captured by a linear fit(shown by the blue, green, and red lines). There was nota change in the relationship between reaction time andtone level as a result of changing the tone modulationfrequency (slopes were not different, intercepts werenot different). Reaction times examined in greaterdetail as a function of modulation frequency (similarto Fig. 3C and D) did not show a trend when examinedwith absolute sound level or with sound level re:threshold (data not shown).

    Figure 6D–F shows an example of when the fmn wasvaried over different blocks. As mentioned above, thephase difference between the modulations was zero attone onset. Figure 6D shows the hit rates, using the sameformat as Figure 6A. False alarm rates were ∼6 % for thefmn=10 Hz condition (blue dashed lines), but were zerofor the other two conditions (green and red dashedlines). As with Figure 6A, the false alarm rates are shownstaggered for the two cases when they were zero. Theeffect of changing fmn was different from the effect ofchanging fms. At fmn=20 Hz, the tone levels required tochange the hit rate from the false alarm rate to higherlevels was reduced relative to fmn=10 Hz (compare blueand green symbols, Fig. 6D). When the noise modula-tion frequency was changed to 40 Hz, then the tone

    levels required to change the hit rate to levels above falsearm rate increased above those for the 20-Hz conditions,but were still lower than the 10-Hz condition. This trendwas reflected in the psychometric functions and theirWeibull fits (Fig. 6E). Psychometric functions for fmn=20 Hz were shifted to lower levels relative to those forfmn=10 Hz, as well as those for fmn=40 Hz; thepsychometric functions for fmn=40 Hz were shifted tolower levels relative to fmn=10 Hz (Fig. 6E). As inprevious cases, reaction times decreased as the tonelevels increased and were best related to tone level by alinear fit. The linear fit was not significantly impacted bychanges in fmn. A closer examination revealed thatreaction times were not impacted by fmn, whether oneexamined the relationship based on absolute tonesound pressure level or tone sound pressure level re:threshold (data not shown).

    Figure 7 summarizes the results of effects onthreshold at various fc values as a result of changingfmn or fms. Figure 7A shows the effect of varying fmswhile keeping δϕ=0 °. Theories of dip listeningpredict that the detection thresholds would be lowerwhen tone modulation frequencies increased, due tomore signal energy in the dip of the masker. Eachcolor and symbol represents a different tone frequen-cy ( fc) tested (see legend in Fig. 7B for details). For allof these cases, fmn=10 Hz and noise level was 55 dBoverall level. The detection threshold was largest atfms=10 Hz and was lower for higher values of fms. Thethresholds for fms910 Hz were not different from eachother (ANOVA after resampling for each frequency,p90.2). A similar trend held when the noise modula-tion frequency ( fmn) was changed for the same fcvalues tested (Fig. 7B). When noise modulationfrequencies varied, previous studies have found thatthe thresholds increased due to a reduction in theduration of the masker dip, and thus smaller integra-tion time (e.g., Velez and Bee 2010). In theseexperiments, tone detection thresholds were highestwhen fmn=10 Hz and were lower for the other valuesof fmn. However, thresholds at fmn=20 Hz were lowerthan those for higher fmn values, a trend that held forall fc values (ANOVA after resampling, p90.17).

    One concern is that when the modulation frequencywas changed, then the instantaneous phase of the tonemodulation waveform and the noise modulation wave-form changed as a function of time. If a subject hadmultiple looks at the stimuli during the tone presenta-tion (i.e., the subject were to sample instantaneoussignal and noise waveforms multiple times) and basedthe response on instantaneous phase difference, thenthere would be no effects of phase difference at toneonset on the effect of modulation frequency ondetection thresholds. This was tested by testing theeffect of modulation frequency with δϕ=180 °. As aresult of this manipulation, the relationship between

    812 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • thresholds and modulation frequency had an invertedshape relative to δϕ=0 °. One example is shown forchanges in fms and one for changes in fmn. Both δϕ=0 °and δϕ=180 ° cases are shown for both modulationfrequency variations. When δϕ=0 ° and fms was varied,thresholds at 10 Hz were highest, and thresholds athigher fms values were not different from each other(Fig. 7C, see red symbols). When δϕ=180 °, thethresholds at 10 Hz were lower than thresholds athigher fms values, and the thresholds at higher fms valueswere not different from each other (blue symbols andlines, Fig. 7C). Note that the thresholds at fms≥20Hz didnot differ as a function of δϕ (Kruskal Wallis test afterresampling, p90.11 at every fc value tested). This trendwas true for other tone frequencies tested (data notshown). Similarly, changing the δϕ values while varyingfmn caused the relationship between fmn and threshold tobe inverted relative to δϕ=0 °. When δϕ=0 °, thresholds

    were highest at fmn=10 Hz, lowest at fmn=20 Hz, and hadvalues intermediate between the above two at higher fmnvalues (red symbols and lines, Fig. 7D). When δϕ=180 °,thresholds at 10 Hz were lowest, and other thresholdswere higher at the other fmn values. Similar to when fmswas varied, the thresholds for fmn≥20 Hz in the δϕ=180 ° and the δϕ=0 ° conditions were not significantlydifferent from each other (Kruskal Wallis test afterresampling, p90.2 for every fc value). The same trendwas observed at all fc values tested (data not shown).

    Predictions of a Model Based on StimulusStructure

    In situations such as this, it is instructive to look at asimple model to fit the behavioral data to attempt toinfer the computations taking place underlying thisbehavior. Our goal is to compare the best model with

    10 20 40 80

    10

    20

    30

    40B

    δφ=0º

    10 20 40 80

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    40A

    δφ=0º

    Modulation frequency of noise (fmn, Hz)

    10 20 40 80

    10

    20

    30

    40

    C, δφ=180˚C, δφ=0˚D, δφ=0˚D, δφ=180˚

    D

    fc = 3.2 kHz

    Tone modulation frequency (fms, Hz)

    10 20 40 80

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    40C

    fc = 3.2 kHz

    D, fc=0.8 kHzD, fc=1.6 kHzD, fc=3.2 kHzD, fc=25.6 kHzC, fc=3.2 kHzC, fc=12.8 kHzC, fc=25.6 kHz

    fmn=10 Hz, ds=dn=1Variation of tone modulation freq.

    fms=10 Hz, ds=dn=1Variation of noise modulation freq.

    FIG. 7. The effects of varying the frequency of tone or noisemodulations. A Threshold as a function of tone modulationfrequency. Thresholds for detection of tones of varying carrierfrequencies (fc, different colors and symbols; legend with panelB) in modulated noise, when tone modulation frequencies werechanged. The tone and noise modulation frequencies were inphase during these blocks. B Similar to A, but shows detectionthresholds for tones having same fc values as in A (different

    colors) in modulated noise when the noise modulation frequen-cies were varied. C Threshold as a function of tone modulationfrequency when the tone and noise modulations were in phase(red) or anti-phase (blue) at tone onset for two subjects (solidand dashed lines, respectively). The tone frequency was3.2 kHz. D Similar to C, but for variations in frequency ofnoise modulation.

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 813

  • empirical results and models of processing at eachvarious stages of the pathway to localize transforma-tions in signal processing. The sinusoidal change inthreshold with the variation of modulation phasedifferences suggests that a difference model wouldfit the threshold changes as a result of the manipula-tion of tone and noise parameters, reminiscent of theequalization-cancellation (EC) model proposed forbinaural processing (Durlach 1963). Note that asecondary formulation of the model would involvejust subtraction of the envelopes, similar to modelsproposed by Hall et al. (1988); the trend in thepredicted results of a model that computed envelopedifferences matched those observed by the proposedmodel, but getting threshold equivalents provedproblematic since the envelope subtraction modelwas independent of stimulus or noise levels. Analternate formulation of the model computed signalto noise ratio at the dips of the masker (dips weredesignated as the time intervals when the instanta-neous masker levels fell below steady state levels, afterVelez and Bee (2010)). The results of this modelshowed trends that did not match the behavioralresults for the effect of manipulating tone modulationfrequency and δϕ=0 ° (see below). To formalize thedifference model in our analysis, we computed thedifference between the amplitude-modulated signalwaveform and the amplitude-modulated noise wave-form (SAM(t)−NAM(t)) as a function of time for eachtone level. The amplitudes of the tone and noisewaveform were logarithmically transformed so as tomatch sensitivity to perceptually related parameters. Ifthe log-transformed noise amplitude for t=t0 waslarger than the log-transformed signal amplitude att=t0, the signal would be masked; in those cases, we setthe difference equal to zero. When the log-trans-formed signal amplitude exceeded the log-trans-formed noise amplitude (either positive or negative),the absolute value of the difference was calculated.The total energy of this difference function wascalculated by integrating the difference waveformnumerically over time, consistent with the findingthat for short duration (G1 s) signals, signal to noiseratio is best expressed as a dimensionless quantity ofsignal energy to noise spectrum level (e.g., Green etal. 1959). The area under this accumulated differencecurve was calculated using the trapezoidal rule andshould be directly related to the behavioral perfor-mance at that tone level. That is, if the area under thedifference curve increased with the parametric varia-tion, it is expected that the hit rate would increase.The area was calculated at all tone levels that wereused in the experiment. We then made an assumptionthat the criterion for behavioral threshold was thesame across all conditions and used a criterion todefine the threshold for a specific set of parameters.

    We set the criterion such that the threshold to tonesalone matched audiometric thresholds. The criterionwas varied in the simulation; the specific value of thecriterion changed the absolute threshold level, butdid not affect the change in threshold as a result ofthe parametric manipulation. Figure 8 provides theresults of such a model calculation.

    The results shown in Figure 8 were obtained usinga tone frequency of 1 kHz. The results did not changewith the use of other frequencies, so the results showncould represent any frequency within the audiblerange of the primate. The simulation was such thatthe model predicted a threshold of 1.5 dB SPL when a1-kHz tone was presented alone, and the predictedthreshold in 55 dB noise was 29 dB SPL. These valueswere similar to the actual threshold values measuredfor the two monkeys used as subjects in this study(tone alone: monkey C: −0.5 dB; D: −0.8 dB; tone in55 dB noise: C: 29.5 dB; D: 30.5 dB; see Fig. 1). Thecriterion area value used to define threshold for allfuture simulations was maintained identical to that forthe unmodulated tone presented alone and in noise.The noise and the tone were amplitude-modulated,and the model run as described. The effects ofmanipulating the modulation phase difference onthe model thresholds are shown in Figure 8A. Thethresholds were highest at δϕ=0 °; as the values of δϕincreased from 0 to 180 °, thresholds decreased andthen increased as δϕ wrapped back around to 360 °(green circles, Fig. 8A). This trend in model thresh-olds is just identical to the data shown in Figure 3Aand B. The relationship between the threshold and δϕwas best fit with a sinusoidal function with anamplitude of 13.9 dB (green line, Fig. 8A), much likethe behavioral data was fit by a sinusoid (Fig. 3A andB). The magnitude of the effect of phase differenceon model thresholds was very similar to its effect onthe behavioral thresholds (compare Fig. 3A and Bwith Fig. 8A). This suggests that a difference model issufficient to capture the effects of changing δϕ.

    FIG. 8. Results of a simple energy difference model that predictsvariation in behavioral threshold as a result of the parametermanipulations for the studies presented here. Parameters matchthose used in the experiments. A Effect of varying phase differencebetween tone and noise modulations. Circles show thresholds, solidline represents best fit to the data. B Effect of varying frequency ofnoise amplitude modulation. Thresholds are shown when tone andnoise modulation were in phase (red triangles) and anti-phase (bluediamonds) at tone onset. Dashed and dotted lines show best fit to thesymbols (exponential functions). C Similar to B, but for variations intone amplitude modulation frequency. Inset. The results of behav-ioral experiments in two monkeys testing the effects of frequency oftone modulation between 10 and 20 Hz shows an undershoot inthreshold to match model predictions when tone and noisemodulations were in phase at tone onset. D Effects of varying depthof noise modulation. Format is similar to C. E Similar to D, butparameter varied was depth of tone modulation.

    b

    814 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • δφ, onset phase difference (º)0 45 90 135 180 225 270 315 360

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    A

    Depth of noise modulation (dn)0 0.25 0.5 0.75 1

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    δφ = 180ºδφ = 0º

    D

    Noise Modulation frequency (fmn, Hz)10 20 40 80

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    δφ = 180ºδφ = 0º

    B

    Tone Modulation frequency (fms, Hz)10 20 40 80

    10

    20

    30

    δφ = 180ºδφ = 0º

    C

    Depth of Tone modulation (ds)0 0.25 0.5 0.75 1

    10

    20

    30

    δφ = 180ºδφ = 0º

    E

    Tone modulation frequency (Hz)10 20 40 80

    Thr

    esho

    ld (

    dB S

    PL)

    10

    20

    30

    40

    δφ=0º

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 815

  • Figure 8B and C shows the effect of varying thefrequency of amplitude modulation of the tone or thenoise on the model thresholds. Figure 8B shows theeffects of varying the frequency of noisemodulation (fmn)while keeping the frequency of tone modulation (fms)constant at 10 Hz. With δϕ=0 °, and as fmn differed fromfms, the thresholds decreased rapidly and then saturatedfor fmn≥20 Hz (red triangles, Fig. 8B). The range ofmodel threshold values and their trend are similar to thebehavioral data shown in Figure 6B. This relationship wasbest captured by an exponential function of (fmn−fms)(red dashed line, Fig. 8B). With δϕ=180 °, the thresholdsincreased from a smaller value when fmn=10 andsaturated at fmn≥20 Hz (blue diamonds, Fig. 8B). It isnoteworthy that the model thresholds matched for thetwo δϕ values, in a manner similar to the behavioral data(Fig. 7D). This result is consistent with behavioralthresholds for δϕ=180 ° that increased as fmn increasedand saturated at fmn values larger than 20 Hz (consistentwith Fig. 7D). An exponential function of fmn−fms best fitthe model. Note also that the model predicts that thethreshold changes in the δϕ=0 ° condition to be largerthan those in the δϕ=180 ° condition. This is, in fact,consistent with the behavioral data from both subjects inthis study (see Fig. 7D).

    The effect of changing fms while maintainingconstant fmn on threshold are shown in Figure 8C. Asfms differed from fmn, and with δϕ=180 °, the modelpredicts thresholds that increased and then saturatedas fms≥20 Hz (blue diamonds, Fig. 8C), similar to theeffect of changing fmn. The behavioral results weresimilar to this function (see Fig. 7C). However, whenδϕ=0 °, the area increased as fms increased, attained apeak value at fms=13 Hz, and then decreased tosaturate for fms≥20 Hz. Because of the low resolutionof the sampling of fms values in the data reported inFigure 7, the correlation with the behavioral values isnot clear. To clarify the match of this model resultwith behavior, we tested the monkeys at fms=11, 12,13 Hz, and the behavioral results are shown inset. Thebehavioral results (see Fig. 8C, inset) show that thethresholds initially decrease, attained a minimum atfms=13 Hz, and then increased when fms=20 Hz. Thistrend, which was observed for both monkeys (differ-ent colors, Fig. 7C, inset), matched the modelprediction. Further analyses with the model indicatedthat the dip in threshold is matched with thefrequency at which the lowest cross-correlation wasobtained with a 10-Hz sine wave (representing thenoise modulation) for a 200-ms duration signal,and the frequency at which the modulation enve-lope of the signal and the noise were mostdifferent cumulatively. The range of thresholdvalues and their trend matched behavioral values.As with fmn values, the model predicted similarthresholds in the δϕ=0 ° and δϕ=180 ° conditions for fms≥

    20Hz and predicted smaller threshold changes in the δϕ=180 ° condition relative to the δϕ=0 ° condition. Both ofthese were also consistent with behavioral results (Fig. 7).

    The model predictions for the effects of modulationdepth are shown in Figure 8D and E. For thesecalculations, fms and fmn were held constant at 10 Hz.When the depth of tonemodulation (ds) increased from0 to 1, and δϕ=0 °, the predicted thresholds increased(red triangles, Fig. 8E) and was fit with a straight line.This trend and its magnitude are both consistent withthe behavioral data (compare with Fig. 5). When dsincreased from 0 to 1, and δϕ=180 °, the predictedthresholds decreased (blue diamonds, Fig. 8E) and wasbest fit with a straight line. This trend and the range ofthresholds predicted were also consistent with thebehavioral data (Fig. 5). Note that the model isconsistent with larger threshold differences in the δϕ=180 ° condition relative to the δϕ=0 ° condition. Whenthe depth of noise modulation (dn) increased from 0 to1, and δϕ=180 °, the thresholds decreased (bluediamonds, Fig. 8D) and were fit with a straight line.This is consistent with behavioral thresholds decreasingunder the same conditions (Fig. 5). When dn increasedfrom 0 to 1, and δϕ=0 °, the model thresholds stayedidentical for modulation depths of 0 to 0.75, and thenincreased by 2 dB for a noisemodulation depth of 1 (redtriangles, Fig. 8D). The model thresholds were fit with astraight line, the slope of which was not different fromzero (t test for slopes, p=0.473). This result is also similarto the behavioral data, which suggests that varying dndoes not significantly change amplitude-modulatedtone detection thresholds (Fig. 5). This may be relatedto increased sensitivity to tone modulation relative tonoise modulation as a result of two-tone suppression atthe level of the auditory nerve. Note also that the modelis consistent with larger threshold differences in the δϕ=180 ° condition relative to the δϕ=0 ° condition. Thus, asingle energy difference accumulation model canaccount for all the results.

    DISCUSSION

    The results of this study show the effects of varying thetemporal relationship between time varying signal andtime varying noise. By systematically varying thevarious parameters that characterize the relationshipbetween signal and noise modulations, the results ofthis study show that the computations underlying thedetection of signal in noise are consistent with adifferencing operation.

    Comparison with Previous Results

    The power spectrum model of hearing suggests thatduring the masked detection of a sine-wave signal, the

    816 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • subject utilizes information from the auditory filterthat is centered on the sinusoid to be detected(Moore 2003). While there is evidence that the systemis able to utilize across-frequency cues in certaincircumstances, a test of within vs. across-frequencyband cues requires manipulating the bandwidth ofthe noise, which was not done in this study. Theresults of this study are generally consistent with thefindings that when the modulation properties of thesignal and the noise are different, signal detectionthresholds were lower; when modulation propertiesbetween signal and noise were similar, signal detec-tion thresholds were higher. These results are consis-tent with previous findings in humans (e.g.,McFadden 1987; Cohen and Schubert 1987; Fantiniand Moore 1994), passarines (Langemann and Klump2007), and corvids (Jensen 2007).

    Recent studies have suggested that dip listening(listening selectively during the trough of the masker)is sufficient to explain changes in detection thresholdsor recognition thresholds (e.g., Velez and Bee 2010,2011). Some of the results of this study are generallyconsistent with dip listening mechanisms that accountfor trends in threshold changes. For example, chang-ing δϕ changes the amount of signal energy in the dipof the masker; the least signal energy was at δϕ=0 °,and the most energy was at δϕ=180 ° and would resultin threshold changes consistent with trends observedin Figure 3B. However, dip listening theories predictthat as the modulation frequency of the maskerincreased, detection thresholds would increase dueto reduced duration of dips (e.g., Gustafsson andArlinger 1994; Bacon et al. 1998; Velez and Bee 2010,2011). Those results are not consistent with thefindings after the manipulation of noise modulationfrequency (Fig. 7). In fact, the only model thatexplains the data across all conditions is the energydifference model (see Fig. 8).

    In general, detection experiments involving modulat-ed sounds have been done in the context ofcomodulation. In comodulation masking ratio (CMR)experiments, the detectability of static signals of varioussorts was determined in the presence of multiple narrowbands of noise having either the same or differentmodulations or in the presence of bandpass noise thatwas amplitude-modulated (e.g., Hall et al. 1984; Cohenand Schubert 1987; Hall 1986; McFadden 1986). Thesestudies cannot be directly compared with CMR studiesbecause both signal and noise were modulated in thisstudy. A better comparison would be studies ofcomodulation detection differences (CDDs), in whichsubjects were asked to detect a modulated band of noisethat was simultaneously masked by one or more spectral-ly non-overlapping noisebands (called cue or flankingbands) that were also modulated. When the modulationof the signal band is different from that of the flanking

    band(s), detection thresholds can be 10–12 dB betterthan when all of the flanking band envelopes were thesame (councorrelated condition) relative to when all theflanking band modulations were different (all randomcondition, Cohen and Schubert 1987; McFadden 1987;Wright 1990; Fantini and Moore 1994). Experiments incorvids and paasarenes showed similar threshold chang-es under CDD measurements in birds (Langemann andKlump 2007; Jensen 2007), suggesting that the CDD isnot specific to humans, but maybe a general processingmode used to segregate sounds in complex environ-ments (Cohen and Schubert 1987; Bee and Micheyl2008). The experimental conditions in this paper aresimilar to the “all correlated” condition in CDD studies(when the signal and the masker had similar modula-tions) or the all uncorrelated condition (when the signaland masker had different modulations). While theexperiments in the current study did not really testCMR or CDD explicitly by using bands of noise as signalor noise, the results are consistent with large thresholdchanges as a result of changes in the correlation betweenthe signal and noise in the above studies.

    The parameters manipulated in this paper have alsobeen manipulated, but mainly in studies of the CMR,when signals were unmodulated and maskers weremodulated (e.g., Hall et al. 1988; Schooneveldt andMoore 1989; Grose and Hall 1989; Fantini 1991). Manystudies have documented that human subjects were ableto discriminate the modulation parameters manipulatedhere (e.g., Wakefield and Edwards 1987; Yost and Sheft1989; Wakefield and Viemeister 1990). Most of theexperiments in which the signal and masker were bothmodulated involvedmanipulation of correlation betweenthe different noisebands (noisebands generated withdifferent amplitude and phase parameters; McFaddenand Wright 1990; Wright 1990; Borrill and Moore 2002)rather than the depth of modulation, the modulationphase, or the modulation frequency. The thresholdchanges in the current study were roughly comparableto than those seen for human behavior caused bychanged noiseband correlations (e.g., ∼10 dB,McFadden and Wright 1990); however, maskers inprevious studies had no spectral overlap with the signal(e.g. Cohen and Schubert 1987; Langemann and Klump2007). Perhaps, the lack of uncertainty of the signal ornoisebandmodulations contributed to the large effects inthe current study (see ∼15-dB threshold change formodulating phase difference in Fig. 3, and ∼10–20-dB threshold change while manipulating depth ofmodulation, Fig. 5).

    In general, the detection or discrimination oftarget sounds among distractors is facilitated underconditions that promote the perceptual segregation oftargets from interferers, especially if targets andinterferers share some common features (Gockel etal. 1999; Micheyl and Carlyon 1998; Micheyl et al.

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 817

  • 2005) or when they vary rapidly and unpredictablyover time (Kidd et al. 1994, 1995, 2002; Micheyl et al.2007). This suggests that (1) changing the modulationphase difference between signal and noise increasesthe segregation between signal and noise (Fig. 3); (2)changing depth of modulation of the tone to valuescloser to the depth of noise modulation when thetone and noise modulations were in phase decreasedthe segregation of signal and noise (Fig. 5); (3) whentone and noise modulations were anti-phase, chang-ing the depth of noise modulation to values closer tothe tone modulation depth improved the segregationof tone and noise (Fig. 5B and D; probably as a resultof enhanced dip listening arising from deeper dips innoise modulation); (4) changing the difference be-tween tone and noise modulation frequency in-creased segregation when the tone and noisemodulations were in phase, whereas the same stimu-lus manipulation when the modulations were anti-phase at tone onset decreased segregation (Fig. 7). Allof these are consistent with theories of auditory sceneanalysis, which suggest that when modulation param-eters are different, stream segregation is enhanced(Bregman 1994). This is also consistent with the ideathat amplitude modulation is an important contribu-tor to object formation (Yost and Sheft 1989) and thatthe monkeys had lower thresholds detecting signalfrom noise when the properties of the signal andnoise modulations were larger (implying signal andnoise were treated as two different objects). Anaddition to the theories of scene analysis here is thatthe various factors causing stream segregation are notindependent; rather, they interact in predictable ways(e.g., modulation phase and depth of modulation).

    Alternatively, all the data in this study could beexplained by selectively listening in the dips of themasker. Dip listening caused enhanced behavioralperformance when the masker modulation frequen-cies were low (Gustafsson and Arlinger 1994; Bacon etal. 1998; Velez and Bee 2010). However, these datashow that the relationship between the modulationfrequencies of signal and masker form an importantdeterminant of behavioral performance.

    Previous studies have suggested either perceptualsegregation of signal and noise, or dip listening, orsuppression in the auditory pathways as a mechanism toexplain effects such as those seen in this paper (e.g.Borrill and Moore 2002; Moore and Borrill 2002; seeMoore 2003 for a review). The perceptual performancein such stream segregation or dip listening tasks can beexplained by comparing the different segregatedstreams (effectively a subtraction operation, similar toDurlach’s EC model (1963)). So, this would suggest thatthe signal and the noise in this study could besegregated when the parameters of the signal and noisewere different (e.g., when δϕ≠0, or when the fms≠fmn,

    or when ds≠dn). Consistent with such a suggestion, thethreshold changes as a result of the manipulations inthis study were consistent with a differencing orcomparator operation being performed by the auditorysystem (compare Fig. 8 with Figs. 3, 5, and 7). This isconsistent with previous suggestions that subtractionmechanismsmay be in play in a detection task (e.g., Hallet al. 1988). However, the dip listeningmay also apply forthe current study that could not apply to the Hall et al.study; the thresholds were lower when the tone and thenoise modulations were anti-phase at the onset of thesignal relative to when the tone and noise modulationswere in phase. In the cases where modulation frequencywas changed, having higher tone modulation frequencymeant that even if the signal and noisemodulations werein phase at signal onset, there was some signal during thedip of the noise; when signals and noise modulationswere anti-phase at signal onset, the energy at the dipdecreased as a result of changing the tone modulationfrequency, so thresholds increased as the tone modula-tion frequency increased (Fig. 7). However, even in thoseconditions where dip listening could explain the behav-ioral performance, a signal to noise difference orcomparison best explained the threshold changes.Some previous studies have found that perceptualstreaming and dip listening did not apply under certainconditions, and the only mechanisms that could explaindetection based on correlations would be neural sup-pression (Borrill and Moore 2002).

    An interesting finding is that the δϕ values that areequally separated from 180 ° produced roughly equalthresholds (i.e., thresholds for 45 ° and 315 ° phase shiftswere very similar as were thresholds for 90 ° and 270 °,etc.; Figs. 1 and 2). This suggests that the exact timing ofthe peaks and troughs of the signal and noise did notmatter, just that the peaks were coincident or not. Thissuggests that the behavioral strategy used by themonkeys did not involve the relative timing of thefeatures of the tone and noise stimulus and was possiblyrelated to simply the stimulus energy.

    The one parameter that did not affect behavioralthresholds (or simulated thresholds) was when thedepth of noise modulation was varied while the toneand noise modulations were in phase at the onset of thesignal (data: Fig. 5C; model: Fig. 8D). Previous results inmacaques (Dylla et al. 2013) and in humans (e.g., Hall etal. 1984) suggest that detection thresholds inmodulatedmaskers were much lower than those in steady-state(unmodulated) maskers. These two results togethersuggest that the system is highly tuned to the salienceof the tone modulation, and any potential advantageprovided by noise modulation was potentially minimalwhen the tone and noise were in phase.

    An unusual prediction of the model was that thedetection threshold for modulated tones would de-crease from its high values at fms=10 Hz, be lowest at

    818 BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise

  • fms=13 Hz, and then saturate at a higher value forfms≥20 Hz (see Fig. 8C). Thus, the 13 Hz representsthe tone modulation frequency at which an observerwould notice the greatest dissimilarity between thetone and the noise modulation waveforms. Thisfinding lends some credence to the idea that correla-tions could play a role in the generation of perceptualstreams. Such sensitivity might require modulationshape discrimination, such as those observed in theauditory cortex in macaques (Malone et al. 2007).

    Potential Neurophysiological Mechanisms

    While the neurophysiological responses under theseexact conditions have not been studied, some studieshave examined neuronal responses when signals aremasked by modulated noise. Studies in songbirds haveshown that changes in correlation in the envelopebetween signal and noisebands cause changes in theresponse thresholds of neurons in the analog of theprimary auditory cortex (Bee et al. 2007). The rangeof changes in these forebrain neuronal thresholdsmatches, roughly, those observed behaviorally in thesame species (Langemann and Klump 2007).Similarly, other studies in mammalian species havealso shown that cortical neurons modulate theirresponses in a manner similar to behavior in responseto stimuli in the presence of modulated maskers(Fishman et al. 2001, 2012). One possible mechanismof signal detection in the presence of time varyingmasker—the masker causes a synchronization of theresponses of a population of neurons, and thepresentation of a signal (modulated or otherwise)desynchronizes the responses of neurons tuned to thesignal parameters from the rest of the neurons(Nelken et al. 1999). Recent studies have found thatneurons in auditory cortex are very sensitive tochanges in amplitude (both increases and decreases)and function as envelope shape discriminators with awide range of response characteristics (Malone et al.2007, 2010). This suggests that the responses ofcortical neurons would be able to respond differen-tially to the parameters of the modulations of tonesand noise. This is consistent with results from Sutter’slaboratory that show that cortical neurons changetheir responses depending on the modulation param-eters as well as the behavioral state of the animal andthe variations in behavioral performance (Yin et al.2011; Niwa et al. 2012a, b; Johnson et al. 2012). Whilethose results suggest enough information in theresponses of forebrain neurons to account for behav-ior, it was not clear if such information represented byauditory objects was present in the earlier parts of thebrain or what exact computations or mechanismswere involved in the generation of the responses.

    Studies in the visual system have implicated earlystructures like the primary visual cortex and even theretina in some species in the processing of local vs.global stimulus properties to account fro scenesegmentation (e.g., Olveczky et al. 2003; Baccus et al.2008; Nothdurft 1994). Very few studies of theauditory system have looked at neuronal responsesin relation to scene segmentation, and they have beenmainly in the auditory cortex (e.g. Fishman et al.2001, 2012; Fishman and Steinschneider 2010;Gutschalk et al. 2005; Nelken and Bar-Yosef 2008).Modulations clearly are a major signal for segregatingor integrating sounds. Modulated sounds changeresponses in multiple parts of the auditory system.The manipulations in this study (the onset phasedifference, modulation frequency, and modulationdepth) cause changes of activity in many parts of theauditory system such as the cochlear nucleus (CN; e.g.Rhode and Greenberg 1994; Joris et al. 1994; Moller1976) and the inferior colliculus (e.g., Nelson andCarney 2007; Krishna and Semple 2000; Langner andSchreiner 1988; Muller-Preuss et al. 1994; Rees andMoller 1983). However, very few studies have directlytested the neuronal correlates of signal detectionwhen the masker is temporally modulated (CN:Pressnitzer et al. 2001; Neuert et al. 2004). The studiesin the cochlear nucleus found that very few neuronsin the ventral CN showed neuronal correlates ofenhanced thresholds such as those seen in behavior(Pressnitzer et al. 2001). However, a majority ofneurons in the dorsal CN showed such neuronalcorrelates (Neuert et al. 2004), and these thresholdenhancements were postulated to result due towideband inhibition. However, magnitude of neuro-nal threshold changes in the dorsal CN could notaccount for behavioral threshold changes. A study inthe inferior colliculus of cats has shown that a majorityof inferior colliculus (IC) neurons show responses thatare associated with wideband inhibition beyond thatobserved in the CN (Davis et al. 2003). This suggeststhat neurons in the IC should show detectionthresholds that are larger than those seen in CN andmay be more in line with behavioral observations.These responses may be further modified at the levelof thalamus and cortex to represent the behaviorallysimilar changes in neuronal responses seen in theforebrain during such tasks (e.g., Bee et al. 2007;Fishman et al. 2012).

    ACKNOWLEDGMENTS

    This research was funded by a grant from the NationalInstitutes of Health, R01 DC 11092. The authors would liketo thank Mary Feurtado for the help during surgery, Bruceand Roger Williams for the hardware. Meagan Quinlan and

    BOHLEN ET AL.: Detection of Modulated Tones in Modulated Noise 819

  • Dr. Jason Grigsby collected some preliminary data andperformed some preliminary data analysis.

    REFERENCES

    BACCUS SA, OLVECZKY BP, MANU M, MEISTER M (2008) A retinal circuitthat computes object motion. J Neurosci 28:6807–6817

    BACON SP, OPIE JM, MONTOYA DY (1998) The effects of hearing lossand noise masking on the masking release for speech intemporally complex backgrounds. J Speech Lang Hear Res41:549–563

    BEE MA, MICHEYL C (2008) The cocktail party problem: what is it?How can it be solved? And why should animal behaviorists studyit? J Comp Psychol 122:235–251

    BEE MA, BUSCHERMOHLE M, KLUMP GM (2007) Detecting modulatedsignals in modulated noise: (II) neural thresholds in thesongbird forebrain. Eur J Neurosci 26:1979–1994

    BORRILL SJ, MOORE BC (2002) Evidence that comodulation detectiondifferences depend on within-channel mechanisms. J Acoust SocAm 111:309–319

    BREGMAN AS (1994) Auditory scene analysis: the perceptual organi-zation of sound. MIT Press, Cambridge

    COHEN MF, SCHUBERT ED (1987) The effect of cross-spectrumcorrelation on the detectability of a noise band. J Acoust SocAm 81:721–723

    DAVIS KA, RAMACHANDRAN R, MAY BJ (2003) Auditory processing ofspectral cues for sound localization in the inferior colliculus. JAssoc Res Otolaryngol 4:148–163

    DURLACH NI (1963) Equalization and cancellation theory of binauralmasking level differences. J Acoust Soc Am 35:1206–1218

    DYLLA M, HRNICEK A, RICE C, RAMACHANDRAN R (2013) Detection oftones and their modification by noise in nonhuman primates. JAssoc Res Otolaryngol 14:547–560

    EFRON B, TISHIRANI RJ (1993) An introduction to the bootstrap.Chapman & Hall/CRC, Boca Raton

    FANTINI DA (1991) The processing of envelope information incomodulation masking release (CMR) and envelope discrimina-tion. J Acoust Soc Am 90:1876–1888

    FANTINI DA, MOORE BC (1994) Profile analysis and comodulationdetection differences using narrow bands of noise and theirrelation to comodulation masking release. J Acoust Soc Am95:2180–2191

    FISHMAN YI, STEINSCHNEIDER M (2010) Neural correlates of auditoryscene analysis based on inharmonicity in monkey primaryauditory cortex. J Neurosci 30:12480–12494

    FISHMAN YI, VOLKOV IO, NOH MD, GARELL PC, BAKKEN H, AREZZO JC,HOWARD MA, STEINSCHNEIDER M (2001) Consonance and disso-nance of musical chords: neural correlates in auditory cortex ofmonkeys and humans. J Neurophysiol 86:2761–2788

    FISHMAN YI, MICHEYL C, STEINSCHNEIDER M (2012) Neural mechanismsof rhythmic masking release in monkey primary auditory cortex:implications for models of auditory scene analysis. JNeurophysiol 107:2366–2382

    GANS C (1992) An overview of the evolutionary biology of hearing.In: Webster DB, Fay RR, Popper AN (eds) The evolutionarybiology of hearing. Springer, New York, pp 3–13

    GOCKEL H, CARLYON RP, MICHEYL C (1999) Context dependence offundamental-frequency discrimination: lateralized temporalfringes. J Acoust Soc Am 106:3553–3563

    GREEN DM, SWETS JA (1966) Signal detection theory and psycho-physics. Krieger, Huntingdon

    GREEN DM, MCKEY MJ, LICKLIDER JCR (1959) Detection of a pulsedsinusoid in noise as a function of frequency. J Acoust Soc Am31:1446–1452

    GROSE JH, HALL JW 3RD (1989) Comodulation masking release usingSAM tonal complex maskers: effects of modulation depth andsignal position. J Acoust Soc Am 85:1276–1284

    GUSTAFSSON HA, ARLINGER SD (1994) Masking of speech by ampli-tude-modulated noise. J Acoust Soc Am 95:518–529

    GUTSCHALK A, MICHEYL C, MELCHER JR, RUPP A, SCHERG M, OXENHAM AJ(2005) Neuromagnetic correlates of streaming in humanauditory cortex. J Neurosci 25:5382–5388

    HALL JW (1986) The effect of across-frequency differences inmasking level on spectro-temporal pattern analysis. J AcoustSoc Am 79:781–787

    HALL JW, HAGGARD MP, FERNANDES MA (1984) Detection in noise byspectro-temporal pattern analysis. J Acoust Soc Am 76:50–56

    HALL JW 3RD, GROSE JH, HAGGARD MP (1988) Comodulation maskingrelease for multicomponent signals. J Acoust Soc Am 83:677–686

    HAWKINS JEJ, STEVENS SS (1950) The masking of pure tones and ofspeech by white noise. J Acoust Soc Am 22:6–13

    JENSEN KK (2007) Comodulation detection differences in thehooded crow (Corvus corone cornix), with direct comparison tohuman subjects. J Acoust Soc Am 121:1783–1789

    JOHNSON JS, YIN P, O’CONNOR KN, SUTTER ML (2012) Ability ofprimary auditory cortical neurons to detect amplitude modula-tion with rate an