Download pdf - Development of the North American Listening in Spatialized … · 2019-12-19 · AMERICAN LiSN-S NA LiSN-S Software Development The NA LiSN-S graphic user interface and signal processing

Development of the North American Listening inSpatialized Noise–Sentences Test (NA LiSN-S):Sentence Equivalence, Normative Data,and Test–Retest Reliability StudiesDOI: 10.3766/jaaa.20.2.6

Sharon Cameron*{David Brown{1Robert Keith1

Jeffrey Martin**Charlene Watson{{Harvey Dillon*

Abstract

Background: The Listening in Spatialized Noise–Sentences test (LiSN-SH) was originally developed inAustralia to assess auditory stream segregation skills in children with suspected central auditory

processing disorder (CAPD). The software produces a three-dimensional auditory environment underheadphones. A simple repetition-response protocol is utilized to determine speech reception thresholds

(SRTs) for sentences presented from 0 degrees azimuth in competing speech. The competing speech(looped children’s stories) is manipulated with respect to its location (0 degrees vs. +90 degrees and

290 degrees azimuth) and the vocal quality of the speaker(s) (same as, or different to, the speaker ofthe target stimulus). Performance is measured as two SRT and three advantage measures. The

advantage measures represent the benefit in dB gained when either talker, spatial, or both talker andspatial cues combined are incorporated in the maskers.

Purpose: The objective of this research was to develop a version of the LiSN-S suitable for use in theUnited States and Canada. The original sentences and children’s stories were reviewed for unfamiliar

semantic items and rerecorded by native North American speakers.

Research Design: In a descriptive design, a sentence equivalence study was conducted to determine

the relative intelligibility of the rerecorded sentences and adjust the amplitude of the sentences forequal intelligibility. Normative data and test–retest reliability data were then collected.

Study Sample: Twenty-four children with normal hearing aged 8 years, 3 months, to 10 years, 0 months,took part in the sentence equivalence study. Seventy-two normal-hearing children aged 6 years, 2

months, to 11 years, 10 months, took part in the normative data study. Thirty-six children returnedbetween two and three months after the initial assessment for retesting. Participants were recruited from

sites in Cincinnati, Dallas, and Calgary.

Results: The sentence equivalence study showed that post-adjustment, sentence intelligibility increased

by 18.7 percent for each 1 dB increase in signal-to-noise ratio. Analysis of the normative data revealed nosignificant differences on any performance measure as a consequence of data collection site or gender.

Inter- and intra-participant variation was minimal. A trend of improved performance as a function ofincreasing age was found across performance measures, and cutoff scores, calculated as two standard

deviations below the mean, were adjusted for age. Test–retest differences were not significant on anymeasure of the North American (NA) LiSN-S (p ranging from .080 to .862). Mean test–retest differences

*National Acoustic Laboratories; {Macquarie University; {Cincinnati Children’s Hospital Medical Center; 1University of Cincinnati; **Universityof Texas at Dallas; {{Community Audiology Services, Calgary Health Region

Sharon Cameron, Ph.D., National Health and Medical Research Council Public Health (Australia) Fellow and Research Scientist, NationalAcoustic Laboratories, 126 Greville St., Chatswood, NSW, 2067, Australia; Phone: +61 2 9412 6851; Fax: +61 2 9411 8273;E-mail: [email protected]

A commercial version of the test described in this article will be released shortly. Financial returns from that commercialization will benefit Dr.Cameron and the organizations involved in this study. This outcome has in no way influenced the research reported in this article.

J Am Acad Audiol 20:128–146 (2009)

128

on the various NA LiSN-S performance measures ranged from 0.1 dB to 0.6 dB. One-sided critical

difference scores calculated from the retest data ranged from 3 to 3.9 dB. These scores, which take intoaccount mean practice effects and day-to-day fluctuations in performance, can be used to determine

whether a child has improved on the NA LiSN-S on retest.

Conclusions: The NA LiSN-S is a potentially valuable tool for assessing auditory stream segregation

skills in children. The availability of one-sided critical difference scores makes the NA LiSN-S useful formonitoring listening performance over time and determining the effects of maturation, compensation

(such as an assistive listening device), or remediation.

Key Words: Auditory stream segregation, (central) auditory processing disorder

Abbreviations: BKB 5 Bamford-Kowal-Bench sentences test; CAPD 5 central auditory processing

disorder; eSRT 5 estimate of speech reception threshold; FFT 5 fast Fourier transform; HpTF 5 head-phone transfer function; HRTF 5 head-related transfer function; KEMAR 5 Knowles Electronics

Manikin for Acoustic Research; LiSN-S 5 Listening in Spatialized Noise–Sentences test; NA LiSN-S 5 North American Listening in Spatialized Noise–Sentences test; rms 5 root mean square;

SNR 5 signal-to-noise ratio

The following article outlines the development of

a North American–accented and semantically

appropriate version of the Listening in Spatia-lized Noise–Sentences test (LiSN-SH [Cameron and

Dillon, 2006]). The LiSN-S was developed in Australia

to assess auditory stream segregation skills in

children with suspected central auditory processing

disorder (CAPD). Auditory stream segregation is the

process by which a listener is able to differentiate the

various auditory signals that arrive simultaneously

at the ears and form meaningful representations ofthe incoming acoustic signals (Sussman et al, 1999).

Auditory cues such as the perceived spatial location of

sounds or the pitch of speakers’ voices help this

process of segregating the total stream of sound

(Bregman, 1990).

The LiSN-S is presented using a personal computer.

Output levels are directly controlled by the software

via an external USB sound card. A three-dimensionalauditory environment under headphones is created by

presynthesizing the speech stimuli with head-related

transfer functions (HRTFs). This approach offers

several advantages over traditional soundfield testing.

First, it minimizes the variability in the sound

pressure level at the eardrum caused by a listener’s

head movements (Wilber, 2002). Second, it offsets

potential differences in stimulus delivery due tovariations in loudspeaker and listener placement that

exist between clinics. Third, it reduces the effects of

reverberation in the test environment (Koehnke and

Besing, 1997).

On the LiSN-S, a simple repetition-response proto-

col is used to assess a listener’s speech reception

threshold (SRT) for target sentences presented in

competing speech maskers (children’s stories). Using

HRTFs, the targets are perceived as coming fromdirectly in front of the listener (0 degrees azimuth),

whereas the maskers, relative to the targets, vary

according to their perceived spatial location (0

degrees vs. +90 degrees and 290 degrees azimuth),

the vocal identity of the speaker(s) of the stories

(same as, or different to, the speaker of the targetsentences), or both. This results in four listening

conditions: same voice at 0 degrees (or low-cue SRT),

same voice at 690 degrees, different voices at 0

degrees, and different voices at 690 degrees (or high-

cue SRT).

Performance on the LiSN-S is evaluated on the low-

and high-cue SRT, as well as on three ‘‘advantage’’

measures. These advantage measures represent the

benefit in dB gained when either vocal, spatial, or bothvocal and spatial cues are incorporated in the maskers,

compared to the baseline (low-cue SRT) condition

where no cues are present in the maskers (see

Figure 1). The use of relative measures of performance

(i.e., difference scores) serves to minimize the influence

of higher-order language, learning, and communica-

tion skills on test performance. For example, as such

skills affect both the SRT when the distracters arepresented at 0 degrees and the SRT when they are

spatially separated at 690 degrees, these skills will

Figure 1. The Listening in Spatialized Noise–Sentences testspeech reception threshold (SRT) and advantage measures.

NA LiSN-S/Cameron et al

129

have minimal effect on the difference between the

SRTs in these two conditions. Thus, the differences

that inevitably exist between individuals in such

functions can be accounted for, allowing for clearer

evaluation of their abilities to use spatial and voice

cues to aid speech understanding.

The LiSN-S has shown to be sensitive to auditory

streaming deficits in children whose primary difficul-

ties in the classroom stem from poor listening

behavior, as opposed to those with documented

learning and attention disorders (Cameron and Dillon,

2008). For these children, interestingly, significant

differences on LiSN-S occurred only in the conditions

where the physical location of the maskers was

manipulated (high-cue SRT, p 5 .001; spatial advan-

tage, p , .0001; and total advantage, p , .0001). These

results provide further evidence to suggest that the

LiSN-S procedure is capable of differentiating not only

an auditory versus language disorder but also a spatial

versus vocal streaming segregation disorder.

The LiSN-S was developed in Australia. The target

sentences were written by Australian speech patholo-

gists, and the distracter children’s stories were written

by an Australian novelist. All the speech stimuli were

recorded by Australian speakers (Cameron and Dillon,

2007a). The sentence equivalence data, normative

data, and test–retest reliability data were collected

from Australian children (Cameron and Dillon, 2007a,

2007c).

Previous research has shown that performance on

audiologic tests that utilize speech stimuli may be

detrimentally affected in nonnative populations due to

factors such as unfamiliar accent and semantic items

(Golding et al, 1996; Marriage et al, 2001; Dawes and

Bishop, 2007). In the development of an Australian

version of the Staggered Spondaic Word Test (Katz,

1962), for example, Golding et al (1996) found poorer

performances on the Australian-accented version of

the American baseball term ‘‘batboy’’ for seven out of

10 young Australian listeners. Since ‘‘batboy’’ was

considered unfamiliar to the Australian population, it

was subsequently substituted with a more familiar

spondee. A further trial on 33 normal-hearing young

adults using the substituted word showed that the

overall percentage error and standard deviation were

reduced. Marriage et al (2001) found that the mean

scores for British children aged seven and eight for

both the filtered words and auditory figure-ground

subtests of the SCAN (Keith, 1986) were significantly

poorer than North American normative data. It was

concluded that vocabulary factors contributed to the

poorer results of the U.K. population sample and the

changed overall acoustic pattern of the target stimuli

did not allow clear word matching with familiar forms.

Dawes and Bishop (2007) compared scores on a revised

version of the SCAN (SCAN-C [Keith, 2000]) in 99

British children aged six to 10 years. All age groups

scored significantly worse on the filtered words and

auditory figure-ground subtests of the SCAN-C, as well

as on the composite scores. It was concluded that

applying North American norms to the scores obtained

by British children results in a high rate of overiden-

tification of listening difficulties.

In light of the potential detrimental effects of accent

and unfamiliar semantic items on Australian LiSN-S

performance in the North American population, it was

decided to replace any unfamiliar semantic items with

those more suitable for a North American population

and to record the stimuli using native North American

speakers. This article reports on the development and

recording of the stimuli; a sentence equivalence study,

normative data study, and test–retest reliability study

for the North American LiSN-S (NA LiSN-S [Cameron

and Dillon, 2007b]) follow. Comparison of the results of

these studies to the results of the respective Australian

data is discussed.

DEVELOPMENT OF THE NORTH

AMERICAN LiSN-S

NA LiSN-S Software Development

The NA LiSN-S graphic user interface and signal

processing application program were developed in the

C# programming language and were based on the

LiSN-S software described in Cameron and Dillon,

2007a. An image of the playback screen used to

administer the NA LiSN-S is provided in Figure 2.

Figure 2. The Listening in Spatialized Noise–Sentences testplayback screen. The graph shows the history of the target levelas the range of correct responses from greater than 50% correct toless than 50% correct is repeatedly traversed. The top horizontalline shows the level of the distracters, and the lower horizontalline shows the average level of the targets during thestable region.

Journal of the American Academy of Audiology/Volume 20, Number 2, 2009

130

Speech Stimuli

A total of 180 sentences used in the development of

the Australian LiSN-S were also utilized for the NA

LiSN-S. The target sentences were developed by the

Cooperative Research Centre for Cochlear Implant and

Hearing Aid Innovation and were used under license

from HearWorks Pty Limited. The sentences were

written by Australian registered speech pathologists

specializing in the rehabilitation of children with

hearing loss. Each sentence was constructed in

accordance with the criteria used in the development

of the Bamford-Kowal-Bench sentences test (BKB

[Bamford and Wilson, 1979]). The BKB sentences

contain mainly Stage 3 and some Stage 2 clause

structures as described in the Language Assessment,

Remediation, and Screening Procedure (Crystal, 1989)

and are suitable for children from 4.6 years of age

(Kowal, 1979).

The semantic content of each sentence was analyzed

independently by a native North American speaker

from the University of Cincinnati and a native

Canadian speaker from the National Acoustic Labora-

tories. Changes were then amalgamated, and the final

list was agreed to by both reviewers. A total of 27

substitutions were made. For example, ‘‘shop’’ was

changed to ‘‘store,’’ ‘‘cricket’’ team was changed to

‘‘baseball’’ team, and ‘‘nappies’’ was changed to ‘‘dia-

pers.’’ Examples of some of the sentences used in the

NA LiSN-S appear in Appendix A.

Two published Australian children’s stories entitled

‘‘Loopy Lizard’s Tail’’ and ‘‘The Great Big Tiny Traffic

Jam’’ were used as the competing speech stimuli.

Although listeners were instructed not to attend to the

competing stories, the semantic content of the stories

was also analyzed by the reviewers, and five changes

were made. For example, ‘‘skirting board’’ was changed

to ‘‘baseboard,’’ and ‘‘peeped out’’ was changed to

‘‘peered out.’’ An extract from ‘‘Loopy Lizard’s Tail’’

appears as Appendix B.

Recording

The North American versions of the LiSN-S target

sentences and distracter stories were recorded at a

professional recording studio in Sydney, Australia, by

native North American actors who had been in

Australia for less than 12 months. Female 1 (who is

also a North American dialect coach) recorded the

target sentences, as well as both stories. Female 2

recorded ‘‘Loopy Lizard’s Tale,’’ and Female 3 recorded

‘‘The Great Big Tiny Traffic Jam.’’ All stimuli were

produced with a general North American accent.

General North American English is the term given to

any American accent that is relatively free of notice-

able regional influences and is found in contemporary

North American–made films and television programs

(Green, 2002). All speakers were of the same ethnicity

and of approximately the same age. All speakers were

instructed to speak with a normal clear voice while

maintaining a normal rhythm of speech and to avoid

placing emphasis on key words. Specifically, clarity,

pace, and effort were maintained across words. These

qualifications were implemented to prevent listeners

from using cues, such as accent, to detect differences

between the stories and target sentences.

Editing

The analog signal was recorded directly onto hard

disk. The standard sampling frequency used in

compact disk recordings of 44.1 kHz with a 16 bit

digitization was utilized. The individual target sen-

tences and distracter discourse were extracted from

the recordings and edited using Adobe Audition

Version 1.5. A silent period of 100 msec was inserted

immediately preceding and following each distracter

story. Extraneous pauses were removed during the

editing process to ensure that the stories ran smoothly

and at a constant intensity level. The stories were

approximately two minutes and 30 seconds in length.

Level Normalization

The root mean square (rms) levels of each target

sentence and the individual distracter stories were

ascertained using Adobe Audition 1.5. These rms levels

were then averaged (in dB) across all stimuli. The rms

amplitude of each sentence and distracter was com-

pared to the average rms in order to obtain a correction

factor (i.e., difference between each sentence/distracter

rms and the average). Once equated for rms amplitude,

all stimuli were then reduced by 7 dB to ensure that no

clipping occurred when the distracters were convolved

with HRTFs at +90 or 290 degrees azimuth. The final

corrected rms level (prior to convolution) for all

sentences was 227.1 dB re: digital full scale.

Convolution

Each sentence recorded by Female 1 was convolved

with HRTFs recorded at 0 degrees azimuth. ‘‘Loopy

Lizard’s Tail’’ (recorded by Female 1 and Female 2)

was convolved with HRTFs recorded at 0 degrees

azimuth and 290 degrees azimuth. ‘‘The Great Big

Tiny Traffic Jam’’ (recorded by Female 1 and Female 3)

was convolved with HRTFs recorded at 0 degrees

azimuth and +90 degrees azimuth. The HRTFs were

recorded in a chamber, anechoic above 50 Hz, using a

Knowles Electronics Manikin for Acoustic Research

(KEMAR) containing a Zwislocki coupler and half-inch

microphone (see Cameron et al, 2006, for complete


131

description). Knowles Electronics small-sized pinnae

were fitted to simulate the outer ear. The HRTFs were

produced from swept sine waves ranging in frequency

from 50 to 20,000 Hz presented from a single loud-

speaker positioned 1 m from the center point of

KEMAR’s head.

The speech files were synthesized using the LiSN

convolution program developed using MATLAB soft-

ware (MathWorks Inc., 2002) as described in Cameron

et al, 2006. In summary, the various speech files were

converted to the frequency domain by a fast Fourier

transform (FFT) and then multiplied by the HRTFs, as

well as an inverse headphone response that is

described in the postequalization procedure section

below. An inverse FFT was then applied to convert the

signals back into the time domain for playback.

Postequalization Procedure

A postequalization procedure was also implemented

to correct for the response of the headphones used

during playback. Swept sine wave signals were played

through Sennheiser HD215 circumaural audiometric

high-frequency headphones on a KEMAR and recorded

by a Stanford Research Systems two-channel network

signal analyzer in order to measure the headphone-to-

eardrum transfer function (HpTF). A filter with the

inverse transfer function of the HpTF was developed,

as described in Cameron et al, 2006. The inverse HpTF

was convolved with the spatialized LiSN-S speech

materials, effectively canceling out the HpTF that

occurs during playback. The convolved, postequalized

stimuli were saved as WAV files.

Stimulus Generation

The convolved and hence spatialized speech files

were stored in an NA LiSN-S subdirectory for

subsequent playback. The playback screen and related

programs retrieved the spatialized target and distrac-

ter speech files and combined and scaled these stimuli

in dB to produce a binaural output signal. The right

and left ear components of the binaural signal were

assigned to the right and left channels of the computer

sound card, respectively.

Calibration

The mean rms level of the combined distracters

(averaged across the recordings made by Female 1 and

Females 2 and 3) at 0 degrees was 222.3 dB and at

690 degrees was 221.3 dB. The 1 dB difference

between the level of the distracters at 690 degrees

and 0 degrees occurs as a consequence of the HRTFs

applied and was intentionally not corrected for. All

signal-to-noise ratios were, therefore, defined relative

to the level of the distracters at 0 degrees, where both

the target sentences and distracters shared the same

head-related transfer functions. A 1 kHz reference

tone was created with an amplitude 10 dB greater

than the average of the combined total rms levels of the

two distracter files convolved with HRTFs at 0 degrees,

that is, 212.3 dB.

During the sentence equivalence study, the NA

LiSN-S was administered using a PC, and the stimuli

were presented through Sennheiser HD215 head-

phones, which were connected directly to the head-

phone socket of the PC. In order to determine the exact

output levels in mV required to present the NA LiSN-S

stimuli at a designated level in dB SPL, the various

stimuli were presented through the left and right ear

headphones to a Bruel and Kjær type 4153 artificial

ear using a flat plate adaptor. Equivalent dBV rms

levels were measured directly from the headphone

socket of the PC using a Stanford Research Systems

two-channel network signal analyzer.

When the 1 kHz reference tone was activated from

the playback screen, the volume of the PC was

adjusted until the electrical level of the calibration

signal was 21 mV (as measured by a voltmeter). At

this point, the dB SPL in each ear of the two combined

distracters at 0 degrees azimuth (recorded by Female 1

and Females 2 and 3) matched the corresponding level

on the LiSN-S competition slider bar. Similarly, the dB

SPL of the target (recorded by Female 1) matched the

corresponding level on the LiSN-S target slider bar.

Daily calibration was achieved by adjusting the PC

volume control until the electrical level of the

calibration signal applied to the headphones was

21 mV.

For the NA LiSN-S normative data study and test–

retest reliability studies, the headphones were con-

nected to the headphone socket of the PC via a Miglia

Harmony Express USB sound card. The sensitivity of

the sound card was automatically set to a predeter-

mined level by the LiSN-S software in order to achieve

the same signal levels as described previously. This

alleviated the need for daily calibration. At this preset

level, the combined distracters at 0 degrees had a long-

term rms level of 55 dB SPL as measured in a Bruel

and Kjær type 4153 artificial ear.

EXPERIMENT 1—SENTENCE

EQUIVALENCE STUDY

The following study was conducted to determine the

relative intelligibility of the LiSN-S sentences and

to adjust the level of the sentences for equal intelligi-

bility. Approval to conduct the sentence equivalence

study was obtained from the Institutional Review

Board of the Cincinnati Children’s Hospital Medical

Center.


132

Participants

Data were collected from 24 children with normal

hearing aged 8 years, 3 months, to 10 years, 0 months

(mean age 9 years, 1 month). There were 12 males and

12 females. Participants were recruited from friends

and family of staff at the Cincinnati Children’s

Hospital Medical Center. The participants were in-

cluded in the study if they had North American

English as a first language, no history of hearing

disorders, and no reported learning or attention

disorders. On the day of testing all participants had

pure-tone thresholds of #15 dB HL at 500 to 4000 Hz

and #20 dB HL at 250 and 8000 Hz, as well as normal

Type A tympanograms and 1000 Hz ipsilateral acous-

tic reflexes present at 95 dB HL.

Materials

The LiSN-S stimuli were administered using a PC

and Sennheiser HD215 headphones. The headphones

were connected directly to the headphone socket of the

PC via a Miglia Harmony Express USB sound card.

The daily calibration procedure is described in the

above section on the development of the LiSN-S under

‘‘Calibration.’’

Design and Procedure

Testing was carried out after school hours (between

3 p.m. and 6 p.m.) in an acoustically treated room

suitable for testing hearing thresholds at the Cincin-

nati Children’s Hospital Medical Center. Target sen-

tences were initially presented at a level of 62 dB SPL,

as measured in a Bruel and Kjær type 4153 artificial

ear. Competing children’s stories, looped during play-

back, were presented at a constant level of 55 dB SPL.

The target and competing signals were presented

simultaneously to both ears. The stimuli were all

presented in the ‘‘same voice—0u’’ condition, whereby

the target sentences and distracter stories are all

spoken by the same female speaker and were processed

with the head-related transfer functions appropriate to

a source at 0 degrees azimuth (directly in front of the

listener). The listener’s task was to repeat the words

heard in each target sentence. A 1000 Hz 200 msec

tone burst was presented before each sentence to alert

the listener that a sentence would be presented. A

silent gap of 500 msec separated the tone burst from

the onset of the sentence. The tone burst was

presented at a constant level of 55 dB SPL.

The signal-to-noise ratio (SNR) was adjusted adap-

tively in each condition by varying the target level to

determine each participant’s SRT. The SNR was

decreased by 2 dB if a listener scored more than 50

percent of words correct and increased by 2 dB if he or

she scored less than 50 percent of words correct. The

SNR was not adjusted if a response of exactly 50

percent correct was recorded (for example, three out of

six words correctly identified). All words in each

sentence were scored individually; including the

definite article ‘‘the’’ and the indefinite articles ‘‘a’’

and ‘‘an.’’ At a minimum, five sentences were provided

as practice; however, practice continued until one

upward reversal in performance (i.e., the sentence

score dropped below 50 percent of words correct) was

recorded. Testing ceased in a particular condition

when the listener had either (a) completed the entire

30 sentences in any one condition or (b) completed the

practice sentences plus a minimum of a further 17

scored sentences, and the standard error, calculated

automatically in real time over the scored sentences,

was less than 1 dB. None of the sentences used to form

the initial estimate of SRT (i.e., eSRT) was repeated in

any subsequent study.

An additional 150 sentences were presented at three

fixed SNRs to determine the relative intelligibility of the

sentences. For each participant, an SRT was obtained

for 50 sentences presented at his or her eSRT, 50

sentences presented at his or her eSRT + 2 dB, and 50

sentences presented at his or her eSRT 2 2 dB. The

sentences assigned to each SNR were counterbalanced

across participants. Logit curves were fitted for each

sentence using least squares regression based on the

equation: exp(a 2 b * SNR)/(1 + exp[a 2 b * SNR]). The

dependent variable was the proportion of words correct

at each SNR for a particular sentence averaged across

participants. All analyses were made with Statistica 7.0.

The resulting b values are related to the slope of the

steepest portion of the curve for each sentence. The

median b value across sentences was 20.594, or 15

percent per dB (calculated as 2b/4). The ratio of a/b for

any sentence (referred to as r) represents the SRT or the

SNR needed to achieve 50 percent correct identification

of words in that sentence. The median value of r (rmed)

was 20.4 dB. The sentences were then adjusted in dB

for equal intelligibility, with the required adjustment

for any sentence calculated as r 2 rmed. A sentence was

discarded if (a) the required adjustment was too great,

that is, r 2 rmed , 22.0 dB or . +2.0 dB; (b) the slope

was too shallow (,6% per dB), that is, 2b/4 , 0.06 or b

, 20.25; or (c) the slope was too steep (50% per dB), that

is, 2b/4 . 0.5 or b . 22.

Based on these criteria, 30 sentences were discarded.

Logit curves for the remaining 120 unadjusted sen-

tences are shown in Figure 3. The remaining sentences

were adjusted in amplitude for equal intelligibility and

used in the normative data study. The mean slope of

the retained sentences was 18.7 percent per dB. Logit

curves for the sentences postadjustment are shown in

Figure 4. The average and median length of the

sentences was five words per sentence across the total


133

number of sentences. The sentences were then allocat-

ed to four lists for use in the normative data study.

Each list also had and average and median sentence

length of five words.

EXPERIMENT 2—NORMATIVE DATA STUDY

Approval to conduct the normative data study was

obtained from the Institutional Review Boards of

the Cincinnati Children’s Hospital Medical Center, the

University of Texas at Dallas, and the University of

Calgary.

Participants

Data were collected from 72 children with normal

hearing aged 6 years, 2 months, to 11 years, 10

months. Participants were recruited from friends and

family of staff at the Cincinnati Children’s Hospital

and Calgary Health Region. Participants recruited by

the University of Texas at Dallas were from local

primary schools. Participant details are provided in

Table 1. Inclusion criteria were as per Experiment 1.


The materials used in the normative data study are as

described for the sentence equivalence. The description

of output levels is described in the above section on the

development of the LiSN-S under ‘‘Calibration.’’ Testing

was again carried out in an acoustically treated room

suitable for testing hearing thresholds at the various

facilities. Testing occurred between 9 a.m. and 3 p.m. atthe University of Texas at Dallas and between 10 a.m.

and 7 p.m. at the other sites.

The LiSN-S target sentences were initially presented

at a level of 62 dB SPL. The competing discourse was

presented at a constant level of 55 dB SPL. The

participant’s task was to repeat as many words as

possible heard in each sentence. The instructions

provided to each participant are attached as AppendixC. Up to 30 sentences were presented in each of the four

conditions of distracter location and voice: same voice at

0 degrees (SV0u), same voice at 690 degrees (SV690u),different voices at 0 degrees (DV0u), and different voices

at 690 degrees (DV690u). The organization of the target

sentences and distracter stories is provided in Table 2.

The SNR was adjusted adaptively as described for

Experiment 1. The presentation order of the LiSN-Sconditions was counterbalanced among participants

using a Latin-square protocol to enable analysis of the

effect of practice on performance.

RESULTS—EXPERIMENT 2

LiSN-S Conditions

Effect of Data Collection Site

The mean SRT and interparticipant standard devi-

ations for the LiSN-S SRT and advantage measures for

Figure 3. Logit curves for 120 North American Listening inSpatialized Noise–Sentences test sentences prior to adjustmentfor equal intelligibility (SRT 5 speech reception threshold).

Figure 4. Logit curves for 120 North American Listening inSpatialized Noise–Sentences test sentences following adjustmentfor equal intelligibility (SRT 5 speech reception threshold).

Table 1. Details of the 72 Participants in the North American Listening in Spatialized Noise–Sentences Test NormativeData Study

Age Group n Male Female Minimum Age (years, months) Maximum Age (years, months) Mean Age (years, months)

6 12 6 6 6, 2 6, 11 6, 6

7 12 8 4 7, 0 7, 7 7, 3

8 12 5 7 8, 1 8, 11 8, 5

9 12 6 6 9, 0 9, 11 9, 6

10 12 7 5 10, 0 10, 11 10, 5

11 12 6 6 11, 0 11, 11 11, 6


134

each collection site are presented in Table 3. Separate

analyses of variance (ANOVAs) were performed to

determine the effect of collection site (Ohio, Alberta,

and Texas) on each of the performance measures. As

the five measures were derived from the four basic

LiSN-S conditions (SV0u, SV690u, DV0u, and DV690u),the alpha level of 0.05 was multiplied by 4/5 to give an

adjusted level of 0.04 to avoid inflating the Type I error

rate.

There was no effect of collection site for any of the

LiSN-S SRT or advantage measures: low-cue SRT, F(2,

69) 5 0.130, p 5 .878; high-cue SRT, F(2, 69) 5 0.933, p

5 .398; talker advantage, F(2, 69) 5 0.397, p 5 .674;

spatial advantage, F(2, 69) 5 1.570, p 5 .215; total

advantage, F(2, 69) 5 0.949, p 5 .392. As no significant

differences were found between collection sites, data

were combined for the following analyses.

Comparison of North American and

Australian Data

The mean SRTs and interparticipant standard

deviations for the LiSN-S SRT and advantage mea-

sures for the combined North American data and the

Australian data are presented in Table 4. Separate

ANOVAs were performed to determine whether differ-

ences existed in the normative data between the two

countries on the various performance measures. There

was a significant difference on all the LiSN-S SRT and

advantage measures between countries: low-cue SRT,

F(1, 140) 5 7.421, p 5 .007; high-cue SRT, F(1, 140) 5

50.261, p , .001; talker advantage, F(1, 140) 5 22.477,

p , .001; spatial advantage, F(1, 140) 5 58.890, p ,

.001; total advantage, F(1, 140) 5 46.353, p , .001.

Main Effects and Interactions

Table 5 details the mean SRTs and interparticipant

standard deviations for the four LiSN-S distracter

conditions—SV0u, SV690u, DV0u, and DV690u. Age

groups and test sites were combined. An ANOVA of

mean SRT was performed for the repeated-measures

factors of distracter location (0 degrees vs. 690 degrees)

and distracter voice (same vs. different). An alpha level

of 0.05 was used for all comparisons. The Greenhouse-

Geisser correction factor was applied to the degrees of

freedom of the main effects and interaction to ensure

that violations of sphericity did not influence the

significance levels calculated for any of the analyses.

There was a significant main effect for distracter

location (F[1, 71] 5 307.66, p , .001), with the 690

degrees condition resulting in a lower SRT than the 0

degrees condition, averaged across distracter voice. An

analysis of simple contrasts revealed that the 690

degrees location produced a significantly lower mean

SRT than the 0 degrees location for both the same voice

distracter (F[1, 71] 5 1811.45, p , .001) and the

different voices distracter (F[1, 71] 5 497, p , .001).

There was also a significant main effect of voice (F[1, 71]

5 1831.87, p , .001) averaged across location. Simple

contrasts revealed that the mean SRT of the same voice

distracter was significantly higher than that for the

Table 2. Organization of Target Sentences and Distracter Stories for Each North American Listening in SpatializedNoise–Sentences Test (LiSN-S) Condition in the Normative Data Study

LiSN-S Condition Distracter Speaker Distracter Location Distracter Story

Same Voice: 0u Female 1 0u Loopy Lizard’s Tail

Female 1 0u The Great Big Tiny Traffic Jam

Same Voice: + and 290u Female 1 290u Loopy Lizard’s Tail

Female 1 +90u The Great Big Tiny Traffic Jam

Different Voices: 0u Female 2 0u Loopy Lizard’s Tail

Female 3 0u The Great Big Tiny Traffic Jam

Different Voices: + and 290u Female 2 290u Loopy Lizard’s Tail

Female 3 +90u The Great Big Tiny Traffic Jam

Note: In each condition the target sentences were spoken by Female 1 and presented at 0 degrees azimuth (directly in front of the speaker).

Table 3. Average Speech Reception Thresholds (SRTs, Expressed as Signal-to-Noise Ratios) and InterparticipantStandard Deviations (in dB) for Each of the SRT and Advantage Measures for the 72 Children in the North AmericanNormative Data Study as a Function of Data Collection Site

Site n Variable Low-Cue SRT High-Cue SRT Talker Advantage Spatial Advantage Total Advantage

University of Cincinnati 24 Mean 20.2 210.7 5.4 9.2 10.5

SD 0.9 2.3 2.2 2.1 2.4

Calgary Health Region 24 Mean 20.3 211.6 4.9 8.8 11.3

SD 1.7 3.6 2.6 1.8 2.6

University of Texas at Dallas 24 Mean 20.4 211.7 5.5 9.8 11.4

SD 0.8 1.9 2.2 1.7 1.7


135

different voices distracter at both the 0 degrees location

(F[1, 71] 5 378.30, p , .001) and the 690 degrees

location (F[1, 81] 5 51.19, p , .001). An interaction

between distracter location and speaker voice was also

significant, indicating that the benefit from separation

in the same voice condition (9.3 dB) was significantly

greater than the benefit from separation in the different

voices conditions (5.8 dB; F[1, 81] 5 121.34, p , .001).

Overall, the listening benefit obtained from the

spatial separation was influenced by the similarity of

the voices between the speaker(s) of the distracters and

that of the speaker of the target sentences. Benefit

from separation is calculated by subtracting the SRT in

the 0 degrees location from that in the 690 degrees

location for a particular condition of distracter voice.

The benefit from separation for the same voice

condition is referred to as the LiSN-S spatial advan-

tage measure (Cameron et al, 2006).

Effect of Age on LiSN-S Performance Measures

The mean SRT and advantage measures for the

children in the normative data study are illustrated in

Figure 5. There was a trend of decreasing SRT and

increasing advantage, as age increased, across measures.

The interparticipant standard deviations of the measures

ranged from 0.7 dB for the eight-year-olds on the low-cue

SRT measure to 2.8 dB for the seven-year-olds on the

high-cue SRT measure. Separate ANOVAs were per-

formed to determine the effect of age on the performance

measures. As for previous analyses, the alpha level of

0.05 was multiplied by 4/5 to give an adjusted level of 0.04

to avoid inflating the Type I error rate.

For the low-cue SRT there was a significant main

effect of age (F[5, 66] 5 2.697, p 5 .028). Post hoc tests

using Tukey’s HSD revealed no significant differences

between age groups. There was also a significant main

effect of age for the high-cue SRT (F[5, 66] 5 3.877, p 5

.004). Post hoc tests revealed that the six-year-olds

required a significantly higher SRT than children aged

nine (p 5 .043) and 11 (p 5 .011). The seven-year-olds

required a higher SRT than the 11-year-olds (p 5 .034).

No differences in thresholds were significant between

other combinations of age groups.

There was a significant main effect of age for the

talker advantage measure (F[5, 66] 5 4.335, p 5 .002).

The six-year-olds required a significantly higher SRT

than children aged nine (p 5 .005) and 11 (p 5 .003). A

significant main effect of age was also found for the

spatial advantage measure (F[5, 66] 5 3.537, p 5 .007).

The six-year-olds needed a higher SRT than the 10-

year-olds (p 5.012) and 11-year-olds (p 5 .007). No

other differences in advantage measures were signif-

icant among the other combination of age groups.

There was no significant main effect of age for the total

advantage measure (F[5, 66] 5 2.062, p 5 .81).

Gender Effects

An analysis was conducted in order to investigate

gender effects in the children. Mean scores and

standard deviations for the 34 females and 38 males

on the various LiSN-S SRT and advantage measures

are provided in Table 6, along with the results of

ANOVAs that were performed with each measure as

the dependant variable, a fixed factor of gender, and

age as a covariate. There was no significant effect of

gender for any LiSN-S measure.

Practice Effects

The effect of practice on performance on the North

American LiSN-S was examined for the 72 children in the

normative data study. In order to determine whether

practice improved performance, the mean SRTs were

compared for the four basic LiSN-S conditions (SV0u,SV690u, DV0u, and DV690u) as a function of presentation

order (first, second, third, or fourth). Participants in each

age group completed the various conditions in exactly the

same order. Age groups were combined to provide

sufficient numbers in each condition and task combina-

tion to calculate meaningful inferential statistics.

Table 5. Average Speech Reception Thresholds (SRTs)and Interparticipant Standard Deviations (in dB) for Eachof the Four Distracter Location Conditions for the 72Children in the North American Normative Data Study

Condition SRT SD

Same Voice 0u 20.3 1.2

690u 29.6 2.3

Different Voices 0u 25.6 2.9

690u 211.3 2.7

Table 4. Comparison of Average Speech Reception Thresholds (SRTs, Expressed as Signal-to-Noise Ratios) andInterparticipant Standard Deviations (in dB) for Each of the SRT and Advantage Measures for the 70 Children in theAustralian Normative Data Study and the 72 Children in the North American Normative Data Study

Site n Variable Low-Cue SRT High-Cue SRT Talker Advantage Spatial Advantage Total Advantage

Australian 70 Mean 20.8 214.2 3.6 11.7 13.3

SD 1.2 2.0 1.8 1.9 1.6

North America 72 Mean 20.3 211.3 5.3 9.3 11.1

SD 1.2 2.7 2.3 1.9 2.3


136

Figure 5. Results on the various Listening in Spatialized Noise–Sentences test (a–b) speech reception threshold (SRT) and (c–e)advantage measures for children in the normative data study. Error bars represent the 95% confidence intervals from the mean.


137

Table 7 shows the mean thresholds in dB for each

LiSN-S condition as a function of presentation order.

One-way ANOVAs revealed no significant differenc-

es in mean SRTs as a factor of presentation order for

either the SV0u condition (F[3, 68] 5 0.078, p 5 .972),

the SV690u condition (F[3, 68] 5 0.538, p 5 .658), or

the DV690u condition (F[3, 68] 5 1.616, p 5 .194).

There was, however, a significant difference for the

DV0u condition (F[3, 68] 5 3.207, p 5 .028). Post hoc

tests using Tukey’s HSD revealed that, at 24.2 dB, the

first presentation resulted in a significantly higher

SRT than the third presentation at 27.0 dB (p 5 .018).

However, there was no significant difference in SRT

between the second presentation (at 25.3 dB) and any

other subsequent presentation.

Standard Error, Time Analysis, and Distribution

of Data

As discussed in the method section, testing in any one

LiSN-S condition was terminated once a participant had

completed 30 sentences or once 17 sentences had been

completed (plus at least five practice sentences including

one reversal) and his or her standard error was less than

1 dB. In the present study, the median standard error

ranged from 0.77 dB in the SV0u condition to 0.98 dB in

the DV690u condition, with a range of 0.59 to 1.79 dB

across all age groups. Normal probability-probability

plots reveal that the data followed a normal distribution

for all SRT and advantage measures. The median time

taken to complete any LiSN-S condition was two

minutes, 40 seconds (mean two minutes, 45 seconds).

Total time taken to complete the testing was on average

approximately 11 minutes, plus five minutes for in-

structions and breaks.

Regression Analysis and LiSN-S Cutoff Scores

As a strong trend of improved performance with

increasing age was found for the various LiSN-S SRT

and advantage measures, it was determined that cutoff

scores, calculated as two standard deviations below the

mean, would need to be adjusted for age for each

performance measure. These cutoff scores represent

the level below which performance on the LiSN-S is

considered to be outside normal limits.

A regression analysis was conducted with SRT for

each measure as the independent variable and age

(ranging from 6.21 to 11.87 years) as the dependent

variable. The cutoff scores were adjusted for age using

the formula

cutoff score 5 intercept + (B-value * age) + (2 * SDs ofresiduals from the age-corrected trend lines)

for the LiSN-S SRT measures and

cutoff score 5 intercept + (B-value * age) 2 (2 * SDs ofresiduals from the age-corrected trend lines)

for the LiSN-S advantage measures. All regression

data are presented in Table 8. Figure 6 provides

scatter plots of the regression analysis showing the

individual data points.

In respect to practice effects, as the first presenta-

tion of the LiSN-S DV0u condition resulted in an SRT

that was significantly higher than that for the third

Table 6. Mean Speech Reception Thresholds (SRTs) and Advantage Measure Scores (in dB) for 34 Males and 38Females Aged 6 to 11 Years, Together with Results of ANOVA Investigating Effects of Gender on North AmericanListening in Spatialized Noise–Sentences Test Performance, with Age as a Covariate

Measure

Males Females

F(1, 70) pMean SD Mean SD

Low-Cue SRT 20.3 1.0 20.3 1.3 0.02 .882

High-Cue SRT 211.2 2.4 211.5 3.0 0.37 .548

Talker Advantage 5.0 2.1 5.6 2.6 1.29 .260

Spatial Advantage 9.2 1.7 9.4 2.1 0.40 .527

Total Advantage 10.9 2.1 11.3 2.5 0.619 .434

Table 7. Mean Speech Reception Thresholds (SRTs) and Standard Deviations (in dB) of Each Listening in SpatializedNoise–Sentences Test Condition, as a Function of Presentation Order for the 72 Children in the Normative Data Study

Condition

First Presentation Second Presentation Third Presentation Fourth Presentation

SRT SD SRT SD SRT SD SRT SD

Same Voice 0u 20.4 1.5 20.2 1.1 20.2 1.1 20.3 1.2

690u 29.8 2.2 210.0 2.8 29.6 1.7 29.1 2.5

Different Voices 0u 24.2 2.5 25.3 2.8 27.0 3.4 25.9 2.2

690u 210.7 3.0 210.7 2.5 211.8 2.3 212.3 2.7

Note: Age groups are combined.


138

presentation, the DV0u condition should not be

presented first. It is recommended that presentation

order be (1) DV690u, (2) SV690u, (3) DV0u, and (4)

SV0u. This configuration represents a gradient from‘‘easy’’ to ‘‘difficult’’ and controls for the practice effects

that were demonstrated for the DV0u condition.

Effect of Change of Presentation Order

on Performance

It must be acknowledged that only a quarter of the

participants in the current study received the LiSN-S

in the recommended order described above. To deter-

mine the effect of change of presentation order on

performance, a regression analysis was conducted withpresentation order (1, 2, 3, or 4) as the dependent

variable and SRT in each LiSN-S condition as the

independent variable. The adjustment to the norma-

tive data needed to test in the recommended order was

calculated as the number of steps away from the order

midpoint (2.5) multiplied by the B-value (dB/step).

The calculated adjustments were 0.9 dB for the

DV690u condition, 20.1 dB for the SV690u condition,

20.3 dB for the DV0u condition, and 0 dB in the SV0ucondition. Whereas most required adjustments were

insignificant, it was decided to adjust the normative

data cutoff scores to reflect the 0.9 dB effect ofpresenting the DV690u condition first as stipulated

by the recommended order. As such, the intercept for

the high-cue SRT (i.e., DV690u SNR) was increased by

0.9 dB (from 25.06 to 24.16 dB). As the SNR of the

DV690u condition is also utilized in the calculation of

the total advantage score, the intercept for this

measure was decreased by 0.9 dB (from 7.36 to

6.46 dB). The adjustments are noted in Table 8.

EXPERIMENT 3—TEST–RETEST

RELIABILITY STUDY

Approval to conduct the test–retest reliability study

was obtained from the Institutional Review

Boards of the Cincinnati Children’s Hospital Medical

Center, the University of Texas at Dallas, and the

University of Calgary.

Participants

Data were collected by Cincinnati Children’s Hos-

pital, Calgary Health Region, and the University of

Texas at Dallas. Participants were 36 of the 72

children who had taken part in the normative data

study who agreed to also take part in the test–retest

reliability study. Participant details are provided in

Table 9. Participants recruited by the Cincinnati

Children’s Hospital were tested between 10 a.m. and

6 p.m. Participants recruited by Calgary Health

Region were tested between 3 p.m. and 6 p.m.

Participants from the University of Texas at Dallas

were tested between 10.30 a.m. and 6 p.m.


The materials and procedures used in the test–

retest reliability study were as for the normative data

study. The four LiSN-S conditions were presented to

each participant in the same order that they were

presented during the normative data study. Retesting

on the LiSN-S was carried out between 2 months, 0

days, to 3 months, 25 days, following the initial

testing (median 2 months, 9 days; mean 2 months, 13

days).

RESULTS—EXPERIMENT 3

All analyses were performed with Statistica 7.1.

Test–Retest Paired Comparisons

The mean scores and standard deviations for the

various LiSN-S conditions—and the advantage mea-

sures derived from the various conditions—at test

and retest are provided in Table 10. Results of

difference scores between test and retest, as well

as the t and P values of paired-samples t-tests, are

also provided. Except for the spatial advantage

measure, all differences were in the direction

representing an improvement in performance. The

maximum improvement in performance on retest

Table 8. Data Utilized in the Calculation of Listening in Spatialized Noise–Sentences Test Cutoff Scores for the 72Children in the Normative Data Study

Measure Mean (dB) SD (Residuals; dB) Intercept B-Value r2

Low-Cue SRT 20.3 1.05 2.26 20.28 0.181

High-Cue SRT 211.3 2.38 24.16a 20.70 0.209

Talker Advantage 5.3 2.07 20.17 0.60 0.207

Spatial Advantage 9.3 1.67 5.18 0.46 0.186

Total Advantage 11.1 2.18 6.46b 0.41 0.099

Note: All r2 values are significant at p , .05. SRT 5 speech reception threshold.aIncreased by 0.9 dB to account for practice effects.bDecreased by 0.9 dB to account for practice effects.


139

was 0.7 dB on the different voices 0 degrees

condition. Minimum change was 0.1 dB on the

spatial advantage and total advantage conditions.There were no significant differences in performance

between test and retest on any LiSN-S measure (p

ranged from .080 to .862).

Effect of Age on LiSN-S Performance Measures

Figure 7 depicts the mean test and retest scores foreach of the LiSN-S SRT and advantage measures, as a

function of age. A repeated-measures ANOVA was

performed for each LiSN-S SRT and advantage

Figure 6. Linear regression scatter plots of Listening in Spatialized Noise–Sentences test (a–b) speech reception threshold (SRT) and(c–e) advantage measures for children in the normative data study. Prediction lines represent 95% intervals from the mean.


140

measure, with age as a between-participants factor, to

determine whether test–retest differences differed

with age. An alpha level of 0.05 was used for all

comparisons. There was no significant interaction of

test session and age for the low-cue SRT (F[5, 30] 5

0.28, p 5 .922), the high-cue SRT (F[5, 30] 5 0.16, p 5

.975), spatial advantage (F[5, 30] 5 1.83, p 5 .137), or

total advantage (F[5, 30] 5 0.56, p 5 .733). The age by

test session interaction was significant for the talker

advantage measure (F[5, 30] 5 3.82, p 5 .008). Post

hoc tests using Tukey’s HSD reveal that the retest

scores of the 11-year-olds on talker advantage were

significantly better than the test scores of the six-year-

olds on that measure (p 5 .016). However, there were

no significant differences in talker advantage scores

between test and retest within any age group (for

example, between six-year-olds at test and retest).

Test–Retest Correlation Analysis

A Pearson product-moment correlation analysis was

performed for each of the LiSN-S SRT and advantage

measures. All correlations were significant except for

the low-cue SRT measure: low-cue SRT, r 5 0.1, p

5.678; high-cue SRT, r 5 0.7, p , .001; talker

advantage, r 5 0.5, p , .001; spatial advantage, r 5

0.4, p 5 .007; total advantage, r 5 0.6, p , .001. Lack of

correlation between test and retest scores for the low-

cue SRT condition is expected due to the small size of

the interparticipant spread. Nearly all data lie within

2 dB of the mean for both test and retest scores.

Scatter plots in Figure 8 show the correlation of test

versus retest scores for each of the LiSN-S SRT and

advantage measures.

Test–Retest Correction Factors

Table 11 displays the calculations of the one-sided

critical differences required to determine whether a

child with (C)APD has improved on the LiSN-S

following remediation or compensation, taking a

correction factor for test–retest differences into ac-

count. That is, for any individual child, an improve-

ment on a particular LiSN-S SRT or advantage

measure should be greater than the listener’s score,

plus the mean test–retest study difference, 21.64 3

the standard deviation of the mean test–retest reli-

ability study difference for the SRT measures and plus

that amount for the advantage measures. Critical

difference measures, including the correction factor,

ranged from 3.0 dB on the total advantage measure to

3.9 dB on the talker advantage measure.

DISCUSSION

The development of the North American LiSN-S and

the subsequent sentence equivalence, normative

data, and test–retest reliability studies were carried

out in line with the design and procedures employed in

the development of the Australian LiSN-S (Cameron

and Dillon, 2007a, 2007c). The NA LiSN-S sentence

equivalence study showed that postequalization,

Table 10. Mean Scores and Standard Deviations (in dB) for the 36 Participants at Test and Retest on the Various NorthAmerican Listening in Spatialized Noise–Sentences Test Conditions and the Advantage Measures Calculated fromThose Conditions

Measure

Test Retest

Paired Difference t Value P ValueMean SD Mean SD

Same Voice 0u (Low-Cue SRT) 20.4 1.29 20.9 1.26 0.5 1.80 .080

Same Voice 690u 29.1 4.51 29.7 4.55 0.5 1.66 .106

Different Voices 0u 25.6 3.92 26.4 4.14 0.7 1.59 .120

Different Voices 690u (High-Cue SRT) 211.8 2.56 212.3 2.01 0.6 1.80 .080

Talker Advantage 5.8 2.30 6.0 2.43 0.2 2.27 .621

Spatial Advantage 9.5 1.89 9.5 1.84 0.1 0.24 .815

Total Advantage 11.4 2.16 11.4 1.53 0.1 20.17 .862

Note: SRT 5 speech reception threshold.

Table 9. Details of the 36 Participants in the Test–Retest Reliability Study for the North American Listening inSpatialized Noise–Sentences Test

Age Group n Male Female Minimum Age (years, months) Maximum Age (years, months) Mean Age (years, months)

6 8 4 4 6, 2 6, 10 6, 7

7 3 2 1 7, 0 7, 7 7, 5

8 7 3 4 8, 1 8, 11 8, 6

9 6 5 1 9, 4 9, 11 9, 8

10 3 2 1 10, 3 10, 11 10, 8

11 9 6 3 11, 0 11, 11 11, 6


141

intelligibility increased across sentences by 18.7

percent for each 1 dB increase in SNR, which is

comparable with the increase of 17 percent per dB forthe Australian LiSN-S.

The results of the NA LiSN-S normative data study

revealed that there was no effect of collection site (Ohio

vs. Alberta vs. Texas) on any performance measure (p

ranging from .215 for spatial advantage to .878 for low-

cue SRT). As for the Australian study, there was a

trend of decreasing SRT as age increased, across

measures. Interparticipant standard deviations across

performance measures were small for both versions of

the LiSN-S (NA LiSN-S standard deviations ranged

from 1.2 dB in the SV0u condition to 2.9 dB in the DV0u

Figure 7. Dot plots depicting mean test and retest scores for each of the North American Listening in Spatialized Noise–Sentences test(a–b) speech reception threshold (SRT) and (c–e) advantage measures as a function of age. Circular filled symbols connected by solidlines represent the test scores. Square unfilled symbols connected by dashed lines represent the retest scores. Error bars represent the95% confidence intervals from the mean.


142

condition; Australian LiSN-S standard deviations

ranged from 1.3 dB in the SV0u condition to 2.8 dB in

the SV690u condition). Intraparticipant standard

deviations were also small for both versions (NA

LiSN-S ranged from 0.77 dB in the SV0u condition to

0.98 dB in the DV690u condition; Australian LiSN-S

ranged from 0.86 dB in the SV0u condition to 0.97 dB

in the DV690u condition). There were no significant

gender effects found for either version of the LiSN-S

for any performance measure.

Analysis of variance revealed that there were

significant differences in mean score between the

North American and Australian versions of the LiSN-

S across performance measures (ranging from a

difference between versions of 0.3 dB on the low-cue

SRT to 2.9 dB on the high-cue SRT). In each case

except for talker advantage, performance was slightly

better for the Australian children. It can only be

speculated as to why these slight differences occurred,

as the design and procedures utilized in the production

of the LiSN-S in both versions were identical. It could

be suggested that the time of day (and hence the

alertness of the children) at which data collection

occurred may have resulted in the differences in mean

SRT/advantage between versions. All the Australian

data were collected between 9.30 a.m. and 2.30 p.m.,

whereas only the data from Texas were collected

during a similar time frame, with the data from

Alberta and Cincinnati mainly collected after school

hours. It is conceivable that the scores achieved by the

North American children were poorer because they

were tested later in the day and fatigue played a role in

the results. However, if this were the case, the scores

achieved in Texas should have been significantly

better than in the other sites, and this is not the case.

Also, scores on the talker advantage measure were

better for the North American children than for the

Australian children, making test time an unlikely

cause of differences in SRT/advantage between ver-

sions. Slight differences in the actual recordings (for

example, in respect to pace of speakers, differences

between speaker voices) may have contributed to the

differences in SRT/advantage between versions.

As a trend of improved performance with increasing

age was found for both the North American and

Australian versions of the LiSN-S, it was determined

that the cutoff scores would need to be adjusted for age.

The cutoff scores for the NA LiSN-S represent the level

below which performance on a particular performance

measure is considered to be outside normal limits and

is calculated from the intercept and B-values obtained

from a regression analysis of SRT on age. All r2 values

were significant for both the Australian and North

American versions of the LiSN-S.

The effect of practice on the NA LiSN-S was also

examined by measuring the effect of position within the

four subtests on the performance of each subtest. For the

DV0u condition there was a significant effect of test order

between the first and third position. There was no effect

of practice between the second and any subsequent

position for this condition, nor was there any significant

effect of practice on any other NA LiSN-S condition (p

ranging from .194 to .972). A significant effect of practice

was also found between the first and third (and fourth)

positions for the DV0u condition of the Australian LiSN-

S. Again there were no significant differences between

the second and any subsequent position.

To account for the practice effect found for the DV0ucondition it was stipulated for the Australian LiSN-S

that the DV0u condition be presented after the DV690ucondition and the SV690u condition when this test is

utilized clinically or in future studies. This presenta-

tion order is also recommended for the North American

version of the LiSN-S. For both versions, a regression

analysis was conducted to determine the effect of

change of presentation order on performance. The

adjustment to the normative data needed to account

for the delivery of the LiSN-S conditions in the

recommended order (DV690u, SV690u, DV0u, then

SV0u) was calculated as the number of steps away from

the order midpoint (2.5) multiplied by the B-value (dB/

step). For the Australian LiSN-S the greatest modifi-

cation required to adjust the normative data to account

for the recommended order was only 0.2 dB, and any

adjustment was therefore considered unnecessary

(Cameron and Dillon, 2007a). For the NA LiSN-S most

required adjustments were again insignificant (0 dB

for the SV0u condition, 20.1 dB for the SV690ucondition, and 20.3 dB for DV0u condition). However,

as the adjustment needed to compensate for presenting

the DV690u condition first was almost 1 dB, the

intercept used in the calculation of the NA LiSN-S

cutoff scores for this condition (and also the total

advantage measure that is consequently affected, as it

is calculated as the difference between the SV0u and

DV690u conditions) was adjusted accordingly.

The NA LiSN-S test–retest reliability study revealed

that differences in mean SRT between test and retest

were small across performance measures (0.1 to 0.7 dB).

This emulated the test–retest differences found for the

Australian LiSN-S, which ranged from 0.1 to 1.1 dB

(Cameron and Dillon, 2007c). While there were no

significant differences in mean SRT/advantage between

test and retest for any performance measure of the NA

LiSN-S, significant differences were found for all

Australian LiSN-S performance measures except spatial

advantage. Test–retest differences did not vary signifi-

cantly with age for either the Australian or North

American versions of the LiSN-S. A correlation analysis

of test and retest scores across performance measures

was statistically significant for the Australian LiSN-S (r

ranging from 0.3 to 0.8). The correlation between test


143

Figure 8. Scatter plots depicting the correlation of test vs. retest scores for each of the Listening in Spatialized Noise–Sentences test(a–b) speech reception threshold (SRT) and (c–e) advantage measures. Lines show the 95% confidence interval of the regression line.


144

and retest was also significant for all of the North

American LiSN-S performance measures except the low-

cue SRT, which was attributed to the extremely small

interparticipant spread on this measure.

As for the Australian LiSN-S, the test–retest data

from the NA LiSN-S study were utilized to develop

one-sided critical difference scores. These scores can be

used to determine whether a child has genuinely

improved on the LiSN-S following a period of remedi-

ation or compensation with an assistive listening

device. Critical difference scores for the NA LiSN-S

ranged from 3.0 dB on the total advantage measure to

3.9 dB on the talker advantage measure. These scores

were highly comparable to the Australian LiSN-S

critical difference scores, which ranged from 2.5 dB

on the low-cue SRT to 4.4 dB on talker advantage.

CONCLUSION

In previous studies (Cameron and Dillon, 2007a,

2007c, 2008), the Australian LiSN-S was reported to

be a fast and efficient assessment tool with potential to

be used clinically to evaluate auditory streaming skills

in children with suspected CAPD. The present study

has described the development of a North American–

accented and semantically appropriate version of the

LiSN-S, appropriate for use in the United States and

Canada. The normative data were not affected by data

collection site or gender, and inter- and intrapartici-

pant variation was minimal. An expected trend of

improved performance as a function of increasing age

was found for all measures, and the calculation of the

cutoff scores that determine the level below which

performance on the NA LiSN-S is considered to be

outside normal limits was adjusted for age accordingly.

Test–retest differences were not significant on any NA

LiSN-S measure. The calculation of one-sided critical

difference scores, which take into account mean

practice effects and day-to-day fluctuation in perfor-

mance, makes the NA LiSN-S a potentially valuable

tool for monitoring performance over time and the

effects of maturation, remediation, or compensation

such as an assistive listening device.

Acknowledgments. We would like to thank Jill Anderson

and Cari Olsen from the University of Cincinnati, who

collected the data for the sentence equivalence study,

normative data study, and test–retest reliability study for

that facility. The distracter children’s stories used in this

project were provided by Ms. Vashti Farrer. The contribution

of Stephen Cameron in the production of the LiSN-S software

program is also sincerely appreciated.

REFERENCES

Bamford J, Wilson I. (1979) Methodological considerationsand practical aspects of the BKB sentence lists. In: Bench J,Bamford J, eds. Speech-Hearing Tests and the Spoken Languageof Hearing-Impaired Children. London: Academic Press, 146–187.

Bregman AS. (1990) Auditory Scene Analysis. Cambridge: MITPress.

Cameron S, Dillon H. (2006) Listening in Spatialized Noise Test(LiSNH)–Sentences (Version 1.0.0) [computer software]. Sydney:National Acoustic Laboratories.

Cameron S, Dillon, H. (2007a) Development of the Listening inSpatialized Noise–Sentences test (LiSN-S). Ear Hear 28(2):196–211.

Cameron S, Dillon H. (2007b) North American Listening inSpatialized Noise–Sentences test (NA LiSN-S) (Version 1.1.0)[computer software]. Sydney: National Acoustic Laboratories.

Cameron S, Dillon H. (2007c) The Listening in Spatialized Noise–Sentences test (LiSN-S): test–retest reliability study. Int J Audiol46:145–153.

Cameron S, Dillon H. (2008) The Listening in Spatialized Noise–Sentences test: comparison to prototype LiSN test and results fromchildren with either a suspected (central) auditory processingdisorder or a confirmed language disorder. J Am Acad Audiol19(5):377–391.

Cameron S, Dillon H, Newall P. (2006) Development and evaluationof the Listening in Spatialized Noise test. Ear Hear 27(1):30–42.

Crystal D. (1989) Grammatical Analysis of Language Disability.London: Cole and Whurr Limited.

Dawes P, Bishop DVM. (2007) The SCAN-C in testing forauditory processing disorder in a sample of British children.Int J Audiol 46:780–786.

Golding M, Lilly DJ, Lay JW. (1996) A Staggered Spondaic Word(SSW) test for Australian use. Aust J Audiol 18(2):81–88.

Table 11. Calculation of the One-Sided Critical Differences Needed to Infer a Genuine Improvement in AuditoryPerformance on Retest While Taking into Account Mean Practice Effects and Day-to-Day Fluctuations in Performance(in dB)

Condition

Correction Factor

(Mean Test–Retest Difference)

SD of the Mean Test–Retest

Difference 1.64 3 SD

Critical Difference

(Including Correction)

Low-Cue SRT 20.52 1.74 2.85 23.4

High-Cue SRT 20.59 1.96 3.21 23.8

Talker Advantage 0.19 2.27 3.72 3.9

Spatial Advantage 20.07 1.98 3.25 3.2

Total Advantage 0.05 1.80 2.95 3.0

Note: SRT 5 speech reception thresholds.


145

Green, LJ. (2002) African American English: A Linguistic Intro-duction. New York: Cambridge University Press.

Katz J. (1962) The use of staggered spondaic words for assessingthe integrity of the central auditory nervous system. J SpeechDisord 33:132–146.

Keith RW. (1986) SCAN—A Screening Test for Auditory Pro-cessing Disorders. San Diego: Psychological Corporation.

Keith RW. (2000) SCAN-C Test for Auditory Processing Disordersin Children—Revised. San Antonio: Psychological Corporation.

Koehnke J, Besing J. (1997) Clinical applications of 3-D auditorytests. Semin Hear 18(4):345–354.

Kowal A. (1979) Sentence list construction and pilot test. In: BenchJ, Bamford J, eds. Speech-Hearing Tests and the Spoken Languageof Hearing-Impaired Children. London: Academic Press, 110–145.

Marriage J, King J, Briggs J, Lutman ME. (2001) The reliabilityof the SCAN test: results from a primary school population in theUK. Br J Audiol 35:199–208.

Sussman E, Ritter W, Vaughan HG. (1999) An investigation ofthe auditory streaming effect using event-related brain poten-tials. Psychophysiology 36:22–34.

The MathWorks, Inc. (2002a) MATLAB (Version 6.5.6) [Comput-er software]. Natick, MA: The MathWorks, Inc.

Wilber LA. (2002) Transducers for audiologic testing. In: Katz J,ed. Handbook of Clinical Audiology. Baltimore: LippincottWilliams and Wilkins, 88–95.

APPENDIX A

Practice Sentences—LiSN-S Same Voice 0 Degrees

Condition

1. The boys are watching the game.

2. A dog is hiding the bone.

3. Two girls went to the store.

4. A painting hangs on the wall.

5. Some people go to the gym.

APPENDIX B

Extract from the LiSN-S distracter continuous dis-

course presented at 0 degrees and 290 degrees:

‘‘Loopy Lizard’s Tail’’ by Vashti Farrer

Loopy lizard was on his way home. His mother had

told him not to dawdle, but Loopy wanted to play.

So he ran across a path and through the grass,

where he pretended to hide. Then he scampered up

a wall and peered into a crack to see who lived

there. Then Loopy stopped very still in the sun, as if

he were asleep, just to feel how warm it was.

Suddenly a big, fierce dog came down the path. ‘‘He

must be fierce,’’ thought Loopy, ‘‘because he has big

teeth.’’ And he started to run as fast as he could

along the path, to get out of the dog’s way. But the

big dog chased him.

APPENDIX C

LiSN-S Instructions

1. You are going to hear some sentences over these

headphones.

2. The sentences are said by a lady called ‘‘Miss Smith.’’

3. Miss Smith will sound as if she is standing just in

front of you.

4. There will be a ‘‘beep’’ before each sentence so youwill know when it is about to start.

5. Your job is to repeat back the sentence that Miss

Smith says.

6. I’ll pretend to be Miss Smith, and I want you to

repeat the sentence you hear.

7. ‘‘Beep.’’ ‘‘The dog had a bone.’’

8. Child repeats ‘‘The dog had a bone.’’

9. Good, that’s easy isn’t it? But there’s a trick. At thesame time that Miss Smith is telling you the sentence th-

ere are some very tricky people talking at the same time.

10. Sometimes the tricky people sound like they are

standing right next to Miss Smith, sometimes they will

sound like they are standing next to you.

11. No matter where the tricky people are I don’t want

you to listen to them.

12. Just listen for the ‘‘beep’’ and the sentence.13. Miss Smith always starts out louder than the tricky

people, so you shouldn’t have any trouble hearing her.

14. But sometimes the tricky people get loud. If you

only hear a bit of the sentence I want you to tell me all

the words that you hear.

15. So if you just heard ‘‘dog’’ and ‘‘bone,’’ what would

you say?

16. Child repeats ‘‘dog’’ and ‘‘bone.’’17. Great. If you don’t hear Miss Smith at all, just shake

your head and I’ll go straight on to the next sentence.

18. Once you’ve heard the sentence tell me what

you’ve heard straight away so you don’t forget it.

19. In the first lot of sentences the tricky people will be

standing right next to you. Don’t listen to them. Just

concentrate on Miss Smith in front.

20. The tricky people start first and then Miss Smithstarts a few seconds later. Ready?

21. Describe where the tricky people are before each

listening condition.

(a) ‘‘Same Voice—690 Degrees’’ Condition: ‘‘Now the

tricky people will be next to you again, but their

voices will be a bit different. Ignore them and just

listen for Miss Smith.

(b) ‘‘Different Voices—0 Degrees’’ Condition: ‘‘Nowthe tricky people will be next to Miss Smith. Just

listen for the beep and the sentence.

(c) ‘‘Same Voice—0 Degrees’’ Condition: ‘‘Now the

tricky people will be next to Miss Smith, and their

voices will be very similar to Miss Smith’s voice.

So you will have to listen very hard for the beep

and Miss Smith.’’


146