Development of the North American Listening inSpatialized Noise–Sentences Test (NA LiSN-S):Sentence Equivalence, Normative Data,and Test–Retest Reliability StudiesDOI: 10.3766/jaaa.20.2.6
Sharon Cameron*{David Brown{1Robert Keith1
Jeffrey Martin**Charlene Watson{{Harvey Dillon*
Abstract
Background: The Listening in Spatialized Noise–Sentences test (LiSN-SH) was originally developed inAustralia to assess auditory stream segregation skills in children with suspected central auditory
processing disorder (CAPD). The software produces a three-dimensional auditory environment underheadphones. A simple repetition-response protocol is utilized to determine speech reception thresholds
(SRTs) for sentences presented from 0 degrees azimuth in competing speech. The competing speech(looped children’s stories) is manipulated with respect to its location (0 degrees vs. +90 degrees and
290 degrees azimuth) and the vocal quality of the speaker(s) (same as, or different to, the speaker ofthe target stimulus). Performance is measured as two SRT and three advantage measures. The
advantage measures represent the benefit in dB gained when either talker, spatial, or both talker andspatial cues combined are incorporated in the maskers.
Purpose: The objective of this research was to develop a version of the LiSN-S suitable for use in theUnited States and Canada. The original sentences and children’s stories were reviewed for unfamiliar
semantic items and rerecorded by native North American speakers.
Research Design: In a descriptive design, a sentence equivalence study was conducted to determine
the relative intelligibility of the rerecorded sentences and adjust the amplitude of the sentences forequal intelligibility. Normative data and test–retest reliability data were then collected.
Study Sample: Twenty-four children with normal hearing aged 8 years, 3 months, to 10 years, 0 months,took part in the sentence equivalence study. Seventy-two normal-hearing children aged 6 years, 2
months, to 11 years, 10 months, took part in the normative data study. Thirty-six children returnedbetween two and three months after the initial assessment for retesting. Participants were recruited from
sites in Cincinnati, Dallas, and Calgary.
Results: The sentence equivalence study showed that post-adjustment, sentence intelligibility increased
by 18.7 percent for each 1 dB increase in signal-to-noise ratio. Analysis of the normative data revealed nosignificant differences on any performance measure as a consequence of data collection site or gender.
Inter- and intra-participant variation was minimal. A trend of improved performance as a function ofincreasing age was found across performance measures, and cutoff scores, calculated as two standard
deviations below the mean, were adjusted for age. Test–retest differences were not significant on anymeasure of the North American (NA) LiSN-S (p ranging from .080 to .862). Mean test–retest differences
*National Acoustic Laboratories; {Macquarie University; {Cincinnati Children’s Hospital Medical Center; 1University of Cincinnati; **Universityof Texas at Dallas; {{Community Audiology Services, Calgary Health Region
Sharon Cameron, Ph.D., National Health and Medical Research Council Public Health (Australia) Fellow and Research Scientist, NationalAcoustic Laboratories, 126 Greville St., Chatswood, NSW, 2067, Australia; Phone: +61 2 9412 6851; Fax: +61 2 9411 8273;E-mail: [email protected]
A commercial version of the test described in this article will be released shortly. Financial returns from that commercialization will benefit Dr.Cameron and the organizations involved in this study. This outcome has in no way influenced the research reported in this article.
J Am Acad Audiol 20:128–146 (2009)
128
on the various NA LiSN-S performance measures ranged from 0.1 dB to 0.6 dB. One-sided critical
difference scores calculated from the retest data ranged from 3 to 3.9 dB. These scores, which take intoaccount mean practice effects and day-to-day fluctuations in performance, can be used to determine
whether a child has improved on the NA LiSN-S on retest.
Conclusions: The NA LiSN-S is a potentially valuable tool for assessing auditory stream segregation
skills in children. The availability of one-sided critical difference scores makes the NA LiSN-S useful formonitoring listening performance over time and determining the effects of maturation, compensation
(such as an assistive listening device), or remediation.
Key Words: Auditory stream segregation, (central) auditory processing disorder
Abbreviations: BKB 5 Bamford-Kowal-Bench sentences test; CAPD 5 central auditory processing
disorder; eSRT 5 estimate of speech reception threshold; FFT 5 fast Fourier transform; HpTF 5 head-phone transfer function; HRTF 5 head-related transfer function; KEMAR 5 Knowles Electronics
Manikin for Acoustic Research; LiSN-S 5 Listening in Spatialized Noise–Sentences test; NA LiSN-S 5 North American Listening in Spatialized Noise–Sentences test; rms 5 root mean square;
SNR 5 signal-to-noise ratio
The following article outlines the development of
a North American–accented and semantically
appropriate version of the Listening in Spatia-lized Noise–Sentences test (LiSN-SH [Cameron and
Dillon, 2006]). The LiSN-S was developed in Australia
to assess auditory stream segregation skills in
children with suspected central auditory processing
disorder (CAPD). Auditory stream segregation is the
process by which a listener is able to differentiate the
various auditory signals that arrive simultaneously
at the ears and form meaningful representations ofthe incoming acoustic signals (Sussman et al, 1999).
Auditory cues such as the perceived spatial location of
sounds or the pitch of speakers’ voices help this
process of segregating the total stream of sound
(Bregman, 1990).
The LiSN-S is presented using a personal computer.
Output levels are directly controlled by the software
via an external USB sound card. A three-dimensionalauditory environment under headphones is created by
presynthesizing the speech stimuli with head-related
transfer functions (HRTFs). This approach offers
several advantages over traditional soundfield testing.
First, it minimizes the variability in the sound
pressure level at the eardrum caused by a listener’s
head movements (Wilber, 2002). Second, it offsets
potential differences in stimulus delivery due tovariations in loudspeaker and listener placement that
exist between clinics. Third, it reduces the effects of
reverberation in the test environment (Koehnke and
Besing, 1997).
On the LiSN-S, a simple repetition-response proto-
col is used to assess a listener’s speech reception
threshold (SRT) for target sentences presented in
competing speech maskers (children’s stories). Using
HRTFs, the targets are perceived as coming fromdirectly in front of the listener (0 degrees azimuth),
whereas the maskers, relative to the targets, vary
according to their perceived spatial location (0
degrees vs. +90 degrees and 290 degrees azimuth),
the vocal identity of the speaker(s) of the stories
(same as, or different to, the speaker of the targetsentences), or both. This results in four listening
conditions: same voice at 0 degrees (or low-cue SRT),
same voice at 690 degrees, different voices at 0
degrees, and different voices at 690 degrees (or high-
cue SRT).
Performance on the LiSN-S is evaluated on the low-
and high-cue SRT, as well as on three ‘‘advantage’’
measures. These advantage measures represent the
benefit in dB gained when either vocal, spatial, or bothvocal and spatial cues are incorporated in the maskers,
compared to the baseline (low-cue SRT) condition
where no cues are present in the maskers (see
Figure 1). The use of relative measures of performance
(i.e., difference scores) serves to minimize the influence
of higher-order language, learning, and communica-
tion skills on test performance. For example, as such
skills affect both the SRT when the distracters arepresented at 0 degrees and the SRT when they are
spatially separated at 690 degrees, these skills will
Figure 1. The Listening in Spatialized Noise–Sentences testspeech reception threshold (SRT) and advantage measures.
NA LiSN-S/Cameron et al
129
have minimal effect on the difference between the
SRTs in these two conditions. Thus, the differences
that inevitably exist between individuals in such
functions can be accounted for, allowing for clearer
evaluation of their abilities to use spatial and voice
cues to aid speech understanding.
The LiSN-S has shown to be sensitive to auditory
streaming deficits in children whose primary difficul-
ties in the classroom stem from poor listening
behavior, as opposed to those with documented
learning and attention disorders (Cameron and Dillon,
2008). For these children, interestingly, significant
differences on LiSN-S occurred only in the conditions
where the physical location of the maskers was
manipulated (high-cue SRT, p 5 .001; spatial advan-
tage, p , .0001; and total advantage, p , .0001). These
results provide further evidence to suggest that the
LiSN-S procedure is capable of differentiating not only
an auditory versus language disorder but also a spatial
versus vocal streaming segregation disorder.
The LiSN-S was developed in Australia. The target
sentences were written by Australian speech patholo-
gists, and the distracter children’s stories were written
by an Australian novelist. All the speech stimuli were
recorded by Australian speakers (Cameron and Dillon,
2007a). The sentence equivalence data, normative
data, and test–retest reliability data were collected
from Australian children (Cameron and Dillon, 2007a,
2007c).
Previous research has shown that performance on
audiologic tests that utilize speech stimuli may be
detrimentally affected in nonnative populations due to
factors such as unfamiliar accent and semantic items
(Golding et al, 1996; Marriage et al, 2001; Dawes and
Bishop, 2007). In the development of an Australian
version of the Staggered Spondaic Word Test (Katz,
1962), for example, Golding et al (1996) found poorer
performances on the Australian-accented version of
the American baseball term ‘‘batboy’’ for seven out of
10 young Australian listeners. Since ‘‘batboy’’ was
considered unfamiliar to the Australian population, it
was subsequently substituted with a more familiar
spondee. A further trial on 33 normal-hearing young
adults using the substituted word showed that the
overall percentage error and standard deviation were
reduced. Marriage et al (2001) found that the mean
scores for British children aged seven and eight for
both the filtered words and auditory figure-ground
subtests of the SCAN (Keith, 1986) were significantly
poorer than North American normative data. It was
concluded that vocabulary factors contributed to the
poorer results of the U.K. population sample and the
changed overall acoustic pattern of the target stimuli
did not allow clear word matching with familiar forms.
Dawes and Bishop (2007) compared scores on a revised
version of the SCAN (SCAN-C [Keith, 2000]) in 99
British children aged six to 10 years. All age groups
scored significantly worse on the filtered words and
auditory figure-ground subtests of the SCAN-C, as well
as on the composite scores. It was concluded that
applying North American norms to the scores obtained
by British children results in a high rate of overiden-
tification of listening difficulties.
In light of the potential detrimental effects of accent
and unfamiliar semantic items on Australian LiSN-S
performance in the North American population, it was
decided to replace any unfamiliar semantic items with
those more suitable for a North American population
and to record the stimuli using native North American
speakers. This article reports on the development and
recording of the stimuli; a sentence equivalence study,
normative data study, and test–retest reliability study
for the North American LiSN-S (NA LiSN-S [Cameron
and Dillon, 2007b]) follow. Comparison of the results of
these studies to the results of the respective Australian
data is discussed.
DEVELOPMENT OF THE NORTH
AMERICAN LiSN-S
NA LiSN-S Software Development
The NA LiSN-S graphic user interface and signal
processing application program were developed in the
C# programming language and were based on the
LiSN-S software described in Cameron and Dillon,
2007a. An image of the playback screen used to
administer the NA LiSN-S is provided in Figure 2.
Figure 2. The Listening in Spatialized Noise–Sentences testplayback screen. The graph shows the history of the target levelas the range of correct responses from greater than 50% correct toless than 50% correct is repeatedly traversed. The top horizontalline shows the level of the distracters, and the lower horizontalline shows the average level of the targets during thestable region.
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
130
Speech Stimuli
A total of 180 sentences used in the development of
the Australian LiSN-S were also utilized for the NA
LiSN-S. The target sentences were developed by the
Cooperative Research Centre for Cochlear Implant and
Hearing Aid Innovation and were used under license
from HearWorks Pty Limited. The sentences were
written by Australian registered speech pathologists
specializing in the rehabilitation of children with
hearing loss. Each sentence was constructed in
accordance with the criteria used in the development
of the Bamford-Kowal-Bench sentences test (BKB
[Bamford and Wilson, 1979]). The BKB sentences
contain mainly Stage 3 and some Stage 2 clause
structures as described in the Language Assessment,
Remediation, and Screening Procedure (Crystal, 1989)
and are suitable for children from 4.6 years of age
(Kowal, 1979).
The semantic content of each sentence was analyzed
independently by a native North American speaker
from the University of Cincinnati and a native
Canadian speaker from the National Acoustic Labora-
tories. Changes were then amalgamated, and the final
list was agreed to by both reviewers. A total of 27
substitutions were made. For example, ‘‘shop’’ was
changed to ‘‘store,’’ ‘‘cricket’’ team was changed to
‘‘baseball’’ team, and ‘‘nappies’’ was changed to ‘‘dia-
pers.’’ Examples of some of the sentences used in the
NA LiSN-S appear in Appendix A.
Two published Australian children’s stories entitled
‘‘Loopy Lizard’s Tail’’ and ‘‘The Great Big Tiny Traffic
Jam’’ were used as the competing speech stimuli.
Although listeners were instructed not to attend to the
competing stories, the semantic content of the stories
was also analyzed by the reviewers, and five changes
were made. For example, ‘‘skirting board’’ was changed
to ‘‘baseboard,’’ and ‘‘peeped out’’ was changed to
‘‘peered out.’’ An extract from ‘‘Loopy Lizard’s Tail’’
appears as Appendix B.
Recording
The North American versions of the LiSN-S target
sentences and distracter stories were recorded at a
professional recording studio in Sydney, Australia, by
native North American actors who had been in
Australia for less than 12 months. Female 1 (who is
also a North American dialect coach) recorded the
target sentences, as well as both stories. Female 2
recorded ‘‘Loopy Lizard’s Tale,’’ and Female 3 recorded
‘‘The Great Big Tiny Traffic Jam.’’ All stimuli were
produced with a general North American accent.
General North American English is the term given to
any American accent that is relatively free of notice-
able regional influences and is found in contemporary
North American–made films and television programs
(Green, 2002). All speakers were of the same ethnicity
and of approximately the same age. All speakers were
instructed to speak with a normal clear voice while
maintaining a normal rhythm of speech and to avoid
placing emphasis on key words. Specifically, clarity,
pace, and effort were maintained across words. These
qualifications were implemented to prevent listeners
from using cues, such as accent, to detect differences
between the stories and target sentences.
Editing
The analog signal was recorded directly onto hard
disk. The standard sampling frequency used in
compact disk recordings of 44.1 kHz with a 16 bit
digitization was utilized. The individual target sen-
tences and distracter discourse were extracted from
the recordings and edited using Adobe Audition
Version 1.5. A silent period of 100 msec was inserted
immediately preceding and following each distracter
story. Extraneous pauses were removed during the
editing process to ensure that the stories ran smoothly
and at a constant intensity level. The stories were
approximately two minutes and 30 seconds in length.
Level Normalization
The root mean square (rms) levels of each target
sentence and the individual distracter stories were
ascertained using Adobe Audition 1.5. These rms levels
were then averaged (in dB) across all stimuli. The rms
amplitude of each sentence and distracter was com-
pared to the average rms in order to obtain a correction
factor (i.e., difference between each sentence/distracter
rms and the average). Once equated for rms amplitude,
all stimuli were then reduced by 7 dB to ensure that no
clipping occurred when the distracters were convolved
with HRTFs at +90 or 290 degrees azimuth. The final
corrected rms level (prior to convolution) for all
sentences was 227.1 dB re: digital full scale.
Convolution
Each sentence recorded by Female 1 was convolved
with HRTFs recorded at 0 degrees azimuth. ‘‘Loopy
Lizard’s Tail’’ (recorded by Female 1 and Female 2)
was convolved with HRTFs recorded at 0 degrees
azimuth and 290 degrees azimuth. ‘‘The Great Big
Tiny Traffic Jam’’ (recorded by Female 1 and Female 3)
was convolved with HRTFs recorded at 0 degrees
azimuth and +90 degrees azimuth. The HRTFs were
recorded in a chamber, anechoic above 50 Hz, using a
Knowles Electronics Manikin for Acoustic Research
(KEMAR) containing a Zwislocki coupler and half-inch
microphone (see Cameron et al, 2006, for complete
NA LiSN-S/Cameron et al
131
description). Knowles Electronics small-sized pinnae
were fitted to simulate the outer ear. The HRTFs were
produced from swept sine waves ranging in frequency
from 50 to 20,000 Hz presented from a single loud-
speaker positioned 1 m from the center point of
KEMAR’s head.
The speech files were synthesized using the LiSN
convolution program developed using MATLAB soft-
ware (MathWorks Inc., 2002) as described in Cameron
et al, 2006. In summary, the various speech files were
converted to the frequency domain by a fast Fourier
transform (FFT) and then multiplied by the HRTFs, as
well as an inverse headphone response that is
described in the postequalization procedure section
below. An inverse FFT was then applied to convert the
signals back into the time domain for playback.
Postequalization Procedure
A postequalization procedure was also implemented
to correct for the response of the headphones used
during playback. Swept sine wave signals were played
through Sennheiser HD215 circumaural audiometric
high-frequency headphones on a KEMAR and recorded
by a Stanford Research Systems two-channel network
signal analyzer in order to measure the headphone-to-
eardrum transfer function (HpTF). A filter with the
inverse transfer function of the HpTF was developed,
as described in Cameron et al, 2006. The inverse HpTF
was convolved with the spatialized LiSN-S speech
materials, effectively canceling out the HpTF that
occurs during playback. The convolved, postequalized
stimuli were saved as WAV files.
Stimulus Generation
The convolved and hence spatialized speech files
were stored in an NA LiSN-S subdirectory for
subsequent playback. The playback screen and related
programs retrieved the spatialized target and distrac-
ter speech files and combined and scaled these stimuli
in dB to produce a binaural output signal. The right
and left ear components of the binaural signal were
assigned to the right and left channels of the computer
sound card, respectively.
Calibration
The mean rms level of the combined distracters
(averaged across the recordings made by Female 1 and
Females 2 and 3) at 0 degrees was 222.3 dB and at
690 degrees was 221.3 dB. The 1 dB difference
between the level of the distracters at 690 degrees
and 0 degrees occurs as a consequence of the HRTFs
applied and was intentionally not corrected for. All
signal-to-noise ratios were, therefore, defined relative
to the level of the distracters at 0 degrees, where both
the target sentences and distracters shared the same
head-related transfer functions. A 1 kHz reference
tone was created with an amplitude 10 dB greater
than the average of the combined total rms levels of the
two distracter files convolved with HRTFs at 0 degrees,
that is, 212.3 dB.
During the sentence equivalence study, the NA
LiSN-S was administered using a PC, and the stimuli
were presented through Sennheiser HD215 head-
phones, which were connected directly to the head-
phone socket of the PC. In order to determine the exact
output levels in mV required to present the NA LiSN-S
stimuli at a designated level in dB SPL, the various
stimuli were presented through the left and right ear
headphones to a Bruel and Kjær type 4153 artificial
ear using a flat plate adaptor. Equivalent dBV rms
levels were measured directly from the headphone
socket of the PC using a Stanford Research Systems
two-channel network signal analyzer.
When the 1 kHz reference tone was activated from
the playback screen, the volume of the PC was
adjusted until the electrical level of the calibration
signal was 21 mV (as measured by a voltmeter). At
this point, the dB SPL in each ear of the two combined
distracters at 0 degrees azimuth (recorded by Female 1
and Females 2 and 3) matched the corresponding level
on the LiSN-S competition slider bar. Similarly, the dB
SPL of the target (recorded by Female 1) matched the
corresponding level on the LiSN-S target slider bar.
Daily calibration was achieved by adjusting the PC
volume control until the electrical level of the
calibration signal applied to the headphones was
21 mV.
For the NA LiSN-S normative data study and test–
retest reliability studies, the headphones were con-
nected to the headphone socket of the PC via a Miglia
Harmony Express USB sound card. The sensitivity of
the sound card was automatically set to a predeter-
mined level by the LiSN-S software in order to achieve
the same signal levels as described previously. This
alleviated the need for daily calibration. At this preset
level, the combined distracters at 0 degrees had a long-
term rms level of 55 dB SPL as measured in a Bruel
and Kjær type 4153 artificial ear.
EXPERIMENT 1—SENTENCE
EQUIVALENCE STUDY
The following study was conducted to determine the
relative intelligibility of the LiSN-S sentences and
to adjust the level of the sentences for equal intelligi-
bility. Approval to conduct the sentence equivalence
study was obtained from the Institutional Review
Board of the Cincinnati Children’s Hospital Medical
Center.
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
132
Participants
Data were collected from 24 children with normal
hearing aged 8 years, 3 months, to 10 years, 0 months
(mean age 9 years, 1 month). There were 12 males and
12 females. Participants were recruited from friends
and family of staff at the Cincinnati Children’s
Hospital Medical Center. The participants were in-
cluded in the study if they had North American
English as a first language, no history of hearing
disorders, and no reported learning or attention
disorders. On the day of testing all participants had
pure-tone thresholds of #15 dB HL at 500 to 4000 Hz
and #20 dB HL at 250 and 8000 Hz, as well as normal
Type A tympanograms and 1000 Hz ipsilateral acous-
tic reflexes present at 95 dB HL.
Materials
The LiSN-S stimuli were administered using a PC
and Sennheiser HD215 headphones. The headphones
were connected directly to the headphone socket of the
PC via a Miglia Harmony Express USB sound card.
The daily calibration procedure is described in the
above section on the development of the LiSN-S under
‘‘Calibration.’’
Design and Procedure
Testing was carried out after school hours (between
3 p.m. and 6 p.m.) in an acoustically treated room
suitable for testing hearing thresholds at the Cincin-
nati Children’s Hospital Medical Center. Target sen-
tences were initially presented at a level of 62 dB SPL,
as measured in a Bruel and Kjær type 4153 artificial
ear. Competing children’s stories, looped during play-
back, were presented at a constant level of 55 dB SPL.
The target and competing signals were presented
simultaneously to both ears. The stimuli were all
presented in the ‘‘same voice—0u’’ condition, whereby
the target sentences and distracter stories are all
spoken by the same female speaker and were processed
with the head-related transfer functions appropriate to
a source at 0 degrees azimuth (directly in front of the
listener). The listener’s task was to repeat the words
heard in each target sentence. A 1000 Hz 200 msec
tone burst was presented before each sentence to alert
the listener that a sentence would be presented. A
silent gap of 500 msec separated the tone burst from
the onset of the sentence. The tone burst was
presented at a constant level of 55 dB SPL.
The signal-to-noise ratio (SNR) was adjusted adap-
tively in each condition by varying the target level to
determine each participant’s SRT. The SNR was
decreased by 2 dB if a listener scored more than 50
percent of words correct and increased by 2 dB if he or
she scored less than 50 percent of words correct. The
SNR was not adjusted if a response of exactly 50
percent correct was recorded (for example, three out of
six words correctly identified). All words in each
sentence were scored individually; including the
definite article ‘‘the’’ and the indefinite articles ‘‘a’’
and ‘‘an.’’ At a minimum, five sentences were provided
as practice; however, practice continued until one
upward reversal in performance (i.e., the sentence
score dropped below 50 percent of words correct) was
recorded. Testing ceased in a particular condition
when the listener had either (a) completed the entire
30 sentences in any one condition or (b) completed the
practice sentences plus a minimum of a further 17
scored sentences, and the standard error, calculated
automatically in real time over the scored sentences,
was less than 1 dB. None of the sentences used to form
the initial estimate of SRT (i.e., eSRT) was repeated in
any subsequent study.
An additional 150 sentences were presented at three
fixed SNRs to determine the relative intelligibility of the
sentences. For each participant, an SRT was obtained
for 50 sentences presented at his or her eSRT, 50
sentences presented at his or her eSRT + 2 dB, and 50
sentences presented at his or her eSRT 2 2 dB. The
sentences assigned to each SNR were counterbalanced
across participants. Logit curves were fitted for each
sentence using least squares regression based on the
equation: exp(a 2 b * SNR)/(1 + exp[a 2 b * SNR]). The
dependent variable was the proportion of words correct
at each SNR for a particular sentence averaged across
participants. All analyses were made with Statistica 7.0.
The resulting b values are related to the slope of the
steepest portion of the curve for each sentence. The
median b value across sentences was 20.594, or 15
percent per dB (calculated as 2b/4). The ratio of a/b for
any sentence (referred to as r) represents the SRT or the
SNR needed to achieve 50 percent correct identification
of words in that sentence. The median value of r (rmed)
was 20.4 dB. The sentences were then adjusted in dB
for equal intelligibility, with the required adjustment
for any sentence calculated as r 2 rmed. A sentence was
discarded if (a) the required adjustment was too great,
that is, r 2 rmed , 22.0 dB or . +2.0 dB; (b) the slope
was too shallow (,6% per dB), that is, 2b/4 , 0.06 or b
, 20.25; or (c) the slope was too steep (50% per dB), that
is, 2b/4 . 0.5 or b . 22.
Based on these criteria, 30 sentences were discarded.
Logit curves for the remaining 120 unadjusted sen-
tences are shown in Figure 3. The remaining sentences
were adjusted in amplitude for equal intelligibility and
used in the normative data study. The mean slope of
the retained sentences was 18.7 percent per dB. Logit
curves for the sentences postadjustment are shown in
Figure 4. The average and median length of the
sentences was five words per sentence across the total
NA LiSN-S/Cameron et al
133
number of sentences. The sentences were then allocat-
ed to four lists for use in the normative data study.
Each list also had and average and median sentence
length of five words.
EXPERIMENT 2—NORMATIVE DATA STUDY
Approval to conduct the normative data study was
obtained from the Institutional Review Boards of
the Cincinnati Children’s Hospital Medical Center, the
University of Texas at Dallas, and the University of
Calgary.
Participants
Data were collected from 72 children with normal
hearing aged 6 years, 2 months, to 11 years, 10
months. Participants were recruited from friends and
family of staff at the Cincinnati Children’s Hospital
and Calgary Health Region. Participants recruited by
the University of Texas at Dallas were from local
primary schools. Participant details are provided in
Table 1. Inclusion criteria were as per Experiment 1.
Design and Procedure
The materials used in the normative data study are as
described for the sentence equivalence. The description
of output levels is described in the above section on the
development of the LiSN-S under ‘‘Calibration.’’ Testing
was again carried out in an acoustically treated room
suitable for testing hearing thresholds at the various
facilities. Testing occurred between 9 a.m. and 3 p.m. atthe University of Texas at Dallas and between 10 a.m.
and 7 p.m. at the other sites.
The LiSN-S target sentences were initially presented
at a level of 62 dB SPL. The competing discourse was
presented at a constant level of 55 dB SPL. The
participant’s task was to repeat as many words as
possible heard in each sentence. The instructions
provided to each participant are attached as AppendixC. Up to 30 sentences were presented in each of the four
conditions of distracter location and voice: same voice at
0 degrees (SV0u), same voice at 690 degrees (SV690u),different voices at 0 degrees (DV0u), and different voices
at 690 degrees (DV690u). The organization of the target
sentences and distracter stories is provided in Table 2.
The SNR was adjusted adaptively as described for
Experiment 1. The presentation order of the LiSN-Sconditions was counterbalanced among participants
using a Latin-square protocol to enable analysis of the
effect of practice on performance.
RESULTS—EXPERIMENT 2
LiSN-S Conditions
Effect of Data Collection Site
The mean SRT and interparticipant standard devi-
ations for the LiSN-S SRT and advantage measures for
Figure 3. Logit curves for 120 North American Listening inSpatialized Noise–Sentences test sentences prior to adjustmentfor equal intelligibility (SRT 5 speech reception threshold).
Figure 4. Logit curves for 120 North American Listening inSpatialized Noise–Sentences test sentences following adjustmentfor equal intelligibility (SRT 5 speech reception threshold).
Table 1. Details of the 72 Participants in the North American Listening in Spatialized Noise–Sentences Test NormativeData Study
Age Group n Male Female Minimum Age (years, months) Maximum Age (years, months) Mean Age (years, months)
6 12 6 6 6, 2 6, 11 6, 6
7 12 8 4 7, 0 7, 7 7, 3
8 12 5 7 8, 1 8, 11 8, 5
9 12 6 6 9, 0 9, 11 9, 6
10 12 7 5 10, 0 10, 11 10, 5
11 12 6 6 11, 0 11, 11 11, 6
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
134
each collection site are presented in Table 3. Separate
analyses of variance (ANOVAs) were performed to
determine the effect of collection site (Ohio, Alberta,
and Texas) on each of the performance measures. As
the five measures were derived from the four basic
LiSN-S conditions (SV0u, SV690u, DV0u, and DV690u),the alpha level of 0.05 was multiplied by 4/5 to give an
adjusted level of 0.04 to avoid inflating the Type I error
rate.
There was no effect of collection site for any of the
LiSN-S SRT or advantage measures: low-cue SRT, F(2,
69) 5 0.130, p 5 .878; high-cue SRT, F(2, 69) 5 0.933, p
5 .398; talker advantage, F(2, 69) 5 0.397, p 5 .674;
spatial advantage, F(2, 69) 5 1.570, p 5 .215; total
advantage, F(2, 69) 5 0.949, p 5 .392. As no significant
differences were found between collection sites, data
were combined for the following analyses.
Comparison of North American and
Australian Data
The mean SRTs and interparticipant standard
deviations for the LiSN-S SRT and advantage mea-
sures for the combined North American data and the
Australian data are presented in Table 4. Separate
ANOVAs were performed to determine whether differ-
ences existed in the normative data between the two
countries on the various performance measures. There
was a significant difference on all the LiSN-S SRT and
advantage measures between countries: low-cue SRT,
F(1, 140) 5 7.421, p 5 .007; high-cue SRT, F(1, 140) 5
50.261, p , .001; talker advantage, F(1, 140) 5 22.477,
p , .001; spatial advantage, F(1, 140) 5 58.890, p ,
.001; total advantage, F(1, 140) 5 46.353, p , .001.
Main Effects and Interactions
Table 5 details the mean SRTs and interparticipant
standard deviations for the four LiSN-S distracter
conditions—SV0u, SV690u, DV0u, and DV690u. Age
groups and test sites were combined. An ANOVA of
mean SRT was performed for the repeated-measures
factors of distracter location (0 degrees vs. 690 degrees)
and distracter voice (same vs. different). An alpha level
of 0.05 was used for all comparisons. The Greenhouse-
Geisser correction factor was applied to the degrees of
freedom of the main effects and interaction to ensure
that violations of sphericity did not influence the
significance levels calculated for any of the analyses.
There was a significant main effect for distracter
location (F[1, 71] 5 307.66, p , .001), with the 690
degrees condition resulting in a lower SRT than the 0
degrees condition, averaged across distracter voice. An
analysis of simple contrasts revealed that the 690
degrees location produced a significantly lower mean
SRT than the 0 degrees location for both the same voice
distracter (F[1, 71] 5 1811.45, p , .001) and the
different voices distracter (F[1, 71] 5 497, p , .001).
There was also a significant main effect of voice (F[1, 71]
5 1831.87, p , .001) averaged across location. Simple
contrasts revealed that the mean SRT of the same voice
distracter was significantly higher than that for the
Table 2. Organization of Target Sentences and Distracter Stories for Each North American Listening in SpatializedNoise–Sentences Test (LiSN-S) Condition in the Normative Data Study
LiSN-S Condition Distracter Speaker Distracter Location Distracter Story
Same Voice: 0u Female 1 0u Loopy Lizard’s Tail
Female 1 0u The Great Big Tiny Traffic Jam
Same Voice: + and 290u Female 1 290u Loopy Lizard’s Tail
Female 1 +90u The Great Big Tiny Traffic Jam
Different Voices: 0u Female 2 0u Loopy Lizard’s Tail
Female 3 0u The Great Big Tiny Traffic Jam
Different Voices: + and 290u Female 2 290u Loopy Lizard’s Tail
Female 3 +90u The Great Big Tiny Traffic Jam
Note: In each condition the target sentences were spoken by Female 1 and presented at 0 degrees azimuth (directly in front of the speaker).
Table 3. Average Speech Reception Thresholds (SRTs, Expressed as Signal-to-Noise Ratios) and InterparticipantStandard Deviations (in dB) for Each of the SRT and Advantage Measures for the 72 Children in the North AmericanNormative Data Study as a Function of Data Collection Site
Site n Variable Low-Cue SRT High-Cue SRT Talker Advantage Spatial Advantage Total Advantage
University of Cincinnati 24 Mean 20.2 210.7 5.4 9.2 10.5
SD 0.9 2.3 2.2 2.1 2.4
Calgary Health Region 24 Mean 20.3 211.6 4.9 8.8 11.3
SD 1.7 3.6 2.6 1.8 2.6
University of Texas at Dallas 24 Mean 20.4 211.7 5.5 9.8 11.4
SD 0.8 1.9 2.2 1.7 1.7
NA LiSN-S/Cameron et al
135
different voices distracter at both the 0 degrees location
(F[1, 71] 5 378.30, p , .001) and the 690 degrees
location (F[1, 81] 5 51.19, p , .001). An interaction
between distracter location and speaker voice was also
significant, indicating that the benefit from separation
in the same voice condition (9.3 dB) was significantly
greater than the benefit from separation in the different
voices conditions (5.8 dB; F[1, 81] 5 121.34, p , .001).
Overall, the listening benefit obtained from the
spatial separation was influenced by the similarity of
the voices between the speaker(s) of the distracters and
that of the speaker of the target sentences. Benefit
from separation is calculated by subtracting the SRT in
the 0 degrees location from that in the 690 degrees
location for a particular condition of distracter voice.
The benefit from separation for the same voice
condition is referred to as the LiSN-S spatial advan-
tage measure (Cameron et al, 2006).
Effect of Age on LiSN-S Performance Measures
The mean SRT and advantage measures for the
children in the normative data study are illustrated in
Figure 5. There was a trend of decreasing SRT and
increasing advantage, as age increased, across measures.
The interparticipant standard deviations of the measures
ranged from 0.7 dB for the eight-year-olds on the low-cue
SRT measure to 2.8 dB for the seven-year-olds on the
high-cue SRT measure. Separate ANOVAs were per-
formed to determine the effect of age on the performance
measures. As for previous analyses, the alpha level of
0.05 was multiplied by 4/5 to give an adjusted level of 0.04
to avoid inflating the Type I error rate.
For the low-cue SRT there was a significant main
effect of age (F[5, 66] 5 2.697, p 5 .028). Post hoc tests
using Tukey’s HSD revealed no significant differences
between age groups. There was also a significant main
effect of age for the high-cue SRT (F[5, 66] 5 3.877, p 5
.004). Post hoc tests revealed that the six-year-olds
required a significantly higher SRT than children aged
nine (p 5 .043) and 11 (p 5 .011). The seven-year-olds
required a higher SRT than the 11-year-olds (p 5 .034).
No differences in thresholds were significant between
other combinations of age groups.
There was a significant main effect of age for the
talker advantage measure (F[5, 66] 5 4.335, p 5 .002).
The six-year-olds required a significantly higher SRT
than children aged nine (p 5 .005) and 11 (p 5 .003). A
significant main effect of age was also found for the
spatial advantage measure (F[5, 66] 5 3.537, p 5 .007).
The six-year-olds needed a higher SRT than the 10-
year-olds (p 5.012) and 11-year-olds (p 5 .007). No
other differences in advantage measures were signif-
icant among the other combination of age groups.
There was no significant main effect of age for the total
advantage measure (F[5, 66] 5 2.062, p 5 .81).
Gender Effects
An analysis was conducted in order to investigate
gender effects in the children. Mean scores and
standard deviations for the 34 females and 38 males
on the various LiSN-S SRT and advantage measures
are provided in Table 6, along with the results of
ANOVAs that were performed with each measure as
the dependant variable, a fixed factor of gender, and
age as a covariate. There was no significant effect of
gender for any LiSN-S measure.
Practice Effects
The effect of practice on performance on the North
American LiSN-S was examined for the 72 children in the
normative data study. In order to determine whether
practice improved performance, the mean SRTs were
compared for the four basic LiSN-S conditions (SV0u,SV690u, DV0u, and DV690u) as a function of presentation
order (first, second, third, or fourth). Participants in each
age group completed the various conditions in exactly the
same order. Age groups were combined to provide
sufficient numbers in each condition and task combina-
tion to calculate meaningful inferential statistics.
Table 5. Average Speech Reception Thresholds (SRTs)and Interparticipant Standard Deviations (in dB) for Eachof the Four Distracter Location Conditions for the 72Children in the North American Normative Data Study
Condition SRT SD
Same Voice 0u 20.3 1.2
690u 29.6 2.3
Different Voices 0u 25.6 2.9
690u 211.3 2.7
Table 4. Comparison of Average Speech Reception Thresholds (SRTs, Expressed as Signal-to-Noise Ratios) andInterparticipant Standard Deviations (in dB) for Each of the SRT and Advantage Measures for the 70 Children in theAustralian Normative Data Study and the 72 Children in the North American Normative Data Study
Site n Variable Low-Cue SRT High-Cue SRT Talker Advantage Spatial Advantage Total Advantage
Australian 70 Mean 20.8 214.2 3.6 11.7 13.3
SD 1.2 2.0 1.8 1.9 1.6
North America 72 Mean 20.3 211.3 5.3 9.3 11.1
SD 1.2 2.7 2.3 1.9 2.3
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
136
Figure 5. Results on the various Listening in Spatialized Noise–Sentences test (a–b) speech reception threshold (SRT) and (c–e)advantage measures for children in the normative data study. Error bars represent the 95% confidence intervals from the mean.
NA LiSN-S/Cameron et al
137
Table 7 shows the mean thresholds in dB for each
LiSN-S condition as a function of presentation order.
One-way ANOVAs revealed no significant differenc-
es in mean SRTs as a factor of presentation order for
either the SV0u condition (F[3, 68] 5 0.078, p 5 .972),
the SV690u condition (F[3, 68] 5 0.538, p 5 .658), or
the DV690u condition (F[3, 68] 5 1.616, p 5 .194).
There was, however, a significant difference for the
DV0u condition (F[3, 68] 5 3.207, p 5 .028). Post hoc
tests using Tukey’s HSD revealed that, at 24.2 dB, the
first presentation resulted in a significantly higher
SRT than the third presentation at 27.0 dB (p 5 .018).
However, there was no significant difference in SRT
between the second presentation (at 25.3 dB) and any
other subsequent presentation.
Standard Error, Time Analysis, and Distribution
of Data
As discussed in the method section, testing in any one
LiSN-S condition was terminated once a participant had
completed 30 sentences or once 17 sentences had been
completed (plus at least five practice sentences including
one reversal) and his or her standard error was less than
1 dB. In the present study, the median standard error
ranged from 0.77 dB in the SV0u condition to 0.98 dB in
the DV690u condition, with a range of 0.59 to 1.79 dB
across all age groups. Normal probability-probability
plots reveal that the data followed a normal distribution
for all SRT and advantage measures. The median time
taken to complete any LiSN-S condition was two
minutes, 40 seconds (mean two minutes, 45 seconds).
Total time taken to complete the testing was on average
approximately 11 minutes, plus five minutes for in-
structions and breaks.
Regression Analysis and LiSN-S Cutoff Scores
As a strong trend of improved performance with
increasing age was found for the various LiSN-S SRT
and advantage measures, it was determined that cutoff
scores, calculated as two standard deviations below the
mean, would need to be adjusted for age for each
performance measure. These cutoff scores represent
the level below which performance on the LiSN-S is
considered to be outside normal limits.
A regression analysis was conducted with SRT for
each measure as the independent variable and age
(ranging from 6.21 to 11.87 years) as the dependent
variable. The cutoff scores were adjusted for age using
the formula
cutoff score 5 intercept + (B-value * age) + (2 * SDs ofresiduals from the age-corrected trend lines)
for the LiSN-S SRT measures and
cutoff score 5 intercept + (B-value * age) 2 (2 * SDs ofresiduals from the age-corrected trend lines)
for the LiSN-S advantage measures. All regression
data are presented in Table 8. Figure 6 provides
scatter plots of the regression analysis showing the
individual data points.
In respect to practice effects, as the first presenta-
tion of the LiSN-S DV0u condition resulted in an SRT
that was significantly higher than that for the third
Table 6. Mean Speech Reception Thresholds (SRTs) and Advantage Measure Scores (in dB) for 34 Males and 38Females Aged 6 to 11 Years, Together with Results of ANOVA Investigating Effects of Gender on North AmericanListening in Spatialized Noise–Sentences Test Performance, with Age as a Covariate
Measure
Males Females
F(1, 70) pMean SD Mean SD
Low-Cue SRT 20.3 1.0 20.3 1.3 0.02 .882
High-Cue SRT 211.2 2.4 211.5 3.0 0.37 .548
Talker Advantage 5.0 2.1 5.6 2.6 1.29 .260
Spatial Advantage 9.2 1.7 9.4 2.1 0.40 .527
Total Advantage 10.9 2.1 11.3 2.5 0.619 .434
Table 7. Mean Speech Reception Thresholds (SRTs) and Standard Deviations (in dB) of Each Listening in SpatializedNoise–Sentences Test Condition, as a Function of Presentation Order for the 72 Children in the Normative Data Study
Condition
First Presentation Second Presentation Third Presentation Fourth Presentation
SRT SD SRT SD SRT SD SRT SD
Same Voice 0u 20.4 1.5 20.2 1.1 20.2 1.1 20.3 1.2
690u 29.8 2.2 210.0 2.8 29.6 1.7 29.1 2.5
Different Voices 0u 24.2 2.5 25.3 2.8 27.0 3.4 25.9 2.2
690u 210.7 3.0 210.7 2.5 211.8 2.3 212.3 2.7
Note: Age groups are combined.
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
138
presentation, the DV0u condition should not be
presented first. It is recommended that presentation
order be (1) DV690u, (2) SV690u, (3) DV0u, and (4)
SV0u. This configuration represents a gradient from‘‘easy’’ to ‘‘difficult’’ and controls for the practice effects
that were demonstrated for the DV0u condition.
Effect of Change of Presentation Order
on Performance
It must be acknowledged that only a quarter of the
participants in the current study received the LiSN-S
in the recommended order described above. To deter-
mine the effect of change of presentation order on
performance, a regression analysis was conducted withpresentation order (1, 2, 3, or 4) as the dependent
variable and SRT in each LiSN-S condition as the
independent variable. The adjustment to the norma-
tive data needed to test in the recommended order was
calculated as the number of steps away from the order
midpoint (2.5) multiplied by the B-value (dB/step).
The calculated adjustments were 0.9 dB for the
DV690u condition, 20.1 dB for the SV690u condition,
20.3 dB for the DV0u condition, and 0 dB in the SV0ucondition. Whereas most required adjustments were
insignificant, it was decided to adjust the normative
data cutoff scores to reflect the 0.9 dB effect ofpresenting the DV690u condition first as stipulated
by the recommended order. As such, the intercept for
the high-cue SRT (i.e., DV690u SNR) was increased by
0.9 dB (from 25.06 to 24.16 dB). As the SNR of the
DV690u condition is also utilized in the calculation of
the total advantage score, the intercept for this
measure was decreased by 0.9 dB (from 7.36 to
6.46 dB). The adjustments are noted in Table 8.
EXPERIMENT 3—TEST–RETEST
RELIABILITY STUDY
Approval to conduct the test–retest reliability study
was obtained from the Institutional Review
Boards of the Cincinnati Children’s Hospital Medical
Center, the University of Texas at Dallas, and the
University of Calgary.
Participants
Data were collected by Cincinnati Children’s Hos-
pital, Calgary Health Region, and the University of
Texas at Dallas. Participants were 36 of the 72
children who had taken part in the normative data
study who agreed to also take part in the test–retest
reliability study. Participant details are provided in
Table 9. Participants recruited by the Cincinnati
Children’s Hospital were tested between 10 a.m. and
6 p.m. Participants recruited by Calgary Health
Region were tested between 3 p.m. and 6 p.m.
Participants from the University of Texas at Dallas
were tested between 10.30 a.m. and 6 p.m.
Design and Procedure
The materials and procedures used in the test–
retest reliability study were as for the normative data
study. The four LiSN-S conditions were presented to
each participant in the same order that they were
presented during the normative data study. Retesting
on the LiSN-S was carried out between 2 months, 0
days, to 3 months, 25 days, following the initial
testing (median 2 months, 9 days; mean 2 months, 13
days).
RESULTS—EXPERIMENT 3
All analyses were performed with Statistica 7.1.
Test–Retest Paired Comparisons
The mean scores and standard deviations for the
various LiSN-S conditions—and the advantage mea-
sures derived from the various conditions—at test
and retest are provided in Table 10. Results of
difference scores between test and retest, as well
as the t and P values of paired-samples t-tests, are
also provided. Except for the spatial advantage
measure, all differences were in the direction
representing an improvement in performance. The
maximum improvement in performance on retest
Table 8. Data Utilized in the Calculation of Listening in Spatialized Noise–Sentences Test Cutoff Scores for the 72Children in the Normative Data Study
Measure Mean (dB) SD (Residuals; dB) Intercept B-Value r2
Low-Cue SRT 20.3 1.05 2.26 20.28 0.181
High-Cue SRT 211.3 2.38 24.16a 20.70 0.209
Talker Advantage 5.3 2.07 20.17 0.60 0.207
Spatial Advantage 9.3 1.67 5.18 0.46 0.186
Total Advantage 11.1 2.18 6.46b 0.41 0.099
Note: All r2 values are significant at p , .05. SRT 5 speech reception threshold.aIncreased by 0.9 dB to account for practice effects.bDecreased by 0.9 dB to account for practice effects.
NA LiSN-S/Cameron et al
139
was 0.7 dB on the different voices 0 degrees
condition. Minimum change was 0.1 dB on the
spatial advantage and total advantage conditions.There were no significant differences in performance
between test and retest on any LiSN-S measure (p
ranged from .080 to .862).
Effect of Age on LiSN-S Performance Measures
Figure 7 depicts the mean test and retest scores foreach of the LiSN-S SRT and advantage measures, as a
function of age. A repeated-measures ANOVA was
performed for each LiSN-S SRT and advantage
Figure 6. Linear regression scatter plots of Listening in Spatialized Noise–Sentences test (a–b) speech reception threshold (SRT) and(c–e) advantage measures for children in the normative data study. Prediction lines represent 95% intervals from the mean.
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
140
measure, with age as a between-participants factor, to
determine whether test–retest differences differed
with age. An alpha level of 0.05 was used for all
comparisons. There was no significant interaction of
test session and age for the low-cue SRT (F[5, 30] 5
0.28, p 5 .922), the high-cue SRT (F[5, 30] 5 0.16, p 5
.975), spatial advantage (F[5, 30] 5 1.83, p 5 .137), or
total advantage (F[5, 30] 5 0.56, p 5 .733). The age by
test session interaction was significant for the talker
advantage measure (F[5, 30] 5 3.82, p 5 .008). Post
hoc tests using Tukey’s HSD reveal that the retest
scores of the 11-year-olds on talker advantage were
significantly better than the test scores of the six-year-
olds on that measure (p 5 .016). However, there were
no significant differences in talker advantage scores
between test and retest within any age group (for
example, between six-year-olds at test and retest).
Test–Retest Correlation Analysis
A Pearson product-moment correlation analysis was
performed for each of the LiSN-S SRT and advantage
measures. All correlations were significant except for
the low-cue SRT measure: low-cue SRT, r 5 0.1, p
5.678; high-cue SRT, r 5 0.7, p , .001; talker
advantage, r 5 0.5, p , .001; spatial advantage, r 5
0.4, p 5 .007; total advantage, r 5 0.6, p , .001. Lack of
correlation between test and retest scores for the low-
cue SRT condition is expected due to the small size of
the interparticipant spread. Nearly all data lie within
2 dB of the mean for both test and retest scores.
Scatter plots in Figure 8 show the correlation of test
versus retest scores for each of the LiSN-S SRT and
advantage measures.
Test–Retest Correction Factors
Table 11 displays the calculations of the one-sided
critical differences required to determine whether a
child with (C)APD has improved on the LiSN-S
following remediation or compensation, taking a
correction factor for test–retest differences into ac-
count. That is, for any individual child, an improve-
ment on a particular LiSN-S SRT or advantage
measure should be greater than the listener’s score,
plus the mean test–retest study difference, 21.64 3
the standard deviation of the mean test–retest reli-
ability study difference for the SRT measures and plus
that amount for the advantage measures. Critical
difference measures, including the correction factor,
ranged from 3.0 dB on the total advantage measure to
3.9 dB on the talker advantage measure.
DISCUSSION
The development of the North American LiSN-S and
the subsequent sentence equivalence, normative
data, and test–retest reliability studies were carried
out in line with the design and procedures employed in
the development of the Australian LiSN-S (Cameron
and Dillon, 2007a, 2007c). The NA LiSN-S sentence
equivalence study showed that postequalization,
Table 10. Mean Scores and Standard Deviations (in dB) for the 36 Participants at Test and Retest on the Various NorthAmerican Listening in Spatialized Noise–Sentences Test Conditions and the Advantage Measures Calculated fromThose Conditions
Measure
Test Retest
Paired Difference t Value P ValueMean SD Mean SD
Same Voice 0u (Low-Cue SRT) 20.4 1.29 20.9 1.26 0.5 1.80 .080
Same Voice 690u 29.1 4.51 29.7 4.55 0.5 1.66 .106
Different Voices 0u 25.6 3.92 26.4 4.14 0.7 1.59 .120
Different Voices 690u (High-Cue SRT) 211.8 2.56 212.3 2.01 0.6 1.80 .080
Talker Advantage 5.8 2.30 6.0 2.43 0.2 2.27 .621
Spatial Advantage 9.5 1.89 9.5 1.84 0.1 0.24 .815
Total Advantage 11.4 2.16 11.4 1.53 0.1 20.17 .862
Note: SRT 5 speech reception threshold.
Table 9. Details of the 36 Participants in the Test–Retest Reliability Study for the North American Listening inSpatialized Noise–Sentences Test
Age Group n Male Female Minimum Age (years, months) Maximum Age (years, months) Mean Age (years, months)
6 8 4 4 6, 2 6, 10 6, 7
7 3 2 1 7, 0 7, 7 7, 5
8 7 3 4 8, 1 8, 11 8, 6
9 6 5 1 9, 4 9, 11 9, 8
10 3 2 1 10, 3 10, 11 10, 8
11 9 6 3 11, 0 11, 11 11, 6
NA LiSN-S/Cameron et al
141
intelligibility increased across sentences by 18.7
percent for each 1 dB increase in SNR, which is
comparable with the increase of 17 percent per dB forthe Australian LiSN-S.
The results of the NA LiSN-S normative data study
revealed that there was no effect of collection site (Ohio
vs. Alberta vs. Texas) on any performance measure (p
ranging from .215 for spatial advantage to .878 for low-
cue SRT). As for the Australian study, there was a
trend of decreasing SRT as age increased, across
measures. Interparticipant standard deviations across
performance measures were small for both versions of
the LiSN-S (NA LiSN-S standard deviations ranged
from 1.2 dB in the SV0u condition to 2.9 dB in the DV0u
Figure 7. Dot plots depicting mean test and retest scores for each of the North American Listening in Spatialized Noise–Sentences test(a–b) speech reception threshold (SRT) and (c–e) advantage measures as a function of age. Circular filled symbols connected by solidlines represent the test scores. Square unfilled symbols connected by dashed lines represent the retest scores. Error bars represent the95% confidence intervals from the mean.
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
142
condition; Australian LiSN-S standard deviations
ranged from 1.3 dB in the SV0u condition to 2.8 dB in
the SV690u condition). Intraparticipant standard
deviations were also small for both versions (NA
LiSN-S ranged from 0.77 dB in the SV0u condition to
0.98 dB in the DV690u condition; Australian LiSN-S
ranged from 0.86 dB in the SV0u condition to 0.97 dB
in the DV690u condition). There were no significant
gender effects found for either version of the LiSN-S
for any performance measure.
Analysis of variance revealed that there were
significant differences in mean score between the
North American and Australian versions of the LiSN-
S across performance measures (ranging from a
difference between versions of 0.3 dB on the low-cue
SRT to 2.9 dB on the high-cue SRT). In each case
except for talker advantage, performance was slightly
better for the Australian children. It can only be
speculated as to why these slight differences occurred,
as the design and procedures utilized in the production
of the LiSN-S in both versions were identical. It could
be suggested that the time of day (and hence the
alertness of the children) at which data collection
occurred may have resulted in the differences in mean
SRT/advantage between versions. All the Australian
data were collected between 9.30 a.m. and 2.30 p.m.,
whereas only the data from Texas were collected
during a similar time frame, with the data from
Alberta and Cincinnati mainly collected after school
hours. It is conceivable that the scores achieved by the
North American children were poorer because they
were tested later in the day and fatigue played a role in
the results. However, if this were the case, the scores
achieved in Texas should have been significantly
better than in the other sites, and this is not the case.
Also, scores on the talker advantage measure were
better for the North American children than for the
Australian children, making test time an unlikely
cause of differences in SRT/advantage between ver-
sions. Slight differences in the actual recordings (for
example, in respect to pace of speakers, differences
between speaker voices) may have contributed to the
differences in SRT/advantage between versions.
As a trend of improved performance with increasing
age was found for both the North American and
Australian versions of the LiSN-S, it was determined
that the cutoff scores would need to be adjusted for age.
The cutoff scores for the NA LiSN-S represent the level
below which performance on a particular performance
measure is considered to be outside normal limits and
is calculated from the intercept and B-values obtained
from a regression analysis of SRT on age. All r2 values
were significant for both the Australian and North
American versions of the LiSN-S.
The effect of practice on the NA LiSN-S was also
examined by measuring the effect of position within the
four subtests on the performance of each subtest. For the
DV0u condition there was a significant effect of test order
between the first and third position. There was no effect
of practice between the second and any subsequent
position for this condition, nor was there any significant
effect of practice on any other NA LiSN-S condition (p
ranging from .194 to .972). A significant effect of practice
was also found between the first and third (and fourth)
positions for the DV0u condition of the Australian LiSN-
S. Again there were no significant differences between
the second and any subsequent position.
To account for the practice effect found for the DV0ucondition it was stipulated for the Australian LiSN-S
that the DV0u condition be presented after the DV690ucondition and the SV690u condition when this test is
utilized clinically or in future studies. This presenta-
tion order is also recommended for the North American
version of the LiSN-S. For both versions, a regression
analysis was conducted to determine the effect of
change of presentation order on performance. The
adjustment to the normative data needed to account
for the delivery of the LiSN-S conditions in the
recommended order (DV690u, SV690u, DV0u, then
SV0u) was calculated as the number of steps away from
the order midpoint (2.5) multiplied by the B-value (dB/
step). For the Australian LiSN-S the greatest modifi-
cation required to adjust the normative data to account
for the recommended order was only 0.2 dB, and any
adjustment was therefore considered unnecessary
(Cameron and Dillon, 2007a). For the NA LiSN-S most
required adjustments were again insignificant (0 dB
for the SV0u condition, 20.1 dB for the SV690ucondition, and 20.3 dB for DV0u condition). However,
as the adjustment needed to compensate for presenting
the DV690u condition first was almost 1 dB, the
intercept used in the calculation of the NA LiSN-S
cutoff scores for this condition (and also the total
advantage measure that is consequently affected, as it
is calculated as the difference between the SV0u and
DV690u conditions) was adjusted accordingly.
The NA LiSN-S test–retest reliability study revealed
that differences in mean SRT between test and retest
were small across performance measures (0.1 to 0.7 dB).
This emulated the test–retest differences found for the
Australian LiSN-S, which ranged from 0.1 to 1.1 dB
(Cameron and Dillon, 2007c). While there were no
significant differences in mean SRT/advantage between
test and retest for any performance measure of the NA
LiSN-S, significant differences were found for all
Australian LiSN-S performance measures except spatial
advantage. Test–retest differences did not vary signifi-
cantly with age for either the Australian or North
American versions of the LiSN-S. A correlation analysis
of test and retest scores across performance measures
was statistically significant for the Australian LiSN-S (r
ranging from 0.3 to 0.8). The correlation between test
NA LiSN-S/Cameron et al
143
Figure 8. Scatter plots depicting the correlation of test vs. retest scores for each of the Listening in Spatialized Noise–Sentences test(a–b) speech reception threshold (SRT) and (c–e) advantage measures. Lines show the 95% confidence interval of the regression line.
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
144
and retest was also significant for all of the North
American LiSN-S performance measures except the low-
cue SRT, which was attributed to the extremely small
interparticipant spread on this measure.
As for the Australian LiSN-S, the test–retest data
from the NA LiSN-S study were utilized to develop
one-sided critical difference scores. These scores can be
used to determine whether a child has genuinely
improved on the LiSN-S following a period of remedi-
ation or compensation with an assistive listening
device. Critical difference scores for the NA LiSN-S
ranged from 3.0 dB on the total advantage measure to
3.9 dB on the talker advantage measure. These scores
were highly comparable to the Australian LiSN-S
critical difference scores, which ranged from 2.5 dB
on the low-cue SRT to 4.4 dB on talker advantage.
CONCLUSION
In previous studies (Cameron and Dillon, 2007a,
2007c, 2008), the Australian LiSN-S was reported to
be a fast and efficient assessment tool with potential to
be used clinically to evaluate auditory streaming skills
in children with suspected CAPD. The present study
has described the development of a North American–
accented and semantically appropriate version of the
LiSN-S, appropriate for use in the United States and
Canada. The normative data were not affected by data
collection site or gender, and inter- and intrapartici-
pant variation was minimal. An expected trend of
improved performance as a function of increasing age
was found for all measures, and the calculation of the
cutoff scores that determine the level below which
performance on the NA LiSN-S is considered to be
outside normal limits was adjusted for age accordingly.
Test–retest differences were not significant on any NA
LiSN-S measure. The calculation of one-sided critical
difference scores, which take into account mean
practice effects and day-to-day fluctuation in perfor-
mance, makes the NA LiSN-S a potentially valuable
tool for monitoring performance over time and the
effects of maturation, remediation, or compensation
such as an assistive listening device.
Acknowledgments. We would like to thank Jill Anderson
and Cari Olsen from the University of Cincinnati, who
collected the data for the sentence equivalence study,
normative data study, and test–retest reliability study for
that facility. The distracter children’s stories used in this
project were provided by Ms. Vashti Farrer. The contribution
of Stephen Cameron in the production of the LiSN-S software
program is also sincerely appreciated.
REFERENCES
Bamford J, Wilson I. (1979) Methodological considerationsand practical aspects of the BKB sentence lists. In: Bench J,Bamford J, eds. Speech-Hearing Tests and the Spoken Languageof Hearing-Impaired Children. London: Academic Press, 146–187.
Bregman AS. (1990) Auditory Scene Analysis. Cambridge: MITPress.
Cameron S, Dillon H. (2006) Listening in Spatialized Noise Test(LiSNH)–Sentences (Version 1.0.0) [computer software]. Sydney:National Acoustic Laboratories.
Cameron S, Dillon, H. (2007a) Development of the Listening inSpatialized Noise–Sentences test (LiSN-S). Ear Hear 28(2):196–211.
Cameron S, Dillon H. (2007b) North American Listening inSpatialized Noise–Sentences test (NA LiSN-S) (Version 1.1.0)[computer software]. Sydney: National Acoustic Laboratories.
Cameron S, Dillon H. (2007c) The Listening in Spatialized Noise–Sentences test (LiSN-S): test–retest reliability study. Int J Audiol46:145–153.
Cameron S, Dillon H. (2008) The Listening in Spatialized Noise–Sentences test: comparison to prototype LiSN test and results fromchildren with either a suspected (central) auditory processingdisorder or a confirmed language disorder. J Am Acad Audiol19(5):377–391.
Cameron S, Dillon H, Newall P. (2006) Development and evaluationof the Listening in Spatialized Noise test. Ear Hear 27(1):30–42.
Crystal D. (1989) Grammatical Analysis of Language Disability.London: Cole and Whurr Limited.
Dawes P, Bishop DVM. (2007) The SCAN-C in testing forauditory processing disorder in a sample of British children.Int J Audiol 46:780–786.
Golding M, Lilly DJ, Lay JW. (1996) A Staggered Spondaic Word(SSW) test for Australian use. Aust J Audiol 18(2):81–88.
Table 11. Calculation of the One-Sided Critical Differences Needed to Infer a Genuine Improvement in AuditoryPerformance on Retest While Taking into Account Mean Practice Effects and Day-to-Day Fluctuations in Performance(in dB)
Condition
Correction Factor
(Mean Test–Retest Difference)
SD of the Mean Test–Retest
Difference 1.64 3 SD
Critical Difference
(Including Correction)
Low-Cue SRT 20.52 1.74 2.85 23.4
High-Cue SRT 20.59 1.96 3.21 23.8
Talker Advantage 0.19 2.27 3.72 3.9
Spatial Advantage 20.07 1.98 3.25 3.2
Total Advantage 0.05 1.80 2.95 3.0
Note: SRT 5 speech reception thresholds.
NA LiSN-S/Cameron et al
145
Green, LJ. (2002) African American English: A Linguistic Intro-duction. New York: Cambridge University Press.
Katz J. (1962) The use of staggered spondaic words for assessingthe integrity of the central auditory nervous system. J SpeechDisord 33:132–146.
Keith RW. (1986) SCAN—A Screening Test for Auditory Pro-cessing Disorders. San Diego: Psychological Corporation.
Keith RW. (2000) SCAN-C Test for Auditory Processing Disordersin Children—Revised. San Antonio: Psychological Corporation.
Koehnke J, Besing J. (1997) Clinical applications of 3-D auditorytests. Semin Hear 18(4):345–354.
Kowal A. (1979) Sentence list construction and pilot test. In: BenchJ, Bamford J, eds. Speech-Hearing Tests and the Spoken Languageof Hearing-Impaired Children. London: Academic Press, 110–145.
Marriage J, King J, Briggs J, Lutman ME. (2001) The reliabilityof the SCAN test: results from a primary school population in theUK. Br J Audiol 35:199–208.
Sussman E, Ritter W, Vaughan HG. (1999) An investigation ofthe auditory streaming effect using event-related brain poten-tials. Psychophysiology 36:22–34.
The MathWorks, Inc. (2002a) MATLAB (Version 6.5.6) [Comput-er software]. Natick, MA: The MathWorks, Inc.
Wilber LA. (2002) Transducers for audiologic testing. In: Katz J,ed. Handbook of Clinical Audiology. Baltimore: LippincottWilliams and Wilkins, 88–95.
APPENDIX A
Practice Sentences—LiSN-S Same Voice 0 Degrees
Condition
1. The boys are watching the game.
2. A dog is hiding the bone.
3. Two girls went to the store.
4. A painting hangs on the wall.
5. Some people go to the gym.
APPENDIX B
Extract from the LiSN-S distracter continuous dis-
course presented at 0 degrees and 290 degrees:
‘‘Loopy Lizard’s Tail’’ by Vashti Farrer
Loopy lizard was on his way home. His mother had
told him not to dawdle, but Loopy wanted to play.
So he ran across a path and through the grass,
where he pretended to hide. Then he scampered up
a wall and peered into a crack to see who lived
there. Then Loopy stopped very still in the sun, as if
he were asleep, just to feel how warm it was.
Suddenly a big, fierce dog came down the path. ‘‘He
must be fierce,’’ thought Loopy, ‘‘because he has big
teeth.’’ And he started to run as fast as he could
along the path, to get out of the dog’s way. But the
big dog chased him.
APPENDIX C
LiSN-S Instructions
1. You are going to hear some sentences over these
headphones.
2. The sentences are said by a lady called ‘‘Miss Smith.’’
3. Miss Smith will sound as if she is standing just in
front of you.
4. There will be a ‘‘beep’’ before each sentence so youwill know when it is about to start.
5. Your job is to repeat back the sentence that Miss
Smith says.
6. I’ll pretend to be Miss Smith, and I want you to
repeat the sentence you hear.
7. ‘‘Beep.’’ ‘‘The dog had a bone.’’
8. Child repeats ‘‘The dog had a bone.’’
9. Good, that’s easy isn’t it? But there’s a trick. At thesame time that Miss Smith is telling you the sentence th-
ere are some very tricky people talking at the same time.
10. Sometimes the tricky people sound like they are
standing right next to Miss Smith, sometimes they will
sound like they are standing next to you.
11. No matter where the tricky people are I don’t want
you to listen to them.
12. Just listen for the ‘‘beep’’ and the sentence.13. Miss Smith always starts out louder than the tricky
people, so you shouldn’t have any trouble hearing her.
14. But sometimes the tricky people get loud. If you
only hear a bit of the sentence I want you to tell me all
the words that you hear.
15. So if you just heard ‘‘dog’’ and ‘‘bone,’’ what would
you say?
16. Child repeats ‘‘dog’’ and ‘‘bone.’’17. Great. If you don’t hear Miss Smith at all, just shake
your head and I’ll go straight on to the next sentence.
18. Once you’ve heard the sentence tell me what
you’ve heard straight away so you don’t forget it.
19. In the first lot of sentences the tricky people will be
standing right next to you. Don’t listen to them. Just
concentrate on Miss Smith in front.
20. The tricky people start first and then Miss Smithstarts a few seconds later. Ready?
21. Describe where the tricky people are before each
listening condition.
(a) ‘‘Same Voice—690 Degrees’’ Condition: ‘‘Now the
tricky people will be next to you again, but their
voices will be a bit different. Ignore them and just
listen for Miss Smith.
(b) ‘‘Different Voices—0 Degrees’’ Condition: ‘‘Nowthe tricky people will be next to Miss Smith. Just
listen for the beep and the sentence.
(c) ‘‘Same Voice—0 Degrees’’ Condition: ‘‘Now the
tricky people will be next to Miss Smith, and their
voices will be very similar to Miss Smith’s voice.
So you will have to listen very hard for the beep
and Miss Smith.’’
Journal of the American Academy of Audiology/Volume 20, Number 2, 2009
146