36
Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa www.humnet.unipi.it/slifo Studi Linguistici e Filologici Online ISSN 1724-5230 Volume 10 (2013) – pagg. 183-218 G. Marotta, L. Iacoponi, A. Idone – “Asymmetries between Perception and Production. Pitch and Length in two varieties of Italian (Pisan and Crotonese)” ASYMMETRIES BETWEEN PERCEPTION AND PRODUCTION. PITCH AND LENGTH IN TWO VARIETIES OF ITALIAN (PISAN AND CROTONESE) GIOVANNA MAROTTA, LUCA IACOPONI, ALICE IDONE 1 1 Introduction In the last decade, a series of empirical researches on speech perception have focussed on the perceptual effects produced by the interaction of the two leading acoustic parameters: duration and F0 [cf. Pisoni & Remez, 2005]. The relevance of a multidimensional approach becomes especially clear in dealing with phenomena like prominence, because the perceptual salience of an auditory object depends on the peculiar combination of different physical elements (e.g. frequency, duration, intensity, voice quality), and is not derivable on a single one of them [Niebuhr, 2009]. In this study, we would like to present original experimental data 1 All authors contributed extensively to the work presented in this paper: G. M. conceived the study, supervised the experiment and wrote §1 and §5; L. I. wrote code, analysed output data and wrote § 3 and §4; A. I. administered the experiment, edited the manuscript and wrote §2.

Dipartimento di Linguistica–Università di Pisa ... · Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa 185 For the purpose of this work,

Embed Size (px)

Citation preview

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

Studi Linguistici e Filologici Online ISSN 1724-5230 Volume 10 (2013) – pagg. 183-218 G. Marotta, L. Iacoponi, A. Idone – “Asymmetries between Perception and Production. Pitch and Length in two varieties of Italian (Pisan and Crotonese)”

ASYMMETRIES BETWEEN PERCEPTION AND PRODUCTION.

PITCH AND LENGTH IN TWO VARIETIES OF ITALIAN (PISAN AND

CROTONESE)

GIOVANNA MAROTTA, LUCA IACOPONI, ALICE IDONE1

1 Introduction

In the last decade, a series of empirical researches on speech

perception have focussed on the perceptual effects produced by the

interaction of the two leading acoustic parameters: duration and F0

[cf. Pisoni & Remez, 2005].

The relevance of a multidimensional approach becomes

especially clear in dealing with phenomena like prominence, because

the perceptual salience of an auditory object depends on the peculiar

combination of different physical elements (e.g. frequency, duration,

intensity, voice quality), and is not derivable on a single one of them

[Niebuhr, 2009].

In this study, we would like to present original experimental data

                                                            

1 All authors contributed extensively to the work presented in this paper: G. M.

conceived the study, supervised the experiment and wrote §1 and §5; L. I. wrote

code, analysed output data and wrote § 3 and §4; A. I. administered the experiment,

edited the manuscript and wrote §2.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

184

regarding the relevance of two prosodic parameters, length and tone

modulation, in the perception of prominent vowels by Italian listeners.

In particular, our focus will be twofold: on the one hand, the

perceptual impact of the native linguistic variety on perception, and on

the other, the role of music training in tasks involving recognition of

prosodic parameters like pitch and length.

2 State of the art

2.1 F0 Modulation and Length

The influence of fundamental frequency (F0) on the perception

of vowel duration is a vexata quaestio. Previous findings are

conflicting. In spite of the widely accepted opinion, according to

which a dynamic F0 contour lengthens perceived duration

[Gussenhoven, 2004; Yu, 2006; Galloway, 2008], there is also

experimental evidence challenging this claim [Rosen, 1977a, 1977b;

van Dommelen, 1993; Lehnert-LeHouillier, 2007].

Probably, the absence of coherent results lies in the procedural

differences that mark the experiments: the effect of dynamic F0 in an

accentual language like Swedish [Rosen ,1977a; 1977b]; the rating of

synthetic monosyllables on a 7-point duration scale [Yu, 2006]; the

inclusion of syllabic structure as a relevant variable [Van Dommelen,

1993]; articulatory explanations and correlations between degree of

vowel length and degree of height [Gussenhoven, 2004].

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

185

For the purpose of this work, given the previous mixed findings

about the possible influence of F0 on the perception of length, the

ABX test was chosen because less subjective than explicit qualitative

judgments.

2.2 Language dependence

The relevance of the native language in the perception of

prosodic parameters like duration and F0 variation has recently been

questioned. Previous studies dealing with the variable of language-

dependence do not allow explanations, by and large, to be conclusive.

The double variability, namely the native language of the listener and

the language chosen as the source for testing the perceptive stimuli,

has been called into doubt, from time to time, according to the single

acoustic parameter considered.

In the specific case of Intrinsic Pitch (IP), Pape & Mooshammer

[2006] demonstrated its dependence on the native language of the

listener, but not on the language of the stimuli proposed. In a more

recent and exhaustive research, Pape [2008] delved into the

phenomenon of pitch and F0 insensitivity among Romance languages.

The data collected seemed to indicate that Romance listeners were

tendentially pitch-insensitive taking into account vowel quality rather

than F0 variations because of their small vowel inventory. So the

native language peculiarities are supposed to influence the perception

of acoustic parameters.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

186

Lehnert-LeHouillier [2007] as well confirms the importance of

this variable by demonstrating that the lengthening effect of dynamic

F0 occurs only in listeners of some languages (Japanese speakers) but

not of others (Thai, German and Spanish listeners); whereas Galloway

[2008] challenges the dependence of this perceptual effect on native

language, since in her experiment the lengthening effect was displayed

in rhythmically different languages: the syllable-timed French and the

stress-timed Swiss German.

In conclusion, there is controversial evidence supporting the

influence of one native language in the perception of length and

melodic contour.

To limit the number of linguistic variables involved in the

experiment, two varieties of the same language rather than two

different languages were compared.

2.3 The role of musical training

Several psycholinguistic studies have recently focused on the

connection between the domains of music and speech perception. In

particular, the basic issue concerns the possible influence of musical

education on the processing of speech acoustic parameters.

In a recent paper, Schön et al. [2004] contributed to the debate

by using brain imaging techniques. They manipulated the F0 in the

final part of word and note sequences, involving French musicians and

non-musicians. The results revealed that within the domains, language

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

187

prosody and music, musicians detected weak F0 manipulations better

and faster than non-musicians. They, together with other scholars

[Deutsch et al., 2004], demonstrated that this evidence also have a

neural counterpart that can finally state the connection between music

and speech. They claimed that musical training makes easier the

detection of pitch changes not only in music, but in language as well,

calling into play similar cognitive processes.

Furthermore, it was demonstrated that not only musical training

influences and improves the perceptual processing, but the perceptual

abilities in discrimination are specific to the domains that music

training emphasizes [Rauscher & Hinton, 2003]

Moreover, the variable ‘musician’ is claimed to be independent

of the native language: all professional musicians, for example, are

pitch-sensitive, with nearly identical results across all languages

[Pape, 2008].

Nevertheless, the musical and the linguistic processing are not

completely comparable, and the typology of the language involved

can slightly modify the perspective, as the experiments involving tone

languages demonstrate [Stevens et al., 2004; Schwanhäußer &

Burnham, 2005; Bidelman, Gandour & Krisnan, 2011].

The aforementioned experiments depict a general frame in

which, in spite of differences in the processing phase, musical

competence can be an important element for a better perception of

acoustic parameters. Nevertheless, others scholars [Niebuhr, 2009]

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

188

avoid explanations of this kind by justifying the better perceptual

performances of musicians to a matter of meta-language: the poor

performance of non-musicians is due to the fact that they are less

confident with the conceptualization of acoustic parameters.

For the experiment, the sample included professional or semi-

professional musicians and subjects who had any explicit knowledge

that could influence the perception of sound stimuli. Contrary to some

experiments, students from the Linguistics Department or from the

Laboratory of Phonetics were excluded from the experiment, as the

criterion of explicit knowledge was vaguely satisfied.

2.4 Pisan and Crotonese Italian

Two varieties of the same language were chosen rather than two

different languages to narrow down the number of prosodic variables

that could influence the perception of the stimuli.

The stimuli refer to two regional varieties of Italian, and not to

dialects. Italian dialects vary greatly and mutual intelligibility is often

rare among major groups. Different varieties of Italians may differ in

many respects influenced by local dialects, especially at the

phonological level, but they maintain mutual intelligibility. Most

features of standard Italian are shared by regional varieties.

The comprehension of the stimuli was therefore guaranteed and

it was possible to include among the participants listeners belonging to

the macro-linguistic areas of Tuscany and Southern Calabria.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

189

Typical features of Pisan are the consonantal lenition

phenomenon known as Tuscan Gorgia [see Marotta, 2001; 2008], the

deaffrication of /ʧ/ and /ʤ/ in intervocalic position [see Bertinetto and

Loporcaro 2005] and the lowering of the middle vowels /ɛ/ and /ɔ/

[see Calamai 2004]. Crotonese, as most of Southern dialects, is

characterised by consonantal fortition [see Loporcaro, 2009] and,

being an extreme Southern Italian dialects [Pellegrini, 1977], by the

neutralisation of the phonological opposition between /e/ ~ /ɛ/ and /o/

~ /ɔ/ in tonic position [see Fanciullo, 1994].

Pisan and Crotonese varieties have been chosen by virtue of

their use of the two prosodic parameters (i.e. duration and frequency)

to convey prominence. In Pisan, long and modulated vowels occur in

case of prominence [cf. Marotta et al. 2004; Marotta et al. 2011]. In

detail, with respect to Florence, in the areas of Pisa there is a stronger

increase of stressed vowel duration and of frequency range; at the

same time, on the perception side, longer vowels and higher

modulation are clearly identified as distinctive features for Italian

spoken in Pisa [Calamai & Ricci, 2005a; 2005b]. On the other hand,

Crotonese, together with most of Calabrian dialects, show very long

prominent vowels with a minor F0 modulation with respect to Pisan

[Romito & Trumper, 1989; Mendicino & Romito, 1991; Romito,

1993; Marotta & Sardelli, 2007, 2009].

The geographic position of the varieties chosen for the

experiment is shown in Figure 1.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

190

 

Figure 1: Geographical location of Pisa and Crotone with reference to their linguistic area, i.e. Tuscan and Southern Calabrian. 

3 Method

3.1 Stimulus Set

The auditory stimuli are part of the set of stimuli used in a previous

experiment [see Marotta et al., 2011]. Eight words containing

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

191

prominent vowels in open stressed penultimate syllable – a context

where tonic lengthening occurs in Italian [see Marotta 1999;

D’Imperio & Rosenthal 1999] - were extracted from a semi-

spontaneous conversation held at different times by a Pisan and a

Crotonese speaker.

Prominent vowels are here defined as segments with a special

degree of perceptive salience in an utterance. As is well-known, in a

phonetic string, a segment as well as a syllable, can be perceived as

being prominent after the relevant modification of the three basic

acoustic parameters, i.e. length, intensity and frequency, which are

perceived as changes in length, volume and tone [cf. Rietveld &

Gussenhoven, 1985; Kohler, 2008]. On perceptive ground, we refer to

prominence as in the definition of Terken [1991]: “the prominence is a

property by which linguistic units are perceived as standing out from

their environment”. For a more detailed discussion of the criteria

adopted for selecting prominent vowels used in the stimulus set, we

refer the reader to Marotta et al. [2011].

The varieties under scrutiny were chosen as they display

asymmetrical behavior in the use of duration and pitch to convey

prominence: Pisan displays pitch modulation in stressed vowels, while

phonetic lengthening is observed for the Crotonese variety (cf §2.4)

[Marotta et al., 2004; 2011]. To limit the number of variables,

intensity variation was not considered in the stimulus modification

set.. Finally, the target vowel in all words is low or mid-low. It was in

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

192

fact early observed [Grammont, 1993] that low vowels are longer than

high vowels, a tendency confirmed by recent measurements of the

varieties used in this experiment [Marotta et al. 2011].

The Pisan speaker is a 51 year old woman, born and raised in

Pisa by Pisan parents. The Crotonese speaker is a 25 year old student,

living in Pisa at the time of the interview, but a native and fluent

speaker of the Crotonese dialect. Both accents were easily perceivable

as regional. Each speaker was recorded in the laboratory of Phonetics,

at the University of Pisa, using a digital solid-state recorder Marantz

PMD671, equipped with a Sennheiser MKE 40-EW microphone.

All stimuli were sampled at 44.1 kHz, Bit Rate 1411 kbps and

Sample size 16 bit. The eight words were finally extracted from the

recording, and each stressed vowel modified for duration and/or F0

using Praat Software (http://www.fon.hum.uva.nl/praat). The vowel

was shortened by 30 ms (D1) at the first stage and then shortened

again by another 30 ms (D2 = -60ms). Three modifications were made

to the pitch: F0 was levelled to its maximum (HP), to its minimum

(LP) and inverted (IP). All stimuli were then normalized at the

beginning and at the end of the modification to avoid pitch smearing.

Figure 2 shows, as an example, the spectrogram of the original Pisan

stimulus ‘dottorato’; Figure 3 shows the same word after the pitch

contour modifications. Table 1 and Table 2 contain the list of the

recorded words with indication of the relevant acoustic parameters,

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

193

i.e. segment duration and F0 value in the onset, peak (or valley) and

end points of the prominent vowel.2

 

Figure 2: Spectrogram for the unmodified stimulus [dotːoˈraːθo] ‘doctorate’ with the pitch curve drawn in blue

 

Figure 3: The three pitch modifications to the word [dotːoˈraːθo] ‘doctorate’

 

 

                                                            

2 For further details about the speech stimuli, we refer the reader to Marotta et al.

[2011].

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

194

Selected words Original duration

Original F0

Bene [ˈbɛːne] ‘good’ 257 ms 233 Hz – 202 Hz – 194 Hz

Dottorato [dotːoˈraːθo] ‘doctorate’ 242 ms 232 Hz – 341 Hz – 335 Hz

Emiliano [emiˈljaːno] ‘Emiliano' PR-N’

189 ms 209 Hz – 245 Hz – 203 Hz

Valerio [vaˈlɛːrjo] ‘Valerio’ PR-N

272 ms 256 Hz – 302 Hz – 283 Hz

Table 1: Stimuli recorder by the Pisan speaker

 

Selected words Original duration

Original F0

Bene [ˈbɛːne] ‘good’ 146 ms 99 Hz – 95 Hz

– 89 Hz

Cucinare [kuʧiˈnaːre] ‘to cook’ 186 ms 146 Hz – 196 Hz – 153 Hz

Lezione [leˈʦːjɔːne] ‘lesson’ 149 ms 103 Hz – 130 Hz – 105 Hz

Prestigioso [prestiˈʤ:ɔ:so] ‘prestigious’ 174 ms 152 Hz – 195 Hz – 162 Hz

Table 2: Stimuli recorded by the Crotonese speaker

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

195

3.2 Sampling

In total, 60 participants were recruited for the experiment. The

sampling frame was divided between the two variables of musical

training and variety of Italian. The sample was composed by Pisans

(n=20), Crotonians (n=20), and by a control group (n=20) whose

variety of Italian was neither Tuscan nor Calabrian. To be selected for

the experiment, the listeners of the first two groups had to match the

following criteria: 1) they had to have grown up and be born in one of

the two linguistic areas of interest; 2) they had to show native fluency

in his/her variety; 3) they had not lived outside their group region for

more than three consecutive years, and only during their adulthood.

With respect to the last prerequisite, 3 Crotonians and 14 participants

in the control group were alumni of the University of Pisa. Half of the

participants in each group were professional musicians (n=30) with at

least 5 years of training and an average of 4 years of experience in

their primary instrument. Of those, 12 had undergone formal training

in Italian conservatories.

None of the non-musicians had ever had any formal musical

training, training in phonetics or phonology or any other sound-related

skills. The mean age of participants was 28, with most participants

being undergraduate students between 19 and 25 years old (n= 27). A

high school diploma was required as the minimum education

requirement. All participants declared that they had no visual, hearing

or cognitive impairments.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

196

The total number of participants within the groups is

summarized in Table 3.

N=60 Pisa Crotone Control

Musician 10 10 10

Non-Musician 10 10 10

Table 3: Total number of participants divided into the 6 groups

3.3 Design and Procedure

The experiment was divided into two blocks, each containing a

set of trials of stimuli from the same variety. For each word in the

stimuli set three stimuli were chosen to form the block’s triplets. The

triplets contained all possible pairings of original and modified stimuli

of the same word and of the same modification group, and a third

stimuli identical to the first, the second or to both in the case where

the first two were identical. The order of stimuli in the pair did not

matter for the generation of the combinations All the experiments

involving the Pisan and Crotonese groups were conducted in a calm

and silent environment in the Laboratory of Phonetics at Pisa

University. To record the input a standard Italian keyboard was used

where a coloured label was applied to each of the keys corresponding

to a possible response. The stimuli were delivered using a high

definition headphone MDR-XD20. The experiment ran on a p4 dual-

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

197

core equipped with a Realtek high definition audio card; a Samsung

R522 was used for some participants of the Crotonese group (N= 17).

Between December 2010 and March 2011 each person who met

the study inclusion criteria was called in to undergo the experiment.

In order to optimize the environmental conditions, for the comfort of

the participant and to avoid any external conditioning factors (peer

pressure, background noise, operator disposal, etc.) the participants

were tested one at a time, assured confidentiality and given the

opportunity to decline to participate in the study. The purpose of the

study was stated only after the experiment had concluded.

All operators had to follow the following standardized

procedure. First, all sensitive information was collected by the

operator, who filled in a sociolinguistic form; the participant was then

given the following instructions:

"You will hear a series of three words. You will have to press the

labeled key '1' if you recognize that the third word is identical to the

first, '2' if identical to the second and 'Don't know' if you can't hear

any difference. You can skip a word if you have trouble hearing the

stimulus because of external distractions such as noise, or a

temporary lack of attention."

A five-stimuli trial session was then delivered to familiarize the

participants with the task. If the participant did not feel comfortable

enough or did not understand the task, the trial session was run again

(this happened only for p=13). The stimuli were grouped into two

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

198

consecutive blocks, one including the Pisan stimuli and the other the

Crotonese. Between the two blocks, the participants were allowed a

short pause. The order of the blocks as well as the order of the stimuli

within each block was randomized in each session. The procedure for

the stimuli delivery was ABX, with a 1 second interval between each

stimulus, and a 2 second time lapse to answer. The stimuli could only

be listened to once. The experiment lasted about 30 minutes. The

delivery of the stimuli, the recording of the responses as well as part

of the data analysis was carried out using software Presentation®

(Version 14.7, www.neurobs.com).

4 Results

Data were analyzed using Neurobs Analyzer® (Version 14.7)

and R (Version 2.14.1). The result of a t-test among the different

groups was computed on the number of correct responses for each of

the groups divided by stimulus modification, stimulus source and

speaker location and then used to investigate the correlation among

speaker and listener groups. The mean values of correct response for

all groups is given in Table 4.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

199

Stimuli Listeners

mod.

Source Pisan

m-Pisan

Crotonese

m-Crotonese

control

m-control

HP Pisa 0.55 1 0.83 0.95 0.6 0.9

Crotone

0.67 0.93 0.67 0.97 0.6 0.93

LP Pisa 0.6 0.8 0.82 0.78 0.57 0.88

Crotone

0.41 0.69 0.55 0.66 0.5 0.73

IP Pisa 0.71 0.91 0.78 0.91 0.6 0.94

Crotone

0.13 0.6 0.4 0.6 0.24 0.71

D1 Pisa 0.41 0.56 0.33 0.55 0.42 0.51

Crotone

0.52 0.81 0.59 0.75 0.5 0.7

D2 Pisa 0.58 0.73 0.39 0.65 0.4 0.58

Crotone

0.58 0.68 0.56 0.72 0.47 0.73

Table 4: Mean values of correct responses for modification, stimulus source and speaker location. The first two columns indicate the stimulus modification and location, the other columns the location of the participants. The m-prefix on participant location row indicates that the group is composed of musician  

Figure 4 shows the average numbers of correct responses by all

speakers divided into musicians and non-musicians.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

200

 

Figure 4: Mean values of correct responses for musicians and non-musicians. The y-axis represents the number of subjects who answered correctly, while the x-axis shows the number of correct responses  

As repeatedly reported in the literature (see § 2.3), musicians

uniformly perform better than non-musicians in both the pitch

(p<0.001) and duration discrimination task (p<0.001; see Figure 5).

By further splitting the data, though, we unexpectedly observed an

over-recognition effect on stimulus identification when the stimuli

were identical (see Figure 6). If musicians do better than non-

musicians in all tasks, they seem to fail to recognize when stimulus A

is the same as stimulus B. The difference is statistically significant

(p<0.001).

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

201

 

Figure 5: Percentage of incorrect responses for pitch and duration discrimination tasks, divided for musical training

 

 

Figure 6: Percentage of incorrect responses when the first stimulus is the same as the second (A=B) and when it is different (A≠B)

 

Central to the design of the experiment is the variable

‘location’, i.e. the relevance of the native variety of the listeners. The

impact of the source of the stimulus in association with the subject

location in recognizing prosodic differences was analyzed among the

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

202

different groups. The mean values and their standard deviations are

shown in Table 5. The first effect observed concerns the

discrimination of pitch variations. All three groups of speakers

(Pisans, Crotonians and controls) recognized pitch differences

occurring in Pisan stimuli significantly better than those occurring in

Crotonese stimuli (p<0.001). A similar but specular correlation was

found for duration differences: the Crotonese stimuli were better

recognized than their Pisan counterparts (p=0.008). The different t-

values could be due to the fact that in conveying prominence the

relevance of duration in Crotonese is less evident than that of pitch in

Pisan.

However, a direct comparison between pitch and duration is

obviously not possible: the two parameters as well as their measures

are intrinsically different (cf. Jones & Munhall, 2000; 2005).

Therefore, the two variables can be only indirectly compared with

reference to the geographical origin of the three groups of listeners.

The weight of the prosodic features in a phonetic production varies as

a continuum where different factors play a role and the impact of

pitch, duration and intensity on the indication of prominence may vary

from language to language. In our case, duration in the Crotonese

stimuli appears to be less overwhelming than pitch in Pisan.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

203

stimulus location (N=30) Pitch Duration

Pisa 136.4 (SD 18.35) 94. (SD 5.01)

Crotone 62.2 (SD 9.78) 105. (SD 6.51)

t-value < 0.001 0.008

Table 5: Correct responses for pitch and duration recognition tasks in Pisan and Crotonese stimuli

No correlation was found when the subject location was the

variable at stake. Speakers of a particular variety do not perform better

in recognizing a particular stimulus modification (pitch, p=0.091;

duration, p=0.062; see Table 6). Similarly, Pisan and Crotonese

speakers are not better at recognizing stimuli from their same varieties

and the controls exhibit no preference for a particular location: no

correlation could be found when considering speaker location

(p=0,303; Table 7).3

                                                            

3 The high mean recorded for Pisan stimuli is probably due to the fact that pitch modified stimuli were 1/3 more numerous than duration stimuli coupled with the observation that pitch recognition is easier within the Pisan variety (see Table 5).

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

204

Listener location (N=20) Pitch Duration

Pisans 44. (SD 8.62) 35. (SD 5.74)

Crotonians 48.8 (SD 4.21) 29.8 (SD 8.26)

Controls 43.6 (SD 11.30) 29.6 (SD 9.54)

p-value 0.0911 0.0617

Table 6: Mean values of correct responses for pitch and duration tasks according to listener location

Listener location (N=20) Pisa Crotone

Pisans 74. (SD 16.97) 49.6 (SD 14.02)

Crotonians 71.2 (SD 13.33) 54.4 (SD 9.41)

Control 68.4 (SD 19.79) 48.6 (SD 13.71)

p-value 0.581 0.303

Table 7: Mean values of correct responses for Pisan and Crotonese stimuli in all listener groups  

Therefore, in our opinion the most original result of our analysis

concerns the fact that listeners do not recognize the stimuli relative to

their own language variety better than the others.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

205

5 Discussion

The results of the ABX discrimination test indicate that the

native variety of the speakers/listeners does not influence the

recognition of prosodic modifications to speech stimuli. No significant

difference was found in Pisan and Crotonese groups when listening to

their variety-specific prominent prosodic features. Pisan listeners did

not perform significantly better than Crotonians at recognizing pitch

variation (see Figure 7) and Crotonians did not discriminate duration

differences better than Pisans (see Figure 8). Similar percentages of

error were also found for the control subjects. The result is also

confirmed by the fact that Pisan and Crotonese listeners do not

discriminate between the prosodic differences in the stimuli relative to

their own variety better than the stimuli from the other variety (Table

4, §4.2).

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

206

 

Figure 7: Boxplot of correct responses for pitch modified stimuli

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

207

 

Figure 8: Boxplot of correct responses for duration modified stimuli

On the other hand, stimulus origin seems to be the variable

responsible for the variation found in perception. All participants

uniformly performed better in recognizing pitch variation in Pisan

stimuli (see Figure 9) and duration in Crotonese (see Figure 10), no

matter what their native language was (Figure 11). Intrinsic phonetic

cues used to mark prominence then are not only evident acoustically,

but are also more easily perceived by all listeners. The primary

prominence features are different in the two specific varieties of

Italian here considered: duration for Crotonese and pitch for Pisan [cf.

Marotta et al. 2011]. However, they are recognized with the same

degree of accuracy no matter what the native variety of the listener is.

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

208

This suggests that the distribution of language specific prosodic

features observed in production is not always mirrored by an

equivalent and symmetrical behavior in perception.

 

Figure 9: Incorrect responses by all listeners for Pisan and Crotonese stimuli modified only in F0

 

Figure 10: Incorrect responses by all listeners for Pisan and Crotonese stimuli modified only in duration

 

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

209

 

Figure 11: Incorrect responses by all listeners for Pisan and Crotonese stimuli modified for duration and F0

Our ABX discrimination experiment adds new evidence to the

controversial debate on the differences in perception among musicians

and non-musicians. The analysis of the responses obtained where the

first stimulus had to be recognized as different from the second

confirmed and reproduced the results previously obtained by

experiments using the same setting. Musicians scored a considerably

higher number of correct responses both in duration and pitch

recognition tasks (Figure 5). The experiment provides fresh data when

the variable A=B is considered, that is when the two stimuli have to be

recognized as being identical. Unexpectedly, the analysis of the data

revealed that in this task the percentage of correct answers is switched

between the groups. Musicians scored worse than subjects with no

explicit sound knowledge in a proportion that is extremely statistically

significant (Figure 6).

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

210

6 Conclusion

In previous sections of the article (cf. § 2.4 and passim) we

observed that Pisan and Crotonese speakers consistently differ in the

production of non-distinctive prosodic features: pitch and vowel

length have a different weight in marking prominent vowels

depending on the language variety. This datum could suggest that a

difference in production is reflected in a difference in perception.

Previous studies showed that prosodic features, when distinctive in a

language, can improve the perception accuracy of the corresponding

acoustic correlates in recognition tasks. Speakers of tonal languages

are better at recognizing pitch differences [Stevens et al., 2004], and if

vowel duration is distinctive in a language, the speakers show an

improved ability to recognize small differences in vowel length. The

stimuli used in this experiment differ from the aforementioned studies

in two aspects: first, instead of sampling speakers and stimuli of two

languages, two varieties of the same language were used, i.e. Pisan

and Crotonese; second, duration and pitch have no distinctive status in

both the Italian varieties considered.

The results obtained from the ABX discrimination test do not

confirm any perceptual impact of the native linguistic variety in

judging stimuli manipulated for duration and F0: the subject’s

behavior is no different or better when listening to his own variety. At

the same time, listeners appear to be sensitive to both the prosodic

parameters taken into account. In particular, the sensitivity is driven

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

211

by the feature which is specific of the two varieties considered: pitch

for Pisan and length for Crotonese. The native variety of the speakers

does not influence the discrimination between prosodic variations in

speech stimuli, because no significant difference was found in the

ability of the two groups of listeners (Pisans and Crotonians) in

recognizing their variety-specific prosodic feature of prominence:

Pisans did not perform significantly better than Crotonians in

recognizing pitch variation, and Crotonians did not discriminate

duration differences better than Pisans.

On the other hand, the results of our experiment suggest that

non-distinctive prosodic parameters, like F0 and duration in Italian

varieties, though used as features of prominence in production, do not

systematically affect the perception of listeners in a symmetric way.

Finally, the results of our perceptive experiment confirm the

relevance of the variable ‘musical training’: alongside previous studies

(Rauscher & Hinton [2003], Schon et al. [2004], Pape &

Mooshammer [2006], Pape [2008] among others), subjects with good

musical competence perform better in prosodic perception tasks.

However, the interpretation of these results is not one-sided. Is the

better performance of musicians directly derived from their

competence in musical code or is it vicariously dependent on a

simpler access to the mastering of the discrimination task? As a matter

of fact, the subjects listened to speech stimuli, not music or psycho-

acustic stimuli. Their explicit knowledge of music may not to be

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

212

related to any special linguistic skill, and in fact may even influence

the perception negatively. In the case of the same stimuli (A = B), our

listeners with a musical education showed a higher percentage of

errors compared to listeners without any musical training, maybe

because of their oversensitivity to pitch changes.

Our data appear to support the results obtained by Schön et al.

[2004], who observed that the scalp negativity measured during a

similar recognition task was different for musicians (temporal sites

bilaterally) and for non-musicians (centrally, left temporal sites),

suggesting that there is no improvement in the specific abilities

investigated in the experiments and that the two groups simply used

different strategies, not directly comparable. Future research will shed

more light on the interaction between competence in music and

prosodic perception.

GIOVANNA MAROTTA, LUCA IACOPONI, ALICE IDONE

DEPARTMENT OF LINGUISTICS, UNIVERSITY OF PISA

7 References

Bidelman, G. M.; Gandour, J. T.; Krishnan, A.: Musicians demonstrate

experience-dependent brainstem enhancement of musical scale

features within continuously gliding pitch. Neuroscience

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

213

Letters 503(3): 203-207 (2011).

Bertinetto, P. M.; Loporcaro, M.: The sound pattern of Standard

Italian, as compared with the varieties spoken in Florence,

Milan and Rome. Journal of the International Phonetic

Association 35:131-151 (2005)

Calamai, S.: Il vocalismo tonico dell’area pisana e livornese. Aspetti

storici, percettivi e acustici, (Edizioni dell’Orso, Alessandria

2004).

Calamai, S.; Ricci, I.: Sulla percezione dei confini vocalici in Toscana:

primi risultati. In Cosi, P. (eds.), Atti del I Convegno Nazionale

AISV (EDK Editore, Torriana 2005a).

Calamai, S.; Ricci, I.: Un esperimento di matched-guise in Toscana.

Studi Linguistici e Filologici Online 3.1, pp. 63-105 (2005b)

(www.humnet.unipi.it/slifo.htlm).

Deutsch, D.; Henthorn, T.; Dolson, M.: Absolute Pitch, Speech, and

Tone Language: Some Experiments and a Proposed

Framework. Music Percept 21: 339-356 (2004).

D'Imperio, M.; Rosenthall, S.: Phonetics and Phonology of Main

Stress in Italian. Phonology 16(1): 1-27 (1999).

Fanciullo, F.: Fra Oriente e Occidente. Per una storia linguistica

dell'Italia meridionale (ETS, Pisa 1997).

Galloway, R.E.: Should rhythm metrics take account of fundamental

frequency? Poster presented at the Workshop on Empirical

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

214

Approaches to Speech Rhythm (EASR08), 28th March 2008,

(University College of London 2008).

Grammont, M.: Traité de phonétique (Delagrave, Paris 1933).

Gussenhoven, C.: Perceived vowel duration. In H. Quené & V. van

Heuven (eds.), On Speech and Language: Studies for Sieb G.

Nooteboom, LOT. 65-71 (Utrecht 2004).

Jones, A.J.; Munhall, K.G.: Perceptual calibration of F0 production:

evidence from feedback perturbation. Journal of the Acoustical

Society of America 108:1246-1251 (2000).

Jones, A.J.; Munhall, K.G.: Remapping Auditory-Motor

Representations in Voice Production, Current Biology

15:1768-1772 (2005).

Kohler, K.J.: The Perception of Prominence Patterns. Phonetica 65:

257-269 (2008).

Lehnert–Le Houillier, H.: The influence of dynamic F0 on the

perception of vowel duration: cross linguistic evidence. In

Proceedings of the 16th International Congress of Phonetic

Sciences: 757 – 760 (Saarland University, Saarbrücken 2007).

Loporcaro, Michele: Profilo linguistico dei dialetti italiani (Laterza,

Roma Bari 2009).

Marotta, G.: Modelli e misure ritmiche. La durata vocalica in italiano

(Zanichelli, Bologna 1985).

Marotta, G.: Degenerate Feet nella fonologia metrica dell'italiano. In

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

215

P. Benincà, A. Mioni & L. Vanelli (eds.), Fonologia e

morfologia dell'italiano e dei dialetti d'Italia. Atti del XXXI

Congresso S.L.I.: 97-116 (Bulzoni, Roma 1999).

Marotta, G.: Non solo spiranti. La ‘gorgia toscana’ nel parlato di Pisa.

L’Italia Dialettale 62: 27-60 (2001).

Marotta, G.: Lenition in Tuscan Italian (gorgia toscana). In J. Brandao

de Carvalho, T. Scheer e Ph. Ségéral (eds.), Lenition and

Fortition: 235-272 (Mouton-de Gruyter, Berlin 2008).

Marotta, G.; Calamai, S.; Sardelli, E.: Non di sola lunghezza. La

modulazione di F0 come indice socio-fonetico. In A. De

Dominicis, L. Mori, M. Stefani (eds.), Costituzione, gestione e

restauro di corpora vocali. Atti delle XIV Giornate del GFS:

210-215 (Esagrafica, Roma 2004).

Marotta, G.; Molino, A.; Bertini, C.: Lunghezza o frequenza: quale

parametro per la prominenza?. In B. Gili Fivela, A. Stella, L.

Garrapa, M. Grimaldi (eds.), Contesto comunicativo e

variabilità nella produzione e percezione della lingua. Atti del

VII Convegno AISV: 31-42 (Bulzoni, Roma 2011).

Marotta, G.; Sardelli E.: Sulla prosodia della domanda con soggetto

postverbale in due varietà di italiano toscano. In P. Cosi, E.

Magno Caldognetto, A. Zamboni (eds.), Studi di fonetica in

ricordo di F. Ferrero: 205-212 (Unipress, Padova 2003).

Marotta, G.; Sardelli, E.: Prosodic parameters for the detection of

regional varieties in Italian. In Proceedings of the 16th

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

216

International Congress of Phonetic Sciences: 682-704

(Saarland University, Saarbrücken 2007).

Marotta, G.; Sardelli E.: Prosodiatopia: parametri prosodici per un

modello di riconoscimento diatopico. In G. Ferrari, R. Benatti

& M. Mosca (eds.), Linguistica e modelli tecnologici di

ricerca, Atti del XL Congresso SLI: 411-436 (Bulzoni, Roma

2009).

Mendicino, A.; Romito, L.: Isocronia e base di articolazione: uno

studio su alcune varietà meridionali. Quaderni del

Dipartimento di Linguistica, Università della Calabria, Serie

Linguistica 3: 49 – 67 (1991).

Niebuhr, O.: F0 – based rhythm effects on the perception of local

syllable prominence. Phonetica 66: 95 – 112 (2009).

Pape, D.: The native language influence on perceptual Intrinsic Pitch:

Cross-linguistic data from German, Italian, Portuguese, and

Spanish. In Proceedings of the 4th Conference on Speech

Prosody, pp. 743-746 (Campinas, Brazil 2008).

Pape, D.; Mooshammer, C.: Is Intrinsic pitch language-dependent?

Evidence from a cross-linguistic vowel pitch perception

experiment: In Proceedings of the ISCA International

Workshop on Multilinguistic MULTILING (Stellenbosch,

South Africa 2006).

Pellegrini, G. B.: La Carta dei Dialetti d’Italia. (Pacini editore, Pisa

1977).

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

217

Pisoni, D.; Remerez, R. E.: The Handbook of Speech Perception.

(Blackwell, Oxford 2005).

Rauscher, F.H.; Hinton, S.C.: Type of music training selectively

influences perceptual processing. In Proceedings of the

European Society for the Cognitive Sciences of Music

(Hannover, Germany 2003).

Rietveld, A. C. M.; Gussenhoven, C.: On the relation between speech

excursion size and pitch prominence. Journal of Phonetics 13:

299 – 308 (1985).

Romito, L.: Cenni sui correlati elettroacustici dell’accento in alcune

varietà di italiano. In Atti delle IV Giornate di Studio del GFS,

pp. 107–119 (1993).

Romito, L.; Trumper, J.: Un problema della coarticolazione:

l’isocronia rivisitata. In Atti del XVII Convegno

dell’Associazione Italiana di Acustica, pp. 449 – 455 (1989).

Romito, L.; Turano, T.; Loporcaro, M.; Mendicino, A.: Micro e

Macrofenomeni di centralizzazione nella variazione diafasica:

rilevanza dei dati fonetico-acustici per il quadro dialettologico

del calabrese". Atti del convegno "VII Giornate di Studio del

Gruppo di Fonetica Sperimentale (G.F.S.)", Napoli, novembre

1996, pp.157 – 175 (1997).

Rosen, S. M.: The Effect of Fundamental Frequency patterns on

perceived duration. Speech Transmission Laboratory Quarterly

Progress and Status Report 1:17 – 30 (1977a).

Studi Linguistici e Filologici Online 10 Dipartimento di Linguistica–Università di Pisa

www.humnet.unipi.it/slifo

218

Rosen, S. M.: Fundamental frequency patterns and the long-short

vowel distinction in Swedish. Speech Transmission Laboratory

Quarterly Progress and Status Report 1: 31-37 (1977b).

Schön, D.; Magne, M.; Besson, M.: The music of speech: Music

training facilitates pitch processing in both music and

language. Psychophysiology 41 (3): 341-349 (2004).

Schwanhäußer, B.; Burnham, D.: Lexical Tone and Pitch Perception in

Tone and Non-Tone Language Speakers. In Proceedings of the

9th European Conference on Speech Communication and

Technology ISCA: 1701–1704 (Bonn, 2005).

Stevens, C.; Keller, P. E.; Tyler, M. D.: Language tonality and its

effect on the perception of contour in short spoken and musical

items. In Proceedings of the 8th International Conference on

Music Perception and Cognition, Evanston, IL, pp. 713-716

(Evanston 2004).

Terken, J.: Fundamental Frequency and Perceived Prominence.

Journal of the Acoustical Society of America 89: 1768 – 1776

(1991).

Van Dommelen, W.: Does dynamic F0 increase perceived duration?

New light on an old issue. Journal of Phonetics 21:367 – 386

(1993).

Yu, A. C. L.: Tonal effects on perceived vowel duration. Laboratory

Phonology 10 (Paris, France 2006).