Music and Speech Prosody

Embed Size (px)

Citation preview

  • 8/12/2019 Music and Speech Prosody

    1/16

    ORIGINAL RESEARCH ARTICLEpublished: 02 September 2013doi: 10.3389/fpsyg.2013.00566

    Music and speech prosody: a common rhythmMaija Hausen 1,2 * , Ritva Torppa 1,2 , Viljami R. Salmela 3 , Martti Vainio 4 and Teppo Srkm 1,2

    1 Cognitive Brain Research Unit, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland 2 Finnish Center of Excellence in Interdisciplinary Music Research, University of Jyvskyl, Jyvskyl, Finland 3 Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland 4 Department of Speech Sciences, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland

    Edited by: Josef P. Rauschecker, GeorgetownUniversity School of Medicine, USA

    Reviewed by: Mireille Besson, Centre National de la Recherche Scientique, France Barbara Tillmann, Centre National de la Recherche Scientique, France Aniruddh Patel, Tufts University,USA

    *Correspondence: Maija Hausen, Cognitive BrainResearch Unit, Institute of

    Behavioral Sciences, University of Helsinki, PO Box 9 (Siltavuorenpenger 1 B), FIN-00014 Helsinki, Finland e-mail: [email protected]

    Disorders of music and speech perception, known as amusia and aphasia, havetraditionally been regarded as dissociated decits based on studies of brain damagedpatients. This has been taken as evidence that music and speech are perceived bylargely separate and independent networks in the brain. However, recent studies ofcongenital amusia have broadened this view by showing that the decit is associated withproblems in perceiving speech prosody, especially intonation and emotional prosody. Inthe present study the association between the perception of music and speech prosodywas investigated with healthy Finnish adults ( n = 61) using an on-line music perceptiontest including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA)and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measuresthe perception of word stress. Regression analyses showed that there was a clearassociation between prosody perception and music perception, especially in the domainof rhythm perception. This association was evident after controlling for music education,age, pitch perception, visuospatial perception, and working memory. Pitch perceptionwas signicantly associated with music perception but not with prosody perception.The association between music perception and visuospatial perception (measured usinganalogous tasks) was less clear. Overall, the pattern of results indicates that there is arobust link between music and speech perception and that this link can be mediated byrhythmic cues (time and stress).

    Keywords: music perception, MBEA, speech prosody perception, word stress, visuospatial perception

    INTRODUCTION

    Music and speech have been considered as two aspects of thehighly developed human cognition. But how much do they havein common? Evolutionary theories suggest that music and speechmay have had a common origin in form of an early communi-cation system based on holistic vocalizations and body gestures(Mithen , 2005) and that music may have played a crucial rolein social interaction and communication, especially between themother and the infant ( Trehub , 2003). Another view holds thatthe development of music can be understood more as a by-product of other adaptive functions related to, for example,language, and emotion ( Pinker , 1997). Whether their origins arelinked or not, both music and speech are auditory communica-tion systems that utilize similar acoustic cues for many purposes,

    for example for expressing emotions (Juslin and Laukka , 2003).Especially in infant-directed speech, the musical aspects of lan-guage(rhythm, timbral contrast, melodic contour) are the centralmeans of communication, and there is novel evidence that new-borns show largely overlapping neural activity to infant-directedspeech and to instrumental music (Kotilahti et al., 2010). It hasbeen suggested that the musical aspects of language might alsobe used as scaffolding for the later development of semantic andsyntactic aspects of language (Brandt et al. , 2012).

    In addition to the links that have been found in early devel-opment, music and speech seem to be behaviorally and neurally

    interrelated also later in life. Evidence from functional resonance

    imaging (fMRI) studies of healthy adults suggests that perceiv-ing music and speech engages at least partly overlapping neuralregions, especially in superior, anterior and posterior temporalareas, temporoparietal areas, and inferior frontal areas ( Koelschet al., 2002; Tillmann et al. , 2003; Rauschecker and Scott , 2009;Schn et al., 2010; Abrams et al., 2011; Rogalsky et al., 2011),including also Brocas and Wernickes areas in the left hemispherethat were previously thought to be language-specic. Similarly,studies using electroencephalography (EEG) and magnetoen-cephalography (MEG) have shown that in both speech and musicthe discrimination of phrases induces similar closure positive shift(CPS) responses (Steinhaueret al. , 1999; Knsche et al., 2005) andthat syntactic violations in both speech and music elicit similar

    P600 responses in the brain (Patel et al., 1998). An EEG study of healthy non-musicians also showed that music may induce simi-lar semantic priming effects as words when semantically relatedor unrelated words are presented visually after hearing musicexcerpts or spoken sentences ( Koelsch et al., 2004).

    A clear link between speech and music has also been shownin behavioral and neuroimaging studies of musical training (fora recent review, see Kraus and Chandrasekaran , 2010; Shahin,2011). Compared to non-musicians, superior speech processingskills have been found in adult musicians ( Schn et al., 2004;Chartrand and Belin , 2006; Marqueset al. , 2007; Lima andCastro ,

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 1

    http://www.frontiersin.org/Psychology/editorialboardhttp://www.frontiersin.org/Psychology/editorialboardhttp://www.frontiersin.org/Psychology/editorialboardhttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/10.3389/fpsyg.2013.00566/abstracthttp://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=MaijaHausen&UID=81530http://community.frontiersin.org/people/RitvaTorppa/107460http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=ViljamiSalmela&UID=87951http://community.frontiersin.org/people/MarttiVainio/105742http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=TeppoS%E4%B2%AB%E4%AD%B6&UID=29446mailto:[email protected]://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/mailto:[email protected]://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=TeppoS%E4%B2%AB%E4%AD%B6&UID=29446http://community.frontiersin.org/people/MarttiVainio/105742http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=ViljamiSalmela&UID=87951http://community.frontiersin.org/people/RitvaTorppa/107460http://www.frontiersin.org/Community/WhosWhoActivity.aspx?sname=MaijaHausen&UID=81530http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/10.3389/fpsyg.2013.00566/abstracthttp://www.frontiersin.org/Psychologyhttp://www.frontiersin.org/Psychology/abouthttp://www.frontiersin.org/Psychology/editorialboardhttp://www.frontiersin.org/Psychology/editorialboardhttp://www.frontiersin.org/Psychology/editorialboard
  • 8/12/2019 Music and Speech Prosody

    2/16

    Hausen et al. Rhythm connects music and speech

    2011; Marie et al., 2011a,b) and musician children (Magne et al.,2006). Also, musical training has been shown to enhance speech-related skills in longitudinal studies where the non-musicianparticipants were randomly assigned to a music training groupand a control group ( Thompson et al. , 2004; Moreno et al. , 2009;Dege and Schwarzer, 2011; Chobert et al. , 2012; Franois et al.,2012). The superior speech-related skills of musicians or partic-

    ipants in musical training group include the perception of basicacoustic cues in speech, such as pitch ( Schn et al., 2004; Magneet al., 2006; Marques et al., 2007; Moreno et al. , 2009), tim-bre (Chartrand and Belin , 2006), and vowel duration ( Chobertet al., 2012). These results support the hypothesis that music andspeech are at least partly based on shared neural resources ( Patel,2008, 2012). The improved processing of these basic acousticparameters can also lead to enhanced processing of more complex attributes of speech, which can be taken as evidence of trans-fer of training effects (Besson et al., 2011). The enhanced higherlevel processing of speech includes speech segmentation ( Degeand Schwarzer, 2011; Franois et al., 2012) and the perception of phonemic structure ( Dege andSchwarzer , 2011), metric structure

    (Marie et al., 2011b), segmental and tone variations in a for-eign tone-language ( Marie et al., 2011a), phonological variations(Slevc and Miyake, 2006) and emotional prosody ( Thompsonet al., 2004; Lima and Castro, 2011). Musical ability is alsorelated to enhanced expressive language skills, such as productivephonological ability ( Slevc and Miyake, 2006) and pronunciation(Milovanov et al., 2008) in a foreign language, as well as readingphonologically complex words in ones native language ( Morenoet al., 2009).

    The enhanced processing of linguistic sounds is coupled withelectrophysiologically measured changes across different auditory processing stages, starting from the brainstem (Musacchia et al.,2007; Wong et al. , 2007) and extending to the auditory cortex and

    other auditory temporal lobe areas (Magne et al., 2006; Musacchiaet al., 2008; Moreno et al. , 2009; Marie et al., 2011a,b). Years of musical training have been found to correlate with stronger neu-ral activity induced by linguistic sounds at both subcortical andcortical levels (Musacchia et al., 2008). This result and the resultsof the longitudinal studies ( Thompson et al. , 2004; Morenoet al., 2009; Dege and Schwarzer, 2011; Chobert et al. , 2012;Franois et al., 2012) suggest that the possible transfer effects aremore likely results of training rather than genetic predispositions.When studying the possible transfer of training effects of musicexpertise on speech processing, it is important to consider gen-eral cognitive abilities as possible mediators. ERP studies show that attention does not explain the effects, but results regarding

    memory present a more mixed picture ( Besson et al., 2011). Aclear correlation between music lessons and general intelligencehas been found (Schellenberg, 2006), indicating that the transfereffects between music and language can partly be explained by enhanced general cognitive abilities when not controlled.

    Conversely, language experience may also have an effect onthe development of music perception. For example, speakers of a tone-language (e.g., Chinese) have better abilities in imitat-ing and discriminating musical pitch ( Pfordresher and Brown ,2009; Bidelman et al., 2011) and they acquire absolute pitch moreoften than Western speakers ( Deutsch et al., 2006). Also, speakers

    of a quantity language (Finnish) have been found to have sim-ilar enhanced processing of duration of non-speech sounds asFrench musicians, compared to French non-musicians ( Marieet al., 2012).

    Processing speech and music appear to be linked in the healthy brain, but does the same hold true in the damaged brain?Disorders of music and speech perception/expression, known as

    amusia and aphasia, have traditionally been regarded as indepen-dent, separable decits based on double dissociations observedin studies of brain damaged patients (amusia without aphasia:Peretz, 1990; Peretz and Kolinsky , 1993; Grifths et al., 1997;Dalla Bella and Peretz, 1999; aphasia without amusia: Bassoand Capitani, 1985; Mendez, 2001; for a review, see Peretz andColtheart, 2003). However, recent studies suggest that this doubledissociation may not be absolute. In Brocas aphasia, problems inthe syntactic (structural) processing of language have been shownto be associated with problems in processing structural relationsin music (Patel, 2005; Patel et al., 2008a). Musical practices areuseful also in the rehabilitation of language abilities of patientswith non-uent aphasia (Racette et al., 2006; Schlaug et al., 2010;

    Stahl et al., 2011), suggesting a further link between the process-ing of speech and music in the damaged brain. Moreover, personswith congenital amusia have been found to have lower thanaverage abilities in phonemic and phonological awareness ( Joneset al., 2009), in the perception of emotional prosody ( Thompson ,2007; Thompson et al. , 2012), speech intonation ( Patel et al.,2005, 2008b; Jiang et al., 2010; Liu et al., 2010) and subtle pitchvariation in speech signals ( Tillmann et al. , 2011b), and in thediscrimination of lexical tones ( Nan et al., 2010; Tillmann et al. ,2011a). Collectively, these results suggest that amusia may beassociated with ne-grained decits in the processing of speech.

    Similar to music, the central elements in speech prosody are melody (intonation) and rhythm (stress and timing)

    (Nooteboom , 1997). Studies of acquired amusia show that themelodic and rhythmic processing of music can be dissociated(Peretz, 1990; Peretz and Kolinsky , 1993; Di Pietro et al., 2004),suggesting that they may be partly separate functions. Previously,the association between music and speech processing has mainly been found to exist between the perception of the melodic aspectof music and speech ( Schn et al., 2004; Patel et al., 2005, 2008b;Magne et al., 2006; Marques et al., 2007; Moreno et al. , 2009;Jiang et al., 2010; Liu et al., 2010; Nan et al., 2010). However,also rhythm has important functions in both music and speech.Speech is perceived as a sequence of time, and the term speechrhythm is used to refer to the way these events are distributedin time. The patterns of stressed (strong) and unstressed (weak)

    tones or syllables build up the meter of both music and speech(Jusczyk et al., 1999; for a review, see Cason and Schn , 2012).Speech rhythm can be used in segmenting words from uentspeech: the word stress patterns that are typical in ones nativelanguage help to detect word boundaries (Vroomen et al. , 1998;Houston et al. , 2004). Depending on the language, word stressis expressed with changes in fundamental frequency, intensity,and/or duration ( Morton and Jassem , 1965). Fundamental fre-quency (f0) is often thought to be a dominant prosodic cue forword stress (Lieberman , 1960; Morton and Jassem , 1965) andword segmentation ( Spinelli et al., 2010)however, changes in

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 2

    http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience
  • 8/12/2019 Music and Speech Prosody

    3/16

    Hausen et al. Rhythm connects music and speech

    syllable duration and sound intensity are also associated wi th theprosodic patterns that signal stress ( Lieberman , 1960; Mortonand Jassem, 1965). For example, the results from Kochansk i et al.(2005) suggest that in English intensity and duration ma y play even more important role for the detection of syllabic stres s thanf0. In Finnish, word or lexical stress alone is signaled with dura-tional cues ( Suomi et al., 2003), as well as intensity, whereas sen-

    tence stress is additionally signaled with fundamental freq uency (Vainio and Jrvikivi , 2007).Although there are relatively few studies looking at r hythm

    or meter associating the perception of music and speech, thereare some recent ndings that support this association. For exam-ple, Marie et al. (2011b) found that musicians perceive the metricstructure of words more accurately than non-musicians: i ncon-gruous syllable lengthenings elicited stronger ERP activati ons inmusicians both automatically and when it was task-relevant . Also,priming with rhythmic tones can enhance the phonologica l pro-cessing of speech (Cason and Schn , 2012) and the synchro nizingof musical meter and linguistic stress in songs can enhan ce theprocessing of both lyrics and musical meter (Gordon et al. , 2011).

    Another cognitive domain that has recently been link ed tomusic perception is visuospatial processing. A stimulus-res ponsecompatibility effect has been found between the pitch (hig h/low)of auditory stimuli and the location (up/down) of the a nswerbutton ( Rusconi et al., 2006). There is also evidence that musi-cians abilities in visuospatial perception are superior to a verage(Brochard et al. , 2004; Patston et al. , 2006). Moreover, co ngeni-tal amusics have been found to have below average perfor mancein a mental rotation task ( Douglas and Bilkey , 2007), althoughthis nding has not been replicated (Tillmann et al. , 2010).Williamson et al. (2011) found that a subgroup of amusic s wereslower but as accurate as the control group in the mental ro tationtask, but did not nd any group differences in a range of other

    visuospatial tasks. Douglas and Bilkey (2007) also found th at thestimulus-response compatibility effect was not as strong in amu-sics as in the control group. In another study, the amusic groupreported more problems in visuospatial perception than th e con-trol group, but this was not conrmed by any objective m easure(Peretzet al. , 2008). Taken together, there is some prelimina ry evi-dence that visuospatial and musical processing might be l inked,but more research is still clearly needed.

    The main aim of the present study was to systematically deter-mine the association between music perception (as indica ted by a computerized music perception test including the Scal e sub-test of the Montreal Battery of Evaluation of Amusia, a s wellas Off-beat and Out-of-key tasks) and the perception of s peech

    prosody, using a large sample of healthy adult subjects ( N = 61).To measure the perception of speech prosody, we used a novelexperiment that does not focus only on pitch contour (suchas the statement-question sentence tests used in many pr eviousstudies)but measures the perception of word stress utilizing a nat-ural combination of fundamental frequency, timing and int ensity variations. Thus, this experiment is suitable for assessing th e pos-sible connection of perception of both rhythm and pitch in musicto prosodic perception. We concentrate on the role of the a cous-tic differences in the perception of word stress, not the lin guisticaspects of this prosodic phenomenon (see, for example, Vogel

    and Raimy , 2002). Second, the study investigated the possibleassociation between visuospatial perception and music percep-tion. Possib le confounding variables, including auditory workingmemory an d pitch perception threshold, were controlled for.

    MATERIAL S AND METHODSPARTICIPAN TS

    Sixty four healthy Finnish adults were recruited into the study between June and August 2011. The ethical committee of theFaculty of Behavioural Sciences of the University of Helsinkiapproved t he study and the participants gave their writteninformed c onsent. Inclusion criteria were age between 1960 years, self-reported normal hearing and speaking Finnish as rstlanguage or at a comparable level (by self-report). Exclusion cri-teria were being a professional musician and/or having obtainedmusiceduc ation at a professional level. From the64 tested partici-pants, 13 reported having visited an audiologistone participantwas excluded from the analysis because of a deaf ear. However, theother partic ipants who had visited an audiologist had suspectedhearing pro blems that had proved to be either non-existent, tran-

    sient, or very mild (reported by the participants and controlledby statistical analyses, see section Associations Between the MusicPerception Test and Demographical and Musical BackgroundVariables). None of the participants had a cerebral vascular acci-dent or a br ain trauma. Anotherparticipant wasexcluded becauseof weaker than rst language level skills in Finnish. One par-ticipant wa s found to perform signicantly ( > 3 SD) below theaverage total score in the music perception test. In questionnaires,this particip ant also reported lacking sense of music and beingunable to d iscriminate out-of-key tones, further suggesting thatthe particip ant might have congenital amusia. In order to limitthis study to healthy participants with musical abilities in thenormal r ange (without musical decits or professional exper-

    tise in music), the data from this participant was excluded fromfurther ana lysis. Thus, data from 61 participants was used in thestatistical analysis. Fifty-eight (95.1%) of the analyzed partici-pants spoke Finnish as their rst language and three participants(4.9%) spo ke Finnish at a level comparable to rst language.Other char acteristics of the analyzed participants are shown inTable 1 .

    ASSESSMEN T METHODS

    Music, speech prosody, pitch, and visuospatial perception abili-ties were assessed with computerized tests and working memory was evaluated using a traditional paper-pencil test. The com-puter was a laptop with display size 12 and headphones. In

    addition, th e participants lled out a paper questionnaire. Theplace of the testing was arranged individually for each partic-ipant: most assessments were done in a quiet work space at apublic libra ry. The researcher gave verbal instructions to all testsexcept the on-line music perception test, in which the participantread the ins tructions from the laptop screen. The duration of theassessment session was ca. 1.5h on average, ranging from 1 to 2 h.

    Music perc eption Music perception was measured with an on-line computer-basedmusic perc eption test including the Scale subtest of the original

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 3

    http://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/
  • 8/12/2019 Music and Speech Prosody

    4/16

    Hausen et al. Rhythm connects music and speech

    Table 1 | Characteristics of the participants.

    Male/female 21/40 (34/66%)Mean age (range) 39.0 (1959)

    Education LevelPrimary level 0 (0%)Secondary level 23 (38%)Lowest level tertiary 6 (10%)Bachelor level 17 (28%)Master level or higher 15 (25%)

    Mean education in years (range) 17.1 (1032)

    Musical education: no/yes 19/42 (31/69%)Musical playschool 4 (7%)Special music class in school 6 (10%)Private lessons or with parents 23 (37%)Music institute or conservatory 13 (21%)Independent music learning 26 (43%)

    Mean musical training in years (range) 3.7 (019)

    Self-reported cognitive problemsReading problems 5 (8%)

    Speech problems 3 (5%)Spatial orientation problems 5 (8%)Problems in maths 12 (20%)Attentional problems 5 (8%)Memory problems 6 (10%)

    Montreal Battery of Evaluation of Amusia (MBEA; Peretz et al.,2003) as well as the Off-beat and Out-of-key tasks from the on-line version of the test ( Peretz et al., 2008). The on-line versionis constructed to measure the same underlying constructs as theMBEA and it has a high correlation with the original MBEAthat is administered in laboratory setting ( Peretz et al., 2008).

    The instructions were translated to Finnish and Swedish for thepresent study. The test used in the present comprised the Scalesubtest (Peretz et al., 2003), the Off-beat subtest ( Peretz et al.,2008), and the Out-of-key subtest (Peretz et al., 2008) (see http://www .brams .umontreal .ca/amusia-demo/ for a demo in Englishor French). The test included 30 melodies composed for theMBEA (Peretz et al., 2003) following Western tonal-harmonicconventions. The Scale subtest comprised piano tones while theOff-beat and the Out-of-key subtests used 10 different timbres(e.g., piano, saxophone, and clarinet). In the Scale subtest, theparticipants were presented with 31 trials, including one catchtrial that was not included in the statistical analysis. Each trial wasa pair of melodies and the task was to judge if the melodies weresimilar or different. In half (15) of the trials the melodies werethe same and in half (15) of the trials the second melody had anout-of-scale tone (on average, 4.3 semitones apart from the orig-inal pitch). In the Off-beat and Out-of-key subtests, the subjectswere presented with 24 trials of which 12 were normal melodiesand 12 were incongruous by having a time delay (Off-beat) oran out-of-scale tone (Out-of-key) on the rst downbeat in thethird bar of the four-bar melody. In the Off-beat subtest the task was to judge if the melody contained an unusual delay. The 12incongruous trials had a silence of 5/7 of the beat duration (i.e.,357 ms) prior to a critical tone disrupting the local meter. In the

    Out-of-key subtest the task was to judge if the melody containedan out-of-tune tone. In the incongruous 12 trials the melody hada 500 ms long tone that was outside the key of the melody, sound-ing like a wrong note. The subtests were always presented in thesame order (Scale, Off-beat, Out-of-key) and each subtest beganwith 24 examples of congruous and incongruous trials. The vol-ume level was adjusted individually to a level that was clearly

    audibly to the subject. In the end the participants lled out an on-line questionnaire about their history and musical background(see Appendix for the questionnaire in English; the participantslled it in Finnish or Swedish). The whole test was completed in2030 min.

    Speech prosody (word stress) perception Speech prosody perception was assessed with a listening exper-iment that measures the identication of word stress as it isproduced to separatea compound word intoa phrase of two sepa-rate words. The task examines the perception of word and syllabicstress as it is used to signal either word level stress or moderatesentence stress and it is designed so that all prosodic cues, namely

    f0, intensity, andduration, play a role (OHalpin, 2010). Thewordstress examined in this study differs from so-called lexical stress,where the stress pattern differentiates the meaning of two phonet-ically identicalwords from each other, as well as fromthe sentencelevel stress, where a word is accented or emphasized to contrast itwith other words in the utterance. The task is designed to mea-sure the perception of syllabic stress at the level which aids inseparating words from the surrounding syllables.

    The test is originally based on work by Vogel and Raimy (2002)and OHalpin (2010) and it has been adapted into Finnish by Torppa et al. (2010 ). Finnish has a xed stress on the rst syl-lable of a word; thus, a compound word has only one stressedsyllable that is accented in an utterance context as opposed to

    two accents in a similar two word phrase. Typically, the rst syl-lable of the second word of a compound has a secondary stressthat differentiates it from a totally unstressed syllable. The mate-rials in the test were spoken with a so called broad focus where(in the case of a phrase) neither of the two words stood out asmore emphatic (as is the case in the so called narrow or con-trastive focus). The stimuli were analyzed acoustically using Praat(Boersma , 2001) with respect to the (potentially) stressed sylla-bles. We measured the raw f0 maxima, intensity maxima as wellas the syllable durations and the differences between the val-ues of the two syllables in each utterance were calculated; theresults are summarized in Table 2 . Table 2 shows the differencesin f0, intensity, and duration between the rst syllable of the rstand second word of compound/phrased words and the results of paired t -tests on the signicances of the differences. As shown inTable 2 , for duration differences the statistical result did not reachsignicancehowever, the differences between the compound vs.phrased utterances in the duration of the vowel (nucleus) in thesecond syllable of the second word was signicant, t (28) = 2.45, p = 0.02. Thus, the compound words were found to differ fromthe phrases with respect to all prosodic parameters (f0, duration,and intensity) showing that the difference was not produced withany single prosodic parameter. An example of an utterance pair(produced by a 10 year old female child) is shown in Figure 1 .

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 4

    http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.brams.umontreal.ca/amusia-demo/http://www.brams.umontreal.ca/amusia-demo/
  • 8/12/2019 Music and Speech Prosody

    5/16

    Hausen et al. Rhythm connects music and speech

    Table 2 | The differences between the cues for word stress in rst and second stressed syllables in compound/phrase utterances.

    Stimulus N Mean durationdifference in ms (sd)

    Mean f0 difference insemitones (sd)

    Mean intensitydifference in dB (sd)

    Compound 14 2.0 (69.2) 9.2 (2.4) 8.6 (5.7)Phrase 16 33.3 (98.6) 4.8 (2.7) 1.1 (2.6)

    Duration f0 Intensity

    T-test betweencompounds vs. phrases

    t (28 ) = 1.11, p = 0.27 t (27 ) = 4.61, p < 0.001 t (28 ) = 2.93, p = 0.007

    The mean differences were calculated as follows (a): Duration: the duration of the rst syllable vowel (nucleus) of the rst part of the compound minus the duration

    of the rst syllable vowel (nucleus) in the second part of the compound or phrase, i.e., kIssankEllo or kIssan kEllo, respectively. (b) f0 and intensity: the peak

    value of the f0/intensity in the rst part of the compound minus the peak value of the f0/intensity in second part of the compound/phrase. The f0 differences were

    calculated in semitones. One f0 value was missing due to creaky voice.

    FIGURE 1 | Example of the spectrum of a compound word (above;audio le 1) and a two-word phrase (audio le 2) with f0 (black line)and intensity (red line) contours. The scale is 9400Hz for f0 and0100dB for intensity.

    Each gure shows the spectrogram, f0 track, as well as intensity contour of the utterance. The extent of the words in question andthe orthoghraphic text are also shown.

    In each trial, the participants heard an utterance producedwith a stress pattern that denoted it either as a compound (e.g.,nyt KISsankello [ kis:an kel:o] meaning show the harebellower or literally cats-bell in English) or as a phrase com-prised from the same two words (e.g., nyt KISsan KELlo[ kis:an kel:o], meaning show the cats bell in English). Asimilar pair of utterances in English would be, for example,BLUEbell and BLUE BELL; [blu bl] and [ blu bl], respec-tively. As the participants heard the utterance (supplementary audio les 1 and 2), they were presented two pictures on thescreen (see Figure 2 ) and the task was to choose which picturematched withthe utterance they heard bypressing a button. Therewere six different pairs of utterances (a compound word and a

    FIGURE 2 | Example of the word stress task. The left picture represents a

    compound word kissankello and the right picture a phrase kissan kello.

    phrase). The utterances were spoken by four different people: anadult male, an adult female, a female childof 10 years and a femalechild of 7 years. The original Finnish test version used by Torppaet al. (2010) had 48 trials. For the present study a shorter 30 trialversion was made by excluding 18 trials of which 2 were found tobe too difcult and 16 too easy for the nine healthy adult partici-pants on a pilot study. The duration of the test was ca 45 min.The test was carried out using Presentation software (www .neurobs .com) .

    Visuospatial perception Visuospatial perception was assessed by a test that was developedfor this study as a visuospatial analogy forthe MBEA Scale subtest.The stimuli were created and the test was conducted using Matlaband Psychophysics Toolbox extension (Brainard , 1997). In eachtrial the participants were presented two series of Gabor patches(contrast75%; spatial frequency ca. 0.8c/ ; size approximately2 )proceeding from left to right. There was a 500 ms pause betweenthe two series. A single Gabor was presented at a time (there wasa 50 ms pause between two Gabors, the duration of each Gaborvaried) and the Gabors formed a continuous path. The path was

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 5

    http://www.neurobs.com/http://www.neurobs.com/http://www.neurobs.com/http://www.neurobs.com/http://www.neurobs.com/http://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/http://www.neurobs.com/http://www.neurobs.com/
  • 8/12/2019 Music and Speech Prosody

    6/16

    Hausen et al. Rhythm connects music and speech

    formed by simultaneously changing the position and the orienta-tion of the Gabor relative to the preceding Gabor. The orientationof the Gabor followed the direction of the path. On half of thetrials thetwo Gabor series were identical, on theother half thesec-ond path was changed ( Figure 3 , Supplementary movie les 1 and2). In change trials the second series had one Gabor that devi-ated from the expected path (Figure 3B , supplementary movie

    le 2). The participants task was to judge whether the two pathswere similar or different. The paths were constructed as analo-gous to the melodies in the MBEA Scale subtest: each Gabor wasanalogous to a tone in the melody and each deviating Gabor wasanalogous to an out-of-scale tone. Every semitone difference inthe melody was equivalent to a 12 difference in the Gabor ori-entation and the corresponding change in Gabor location, exceptthe deviant Gabor that had22 location change per semitone. Theorientation change, 12 , was within the association eld of con-tour integration ( Field et al., 1993). Like the MBEA Scale test, thetest began with two example trials:one trial with two similar seriesand one trial with a difference in the second series. The experi-ment had 30 trials of which 15 contained two similar series and

    15 contained a deviant gure in the second series. In a pilot study with 11 participants, the type (location, orientation, both) andthe size (422 ) of the deviant Gabor change were varied. Fromthe different types and sizes the deviant change (location, 22 )was chosen to match the level of difculty of the MBEA Scale test(Peretz et al., 2003; norms updated in 2008). The duration of thetest was ca 10 min.

    FIGURE 3 | Example of the visuospatial task with the originalsequence of Gabor gures (A) and a sequence with a change in thelocation and orientation of one of the Gabor gures (B). Note that in theactual test, only a single Gabor was presented at a time.

    Pitch perception The pitch perception test was a shortened adaptation of the testused by Hyde and Peretz (2004) and it was carried out usingPresentation software (www .neurobs .com). In every trial thesubjects heard a sequence of ve successive tones and their task was to judge if all ve tones were similar or if there was a changein pitch. The duration of a tone was always 100 ms and the inter-

    tone interval (ITI; onset to onset) was 350 ms. In the standardsequence, all tones were played at the pitch level of C6 (1047Hz)and in the sequences that contained a change, the fourth tonewas altered. The altered tones were 1/16, 1/8, 1/4, 1/2, or 1semitones (3, 7, 15, 30 or 62 Hz) upward or downward from C6.The different change sizes and changes upward and downwardwere presented as many times. The order of the trials was ran-domized. The test contained 80 trials: 40 standard sequences and40 sequences with the fourth tone altered in pitch. Three exam-ple trials are presented in Supplementary les: a standard trialwith no change (supplementary audio le 3) and two changetrials (1 semitone upwards; audio le 4 and downwards; audiole 5). The test was substantially shorter than the test by Hyde

    and Peretz (2004). It also contained smaller pitch changes becausethe difculty level was set to match the participants who were notrecruited because of having problems in the perception of music.The duration of the test was ca 34 min.

    Auditory working memory Auditory working memory and attention span were measuredwith the Digit Span subtest of the Wechsler Adult IntelligenceScale III (WAIS-III; Wechsler , 1997). In the rst part of thetest, the participants hear a sequence of numbers read by theresearcher and their task is to repeat the numbers in the sameorder. In the second part the task is to repeat the number sequencein reverse order. The test proceeds from the shortest sequences

    (two numbers) to the longer ones (max. nine numbers in the rstand eight numbers in the second part of the test). Every sequencethat the participant repeats correctly is scored as one point andthe maximum total score is 30. The duration of the test was ca5 min.

    Questionnaires The subjects lled out two questionnaires: a computerized ques-tionnaire after the music perception test (same as in Peretz et al.,2008) as well as a paper questionnaire at the end of the assess-ment session. In the questionnaires the participants were askedabout their musical and general educational background; cog-nitive problems; musical abilities, hobbies, and preferences (seeAppendix: Data Sheet 1). The last part of the paper question-naire was the Brief Music in Mood Regulation -scale ( Saarikallio,2012). The links between music perception, different kinds of musical hobbies and music in mood regulation will be presentedelsewhere in more detail: in the present study, only questionsregarding rst language, cognitive problems, years of musical andgeneral education, and education level, were analyzed.

    STATISTICAL ANALYSIS

    The associations between the MBEA scores and background vari-ables were rst examined using t -tests, ANOVAs, and Pearson

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 6

    http://www.neurobs.com/http://www.neurobs.com/http://www.neurobs.com/http://www.neurobs.com/http://www.neurobs.com/http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.neurobs.com/
  • 8/12/2019 Music and Speech Prosody

    7/16

    Hausen et al. Rhythm connects music and speech

    correlation coefcients depending on the variable type. The vari-ables that had signicant associations with the music perceptionscores were then included in further analysis. Pitch perceptionand auditory working memory were also regarded as possibleconfounding variables and controlled for when examining theassociations that word stress and visuospatial perception had withmusic perception. Linear step-wise regression analyses were per-

    formed to see how much the different variables could explain thevariation of the music perception total score and subtest scores.All statistical analyses were performed using PASW Statistics 18.

    RESULTSDESCRIPTIVE STATISTICS OF THE MBEA AND OTHER TESTS

    Table 3 presents the ranges, means, and standard deviations of the music perception scores. Total music perception scores werecalculated as the mean averaged score across the three subtests.Discrimination (d ) and response bias [ln( )] indexes for thesubtests were also calculated. The analysis of d yielded highly similar associations to other variables as the proportion of correctanswers (hit rate + correct rejections) and hence only the latter isreported. There was no signicant correlation between responsebias and proportion of correct answers in the music perceptiontotal score [ r (59) = 0.18, p = 0.17]. There was a small responsebias toward congruous responses in the Off-beat [ t (60) = 15.23, p < 0.001] and Out-of-key subtests [ t (60) = 5.07, p 0.05 in all t -tests). First language wasalso not signicantly associated with the word stress perception,

    t (59) = 1.08, p = 0.29. Suspected or mild hearing problemswere neither signicantly associated with the pitch discrimina-tion threshold [ t (59) = 0.52, p = 0.61] or word stress perception[t (59) = 0.55, p = 0.59]. The associations to the music perceptiontotal score are shown in Table 5 . However, owing the relatively small number of the self-reported cognitive problems, possibleassociations cannot be reliably ruled out for most problems.

    Age was not linearly correlated with the music perceptiontotal score [ r (59) = 0.03, p = 0.79], but when the age groups werecompared to each other using ANOVA, a signicant associationwas found [F (3, 57) = 6.21, p = 0.001]. The music perceptionscore seemed to rise until the age group of 4049 years but theage group of 5059 years had the lowest scores. Post hoc test

    (Tukey HSD) showed that the age group 4049 years had signi-cantlyhigher music perception scores than thegroups 1929 years( p = 0.004) and 5059 years ( p = 0.002). The average musicperception scores of the age groups are shown in Table 6 .

    Level of education did not differentiate the participantsregarding their music perception scores [ F (3, 57) = 1.81, p =0.16] and neither were education years signicantly correlatedwith music perception [ r (56) = 0.10, p = 0.46]. The participantswho had got some kind of music education in addition to thecompulsory music lessons in school ( N = 42) had higher musicperception scores than those who only had got the compulsory lessons ( N = 19) [t (59) = 2.75, p = 0.008]. The difference was4.7% on average. The correlation between years of music educa-

    tion (019) and the total music perception score was signicant[r (51) = 0.32, p = 0.019].

    FIGURE 4 | Distributions of the music perception subtest and totalscores.

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 7

    http://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/
  • 8/12/2019 Music and Speech Prosody

    8/16

    Hausen et al. Rhythm connects music and speech

    FIGURE 5 | Scatter plots indicating the relationships between the three music perception subtests.

    Table 4 | Other tests of perception and memory: basic descriptive statistics.

    Range Mean Standard deviation

    Speech prosody perception 1930 (63100%) 25.0 (83%) 2.7 (9%)Visuospatial perception 1730 (57100%) 23.8 (79%) 2.9 (10%)Auditory working memory 1022 (3373%) 15.8 (53%) 3.0 (10%)Pitch perception

    No change trials 1340 (33100%) 32.4 (81%) 6.4 (16%)Change trials 2238 (5595%) 30.3 (76%) 4.4 (11%)

    3 Hz change (1/16 semitone) 07 (0-88%) 2.7 (34%) 2.1 (26%)7 Hz change (1/8 semitone) 08 (0100%) 4.5 (57%) 2.0 (25%)15 Hz change (1/4 semitone) 48 (50100%) 7.1 (89%) 0.9 (12%)36 Hz change (1/2 semitone) 68 (75100%) 7.8 (98%) 0.4 (0%)62 Hz change (1 semitone) 8 (100%) 8 (100%) 0 (0%)

    Pitch discrimination threshold (Hz) 3.026.1 9.9 4.7

    Table 5 | Background variables associations with the music perception total score.

    Background variable N Mean music perception scores (%) Signicance of the difference

    Gender: female/male 40/21 84/82 t (59) = 0.96, p = 0.34First language: Finnish/Swedish 58/3 84/81 t (59) = 0.73, p = 0.47Self-reported cognitive problems

    Problems in reading: yes/no 5/53 83/84 t (56) = 0.16, p = 0.87Attention problems: yes/no 5/53 83/84 t (56) = 0.30, p = 0.76Problems in speech: yes/no 3/55 79/84 t (57) = 1.40, p = 0.17Problems in mathematics: yes/no 12/45 83/84 t (55) = 0.69, p = 0.49Memory problems: yes/no 6/51 85/84 t (55) = 0.43, p = 0.67Problems in visuospat ial orientat ion: yes/no 5/52 82/84 t (55) = 0.79, p = 0.43

    Suspected or mild hearing problems: yes/no 12/49 81/84 t (60) = 1.49, p = 0.14

    ASSOCIATIONS BETWEEN MUSIC PERCEPTION, WORD STRESS

    PERCEPTION AND VISUOSPATIAL PERCEPTIONTable 7 shows the correlations between the possible confound-ing variables (pitch perception, auditory working memory, musiceducation, and general education) and word stress, visuospatialperception, and music perceptions

    Step-wise regression analyses were performed to see how much the different variables could explain the variation of themusic perception total score and subtests. Four different mod-els of predictors were examined: rst the possibly confoundingbackground variables, then the possibly confounding variablesmeasured by tests and lastly the test scores that were the main

    Table 6 | Average music perception scores of the age groups.

    Age group (years) N Music perception score mean (sd) (%)

    1929 17 81.2 (6.2)3039 14 85.0 (6.8)4049 12 89.0 (3.2)5059 18 80.7 (6.5)

    interest of this study. In the rst model, age group (under/over50 years) and music education (no/yes) were used as predic-tors. These background variables were included in the regressionanalysis because they were signicantly associated with the music

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 8

    http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience
  • 8/12/2019 Music and Speech Prosody

    9/16

    Hausen et al. Rhythm connects music and speech

    Table 7 | Correlations between speech prosody and visuospatial perception, music perception and possible confounding variables.

    Word stress Visuospatial Music perception (total)

    Pitch perception: change trials ( df = 59) 0.01 0.05 0.31 *No change trials ( df = 59) 0.06 0.07 0.15All trials (df = 59) 0.08 0.14 0.09

    Pitch discrimination threshold ( df = 59) 0.13 0.03 0.32 **

    Auditory working memory ( df = 59) 0.26 * 0.10 0.10Digit span forwards ( df = 59) 0.26 * 0.07 0.07Digin span backwards ( df = 59) 0.13 0.11 0.06

    Music education (years) ( df = 51) 0.12 0.02 0.32 *General education (years) ( df = 56) 0.08 0.11 0.10

    ** p < 0.01; * p < 0.05.

    perception scores. Second, pitch discrimination threshold andauditory working memory score were added to the model. Third,visuospatial perception score was added as a predictor. Finally,the word stress score was added to the model. Table 8 showsthe regression analyses including the coefcients of determina-tion ( R2) of the different models. As can be seen from the R2change in the regression analysis for the total music perceptionscore, both visuospatial perception and word stress perceptionexplained about 8% of the variation of the total music percep-tion score while controlling for music education, age, auditory working memory and pitch discrimination threshold. Music edu-cation and pitch discrimination threshold were also signicantpredictors.

    When the Scale subtest was analyzed separately, age group wasa signicant predictor in the rst model, but the further regres-sion models were not signicant. Visuospatial perception hadonly a marginally signicant association with the Scale subtestthat was analogous with it. The nal regression model for theOut-of-key subtest was signicant and explained 24% of the vari-ance. The most signicant predictor was music education. In theregression analysis on the Off-beat subtest, the nal model wassignicant and explained 33% of the variance. The most signi-cantpredictor was word stress perception thataloneexplained 9%of the variance. Figure 6 shows that word stress perception cor-related highly signicantly with the music perception total score[r (59) = 0.34, p = 0.007], and with the Off-beat score [ r (59) =0.39, p = 0.002], but not with the Scale and Out-of-key scores.

    DISCUSSIONThe most important nding of this study is the associationfound between the perception of music and speech prosody,more specically word stress. Auditory working memory, pitchperception abilities, or background variables like musical edu-cation did not explain this association. This nding gives sup-port to the hypothesis that processing music and speech are insome extent based on shared neural resources. The associationis found in normal, healthy population and thus strengthensthe generalizability of the associations previously found in musi-cians and those having problems in the perception of music orlanguage.

    The most powerful background variable inuencing the musicperception was music education. Age was also found to be related

    to music perception, as also Peretz et al. (2008) found, butin the present study the association was not linear. Older per-sons lower performance in the music perception test might bepartly explained by less music educationhowever, this doesnot explain the nding that the youngest age group also hadlower than average performance. However, the relation betweenage group and music perception was not very strong, as agegroup was not a signicant predictor in the regression anal- ysis including other more strongly related variables. Gender,general education, and self-reported cognitive problems werenot associated with the music perception scores. Music edu-cation and age group (under/over 50 years) were controlledfor in the statistical analysis and did not affect the associa-tions that were the main ndings of this study. Auditory work-ing memory was signicantly associated only with the wordstress task and did not explain any of the relations that werefound.

    ASSOCIATION BETWEEN MUSIC AND SPEECH PROSODYPatel (2012) argues that the apparent contradiction betweenthe dissociation between speech and music perception foundin brain damage studies ( Peretz, 1990; Peretz and Kolinsky ,1993; Grifths et al., 1997; Dalla Bella and Peretz, 1999) andthe associations found in brain imaging studies ( Patel et al.,1998; Steinhauer et al., 1999; Koelsch et al., 2002; Tillmannet al., 2003; Knsche et al., 2005; Schn et al., 2010; Abramset al., 2011; Rogalsky et al., 2011) may be explained by aresource sharing framework. According to this framework, musicand speech have separate representations in long-term mem-ory, and damage to these representations may lead to a spe-cic decit of musical or linguistic cognition. However, in thenormal brain, music and language also share neural resourcesin similar cognitive operations. In the introduction we alsopointed out that the enhanced abilities in music and speechmay be based on transfer of training ( Besson et al., 2011),however, as the association found in this study was signi-cant after controlling for musical training, our results may bebest interpreted as support for the hypothesis of shared neuralresources.

    The most important differencebetween the neural basis of pro-cessing speech and music is that at least in most right-handed per-sons, music is processed dominantly in the right hemisphere while

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 9

    http://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/
  • 8/12/2019 Music and Speech Prosody

    10/16

    Hausen et al. Rhythm connects music and speech

    Table 8 | Regression analysis.

    Model Variable Beta T F (df ) R 2 R 2 change

    MUSIC PERCEPTION TOTAL SCORE1 F (2, 58 ) = 5.21** 0.15 0.15

    Music education 0.28 2.27 *Age group 0.20 1.62

    2 F (4, 56 ) = 3.77** 0.21 0.06Music education 0.28 2.25 *Age group 0.14 1.07Auditory working memory 0.01 0.05Pitch discrimination threshold 0.25 2.07 **

    3 F (5, 55 ) = 4.43** 0.29 0.08Music education 0.28 2.37 *Age group 0.08 0.61Auditory working memory 0.02 0.19Pitch discrimination threshold 0.26 2.23 *Visuospatial perception 0.28 2.40 *

    4 F (6, 54 ) = 5.27*** 0.37 0.08Music education 0.28 2.47 *

    Age group 0.09 0.74Auditory working memory 0.10 0.85Pitch discrimination threshold 0.23 2.03 *Visuospatial perception 0.27 2.42 *Word stress perception 0.30 2.65 *

    SCALE SUBTEST1 F (2, 58 ) = 3.67* 0.11 0.11

    Music education 0.15 1.20Age group 0.26 2.00 *

    2 F (4, 56 ) = 1.78 0.11 0.00Music education 0.15 1.15Age group 0.25 1.79 +

    Auditory working memory 0.01 0.10

    Pitch discrimination threshold 0.04 0.303 F (5, 55 ) = 2.05 + 0.16 0.05

    Music education 0.15 1.19Age group 0.20 1.44Auditory working memory 0.00 0.01Pitch discrimination threshold 0.05 0.36Visuospatial perception 0.22 1.71 +

    4 F (6, 54 ) = 1.85 0.17 0.01Music education 0.15 1.18Age group 0.20 1.47Auditory working memory 0.03 0.22Pitch discrimination threshold 0.03 0.25Visuospatial perception 0.21 1.67Word stress perception 0.12 0.91

    OUT-OF-KEY SUBTEST1 F (2, 58 ) = 3.35* 0.10 0.10

    Music education 0.31 2.40 *Age group 0.04 0.31

    2 F (4, 56 ) = 2.77* 0.17 0.06Music education 0.32 2.51 *Age group 0.01 0.04Auditory working memory 0.12 0.98Pitch discrimination threshold 0.23 1.82 +

    (Continued)

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 10

    http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience
  • 8/12/2019 Music and Speech Prosody

    11/16

    Hausen et al. Rhythm connects music and speech

    Table 8 | Continued

    Model Variable Beta T F (df ) R 2 R 2 change

    3 F (5, 55 ) = 2.55* 0.19 0.02Music education 0.32 2.54 *Age group 0.03 0.21Auditory working memory 0.13 1.06

    Pitch discrimination threshold 0.24 1.87 +Visuospatial perception 0.15 1.23

    4 F (6, 54 ) = 2.87* 0.24 0.05Music education 0.32 2.58 *Age group 0.00 0.00Auditory working memory 0.20 1.53Pitch discrimination threshold 0.21 1.68 +

    Visuospatial perception 0.14 1.18Word stress perception 0.24 1.96 +

    OFF-BEAT SUBTEST1 F (2, 58 ) = 1.67 0.05 0.05

    Music education 0.14 1.07Age group 0.15 1.15

    2 F (4, 56 ) = 2.83* 0 .17 0.11Music education 0.12 0.90Age group 0.04 0.33Auditory working memory 0.13 1.06Pitch discrimination threshold 0.32 2.52 *

    3 F (5, 55 ) = 3.33* 0.23 0.06Music education 0.11 0.96Age group 0.01 0.10Auditory working memory 0.12 0.97Pitch discrimination threshold 0.33 2.67 *Visuospatial perception 0.26 2.15 *

    4 F (6, 54 ) = 4.34** 0.33 0.09Music education 0.12 1.18

    Age group 0.00 0.00Auditory working memory 0.04 0.32Pitch discrimination threshold 0.29 2.48 *Visuospatial perception 0.25 2.15 *Word stress perception 0.32 2.73 **

    ***p < 0.001; **p < 0.01; *p < 0.05; + p < 0.10.

    speech is dominantly processed in the left hemisphere ( Tervaniemiand Hugdahl, 2003; Zatorre and Gandour , 2008). Zatorre andGandour (2008) suggest that this difference may not indicate anabstract difference between the domains of speech and music, but

    it could be explained by the acoustic differences: the left hemi-sphere is specialized in processing fast temporal acoustic changesthat are important for speech perception while ne-grained pitchchanges that are important formusic perception aremore precisely processed by the right hemisphere. However, although the tempo-ral changes that arecentral in the processingof musical rhythm arerelatively slower than those typical in speech, temporal processingis important in processing both speech and musical rhythm. It isthus worth investigating if the processing of musical rhythm may have even more close associations to speech processing than themelodic aspect of music.

    Rhythm as a new link Shared neural mechanisms have thus far been found especially concerning pitch perception in music and speechcongenitalamusia has been found to be associated with problems in per-

    ceiving speech intonation ( Patel et al., 2005, 2008b; Jiang et al.,2010; Liu et al., 2010), emotional prosody ( Thompson , 2007;Thompson et al. , 2012) and the discrimination of lexical tones(Nan et al., 2010). It seems likely that the processing of coarse-grained pitch in music and speech rely on a shared mechanism(Zatorre and Baum , 2012). However, in this study the aim wasto nd out if music and speech perception may be associated notonly regarding the melodic but also the rhythmic aspect. Indeed,the strongest association that the word stress test had with musicperception was with the Off-beat subtest that measures the ability to perceive unusual time delays in music. This gives support to the

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 11

    http://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/
  • 8/12/2019 Music and Speech Prosody

    12/16

    Hausen et al. Rhythm connects music and speech

    FIGURE 6 | Scatter plots indicating the relationships between theword stress task and the music perception test.

    hypothesis that the perception of rhythm in music and speech areconnected. Word stress perception was not a signicant predictorfor performance in the Scale subtest, and for performa nce in theOut-of-key subtest it was only a marginally signicant predictor.The marginally signicant relation with the Out-of-ke y subtest,which measures the ability to notice melodically devi ant tones,suggests that melodic cues affect word stress perceptio n in some

    extent, but as the pitch perception threshold was not associatedwith the word stress perception scores, it seems likel y that theword stress test does not rely on ne-grained pitch p erception.This nding suggests that pitch is not the only auditory cue relat-ing the perception of music and speechthe relation c an also befound via rhythm perception.

    Although previous research has focused more on stu dying themelodic than the rhythmic aspect of music and spe ech, thereare studies showing that rhythm might actually be a s trong link between the two domains. For instance, musicians have beenfound to perceive the metric structure of words more precisely than non-musicians ( Marie et al., 2011b). Also, classical musiccomposed by French and English composers differ in their use

    of rhythmthe musical rhythm is associated with the rhythm of speech in French and English language ( Patel and Dani ele, 2003).In a study that investigated the effects of melodic a nd rhyth-mic cues in non-uent aphasics speech production, t he resultssuggest that rhythm may actually be more crucial tha n melody (Stahl et al., 2011). Also, children with reading or langu age prob-lems have been found to have auditory difculties in rhythmicprocessingdecits in phonological processing are associatedwith problems in the auditory processing of amplitude rise timethat is critical for rhythmic timing in both language a nd music(Corriveau et al. , 2007; Corriveau and Goswami , 2009; Goswami

    et al., 2010; Goswami, 2012). Musical metrical sensitivity has beenfoun d to predict phonological awareness and reading develop-men t and it has been suggested that dyslexia may have its rootsin a temporal processing decit ( Huss et al., 2011). Moreover,the members of the KE family, who suffer from a genetic devel-opm ental disorder in speech and language have been found tohave problems also in the perception and production of rhythm

    (Alcock et al., 2000). The affected family members have bothexpressive and receptive language difculties and the disorder hasbeen connected to a mutation in the FOXP2 gene that has beenconsidered language-specic ( Lai et al., 2001). Alcock et al. (2000)studied the musical abilities of the affected members and foundthat their perception and production of pitch did not differ fromthe control group while their rhythmic abilities were signicantly lower, suggesting that the disorder might be based on the difcul-ties of temporal processing. However, as Besson and Schn (2012 )poin t out, genetics can only provide limited information aboutthe modularity of music and language.

    Neuroimaging and neuropsychological studies show thatmusi cal rhythm is processed more evenly in both hemispheres

    whereas the melodic aspect is more clearly lateralized to the righthemisphere, at least in most right-handed persons ( Peretz andZatorre, 2005). Because language is more left-lateralized, it is rea-sona ble to hypothesize that it may have shared mechanisms withmusi cal rhythm. Also, perception and production of speech andmusic have found to be neurally connected ( Rauschecker andScott, 2009)especially when processing rhythm: motor regionslike the cerebellum, the basal ganglia, the supplementary motorarea and the premotor cortex are central in the processing of both musical rhythm ( Grahn and Brett , 2007; Chen et al., 2008)and speech rhythm ( Kotz and Schwartze, 2010). Stewart et al.(2006) proposed that the importance of motor regions give sup-port to a motor theory of rhythm perception, as a parallel to

    the motor theory of speech perception ( Liberman and Mattingly ,1985)the perception of rhythm might be based on the motormechanisms required for its production. Similar developmentalmechanisms for speech and music might explain why training inthe o thermodality causes improvement in the other ( Patel, 2008).The association between the perception of rhythm in speech andmusic can also be related to dynamic attention theory propos-ing that the allocation of attention depends on synchronizationbetw een internal oscillations and external temporal structure(Jones, 1976; Large and Jones, 1999). Recent neuroimaging stud-ies have found evidence that attending to and predictive codingof specic time scales is indeed important in speech perception(Kotz and Schwartze, 2010; Luo and Poeppel , 2012). Kubanek

    et al. (2013) found that the temporal envelope of speech thatis critical for understanding speech is robustly tracked in beltareas at the early stage of the auditory pathway, and the sameareas are activated also when processing the temporal envelopeof non-speech sounds.

    Studies on acquired (Peretz , 1990; Peretz and Kolinsky , 1993;Di Pietro et al. , 2004) and congenital amusia (Hyde and Peretz ,2004; Thompson , 2007; Peretz et al., 2008; Phillips-Silver et al.,2011) have found double dissociations between the decits of melody and rhythm perception in music. Even though congenitalamu sia is considered to be mainly a decit of ne-grained pitch

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 12

    http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience
  • 8/12/2019 Music and Speech Prosody

    13/16

    Hausen et al. Rhythm connects music and speech

    perception ( Hyde and Peretz , 2004), there are cases of congeni-tal amusia in which the main problem lies in rhythm perception(Peretz et al., 2003; Thompson , 2007; Phillips-Silver et al., 2011).It has been proposed that there are different types of congen-ital amusia and that speech perception decits associated withcongenital amusia might only concern a subgroup of amusics(Patel et al., 2008b) whereas some might actually have a specic

    decit in rhythm perception (Thompson , 2007; Phillips-Silveret al., 2011). In the present study, the Off-beat subtest that mea-sures the perception of rhythmic deviations was not signicantly associated with the other music perception subtests measuringthe perception of melodic deviations, which further strength-ens the hypothesis that rhythm and melody perception can beindependent.

    Taken together, we found evidence that the perception of speech prosody could be associated with the perception of musicvia the perception of rhythm, and that the perception of rhythmand melody are separable. This raises the question of whetherthe type of the acoustic properties (melody or rhythm) of thestimuli under focus might sometimes orient the perception pro-

    cess more than the category (speech or music). This hypothesis isin line with Patel (2012) resource sharing framework suggestingthat the cognitive operations might share brainmechanisms whilethe domains may have separate representations in long-termmemory.

    MUSIC AND VISUOSPATIAL PERCEPTION: IS THERE AN ASSOCIATION?

    Because the perception of musical pitch may be spatial in nature(Rusconi et al., 2006), the possibility of an association betweenmusic and visuospatial perception has been suggested. Previousresearch results considering this association aresomewhat contra-dictory: congenital amusics may have lower than average abilitiesin visuospatial processing (Douglas and Bilkey , 2007) but this

    effect has not been replicated ( Tillmann et al. , 2010; Williamsonet al., 2011). In the present study the hypothesis was investi-gated by creating a visuospatial task that was analogous to theScale subtest of the MBEA. In the regression analysis where thepossible confounding variables were controlled for, the visuospa-tial test was only marginally signicant predictor of the Scalesubtest. Also, self-reported problems in visuospatial perceptionwere not found to be signicantly associated with the musicperception scores. However, visuospatial test was a signicantpredictor of the music perception total score and Off-beat sub-test score when pitch perception, short-term memory, age group,and music education were controlled for. It is possible that theseassociations may be at least partly explained by some confound-

    ing factor (e.g., attention), which we were not able to control

    for here. Because the expected association between the analogoustests of music and visuospatial perception was not signicant, thepresent results remain somewhat inconclusiveconcerning the link between music perception and visuospatial processing, and moreresearch is still needed.

    CONCLUSIONS

    The main result of this study is the observed strong associ-ation between music and word stress perception in healthy subjects. Our ndings strengthen the hypothesis that musicand speech perception are linked and show that this link doesnot exist only via the perception of pitch, as found in for-mer studies, but also via the rhythmic aspect. The study alsoreplicated former ndings of the independence of rhythm andmelody perception. Taken together, our results raise an inter-esting possibility that the perception of rhythm and melody could be more clearly separable than music and speech, atleast in some cases. However, our data is not able to provideany denitive answers and it is clear that more work is stillneeded.

    In future, more research is still needed to better under-stand the association between the processing of rhythm ormeter in speech and music. The perception of rhythm com-prises of many aspects and it is important to nd out exactly which characteristics of the rhythm of speech and musicare processed similarly. One possibility is to study commu-nication forms in which rhythm is a central variable forboth speech and musicrap music might be one exam-ple. This line of research would be interesting and fruit-ful in delineating the commonalities and cross-boundariesbetween music and speech in the temporal and spectraldomains.

    ACKNOWLEDGMENTSWe would like thank Andrew Faulkner for his helpful advicein adapting the word stress test into Finnish language, MiikaLeminen for his help in carrying out the Finnish version of theword stress test, Valtteri Wikstrm for programming the visu-ospatial perception test, Tommi Makkonen for programming thepitch perception test, Jari Lipsanen for statistical advice, andMari Tervaniemi for valuable help in designing this study anddiscussing the results.

    SUPPLEMENTARY MATERIALThe Supplementary Material for this article can be found onlineat: http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/

    10.3389/fpsyg.2013.00566/abstract

    REFERENCESAbrams, D. A., Bhatara, A., Ryali,

    S., Balaban, E., Levitin, D. J.,and Menon, V. (2011). Decodingtemporal structure in musicand speech relies on sharedbrain resources but elicits dif-ferent ne-scale spatial patterns.Cereb. Cortex 21, 15071518. doi:10.1093/cercor/bhq198

    Alcock, J. A., Passingham, R. E.,Watkins, K., and Vargha-Khadem,F. (2000). Pitch and timing abilitiesin inherited speech and languageimpairment. Brain Lang . 75, 3446.doi: 10.1006/brln.2000.2323

    Basso, A., and Capitani, E. (1985).Spared musical abilities in a con-ductor with global aphasia andideomotor apraxia. J. Neurol.

    Neurosurg. Psychiatry 48, 407412.doi: 10.1136/jnnp.48.5.407

    Besson, M., Chobert, J., and Marie,C. (2011). Transfer of trainingbetween music and speech: com-mon processing, attention, andmemory. Front. Psychol . 2:94. doi:10.3389/fpsyg.2011.00094

    Besson, M., and Schn, D. (2012).What remains of modularity?,

    in Language and Music asCognitive Systems, eds P.Rebuschat, M. Rohrmeier, J.Hawkins, and I. Cross (Oxford,UK: Oxford University Press),283291.

    Bidelman, G. M., Gandour, J. T.,and Krishnan, A. (2011). Cross-domain effects of music and lan-guage experience on the representa-

    www.frontiersin.org September 2013 | Volume 4 | Article 566 | 13

    http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/10.3389/fpsyg.2013.00566/abstracthttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/10.3389/fpsyg.2013.00566/abstracthttp://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/http://www.frontiersin.org/Auditory_Cognitive_Neuroscience/10.3389/fpsyg.2013.00566/abstracthttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/10.3389/fpsyg.2013.00566/abstract
  • 8/12/2019 Music and Speech Prosody

    14/16

    Hausen et al. Rhythm connects music and speech

    tion of pitch in the human auditory brainstem. J. Cogn. Neurosci. 23,425434. doi: 10.1162/jocn.2009.21362

    Boersma, P. (2001). Praat, a system fordoing phonetics by computer. Glot Int. 5, 341345.

    Brainard, D. H. (1997). Thepsychophysics toolbox.Spat. Vis. 10, 433436. doi:10.1163/156856897X00357

    Brandt, A., Gerbian, M., and Slevc, L.R. (2012). Music and early languageacquisition. Front. Psychol . 3:327.doi: 10.3389/fpsyg.2012.00327

    Brochard, R., Dufour, A., and Desprs,O. (2004). Effect of musicalexpertise on visuospatial abil-ities: evidence from reactiontimes and mental imagery.Brain Cogn. 54, 103109. doi:10.1016/S0278-2626(03)00264-1

    Cason, N., and Schn, D. (2012).Rhythmic priming enhances the

    phonological processing of speech. Neuropsychologia 50, 26522658.doi: 10.1016/j.neuropsychologia.2012.07.018

    Chartrand, J.-P., and Belin, P.(2006). Superior voice tim-bre processing in musicians. Neurosci. Lett . 405, 164167. doi:10.1016/j.neulet.2006.06.053

    Chen, J. L., Penhune, V. B., andZatorre, R. J. (2008). Listening tomusical rhythms recruits motorre-gions of the brain. Cereb. Cortex 18, 28442854. doi: 10.1093/cer-cor/bhn042

    Chobert, J., Franois, C., Velay, J.-L.,

    and Besson, M. (2012). Twelvemonths of active musical train-ing in 8-to 10-year-old childrenenhances the preattentive pro-cessing of syllabic duration andvoice onset time. Cereb. Cortex .doi: 10.1093/cercor/bhs377. [Epubahead of print].

    Corriveau, K., Pasquini, E., andGoswami, U. (2007). Basic audi-tory processing skills and speciclanguage impairment: a new look at an old hypothesis. J. SpeechLang. Hear. Res. 50, 647666. doi:10.1044/1092-4388(2007/046)

    Corriveau, K. H., and Goswami, U.(2009). Rhythmic motor entrain-ment in children with speech andlanguage impairments: tapping tothe beat. Cortex 45, 119130. doi:10.1016/j.cortex.2007.09.008

    Dalla Bella, S., and Peretz, I. (1999).Music agnosias: selective impair-ments of music recognitionafter brain damage. J. New Music Res. 28, 209216. doi:10.1076/jnmr.28.3.209.3108

    Dege, F., and Schwarzer, G. (2011).The effect of a music program

    on phonological awareness inpreschoolers. Front. Psychol . 2:124.doi: 10.3389/fpsyg.2011.00124

    Deutsch, D., Henthorn, T., Marvin,E., and Xu, H. (2006). Absolutepitch among American and Chineseconservatory students: prevalencedifferences, and evidence for aspeech-related critical period. J. Acoust. Soc. Am. 119, 719722.doi: 10.1121/1.2151799

    Di Pietro, M., Laganaro, M., Leemann,B., and Schnider, A. (2004).Receptive amusia: temporalauditory processing decit in aprofessional musician followinga left temporo-parietal lesion. Neuropsychologia 42, 868877.doi: 10.1016/j.neuropsychologia.2003.12.004

    Douglas, K. M., and Bilkey, D. K.(2007). Amusia is associated withdecits in spatial processing. Nat. Neurosci. 10, 915921. doi:

    10.1038/nn1925Field, D. J., Hayes, A., and Hess, R.F. (1993). Contour integration by the human visual system: evidencefor a local association eld. VisionRes. 33, 173193. doi:10.1016/0042-6989(93)90156-Q

    Franois, C., Chobert, J., Besson,M., and Schn, D. (2012). Musictraining for the development of speech segmentation. Cereb. Cortex .doi: 10.1093/cercor/bhs180. [Epubahead of print].

    Gordon, R. L., Magne, C. L., andLarge, E. W. (2011). EEG corre-lates of song prosody: a new look

    at the relationship between lin-guistic and musical rhythm. Front.Psychol . 2:352. doi: 10.3389/fpsyg.2011.00352

    Goswami, U. (2012). Language,music, and childrens brains: arhythmic timing perspective onlanguage and music as cogni-tive systems, in Language and Music as Cognitive Systems, edsP. Rebuschat, M. Rohrmeier, J.Hawkins, and I. Cross (Oxford,UK: Oxford University Press),292301.

    Goswami, U., Gerson, D., and Astruc,L. (2010). Amplitude envelopeperception, phonology andprosodic sensitivity in childrenwith developmental dyslexia.Read. Writ . 23, 9951019. doi:10.1007/s11145-009-9186-6

    Grahn, J. A., and Brett, M. (2007).Rhythm and beat perceptionin motor areas of the brain. J. Cogn. Neurosci. 19, 893906. doi:10.1162/jocn.2007.19.5.893

    Grifths, T. D., Rees, A., Witton,C., Cross, P. M., Shakir, R. A.,and Green, G. G. (1997). Spatial

    and temporal auditory process-ing decits following right hemi-sphere infarction: a psychophysi-cal study. Brain 120, 785794. doi:10.1093/brain/120.5.785

    Houston, D., Santelmann, L., andJusczyk, P. (2004). English-learninginfants segmentation of trisyl-labic words from uent speech.Lang. Cogn. Proc . 19, 97136. doi:10.1080/01690960344000143

    Huss, M., Verney, J. P., Fosker, T.,Mead, N., and Goswami, U. (2011).Music, rhythm, rise time perceptionand developmental dyslexia: per-ception of musical meter predictsreading and phonology. Cortex 47,674689. doi: 10.1016/j.cortex.2010.07.010

    Hyde, K., and Peretz, I. (2004). Brainsthat are out of tune but in time.Psychol. Sci. 15, 356360. doi:10.1111/j.0956-7976.2004.00683.x

    Jiang, C., Hamm, J. P., Lim, V. K.,

    Kirk, I. J., and Yang, Y. (2010).Processing melodic contour andspeech intonation in congenitalamusics with Mandarin Chinese. Neuropsychologia 48, 26302639.doi: 10.1016/j.neuropsychologia.2010.05.009

    Jones, L. J., Lucker, J., Zalewski,C., Brewer, C., and Drayna,D. (2009). Phonological pro-cessing in adults with defcitsin musical pitch recognition. J. Commun. Disord . 42, 226234.doi: 10.1016/ j.jcomdis.2009.01.001

    Jones, M. R. (1976). Time, our lostdimension: toward a new theory

    of perception, attention, and mem-ory. Psychol. Rev . 83, 323335. doi:10.1037/0033-295X.83.5.323

    Jusczyk, P. W., Houston, D. M.,and Newsome, M. (1999). Thebeginnings of word segmenta-tion in english-learning infants.Cogn. Psychol . 39, 159207. doi:10.1006/cogp.1999.0716

    Juslin, P. N., and Laukka, P. (2003).Communication of emotions invocal expression and music per-formance: different channels, samecode? Psychol. Bull . 129, 770814.doi: 10.1037/0033-2909.129.5.770

    Knsche, T. R., Neuhaus, C., Haueisen,J., Alter, K., Maess, B., Witte, O.W., et al. (2005). Perception of phrase structure in music. Hum.Brain Mapp. 24, 259273. doi:10.1002/hbm.20088

    Kochanski, G., Grabe, E., Coleman, J.,and Rosner, B. (2005). Loudnesspredicts prominence: fundamentalfrequency lends little. J. Acoust.Soc. Am. 118, 10381054. doi:10.1121/1.1923349

    Koelsch, S., Gunter, T. C., van Cramon,D. Y., Zysset, S., Lohmann, G.,

    and Friederici, A. D. (2002).Bach speaks: a cortical language-network serves the processing of music. Neuroimage 17, 956966.doi: 10.1006/nimg.2002.1154

    Koelsch, S., Kasper, E., Sammler,D., Schulze, K., Gunter, T., andFriederici, A. D. (2004). Music,language and meaning: brain sig-natures of semantic processing. Nat. Neurosci. 7, 302307. doi:10.1038/nn1197

    Kotilahti, K., Nissil, I., Nsi, T.,Lipiinen, L., Noponen, T.,Merilinen, P., et al. (2010).Hemodynamic responses tospeech and music in newborninfants. Hum. Brain Mapp . 31,595603.

    Kotz, S. A., and Schwartze, M.(2010). Cortical speech pro-cessing unplugged: a timely subcortico-cortical framework.Trends Cogn. Sci. 14, 392399. doi:

    10.1016/j.tics.2010.06.005Kraus, N., and Chandrasekaran, B.(2010). Music training for the devel-opment of auditory skills. Nat.Rev. Neurosci. 11, 599605. doi:10.1038/nrn2882

    Kubanek, J., Brunner, P., Gunduz,A., Poeppel, D., and Schalk, G.(2013). The tracking of speech enve-lope in the human cortex. PLoSONE 8:e53398. doi: 10.1371/jour-nal.pone.0053398

    Lai, C. S., Fisher, S. E., Hurst, J. A.,Vargha-Khadem, F., and Monaco,A. P. (2001). A forkhead-domaingene is mutated in a severe speech

    and language disorder. Nature 413,519523. doi: 10.1038/35097076

    Large, E. W., and Jones, M. R. (1999).The dynamics of attending: how people track time-varying events.Psychol. Rev . 106, 119159. doi:10.1037/0033-295X.106.1.119

    Liberman, A. M., and Mattingly,I. G. (1985). The motor the-ory of speech perceptionrevised. Cognition 21, 136. doi:10.1016/0010-0277(85)90021-6

    Lieberman, P. (1960). Some acous-tic correlates of word stress inAmerican English. J. Acoust.Soc. Am. 32, 451454. doi:10.1121/1.1908095

    Lima, C. F., and Castro, S. L. (2011).Speaking to the trained ear: musicalexpertise enhances the recognitionof emotions in speech prosody.Emotion 11, 10211031. doi:10.1037/a0024521

    Liu, F., Patel, A. D., Fourcin, A., andStewart, L. (2010). Intonation pro-cessing in congenital amusia: dis-crimination, identication and imi-tation. Brain 133, 16821693. doi:10.1093/brain/awq089

    Frontiers in Psychology | Auditory Cognitive Neuroscience September 2013 | Volume 4 | Article 566 | 14

    http://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience/archivehttp://www.frontiersin.org/Auditory_Cognitive_Neurosciencehttp://www.frontiersin.org/Auditory_Cognitive_Neuroscience
  • 8/12/2019 Music and Speech Prosody

    15/16

    Hausen et al. Rhythm connects music and speech

    Luo, H., and Poeppel, D. (2012).Cortical oscillations in audi-tory perception and speech:evidence for two temporalwindows in human auditory cortex. Front. Psychol . 3:170. doi:10.3389/fpsyg.2012.00170

    Magne, C., Schn, D., and Besson, M.(2006). Musician children detectpitch violations in both musicand language better than nonmu-sician children: behavioral andelectrophysiological approaches. J. Cogn. Neurosci.18, 199211. doi:10.1162/jocn.2006.18.2.199

    Marie, C., Delogu, F., Lampis, G.,Belardinelli, M. O., and Besson,M. (2011a). Inuence of musicalexpertise on segmental and tonalprocessing in Mandarin Chinese. J. Cogn. Neurosci. 23, 27012715.doi: 10.1162/jocn.2010.21585

    Marie, C., Magne, C., and Besson,M. (2011b). Musicians and

    the metric structure of words. J. Cogn. Neurosci. 23, 294305. doi:10.1162/jocn.2010.21413

    Marie, C., Kujala, T., and Besson,M. (2012). Musical and linguis-tic expertise inuence pre-attentiveand attentive processing of non-speech sounds. Cortex 48, 447457.doi: 10.1016/j.cortex.2010.11.006

    Marques, C., Moreno, S., Castro,S. L., and Besson, M. (2007).Musicians detect pitch violationin a foreign language better thannonmusicians: behavioral andelectrophysiological evidence. J. Cogn. Neurosci.19, 14531463.

    doi: 10.1162/jocn.2007.19.9.1453Mendez, M. (2001). Generalized

    auditory agnosia with sparedmusic recognition in a left-hander. analysis of a case with aright temporal stroke. Cortex 37,139150. doi: 10.1016/S0010-9452(08)70563-X

    Milovanov, R., Huotilainen, M.,Vlimki, V., Esquef, P. A., andTervaniemi, M. (2008). Musicalaptitude and second languagepronunciation skills in school-agedchildren: neural and behavioralevidence. Brain Res. 1194, 8189.doi: 10.1016/j.brainres.2007.11.042

    Mithen, S. (2005). The Singing Neanderthals: The Origins of Music, Language, Mind and Body .Cambridge: Harvard University Press.

    Moreno, S., Marques, C., Santos,A., Santos, M., Castro, S. L.,and Besson, M. (2009). Musicaltraining inuences linguisticabilities in 8-year-old children:more evidence for brain plasticity.Cereb. Cortex 19, 712723. doi:10.1093/cercor/bhn120

    Morton, J., and Jassem, W. (1965).Acoustic correlates of stress. Lang.Speech 8, 159181.

    Musacchia, G., Sams, M., Skoe, E.,and Kraus, N. (2007). Musicianshave enhanced subcortical audi-tory and audiovisual processing of speech and music. Proc. Natl. Acad.Sci. U.S.A. 104, 1589415898. doi:10.1073/pnas.0701498104

    Musacchia, G., Strait, D., and Kraus,N. (2008). Relationships betweenbehavior, brainstem and corti-cal encoding of seen and heardspeech in musicians and nonmu-sicians. Hear. Res. 241, 3442. doi:10.1016/j.heares.2008.04.013

    Nan, Y., Sun, Y., and Peretz, I.(2010). Congenital amusia inspeakers of a tone language:association with lexical toneagnosia. Brain 133, 26352642. doi:10.1093/brain/awq178

    Nooteboom, S. (1997). The prosody

    of speech: melody and rhythm, inThe Handbook of Phonetic Sciences,eds W. J. Hardcastle and J. Laver(Oxford, UK: Blackwell), 640673.

    OHalpin, R. (2010). The Perceptionand Production of Stress and Intonation by Children with Cochlear Implants. Dissertation, University College London. Available onlineat:http://eprints.ucl.ac.uk/20406/

    Patel, A. D. (2005). The relation-ship of music to the melody of speech and to syntactic process-ing disorders in aphasia. Ann. N.Y Acad. Sci. 1060, 5970. doi:10.1196/annals.1360.005

    Patel, A. D. (2008). Music, Language,and The Brain. New York, NY:Oxford University Press.

    Patel, A. D. (2012). Language, music,and the brain: a resource-sharingframework, in Language and Music as Cognitive Systems, edsP. Rebuschat, M. Rohrmeier, J.Hawkins, and I. Cross (Oxford,UK: Oxford University Press),204223.

    Patel, A. D., and Daniele, J. R. (2003).An empirical comparison of rhythmin language and music. Cognition87, B35B45. doi: 10.1016/S0010-0277(02)00187-7

    Patel, A. D., Gibson, E., Ratner, J.,Besson, M., and Holcomb, P.J. (1998). Processing syntacticrelations in language and music:an event-related potential study. J. Cogn. Neurosci. 10, 717733. doi:10.1162/089892998563121

    Patel, A. D., Iversen, J. R., Wassenaar,M., and Hagoort, P. (2008a).Musical syntactic processingin agrammatic Brocas aphasia. Aphasiology 22, 776789. doi:10.1080/02687030701803804

    Patel, A. D., Wong, M., Foxton,J., Lochy, A., and Peretz, I.(2008b). Speech intonation per-ception decits in musical tonedeafness (congenital amusia). Music Percept . 25, 357368. doi:10.1525/mp.2008.25.4.357

    Patel, A. D., Foxton, J. M., andGrifths, T. D. (2005). Musically tone-deaf individuals have dif-culty discriminating intonationcontours extracted from speech.Brain Cogn. 59, 310313. doi:10.1016/j.bandc.2004.10.003

    Patston, L. L. M., Corballis, M. C.,Hogg, S. L., and Tippett, L. J.(2006). The neglect of musi-cians: line bisection reveals anopposite bias. Psychol. Sci. 17,10291031. doi: 10.1111/j.1467-9280.2006.01823.x

    Peretz, I. (1990). Processing of localand global musical informationby unilateral brain-damaged

    patients. Brain 113, 11851205. doi:10.1093/brain/113.4.1185Peretz, I., Champod, S., and Hyde,

    K. (2003). Varieties of musicaldisorders: the montreal battery of evaluation of amusia. Ann. N.Y. Acad. Sci. 999, 5875. doi:10.1196/annals.1284.006

    Peretz, I., and Coltheart, M. (2003).Modularity of music processing. Nat. Neurosci. 6, 688691. doi:10.1038/nn1083

    Peretz, I., Gosselin, N., Tillmann, B.,Cuddy, L. L., Gagnon, B., Trimmer,C. G., et al. (2008). On-line iden-tication of congenital amusia.

    Music Percept . 25, 331343. doi:10.1525/mp.2008.25.4.331

    Peretz, I., and Kolinsky, R. (1993).Boundaries of separability betweenmelody and rhythm in music dis-crimination: a neuropsychologicalperspective. Q. J. Exp. Psychol . 46A,301325.

    Peretz, I., and Zatorre, R. J. (2005).Brain organization for music pro-cessing. Annu. Rev. Psychol . 56,89114. doi: 10.1146/annurev.psych.56.091103.070225

    Pfordresher, P. Q., and Brown, S.(2009). Enhanced production andperception of musical pitch in tonelanguage speakers. Attent. Percept.Psychophys. 71, 13851398. doi:10.3758/APP.71.6.1385

    Phillips-Silver, J., Toiviainen, P.,Gosselin, N., Pich, O., Nozaradana,S., Palmera, C., et al. (2011).Born to dance but beat deaf: anew form of congenital amusia. Neuropsychologia 49, 961969.doi: 10.1016/j.neuropsychologia.2011.02.002

    Pinker, S. (1997). How the Mind Works.London: Allen Lane.

    Racette, A., Bard, C., and Peretz,I. (2006). Making non-uentaphasics speak: sing along!Brain 129, 25712584. doi:10.1093/brain/awl250

    Rauschecker, J. P., and Scott, S. (2009).Maps and streams in the auditory cortex: nonhuman primates illu-minate human speech processing. Nat. Neurosci. 12, 718724. doi:10.1038/nn.2331

    Rogalsky, C., Rong, F., Saberi, K.,and Hickok, G. (2011). Functionalanatomy of language and musicperception: temporal and struc-tural factors investigated usingfunctional magnetic resonanceimaging. J. Neurosci. 31, 38433852.doi: 10.1523/JNEUROSCI.4515-10.2011

    Rusconi, E., Kwan, B., Giordano, B.L., Umilta, C., and Butterworth,B. (2006). Spatial representa-t