The secret formula of the perfect melody: Can musical features …mas03dm/papers/Features_Cogn... · 2014-06-02 · An old question: Can a perfectly composed melody make you go insane

The secret formula of the

perfect melody: Can musical

features predict cognitive

responses?

Daniel Müllensiefen

Goldsmiths, University of London

An old question:

Can a perfectly composed melody

make you go insane (Sirens, Ulysses)heal (Pythagoras)

turn sorrows into joy (Magic Flute)make walls tumble (Jericho)

make you will-less (Pied Piper of Hameln)charm wild beasts and soften the heart of Death (Orpheus)

???

=> Great stories, often effective templates for media reports and scientific narratives (e.g. the Mozart effect)

More generally (and modestly)

Can musical structure influence / bias behaviour?

Can we predict human behaviour from musical structure?

Are all musical structures equally likely to generate certain cognitive or emotional responses? (H0)

or

Are certain structures more prone than others to trigger a certain reaction? (H1)

Empirical research since late 1960s, interdisciplinary area ‘Experimental Aesthetics’

I’m guilty of

� The hitsong formula: Melodic features that makes some songs more commercially successful than others (Kopiez & Müllensiefen, 2011)

� The singalong formula: Vocal features that motivate people to sing along to music in nightclubs (Pawley & Müllensiefen, 2012)

� The formula for sexy songs: Vocal features that are conducive to romantic situations (Spotify consultancy)

� The formula for memorable tunes: Melodic features that make melodies more memorable (Müllensiefen & Halpern, 2014)

� The earworm formula: Melodic features of songs that become earworms (Williamson & Müllensiefen, 2012; Müllensiefen et al. in prep.)

The general approach

1. Collect data on music-related behaviour

and record melodies associated with

different types of behaviour

2. Extract melody features (+ compare to

large corpus of melodies)

3. Data-mine melody features to explain type

of behaviour

Computing features of melodies:

A corpus approach (FANTASTIC, Müllensiefen, 2009)

Summary features

m-type of length 2:“s1e_s1e”

m-type features

mean.Honores.H = 1

n⋅100 ⋅

log N

1.01−V 1,N( )V N( )

2nd order features

mtcf .norm.log.dist =T ′ F τ i

− D ′ F τ iτ i ∈m∑

TFτ ∈m

Distribution of features

in large corpus

Summary Features

The task: Summarise the content of a melodic phrase

= ?

Summary Features

Cognitive Hypothesis: Listeners abstract summary representation of short melodies during listening

Format: Value that represents particular aspect of melody

Pitch range (p.range):

p.range = max(p) − min(p)

Standard deviation of absolute intervals (i.abs.std):

Summary FeaturesRelative number of direction changes in interpolated contour

representation (int.cont.dir.changes)

( ) ( )

∑

∑

+

+

≠

≠=

1

1

1

1

ir.changesint.cont.dsgnsgn

ii

ii

xx

xx

Simple summary features

� Huron contour class� Contour variability

� Tonal clarity

� Tonal strength� Note density

� Length� Entropy (pitches, intervals, durations)

� Descriptive statistics of distribution of pitches, intervals, durations

� …

m-type features

Cognitive Hypothesis: Listeners use literal representation of short melodic sequences (motives) for processing

Format of m-type: String of digits (similar to “word type” in linguistics); similar to n-grams models

m-type of length 2:

“s1e_s1e”

m-type of length 4:

“s1q_s1l_s1q_s1l”

m-type features

Format of m-type feature: Number that represents distribution of m-types in melody

mean.Honores.H = 1

n⋅100 ⋅

log N

1.01−V 1,N( )V N( )

Simple m-type features:

� Yule‘s k

� Simpson‘s D

� Sichel‘s S

� Honoré’s h

� Productivity

� Entropy

2nd order features from a corpus

The M4S Corpus of Popular Music (Müllensiefen, Wiggins & Lewis, 2008):

� 14,067 high-quality MIDI transcriptions

� Representative sample of commercial pop songs from 1950 -

2006

� Complete compositional structure (all melodies, harmonies,

rhythms, instrumental parts, lyrics)

� Some performance information (MIDI patches, some expressive

timing)

2nd order summary features

Cognitive Hypothesis: Listeners encode commonness of feature valueMethod: Replacing feature values by their relative frequencies

2nd order m-type features

Cognitive Hypothesis: Listeners are sensitive to commonness of m-types Method: Use frequency information on m-types from large corpus

Example: Normalised distance of m-type frequencies in melody and corpus (mtcf.norm.log.dist)

=> measures whether uncommon m-types are used frequently in melody

m

m

TF

FDFTi ii

∈

∈∑ ′−′=

τ

τ ττ

log.distmtcf.norm.

τ : m-type

TF’ (Term Frequency): Normalised log-frequency of m-type in melody

DF’ (Document Frequency): Normalised log-frequency of m-type in corpus

FANTASTIC

Feature ANalysis Technology Accessing STatistics In a Corpus

� Open source tool box for computational analysis of melodies*

� 58 features currently implemented

� Ideas from: Descriptive statistics, music theory, music cognition, computational linguistics, music information retrieval

� 2 feature categories: Summary features and m-type features

� Context modelling via integration of corpus: 2nd order features

*http://www.doc.gold.ac.uk/isms/m4s/?page=Software%20and%20Documentation

Similar Approaches

Folk Song Research / Ethnomusicology

� Bartók (1936), Bartók & Lord (1951)

� Lomax (1977)

� Steinbeck (1982)

� Jesser (1992)

� Sagrillo (1999)

� Volk et al. (2008)

� van Kranenburg, Volk, Wiering (2012)

Popular Music Research

� Moore (2006)

� Kramarz (2006)

� Furnes (2006)

Computational / Cognitive Musicology

� Eerola et al. (2001, 2007)

� McCay (2005)

� Huron (2006)

� Frieler (2008)

Computational Linguistics / Cognitive Psychology

� Baayen (2001)

� Landauer et al. (2007)

� Sedlmeier & Betsch (2002)

� Cortese et al. (2010)

� ….

Test case: Memory

Do structural features make melodies more

memorable?

Müllensiefen & Halpern (2014)

Melody memory recognition paradigm

Participants• 34 students

• Low to medium musical training

Study list (40 items)• Randomly selected vocal phrases from

pop songs; MIDI piano timbre.

Test list (40 old + 40 new items)• Old and new counterbalanced over

listeners

Measures melody memorability• Item AUC scores (based on

recognition ratings => explicit memory)

• Difference in pleasantness ratings (=> implicit memory)

Large variation in memorability across melodies

Number of Correct Rejections, False Alarms, Hits and Misses for all 80 test melodies

Feature Modelling

Goal: Explain variability in explicit and implicit memorability of melodies

134 features:

� 1st order summary and m-type features

� 2nd order features, frequency information from 80-item testset

� 2nd order features, frequency information from pop corpus

Modeling: Partial Least Squares Regression (PLSR)

� Handles k>n problem

� Aggregates many features to few distinct components

� Built-in cross-validation avoids over-fitting

Feature name Explicit Model Implicit Memory Model

E1 E2 I1 I2 I3 I4

2nd-order features (Testset as context)

Summary Features

d.median

int.contour.class

int.cont.dir.change

-.209

.263

.217

M-Type Features

mtcf.TFDF.spearman

mtcf.TFDF.kendall

norm.log.dist

log.max.DF

mean.log.DF

mean.g.weight

mean.gl.weight

-.238

.239

.237

-.254

.255

.260

-.213

-.212

.207

-.234

2nd-order features (Corpus as context)

Summary Features

d.eq.trans

int.cont.grad.std

d.mode

int.cont.dir.change

.208

.201

-.227

.227

M-Type Features

TFDF.spearman

TFDF.kendall

mean.log.DF

g.weight

g.weight

mean.gl.weight

TFIDF.m.D

-.235

.235

.238

.226

.224

-.261

.261

.205

.266

.206

-.231

-.229


E1 E2 I1 I2 I3 I4


Summary Features

d.median

int.contour.class

int.cont.dir.change

-.209

.263

.217

M-Type Features

mtcf.TFDF.spearman

mtcf.TFDF.kendall

norm.log.dist

log.max.DF

mean.log.DF

mean.g.weight

mean.gl.weight

-.238

.239

.237

-.254

.255

.260

-.213

-.212

.207

-.234


Summary Features

d.eq.trans

int.cont.grad.std

d.mode

int.cont.dir.change

.208

.201

-.227

.227

M-Type Features

TFDF.spearman

TFDF.kendall

mean.log.DF

g.weight

g.weight

mean.gl.weight

TFIDF.m.D

-.235

.235

.238

.226

.224

-.261

.261

.205

.266

.206

-.231

-.229


E1 E2 I1 I2 I3 I4


Summary Features

d.median

int.contour.class

int.cont.dir.change

-.209

.263

.217

M-Type Features

mtcf.TFDF.spearman

mtcf.TFDF.kendall

norm.log.dist

log.max.DF

mean.log.DF

mean.g.weight

mean.gl.weight

-.238

.239

.237

-.254

.255

.260

-.213

-.212

.207

-.234


Summary Features

d.eq.trans

int.cont.grad.std

d.mode

int.cont.dir.change

.208

.201

-.227

.227

M-Type Features

TFDF.spearman

TFDF.kendall

mean.log.DF

g.weight

g.weight

mean.gl.weight

TFIDF.m.D

-.235

.235

.238

.226

.224

-.261

.261

.205

.266

.206

-.231

-.229


E1 E2 I1 I2 I3 I4


Summary Features

d.median

int.contour.class

int.cont.dir.change

-.209

.263

.217

M-Type Features

mtcf.TFDF.spearman

mtcf.TFDF.kendall

norm.log.dist

log.max.DF

mean.log.DF

mean.g.weight

mean.gl.weight

-.238

.239

.237

-.254

.255

.260

-.213

-.212

.207

-.234


Summary Features

d.eq.trans

int.cont.grad.std

d.mode

int.cont.dir.change

.208

.201

-.227

.227

M-Type Features

TFDF.spearman

TFDF.kendall

mean.log.DF

g.weight

g.weight

mean.gl.weight

TFIDF.m.D

-.235

.235

.238

.226

.224

-.261

.261

.205

.266

.206

-.231

-.229

Recognition memory test

1 2 3 4 5 6

Summary PLSR models

1. Explicit and implicit memory models share similar component (E1 and I1) but differ in other components (E2 vs. I2, I3, I4)

2. Summary features less important than m-type features

3. Frequency information from corpora very important

4. Uniqueness of m-types (with respect to corpus) very important

Do structural features explain the memorability of melodies?

Yes, but to a modest degree

Proportion variance explained:

Explicit model: 9.3% (CV), 49.5% (non-CV)

Implicit model: 25.3% (CV), 76.7% (non-CV)

Test case: Earworms

What is it that makes a tune sticky?

Identifying features of earworms in 3 steps (Finkel & Müllensiefen, 2012; Müllensiefen, Jakubowski et al., in prep)

1) Gathering earworms

� 3000 (out of ~6000) participants on earwormery.com

� Most recent, most frequent earworm (artist, title, part of song)

1) Gathering earworms

� Frequency distribution of 2816 nominated earworm tunes

� Only 505 named more than once

� Top 5:

• Lady Gaga: Bad Romance (33)

• Kylie Minogue: Can’t get you out of my head (24)

• Journey: Don’t stop believing (21)

• Gotye: Somebody that I used to know (19)

• Maroon 5: Moves like Jagger (15)

2) Matching earworm and non-earworm tunes

� Select songs named an earworm by at least 3 independent people

� Match to non-earworm songs on artist, genre, weeks in charts, highest chart position, days since chart entry (genetic algorithm for optimal matching)

Most frequent earworm tunes:

Similarly successful but never mentioned as earworms:

3) Predict earworms vs non-earworms from melodic features

� Logistic regression model

� Variable selection from feature clusters (summary features only)

� Model selection via BIC

� Classification accuracy 61%

� sig. predictors: median of note durations (+), proportions of large intervals (-)

i.mode) 0.723-d.median 0.064 +079.1(1

1)1earworm(

⋅⋅−+==

ep

= Longer durations and smaller intervals make tunes sticky (maybe because they are easier to sing?)

The hunt for the earworm formula

� Very few people have common earworms

� Earworm tunes are partly (but only partly) explained by their popularity

� Simple summary features are part of a classification model

� m-type features and 2nd order features yet to be tested

The wider picture

Finding the secret formula of the perfect melody is difficult:

� Many potential features to consider => Aggregation / Feature selection

� People are very idiosyncratic in their responses to music => even more so for ‘real’ music and real-world music behaviour

� Potentially many confounding factors

BUT:

� We found significant relationships between musical structure andhuman behaviours

� Mainly driven by m-type features

� Statistical information from corpus important

=> Points to general cognitive mechanism for event frequency processing

However

We’re still far from the golden times of music manipulation

The secret formula of the

perfect melody: Can musical

features predict cognitive

responses?

Daniel Müllensiefen

Goldsmiths, University of London

2) Controlling for popularity

� Hurdle model of frequency distribution of 1552 songs with chart data

� Predictors: Popularity and recency indicators

� sig. predictor: #weeks in charts (p<.001)

Empirical dataHurdle model

A complex summary feature: Polynomial

phrase contour (Müllensiefen & Wiggins,

2011)

6543 003.0012.0071.0326.0992.1578.66 xxxxxy ⋅+⋅−⋅−⋅+⋅−=

Represent a melodic phrase

by a polynomial curve

and a polynomial function

Documents

The secret formula of the perfect melody: Can musical features …mas03dm/papers/Features_Cogn... · 2014-06-02 · An old question: Can a perfectly composed melody make you go insane