Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
The secret formula of the
perfect melody: Can musical
features predict cognitive
responses?
Daniel Müllensiefen
Goldsmiths, University of London
An old question:
Can a perfectly composed melody
make you go insane (Sirens, Ulysses)heal (Pythagoras)
turn sorrows into joy (Magic Flute)make walls tumble (Jericho)
make you will-less (Pied Piper of Hameln)charm wild beasts and soften the heart of Death (Orpheus)
???
=> Great stories, often effective templates for media reports and scientific narratives (e.g. the Mozart effect)
More generally (and modestly)
Can musical structure influence / bias behaviour?
Can we predict human behaviour from musical structure?
Are all musical structures equally likely to generate certain cognitive or emotional responses? (H0)
or
Are certain structures more prone than others to trigger a certain reaction? (H1)
Empirical research since late 1960s, interdisciplinary area ‘Experimental Aesthetics’
I’m guilty of
� The hitsong formula: Melodic features that makes some songs more commercially successful than others (Kopiez & Müllensiefen, 2011)
� The singalong formula: Vocal features that motivate people to sing along to music in nightclubs (Pawley & Müllensiefen, 2012)
� The formula for sexy songs: Vocal features that are conducive to romantic situations (Spotify consultancy)
� The formula for memorable tunes: Melodic features that make melodies more memorable (Müllensiefen & Halpern, 2014)
� The earworm formula: Melodic features of songs that become earworms (Williamson & Müllensiefen, 2012; Müllensiefen et al. in prep.)
The general approach
1. Collect data on music-related behaviour
and record melodies associated with
different types of behaviour
2. Extract melody features (+ compare to
large corpus of melodies)
3. Data-mine melody features to explain type
of behaviour
Computing features of melodies:
A corpus approach (FANTASTIC, Müllensiefen, 2009)
Summary features
m-type of length 2:“s1e_s1e”
m-type features
mean.Honores.H = 1
n⋅100 ⋅
log N
1.01−V 1,N( )V N( )
2nd order features
mtcf .norm.log.dist =T ′ F τ i
− D ′ F τ iτ i ∈m∑
TFτ ∈m
Distribution of features
in large corpus
Summary Features
The task: Summarise the content of a melodic phrase
= ?
Summary Features
Cognitive Hypothesis: Listeners abstract summary representation of short melodies during listening
Format: Value that represents particular aspect of melody
Pitch range (p.range):
p.range = max(p) − min(p)
Standard deviation of absolute intervals (i.abs.std):
Summary FeaturesRelative number of direction changes in interpolated contour
representation (int.cont.dir.changes)
( ) ( )
∑
∑
+
+
≠
≠=
1
1
1
1
ir.changesint.cont.dsgnsgn
ii
ii
xx
xx
Simple summary features
� Huron contour class� Contour variability
� Tonal clarity
� Tonal strength� Note density
� Length� Entropy (pitches, intervals, durations)
� Descriptive statistics of distribution of pitches, intervals, durations
� …
m-type features
Cognitive Hypothesis: Listeners use literal representation of short melodic sequences (motives) for processing
Format of m-type: String of digits (similar to “word type” in linguistics); similar to n-grams models
m-type of length 2:
“s1e_s1e”
m-type of length 4:
“s1q_s1l_s1q_s1l”
m-type features
Format of m-type feature: Number that represents distribution of m-types in melody
mean.Honores.H = 1
n⋅100 ⋅
log N
1.01−V 1,N( )V N( )
Simple m-type features:
� Yule‘s k
� Simpson‘s D
� Sichel‘s S
� Honoré’s h
� Productivity
� Entropy
2nd order features from a corpus
The M4S Corpus of Popular Music (Müllensiefen, Wiggins & Lewis, 2008):
� 14,067 high-quality MIDI transcriptions
� Representative sample of commercial pop songs from 1950 -
2006
� Complete compositional structure (all melodies, harmonies,
rhythms, instrumental parts, lyrics)
� Some performance information (MIDI patches, some expressive
timing)
2nd order summary features
Cognitive Hypothesis: Listeners encode commonness of feature valueMethod: Replacing feature values by their relative frequencies
2nd order m-type features
Cognitive Hypothesis: Listeners are sensitive to commonness of m-types Method: Use frequency information on m-types from large corpus
Example: Normalised distance of m-type frequencies in melody and corpus (mtcf.norm.log.dist)
=> measures whether uncommon m-types are used frequently in melody
m
m
TF
FDFTi ii
∈
∈∑ ′−′=
τ
τ ττ
log.distmtcf.norm.
τ : m-type
TF’ (Term Frequency): Normalised log-frequency of m-type in melody
DF’ (Document Frequency): Normalised log-frequency of m-type in corpus
FANTASTIC
Feature ANalysis Technology Accessing STatistics In a Corpus
� Open source tool box for computational analysis of melodies*
� 58 features currently implemented
� Ideas from: Descriptive statistics, music theory, music cognition, computational linguistics, music information retrieval
� 2 feature categories: Summary features and m-type features
� Context modelling via integration of corpus: 2nd order features
*http://www.doc.gold.ac.uk/isms/m4s/?page=Software%20and%20Documentation
Similar Approaches
Folk Song Research / Ethnomusicology
� Bartók (1936), Bartók & Lord (1951)
� Lomax (1977)
� Steinbeck (1982)
� Jesser (1992)
� Sagrillo (1999)
� Volk et al. (2008)
� van Kranenburg, Volk, Wiering (2012)
Popular Music Research
� Moore (2006)
� Kramarz (2006)
� Furnes (2006)
Computational / Cognitive Musicology
� Eerola et al. (2001, 2007)
� McCay (2005)
� Huron (2006)
� Frieler (2008)
Computational Linguistics / Cognitive Psychology
� Baayen (2001)
� Landauer et al. (2007)
� Sedlmeier & Betsch (2002)
� Cortese et al. (2010)
� ….
Test case: Memory
Do structural features make melodies more
memorable?
Müllensiefen & Halpern (2014)
Melody memory recognition paradigm
Participants• 34 students
• Low to medium musical training
Study list (40 items)• Randomly selected vocal phrases from
pop songs; MIDI piano timbre.
Test list (40 old + 40 new items)• Old and new counterbalanced over
listeners
Measures melody memorability• Item AUC scores (based on
recognition ratings => explicit memory)
• Difference in pleasantness ratings (=> implicit memory)
Large variation in memorability across melodies
Number of Correct Rejections, False Alarms, Hits and Misses for all 80 test melodies
Feature Modelling
Goal: Explain variability in explicit and implicit memorability of melodies
134 features:
� 1st order summary and m-type features
� 2nd order features, frequency information from 80-item testset
� 2nd order features, frequency information from pop corpus
Modeling: Partial Least Squares Regression (PLSR)
� Handles k>n problem
� Aggregates many features to few distinct components
� Built-in cross-validation avoids over-fitting
Feature name Explicit Model Implicit Memory Model
E1 E2 I1 I2 I3 I4
2nd-order features (Testset as context)
Summary Features
d.median
int.contour.class
int.cont.dir.change
-.209
.263
.217
M-Type Features
mtcf.TFDF.spearman
mtcf.TFDF.kendall
norm.log.dist
log.max.DF
mean.log.DF
mean.g.weight
mean.gl.weight
-.238
.239
.237
-.254
.255
.260
-.213
-.212
.207
-.234
2nd-order features (Corpus as context)
Summary Features
d.eq.trans
int.cont.grad.std
d.mode
int.cont.dir.change
.208
.201
-.227
.227
M-Type Features
TFDF.spearman
TFDF.kendall
mean.log.DF
g.weight
g.weight
mean.gl.weight
TFIDF.m.D
-.235
.235
.238
.226
.224
-.261
.261
.205
.266
.206
-.231
-.229
Feature name Explicit Model Implicit Memory Model
E1 E2 I1 I2 I3 I4
2nd-order features (Testset as context)
Summary Features
d.median
int.contour.class
int.cont.dir.change
-.209
.263
.217
M-Type Features
mtcf.TFDF.spearman
mtcf.TFDF.kendall
norm.log.dist
log.max.DF
mean.log.DF
mean.g.weight
mean.gl.weight
-.238
.239
.237
-.254
.255
.260
-.213
-.212
.207
-.234
2nd-order features (Corpus as context)
Summary Features
d.eq.trans
int.cont.grad.std
d.mode
int.cont.dir.change
.208
.201
-.227
.227
M-Type Features
TFDF.spearman
TFDF.kendall
mean.log.DF
g.weight
g.weight
mean.gl.weight
TFIDF.m.D
-.235
.235
.238
.226
.224
-.261
.261
.205
.266
.206
-.231
-.229
Feature name Explicit Model Implicit Memory Model
E1 E2 I1 I2 I3 I4
2nd-order features (Testset as context)
Summary Features
d.median
int.contour.class
int.cont.dir.change
-.209
.263
.217
M-Type Features
mtcf.TFDF.spearman
mtcf.TFDF.kendall
norm.log.dist
log.max.DF
mean.log.DF
mean.g.weight
mean.gl.weight
-.238
.239
.237
-.254
.255
.260
-.213
-.212
.207
-.234
2nd-order features (Corpus as context)
Summary Features
d.eq.trans
int.cont.grad.std
d.mode
int.cont.dir.change
.208
.201
-.227
.227
M-Type Features
TFDF.spearman
TFDF.kendall
mean.log.DF
g.weight
g.weight
mean.gl.weight
TFIDF.m.D
-.235
.235
.238
.226
.224
-.261
.261
.205
.266
.206
-.231
-.229
Feature name Explicit Model Implicit Memory Model
E1 E2 I1 I2 I3 I4
2nd-order features (Testset as context)
Summary Features
d.median
int.contour.class
int.cont.dir.change
-.209
.263
.217
M-Type Features
mtcf.TFDF.spearman
mtcf.TFDF.kendall
norm.log.dist
log.max.DF
mean.log.DF
mean.g.weight
mean.gl.weight
-.238
.239
.237
-.254
.255
.260
-.213
-.212
.207
-.234
2nd-order features (Corpus as context)
Summary Features
d.eq.trans
int.cont.grad.std
d.mode
int.cont.dir.change
.208
.201
-.227
.227
M-Type Features
TFDF.spearman
TFDF.kendall
mean.log.DF
g.weight
g.weight
mean.gl.weight
TFIDF.m.D
-.235
.235
.238
.226
.224
-.261
.261
.205
.266
.206
-.231
-.229
Recognition memory test
1 2 3 4 5 6
Summary PLSR models
1. Explicit and implicit memory models share similar component (E1 and I1) but differ in other components (E2 vs. I2, I3, I4)
2. Summary features less important than m-type features
3. Frequency information from corpora very important
4. Uniqueness of m-types (with respect to corpus) very important
Do structural features explain the memorability of melodies?
Yes, but to a modest degree
Proportion variance explained:
Explicit model: 9.3% (CV), 49.5% (non-CV)
Implicit model: 25.3% (CV), 76.7% (non-CV)
Test case: Earworms
What is it that makes a tune sticky?
Identifying features of earworms in 3 steps (Finkel & Müllensiefen, 2012; Müllensiefen, Jakubowski et al., in prep)
1) Gathering earworms
� 3000 (out of ~6000) participants on earwormery.com
� Most recent, most frequent earworm (artist, title, part of song)
1) Gathering earworms
� Frequency distribution of 2816 nominated earworm tunes
� Only 505 named more than once
� Top 5:
• Lady Gaga: Bad Romance (33)
• Kylie Minogue: Can’t get you out of my head (24)
• Journey: Don’t stop believing (21)
• Gotye: Somebody that I used to know (19)
• Maroon 5: Moves like Jagger (15)
2) Matching earworm and non-earworm tunes
� Select songs named an earworm by at least 3 independent people
� Match to non-earworm songs on artist, genre, weeks in charts, highest chart position, days since chart entry (genetic algorithm for optimal matching)
Most frequent earworm tunes:
Similarly successful but never mentioned as earworms:
3) Predict earworms vs non-earworms from melodic features
� Logistic regression model
� Variable selection from feature clusters (summary features only)
� Model selection via BIC
� Classification accuracy 61%
� sig. predictors: median of note durations (+), proportions of large intervals (-)
i.mode) 0.723-d.median 0.064 +079.1(1
1)1earworm(
⋅⋅−+==
ep
= Longer durations and smaller intervals make tunes sticky (maybe because they are easier to sing?)
The hunt for the earworm formula
� Very few people have common earworms
� Earworm tunes are partly (but only partly) explained by their popularity
� Simple summary features are part of a classification model
� m-type features and 2nd order features yet to be tested
The wider picture
Finding the secret formula of the perfect melody is difficult:
� Many potential features to consider => Aggregation / Feature selection
� People are very idiosyncratic in their responses to music => even more so for ‘real’ music and real-world music behaviour
� Potentially many confounding factors
BUT:
� We found significant relationships between musical structure andhuman behaviours
� Mainly driven by m-type features
� Statistical information from corpus important
=> Points to general cognitive mechanism for event frequency processing
However
We’re still far from the golden times of music manipulation
The secret formula of the
perfect melody: Can musical
features predict cognitive
responses?
Daniel Müllensiefen
Goldsmiths, University of London
2) Controlling for popularity
� Hurdle model of frequency distribution of 1552 songs with chart data
� Predictors: Popularity and recency indicators
� sig. predictor: #weeks in charts (p<.001)
Empirical dataHurdle model
A complex summary feature: Polynomial
phrase contour (Müllensiefen & Wiggins,
2011)
6543 003.0012.0071.0326.0992.1578.66 xxxxxy ⋅+⋅−⋅−⋅+⋅−=
Represent a melodic phrase
by a polynomial curve
and a polynomial function