23
Emotion in Music Task: Lessons Learned Anna Aljanaki 1 Yi-Hsuan Yang 2 Mohammad Soleymani 1 1 University of Geneva, Switzerland 2 Academia Sinica, Taiwan 20-21 October, MediaEval 2016

MediaEval 2016 - Emotion in Music Task: Lessons Learned

Embed Size (px)

Citation preview

Page 1: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Emotion in Music Task: Lessons Learned

Anna Aljanaki1 Yi-Hsuan Yang2

Mohammad Soleymani1

1University of Geneva, Switzerland2Academia Sinica, Taiwan

20-21 October, MediaEval 2016

Page 2: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Emotion in Music Task

I 2013 — Emotion in Music Brave New Task.I Organized by M. Soleymani, M.N. Caro, E.M. Schmidt and

Y.-H. YangI 2 subtasks - dynamic (per-second) music emotion

recognition and song-level emotion recognitionI 3 participating teams

Page 3: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Emotion in Music Task

I Focused on audio analysis (optionally, metadata)I Most attention was paid to recognizing how emotion

changes over timeI Used valence/arousal model

Page 4: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Valence/Arousal model

Page 5: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Dynamic emotion tracking (over duration of a piece)

Page 6: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Emotion in Music Task

I 2013 — Emotion in Music Brave New Task.I Organized by M. Soleymani, M.N. Caro, E.M. Schmidt and

Y.-H. YangI 2 tasks - dynamic (per-second) music emotion recognition

and song-level emotion recognitionI 3 participating teams

I 2014 — Emotion in Music Task, Second EditionI Organized by A. Aljanaki, Y.-H. Yang, M. SoleymaniI 2 tasks - dynamic (per-second) music emotion recognition

and feature designI 7 participating teams

Page 7: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Emotion in Music Task

I 2013 — Emotion in Music Brave New Task.I Organized by M. Soleymani, M.N. Caro, E.M. Schmidt and

Y.-H. YangI 2 tasks - dynamic (per-second) music emotion recognition

and song-level emotion recognitionI 3 participating teams

I 2014 — Emotion in Music Task, Second EditionI Organized by A. Aljanaki, Y.-H. Yang, M. SoleymaniI 2 tasks - dynamic (per-second) music emotion recognition

and feature designI 7 participating teams

I 2015 — Emotion in Music Task, Third Edition.I Organized by A. Aljanaki, Y.-H. Yang, M. SoleymaniI 1 task - dynamic (per-second) music emotion recognition,

three submissions - features, prediction on baselinefeatures, prediction on custom features.

I 11 participating teams

Page 8: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Quality of the annotations

Year 2013 2014 2015Total length 9h 18min 12h 30min 3h 46minCronbach’s α for arousal .28 ± 0.28 .31 ± 0.30 .66 ± 0.26GAM’s R2 for arousal .13 ± 0.10 .14 ± 0.11 .44 ± 0.19Cronbach’s α for valence .28 ± 0.29 .20 ± 0.24 .51 ± 0.35GAM’s R2 for valence .13 ± 0.10 .10 ± 0.08 .37 ± 0.21

Page 9: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Quality of the annotations

Year 2013 2014 2015Total length 9h 18min 12h 30min 3h 46minCronbach’s α for arousal .28 ± 0.28 .31 ± 0.30 .66 ± 0.26GAM’s R2 for arousal .13 ± 0.10 .14 ± 0.11 .44 ± 0.19Cronbach’s α for valence .28 ± 0.29 .20 ± 0.24 .51 ± 0.35GAM’s R2 for valence .13 ± 0.10 .10 ± 0.08 .37 ± 0.21

I 2013 & 2014 – 45 second excerpts. 2015 – full songs.I 2013 & 2014 – Amazon Mechanical Turk Workers. 2015 –

Both lab and AMT workers.I 2015 – introduced preliminary listening.

Page 10: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Quality of the annotations - Arousal

Page 11: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Quality of the annotations - Valence

Page 12: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Continuous annotation interface

Page 13: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Continuous annotation problems

I Absolute scaleI Reaction timeI Scaling (’zoom’ levels)

Page 14: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Continuous annotation problems

Absolute scale ratings

Page 15: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Continuous annotation problems

We tried to scale each annotation to the dynamic mean of thesong: aj,i = aj,i + (Aj − A)

Page 16: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Continuous annotation problems

There is a reaction time in the annotations. Before listeners cangive judgements on the emotional content of music, they needto listen to it for some time.

Page 17: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Continuous annotation problems

There is a scaling problem – the unit of emotional expressioncan be structural section, or phrase, or a single note.

Page 18: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Best solutions

Method ρ RMSE2013, BLSTM-RNN .31 ± .37 .08 ± .052014, LSTM .35 ± .45 .10 ± .052015, BLSTM-RNN .66 ± .25 .12 ± .06

Table: Winning algorithms on arousal, ordered by Spearman’s ρ.BLSTM-RNN – Bi-directional Long-Short Term Memory RecurrentNeural Networks.

Method ρ RMSE2013, BLSTM-RNN .19 ± .43 .08 ± .042014, LSTM .20 ± .49 .08 ± .052015, BLSTM-RNN .17 ± .09 .12 ± .54

Table: Winning algorithms on valence, ordered by Spearman’s ρ.

Page 19: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Possible solutions and modifications

I Change the task from emotion tracking to dynamicstracking (diminuendo, crescendo, rallentando)

Page 20: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Possible solutions and modifications

I Change the task from emotion tracking to dynamicstracking (diminuendo, crescendo, rallentando)

I Change the data collection interface

Page 21: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Categorical interface

Page 22: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Possible solutions and modifications

I Change the task from emotion tracking to dynamicstracking (diminuendo, crescendo, rallentando)

I Change the data collection interfaceI Finding the practical task where continuous tracking is

necessary.I Retrieval by an emotional trajectoryI ThumbnailingI Emotion prediction from physiological signals and audio

Page 23: MediaEval 2016 - Emotion in Music Task: Lessons Learned

Acknowledgements

We thank Erik M. Schmidt, Mike N. Caro, Cheng-Ya Sha,Alexander Lansky, Sung-Yen Liu and Eduardo Countinho fortheir contributions to task developments, and anonymousTurkers for their work.