44
Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Michael Kuhn

Distributed Computing Group (DISCO)ETH Zurich

The MusicExplorer Project:Mapping the World of Music

Page 2: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

„Today, I woud like to listen to something cheerful.“

„Something like Lenny Kravitz would be great.“

„Who can help me to discover my collection?“

Page 3: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

„In my shelf AC/DC isnext to the ZZ Top...“

Page 4: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

„play random songs that match my mood“

Page 5: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

What‘s the Talk about?

basic idea: map of music

constructing the map(MDS, PLSA)

using the map(Youtube, Android)

Page 6: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Map of Music: What is it good for?

Page 7: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Similar songs are close to each other

• Quickly find nearest neighbors

• Span (and play) volumes

• Create smooth playlists by interpolation

• Visualize a collection

• Low memory footprint– Well suited for mobile domain

Advantages of a Map

Hey Jude

Imagine

My Prerogative

I want it that way

Praise you

Galvanize

rock

pop

electronic

convenient basis to build music software

Page 8: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

How to Construct a Map of Music?

Page 9: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Similar or different???

Page 10: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Music Similarity

Audio Analysis Usage Data

Page 11: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

From Usage Data to Similarity

Collaborative Filtering

Folksonomies (Tags)

Page 12: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Collaborative Filtering and MDSMethod 1:

Page 13: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Basic Idea

d = ?

item-to-item collaborative filtering

1

graph for all-pairsdistances

2

MDS to embed graph (i.e. distances)into Euclidean space

3

Page 14: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Item-to-Item Collaborative Filtering

People Who Listen To This Song also Listen to...

[Linden et al., 2003]

Page 15: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Users who listen to A also listen to B– Top-50 listened songs per user– Normalization (cosine similarity)

Pairwise Similarity

ba

cBAsim

),(

#common users (co-occurrences)

Occurrences of song A Occurrences of song B

Page 16: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Graph Embedding

E B

D C

A

2

3

34

5

4

5

B AE

D

C

Page 17: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Well-known for dimensionality reduction– first described by Young and Householder, 1938

• Principal Component Analysis (PCA): – Project on hyperplane that maximizes variance.– Computed by solving an eigenvalue problem.

• Basic idea of MDS:– Assume that the exact positions y1,...,yN in a high-dimensional space

are given.– It can be shown that knowing only the distances d(yi, yj) between

points we can calculate the same result as applying PCA to y1,...,yN.

• Problem: Complexity O(n2 log n)

Classical Multidimensional Scaling (MDS)

Page 18: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Select k landmarks and embed them using MDS

• For the remaining points: – Place according to distances

from landmarks

• Complexity: O(k n log n)

Landmark MDS[de Silva and Tenenbaum, 1999]

Page 19: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Assumption: some links erroneously shortcut certain paths

• Idea: Use embedding as estimator for distance– Shortcut edges get stretched– Remove edges with worst stretch and re-embed

• Example: Kleinberg graph (20x20 grid with random edges)

Iterative Embedding

Original embedding(spring embedder) After 6 rounds After 12 rounds After 30 rounds

Page 20: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Evaluation: Dimensionality

Pink Floyd - TimePink Floyd - On the RunPink Floyd - Any Colour you LikePink Floyd - The Great Gig in the SkyPink Floyd - EclipsePink Floyd - Us and ThemPink Floyd - Brain DamagePink Floyd - Speak to MePink Floyd - MoneyPink Floyd - BreathePink Floyd - One of These Days

Miles Davis - So WhatHorace Silver - Song For My FatherBill Evans - All of YouMiles Davis - Freddie FreeloaderNat King Cole - The More I See YouMiles Davis - So NearMiles Davis - Flamenco SketchesCharles Mingus - Eat That ChickenJimmy Smith - On the Sunny SideJulie London - DaddyBill Evans – My Man‘s Gone Now

10 Dimensions give a reasonable quality

Example Neighborhoods in 10D Space

Page 21: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Social Tags and PLSAMethod 2:

Page 22: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Similarity from Social Tagging

Page 23: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Probabilistic Latent Semantic Analysis (PLSA)

w1

w2

wM

Z1 =?

Z2 =?

ZK =?

P(w|z)P(z|d)

d1

d2

dN

documents latent semantic classes wordssongs latent music style classes tags

[Hofmann, 1999]

Page 24: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

PLSA: Interpretation as Space

w1

w2

wM

Z1 =?

Z2 =?

ZK =?

P(w|z)P(z|d)

d1

d2

dN

songs latent music style classes tags

can be seen as a vector that defines a point in space [Hofmann, 1999]

K small: Dimensionality reduction

Page 25: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

PLSA Space

latent class 2

late

nt c

lass

3

latent class 1

Probabilities sum to 1:K-1 dimensional hyperplane

Similar documents (songs) are close to each other

music space: 32 dimensions

Page 26: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Advantages of LMDS:– Same accurracy at lower dimensionality (10 vs. 32)

• Advantages of PLSA:– Natural meaning of tags– Assignment of tags to songs (probabilistic)

LMDS vs. PLSA Space

Current sizes (approx.):LMDS: 600K tracksPLSA: 1.1M tracks

Page 27: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Using the Map

Page 28: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Visualization?

high-dimensional!

Page 29: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Identify relevant tags

• Find centroids of these tags in 10D

• Apply Principal Component Analysis (PCA) to these centroids

From 10D to 2D

Page 30: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music
Page 31: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

What people have chosen during the researcher‘s night in Zurich

Page 32: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

YouJuke – The YouTube Jukebox

Page 33: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

YouTube as media source

Music map to create smart playlist

Page 34: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music
Page 35: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music
Page 36: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Reaching YouJuke

www.youjuke.org

apps.facebook.com/youjuke

Page 37: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

μseek

„play random songs that match my mood“

„In my shelf John Lennon isnext to the Beatles...“

μseek:The map of music

on Android

Page 38: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

An Intelligent iPod-Shuffle

skip =

listen =

Page 39: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Realization

After only few skips, we know pretty well which songs match the user‘s mood

Page 40: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Work in Progress: Who is Dancing?

AC/DC

Beatles

Prodigy

Page 41: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

„In my shelf AC/DC isnext to the ZZ Top...“

Browsing Covers

Page 42: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

Video

Page 43: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

www.musicexplorer.org/museek

Internet connection only required at first startup!

Page 44: Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music

• Thanks to:– Lukas Bossard – Mihai Calin– Olga Goussevskaia– Michael Lorenzi– Roger Wattenhofer– Samuel Welten

• URLs:– www.musicexplorer.org/museek– www.youjuke.org– apps.facebook.com/youjuke

• E-Mail:– [email protected] (Michael Kuhn)

Questions?