Auralist: Introducing Serendipity ( 惊喜度 ) into Music Recommendation WSDM ’12 (ACM...

Preview:

Citation preview

Auralist: Introducing Serendipity (惊喜度 )into Music Recommendation

WSDM ’12(ACM international conference on

Web Search and Data Mining)

1. INTRODUCTION

• The majority of research focuses on improving the accuracy of recommendation

• The dangers– 1. produces boring and ineffective recommendations– 2. harms

a user's personal growth and experience

1. INTRODUCTION

• Contributions :• 1. Balance the conflicting goals

– accuracy– diversity– novelty– serendipity

• 2. Use metrics to measure all three non-accuracy factors simultaneously

1. INTRODUCTION

• Next…• 2. Why accuracy is not enough(and the other

three properties)• 3. The auralist framework(and algorithm)• 4. Evaluation(Quantitative evaluation)• 5. User study(Quantitative evaluation)

2. WHY ACCURACY IS NOT ENOUGH• Firstly , the boring symbols

用户集评分矩阵

user u 的 Top-N 推荐

物品集user u 的验证集user u 的训练集item i 的热度LDA topic 的集合

LDA 物品 - 话题 矩阵

歌手 ( 物品 ) i 的听众数目

2.1 ACCURACY

• average Top-N Recall

直观上:推荐列表的物品,在验证集中的数目(越小越好)

2.1 ACCURACY

• average Rank score

• 其中

直观上:所推荐的物品,在验证集中的受喜爱程度喜爱程度通过排名表示,所以越小越好

2.1 ACCURACY

• Produce recommendations that appear supercially “good”

• But are in fact inferior in terms of actual user satisfaction

2. DIVERSITY( 多样性 )

直观上:位于推荐列表中所有物品,两两的余弦相似度(越小越好)

2. NOVELTY( 新颖性 )

直观上: higher values mean that more globally “unexplored" items are being recommended

2. SERENDIPITY( 惊喜度 )

直观上:训练集的物品与推荐结果的物品两两之间的相似度(越小越好)

3. THE AURALIST FRAMEWORK

• Basic– LDA model

• Hybrid– Listener Diversity + Basic ---> Community-Aware– Declustering + Basic ---> Bubble-Aware

• Full

3.1 BASIC AURALIST

• Using LDA ( Latent Dirichlet Allocation )• 即:将原来文档中,向量空间的词的维度

转变为” Topic” 的维度

3.1 BASIC AURALIST

• 举个栗子

• 一个文档 A ,包含“电脑”和“微机”这两个词。

• 将文档 A 向量化后可能是,“电脑”这个词是全部词汇中的第 2 维,而“微机”是第 3 维。

• 维上的投影简单看作是其 TF( 文档中出现的次数 )。

• A={x,1,1,x,...,x}

3.1 BASIC AURALIST

• 词的向量空间 A={x,1,1,x,...,x}• 在向量空间中,“电脑 ”及“微机”这两个维度

被认为正交,即两个词表示了完全不同的意义。• 将两个词的维度“捏合”为一个 Topic 的维度,词

在 Topic 中表示为权重。• Topic 的向量空间 A={y,(p1+p2),y,...,y}

• 降低了维度 ( 好像很好的样子 )

3.1 BASIC AURALIST

• Document :TheWilliam Randolph Hearst Foundation will give $1.25 million to Lincoln Center, Metropolitan Opera Co., New York Philharmonic and Juilliard School. “Our board felt that we had a real opportunity to make a mark on the future of the performing arts with these grants an act every bit as important as our traditional areas of support in health, medical research, education and the social services,” Hearst Foundation President Randolph A. Hearst said Monday in announcing the grants. Lincoln Center’s share will be $200,000 for its new building, which will house young artists and provide new public facilities. The Metropolitan Opera Co. and New York Philharmonic will receive $400,000 each. The Juilliard School, where music andthe performing arts are taught, will get $250,000. The Hearst Foundation, a leading supporter of the Lincoln Center Consolidated Corporate Fund, will make its usual annual $100,000 donation, too.

3.1 BASIC AURALIST

• Words ---> Topics

3.1 BASIC AURALIST

3.1 BASIC AURALIST

• Artist-based LDA model

3.1 BASIC AURALIST• similarity between artist topic vectors

• the score that user u associates to item I– The LDA similarity used directly for item-based recommendation

• 对所有 item 的 Basic 值排序,得到推荐列表

3.2 Two hybrid versions of Auralist

• “A” that includes – Artist-based LDA– Listener Diversity– Declustering

3.2.1 Community-Aware Auralist

• Listener Diversity(the entropy over its topic distribution)

• The Rank

• Give it some offset

• The offset

3.2.2 Bubble-Aware Auralist

• The rank

3.2.2 Bubble-Aware Auralist

• algorithm

4. EVALUATION

• 1. Basic Auralist• 2. the state-of-the art :

– Implicit SVD( 奇异值分解 ) method

• 3. Community-aware Auralist (λ1=0.05)• 4. Bubble-aware Auralist (λ2=0.2)• 5. Full Auralist

4.1 DATASET

• user.getTopArtists() from the Last.fm API• Quantity : 360k users

4.2 Basic Auralist Recommendation

4.2 Hybrid versions of Auralist

• Accuracy performance

4.2 Hybrid versions of Auralist

• Diversity, Novelty , Serendipity performance

5. USER STUDY

• Full Auralist

• λ1=0.03• λ2=0.20

5.1 Experimental Method

• involved 21 participants• included a mix of

– under/post graduates– men/women– between the ages of18-27– varying nationalities

5.2 User Ratings

5.2 User Ratings

5.2 User Satisfaction

• “[Full Auralist] was more satisfying because it bintroduced me to new artists. [Basic] was lled entirely with new artists which, while very good, were things that I listened to all the time on a regular basis. [Full Auralist] had artists that were of the same quality of those I listen to but which I'd never heard of.“

• “I found [the Full Auralist list] more surpris- ing than [Basic]. Most artists I had not heard of (which is what I prefer). Listening to them gave me at least ve new artists I could look into and use in the future.“

• “While I enjoyed the songs on the [Full Au- ralist] list less, I liked that there was more new music on it than the rst list. So I'm going to say that I preferred the [Full Auralist] list.“

• “[The Basic list was better], more familiar music & more my taste, although [Full Auralist] introduced me to a few good bands.“

• “[The Full Auralist list] was way too jazzy, and had very few artists I connected with imme- diately. While [the Basic list] had a vast majority of artists I knew well and have opinions of, the few unknowns were really very congenial."

以下为测试者的言论,对 Full Auralist 各种赞扬