23
The Benefit The Benefit of Using of Using Tag-Based Profiles Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Embed Size (px)

DESCRIPTION

Challenges Collaborative Filtering Content Based Techniques Hybrid Methods Cold start problem Items with no ratings Users with no profile Poor artist variety in recommended pieces Slow Unreliability in modeling user’s preferences Content similarity does not necessarily reflect preferences Slow Heavy user input 3

Citation preview

Page 1: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

The Benefit The Benefit of Using of Using

Tag-Based ProfilesTag-Based Profiles

Claudiu Firan, Wolfgang Nejdl, Raluca Paiu5th Latin American Web Congress, 2007

Page 2: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Music RecommendationMusic Recommendation

2

PersonalMusic

Community

Data

Page 3: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

ChallengesChallenges

Collaborative Filtering

Content Based Techniques

Hybrid Methods

• Cold start problem• Items with no ratings• Users with no profile

• Poor artist variety in recommended pieces• Slow

• Unreliability in modeling user’s preferences• Content similarity does not necessarily reflect preferences• Slow

• Heavy user input

3

Page 4: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

New ApproachNew Approach

4

PersonalMusic

Community

Data

PersonalTags

Page 5: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Why Use Tags?Why Use Tags?

Tags are:• Written chaotically• Not verified• Unstructured• Heterogeneous• Unreliable

But if many, the correct ones arise

“Wisdom of the masses”

5

Page 6: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Last.fm – “The Social Music Revolution”Last.fm – “The Social Music Revolution”

6

Track

Artist

Similar Artists

Albums

Track Usage

Info

Similar Tracks

Tags(with weight)

User Comments

Page 7: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Tracks, Tags, and ProfilesTracks, Tags, and Profiles

7

Page 8: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

User ProfilesUser Profiles

weight=preference(user,item)

8

Page 9: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Track-based Profiles (TR)Track-based Profiles (TR)

preference(user,track) = log(user_track_#listened)

9

TR

<tracki, weighti> …

Page 10: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Track-Tag-based Profiles (TT)Track-Tag-based Profiles (TT)

preference(user,tag) = log( Σi(log(user_tracki_#listened) ∙log(user_tag_tracki_#tagged)))[∙ ITF(tag)]

ITF = Inverse Tag Frequency• With: TTI• Without: TTN

10

TTN

TTI

<tagi, weighti> …

Page 11: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Tag-based Profiles (TG)Tag-based Profiles (TG)

preference(user,tag) = log(user_tag_#used)

11

TG

<tagi, weighti> …

Page 12: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

User Profiles from Personal MP3sUser Profiles from Personal MP3s

1. Read personal playlist from PC

2. Match MP3s against our database

3. Add overall average usage information values

12

Page 13: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Collaborative Filtering vs. SearchCollaborative Filtering vs. Search

13

Page 14: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Track- & Tag-based RecommendationsTrack- & Tag-based Recommendations

14

Collaborative Filtering

<tracki, weighti> …

<tagi, weighti> …

Page 15: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Tag-based SearchTag-based Search

15

<tagi, weighti> …

Page 16: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

AlgorithmsAlgorithms

16

Page 17: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Experiments & OutcomeExperiments & Outcome

17

Page 18: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Last.fm Crawled DataLast.fm Crawled Data

• 317,058 tracks

• 21,177 tags (most prominent ones are music genres)

• 289,654 users 12,193 listened at least 50 tracks and used at least 10 tags

18

Page 19: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Experimental SetupExperimental Setup

1. Create user profiles• 18 subjects• 658 tracks on average in user profile (not statistically

significant in influencing algorithm outcome)2. Run algorithms

• 7 algorithms• 10 recommended items per algorithm per user

3. Two scores• Quality of recommendation [0-2] NDCG• Novelty of recommendation [0-2] Average

19

Page 20: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

ResultsResults

20

Nr

Algorithm

NDCG Signif. vs. CFTR

Average Novelty

Average Popularit

y1 CFTR 0.54 - 1.39 15,1772 CFTG 0.25 Highly 1.83 4,0653 CFTTI 0.36 Highly 1.72 6,6324 CFTTN 0.37 Highly 1.74 13,6715 STG 0.60 No 1.07 7,5876 STTI 0.73 Highly 0.82 10,3807 STTN 0.77 Highly 0.78 16,309

CFTR: Baseline

STG: • Lower popularity• Higher quality

STTI & STTN: • Huge improvement• Statistically significant

NDCG – Novelty: • High inverse correlation• Pearson c = -0.987

Page 21: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

Gain over the Baseline (CF on Tracks)Gain over the Baseline (CF on Tracks)

21

Page 22: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007

ConclusionsConclusions

• CF on tag-based profiles worse than CF on track-based profiles

• Search with tags improved recommendation performance substantially• 44% increase in quality• Instant results – virtually no time delay• No cold start problem

• Tag-based profiles work also with less rich music repositories

• Results probably influenced by the consistent tag usage on Last.fm: mostly genres

22

Page 23: The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007