The Benefit of Using Tag-Based Profiles Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin...

Preview:

DESCRIPTION

Challenges Collaborative Filtering Content Based Techniques Hybrid Methods Cold start problem Items with no ratings Users with no profile Poor artist variety in recommended pieces Slow Unreliability in modeling user’s preferences Content similarity does not necessarily reflect preferences Slow Heavy user input 3

Citation preview

The Benefit The Benefit of Using of Using

Tag-Based ProfilesTag-Based Profiles

Claudiu Firan, Wolfgang Nejdl, Raluca Paiu5th Latin American Web Congress, 2007

Music RecommendationMusic Recommendation

2

PersonalMusic

Community

Data

ChallengesChallenges

Collaborative Filtering

Content Based Techniques

Hybrid Methods

• Cold start problem• Items with no ratings• Users with no profile

• Poor artist variety in recommended pieces• Slow

• Unreliability in modeling user’s preferences• Content similarity does not necessarily reflect preferences• Slow

• Heavy user input

3

New ApproachNew Approach

4

PersonalMusic

Community

Data

PersonalTags

Why Use Tags?Why Use Tags?

Tags are:• Written chaotically• Not verified• Unstructured• Heterogeneous• Unreliable

But if many, the correct ones arise

“Wisdom of the masses”

5

Last.fm – “The Social Music Revolution”Last.fm – “The Social Music Revolution”

6

Track

Artist

Similar Artists

Albums

Track Usage

Info

Similar Tracks

Tags(with weight)

User Comments

Tracks, Tags, and ProfilesTracks, Tags, and Profiles

7

User ProfilesUser Profiles

weight=preference(user,item)

8

Track-based Profiles (TR)Track-based Profiles (TR)

preference(user,track) = log(user_track_#listened)

9

TR

<tracki, weighti> …

Track-Tag-based Profiles (TT)Track-Tag-based Profiles (TT)

preference(user,tag) = log( Σi(log(user_tracki_#listened) ∙log(user_tag_tracki_#tagged)))[∙ ITF(tag)]

ITF = Inverse Tag Frequency• With: TTI• Without: TTN

10

TTN

TTI

<tagi, weighti> …

Tag-based Profiles (TG)Tag-based Profiles (TG)

preference(user,tag) = log(user_tag_#used)

11

TG

<tagi, weighti> …

User Profiles from Personal MP3sUser Profiles from Personal MP3s

1. Read personal playlist from PC

2. Match MP3s against our database

3. Add overall average usage information values

12

Collaborative Filtering vs. SearchCollaborative Filtering vs. Search

13

Track- & Tag-based RecommendationsTrack- & Tag-based Recommendations

14

Collaborative Filtering

<tracki, weighti> …

<tagi, weighti> …

Tag-based SearchTag-based Search

15

<tagi, weighti> …

AlgorithmsAlgorithms

16

Experiments & OutcomeExperiments & Outcome

17

Last.fm Crawled DataLast.fm Crawled Data

• 317,058 tracks

• 21,177 tags (most prominent ones are music genres)

• 289,654 users 12,193 listened at least 50 tracks and used at least 10 tags

18

Experimental SetupExperimental Setup

1. Create user profiles• 18 subjects• 658 tracks on average in user profile (not statistically

significant in influencing algorithm outcome)2. Run algorithms

• 7 algorithms• 10 recommended items per algorithm per user

3. Two scores• Quality of recommendation [0-2] NDCG• Novelty of recommendation [0-2] Average

19

ResultsResults

20

Nr

Algorithm

NDCG Signif. vs. CFTR

Average Novelty

Average Popularit

y1 CFTR 0.54 - 1.39 15,1772 CFTG 0.25 Highly 1.83 4,0653 CFTTI 0.36 Highly 1.72 6,6324 CFTTN 0.37 Highly 1.74 13,6715 STG 0.60 No 1.07 7,5876 STTI 0.73 Highly 0.82 10,3807 STTN 0.77 Highly 0.78 16,309

CFTR: Baseline

STG: • Lower popularity• Higher quality

STTI & STTN: • Huge improvement• Statistically significant

NDCG – Novelty: • High inverse correlation• Pearson c = -0.987

Gain over the Baseline (CF on Tracks)Gain over the Baseline (CF on Tracks)

21

ConclusionsConclusions

• CF on tag-based profiles worse than CF on track-based profiles

• Search with tags improved recommendation performance substantially• 44% increase in quality• Instant results – virtually no time delay• No cold start problem

• Tag-based profiles work also with less rich music repositories

• Results probably influenced by the consistent tag usage on Last.fm: mostly genres

22

Recommended