Download pptx - User Profiling based on Folksonomy Information in Web 2.0 for Personalized Recommender Systems

User Profiling based on Folksonomy Information in Web 2.0 for

Personalized Recommender Systems

Huizhi (Elly) Liang

Supervisors: Yue Xu, Yuefeng Li, Richi Nayak

Queensland University of Technology, Australia

Agenda

4

Introduction1

2

3

5

The Proposed Approaches

Experiments

Conclusion

Literature Review

1 Introduction

Information overload

Personalization “Personalization is the ability providing content and services tailored

to individuals based on knowledge about their preferences and behaviours” (Hagen, 1999) Recommender systems

User profiling Explicit user profiles

Explicit ratings Implicit user profiling

Web log Other information sources

Web 2.0

Web 2.0: Read and Write web (O’Reilly Media, 2004) A platform for users to conduct online participation,

collaboration and interaction. Expressing opinions, sharing information, building

networks Wikipedia, Facebook, Delicious, Tweeter

Plenty of new user information Folksonomy (Tags), reviews, networks, blogs, micro-blogs

etc. Opportunities

Providing possible new solutions to profile users

Folksonomy

Folksonomy= folk + taxonomy Tags: Typical Web 2.0 information Keywords given by users to organize and classify

items The wisdom of crowds Multiple functions

Item organizing and sharing Building networks Expressing users’ explicit topic interests and

opinions

Tag Cloud

Folksonomy Tags

Taxonomy categories

Taxonomy Given by experts Standard vocabulary & Structural relationship Well recognized as common knowledge Independent with user communities

No users’ personal viewpoints or preferences information

Folksonomy Given by users explicitly and proactively Reflecting users’ personal viewpoints and topic

preferences Less intrusive & Multiple function Lightweight textural information

Contains a lot of noise

Literature Review2

User Profiling

Web User profiling Web content & structure Web log & Web usage Taxonomy & Ontology

User Profiling in Web 2.0 New user information sources

Folksonomy, blogs, reviews, micro-blogs Videos, audios, images Friends, trust network, followers, following

User Profiling 2

User Profiling based on folksonomy Approaches

Users’ own tags Associated tags Latent topics of tags Popular tags

Challenges Distinctive features of tags Tag quality problem

Semantic ambiguity and synonyms About 60% of tags are personal tags

Recommender system

Recommendation tasks Top N Recommendation (Precision, Recall, F1) Rating Prediction (Mean Absolute Error, Root Mean

Squared Error) Recommendation approaches

Content based Term vector model Latent Dirichlet Allocation (LDA)

Collaborative Filtering (CF) Memory based CF: User-KNN & Item-KNN Model based CF: Matrix Factorization techniques

Hybrid

Recommender system 2

Recommender systems based on Taxonomy Ziegler’s approach (CIKM, 2004)

Recommender systems based on Folksonomy Tag recommendations

Tensor based approach (KDD, 2009) Graph based approach (SIGIR, 2009)

Item recommendations Tso-Sutter’s approach(SAC, 2008) Clustering (RecSys, 2009) LDA approach (HT, 2009) Graph Rank (2010) Special tag rating function(WWW,2009)

Research Problem

Research Gap Features of folksonomy Noise of folksonomy Combining with taxonomy

Research Problem Profiling users based on folksonomy

information in Web 2.0 and enhance recommender systems

The Proposed Approaches3

User Profiling Models User Profiling based on Folksonomy User Profiling based on Taxonomy Hybrid User Profiling

Recommender System Top N item recommendation

Recommendation making

The Proposed Approaches

User Profiling

User Profiling-Folksonomy

User Profiling-Taxonomy

User Profiling-Hybrid

The Relationship Modelling The Multiple relationships of tagging

Two dimensional relationships User-Item relationship User-Tag relationship Item-Tag relationship

Three dimensional relationship Personal tagging behavior User-Tag-Item relationship

(User×Tag)-Item mapping Item-(User×Tag) mapping

Part 1: User Profiling Approaches based on Folksonomy

Tag representation-Folksonomy Item representation-Folksonomy User representation-Folksonomy

Tag Representation-

Folksonomy

Item Representation-

Folksonomy

User Representation-

FolksonomyUser Profiling-Folksonomy

Tag representation-Folksonomy

Reduce the noise of tags Find the personally related tags of each tag Determine the relevance weight

Relevance weight of two tags with respect to a user The collected items of a tag The expectation of the probability of a tag being used for the

collected items

“apple”

“garden”

“globalization”

“apple”

“internet”

0.16

0.34

Number of users used the tag for the item

Number of users collected the item

Item representation-Folksonomy

Expand the tags of each item Find the relevant tags of each item Determine the relevance weight

The relevance of an item to a tag User-tag pairs The relevance of two tags with respect to a user Inverse item frequency

“garden”

“apple”“globalizatio

n”“internet”

“0403”

User Representation-Folksonomy

Find users’ preferences to tags The preference weight of a user to a tag

Preferences to one tag The relevance of two tags with respect to a user Inverse user frequency

“garden”


n”“intern

et”“0403”

Number of items collected with the tag by the user

Number of items collected by the user

User Item preferences

Implicit ratings Topic preferences

Tag vocabulary Item

Tag vocabulary


“garden”


n”“internet”

“0403”

“garden”


n”“intern

et”“0403”

Part 2: User Profiling based on Taxonomy

Advantages of Taxonomy Standard vocabulary Well recognized Independent with user communities Experts’ viewpoints

Representations Item representation-Taxonomy Tag representation-Taxonomy User representation-Taxonomy

“apple”

Tag Representation-

Taxonomy

Item Representation-

Taxonomy

User Representation-

TaxonomyUser Profiling-Taxonomy

Find the relevant taxonomic topics of each item The relevance of an item to a taxonomic topic

The average weight of a taxonomic topic in all descriptors The weight of a taxonomic topic in an item descriptor Deploy weight from leaf topic to root topic

Inverse item frequency

Item Representation-Taxonomy

“programming”

“book”“computers”

“networks”

Reduce the noise of tags Find the personal semantic meaning of each tag

The relevance of a tag to a taxonomic topic with respect to a user The collected items of a tag Average relevance weight of a taxonomic topic to the collected

items

Tag Representation-Taxonomy

“computers”“programming”

“databases”

“networks”

“apple”

“apple”

“garden”“flowers”“fruit”

“apple”

“apple”

Find users’ preferences to taxonomic topics The preference weight of a user to a taxonomic topic

Preference to a tag Relevance of a tag to a taxonomic topic with respect to the user Inverse user frequency

User Representation-Taxonomy

“databases”

“programming”

“computers”“book”

“0403”

User Item preferences

Implicit ratings Topic preferences

Taxonomy vocabulary Item

Taxonomy vocabulary


“databases”

“programming”

“computers”“book”

“book”“computers”

“programming”

“networks”

Part 3: Hybrid User Profiling

Combine Part 1 and Part 2 Wisdom of crowds

Tag vocabulary & Users’ viewpoints Wisdom of experts

Taxonomy vocabulary & Experts’ viewpoints

Tag representation-Hybrid

Item representation-Hybrid

User representation-Hybrid

Personalized Recommendation Making

Top N item recommendation

Neighborhood Formation

Recommendation Generation



User Profiling-Hybrid

User Profiling

Recommendation Making

Neighbourhood Formation

K-Nearest Neighbourhood User-KNN

Similarity of item preferences Similarity of topic preference

Tags Taxonomic topics

Linear combination

Item Preferences

Topic Preferences

Tags Taxonomic topics

User Similarity

Neighbourhood Formation 2

K-Nearest Neighbourhood Item-KNN

Similarity of Tags Similarity of Taxonomic topics Linear combination

Tags Taxonomic Topics

Item similarity

Recommendation Generation Candidate items

Neighbour items & Not tagged by the target user User based recommendation

Item based recommendation

Prediction Score

User Similarity Content

matching TagsTaxonomic

Topics

Prediction Score

Item Similarity

Experiments4

Datasets

D1: Amazon.com 4112 users, 34201 tags, 30467 items, 9919 taxonomic topics

D2: CiteULike “Who-posted-what” dataset 7103 users, 78414 tags, 117279 items

Power Law Distributions

0 200 400 600 800 1000 12000500

10001500200025003000350040004500

Number of Users

Nuv

er o

f Ta

gs D2

D1

0 20 40 60 80 100 120 1400

2000

4000

6000

8000

10000

12000

Number of Users

Num

ber

of It

ems D2

D1

Tags Items

Experiment setup

Top N item recommendation Experiment setup

5-folded 80% training & 20% testing

Evaluation Metrics Precision, Recall, F1 Measure

Comparisons Proposed Models

Folksonomy Model: FM-User, FM-Item Taxonomy Model: TM-User, TM-Item Hybrid Model: FTM-User, FTM-Item

Baseline Models

Tag Noise Removing Approaches (Dataset D1) Parameter setting

FM-User: : 0.8-1.0 , 1 : 0.4-0.5

FM-Item: 1 : 0.4-0.5

Results-I Folksonomy Model

1 2 3 4 5 6 7 8 9 1000.050.1

0.150.2

0.250.3

0.350.4

Top N

Prec

isio

n

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 100

0.020.040.060.080.1

0.120.140.16

Top N

FM-UserFM-ItemClusteringARTELDA

F1

The Comparison of the State-of-the-art approaches (Dataset D1)

Results-I

1 2 3 4 5 6 7 8 9 100.1

0.15

0.2

0.25

0.3

0.35

0.4

Top N

Prec

isio

n

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 1000.020.040.060.080.1

0.120.140.16

Top N

FM-User

Graph Rank

Tag tf-iuf

Tso-Sutter’s approach

CF-Item

F1

Comparison results of Dataset D2

Results-I

1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

Top N

Prec

isio

n

1 2 3 4 5 6 7 8 9 1000.020.040.060.080.1

0.120.140.16

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 1000.020.040.060.080.1

0.120.14

Top N

FM-User

Graph Rank

Clustering

CF-Item

F1

Parameter setting (Dataset D1) TM-User:

: 0.8-1.0 , 1 : 0.4-0.5 TM-Item:

1 : 0.4-0.5

Results-2 Taxonomy Model

1 2 3 4 5 6 7 8 9 1000.050.1

0.150.2

0.250.3

0.35

Top N

Prec

isio

n

1 2 3 4 5 6 7 8 9 1000.010.020.030.040.050.060.070.08

Top N

Reca

ll

1 2 3 4 5 6 7 8 9 100

0.02

0.04

0.06

0.08

0.1

0.12

Top N

TM-UserTM-ItemTPR

F1

Parameter setting (Dataset D1) FTM-User: FTM-Item: 1=0.3,

Hybrid Models v.s. Single Models Folksonomy Model v.s. Taxonomy Model

Results-3 Hybrid Models

1 2 30.18

0.23

0.28

0.33

0.38

0.43

FTM-UserFTM-ItemFM-UserTM-User

Top N

Prec

isio

n

1 2 30

0.01

0.02

0.03

0.04

0.05

0.06

0.07

FTM-UserFTM-ItemFM-UserTM-user

Top N

Reca

ll

Results-3

The influence of personal tags D1 personal tags: 67%, 10: 4.8% D2 personal tags: 70% , 10: 5.2%

Findings Personal tags can improve the precision results Precision values decreased dramatically when large number (i.e., 90%) of tags

(i.e., 5) was removed.

1 2 3 4 5 6 7 8 9 101000

10000

10000078414

2322914045

10299822968395837512445734131

34201

112986906

4897378430512544215518841648

Num

ber

of T

ags

θ

D2

D1

1 2 3 4 5 6 7 8 9 100.1

0.15

0.2

0.25

0.3

0.35

(78414,0.21)

(23229,0.19)

(4131, 0.16)

(34201, 0.31)

(11298, 0.28)

(4897,0.24)

(1648, 0.2)

θ

Top-

3 Pr

ecis

ion

FM-User, D1

FM-User, D2

0.24

TM-User, D1

(9919, 0.24)

Discussions

The proposed approaches outperformed other related work

The Hybrid Model performed the best Each tag counts Folksonomy can be used as quality information source

(rich personalization information)

Conclusions5

Conclusions

Web 2.0 New user information Modelling the relationships of tagging behaviour Tag quality problem The wisdom of crowds & experts

Proposed three user profiling models User profiling based on folksonomy User profiling based on taxonomy Hybrid user profiling

Utilized the proposed user profiles to improve recommender systems User based Item based

Evaluation Experiments

Contributions

Advantages Domain free Language free

Information overload User profiling and web personalization Recommender systems Web 2.0

Future Work

Time factor Cross folksonomy recommendations Mobile platform application Integrate with other user information

Explicit ratings Tweets Friendship network

Published Work

Liang, H. et al. (2010). Personalized Recommender System Based on Item Taxonomy and Folksonomy. CIKM

Liang, H. et al. (2010). Connecting Users and Items with Weighted Tags for Personalized Item Recommendations. Hypertext

Liang, H. et al. (2010). A Hybrid Recommender System based on Weighted Tags. SDM Workshop

Liang, H. et al. (2010). Mining Users’ Opinions based on Item Folksonomy and Taxonomy for Personalized Recommender Systems. ICDM Workshop

Liang, H. et al. (2010). Parallel User profiling based on folksonomy for Large Scaled Recommender Systems-An implementation of Cascading MapReduce. ICDM Workshop

Liang, H. et al. (2009). Collaborative Filtering Recommender Systems based on Popular Tags. ADCS

Liang, H. et al. (2009). Tag Based Collaborative Filtering for Recommender Systems. RSKT

Liang, H. et al. (2009). Personalized Recommender Systems Integrating Social tags and Item Taxonomy. WI

Liang, H. et al. (2008). Collaborative Filtering Recommender Systems Using Tag Information. WI Workshop

Bhuiyan, T., Xu, Y., Jøsang, A., & Liang, H. (2010). Developing Trust Networks Based on User Tagging Information for Recommendation Making. WISE

Acknowledgements

Time Supervisor Team HPC group Penal Members ISS Anonymous Reviewers Papers Staffs

Colleagues Friends Google Books Sunshine CSC Trees Stars Music Trips Blogs Beaches Family …

Questions & Answers

[email protected]