User Profiling based on Folksonomy Information in Web 2.0 for
Personalized Recommender Systems
Huizhi (Elly) Liang
Supervisors: Yue Xu, Yuefeng Li, Richi Nayak
Queensland University of Technology, Australia
Agenda
4
Introduction1
2
3
5
The Proposed Approaches
Experiments
Conclusion
Literature Review
1 Introduction
Information overload
Personalization “Personalization is the ability providing content and services tailored
to individuals based on knowledge about their preferences and behaviours” (Hagen, 1999) Recommender systems
User profiling Explicit user profiles
Explicit ratings Implicit user profiling
Web log Other information sources
Web 2.0
Web 2.0: Read and Write web (O’Reilly Media, 2004) A platform for users to conduct online participation,
collaboration and interaction. Expressing opinions, sharing information, building
networks Wikipedia, Facebook, Delicious, Tweeter
Plenty of new user information Folksonomy (Tags), reviews, networks, blogs, micro-blogs
etc. Opportunities
Providing possible new solutions to profile users
Folksonomy
Folksonomy= folk + taxonomy Tags: Typical Web 2.0 information Keywords given by users to organize and classify
items The wisdom of crowds Multiple functions
Item organizing and sharing Building networks Expressing users’ explicit topic interests and
opinions
Tag Cloud
Folksonomy Tags
Taxonomy categories
Taxonomy Given by experts Standard vocabulary & Structural relationship Well recognized as common knowledge Independent with user communities
No users’ personal viewpoints or preferences information
Folksonomy Given by users explicitly and proactively Reflecting users’ personal viewpoints and topic
preferences Less intrusive & Multiple function Lightweight textural information
Contains a lot of noise
Literature Review2
User Profiling
Web User profiling Web content & structure Web log & Web usage Taxonomy & Ontology
User Profiling in Web 2.0 New user information sources
Folksonomy, blogs, reviews, micro-blogs Videos, audios, images Friends, trust network, followers, following
User Profiling 2
User Profiling based on folksonomy Approaches
Users’ own tags Associated tags Latent topics of tags Popular tags
Challenges Distinctive features of tags Tag quality problem
Semantic ambiguity and synonyms About 60% of tags are personal tags
Recommender system
Recommendation tasks Top N Recommendation (Precision, Recall, F1) Rating Prediction (Mean Absolute Error, Root Mean
Squared Error) Recommendation approaches
Content based Term vector model Latent Dirichlet Allocation (LDA)
Collaborative Filtering (CF) Memory based CF: User-KNN & Item-KNN Model based CF: Matrix Factorization techniques
Hybrid
Recommender system 2
Recommender systems based on Taxonomy Ziegler’s approach (CIKM, 2004)
Recommender systems based on Folksonomy Tag recommendations
Tensor based approach (KDD, 2009) Graph based approach (SIGIR, 2009)
Item recommendations Tso-Sutter’s approach(SAC, 2008) Clustering (RecSys, 2009) LDA approach (HT, 2009) Graph Rank (2010) Special tag rating function(WWW,2009)
Research Problem
Research Gap Features of folksonomy Noise of folksonomy Combining with taxonomy
Research Problem Profiling users based on folksonomy
information in Web 2.0 and enhance recommender systems
The Proposed Approaches3
User Profiling Models User Profiling based on Folksonomy User Profiling based on Taxonomy Hybrid User Profiling
Recommender System Top N item recommendation
Recommendation making
The Proposed Approaches
User Profiling
User Profiling-Folksonomy
User Profiling-Taxonomy
User Profiling-Hybrid
The Relationship Modelling The Multiple relationships of tagging
Two dimensional relationships User-Item relationship User-Tag relationship Item-Tag relationship
Three dimensional relationship Personal tagging behavior User-Tag-Item relationship
(User×Tag)-Item mapping Item-(User×Tag) mapping
Part 1: User Profiling Approaches based on Folksonomy
Tag representation-Folksonomy Item representation-Folksonomy User representation-Folksonomy
Tag Representation-
Folksonomy
Item Representation-
Folksonomy
User Representation-
FolksonomyUser Profiling-Folksonomy
Tag representation-Folksonomy
Reduce the noise of tags Find the personally related tags of each tag Determine the relevance weight
Relevance weight of two tags with respect to a user The collected items of a tag The expectation of the probability of a tag being used for the
collected items
“apple”
“garden”
“globalization”
“apple”
“internet”
0.16
0.34
Number of users used the tag for the item
Number of users collected the item
Item representation-Folksonomy
Expand the tags of each item Find the relevant tags of each item Determine the relevance weight
The relevance of an item to a tag User-tag pairs The relevance of two tags with respect to a user Inverse item frequency
“garden”
“apple”“globalizatio
n”“internet”
“0403”
User Representation-Folksonomy
Find users’ preferences to tags The preference weight of a user to a tag
Preferences to one tag The relevance of two tags with respect to a user Inverse user frequency
“garden”
“apple”“globalizatio
n”“intern
et”“0403”
Number of items collected with the tag by the user
Number of items collected by the user
User Item preferences
Implicit ratings Topic preferences
Tag vocabulary Item
Tag vocabulary
User Profiling-Folksonomy
“garden”
“apple”“globalizatio
n”“internet”
“0403”
“garden”
“apple”“globalizatio
n”“intern
et”“0403”
Part 2: User Profiling based on Taxonomy
Advantages of Taxonomy Standard vocabulary Well recognized Independent with user communities Experts’ viewpoints
Representations Item representation-Taxonomy Tag representation-Taxonomy User representation-Taxonomy
“apple”
Tag Representation-
Taxonomy
Item Representation-
Taxonomy
User Representation-
TaxonomyUser Profiling-Taxonomy
Find the relevant taxonomic topics of each item The relevance of an item to a taxonomic topic
The average weight of a taxonomic topic in all descriptors The weight of a taxonomic topic in an item descriptor Deploy weight from leaf topic to root topic
Inverse item frequency
Item Representation-Taxonomy
“programming”
“book”“computers”
“networks”
Reduce the noise of tags Find the personal semantic meaning of each tag
The relevance of a tag to a taxonomic topic with respect to a user The collected items of a tag Average relevance weight of a taxonomic topic to the collected
items
Tag Representation-Taxonomy
“computers”“programming”
“databases”
“networks”
“apple”
“apple”
“garden”“flowers”“fruit”
“apple”
“apple”
Find users’ preferences to taxonomic topics The preference weight of a user to a taxonomic topic
Preference to a tag Relevance of a tag to a taxonomic topic with respect to the user Inverse user frequency
User Representation-Taxonomy
“databases”
“programming”
“computers”“book”
“0403”
User Item preferences
Implicit ratings Topic preferences
Taxonomy vocabulary Item
Taxonomy vocabulary
User Profiling-Taxonomy
“databases”
“programming”
“computers”“book”
“book”“computers”
“programming”
“networks”
Part 3: Hybrid User Profiling
Combine Part 1 and Part 2 Wisdom of crowds
Tag vocabulary & Users’ viewpoints Wisdom of experts
Taxonomy vocabulary & Experts’ viewpoints
Tag representation-Hybrid
Item representation-Hybrid
User representation-Hybrid
Personalized Recommendation Making
Top N item recommendation
Neighborhood Formation
Recommendation Generation
User Profiling-Folksonomy
User Profiling-Taxonomy
User Profiling-Hybrid
User Profiling
Recommendation Making
Neighbourhood Formation
K-Nearest Neighbourhood User-KNN
Similarity of item preferences Similarity of topic preference
Tags Taxonomic topics
Linear combination
Item Preferences
Topic Preferences
Tags Taxonomic topics
User Similarity
Neighbourhood Formation 2
K-Nearest Neighbourhood Item-KNN
Similarity of Tags Similarity of Taxonomic topics Linear combination
Tags Taxonomic Topics
Item similarity
Recommendation Generation Candidate items
Neighbour items & Not tagged by the target user User based recommendation
Item based recommendation
Prediction Score
User Similarity Content
matching TagsTaxonomic
Topics
Prediction Score
Item Similarity
Experiments4
Datasets
D1: Amazon.com 4112 users, 34201 tags, 30467 items, 9919 taxonomic topics
D2: CiteULike “Who-posted-what” dataset 7103 users, 78414 tags, 117279 items
Power Law Distributions
0 200 400 600 800 1000 12000500
10001500200025003000350040004500
Number of Users
Nuv
er o
f Ta
gs D2
D1
0 20 40 60 80 100 120 1400
2000
4000
6000
8000
10000
12000
Number of Users
Num
ber
of It
ems D2
D1
Tags Items
Experiment setup
Top N item recommendation Experiment setup
5-folded 80% training & 20% testing
Evaluation Metrics Precision, Recall, F1 Measure
Comparisons Proposed Models
Folksonomy Model: FM-User, FM-Item Taxonomy Model: TM-User, TM-Item Hybrid Model: FTM-User, FTM-Item
Baseline Models
Tag Noise Removing Approaches (Dataset D1) Parameter setting
FM-User: : 0.8-1.0 , 1 : 0.4-0.5
FM-Item: 1 : 0.4-0.5
Results-I Folksonomy Model
1 2 3 4 5 6 7 8 9 1000.050.1
0.150.2
0.250.3
0.350.4
Top N
Prec
isio
n
1 2 3 4 5 6 7 8 9 100
0.02
0.04
0.06
0.08
0.1
0.12
Top N
Reca
ll
1 2 3 4 5 6 7 8 9 100
0.020.040.060.080.1
0.120.140.16
Top N
FM-UserFM-ItemClusteringARTELDA
F1
The Comparison of the State-of-the-art approaches (Dataset D1)
Results-I
1 2 3 4 5 6 7 8 9 100.1
0.15
0.2
0.25
0.3
0.35
0.4
Top N
Prec
isio
n
1 2 3 4 5 6 7 8 9 100
0.02
0.04
0.06
0.08
0.1
0.12
Top N
Reca
ll
1 2 3 4 5 6 7 8 9 1000.020.040.060.080.1
0.120.140.16
Top N
FM-User
Graph Rank
Tag tf-iuf
Tso-Sutter’s approach
CF-Item
F1
Comparison results of Dataset D2
Results-I
1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
Top N
Prec
isio
n
1 2 3 4 5 6 7 8 9 1000.020.040.060.080.1
0.120.140.16
Top N
Reca
ll
1 2 3 4 5 6 7 8 9 1000.020.040.060.080.1
0.120.14
Top N
FM-User
Graph Rank
Clustering
CF-Item
F1
Parameter setting (Dataset D1) TM-User:
: 0.8-1.0 , 1 : 0.4-0.5 TM-Item:
1 : 0.4-0.5
Results-2 Taxonomy Model
1 2 3 4 5 6 7 8 9 1000.050.1
0.150.2
0.250.3
0.35
Top N
Prec
isio
n
1 2 3 4 5 6 7 8 9 1000.010.020.030.040.050.060.070.08
Top N
Reca
ll
1 2 3 4 5 6 7 8 9 100
0.02
0.04
0.06
0.08
0.1
0.12
Top N
TM-UserTM-ItemTPR
F1
Parameter setting (Dataset D1) FTM-User: FTM-Item: 1=0.3,
Hybrid Models v.s. Single Models Folksonomy Model v.s. Taxonomy Model
Results-3 Hybrid Models
1 2 30.18
0.23
0.28
0.33
0.38
0.43
FTM-UserFTM-ItemFM-UserTM-User
Top N
Prec
isio
n
1 2 30
0.01
0.02
0.03
0.04
0.05
0.06
0.07
FTM-UserFTM-ItemFM-UserTM-user
Top N
Reca
ll
Results-3
The influence of personal tags D1 personal tags: 67%, 10: 4.8% D2 personal tags: 70% , 10: 5.2%
Findings Personal tags can improve the precision results Precision values decreased dramatically when large number (i.e., 90%) of tags
(i.e., 5) was removed.
1 2 3 4 5 6 7 8 9 101000
10000
10000078414
2322914045
10299822968395837512445734131
34201
112986906
4897378430512544215518841648
Num
ber
of T
ags
θ
D2
D1
1 2 3 4 5 6 7 8 9 100.1
0.15
0.2
0.25
0.3
0.35
(78414,0.21)
(23229,0.19)
(4131, 0.16)
(34201, 0.31)
(11298, 0.28)
(4897,0.24)
(1648, 0.2)
θ
Top-
3 Pr
ecis
ion
FM-User, D1
FM-User, D2
0.24
TM-User, D1
(9919, 0.24)
Discussions
The proposed approaches outperformed other related work
The Hybrid Model performed the best Each tag counts Folksonomy can be used as quality information source
(rich personalization information)
Conclusions5
Conclusions
Web 2.0 New user information Modelling the relationships of tagging behaviour Tag quality problem The wisdom of crowds & experts
Proposed three user profiling models User profiling based on folksonomy User profiling based on taxonomy Hybrid user profiling
Utilized the proposed user profiles to improve recommender systems User based Item based
Evaluation Experiments
Contributions
Advantages Domain free Language free
Information overload User profiling and web personalization Recommender systems Web 2.0
Future Work
Time factor Cross folksonomy recommendations Mobile platform application Integrate with other user information
Explicit ratings Tweets Friendship network
Published Work
Liang, H. et al. (2010). Personalized Recommender System Based on Item Taxonomy and Folksonomy. CIKM
Liang, H. et al. (2010). Connecting Users and Items with Weighted Tags for Personalized Item Recommendations. Hypertext
Liang, H. et al. (2010). A Hybrid Recommender System based on Weighted Tags. SDM Workshop
Liang, H. et al. (2010). Mining Users’ Opinions based on Item Folksonomy and Taxonomy for Personalized Recommender Systems. ICDM Workshop
Liang, H. et al. (2010). Parallel User profiling based on folksonomy for Large Scaled Recommender Systems-An implementation of Cascading MapReduce. ICDM Workshop
Liang, H. et al. (2009). Collaborative Filtering Recommender Systems based on Popular Tags. ADCS
Liang, H. et al. (2009). Tag Based Collaborative Filtering for Recommender Systems. RSKT
Liang, H. et al. (2009). Personalized Recommender Systems Integrating Social tags and Item Taxonomy. WI
Liang, H. et al. (2008). Collaborative Filtering Recommender Systems Using Tag Information. WI Workshop
Bhuiyan, T., Xu, Y., Jøsang, A., & Liang, H. (2010). Developing Trust Networks Based on User Tagging Information for Recommendation Making. WISE
Acknowledgements
Time Supervisor Team HPC group Penal Members ISS Anonymous Reviewers Papers Staffs
Colleagues Friends Google Books Sunshine CSC Trees Stars Music Trips Blogs Beaches Family …
Questions & Answers