Upload
yueshen-xu
View
115
Download
0
Tags:
Embed Size (px)
Citation preview
Learning to Recommend with User
Generated Content
Yueshen Xu1, Zhiyuan Chen2, Jianwei Yin1, Zizheng
Wu1 and Taojun Yao1
1School of Computer Science and Technology, Zhejiang University
2University of Illinois at Chicago
[email protected]; [email protected]
2015/6/9 1Zhejiang University
Junxiang Wang
Yueshen Xu, WAIM, 2015
Outline
Background
Introduction
Related Work
Recommendation with UGC in User Side
Matrix Factorization
Topic Analysis for Items through Topic Modeling
User Interest Distribution
User Topic Regularization
Recommendation with UGC in Item Side
Item Topic Regularization
Experiment and Evaluation
Reference2015/6/9 2Zhejiang University
Keywords: Recommendation, User
Generated Content, Topic Modeling, Matrix
Factorization
Yueshen Xu, WAIM, 2015
Background
Recommendation in General
Collaborative Filtering (CF)
β Matrix Factorization (MF)
Content-based approach
β Pandora music genome project
2015/6/9 3Zhejiang University
User Generated Content (UGC)
social tag, review, question answer, blog, tweet, etc
tag-based / review-based recommendation
Problems in existing works not every web site has all kinds of UGC
the item-word / user-word space is highly sparse
synonym & polysemy
most works only focus on a single kind of UGC
item1 item2 item3 item4
user1 r11
user2 r22
user3
user4 r41 r44
user5 r53
Yueshen Xu, WAIM, 2015
Background
2015/6/9 4
Other related work social / trust-based recommendation helpful but limited
β no social relationship Amazon, Ebay, Newegg, Jingdong, Expedia, etc
β UGC β
Description/Profile-based recommendationβ static content
β fail to distinguish different items
β unrelated to a userβs preference
UGC, in contrast: emphasize an itemβs features
β those words received frequently
increase dynamically
associated with a userβs preference / interested topics β I like science fiction films, so I wrote a lot of movie reviews that contain
words like fiction, tech, super, hero, robotic, machine
natural chunking (social tag)
Yueshen Xu, WAIM, 2015
Contribution
2015/6/9 5Zhejiang University
Main contributions
We study UGC in learning user interests and learning item features
We propose a novel user-oriented collaborative filtering model and a
novel item-oriented collaborative filtering model
We propose a way to utilize different types of UGC in a unified way in
recommender systems
We expand an existing dataset by crawling new data, and conduct
sufficient experiments on three real-world datasets, which attest the
effectiveness of proposed models.
Yueshen Xu, WAIM, 2015
Recommendation with UGC in User
Side
2015/6/9 Zhejiang University 6
Topic analysis for items through topic modeling Terms in UGC are combined together to compose the term set W
each item owns an aggregated term list
pLSA/LDA/HDP/nCRP/PAM: all are OK
π― = π½π (π½π = πππ, πππ, β¦ , πππ², ) is the topic/aspect distribution
of document j (i.e., item j) what we need
User Interest Distribution Cluster items into groups according to the similarity of their
topics (K-Means/GMM/K-Medoid: all are OK)
Yueshen Xu, WAIM, 2015
Recommendation with UGC in User
Side
2015/6/9 Zhejiang University 7
User Interest Distribution (cont.)
Intuition : find items with similar topics, although they are in
different categories: clothes, gadget, book, toy, DVD all about
Harry Potter
Aggregate each userβs consumption records on each cluster πΆπ
πππ π, π =ππΆπΆ, ππππππ ππ πΎπΏ πππ£πππππππ
the weight of π as one of user πβs
neighbors: πππ π, π =πππ(π,π)
πβ²βπΏ(π) πππ(π,πβ²)
A novel regularization : user topic regularization (UTR)
πππ π=1π β₯ ππ β πβπΏ(π) πππππ β₯πΉ
2
Intuition: users with similar interested topics tend to have similar latent features
user π
user π
Yueshen Xu, WAIM, 2015
Recommendation with UGC in User
Side
2015/6/9 Zhejiang University 8
A new MF model (UTR-MF)
ππππ,ππΏ = π=1π π=1
π πΌππ(π ππ β πππππ)
2 +ππ
2β₯ π β₯πΉ
2 +ππ
2β₯ π β₯πΉ
2 +πΌ
2 π=1π β₯ ππ β πβπΏ(π) πππππ β₯πΉ
2
gradient descent/ coordinate descent
Gradient Descent
ππΏ
πππ= π=1
π πΌππ(π ππ β πππππ)(βππ) + ππππ + πΌ ππ β πβπΏ π πππππ +
πΌ πβπΊ(π)(ππ β πβ²βπΏ π πππβ²ππβ²) Γ (βπππ)
ππΏ
πππ= π=1
π πΌππ(π ππ β πππππ)(βππ) + ππππ
πΊ(π) is a set consisting of those users whose neighborhoods
include user π
Yueshen Xu, WAIM, 2015
Recommendation with UGC in Item
Side
2015/6/9 9
Intuition for items: similar UGC similar topic
distribution similar latent feature
πππ π, β : similarity between item j and h PCC, cosine or KL
divergence
π€ π, β =πππ(π,β)
ββ²βπ»(π) πππ(π,ββ²)
A novel regularization: item topic regularization (ITR)
πππ π=1π β₯ ππ β ββπ»(π)π€πβπβ β₯πΉ
2
A new MF model (ITR-MF):
β ππππ,ππΏ = π=1π π=1
π πΌππ(π ππ β πππππ)
2 +ππ
2β₯ π β₯πΉ
2 +ππ
2β₯ π β₯πΉ
2 +πΌ
2 π=1π β₯ ππ β ββπ»(π)π€πβπβ β₯πΉ
2
A natural combination: UTR + ITR
gradient descent/coordinate descent
Yueshen Xu, WAIM, 2015
Experiment and Evaluation
2015/6/9 Zhejiang University 10
Real-world dataset Movielens (social tag + rating)
Last.fm (expanded, social tag + rating)
Yelp (review + rating)
Evaluation Metric: RMSE and MAE Compared baseline models: UserCF, ItemCF, PMF, TF-IDF MF, CTR
In social tag case:
Yueshen Xu, WAIM, 2015
Experiment and Evaluation
2015/6/9 Zhejiang University 11
Experimental results (cont.)
UTR-MF and ITR-MF outperform other baselines in all cases
A detailed example, in Last.fm dataset, ITR-MF achieves 14%
improvement than PMF and 8% improvement than CTR
ITR-MF behaves better than UTR-MF: a userβs preference is harder to
infer. The main reason is probably that a userβs preference can change
dynamically
Yueshen Xu, WAIM, 2015
Experiment and Evaluation
2015/6/9 Zhejiang University 12
Experimental results (cont.) in review case the improvement is similar to that in the social tag
case
UTR-MF and ITR-MF outperform other baselines in all cases
ITR-MF behaves better than UTR-MF: a userβs preference is harder to
infer
The improvements are significant according to the paired t-test (π <0.001)
For more details, please refer to our paper
Yueshen Xu, WAIM, 2015
Conclusion
Conclusion
We demonstrate that different types of UGC can be integrated
into the MF model in a unified way
User preferences and item features can be learned from UGC
text
Our two novel regularization terms are effective to model user
preferences and item features
Our two MF-extended models can achieve large improvements
Future Work
Study other types of UGC, such as tweet and blog, to learn user
preferences and influential events in SNS
2015/6/9 Zhejiang University 13
Yueshen Xu, WAIM, 2015
Reference
[1] Adomavicius, G. and Tuzhilin, A.: Toward the next generation of recommender systems: A survey of
the state-of-the-art and possible extensions. In: IEEE TKDE, 17(6):734-749 (2005)
[2] Aggarwal, C.C. and Zhai, C.: Mining Text Data. In: Springer, New York (2012)
[3] Bischo, K., Firan, C.S., Nejdl, W., and Paiu, R.: Can all tags be used for search?In: ACM CIKM, pp.
193-202 (2008)
[4] Blei, D.M., Ng, A. Y., and Jordan, M. I.: Latent dirichlet allocation. In: JMLR,3:993-1022 (2003)
[5] Cantador, I., Brusilovsky, P., and Ku ik, T.: HetRec workshop. In: ACM RecSys,New York, USA (2011)
[6] Chen, C., Zheng, X., Wang, Y., Hong, F. and Lin, Z.: Context-Aware Collaborative Topic Regression
with Social Matrix Factorization for Recommender Systems. In: AAAI, pp. 9-15 (2014)
[7] Fang, Y. and Si, L.: Matrix co-factorization for recommendation with rich side information and implicit
feedback. In: HetRec (workshop of RecSys), pp. 65-69 (2011)
[8] Griths, T. L. and Steyvers, M.: Finding Scientific Topics. In: PNAS (2004)
[9] Koren, Y., Bell, R., and Volinsky, C.: Matrix factorization techniques for recommender systems. In:
Computer, 42(8):30-37 (2009)
[10] Liang, H., Xu, Y., Li, Y., Nayak, R., and Tao, X.: Connecting users and items with weighted tags for
personalized item recommendations. In: Hypertext, pp.51-60(2010)
[11] Liu, X. and Aberer, K.: SoCo: a social network aided context-aware recommendersystem. In: WWW,
pp. 781-802 (2013)
[12] Ma, H., Zhou, D., Liu, C., Lyu, M.R., and King, I.: Recommender systems with social regularization.
In: ACM WSDM, pp. 287-296 (2011)
2015/6/9 Zhejiang University 14
Yueshen Xu, WAIM, 2015
Reference
[13] McAuley, J.J. and Leskovec, J.: Hidden factors and hidden topics: understanding rating
dimensions with review text. In: ACM RecSys, pp. 165-172 (2013)
[14] Moens, M.-F., Li, J. and Chua, T.-S. : Mining User Generated Content. In: Chapman and Hall/CRC
(2014)
[15] Pandora. Music genome project. In: http://www.pandora.com/about/mgp
[16] Purushotham, S. and Liu, Y.: Collaborative topic regression with social matrix factorization for
recommendation systems. In: IEEE ICML, pp. 759-766 (2012)
[17] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J.: Grouplens: An open
architecture for collaborative filtering of netnews. In: CSCW, pp. 175-186 (1994)
[18] Rovi. Recommendations api version 2.0. In:
http://proddoc.rovicorp.com/mashery/index.php/Recommendations
[19] Salakhutdinov, R. and Mnih, A.: Probabilistic matrix factorization. In: NIPS
[20] Sarwar, B., Karypis, G., Konstan, J., and Reidl, J.: Item-based collaborative tering
recommendation algorithm. In: WWW, pp. 285-295 (2001)
[21] Wang, C. and Blei, D.M.: Collaborative topic modeling for recommending scientic articles. In: ACM
SIGKDD, pp. 448-456 (2011)
[22] Yang, X., Steck, H., and Liu, Y.: Circle-based recommendation in online social networks. In: ACM
SIGKDD, pp. 1267-1275 (2012)
[23] Zhang, Y., Lai, G., Zhang, M., Zhang, Y., Liu, Y. and Ma, S.: Explicit factor models for explainable
recommendation based on phrase-level sentiment analysis. In: ACM SIGIR, pp. 83-92 (2014)
2015/6/9 Zhejiang University 15
Yueshen Xu, WAIM, 2015
Thank you!
Q&A
2015/6/9 16Zhejiang University