Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Point-of-Interest Recommendations: Learning Potential Check-ins from Friends
1
Huayu Li∗, Yong Ge+, Richang Hong−, Hengshu Zhu×
∗University of North Carolina at Charlotte
+University of Arizona
−Hefei University of Technology×
Baidu Research-Big Data Lab
Outline
Introduction
Research Problem
Research Challenges
Related Work
Methodologies
Experiments
2
Introduction
3
Users Mobile DevicesLocation-based Social
Network (LBSN) Services
4
Introduction
5
Introduction
6
Introduction
7
Introduction
Information Overload• Foursquare: 65 million venues
• Facebook: 16 million local business
• Yelp: 2.1 million claimed business
New Region
Which One?
8
Introduction
Information Overload• Foursquare: 65 million venues
• Facebook: 16 million local business
• Yelp: 2.1 million claimed business
New Region
Which One? A location recommender system is very important!
Research Problem
9
Given a set of users and a set of locations they have visited
before, the objective is to recommend the locations to an
individual who might have interest to visit.
visited recommended
Research Challenges
Complex Decision Making Process• Social Network Influence
• Geographical Influence
10
Research Challenges
Complex Decision Making Process• Social Network Influence
• Geographical Influence
Data Sparsity Issue• Each user only visits a limited number of locations.
• For new user/location, we do not have their check-in information.
11
Research Challenges
Complex Decision Making Process• Social Network Influence
• Geographical Influence
Data Sparsity Issue• Each user only visits a limited number of locations.
• For new user/location, we do not have their check-in information.
Implicit Feedback Issue• Only check-in frequency without explicit rating.
• We do not know user’s explicit preference for locations.
12
Related Work
Modeling Social Network Influence• Social regularization constraint (WSDM’11)
• Social correlations (CIKM’12, IJCAI’13, ICDM’15)
• User-based collaborative filtering (SIGIR’11)
Modeling geographical influence• Incorporating geographical distance (KDD’11, SIGIR’11, AAAI’12,
SIGSPATIAL’ 13, KDD’14, ICDM’15)
• Incorporating activity area (KDD’ 14)
• Incorporation nearest neighbors (CIKM’14)
13
Methods: Framework
Learn potential locations from
friends
Learn user’s preference for
locations
14
Methods: Framework
Learn potential locations from
friends
Learn user’s preference for
locations
15
Definition of Friends
Social Friends ℱ𝑖s
• The users who socially connect with the target user 𝑖 in LBSNs.
Location Friends ℱ𝑖𝑙
• The users who check-in the same locations as the target user 𝑖.
Neighboring Friends ℱ𝑖𝑛
• The users who live physically closest to the target user 𝑖.
16
𝑙1𝑙2
𝑙3𝑙4 𝑙5
𝑓1
𝑓2
𝑓3
𝑓4
𝑓5
𝑓6
𝑢𝑖
Definition of Friends
Social Friends ℱ𝑖s
• The users who socially connect with the target user 𝑖 in LBSNs.
Location Friends ℱ𝑖𝑙
• The users who check-in the same locations as the target user 𝑖.
Neighboring Friends ℱ𝑖𝑛
• The users who live physically closest to the target user 𝑖.
17
𝑙1𝑙2
𝑙3𝑙4 𝑙5
𝑓1
𝑓2
𝑓3
𝑓4
𝑓5
𝑓6
𝑢𝑖
ℱi = ℱ𝑖s ∪ 𝑆(ℱ𝑖
𝑙) ∪ 𝑆(ℱ𝑖𝑛)
Methods: Learning Potential Locations
18
𝑢𝑖PROBLEM DEFINITION:For the target user 𝑖, given
a set of locations that her
friends have checked-in
before but she never visits,
the problem is to find top
most potential locations that
she might be interested in.
Methods: Learning Potential Locations
19
𝑢𝑖
𝑃𝑖𝑗𝑝𝑜𝑡
?
𝑙𝑗
Location Candidate
Linear Aggregation
Random Walk
Methods: Linear Aggregation
20
𝑢𝑖
𝑙𝑗
Probability 𝑃𝑖𝑗𝑝𝑜𝑡
that
user 𝑖 visits a location 𝑗:
𝑃𝑖𝑗𝑝𝑜𝑡
∝ max𝑓∈ℱ
𝑖𝑗{𝑆𝑖𝑚(𝑖, 𝑓; 𝑗)}
𝜁𝑆𝑖𝑚𝑢 𝑖, 𝑓 + (1 − 𝜁)𝑃𝑖𝑗𝐺
Similarity of User Interest Similarity of Geo-location
Methods: Random Walk
21
𝑢𝑖 Nodes: users and locations
Links: user-user, user-location, location-location
𝐲 = 1 − 𝛽 𝐀𝐲 +𝛽
|ℳ𝑖𝑜∩ℳ𝑖
𝑓|+|ℱ𝑖|+1
x
𝑃𝑖𝑗𝑝𝑜𝑡
is the steady probability
corresponding to location j
Transition Matrix Restart Nodes
Methods: Learning Potential Locations
22
Observed Locations Potential Locations Other Unobserved Locations
Methods: Framework
Learn potential locations from
friends
Learn user’s preference for
locations
23
Recommendation Models
The preference Ƹ𝑝𝑖𝑗 of user 𝑖 for location 𝑗:
24
≈×d
⊙Users’ preference for locations
Category FeatureMatrix
Location Latent Matrix
User Latent Matrix
Ƹ𝑝𝑖𝑗 = (𝑞𝑖𝑐𝑗 + 𝜀) 𝐮𝑖𝑇𝐯𝑗
User’s Preference for Category
Tuning Parameter
User’s Typical Preference for Location
𝐏 ෩𝐐 = 𝐐 + 𝛆
𝐔
𝐕
Recommendation Models
Loss function of general form
25
argmin𝐔,𝐕,𝐐
𝑖
𝐸𝑖 𝑝𝑖𝑗 , 𝑝𝑖𝑘 , 𝑝𝑖ℎ , ො𝑝𝑖𝑗 , ො𝑝𝑖𝑘 , ො𝑝𝑖ℎ + Θ(𝐔, 𝐕,𝐐)
∀ 𝑗 ∈ ℳ𝑖𝑜, 𝑘 ∈ ℳ𝑖
𝑝, ℎ ∈ ℳ𝑖
𝑢
Estimated Value
Observed Locations
Potential Locations
Other Unobserved Locations
Recommendation Models
Loss function of general form
26
argmin𝐔,𝐕,𝐐
𝑖
𝐸𝑖 𝑝𝑖𝑗 , 𝑝𝑖𝑘 , 𝑝𝑖ℎ , ො𝑝𝑖𝑗 , ො𝑝𝑖𝑘 , ො𝑝𝑖ℎ + Θ(𝐔, 𝐕,𝐐)
∀ 𝑗 ∈ ℳ𝑖𝑜, 𝑘 ∈ ℳ𝑖
𝑝, ℎ ∈ ℳ𝑖
𝑢
Estimated Value
Observed Locations
Potential Locations
Other Unobserved Locations
𝜆𝑢
2||𝐔||2
2 +𝜆𝑣
2||𝐕||2
2+𝜆𝑞
2||𝐐||2
2
Regularization Term
Recommendation Models
Loss function of general form
27
argmin𝐔,𝐕,𝐐
𝑖
𝐸𝑖 𝑝𝑖𝑗 , 𝑝𝑖𝑘 , 𝑝𝑖ℎ , ො𝑝𝑖𝑗 , ො𝑝𝑖𝑘 , ො𝑝𝑖ℎ + Θ(𝐔, 𝐕,𝐐)
∀ 𝑗 ∈ ℳ𝑖𝑜, 𝑘 ∈ ℳ𝑖
𝑝, ℎ ∈ ℳ𝑖
𝑢
Observed Locations
Potential Locations
Other Unobserved Locations
𝜆𝑢
2||𝐔||2
2 +𝜆𝑣
2||𝐕||2
2+𝜆𝑞
2||𝐐||2
2
Regularization Term
Square Error based Model Ranking Error based Model
Square Error based Model
The user’s preference for a location is defined as:
𝑝𝑖𝑗 = ൞
1 𝑖𝑓 𝑗 ∈ ℳio
𝛼 𝑖𝑓 𝑗 ∈ ℳi𝑝
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
28
Observed Locations
Potential Locations
Other unobserved Locations
Square Error based Model
The user’s preference for a location is defined as:
𝑝𝑖𝑗 = ൞
1 𝑖𝑓 𝑗 ∈ ℳio
𝛼 𝑖𝑓 𝑗 ∈ ℳi𝑝
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Squared error loss function
𝐸𝑖 ∙ =
𝑗=1
𝑀
𝑤𝑖𝑗(𝑝𝑖𝑗 − Ƹ𝑝𝑖𝑗 )2
29
𝑤𝑖𝑗 = ቊ1 + 𝛾 × 𝑟𝑖𝑗 , 𝑖𝑓 𝑗 ∈ ℳi
o
1, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Weight Matrix
Square Error based Model
Squared error based objective function
ℒ
= min𝐔,𝐕,𝐐
𝑖=1
𝑁
𝑗=1
𝑀
𝑤𝑖𝑗(𝑝𝑖𝑗 − Ƹ𝑝𝑖𝑗 )2
+ Θ(𝐔, 𝐕, 𝐐)
30
Initialization
Alternating Update
Alternating Least Square
Ranking Error based Model
Model the ranking order among user’s preference for three types of locations
ቊƸ𝑝𝑖𝑗 > Ƹ𝑝𝑖𝑘Ƹ𝑝𝑖𝑘 > Ƹ𝑝𝑖ℎ
, ∀ 𝑗 ∈ ℳ𝑖𝑜,𝑘 ∈ ℳ𝑖
𝑝, ℎ ∈ ℳ𝑖
𝑢
31
Observed Location
Potential Location
Potential Location
Other Unobserved Location
Ranking Error based Model
Model the ranking order among user’s preference for three types of locations
ቊƸ𝑝𝑖𝑗 > Ƹ𝑝𝑖𝑘Ƹ𝑝𝑖𝑘 > Ƹ𝑝𝑖ℎ
, ∀ 𝑗 ∈ ℳ𝑖𝑜,𝑘 ∈ ℳ𝑖
𝑝, ℎ ∈ ℳ𝑖
𝑢
Ranking error loss function
𝐸𝑖 ∙ = −
𝑗∈ℳ𝑖𝑜
𝑘∈ℳ𝑖𝑝
ln 𝜎( Ƹ𝑝𝑖𝑗 − Ƹ𝑝𝑖𝑘) −
𝑘∈ℳ𝑖𝑝
ℎ∈ℳ𝑖𝑢
ln 𝜎( Ƹ𝑝𝑖𝑘 − Ƹ𝑝𝑖ℎ)
32
Using Logistic Function to Model Ranking Order
Ranking Error based Model
Ranking error based objective function
33
Initialization
Update
Stochastic Gradient Descent with Boostrap Sampling
Sampling
Incorporating Geographical Influence
Check-in probability is refined by a power-law function associated with the distance between user home position and a location.
Ƹ𝑝𝑖𝑗 ∝ 𝑝𝑖𝑗𝐺 × 𝜎( Ƹ𝑝𝑖𝑗)
34
𝑝𝑜𝑤𝑒𝑟𝑙𝑎𝑤(𝑑(𝑖, 𝑗))
Recommendation Strategies
35
Target User 𝑖
New Location
Standard Recommendation
New User RecommendationƸ𝑝𝑖𝑗 = (𝑞𝑖𝑐𝑗 + 𝜀) 𝐮𝑖
𝑇𝐯𝑗
New Location Recommendation
Ƹ𝑝𝑖𝑗 ∝ 𝑝𝑖𝑗𝐺× 𝜎
σ𝑙∈𝜓𝑗
𝑆𝑖𝑚𝐺(𝑗, 𝑙) Ƹ𝑝𝑖𝑙
σ𝑙∈𝜓𝑗
𝑆𝑖𝑚𝐺(𝑗, 𝑙)
Datasets: Gowalla
Test Methodology• Selecting 80% as training and using the rest 20% as testing according to
timestamp
Evaluation Metrics: • Top-K Recommendation Accuracy
(Precision@K and Recall@K)
Experiments
36
Statistics of Data Set
New Location Rec New User Rec
#User #Location #Check-in Sparsity #New Location #Test #New User #Test
52,216 98,351 2,577,336 0.0399% 78,881 568,937 9,326 79,153
Exp. : Standard Recommendation
37
Precision@K Recall@K
Modeling unobserved check-ins can improve recommendation accuracy !
Exp. : Standard Recommendation
38
Precision@K Recall@K
Modeling potential check-ins can benefit recommendation!
Exp. : New User Recommendation
39
Precision@K Recall@K
Modeling potential check-ins can solve user cold-start issue!
Exp. : New Location Recommendation
40
Modeling potential check-ins can solve location cold-start issue!
Performance comparison for new location recommendation in terms of Precision@K and Recall@K.
Conclusion
Empirically analyze the correlations between users and their three type of friends using real-world data
Learn a set of locations for each user that her friends have checked-in before and she is most interested in
Develop matrix factorization based models via different error loss functions with the learned potential check-ins, and propose two scalable optimization methods
Design three different recommendation strategies
41
42
Thank You