87
Convolutional Neural Network based Recommender System Deep Learning based Recommender System (Zhang et al. 2017) Presented by Jiin Seo November 28, 2017

Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Convolutional Neural Networkbased Recommender System

Deep Learning based Recommender System(Zhang et al. 2017)

Presented by Jiin Seo

November 28, 2017

Page 2: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 3: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 4: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Attention based CNN (Gong et al. 2016)

• Hashtag recommendation in microblog

• Multi-class classification problem

• (Global channel + Local channel) ⇒ Convolutional layer

• We adopt Attention Mechanism to scan input microblog and selecttrigger word. It chooses to focus only on a small subset of the wordsfor each tag.

Page 5: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Architecture

Figure: The architecture of the attention-based Convolutional Neural Network

Page 6: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Notations

• Given an input microblog m with length n,we take wi ∈ Rd for each word in the microblog.(d : dim. of the word vector)

• wi :i+j : the concatenation of words wi ,wi+1, · · · ,wi+j

Page 7: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Local Attention Channel . 1) Local attention layer

• Attention layer generates a seq. of trigger words (wi , · · · ,wj) from asmall window (window size: h)

• The score of the central word (w(2i+h−1)/2) is

s(2i+h−1)/2 = g(Ml ∗wi :i+h + b)

g : non-linear function, Ml ∈ Rh×d : parameter matrix, b: bias,

• Extract the trigger words.

wi =

{wi if wi > η,0 if wi ≤ η , 0 ≤ i ≤ n

• The threshold : η = δ ·min{s}+ (1− δ) ·max{s} ,s : seq. of scores

Page 8: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Local Attention Channel . 2) Folding layer

• Abstract the features of the trigger words(w).

z = g(Ml ∗ folding(w) + b)

where g : non-linear function, Ml ∈ Rd×r and b ∈ Rr

• folding : the sum operation for each dimension of all the trigger words

fi =∑j

wj ,i

• Output : fixed-length vector,which represents the embeddings of the trigger words w.

Page 9: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. Attention based CNN

Global Channel . 1) Convolutional Layer

• All the words for each tag will be encoded.

• We use a CNN architecture to model whole microblog.

• Abstract the features.

zi = g(Mg ·wi :i+l−1 + b)

g : non-linear function, Mg ∈ Rl×d (l : window size) and b ∈ R• We Operate this filter on all combinations of the word in microblog{w1:l ,w2:l+1, · · · ,wn−l+1:}

• A map of feature :

z = [z1, z2, z · · · , zn−l+1]

Page 10: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Global Channel . 2) Pooling Layer

• A max-overtime pooling operation is applied.

• We can extract the most important feature for each feature map.

• To obtain multiple features,we use multiple filters with varying window sizes in the model.

• Output : fixed length vector,which represents the embeddings of the input microblog m .

Page 11: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Combining the Outputs of both channels

• Outputs of the local attention channel and the global channel.⇒ A simple convolutional layer

• Combine the information as follows :

h = tanh(M ∗ v[hg;hl] + b)

hg,hl : the feature vectors extracted from global and local channel,M : filter matrix for the convolutional operation, b : bias

Page 12: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Training

• Parameters : Θ = {W,Ml,Mg}W : words embeddings, Ml,Mg : the parameters of both channels

• Training Objective ftn :

J =∑

(m,a)∈D

−log(a | m)

,where D is the training corpus, a is the hashtag for microblog m.

• To minimize the objective ftn, we use AdaDelta.

Page 13: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Hashtag Recommendation

• Given an unlabelled dataset,Train our model on training data, and save the model which has thebest performance on the validate dataset.

• Encode the microblog through the local attention channel and globalchannel by the saved model.

• Combine the features generated from both channels.

• The scores of the hashtagsfor the d-th microblog by fully connected layer:

P(yd = a | hd ;β) =exp(β(a)Thd)∑j∈A exp(β(j)Thd)

A : set of candidate hashtags, β : parameters, h : feature vector

• Rank the hashtags for each microblog . And recommend thetop-ranked hashtags

Page 14: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

1. Attention based CNN

Reult

• Attention based CNN outperforms state of-the-art methods.

• The trigger words methods could improve the performance.

• The multiple channels can achieve better performance than a singlechannel.

Page 15: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 16: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. Personalized CNN (CNN-PerMLP)

Personalized CNN for Tag Recommendation (Nguyen et al. 2016)

• Image tag recommender system

• Personalized Content-Aware Tag Recommender suggests a ranked listof relevant tags.(Tu,i )

• CNN-PerMLP employs

• Convolution Neural Networks.• Personalized Fully-Connected Layer• Multilayer Perceptron as the Predictor

Page 17: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Architecture

Figure: The architecture of CNN-PerMLP

Page 18: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Notaions

• U : users, I : imagess, T : tags

• A = (au,i ,t) ∈ R|U|×|I |×|T|,

au,i ,t =

{1 if u assigns the tag t to the image i ,0 o.w.

• S := {(u, i , t) | (au,i ,t) ∈ A ∧ (au,i ,t) = 1} : the observed tagging set

• Tu,i := {t ∈ T | (u, i , t) ∈ S} : the set of relevant tags of user-image

• PS := {(u, i) | ∃t ∈ T : (u, i , t) ∈ S} : all observed posts

Page 19: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Notaions

• The collection of all RGB squared images :R = {Ri ,q | Ri ,q ∈ Rd×d×3 ∧ i ∈ I ∧ q ∈ Q}zi ∈ Rm : the visual features of the i-th image Ri ,Q :the patches

• The final scores of tags are calculated as follows :

y(u, i , t) = avgRi,q,,q∈Q

y ′(u,Ri ,q,, t)

• Top-K tag list :Tu,i := arg max

t∈T,|Tu,i |=K

y(u, i , t)

Page 20: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Convolution Neural Networks

• The visual features are achieved by passing a patch q of the image ithrough the CNN feature extractor.

• Convolutional layer

τkij = ϕ(bk +

p1∑a=1

(Wka ∗ ξa)ij)

τk : k-th feature map, ξa : a-th feature map* : convolutional operator, ϕ : activation ftnWk ∈ Rp1 × Rp2 × Rp2 , bk : weights and biases of filters for τk

• Max pooling operator

τkij = maxa,b

(ξk)a,b : k − th feature map

,

• Output :zqi = fcnn(Rq

i ) : Rd×d×3 → Rm

Page 21: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Personalized Fully-Connected Layer

• To personalize visual features of an image, the user’s information(ID)has to be combined with the features from the CNN .

• This layer captures the interaction between the user and each visualfeature.

• Input :

• zqi : the visual feature vector• κ: = {0, 1}|u| : the sparse vector (user’s features)

• Output (User-aware features) :

ψj(u, zqi ) = ϕ(bj + wper

j · (zqi )j + Vjκu)

wper ∈ Rm : the weights of the visual features ,V ∈ Rm×|U| : the weights of the user features,ϕ : activation ftn

Page 22: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Multilayer Perceptron as the Predictor

• To compute the scores of the tags, MLP is adopted.

• The network has one hidden layer.

• The Neural Network Score ftn :

y ′(u,Rq,i , , tj) = ϕ(wout

j · ϕ(Whiddenψ + bhidden) + boutj )

Whidden, bhidden : the weights and the biases of the hidden layerwoutj ∈Wout , bout : the weights and the biases of the output layer

Page 23: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

2. CNN-PerMLP

Optimization• We adapt the Bayesian Personalized Ranking (BPR) optimization

criterion.• BPR finds the model’s parameters that maximize the difference

between the relevant and irrelevant tags.

Figure: The algorithm of BPR

Page 24: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 25: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. Deep Coperative Neural Network (DeepCoNN)

DeepCoNN (Zheng et al. 2017)

• Joint Deep Modeling of Users and Items using Reviews

• DeepCoNN adopt two parallel CNNs to model User behaviors andItem properties from review texts

• In the shared layer, FM(Factorization Machine) is applied to capturetheir interactions for rating prediction.

Page 26: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

DeepCoNN

• DeepCoNN alleviates the sparsity problem and enhances the modelinterpretability.

• DeepCoNN represents review text using pre-trained a wordembedding-technique.

Page 27: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

Architecture

Figure: The architecture of DeepCoNN

Page 28: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

Notations

• Each tuple (u, i , rui ,wui ) denotes a review written by user u for item iwith rating rui and text review of wui .

• A network for users (Netu) : user reviews −→ xu(rates)

• A network for items (Neti ) : item reviews −→ yi (rates)

• We focus on (Netu) in detail. The same process is applied for (Neti ).

Page 29: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

Word Representation(Look-up Layer)

• A word embedding f : M→ Rn

• Matrix of word vector by user u :

Vu1:n = φ(du

1 )⊕ φ(du2 )⊕ · · · ⊕ φ(du

n )

duk : k-th word of singe document du

1:n, consisting of n wordsφ(du

k ) ∈ Rc : look-up ftn⊕ : the concatenation operator

• The order of words is preserved in matrix Vu1:n.

Page 30: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

CNN Layers . 1) Convolution Layer

• Convolution layer consists of m neurons.

• Each neuron j in the convolutional layer uses filter Kj ∈ Rc×t .

• Convolution operation :

zi = f (Vu1:n ∗Kj + bj)

*: convolutional operatorf (x) = max{0, x}: activation ftn (ReLu)

Page 31: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

CNN Layers . 2) Max Pooling Layer

• The most important feature of each feature map has been captured.

• Convolutional results are reduced to a fixed size vector.

oj = max{z1, z2, · · · , zn−t+1}

• Output vector of convolutional Layer, using multi-filters:

O = {o1, o2, · · · , on1}, n1 : # of kernel in the convolutional layer

Page 32: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

CNN Layers . 3) Fully Connected Layer

• Output (rates for user u) :

xu = f (W ×O + g), xu ∈ Rn2×1

W: Weight matrix

• yi can be obtained with the same process.

• The dropout strategy has also been applied, to prevent overfitting,

Page 33: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

3. DeepCoNN

Shared Layer

• This layer Maps the features of users and items into the same featurespace.

• Concatenate xu and yi into a single vector.

z = (xu, yi )

• Factorization Machine (FM) models all nested variable interactions inz.

• The Objective ftn :

J = w0 +

|z|∑i=1

wi zi +

|z|∑i=1

|z|∑j=i+1

< vi , vj > zi zj ,

w0, wi : the global bias and the strength of the i-th variable in z

< vi , vj >=∑|z|

f=1< ˆvi ,f , ˆvj ,f >

Page 34: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 35: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. Convolutional Matrix Factorization (ConvMF)

ConvMF (Kim et al. 2016)

• Document context-aware recommendation model

• CNN (Convolutional neural network)+ PMF (Probabilistic matrix factorization)

• In the shared layer, FM(Factorization Machine) is applied to capturetheir interactions for rating prediction.

Page 36: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Architecture

Figure: The architecture of ConvMF

Page 37: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Convolutional neural network(CNN)

• Convolution layer for generating local features

• Pooling layer for representing data as more concise representation

Page 38: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Matrix Factorization(MF)

• Goal : Find latent models of users and items on a shared latent space .

• R ∈ RN×M : rating matrix (N users, M items)

• ui ∈ Rk , vj ∈ Rk : latent models of user i and item j

• The rating rij of user i on item j is approximated by the inner-productof corresponding latent models.

rij ≈ rij = uTi vj

• Minimize a Loss ftn :

L =N∑i

M∑j

Iij(rij − uTi vj)2 + λu

N∑i

‖ ui ‖2 +λv

M∑j

‖ vj ‖2

Page 39: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Probabilistic Model of ConvMF

• Goal : Find user and item latent models U ∈ Rk × N,V ∈ Rk ×M.

• UTV reconstructs the rating matrix R.

• Condi. dist. over observed ratings is given by

p(R | U,V, σ2) =N∏i

M∏j

N(rij | uTi vj , σ2)Ii j

, where N(x | µ, σ2) is p.d.f. of Normail dist.

• User latent models with zero-mean Gaussian prior are

p(U | σ2U) =N∏i

N(ui | 0, σUI )

Page 40: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Probabilistic Model of ConvMF

• Item latent model is generated from three variables:• internal weights W in CNN• Xj representing the document of item j• Gaussian noise

• Item latent model

vj = cnn(W,Xj) + εj

εj ∼ N(o, σ2VI )

• For each wk in W, we place zero-mean Gaussian prior are

p(W | σ2W) =∏k

N(wk | 0, σ2W)

• Condi. dist. over item latent model

p(V |W,X, σ2V) =M∏j

N(vj | cnn(W,Xj), σ2VI )

,where X is the set of description documents of items

Page 41: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

CNN

• Goal : Generating document latent vectors from documents of items

• 1) embedding layer, 2) convolution layer, 3) pooling layer, and4) output layer

Figure: CNN architecture for ConvMF

Page 42: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

CNN . 1) Embedding Layer

• A raw document −→ A dense numeric matrix

• Document : seq. of l words

• Document matrix :

D =

| | |· · · wi−1 wi wi+1 · · ·

| | |

,D ∈ Rp×l (1)

Page 43: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

CNN . 2) Convolutional Layer

• Convolutional Layer extracts contextual features.

• Contextual feature is extracted by j-th shared weight Wjc ∈ Rp×ws :

c ji = f (Wjc ∗D(:,i :(i+ws−1)) + bjc)

* : convolution operator , ws: window size.f : activation ftn(ReLU)

• Contextual feature vector with Wjc

c j = [c j1, cj2, · · · , c

ji , · · · , c

jl−ws+1] ∈ Rl−ws+1

• We use multiple shared weights to capture multiple types ofcontextual features.

Wjc , j = 1, 2, · · · , nc

Page 44: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

CNN . 3) Pooling Layer

• Max-pooling

df = [max(c1),max(c2), · · · ,max(c j), · · · ,max(cnc )]

Page 45: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

CNN . 4) Output Layer

• We project df → on k-dim space of user and item latent models.

• Document latent vector using nonlinear projection:

s = tanh(Wf2{tanh(Wf1df + bf1)}+ bf2)

,where Wf1 ∈ Rf×nc ,Wf2 ∈ Rk×f are projection matricesand bf1 ∈ Rf , bf2 ∈ Rk are a bias vectors for Wf1 ,Wf2 with s ∈ Rk

• Output(document latent vector of item j) :

sj = cnn(W,Xj)

Xj : a raw document of item j , W : all the weight and bias variables

Page 46: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Optimization

• To optimize the variables , we use maximum a posteriori (MAP)estimation.

maxU,V,W

p(U,V,W | R,X, σ2, σ2U, σ2V, σ2W)

= maxU,V,W

[p(R | U,Vσ2)p(U | σ2U)p(V |W,X, σ2V)p(W | σ2W)]

L(U,V,W) =N∑i

M∑j

Iij2

(rij − uTi vj)2 +λU2

N∑i

‖ ui ‖2

+λV2

M∑j

‖ vj − cnn(W,Xj) ‖2 +λW2

|wk |∑k

‖ wk ‖2

,where λU = σ/σ2U, λV = σ/σ2V, and λW = σ/σ2W

Page 47: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

- Optimization

• We adopt coordinate descent, to optimize the variables iteratively

ui ← (VIiVT + λUIK )−1VRi

vj ← (UIjUT + λVIK )−1(URj + λVcnn(W,Xj))

,where Ii = diag(Iij), j = 1, · · · ,M and Ri is a vector with (rij)Mj=1 for

user i.

• To optimize W, we use back propagation algorithm.

E(W) =λV2

M∑j

‖ (vj − cnn(W,Xj) ‖2 +λW2

|wk |∑k

‖ wk ‖2 +constant

Page 48: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Optimization

• With optimized U,V , and W, finally we can predict unknown ratingsof users on items.

rij ≈ E[rij | uTi vj , σ2]

= uTi vj = uTi (cnn(W,Xj) + εj)

Page 49: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

4. ConvMF

Result

• ConvMF significantly outperforms the state-of-the-art competitors

• ConvMF well deals with the sparsity problem and skewed data withcontextual information.

• Pre-trained word embedding model increases the performance ofwhen the number of ratings is insufficient.

• ConvMF can distinguish subtle contextual difference of the sameword via different shared weights.

Page 50: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 51: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. CNN for Image Feature Extraction(VPOI)

Visual Content Enhanced POI recommendation (VPOI) (Wang et al.2016)

• Goal : Recommending k un-visited POIs to each user.

• VPOI incorporates visual contents for POI recommendations

• Photos reflect users’ interests and informative descriptions aboutlocations.

Figure: Example of Images Posted by Users

Page 52: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Architecture

Figure: The architecture of VPOI

Page 53: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

POI Recommender

• POI recommendation called location recommendation,

• POI recommendation focuses on

• geographical influence• social correlations• temporal patterns• textual content indications

Page 54: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Notations

• U = {u1, u2, · · · , un}, L = {l1, l2, · · · , lm}, P = {p1, p2, · · · , pN}: the set of users. locations and photos

• X ∈ Rn×m : user-POI check-in matrix , Xij = freq. or rating of ui on lj

• R ∈ Rn×m : normalized version of X

Rij = g(Xij), g(x) =1

1 + exp−1

• Pui : the set of images uploaded by user i

• Plj : the set of images that are tagged lj

Page 55: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Basic POI Recommender

• Probabilistic Matrix Factorization (PMF)

• POI recommender is one class CF, where only positive sample aregiven.

• Condi. dist. over observed ratings is

P(R | U,V, σ) =n∏

i=1

m∏j=1

[N(Rij | uTi vj , σ2)]Yij

,where U ∈ RK×n and V ∈ RK×m are the latent feature matrices ofusers and POIs, respectively.Y : indicator matix (Yij = 1 if Rij > 0 and 0 o.w )

Page 56: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Basic POI Recommender

• User-Check-in data Model is

P(U,V | R) =n∏

i=1

N(ui | 0, σ2uI )m∏j=1

N(vj | 0, σ2v I )

n∏i=1

m∏j=1

[N(Rij | uTi vj , σ2)]Yij .

Page 57: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Extracting and Modeling

• VGG16 model is choosen.

• For an input image pk , the visual contents are the output of VGG16.We denote it as cnn(pk) .

Figure: The architecture of VGG16 model

Page 58: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Extracting and Modeling

• Prob. that ps belongs to ui :

P(fis = 1 | ui , ps) =exp(ui · P · CNN(ps))∑

pk∈P exp(uTi · P · CNN(pk))

, where P ∈ RK×d is the interaction marix between the visualcontents and latent user features.fis denotes if ps is posted by ui or not.

• By maximizing P(fis = 1 | ui , ps) for ps ∈ Pui , we force ui to besimilar to the visual contents.

Page 59: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Extracting and Modeling

• Prob. that pt associated with lj :

P(gjt = 1 | lj , pt) =exp(vTi ·Q · CNN(pt))∑

pk∈P exp(vTj ·Q · CNN(pk))

, where Q ∈ RK×d is the interaction marix between the visualcontents and latent POI features.gjt denotes if pt is associated with lj or not.

• By maximizing P(gjt = 1 | lj , pt) for pt ∈ Pvj , we force vj to besimilar to the visual contents.

Page 60: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Extracting and Modeling

• The image features :

P(F ,G | P,U,V,P,Q)

= [n∏

i=1

∏ps∈Pui

P(fis = 1 | ui , ps)] · [m∏j=1

∏pt∈Plj

P(gjt = 1 | lj , pt)]

,where F = {fis : ps ∈ Pui , ∀ui ∈ U} and G = {gjt : pt ∈ Plj , ∀lj ∈ L}

Page 61: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

VPOI Framework

maxU,V,P,Q,CNN

P(U,V,P,Q | R,F ,G,P)

• The Posterior Dist. is

P(U,V,P,Q | R,F ,G,P)

∝ P(R,F ,G | U,V,P,Q,P)P(U,V,P,Q | P)

= P(R | U,V)P(F ,G | P,U,V,P,Q)P(P)P(Q)P(U)P(V)

Page 62: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

VPOI Framework

• VPOI Framework can be written as

maxU,V,P,Q,CNN

− ‖ Y � (R−UTV) ‖2F −λ1(‖ U ‖2F + ‖ V ‖2F )

+αn∑

i=1

∑pk∈Pui

logP(fik = 1 | ui , pk)− λ2 ‖ P ‖2F

+αm∑j=1

∑pk∈Pvj

logP(gjk = 1 | vj , pk)− λ2 ‖ Q ‖2F

,where λ1 = σ2

σ2u

== σ2

σ2v, λ2 = σ2

σ2p

= σ2

σ2q

and α = 2σ2. � is the

Hadamard product.

Page 63: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Algorithm

Figure: The architecture of VGG16 model

Page 64: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

5. VPOI

Result

• VPOI outperforms representative state-of-the-art POI recommendersystems.

• The proposed framework alleviates the cold-start problem forrecommendation by incorporating images.

Page 65: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 66: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

6. CNN for Audio Feature Extraction

Deep Content-based Music recommendation (Van et al. 2013)

• We propose to use a latent factor model for recommendation, and thelatent factors from music audio when they cannot be obtained fromusage data.

Page 67: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

6. CNN for Audio Feature Extraction(WMF)

Weighted Matrix Factorization(WMF)

• The Taste Profile Subset contains play counts per song and per user.

• To learn latent factor representations of all users and items, we useWMF.

• rui : play count for user u and song i

• Define a preference and confidence variables

pui = I (rui > 0),

cui = 1 + αlog(1 + ε−1rui ).

• Assume the user enjoys the song, if pui = 1.

• cui measures how certain we are about this particular preference.

Page 68: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

6. CNN for Audio Feature Extraction

Weighted Matrix Factorization(WMF) (Kim et al. 2016)

• WMF objective function :

minx∗,y∗

∑u,i

cui (pui − xTu yi )2 + λ(

∑u

‖ xu ‖2 +∑i

‖ yi ‖2)

,where xu is the latent factor vector for user u, and yi is the latentfactor vector for song i

• It consists of a confidence-weighted MSE and an L2 regularizationterm.

• ALS optimization method is used.

Page 69: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

6. CNN for Audio Feature Extraction

Predictingl latent factors from music audio

• Regression problem

• Two methods (to convert music audio signals into a fixed-sizerepresentation):

• Bag-of-words representation• deep CNN

Page 70: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

6. CNN for Audio Feature Extraction

Objective functions

• yi : the latent factor vector for song i , obtained with WMF

• y ′i : the corresponding prediction by the model

• Minimize MSE :minθ

∑i

‖ yi − y ′i ‖2

• Minimize WPE(weighted prediction error) :

minθ

∑u,i

cui (pu i − xTu y ′i )2

Page 71: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

6. CNN for Audio Feature Extraction

Result

• Predicting latent factors from music audio is a viable method forrecommending new and unpopular music.

• Deep CNN significantly outperforming the traditional approaches.

Page 72: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

Outline

1. Attention based CNN

2. Personalized CNN (CNN-PerMLP)

3. Deep Coperative Neural Network (DeepCoNN)

4. Convolutional Matrix Factorization (ConvMF)

5. CNN for Image Feature Extraction(VPOI)

6. CNN for Audio Feature Extraction(WMF)

7. CNN for Text Feature Extraction

Page 73: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

e-Learning Resources Recommendation (Shen et al. 2016)

• Automatic Recommendation Technology for e-Learning Resourceswith CNN

• Text information : the course introduction or the classroom content,the abstract or full content of the learning resources.

• CNN can be used to predict the latent factors from the textinformation .

• We predict the rating scores between students and learning resources.

Page 74: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Architecture

Figure: The architecture of the recommendation algorithm

Page 75: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Training process

• Language model is employed for the input of CNN.

• LFM(Latent Factor Model) is employed for the output of CNN.

• CNN bridges the semantic gap between text information and thevectors of latent factors.

Page 76: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Recommendation process

• CNN : the input text information →the features of the learningresource

• We combine it with the student’s preferences

• The rating score between a student and a learning resource can bepredicted.

Page 77: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model

• The CNN can be used to predict the latent factors from the textinformation.

• Input is achieved by language model according to the textinformation

• Output is solved by latent factor model from the historical ratingscores data

Page 78: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - CNN• four layers of CNN

• convolutional layer with multiple feature maps.• a mean-over-time pooling layer• an over-time convolutional layer• fully connected layer

Figure: The Construction of CNN

Page 79: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - CNN . 1) convolutional layer

• xi ∈ Rk : k-dim word representation of i-th word

• x = [x1, x2, · · · , xn] ∈ Rk

ci = f (w · xi + b)

, where w ∈ Rk is a filter, b ∈ R is a bias and f is a non-linear ftn.

• Feature Map :c = [c1, c2, · · · , cn] ∈ Rn

Page 80: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - CNN . 2) mean-overtime pooling layer

• We apply a mean-overtime region pooling operation over the featuremap.

• Pooling Operation in λ regions

bi = max{c(i−1)×(n/λ)+1, · · · , ci×(n/λ)), i ∈ [1, λ]

b = [b1,b2, · · · ,bλ]

Page 81: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - CNN . 3) convolutional layer

• Feature value :a = f (w · b + b)

, where w ∈ Rλ is a filter, b ∈ R is a bias and f is a non-linear ftn.

• The process extracts one feature from one filter. The model usesmultiple filters to obtain multiple features.

Page 82: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - CNN . 4) Fully Connected Layer

• Input : The features from previous layer.

• Output is the predicted latent factors

• The process extracts one feature from one filter. The model usesmultiple filters to obtain multiple features.

Page 83: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - CNN

• Minimize the mean squared error (MSE) of the predictions

arg minw,b

∑i

‖ y′i − yi ‖2

,where y′i is the latent factor vector for article i and yi is the outputof CNN.

Page 84: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - LFM

• The LFM results represent the features of students’ preferences andlearning resources.

Figure: The Process of LFM

Page 85: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - LFM L1R

• We proposed a modified matrix factorization method with L1 normbased regularization.

J(U,V) =∑ij

(Ui∗ · V∗j − rij)2 + λ1 ‖ U ‖1 +λ2 ‖ V ‖1

• U : the relationship between the students and the latent factors

• V : the relationship between the learning resources and the latentfactors

• rij : the rating score that made by i-th student to the j-th learningresource

• To minimize it, the split Bregman iteration method is used.

Page 86: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Model - Language Model

• Topic Model is employed.

• The Latent Dirichlet Allocation (LDA) method is used to train thetopic model.

Page 87: Convolutional Neural Network based Recommender Systemstat.snu.ac.kr/idea/seminar/20171128/CNN based RS.pdf · 2017. 11. 29. · based Recommender System Deep Learning based Recommender

7. CNN for Text Feature Extraction

Result

• It achieves significant improvements over conventional methods.

• It can also work well when the existing recommendation algorithmssuffer from the cold-start problem.