24
FACTORIZATION MACHINE: MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: [email protected] Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: [email protected] Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

Embed Size (px)

Citation preview

Page 1: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

1

FACTORIZATION MACHINE:MODEL, OPTIMIZATION AND APPLICATIONS

Yang LIUEmail: [email protected]: Prof. Andrew Yao

Prof. Shengyu Zhang

Page 2: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

2

OUTLINE

Factorization machine (FM) A generic predictor Auto feature interaction

Learning algorithm Stochastic gradient descent (SGD) …

Applications Recommendation systems Regression and classification …

Page 3: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

3

DOUBAN MOVIE

Page 4: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

4

PREDICTION TASK

e.g. Alice rates Titanic 5 at time 13

??

Page 5: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

5

PREDICTION TASK

Format: for regression, for classification

Training set:

Testing set: ,

Objective: to predict

Page 6: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

6

LINEAR MODEL – FEATURE ENGINEERING

Linear SVM

Logistic Regression

�̂� (𝑥 )= 1

1+𝑤0 exp (−𝑤𝑇𝑥  )

Jia
Is this the correct format of linear SVM?Do we need to mention that features are independent in linear models?
Page 7: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

7

FACTORIZATION MODEL

Model parameters , where

is the inner dimension

Linear:

FM:

Interaction between variables

Page 8: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

8

W

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

Page 9: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

9

W

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

Page 10: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

10

W?

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

Page 11: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

11

VVT

k

W

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

=

Page 12: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

12

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

Wk

Page 13: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

13

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W

Page 14: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

14

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W¿ 𝒗𝑨

𝑻 𝒗𝑻𝑰

Page 15: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

15

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W𝑤𝑖𝑗

𝑣 𝑖T

𝑣 𝑗

Factorization

Page 16: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

16

=

𝑤𝑖 , 𝑗= ⟨𝑣 𝑖 ,𝑣 𝑗 ⟩INTERACTION MATRIX

VVT

W𝑤𝑖𝑗

𝑣 𝑖T

𝑣 𝑗

FactorizationMachine

Jia
Is "machine" an appropriate description?Why using "machine"?
Page 17: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

17

FM: PROPERTIES

Expressiveness:

Feature dependency: and are dependent

Linear computation complexity:

Page 18: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

18

OPTIMIZATION TARGET

Min ERROR Min ERROR + Regularization

Loss function

Page 19: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

19

STOCHASTIC GRADIENT DESCENT (SGD)

For item , update by:

: initial value of : learning rate : regularization

Pros Easy to implement Fast convergence on big training data

Cons Parameter tuning Sequential method

Page 20: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

20

APPLICATIONS

EMI Music Hackathon 2012 Song recommendation

Given: Historical ratings User demographics

# features: 51K # items in training: 188K

?

Page 21: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

21

RESULTS FOR EMI MUSIC

FM: Root Mean Square Error (RMSE) 13.27626 Target value [0,100] The best (SVD++) is 13.24598

Details Regression Converges in 100 iterations Time for each iteration: < 1 s

Win 7, Intel Core 2 Duo CPU 2.53GHz, 6G RAM

Page 22: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

22

OTHER APPLICATIONS

Ads CTR prediction (KDD Cup 2012) Features

User_info, Ad_info, Query_info, Position, etc. # features: 7.2M # items in training: 160M Classification Performance:

AUC: 0.80178, the best (SVM) is 0.80893

Page 23: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

23

OTHER APPLICATIONS

HiCloud App Recommendation Features

App_info, Smartphone model, installed apps, etc. # features: 9.5M # items in training: 16M Classification Performance:

Top 5: 8%, Top 10: 18%, Top 20: 32%; AUC: 0.78

Page 24: F ACTORIZATION M ACHINE : MODEL, OPTIMIZATION AND APPLICATIONS Yang LIU Email: yliu@cse.cuhk.edu.hk Supervisors: Prof. Andrew Yao Prof. Shengyu Zhang 1

24

SUMMARY

FM: a general predictor Works under sparsity Linear computation complexity Estimates interactions automatically Works with any real valued feature vector

THANKS!