36
Fast ALS-based matrix factorization for explicit and implicit feedback datasets István Pilászy, Dávid Zibriczky, Domonkos Tikk Gravity R&D Ltd. www.gravityrd.com 28 September 2010

Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Embed Size (px)

Citation preview

Page 1: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Fast ALS-based matrix factorization for explicit andimplicit feedback datasets

István Pilászy, Dávid Zibriczky, Domonkos Tikk

Gravity R&D Ltd.www.gravityrd.com

28 September 2010

Page 2: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Collaborative filtering

Page 3: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Problem setting

5 4 3

44

2 41

Page 4: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• Ridge Regression

wx Tiiy

?????

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

w

wXy T

Page 5: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•Optimal solution:

• Ridge Regression

3.03.001.04.00.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

Page 6: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•Computing the optimal solution:

•Matrix inversion is costly:

•Sum of squared errors of the optimal solution: 0.055

• Ridge Regression

3.253.178.239.236.133.170.216.229.186.88.236.224.296.253.139.239.186.253.373.186.136.83.133.182.12

XXT

)()( 1 yXXXw TT

6.221.194.247.284.14

yXT

)( 3KO

Page 7: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•Start with zero:

•Sum of squared errors: 24.6

w

0.00.00.00.00.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

Page 8: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•Start with zero, then optimize w1

•Sum of squared errors: 7.5

w

0.00.00.00.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

1.2

Page 9: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•Start with zero, then optimize w1 ,then optimize w2

•Sum of squared errors: 6.2

w

0.00.00.0

2.1

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

0.2

Page 10: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•Start with zero, then optimize w1, then w2, then w3

•Sum of squared errors: 5.7

w

0.00.0

2.02.1

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

0.1

Page 11: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•… w4

•Sum of squared errors: 5.4

w

0.0

1.02.02.1

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

0.1

Page 12: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•… w5

•Sum of squared errors: 5.0

w

0.11.01.02.02.1

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

Page 13: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•… w1 again

•Sum of squared errors: 3.4

w

1.01.01.02.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

0.8

Page 14: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•… w2 again

•Sum of squared errors: 2.9

w

1.01.01.0

8.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

0.3

Page 15: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•… w3 again

•Sum of squared errors: 2.7

w

1.01.0

3.08.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

0.2

Page 16: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• RR1: RR with coordinate descent

• Idea: optimize only one variable of at once

•… after a while:

•Sum of squared errors: 0.055

•No remarkable difference

•Cost: n examples, e epoch

w

0.30.30.010.40.0

7.24.15.20.26.15.09.25.22.27.08.11.22.22.00.03.13.13.19.21.09.20.25.27.13.11.12.00.27.27.18.13.04.09.27.11.06.01.02.13.1

9.19.12.11.23.26.19.18.0

)( enKO

Page 17: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• The rating matrix, R of (M x N) is approximated as the product of two lower ranked matrices,

• P: user feature matrix of (M x K) size

• Q: item (movie) feature matrix of (N x K) size

• K: number of features

• Matrix factorization

iTuuir qp

PT

RT

Q

TPQR iTuuir qp

Page 18: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Matrix Factorization for explicit feedb.

Q

P5

5

4

3

1

R3.3

1.3

1.3

1.4

1.3

1.9

1.7 0.7 1.0 1.3 0.8

0 0.7 0.4 1.7 0.3

2.12.2

6.7

1.6

1.4

TPQR

iTuuir qp

2

43.3

1.6 1.8

Page 19: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Finding P and Q

Q

PR

0.3 0.9 0.7 1.3 0.5

0.6 1.2 0.3 1.6 1.1

5

5

4

3

1

2

4 ??

• Init Q randomly

•Find p1

Page 20: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Finding p1 with RR

??

1.15.03.07.06.03.0

345

• Optimal solution:

3.22.3

1p

Page 21: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Finding p1 with RR

Q

PR

0.3 0.9 0.7 1.3 0.5

0.6 1.2 0.3 1.6 1.1

5

5

4

3

1

2

4 2.33.2

Page 22: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• Initialize Q randomly

•Repeat

• Recompute P

• Compute p1 with RR

• Compute p2 with RR

• … (for each user)

• Recompute Q

• Compute q1 with RR

• … (for each item)

• Alternating Least Squares (ALS)

Page 23: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• ALS relies on RR:

• recomputation of vectors with RR

•when recomputing p1, the previously computed value is ignored

• ALS1 relies on RR1:

• optimize the previously computed p1, one scalar at once

•the previously computed value is not lost

•run RR1 only for one epoch

• ALS is just an approximation method.

• Likewise ALS1.

• ALS1: ALS with RR1

1:),( eenKO

Page 24: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Implicit feedback

Q

P

1 0

R0.5

0.1

0.2

0.7

0.3

0.1

0.1 0.7 0.3 0 0.2

0 0.7 0.4 0.4 0.4

TPQR

iTuuir qp

10 0

0 011 0

0 101 1

Page 25: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•The matrix is fully specified: each user watched each item.

•Zeros are less important, but still important. Many 0-s, few 1-s.

•Recall, that

•Idea (Hu, Koren, Volinsky):

• consider a user, who watched nothing

• compute and for this user (the null-user)

• when recomputing p1, compare her to the null-user

• based on the cached and ,update them according to the differences

• In this way, only the number of 1-s affect performance, not the number of 0-s

•IALS: alternating least squares with this trick.

• Implicit feedback: IALS

)()( 1 yXXXw TT

XXT yXT

XXT yXT

Page 26: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•The RR1 trick cannot be applied here

• Implicit feedback: IALS1

Page 27: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•The RR1 trick cannot be applied here

•But, wait…!

• Implicit feedback: IALS1

Page 28: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•XTX is just a matrix.

•No matter how many items we have, its dimension is the same (KxK)

•If we are lucky, we can find K items which generate this matrix

•What, if we are unlucky? We can still create synthetic items.

•Assume that the null user did not watch these K items

•XTX and XTy are the same, if synthetic items were created appropriately

• Implicit feedback: IALS1

3.253.178.239.236.133.170.216.229.186.88.236.224.296.253.139.239.186.253.373.186.136.83.133.182.12

XXT

Page 29: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•Can we find a Z matrix such that

• Z is small, KxK and ?

•We can, by eigenvalue decomposition

• Implicit feedback: IALS1

3.253.178.239.236.133.170.216.229.186.88.236.224.296.253.139.239.186.253.373.186.136.83.133.182.12

XXT

ZZXX TT

TT SΛSXX TSΛZ :

65.496.315.559.594.2

26.094.142.117.24.1

79.100.104.003.153.0

62.006.052.049.008.1

16.070.078.015.044.0

Z

Page 30: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

•If a user watched N items,we can run RR1 with N+K examples

•To recompute pu, we need steps (assume 1 epoch)

•Is it better in practice, than the of IALS ?

• Implicit feedback: IALS1

)( 2 KNKO )( 23 NKKO

Page 31: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• Evaluation of ALS vs. ALS1•Probe10 RMSE on Netflix Prize dataset, after 25 epochs

Page 32: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• Evaluation of ALS vs. ALS1•Time-accuracy tradeoff

Page 33: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• Evaluation of IALS vs. IALS1•Average Relative Position on the test subset of a proprietary implicit feedback dataset, after 20 epochs. Lower is better.

Page 34: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

• Evaluation of IALS vs. IALS1•Time – accuracy tradeoff.

Page 35: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Conclusionsusers

item

s

•We learned two tricks:

• ALS1: RR1 can be used instead of RR in ALS

• IALS1: we can create few synthetic examples to replace the not-watching of many examples

•ALS and IALS are approximation algorithms, so why not change them to be even more approximative

•ALS1 and IALS1 offer better time-accuracy tradeoffs, esp. when K is large.

•They can be even 10x faster (or even 100x faster, for non-realistic K values)

TODO:

Precision, recall, other datasets.

Page 36: Fast ALS-based matrix factorization for explicit and implicit feedback datasets

Thank you for your attention

?