Transcript
Page 1: Fast ALS-Based Matrix Factorization for Recommender Systems

Fast ALS-Based Matrix

Factorization for

Recommender Systems

David Zibriczky

LAWA Workpackage Meeting

16th January, 2013

Page 2: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Problem setting

16th January, 20132

Page 3: Fast ALS-Based Matrix Factorization for Recommender Systems

Item Recommendation

โ€ข Classical item recommendation problem (see Netflix)

โ€ข Explicit feedbacks (ratings)

16th January, 20133 LAWA Workpackage Meeting

5 ?

?

The Matrix The Matrix 2 Twilight The Matrix 3

?

Page 4: Fast ALS-Based Matrix Factorization for Recommender Systems

Collaborative Filtering (Explicit)

โ€ข Classical item recommendation problem (see Netflix)

โ€ข Explicit feedbacks (ratings)

โ€ข Collaborative Filtering

โ€ข Based on other users

16th January, 20134 LAWA Workpackage Meeting

5

54

55

?

?

The Matrix 3The Matrix The Matrix 2 Twilight

5

?

Page 5: Fast ALS-Based Matrix Factorization for Recommender Systems

Collaborative Filtering (Implicit)

โ€ข Items are not movies only (live content, products, holidays, โ€ฆ)

โ€ข Implicit feedbacks (buy, view, โ€ฆ)

โ€ข Less information about pref.

16th January, 20135 LAWA Workpackage Meeting

?

?

Item4Item1 Item2 Item3

?

Page 6: Fast ALS-Based Matrix Factorization for Recommender Systems

Industrial motivation

โ€ข Keeping the response time low

โ€ข Up-to-date user models, the adaptation should be fast

โ€ข The items may change rapidly, the training time can be a bottleneck of

live performance

โ€ข Increasing amount of data from a customer Increasing training time

โ€ข Limited resources

16th January, 20136 LAWA Workpackage Meeting

Page 7: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Model

16th January, 20137

Page 8: Fast ALS-Based Matrix Factorization for Recommender Systems

Preference Matrix

โ€ข Matrix representation

โ€ข Implicit Feedbacks: Assuming

positive preference

โ€ข Value = 1

โ€ข Estimation of unknown preference?

โ€ข Sorting items by estimation Item

Recommendation

16th January, 20138 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 ? ? ?

User2 ? ? 1 ?

User3 1 1 ? ?

User4 ? 1 ? 1

Page 9: Fast ALS-Based Matrix Factorization for Recommender Systems

Matrix Factorization

๐‘น = ๐‘ท๐‘ธ๐‘ป ๐‘Ÿ๐‘ข๐‘– = ๐’‘๐‘ข๐‘‡๐’’๐‘–

๐‘น๐‘ต๐’™๐‘ด: preference matrix

๐‘ท๐‘ต๐’™๐‘ฒ: user feature matrix

๐‘ธ๐‘ด๐’™๐‘ฒ: item feature matrix

๐‘ต: #users

๐‘ด: #items

๐‘ฒ: #features

๐‘ฒ โ‰ช ๐‘ด , ๐‘ฒ โ‰ช ๐‘ต

16th January, 20139 LAWA Workpackage Meeting

R Item1 Item2 Item3 โ€ฆ

User1

User2 ๐’“๐‘ข๐‘–

User3

โ€ฆ

P

๐’‘๐‘ข๐‘‡

QT ๐’’๐‘–

๐’‘๐’– โ‰” ๐‘ท ๐’– ๐‘ป

๐’’๐’Š โ‰” ๐‘ธ ๐’Š ๐‘ป

Page 10: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Objective Function

16th January, 201310

Page 11: Fast ALS-Based Matrix Factorization for Recommender Systems

Preference Matrix

16th January, 201311 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1

User2 1

User3 1 1

User4 1 1

Page 12: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Zero value for unknown preference (zero example). Many 0s, few 1s, in practice

Preference Matrix

16th January, 201312 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 13: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Zero value for unknown preference (zero example). Many 0s, few 1s, in practice-

โ€ข ๐’„๐‘ข๐‘– confidence for known feedback (constant or function of the context of event)

โ€ข Zero examples are less important, but important.

Confidence Matrix

16th January, 201313 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

C Item1 Item2 Item3 Item4

User1 ๐’„11 1 1 1

User2 1 1 ๐’„23 1

User3 ๐’„31 ๐’„32 1 1

User4 1 ๐’„42 1 ๐’„44

Page 14: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Objective function:

Weighted Sum of Squared Errors

16th January, 201314 LAWA Workpackage Meeting

C Item1 Item2 Item3 Item4

User1 ๐’„11 1 1 1

User2 1 1 ๐’„23 1

User3 ๐’„31 ๐’„32 1 1

User4 1 ๐’„42 1 ๐’„44

๐’‡ ๐‘ท,๐‘ธ = ๐‘พ๐‘บ๐‘บ๐‘ฌ =

(๐’–,๐’Š)

๐’„๐’–๐’Š ๐’“๐’–๐’Š โˆ’ ๐’“๐’–๐’Š๐Ÿ ๐‘ท = ?

๐‘ธ = ?

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 15: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Optimizer

16th January, 201315

Page 16: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Ridge Regression

โ€ข ๐‘๐‘ข = ๐‘„๐‘‡๐ถ๐‘ข๐‘„ โˆ’1๐‘„๐‘‡๐ถ๐‘ข๐‘…๐‘Ÿ ๐‘ข

โ€ข ๐‘ž๐‘– = ๐‘ƒ๐‘‡๐ถ๐‘–๐‘ƒโˆ’1

๐‘ƒ๐‘‡๐ถ๐‘–๐‘…๐‘ ๐‘–

Optimizer โ€“ Alternating Least Squares

16th January, 201316 LAWA Workpackage Meeting

QT0.1 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

P

-0.2 0.6

0.6 0.4

0.7 0.2

0.5 -0.2

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 17: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Ridge Regression

โ€ข ๐‘๐‘ข = ๐‘„๐‘‡๐ถ๐‘ข๐‘„ โˆ’1๐‘„๐‘‡๐ถ๐‘ข๐‘…๐‘Ÿ ๐‘ข

โ€ข ๐‘ž๐‘– = ๐‘ƒ๐‘‡๐ถ๐‘–๐‘ƒโˆ’1

๐‘ƒ๐‘‡๐ถ๐‘–๐‘…๐‘ ๐‘–

Optimizer โ€“ Alternating Least Squares

16th January, 201317 LAWA Workpackage Meeting

QT0.3 -0.3 0.7 0.7

0.7 0.8 -0.5 -0.1

P

-0.2 0.6

0.6 0.4

0.7 0.2

0.5 -0.2

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 18: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Ridge Regression

โ€ข ๐‘๐‘ข = ๐‘„๐‘‡๐ถ๐‘ข๐‘„ โˆ’1๐‘„๐‘‡๐ถ๐‘ข๐‘…๐‘Ÿ ๐‘ข

โ€ข ๐‘ž๐‘– = ๐‘ƒ๐‘‡๐ถ๐‘–๐‘ƒโˆ’1

๐‘ƒ๐‘‡๐ถ๐‘–๐‘…๐‘ ๐‘–

Optimizer โ€“ Alternating Least Squares

16th January, 201318 LAWA Workpackage Meeting

QT0.3 -0.3 0.7 0.7

0.7 0.8 -0.5 -0.1

P

-0.2 0.7

0.6 0.5

0.8 0.2

0.6 -0.2

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 19: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Alternating Least Squares

โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐Ÿ๐‘ต๐‘ด + ๐‘ฐ๐‘ฒ๐Ÿ‘ ๐‘ต + ๐‘ด

๐‘ฌ: number of examples, ๐‘ฐ : number of iterations

โ€ข Improvement (Hu, Koren, Volinsky)

Ridge Regression: ๐‘๐‘ข = ๐‘„๐‘‡๐ถ๐‘ข๐‘„ โˆ’1๐‘„๐‘‡๐ถ๐‘ข๐‘…๐‘Ÿ ๐‘ข

๐‘„๐‘‡๐ถ๐‘ข๐‘„ = ๐‘„๐‘‡๐‘„ + ๐‘„๐‘‡ ๐ถ๐‘ข โˆ’ ๐ผ ๐‘„ = ๐ถ๐‘‚๐‘‰๐‘„0 + ๐ถ๐‘‚๐‘‰๐‘„+, ๐šถ(๐‘ฐ๐‘ฒ๐Ÿ๐‘ต๐‘ด) is costly

๐ถ๐‘‚๐‘‰๐‘„0 is user independent, need to be calculated at the start of the iteration

Calculating ๐ถ๐‘‚๐‘‰๐‘„+ needs only #๐‘ท(๐’–)+steps.

o #๐‘ท(๐’–)+: number of positive examples of user u

Complexity: ๐œช ๐‘ฐ๐‘ฒ๐Ÿ๐‘ฌ + ๐‘ฐ๐‘ฒ๐Ÿ‘(๐‘ต + ๐‘ด) = ๐œช ๐‘ฐ๐‘ฒ๐Ÿ(๐‘ฌ + ๐‘ฒ(๐‘ต + ๐‘ด)

Codename: IALS

โ€ข Complexity issues on large dataset:

If ๐‘ฒ is low: ๐œช(๐‘ฐ๐‘ฒ๐Ÿ๐‘ฌ) is dominant

If ๐‘ฒ is high: ๐‘ถ(๐‘ฐ๐‘ฒ๐Ÿ‘(๐‘ต + ๐‘ด)) is dominant

19 LAWA Workpackage Meeting 16th January, 2013

Page 20: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Problem: Complexity

16th January, 201320

Page 21: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201321 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

P

? ? ?

Page 22: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Initialize with zero values

Ridge Regression with Coordinate Descent

16th January, 201322 LAWA Workpackage Meeting

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

P

0 0 0

Page 23: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201323 LAWA Workpackage Meeting

P

0.51 0 0

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

โ€ข Target vector: ๐’†๐’–= ๐‘ช๐’– ๐’“๐’– โˆ’ ๐’‘๐’–๐‘ธ๐‘ป

โ€ข Optimize only one feature of ๐‘๐‘ข at once

โ€ข ๐‘๐‘ข๐‘˜ = ๐‘–=1

๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘’๐‘ข๐‘–

๐‘–=1๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘ž๐‘–๐‘˜

=๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„

โ€ข ๐‘’๐‘ข๐‘– = ๐‘’๐‘ข๐‘– โˆ’ ๐‘๐‘ข๐‘˜๐‘’๐‘ข๐‘–๐‘๐‘ข๐‘–

โ€ข Apply more iteration

Page 24: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201324 LAWA Workpackage Meeting

P

0.51 0.10 0

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

โ€ข Target vector: ๐’†๐’–= ๐‘ช๐’– ๐’“๐’– โˆ’ ๐’‘๐’–๐‘ธ๐‘ป

โ€ข Optimize only one feature of ๐‘๐‘ข at once

โ€ข ๐‘๐‘ข๐‘˜ = ๐‘–=1

๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘’๐‘ข๐‘–

๐‘–=1๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘ž๐‘–๐‘˜

=๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„

โ€ข ๐‘’๐‘ข๐‘– = ๐‘’๐‘ข๐‘– โˆ’ ๐‘๐‘ข๐‘˜๐‘’๐‘ข๐‘–๐‘๐‘ข๐‘–

โ€ข Apply more iteration

Page 25: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201325 LAWA Workpackage Meeting

P

0.51 0.10 0.08

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

โ€ข Target vector: ๐’†๐’–= ๐‘ช๐’– ๐’“๐’– โˆ’ ๐’‘๐’–๐‘ธ๐‘ป

โ€ข Optimize only one feature of ๐‘๐‘ข at once

โ€ข ๐‘๐‘ข๐‘˜ = ๐‘–=1

๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘’๐‘ข๐‘–

๐‘–=1๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘ž๐‘–๐‘˜

=๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„

โ€ข ๐‘’๐‘ข๐‘– = ๐‘’๐‘ข๐‘– โˆ’ ๐‘๐‘ข๐‘˜๐‘’๐‘ข๐‘–๐‘๐‘ข๐‘–

โ€ข Apply more iteration

Page 26: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201326 LAWA Workpackage Meeting

P

0.47 0.10 0.08

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

โ€ข Target vector: ๐’†๐’–= ๐‘ช๐’– ๐’“๐’– โˆ’ ๐’‘๐’–๐‘ธ๐‘ป

โ€ข Optimize only one feature of ๐‘๐‘ข at once

โ€ข ๐‘๐‘ข๐‘˜ = ๐‘–=1

๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘’๐‘ข๐‘–

๐‘–=1๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘ž๐‘–๐‘˜

=๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„

โ€ข ๐‘’๐‘ข๐‘– = ๐‘’๐‘ข๐‘– โˆ’ ๐‘๐‘ข๐‘˜๐‘’๐‘ข๐‘–๐‘๐‘ข๐‘–

โ€ข Apply more iteration

Page 27: Fast ALS-Based Matrix Factorization for Recommender Systems

Ridge Regression with Coordinate Descent

16th January, 201327 LAWA Workpackage Meeting

P

0.46 0.11 0.07

R Item1 Item2 Item3 Item4

User1 1 0 0 0

QT

0.9 -0.4 0.8 0.6

0.6 0.7 -0.7 -0.2

-0.1 -0.4 -0.1 0.6

โ€ข Target vector: ๐’†๐’–= ๐‘ช๐’– ๐’“๐’– โˆ’ ๐’‘๐’–๐‘ธ๐‘ป

โ€ข Optimize only one feature of ๐‘๐‘ข at once

โ€ข ๐‘๐‘ข๐‘˜ = ๐‘–=1

๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘’๐‘ข๐‘–

๐‘–=1๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘ž๐‘–๐‘˜

=๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„

โ€ข ๐‘’๐‘ข๐‘– = ๐‘’๐‘ข๐‘– โˆ’ ๐‘๐‘ข๐‘˜๐‘’๐‘ข๐‘–๐‘๐‘ข๐‘–

โ€ข Apply more iteration

Page 28: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201328 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 0

0 0

0 0

0 0

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 29: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201329 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0 0

0 0

0 0

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 30: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201330 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0.1 0

0 0

0 0

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 31: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201331 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0.1 0.5

0 0

0 0

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 32: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201332 LAWA Workpackage Meeting

QT0.1 0.4 1.1 0.6

0.6 0.7 1.5 1.0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 33: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201333 LAWA Workpackage Meeting

QT0.1 0 0 0

0 0 0 0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 34: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201334 LAWA Workpackage Meeting

QT0.1 0 0 0

0.6 0 0 0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 35: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201335 LAWA Workpackage Meeting

QT0.1 0.4 0 0

0.6 0 0 0

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 36: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201336 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.3 -0.1

0.1 -0.5

-0.4 0.2

0.5 -0.4

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 37: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201337 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.2 0

0 0

0 0

0 0

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 38: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201338 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.2 -0.1

0 0

0 0

0 0

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 39: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

16th January, 201339 LAWA Workpackage Meeting

QT0.1 0.4 -0.1 0.2

0.6 0.7 0.8 0.5

P

0.2 -0.1

0.1 -0.4

-0.3 0.1

0.5 -0.6

โ€ข Ridge Regression with Coordinate Descent

R Item1 Item2 Item3 Item4

User1 1 0 0 0

User2 0 0 1 0

User3 1 1 0 0

User4 0 1 0 1

Page 40: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐‘ต๐‘ด

โ€ข Ridge Regression calculates the features based on examples directly,

Covariance precomputing solution cannot be applied here.

40 LAWA Workpackage Meeting 16th January, 2013

Page 41: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent Improvement

โ€ข Synthetic examples (Pilรกszy, Zibriczky, Tikk)

โ€ข Solution of Ridgre Regression with CD: ๐‘๐‘ข๐‘˜ = ๐‘–=1

๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘’๐‘ข๐‘–

๐‘–=1๐‘€ ๐‘๐‘ข๐‘–๐‘ž๐‘–๐‘˜๐‘ž๐‘–๐‘˜

=๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„

โ€ข Calculate statistics for this user, who watched nothing (๐‘†๐ธ๐‘„0 and ๐‘†๐‘„๐‘„0)

โ€ข The solution is calculated incrementally: ๐‘๐‘ข๐‘˜ =๐‘†๐‘„๐ธ

๐‘†๐‘„๐‘„=

๐‘†๐‘„๐ธ0+๐‘†๐‘„๐ธ+

๐‘†๐‘„๐‘„0+๐‘†๐‘„๐‘„+(๐‘ด + #๐‘ท(๐’–)+ steps)

โ€ข Eigenvalue decomposition: ๐‘„๐‘‡๐‘„ = ๐‘†ฮ›๐‘†๐‘‡ = ๐‘† ฮ›๐‘‡

ฮ›๐‘† = ๐บ๐‘‡๐บ

โ€ข Zero examples are compressed to synthetic examples: ๐‘„๐‘€๐‘ฅ๐พ โ†’ ๐บ๐พ๐‘ฅ๐พ

โ€ข ๐‘†๐บ๐บ0 = ๐‘†๐‘„๐‘„0, but needs only ๐Š steps to compute: ๐‘๐‘ข๐‘˜ =๐‘บ๐‘ฎ๐‘ฌ๐ŸŽ+๐‘†๐‘„๐ธ+

๐‘บ๐‘ฎ๐‘ฎ๐ŸŽ+๐‘†๐‘„๐‘„+(๐‘ฒ + #๐‘ท(๐’–)+ steps)

โ€ข ๐‘†๐บ๐ธ0 is calculated the same way as ๐‘†๐‘„๐ธ0, but using ๐Š steps only.

โ€ข Complexity: ๐›ฐ ๐ผ๐พ(๐ธ + ๐พ๐‘€ + ๐พ๐‘)) = ๐šถ ๐‘ฐ๐‘ฒ(๐‘ฌ + ๐‘ฒ(๐‘ด + ๐‘ต)

41 LAWA Workpackage Meeting 16th January, 2013

Page 42: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐‘ต๐‘ด

โ€ข Ridge Regression calculates the features based on examples directly,

Covariance precomputing solution cannot be applied here.

โ€ข Synthetic Examples

โ€ข Codename: IALS1

โ€ข Complexity reduction (IALSIALS1)

๐œช ๐‘ฐ๐‘ฒ(๐‘ฌ + ๐‘ฒ(๐‘ด + ๐‘ต)

โ€ข IALS1 requires higher ๐‘ฒ for the same accuracy as IALS.

42 LAWA Workpackage Meeting 16th January, 2013

Page 43: Fast ALS-Based Matrix Factorization for Recommender Systems

Optimizer โ€“ Coordinate Descent

...does it work in practice?

16th January, 201343 LAWA Workpackage Meeting

Page 44: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Average Rank Position on the subset of a propietary implicit feedback dataset. The lower

value is better.

โ€ข IALS1 offers better time-accuracy tradeoffs, especially when K is large.

Comparison

44 LAWA Workpackage Meeting 16th January, 2013

IALS IALS1

K ARP time ARP time

5 0,1903 153 0,1898 112

10 0,1578 254 0,1588 134

20 0,1427 644 0,1432 209

50 0,1334 2862 0,1344 525

100 0,1314 11441 0,1325 1361

250 0,1311 92944 0,1312 6651

500 N/A N/A 0,1282 24697

1000 N/A N/A 0,1242 104611

0,120

0,125

0,130

0,135

0,140

0,145

0,150

0,155

100 1000 10000 100000A

RP

Training Time (s)

IALS IALS1

Page 45: Fast ALS-Based Matrix Factorization for Recommender Systems

Conclusion

โ€ข Explicit feedbacks are rarely or not provided.

โ€ข Implicit feedbacks are more general.

โ€ข Complexity issues of Alternating Least Squares.

โ€ข Efficient solution by using approximation and synthetic examples.

โ€ข IALS1 offers better time-accuracy tradeoffs, especially when ๐‘ฒ is large.

โ€ข IALS is approximation algorithm too, so why not change it to be even

more approximative?

45 LAWA Workpackage Meeting 16th January, 2013

Page 46: Fast ALS-Based Matrix Factorization for Recommender Systems

LAWA Workpackage Meeting

Other algorithms

16th January, 201346

Page 47: Fast ALS-Based Matrix Factorization for Recommender Systems

Model โ€“ Tensor Factorization

47 LAWA Workpackage Meeting 16th January, 2013

โ€ข Different preferences during the day

โ€ข Time period 1: 06:00-14:00

R1 Item1 Item2 Item3 โ€ฆ

User1 1 โ€ฆ

User2 1 โ€ฆ

User3 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

Page 48: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Different preferences during the day

โ€ข Time period 2: 14:00-22:00

Model โ€“ Tensor Factorization

48 LAWA Workpackage Meeting 16th January, 2013

R1 Item1 Item2 Item3 โ€ฆ

User1 1 โ€ฆ

User2 1 0 โ€ฆ

User3 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

R2 Item1 Item2 Item3 โ€ฆ

User1 1 โ€ฆ

User2 1 โ€ฆ

User3 1 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

Page 49: Fast ALS-Based Matrix Factorization for Recommender Systems

Model โ€“ Tensor Factorization

โ€ข Different preferences during the day

โ€ข Time period 3: 22:00-06:00

49 LAWA Workpackage Meeting 16th January, 2013

R1 Item1 Item2 Item3 โ€ฆ

User1 1 โ€ฆ

User2 1 0 โ€ฆ

User3 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

R2 Item1 Item2 Item3 โ€ฆ

User1 0 1 โ€ฆ

User2 1 โ€ฆ

User3 1 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

R3 Item1 Item2 Item3 โ€ฆ

User1 1 โ€ฆ

User2 โ€ฆ

User3 1 1 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

Page 50: Fast ALS-Based Matrix Factorization for Recommender Systems

Model โ€“ Tensor Factorization

50 LAWA Workpackage Meeting 16th January, 2013

R1 Item1 Item2 Item3 โ€ฆ

User1 1 โ€ฆ

User2 1 0 โ€ฆ

User3 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

R2 Item1 Item2 Item3 โ€ฆ

User1 0 1 โ€ฆ

User2 1 โ€ฆ

User3 1 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

R3 Item1 Item2 Item3 โ€ฆ

User1 โ€ฆ

User2 ๐’“๐‘ข๐‘–๐‘ก โ€ฆ

User3 โ€ฆ

โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ

QTq11 q21 q31 โ€ฆ

q12 q22 q32 โ€ฆ

P

p11 p12

p21 p22

p31 p32

โ€ฆ โ€ฆ

Tt11

t12

t21

t22

t31

t32

๐‘น๐‘ต๐’™๐‘ด: preference matrix

๐‘ท๐‘ต๐’™๐‘ฒ: user feature matrix

๐‘ธ๐‘ด๐’™๐‘ฒ: item feature matrix

๐‘ป๐‘ณ๐’™๐‘ฒ: time feature matrix

๐‘ต: #users

๐‘ด: #items

๐‘ณ: #time periods

๐‘ฒ: #features

๐’“๐’–๐’Št =

๐’Œ

๐’‘๐’–๐’Œ๐’’๐’Š๐’Œ๐’•๐’•๐’Œ

๐‘น = ๐‘ทยฐ๐‘ธยฐ๐‘ป

Page 51: Fast ALS-Based Matrix Factorization for Recommender Systems

โ€ข Data sets: Netflix Rating 5, IPTV Provider VOD rental, Grocery buys

โ€ข Evaluation Metric: Recall@20, Precision-Recall@20

โ€ข Number of features: 20

Comparison โ€“ ITALS vs. IALS

51 LAWA Workpackage Meeting 16th January, 2013

Test case (20) IALS ITALS

Netflix Probe 0.087 0.097

Netflix Time Split 0.054 0.071

IPTV VOD 1day 0.063 0.112

IPTV VOD 1week 0.055 0.100

Grocer 0.065 0.103

Page 52: Fast ALS-Based Matrix Factorization for Recommender Systems

Comparison โ€“ ITALS vs. IALS

52 LAWA Workpackage Meeting 16th January, 2013

Page 53: Fast ALS-Based Matrix Factorization for Recommender Systems

Objective Function โ€“ Ranking-based objective function

16th January, 201353 LAWA Workpackage Meeting

โ€ข Ranking-based objective function approach:

โ€ข ๐’“๐’–๐’Š โˆ’ ๐’“๐’–๐’‹ : difference of preference between item i and j

โ€ข ๐’“๐’–๐’Š โˆ’ ๐’“๐’–๐’‹ : estimated difference of preference between item i and j

โ€ข ๐’”๐’‹: importance of item j in objective function

โ€ข Model: Matrix Factorization

โ€ข Optimizer: Alternating Least Squares

โ€ข Name: RankALS

๐’‡ ๐œฝ =

๐’–๐๐‘ผ

๐’Š๐๐‘ฐ

๐’„๐’–๐’Š

๐’Š๐๐‘ฐ

๐’”๐’‹[ ๐’“๐’–๐’Š โˆ’ ๐’“๐’–๐’‹ โˆ’ ๐’“๐’–๐’Š โˆ’ ๐’“๐’–๐’‹ ]๐Ÿ

Page 54: Fast ALS-Based Matrix Factorization for Recommender Systems

Comparison โ€“ RankIALS vs. IALS

54 LAWA Workpackage Meeting 16th January, 2013

Page 55: Fast ALS-Based Matrix Factorization for Recommender Systems

Comparison โ€“ RankIALS vs. IALS

55 LAWA Workpackage Meeting 16th January, 2013

Page 56: Fast ALS-Based Matrix Factorization for Recommender Systems

Related Publications

โ€ข Alternating Least Squares with Coordinate Descent

I. Pilรกszy, D. Zibriczky, D. Tikk. Fast ALS-based matrix factorization for explicit and

implicit feedback datasets. RecSys 2010

โ€ข Tensor Factorization

B. Hidasi, D. Tikk: Fast ALS-Based Tensor Factorization for Context-Aware

Recommendation from Implicit Feedback, ECML PKDD 2012

โ€ข Personalized Ranking

G. Takรกcs, D. Tikk: Alternating least squares for personalized ranking, RecSys 2012

โ€ข IPTV Case Study

D. Zibriczky, B. Hidasi, Z. Petres, D. Tikk: Personalized recommendation of linear content

on interactive TV platforms: beating the cold start and noisy implicit user feedback,

TVMMP @ UMAP 2012

56 LAWA Workpackage Meeting 16th January, 2013


Recommended