Upload
massimo-quadrana
View
431
Download
0
Embed Size (px)
Citation preview
PersonalizingSession-basedRecommendationswithHierarchical
RecurrentNeuralNetworks
MassimoQuadrana(Politecnico diMilano)AlexandrosKaratzoglou (TelefonicaR&D)
Balázs Hidasi (GravityR&D)PaoloCremonesi (Politecnico diMilano)
29/08/2017Como
Anonym2
Time
Traditionalsession-basedrecommendation
Anonym3
Anonym1
Soccer
Anonym1
Anonym2
Time
Traditionalsession-basedrecommendation
Cartoons
NBA
Anonym3
Soccer
User2
Time
Personalizedsession-basedrecommendation
Cartoons
NBA
User1
User1
Soccer
Time
Personalizedsession-basedrecommendation
NBA
User1
Sports!
User1
Soccer
Time
Personalizedsession-basedrecommendation
Cupcakes
User1
Sports!
User1
Researchquestion
Howcanwecombinelong-term(historical)preferencesoftheuserwithhershort-term(session)intenteffectively?
• Wholeuserhistoryasasingle sequence
• Trivialimplementationbutlimitedeffectiveness
Naïvesolution:concatenation
User1
Session1 Session2 SessionN
…
RNN
RNN
RNN
RNN
RNN
RNN
RNN
RNN
RNN
RNN
HierarchicalRNN
• Decoupleuserandsessionrepresentations• UserRNN(GRUusr)
• Relays&Evolvestheuserlatentstateacrosssessions
• SessionRNN(GRUses)
• Generatespersonalized session-basedrecommendations
• Seamlessly personalizeSessionRNNwithcross-sessioninformationtransfer
s1
i1,3
c0
i1,4i1,2 i1,3
i1,1 i1,2
s1,0
prediction
inputitem id
session-levelrepresentation
Architecture
Session1
User1
GRUses (firstsessiononly):• Initialization:• Update:
𝑠",$ = 0𝑠",' = 𝐺𝑅𝑈+,+(𝑖",', 𝑠",'0")
GRUusr:• Initialization: 𝑐$ = 0
Architecture
User1
Session1
s1
i1,3
c0c1
i1,4i1,2 i1,3
i1,1 i1,2
s1,0
user-level representation
session-levelrepresentation
GRUusr:• Update: 𝑐3 = 𝐺𝑅𝑈4+5 𝑠3, 𝑐30"
previoususer-statelastsession-state
Architecture– HRNNInit
User1
Session1 Session2
s1 s2
i2,4i1,3
c0c1
i2,3i2,1 i2,2
i2,5i2,4i2,2 i2,3i1,4i1,2 i1,3
i1,1 i1,2
s1,0
session initialization
user-level representation
GRUses (fromthe2nd sessionon):• Initialization:• Update: 𝑠3,' = 𝐺𝑅𝑈+,+(𝑖3,', 𝑠3,'0")
𝑠3,$ = tanh(𝑊;';<𝑐30" + 𝑏;';<)
Architecture– HRNNAll
User1
Session1 Session2
s1 s2
i2,4i1,3
c0c1
i2,3i2,1 i2,2
i2,5i2,4i2,2 i2,3i1,4i1,2 i1,3
i1,1 i1,2
s1,0
user representation propagation
user-level representation
session initialization
GRUses (fromthe2nd sessionon):• Initialization:• Update: 𝑠3,' = 𝐺𝑅𝑈+,+(𝑖3,', 𝑠3,'0", 𝑐30")
𝑠3,$ = tanh(𝑊;';<𝑐30" + 𝑏;';<)
Architecture- Complete
User1
s1 s2
i2,4i1,3
c2c0c1
user representation propagation
i2,3i2,1 i2,2
prediction i2,5i2,4i2,2 i2,3
inputitem id
i1,4i1,2 i1,3
user-level representation
session-levelrepresentation
session initialization
i1,1 i1,2
s1,0
Session1 Session2
Architecture- Complete
User1
s1 s2
i2,4i1,3
c2c0c1
user representation propagation
i2,3i2,1 i2,2
prediction i2,5i2,4i2,2 i2,3
inputitem id
i1,4i1,2 i1,3
user-level representation
session-levelrepresentation
session initialization
i1,1 i1,2
s1,0
Session1 Session2
Twoidentical sessions fromdifferentuserswillproducedifferent recommendations
Training
• BasedonGRU4Rec[Hidasi etal.,2016]• GatedRecurrentUnits
• Rankinglosses(Cross-entropy,BPR,TOP1)
• Outputsampling
• Dropoutregularization
• Adagrad w/Momentum
• User-parallelmini-batches
𝑖"," 𝑖",? 𝑖",@ 𝑖",A
𝑖?," 𝑖?,? 𝑖?,@
𝑖"," 𝑖",?
𝑖"," 𝑖",?
𝑖?," 𝑖?,?
Session1
Session2
Session1
Session1
Session2
…
𝑖"," 𝑖",? 𝑖",@
𝑖"," 𝑖",? 𝑖","
𝑖?," 𝑖?,?
𝑖",? 𝑖",@ 𝑖",A
𝑖",? 𝑖",@ 𝑖",?
𝑖?,? 𝑖?,@
Input
Output
Mini-b
atch1
Mini-b
atch2
……
……Us
er1
User
2Us
er3 𝑖",A𝑖",@
𝑖",@
𝑖",?
𝑖",@
Hidasi B.,Karatzoglou A.,Baltrunas L.andTikk D..Session-based recommendations withrecurrent neuralnetworks.InternationalConferenceonLearningRepresentations,2016.
Experiments
• Datasets• Jobpostings(XING)
• “Sessionized”RecSys Challenge2016dataset(30-minidlethreshold)
• Norepetitions+“deletes”
• 11Kusers,60kitems(min5sess/user,20events/item)
• Training/Test:78ksessions(488kevents)/11ksessions(58kevents)
• Onlinevideosite(VIDEO)
• 13kusers,20kitems(min5sess/user,10events/item)
• Training/Test:120ksessions(745kevents)/13ksessions(78kevents)
Evaluation• Methods:
• PersonalizedPopularity(PPOP)
• Co-occurrenceItem-kNN
• Session-basedRNN(RNN)
• RNNonconcatenatedsessions(RNNConcat)
• HierarchicalRNNs
• HRNNInit (initializationonly)
• HRNNAll(Initialization+propagationininput)
Evaluation
• Procedure• Sequentialnext-itemprediction(Recall/Precision/MRR @5)
• RNNs:Avg.10iterationswithdifferentrandomseeds
• Bootstrappedevaluation(RNNConcat andHRNNs)
• DiscardedfirstpredictionofeachsessionmadebyRNNConcat
Overallresults- XING
Method #Hiddenunits Recall@5 MRR@5
ItemKNN - 0.0697 0.0406
PPOP - 0.1326 0.0939
RNN 500 0.1317 0.0796
RNNConcat 500 0.1467 0.0878
HRNNAll 500+500 0.1482 0.0925
HRNNInit 500+500 0.1473 0.0901
• PPOPstrongbaselineduetorepetitivenessacrosssessions
• Onlypersonalizedmodelswork
• +11%Recall vsRNN/PPOP(HRNNAll)
• ComparableMRRtoPPOP(HRNNAll)
• NosignificantdifferencebetweenHRNNs
Overallresults- VIDEO
• RNN-modelsoutperformallbaselinessignificantly
• HRNNInit outperforms allbaselines
• +7%RecallvsRNN&RNNConcat (HRNNInit)
• +19%/+2%MRRvsRNN/RNNConcat (HRNNInit)
VIDEO
Method #Hiddenunits Recall@5 MRR@5
ItemKNN - 0.4192 0.2916
PPOP - 0.3887 0.3031
RNN 500 0.5551 0.3886
RNNConcat 500 0.5582 0.4333
HRNNAll 500+500 0.5191 0.3877
HRNNInit 500+500 0.5947 0.4433
Overallresults- VIDEO
• RNN-modelsoutperformallbaselinessignificantly
• HRNNInit outperforms allbaselines
• +7%RecallvsRNN&RNNConcat (HRNNInit)
• +19%/+2%MRRvsRNN/RNNConcat (HRNNInit)
• HRNNsdiffersignificantly
• ForcedpropagationdegradesperformanceofHRNNAll
VIDEO
Method #Hiddenunits Recall@5 MRR@5
ItemKNN - 0.4192 0.2916
PPOP - 0.3887 0.3031
RNN 500 0.5551 0.3886
RNNConcat 500 0.5582 0.4333
HRNNAll 500+500 0.5191 0.3877
HRNNInit 500+500 0.5947 0.4433
In-depth analysis
• Historylength• #Sessionsintheuserprofile
• Short:≤6sessions
• Long:>6 sessions
• Positionwithinsession• Beginning[1,2]- Middle[3,4]- End[4,Inf)
• Onlysessionwithlength>4
• 6,736sessions XING
• 8,254sessionsVIDEO
History length XING VIDEOShort 67% 54%Long 33% 46%
Historylength- XING
Short Long0.120
0.125
0.130
0.135
0.140
0.145
0.150
0.155
0.160
Rec
all
0.1304
0.1355
0.1449
0.1504
0.1464
0.1518
0.1457
0.1505
RNN
RNN Concat
HRNN All
HRNN Init
Short Long0.080
0.085
0.090
0.095
0.100
0.105
0.110
MRR
0.0860
0.0824
0.0932
0.0916
0.0985
0.09570.0968
0.0929
RNN
RNN Concat
HRNN All
HRNN Init
Historylength- VIDEO
Short Long0.44
0.46
0.48
0.50
0.52
0.54
0.56
0.58
Rec
all
0.4999
0.5167
0.4770
0.5025
0.4753 0.4763
0.5249
0.5535
RNN
RNN Concat
HRNN All
HRNN Init
Short Long0.30
0.32
0.34
0.36
0.38
0.40
0.42
MRR
0.3388
0.3306
0.3491
0.3658
0.3440
0.3308
0.3820
0.3954
RNN
RNN Concat
HRNN All
HRNN Init
Analysiswithinsession- XING
Beginning Middle End0.120
0.125
0.130
0.135
0.140
0.145
0.150
Rec
all
RNN
RNN Concat
HRNN All
HRNN Init
Beginning Middle End0.0725
0.0775
0.0825
0.0875
0.0925
MRR
RNN
RNN Concat
HRNN All
HRNN Init
Sessionevolution- VIDEO
Beginning Middle End0.475
0.525
0.575
0.625
0.675
Rec
all
RNN
RNN Concat
HRNN All
HRNN Init
Beginning Middle End0.30
0.35
0.40
0.45
0.50
0.55
MRR
RNN
RNN Concat
HRNN All
HRNN Init
Experimentswithalargedataset
• ValidateHRNNseffectivenessonlargedataset(VIDEOXXL)• 810kusers,380kvideos,8.5Msessions,33M
events
• Evaluationontop-50kpopularitemsonly
• HRNNInit:+28%Recall/+41%MRRoverRNN
VIDEOXXL
Method #Hiddenunits Recall@5 MRR@5
RNN 100 0.3415 0.2314
RNNConcat 100 0.3459 0.2368
HRNNAll 100+100 0.3621 0.2658
HRNNInit 100+100 0.4362 0.3261
Summary
• Cross-sessionknowledgetransferworks!
• Naïveconcatenationisonlypartiallyeffective
• BothHRNNvariantsplaywellwhenpersonalizationis“easy”
• HRNNInit issignificantlymoreeffectiveincomplexscenarios
Futureworks
• Attentionmodels
• Multimodalmodels(user/itemfeatures)[Hidasi etal.,2016]
• EnhancedGRU4Reclosses[Hidasi &Karatzoglou,2017]
• Otherdomains(music,e-commerce,etc.)
• Codeavailableathttps://github.com/mquad/hgru4rec
[Hidasi etal.,2016]ParallelRecurrentNeuralNetworkArchitecturesforFeature-richSession-basedRecommendations.ACM RecSys 2016[Hidasi &Karatzoglou,2017]RecurrentNeuralNetworkswithTop-kGainsforSession-basedRecommendations.arXiv:1706.03847
Thanks!Questions?