23
Politecnico di Milano Team Pumpkin-Pie 21-11-2016 RecSysChallenge 2016

RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

RecSysChallenge

2016

Page 2: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

RecSysChallenge 2016International competition organized for the

10° RecSys Conference, held in Boston (at

MIT) in September 2016

Focus on job recommendation problem:

Given a XING website user, predict those job

postings the user will positively interact with

in the near future (one week)

Page 3: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Team Pumpkin-Pie

T. Carpi

M. Edemanti

E. Kamberoski

E. Sacchi

R. Pagano

M. Quadrana

Page 4: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Dataset

• 1 500 000 users

• 1 300 000 items

• 8 850 000 interactions

• 10 350 000 rows of impressions

• Make prediction for 150 000 users

Total size of the dataset alone: about 10 Gb!

Page 5: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Computational Resources

From PoliCloud:

• 44 cores

• 94 Gb ram

• 760 Gb storage

• 6 VM

From TU Delft:

• 64 cores

• 128 Gb ram

• 100 Gb storage

• 1 VM

Page 6: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Mim-Solutions

Impress TV

Alibaba

PumpkinPie

Page 7: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Page 8: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

The multi-stack ensemble

Page 9: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Multi-Stack Ensemble

InteractionsImpressions

UBCF IntInt UBCF IntImp UBCF ImpImpUBCF ImpInt IBCF IntInt IBCF ImpImp KBUIS KBIS

Baseline

Final Ensemble

EnsCF+CB

Ens CF Ens CB

Page 10: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

UBCF IntInt UBCF IntImp UBCF ImpImpUBCF ImpInt IBCF IntInt IBCF ImpImp KBUIS KBIS

Ens CF Ens CB

Multi-Stack Ensemble

Page 11: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

UBCF IntInt UBCF IntImp UBCF ImpImpUBCF ImpInt IBCF IntInt IBCF ImpImp KBUIS KBIS

EnsCF+CB

Ens CF Ens CB

Multi-Stack Ensemble

Page 12: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

InteractionsImpressions

UBCF IntInt UBCF IntImp UBCF ImpImpUBCF ImpInt IBCF IntInt IBCF ImpImp KBUIS KBIS

Baseline

Final Ensemble

EnsCF+CB

Ens CF Ens CB

Multi-Stack Ensemble

Page 13: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

2-Step Algorithm

Voting-Based Technique

Reduce Function

+

Page 14: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Linear Ensemble

weight for algorithm a

rank of item i in algorithm a

decay for algorithm a

Page 15: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

red 1.9990

pink 1.9980

yellow 1.9970

blue 1.9960

green 1.9985

orange 1.9970

purple 1.9955

red 1.9940

Algorithm A

Weight 2

Decay 0.001

Algorithm B

Weight 2

Decay 0.0015

Linear Ensemble

red 3.9930

1

2

3

4

1

2

3

4

Page 16: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Evaluation-Score Ensemble

local test score

# of elements

Page 17: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Evaluation-Score Ensemble

red 7.5660

pink 7.5660

purple 5.5660

orange 5.5660

yellow 9.4575

green 9.4575

blue 6.9575

grey 6.9575

Algorithm A

l_a 200k

n_a 1 Mln

Weight 0.2

Algorithm B

l_b 200k

n_b 800k

Weight 0.25

Page 18: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Algorithms

Page 19: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Collaborative-Filtering

IDF value as a rate for the job

Page 20: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Collaborative-Filtering

User-based e Item-based

Page 21: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Content-Based

Concept-based joint User-Item similarity

Page 22: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Interactions &

Impressions

Already clicked jobs are likely to

be clicked again

ReplyBookm

ark

Click

Page 23: RecSysChallenge 2016policloud.polimi.it/wp-content/uploads/2016/12/RecSys... · 2016. 12. 19. · 21-11-2016 Politecnico di Milano Team Pumpkin-Pie RecSysChallenge 2016 International

Politecnico di Milano Team Pumpkin-Pie21-11-2016

Interactions &

Impressions

A B C D E F

B C DA

B D AC

Dataset

Filtering-step

Reordering-step