Upload
roman-zykov
View
6.000
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
The Cinematch System:
Operation, Scale
Coverage, Accuracy
Impact
Jim Bennett
9/13/06
What Is Netflix?
• “Connecting people to the movies they love”
• Online DVD movie rental:
– Users subscribe for a fixed fee per month
• Plans define #movies out at once, #turns in a month
– Find, then queue up movies on website
– USPS delivers DVDs within 1 business day most areas
– Keep as long as you want; no late fees
– Return in pre-paid mailer when done
– Next DVD on your queue sent automatically
• Working on movie delivery over the net
• Choice of 65,000 titles…which ones?
Give Ratings
Get Recommendations
Show Interest
Get Recommendations
Netflix and Cinematch Scale
• 5M active customers
– Ship 1.4M disks per day from 40 locations
• 1.4B ratings since 1997
– 2M ratings per day
– 1B predictions per day
• Item-to-item analysis with many data-
conditioning heuristics
• 2 days to retrain on new ratings
• Manual item setup for “coldstart” titles
– Automatically retired
Cinematch Operation
Ratings distribution
Wizard of Oz
Gone with the Wind
Netflix starts DVD rentals
Ratings distribution
Silent B&W Color
Predictive Coverage
0
1000
2000
3000
4000
5000
6000
7000
8000
Year
1908
1913
1917
1921
1925
1929
1933
1937
1941
1945
1949
1953
1957
1961
1965
1969
1973
1977
1981
1985
1989
1993
1997
2001
Total
Predictees
20K predictees (30%)
Predictable Films by Genre
Music
& M
usic
als
Fore
ign
Dra
ma
Docu
menta
ry
Child
ren &
Fam
ily
Com
edy
Tele
visi
on
Cla
ssic
s
Sport
s
Action &
Adventu
re
Horr
or
Specia
l Inte
rest
Thrille
rs
Anim
e &
Anim
atio
n
Sci-F
i &
Fanta
sy
Rom
ance
Independent
Gay &
Lesbia
n
Popular
0
1000
2000
3000
4000
5000
6000
Popular
Predictable
Total
* Popular = top 10K by ratings
0
25 50 75
100
150
200
300
400
500
600
700
800
900
1000
10000
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
# movies
# user ratings
Predictable movies
Shooting stars
4 and 5 stars
Predictably bad (<3)
Predictable
Climbing Mount Predictable
Prediction Accuracy
Error as user ratings increase
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
<=5 <=10 <=20 <=50 <=100 <=200 <=300 <=500 >500
+/-
Sta
rs RMSE
MAE
Bias
Error by Confidence
Error as confidence increases
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
Average 0 1 2 3
+/-
Sta
rs RMSE
MAE
Bias
Does It Matter?
• Absolutely critical to retaining users
– As CM has improved and RMSE has fallen, the
percentage of 4-5 star movies rented has increased
• Important to users:
– There are only so many new releases
– Help jog memories about movies to see
– CM reflects the collective memory of good movies
Does It Matter?
Cinematch-based User
What’s Next?
• Anticipate scale of 20M subscribers in 2010-2012
– Nearly 10B ratings, 10M/day
– 5B predictions/day
• Improved learning algorithms
– Improve coverage, accuracy and learning speed
• Help the non-rater
• Explore getting movie tastes beyond ratings
• Encode traits of movies that predict emotional
response
• Motivate a user to take an unknown but likely great
movie