Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Forecasting the FIFA World CupCombining goal- and result-based team ability parameters
Pieter Robberechts, Jesse Davishttp://people.cs.kuleuven.be/pieter.robberechts
Introduction
A popular research topic since the '60
Two popular approaches:
1. Goal-based models
Model the number of goals scored by both teams
2. Result-based models
Model win-draw-loss outcomes directly
Typical approach:
1. Estimate team abilities based on historical match data
2. Use them to predict future match outcomes
Match outcome prediction
Data → Team ratings → Predictions
Match outcome prediction
Typical approach:
1. Estimate team abilities based on historical match data
2. Use them to predict future match outcomes
Data → Team ratings → Predictions
Data scraped from:
- post WW2 international games from http://eloratings.net
- betting odds from http://betexplorer.com/
Match outcome prediction
Typical approach:
1. Estimate team abilities based on historical match data
2. Use them to predict future match outcomes
Data → Team ratings → Predictions
Two rating systems were explored:
- ELO ratings (result-based)
- ODM ratings (goal-based)
Team ...
Strength 2320 2237 2220 2207 ....
The ELO rating systemA Result-based rating system
EH =1
1 + 10RH−RA
400
R′�H = RH + k(SH − EH)
Given:
RH, RA
SH = {10.50
Then:
Current home and away team ratings
Expected score for the home team
Actual score of the home team
Updated rating of the home team
If the home team wonWhen drawIf the home team lost
The ELO rating systemA Result-based rating system
k = k0wi(1+δ)γ
Problem: - Not all games are handled with the same seriousness- Most games are played against weak opponents
‣ Competitiveness factor ‣ Margin of victory
Margin of victory weight Recentness factorR′�H = RH + k(SH − EH)
Offense-Defense ratingsA Goal-based rating system
Given:
Then:
Aij =
oj =n
∑i=1
Aij
didi =
n
∑i=1
Aji
oi
Aij = 0
Score team j generated against team i
Otherwise
Offensive rating of team j Defensive rating of team i
Offense-Defense ratingsA Goal-based rating system
Problem: - Large disparities between the number of games played and the
strength of the opponents- Teams in different confederations rarely play each other
Solution:
Update ratings sequentially
For each team:- Pre-game ratings = weighted sum of a team's post game ratings- Post-game ratings = ODM procedure with pre-game ratings as initial ratings
Match outcome predictionVia team rating systems
Two prediction models were explored:
- Ordered logit regression (result-based)
- Bivariate poisson regression (goal-based)
Typical approach:
1. Estimate team abilities based on historical match data
2. Use them to predict future match outcomes
Data → Team ratings → Predictions
Predictor
Eloattdef
Elodefatt [ 0.43 0.33 0.24 ]
"Belgium wins"
"It's a tie"
"England wins"
Home advantage?
-
Tuning the predictive power
1r − 1
r −1
∑k=1
(k
∑l=1
( ̂pl − yl))2
How accurate are our predictions?
3 possible interpretations:1. How many games are predicted correctly?→ Accuracy
2. How certain was the model about the true outcome?→ Logarithmic loss
3. How certain was the model about the true ordered outcome?
→ Ranked Probability Score (RPS)
Tuning the predictive power
Dataset
Test setValidation set
Apply best model
Training set
Until convergence: For each game ∈ Training set: update_rating(game) If game ∈ Validation set: make_prediction(game)
End if End for Compute average RPS Update rating and prediction model parameters
Minimise RPS with L-BFG-S algorithm:
Challenge I: Match outcome prediction
Accuracy LogLoss RPS
ELO ordered logit
ELO bivariate Poisson
Random forest
Bookmakers
ELO+ODM ordered logit
ELO+ODM bivariate Poisson
ODM ordered logit
ODM bivariate Poisson
0,51 0,6 0,1
40,2
30,9
21,0
1
The models were validated on the 2002, 2006, 2010 and 2014
World Cups 2002 2006 2010 2014allX
Challenge I: Match outcome prediction
Accuracy RPS
Bookmakers
ELO ordered logit
ELO+ODM ordered logit
Berrar et al.
Hubáček et al.
Constantinou
Tsokos et al.
And compared with the 2017 Soccer Prediction Challenge submissions
0,5 0,54
0,201
0,209
Accuracy LogLoss RPS
2014 Elo
Elo+ODM
FiveThirthyEight
2010 Elo
Elo+ODM
2006 Elo
Elo+ODM
2002 Elo
Elo+ODM
0,3 0,6 0,1 0,24
0,15
0,25
Challenge II: Tournament elimination
How accurate can we predict the round of elimination of each team in
previous World Cups?
Our predictions
Other's predictions
Accuracy LogLoss RPSFiveThirtyEightZeileirs et al.Groll et al.Our model
UBS 0,50,563
0,5940,563
0,531
0,2010,224
0,1860,1850,182
0,1920,1320,1260,1270,124
Tournament elimination
Online interactive https://dtai.cs.kuleuven.be/sports/worldcup18/
Thanks!Any questions?
Interactive at:https://dtai.cs.kuleuven.be/sports/worldcup18/