27
The evolution of hockey statistics – an ongoing story Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25

The Evolution of Hockey Statistics

Embed Size (px)

DESCRIPTION

by: Bruce McCurdy

Citation preview

  • The evolution of hockey statistics an ongoing story

    Bruce McCurdyAnalytics, Big Data, and the Cloud2012 April 25

  • Traditional game summaries

  • 1967-68 Plus/minus formally introduced, as well as individual shots on goal / Shooting %

  • 1983-84 Goaltender save percentage added

    Grant Fuhr

    Grant Fuhr

  • 1998-99 Time on ice published, opening the door for rate stats

    Chris Pronger

  • 1998: NHL introduces Zone Time

    but turfs it in 2002. Why?!

  • 1998: NHL starts to (sporadically) maintain

    Real Time Scoring System (RTSS)

  • but there remain huge problems due to lack of standardization & rink bias

    Oilers have twice as many giveaways as Florida or do they?

  • Ranking of teams RTSS home and away yields results that might as well be randomized for giveaways and takeaways, and very nearly so for hits and blocked shots.

    Whereas the same exercise for Goals For yields a crudely similar ordering home to away.

    Significant home scorer bias in turnover stats. 45% more giveaways and 33% more takeaways by home teams league-wide!

    As a result RTSS is highly unreliable, serving to rank players within a given team but almost useless for comparing players from different clubs.

  • 2002-03: NHL introduces play-by-play reports

    though problems remain with accuracy of some data, e.g. shot distance

  • Stripping of PxP data allows detailed on-ice analysis of individual players

    Even-strength shots / Fenwick / Corsi from timeonice.com

  • Head-to-head match-ups (timeonice.com)

  • Customizable, sortable stats from behindthenet.ca

    Available stats:

    Even strength / powerplay / shorthanded

    Scoring per 60 minutes

    On/off ice plus/minus per 60

    On/off ice shots / Fenwick / Corsi per 60

    On-ice Sh% / Sv% / PDO

    QualComp / QualTeam

    Penalties drawn / taken

    ZoneStart / ZoneFinish

  • Many stats need to be parsed in terms of positive / negative /neutral game states, e.g.:

    Leading / trailing / tied (score effects are HUGELY important)

    PP / PK / EV

    O-zone / D-zone / neutral zone

    Taken in isolation without context, modern stats will be distorted; e.g. soft minutes players used in offensive situations should be expected to have positive numbers in things like Relative Corsi

  • "A chance is counted any time a

    team directs a shot cleanly on-net

    from within home-plate. Shots on

    goal and misses are counted, but

    blocked shots are not (unless the

    player who blocks the shot is acting like a goaltender). Generally

    speaking, we are more generous

    with the boundaries of home-plate if

    there is dangerous puck movement

    immediately preceding the scoring

    chance, or if the scoring chance is

    screened. If you want to get a visual

    handle on home-plate, check this

    image."

    Scoring chances

  • One weakness to the current method is that

    home plate isnt best template for scoring area

    Another is that scoring chances are just 1s and 0s no extra weight for first class chances as suggested by heat map colour coding

  • Actually,

    scoring

    areas

    which vary for

    different

    types of

    shots and

    manpower

    situations.

    Scoring

    chance

    model is

    greatly

    simplified

    from this

    reality.

  • Common SC errors and outcomes

    NHL data doesnt properly record on-ice players

    +1 or -1 for selected players

    Scoring chance improperly credited (or missed)

    +1 or -1 for 10 players

    Scoring chance recorded at wrong game time

    +1 or -1 for up to 20 players

    Scoring chance recorded but for wrong team

    +2 or -2 for 10 players

  • Neilson Numbers

    Based on ideas of Roger Neilson Assignment of individual responsibility on scoring chances

    for and against Requires an extra degree of qualitative judgement over and

    above deciding whether a scoring chance has occurred Eliminates false positives/negatives, however individual

    numbers dont reconcile to team totals Fewer recording errors than on-ice scoring chances as

    players are identified as part of the process Same system can be used to assign unofficial assists on GF

    or errors on GA Reliant on a knowledgeable scorer, but as with other

    scoring chance systems, would work better if 3 or 5 scorers worked independently, then pooled results.

  • Sample box:

  • Zone Start:fad or trend?

  • Possession

    Hockey is a transition game: offense to defense, defense to offense, one team to another. Hundreds of tiny fragments of action, some leading somewhere, most going nowhere. Only one thing is clear. A fragmented game must be played in fragments. Grand designs do not work. Before offense turns to defense, or defense to offense, there is a moment of disequilibrium when a defense is vulnerable, when a games sudden, unexpected swings can be turned to advantage. It is what you do at this moment, when possession changes, that makes the difference.

    Ken Dryden, The Game

  • It is noteworthy that in general our teamwork was considerably above our main contenders. In the game against the Canadian team, the players of the USSR squad made 110 passes, while the Canadians made 60 passes; in the game against Czechoslovakia we made 106 passes, they made 70; in the game against Sweden we made 49 more passes than they did. This is an indication of quite stable habits and a high culture of playing, a correct understanding of the game by the Soviet players.

    -- Anatoli Tarasov, Road to Olympus

  • Good pass: plus.

    Bad pass: minus.

    Good clearance: plus.

    Bad clearance: minus.

    Good rush: plus.

    Bad rush: minus.

    Good shoot in: plus. Bad shoot in: minus.

    Tarasov Numbers

  • and many more advanced ideas

    Goals Versus Threshold (GVT)

    Defence Independent Goalie Rating (DIGR)

    Shot Quality (SQF / SQA)

    Preditcted Goals Scored (PGS)

    Zone Start Adjusted Corsi (ZSAC)

    Etc.

    No time to do them all justice here

    Thanks for listening!