The annual NHL draft has become a great source of entertainment for fans. Since teams make player selections based on a combination of game theory and data, the draft is also a fertile ground for analysts as well. Game theory specifically is the foundation for the Draft Probability Tool that will be presented in this piece. It will help you explore how teams should approach the draft strategically: if you’re interested in a specific player, do you need to trade up or down to get him? How much should you be giving up or asking for? How far should you trade up or down to still get the player you value highly? This tool helps answer those questions.
One of the greatest challenges in sports analytics is determining the skill of a player independently of quality of teammates. While a number of tools already exist (e.g. WOWYs in hockey), their (mis)use lends itself to significant limitations and collinearity concerns. This is where regression-based approaches can provide a more rigorous alternative in isolating a player’s true talent.
An encouraging development in hockey analytics as of late has been Ryan Stimson’s Passing Project, which you can read about here. The goal of this post is to introduce a regression-based method to estimate an NHL player’s expected scoring performance independently of the passing strength of his teammates. To this end, player and linemate data from Stimson’s Passing Project and Muneeb Alam of the 2014-2015 season were used to devise a rate-based metric of a player’s projected goals. The difference between a player’s projected goals per 60 minutes and actual goals per 60 minutes will be called Delta Box Score or DBS.
Back in October 2015, @asmae_t and I first unveiled an Expected Goals model which proved to be a better predictor of team and player goalscoring performance than any other public model to date. Thanks to the feedback of the community, a few adjustments and corrections were made since then. The changes were the following:
- Score state was a variable that was accounted for in the model but was not explicitly mentioned in the original write-up. Recall that after accounting for all variables, including score state, it was found that a shot attempted by a trailing team still has a lower likelihood of resulting in a goal than a shot taken by a leading team.
- The shot multiplier in Part I of the original write-up was adjusted using a historical weighted average instead of in-season data. Thus, a 2016 shot multiplier for example would be based on the average of the regressed goals (rGoals) and regressed shots (rShots) of 2014 and 2015. This adjustment improved the model’s performance against score-adjusted Corsi and goals % in predicting future scoring, as seen in the graph below. We thank @Cane_Matt again for pointing out this error.
Analysis of goaltending performance in hockey has traditionally relied on save percentage (Sv%). Recent efforts have improved on this statistic, such as adjusting for shot location and accounting for goals saved above average (GSAA). The common denominator of all these recent developments has been the use of completed shots on goal to analyze and predict goaltender performance.
Expected goals models have been developed in a number of sports to better predict future performance. For sports like hockey and soccer where goals are inherently random and scarce, expected goals models proved to be particularly useful at predicting future scoring. This is because they take into account shot attempts, which are better predictors of a team and player’s performance than goal totals alone.
A notable example is Brian Macdonald’s expected goals model dating back to 2012, which used shot differentials (Corsi, Fenwick) and other variables like faceoffs, zone starts and hits. Important developments have been made since then in regards to the predictive value of those variables, particularly those pertaining to shot quality.
Shot quality has been the subject of spirited debate despite evidence suggesting that it plays an important role in predicting goals. The evidence shows that shot characteristics like distance and angle can significantly influence the probability of a certain shot resulting in a goal. Previous attempts to account for shot quality in an expected goals model format have been conducted by Alan Ryder, see here and here.
In Part I, an updated expected goals (xG) model will be presented that accounts for shot quality and a number of other variables. Part II will deal with testing the performance of xG against previous models like score-adjusted Corsi and goals percentage.