Testing and Final Remarks

(Photo by Andre Ringuette/NHLI via Getty Images)

(Photo by Andre Ringuette/NHLI via Getty Images)

This is Part 5 of a 5 part series detailing my WAR model, Part 1 of the series can be found here, Part 2 of the series can be found here, Part 3 of the series can be found here and Part 4 of the series can be found here.

Introduction

In the beginning of this exercise I set out to try and encapsulate the best estimate of a NHL player’s true value. An adjusted plus-minus system (XPM) was introduced to help contextualize shot attempt numbers. An box plus-minus system (BPM) was introduced to help contextualize metrics such as goals and assists. Ability to win faceoffs as well as to draw and not take penalties were also included

“WAR is not meant to be a perfectly precise indicator of a player’s contribution, but rather an estimate of their value to date. Given the imperfections of some of the available data and the assumptions made to calculate other components, WAR works best as an approximation. WAR is trying to answer the time-honored question: How valuable is each player to his team? Comparing two players offensively is useful, but it discounts the potential contribution a player can make by saving runs on defense or special teams. WAR is a simple attempt to combine a player’s total contribution into a single value.

The goal of WAR is to provide a holistic metric of player value that allows for comparisons across teams and years and a framework for player evaluation. While there will likely be improvements to the process by which we calculate the inputs of WAR, the basic idea is something fans and analysts have desired for decades. WAR estimates a player’s total value and allows us to make comparisons among players with vastly different skill sets.  (FanGraphs). 

The final study will examine the repeatability and predictiveness of the WAR components.

Continue reading

Extras, Blending and Seasonal Adjustment

(Photo by Rich Graessle/Icon Sportswire)

(Photo by Rich Graessle/Icon Sportswire)

This is Part 4 of a 5 part series detailing the my WAR model, Part 1 of the series can be found here, Part 2 of the series can be found here and Part 3 of the series can be found here.

Introduction

Now that we have covered the overall player models here and here, we will explore how to blend these two together to achieve maximum out-of-sample predictive power. We will touch on what I have coined the “extras” section made up of penalties and faceoffs. Faceoffs are a fairly standard and well accepted player skill, even though it is overvalued by many hockey “traditionalists.” Penalties are an aspect of player analysis that typically goes unaccounted for in most current analysis. Finally, we will implement a yearly adjustment most commonly used in baseball WAR.

Continue reading

Introducing Box Plus-Minus

EDMONTON, AB - OCTOBER 23: Connor McDavid #97 of the Edmonton Oilers skates during a game against the Washington Capitals on October 23, 2015 at Rexall Place in Edmonton, Alberta, Canada. (Photo by Andy Devlin/NHLI via Getty Images)

(Photo by Andy Devlin/NHLI via Getty Images)

This is Part 3 of a 5 part series detailing the my WAR model, Part 1 of the series can be found here and Part 2 of the series can be found here.

Introduction

Box Plus-Minus (BPM) is a box score-based metric for evaluating a hockey player’s quality and contribution to the team. It is very different than an Expected Plus-Minus type model, which is a play-by-play regression metric. BPM relies on a player’s box score information to estimate a player’s performance relative to replacement level. Box Plus-Minus type metrics have long populated basketball circles, there is a great summation of some of the original creations here, with many newer versions popping up including Dredge, DRE and Player Tracking Plus Minus. A version has even already been brought to hockey in the form of Game Score. Here I will attempt to create my own version of Box Plus-Minus for the NHL.

Continue reading

Introducing Expected Plus-Minus

sidney-crosby-nhl-chicago-blackhawks-pittsburgh-penguins

This is Part 2 of a 5 part series detailing the my WAR model, Part 1 of the series can be found here.

Introduction

“Basically, anything that WOWY can do, I think can be done better with regression-type methods”Andrew C. Thomas, Lead Hockey Researcher Minnesota Wild

Adjusted Plus-Minus metrics were first introduced into NBA circles around 2004 by Dan Rosenbaum. The basketball community has since seen many iterations including; Steve Ilardi/Aaron Barzilai, Joseph Sill and Jeremias Engelmann. Soon after these metrics made their debut into the public sphere they were adopted for hockey and have themselves seen many different iterations; Schuckers/D.Lock/Wells/Knickerbocker/R.Lock, Brian Macdonald, Gramacy/Jensen/Taddy, Thomas/Ventura/Jensen/Ma and Emmanuel Perry. I even made my own attempt in the summer of 2015 which I coined Corsi Plus-Minus. These metrics have struggled to take hold amongst the hockey community for whatever reason, unlike in basketball circles.

Continue reading

A Primer on @DTMAboutHeart’s WAR Model

paveldatsyukmoves

Introduction

Hockey stats have existed for about as long as the game itself. Simple boxscore stats such as goals and assists can be traced back almost a full century now. These stats have helped inform fans, coaches, and managers of the value held by players. Around the 1950’s the Montreal Canadiens began to track a player’s plus-minus with the idea that simple boxscore stats failed to capture many important elements of a game. Plus-minus was a good start towards tracking impact that is not realized in traditional boxscore stats, but has been recently shown to be quite incomplete and lacking by modern evaluation standards.

Continue reading

NHL Draft Probability Tool

donnie

SUNRISE, FL – JUNE 27: the Boston Bruins during the 2015 NHL Draft at BB&T Center on June 27, 2015 in Sunrise, Florida. (Photo by Dave Sandford/NHLI via Getty Images)

The annual NHL draft has become a great source of entertainment for fans. Since teams make player selections based on a combination of game theory and data, the draft is also a fertile ground for analysts as well. Game theory specifically is the foundation for the Draft Probability Tool that will be presented in this piece. It will help you explore how teams should approach the draft strategically: if you’re interested in a specific player, do you need to trade up or down to get him? How much should you be giving up or asking for? How far should you trade up or down to still get the player you value highly? This tool helps answer those questions.

Continue reading

Delta Box Score: a model for predicting player scoring independent of teammate quality

 

Introduction

One of the greatest challenges in sports analytics is determining the skill of a player independently of quality of teammates. While a number of tools already exist (e.g. WOWYs in hockey), their (mis)use lends itself to significant limitations and collinearity concerns. This is where regression-based approaches can provide a more rigorous alternative in isolating a player’s true talent.  

An encouraging development in hockey analytics as of late has been Ryan Stimson’s Passing Project, which you can read about here. The goal of this post is to introduce a regression-based method to estimate an NHL player’s expected scoring performance independently of the passing strength of his teammates. To this end, player and linemate data from Stimson’s Passing Project and Muneeb Alam of the 2014-2015 season were used to devise a rate-based metric of a player’s projected goals. The difference between a player’s projected goals per 60 minutes and actual goals per 60 minutes will be called Delta Box Score or DBS.

Continue reading

Expected Goals Data Release

bergy goalBack in October 2015,  @asmae_t and I first unveiled an Expected Goals model which proved to be a better predictor of team and player goalscoring performance than any other public model to date. Thanks to the feedback of the community, a few adjustments and corrections were made since then. The changes were the following:

  1. Score state was a variable that was accounted for in the model but was not explicitly mentioned in the original write-up. Recall that after accounting for all variables, including score state, it was found that a shot attempted by a trailing team still has a lower likelihood of resulting in a goal than a shot taken by a leading team.
  2. The shot multiplier in Part I of the original write-up was adjusted using a historical weighted average instead of in-season data. Thus, a 2016 shot multiplier for example would be based on the average of the regressed goals (rGoals) and regressed shots (rShots) of 2014 and 2015.  This adjustment improved the model’s performance against score-adjusted Corsi and goals % in predicting future scoring, as seen in the graph below. We thank @Cane_Matt again for pointing out this error. Corrected Version of xG

Continue reading

xSV% is a better predictor of goaltending performance than existing models

This piece is co-authored between DTMAboutHeart and asmean.

Analysis of goaltending performance in hockey has traditionally relied on save percentage (Sv%). Recent efforts have improved on this statistic, such as adjusting for shot location and accounting for goals saved above average (GSAA). The common denominator of all these recent developments has been the use of completed shots on goal to analyze and predict goaltender performance.

Continue reading

Expected Goals are a better predictor of future scoring than Corsi, Goals

newplot

This piece is co-authored between DTMAboutHeart and asmean.

Introduction

Expected goals models have been developed in a number of sports to better predict future performance. For sports like hockey and soccer where goals are inherently random and scarce, expected goals models proved to be particularly useful at predicting future scoring. This is because they take into account shot attempts, which are better predictors of a team and player’s performance than goal totals alone. 

A notable example is Brian Macdonald’s expected goals model dating back to 2012, which used shot differentials (Corsi, Fenwick) and other variables like faceoffs, zone starts and hits. Important developments have been made since then in regards to the predictive value of those variables, particularly those pertaining to shot quality.

Shot quality has been the subject of spirited debate despite evidence suggesting that it plays an important role in predicting goals. The evidence shows that shot characteristics like distance and angle can significantly influence the probability of a certain shot resulting in a goal. Previous attempts to account for shot quality in an expected goals model format have been conducted by Alan Ryder, see here and here

In Part I, an updated expected goals (xG) model will be presented that accounts for shot quality and a number of other variables. Part II will deal with testing the performance of xG against previous models like score-adjusted Corsi and goals percentage.

Continue reading