Testing and Final Remarks

(Photo by Andre Ringuette/NHLI via Getty Images)

(Photo by Andre Ringuette/NHLI via Getty Images)

This is Part 5 of a 5 part series detailing my WAR model, Part 1 of the series can be found here, Part 2 of the series can be found here, Part 3 of the series can be found here and Part 4 of the series can be found here.

Introduction

In the beginning of this exercise I set out to try and encapsulate the best estimate of a NHL player’s true value. An adjusted plus-minus system (XPM) was introduced to help contextualize shot attempt numbers. An box plus-minus system (BPM) was introduced to help contextualize metrics such as goals and assists. Ability to win faceoffs as well as to draw and not take penalties were also included

“WAR is not meant to be a perfectly precise indicator of a player’s contribution, but rather an estimate of their value to date. Given the imperfections of some of the available data and the assumptions made to calculate other components, WAR works best as an approximation. WAR is trying to answer the time-honored question: How valuable is each player to his team? Comparing two players offensively is useful, but it discounts the potential contribution a player can make by saving runs on defense or special teams. WAR is a simple attempt to combine a player’s total contribution into a single value.

The goal of WAR is to provide a holistic metric of player value that allows for comparisons across teams and years and a framework for player evaluation. While there will likely be improvements to the process by which we calculate the inputs of WAR, the basic idea is something fans and analysts have desired for decades. WAR estimates a player’s total value and allows us to make comparisons among players with vastly different skill sets.  (FanGraphs). 

The final study will examine the repeatability and predictiveness of the WAR components.

Predictiveness

Here is a look at the year-to-year repeatability of our six new stats (at a per 60 rate).

screen-shot-2016-10-20-at-3-39-38-pm

Here is how the even-strength per 60 stats stack up against the most popular metrics currently used to evaluate a player’s value:

screen-shot-2016-10-20-at-12-42-22-am

Predictiveness

The point of hockey games is to win and teams win by scoring more goals than their opponents. This is why shot attempt metrics have gained such prominence in recent years, their value comes from their ability to predict future goals and future winning. I have based this test off a similar test done by Neil Paine. Since there are not many individual metrics currently available for special teams we will focus on even-strength metrics for these tests.

The test will be conducted as follows: we will assume that for a given season we know exactly how much playing time each player will receive and use the player ratings from a previous season to predict their value in the next season.

Player’s Rating in Year 1 * Player’s Playing Time in Year 2 = Player’s Predicted Value in Year 2.

We will then sum each player’s Year 2 rating by team and we will see how well that team rating correlates with that team’s goals for and against total in Year 2. The lockout shortened season was removed from this study, as well as players whose Year 1 rating occurred while playing less than 300 even-strength minutes. Here are the results of that test:

screen-shot-2016-10-20-at-1-15-01-am

Application

WAR might be presented in some circles as a one number metric but each of it’s components provides a unique insight. Each component has clearly different levels of repeatability and none are exempt from outliers, common sense still needs to prevail. If a player has had a history of drawing penalties and one year suddenly does not, such information should be taken into consideration.

“Given the nature of the calculation and potential measurement errors, WAR should be used as a guide for separating groups of players and not as a precise estimate. For example, a player that has been worth 2.4 WAR and a player that has been worth 2.1 WAR over the course of a season cannot be distinguished from one another using WAR. It is simply too close for this particular tool to tell them apart. WAR can tell you that these two players are likely about equal in value, but you need to dig deeper to separate them.

However, a 2.4 WAR player and a 0.5 WAR player are different enough that you can have a high level of confidence that the first player has been more valuable to their team over the given season” – (FanGraphs)

Conclusion

This WAR model seems to provide a much more accurate depiction of a player’s true talent levels than current metrics. Further testing will be required, especially with regards to the special teams metrics.  The introduction of the varying components of this model are the first steps in what will surely be a long and developing process. As more people dive into these numbers there will hopefully be new discoveries that help move the conversation around hockey analytics forward.

All WAR and GAR data can be found here.

  • GAR is Goals Above Replacement and is simply WAR multiplied by the Goal to Win conversion factors discussed earlier here).
  • Pure Offense = Even Strength Offense + Power Play Offense
  • Pure Defense = Even Strength Defense

“I’m just standing on the shoulders of giants.  WAR is simply an evolution, something that many creative people have had a hand in, or provided the necessary inspiration.  While my fingerprints are all over it, it’s more in terms of shaping it, than creating it.  All the hard work was done at the beginning, and then the rest of us came over to mold it, chip away at it, and present it in a way that could be consumed.” – Tom Tango

Please let me know if you have any thoughts, questions, concerns or suggestions. You can comment below, reach me via email me here: DTMAboutHeart@gmail.com or via Twitter @DTMAboutHeart.

3 thoughts on “Testing and Final Remarks

  1. Thoroughly enjoyed and interesting results. Very interesting that penalty differential can be such a large component a la Kadri. Hopefully this type of research leads to greater debate versus corsi.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s