A Primer on @DTMAboutHeart’s WAR Model

paveldatsyukmoves

Introduction

Hockey stats have existed for about as long as the game itself. Simple boxscore stats such as goals and assists can be traced back almost a full century now. These stats have helped inform fans, coaches, and managers of the value held by players. Around the 1950’s the Montreal Canadiens began to track a player’s plus-minus with the idea that simple boxscore stats failed to capture many important elements of a game. Plus-minus was a good start towards tracking impact that is not realized in traditional boxscore stats, but has been recently shown to be quite incomplete and lacking by modern evaluation standards.

It is no surprise to many that hockey has lagged behind most sports in the depth and sophistication of the analytics movement. This is through no real fault of those participating in the research, but is simply a by-product of limited data access (relatively speaking) and a smaller fanbase. The most famous stat to come out of the modern hockey analytics movement is probably Corsi (or simply all shots). Corsi has numerous benefits over goal based metrics, mostly because it accumulates faster leading to more reliable information. Its true value lies in its ability to be more predictive of who will be better in the future.

Current Shot Attempt Metrics

A hiccup in hockey’s analytics movement forward lies not with shot attempt metrics, but in how they are considered and presented. Hockey is a difficult game to analyze. It is much more fluid than baseball, it doesn’t have the high scoring of basketball (even if you equate Corsi to points), and it doesn’t have the continuity or tactical prowess of soccer. However, when many people talk about hockey stats (even the more analytically based ones) they tend to simplify hockey’s many factors, to a point of detriment. The current three most popular methods used to analyze shot attempt numbers in hockey are as follows (using Corsi as an example but one could substitute Fenwick or Expected Goals):

  • Corsi For%
    • All shots directed at the opponent’s net when Player X is on the ice / all shots taken either by Player X’s team or their opponents while Player X is on the ice
  • Corsi Relative%
    • The difference between how Player X’s team performs with Player X on the ice and how Player X’s team performs when Player X isn’t on the ice
  • WOWYs – With Or Without Yous
    • Similar to Corsi Relative %, but broken down to compare (typically) two players, it is presented in three parts
      • Together
        • How Player X and Player Y perform (their CF%) when they are on the ice together
      • Player X Apart
        • How Player X performs when they are not on the ice with Player Y
      • Player Y Apart
        • How Player Y performs when they are not on the ice with Player X

The issue with these metrics can be best demonstrated by the plot below that shows the amount of time each forward on the New York Rangers played together during the 2015-2016 season (players that share more ice-time together have thicker connecting lines).  

rangers_network

It’s a mess and this doesn’t even include the six defensemen a team will typically dress. Due to hockey’s sheer number of players (18 players regularly participate in a NHL game vs. 10 players in an NBA game) and it’s fluid substitution method (changing on the fly), current methods vastly oversimplify all of the nuanced interactions between NHL players. Simply looking at a player’s on vs. off numbers (Corsi Relative%) or looking at a player’s performance without one specific player (WOWYs) results in a loss of a lot of important information.

One Number Metrics

“Know exactly what every player in baseball is worth to you. You can put a dollar figure on it.” – Billy Beane

No stat is perfect. That shouldn’t need to be said, but I feel it is necessary to get out in front of any strawman arguments. However, just because a single number metric isn’t perfect doesn’t mean it can’t be incredibly useful. There is a ton of information that goes into evaluating a hockey player. It is extremely difficult for people to properly track and weight the variables in an effective manner. The best example of this are Domenic Galamini’s HERO charts.

screen-shot-2016-10-18-at-3-18-37-pm

While extremely useful for immediate analysis, they can leave one with more questions than answers. The two charts mainly show production (goals and assists) and possession (shot attempts), yet which is more important and by how much? Not to mention, if you go deeper into your analysis, how much of their rating is dependent on their teammates or their competition? How do you handle a player’s WOWY with low time-on-ice samples with and/or without a player? These are some examples the types of questions that people need to answer when trying to determine player value.

Standard analysis typically involves one gathering many different statistics and interpreting them through a mishmash technique. Weighting each factor on a case by case basis is difficult, time consuming and inconsistent. The value in many one number metrics (such as my WAR model) is that it was built under a theoretical construct. It is a method for decomposing value into fractional parts that can then be attributed to particular players. The methodology came first and the numbers are the result of that methodology. These numbers are not absolutes, nor are they perfect rankings of player value; think of them more as strong estimates of a player’s on-ice contributions to a team.

Conclusion

My WAR model consists of 6 major components (and some minor components within those):

  • Even-Strength Offense
  • Even-Strength Defense
  • Power-Play Offense
  • Drawing Penalties
  • Taking Penalties
  • Faceoffs

The even-strength and special teams portion of this model are actually the result of two submodels: Expected Plus-Minus and Box Plus-Minus. The “extras” (penalties and faceoffs) are derived from two models, originally developed by War-On-Ice creators Sam Ventura and Andrew C. Thomas. There is also a seasonal adjustment. There is no penalty kill defense included in this model, the justification for doing so will be expanded upon later. All of these areas will be explained in further detail later in this series as well as testing of the models repeatability and predictive power.

Please let me know if you have any thoughts, questions, concerns or suggestions. You can comment below, reach me via email me here: DTMAboutHeart@gmail.com or via Twitter @DTMAboutHeart.

3 thoughts on “A Primer on @DTMAboutHeart’s WAR Model

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s