Every once-in-a-while I will rant on the concepts and ideas behind what numbers suggest in a series called Behind the Numbers, as a tip of the hat to the website that brought me into hockey analytics: Behind the Net.
Hockey’s plus/minus may be the worst statistic in hockey, although there is some debate with goalie statistics not based off of save percentage (like GAA or Win% that just adds a team component to a goalie’s save percentage). It could even be in contention for just the worst statistic in sport.
Now, some people may read that and think I’m simply saying this because I value shot metrics over goal metrics in player evaluations. While I do feel that way, it is only one of a few reasons that that plus/minus fails in being useful.
Shot based metrics are typically better than goals
I wanted to get the obvious one for a hockey stats guy to argue out of the way first.
Don’t get me wrong, goals are the end objective, and in the very long run it should be worth at least a look, but we also know that sometimes players get lucky bounces, or that goaltenders steal games. This uncertainty means we can’t rely on goals for predictivity.
The purpose of analysis is to maximize the probability of future outscoring, and to do this requires looking at those metrics that suggest success is most likely in the future.
Noting who outscored who doesn’t really matter if they do not continue outscoring. This is important.
What is most likely to happen in the future is a better measure of who they truly are rather than what happened before.
We’re not even limited in shot quantity only in the toolbox anymore either, as shot metrics can now account for shot quality, like with the expected goal (xG) model seen above.
Goals are rare, and with any rare statistical event it becomes easy for the sample to be skewed by outliers and not be indicative of the “true-talent level” or population. There are also more confounding variables added with the highly unstable impact that is goaltending performance on both ends while a player is on the ice.
Combining a rare event with highly variable confounding variables and you get a number that takes a very long while to settle. With outscoring, this often means multiple seasons.
Plus/Minus arbitrarily chooses to exclude some goals and not others
This is sometimes news to those that use plus/minus as a stat. The number is not exclusive to 5v5, or even strength. The plus/minus statistic includes all even strength goals, all goals with either goaltender pulled (as hockey views goalie pulled as even strength), and short handed goals.
This causes some very odd skewing in plus/minus to particular player types.
Including only shorthanded goals means that the power play can only hurt a skater’s rating (or stay even). It also means that a penalty killer can only improve their rating with tallies in the plus column (or stay even). The longer they spend in those situations, the greater the likely skew, regardless of how effective they are in their respective special teams deployment.
The skew is increased further by empty nets. Pulling the goalie is a desperation tactic, that increases the chance of scoring while also increasing the chance of a goal against even more.
This means that players who typically play with the lead and the opposition’s goaltender pulled are more likely to garner pluses than minuses over the long run. Players who are playing from behind and have their own goalie pulled are the opposite.
There is a tendency for penalty killers to play in the lead situation, and power play unit skaters to go out with the trailing situation, so these two factors in general skew in the same direction for skaters.
A handful of short handed and empty net goals may not seem like much, until you realize that the standard deviation in even strength goal differential is only about nine goals.
There is no science or evidence-backed reasoning to the meaningfulness of using some goals and not others. It is simply because it seemed right to the first person who made the rules for the number, and then tradition has kept it that way.
One example I like to use is Mark Stuart of the Winnipeg Jets for the 2014-2015 season:
Mark Stuart carried the Jets’ team-worst goal differential of the 13 defenders when looking at even strength (-3), power play (+0), and penalty kill (-19). He ended the season with the 6th-best plus/minus, with a +5 rating.
Stuart logged huge minutes on the penalty kill and essentially none on the power play. This, plus empty net goals, skewed his plus/minus up despite being the worst differential player in essentially all the situations.
Plus/Minus is a counting statistic
Every NHL skater to play at least one game of hockey since the 2007-08 season.
This may not seem intuitive to everyone on why being a raw differential does not work well for comparing outscoring, but there are two major problems that come from the statistic being a counting differential.
1: The worst plus/minus player is unlikely to be the worst at outscoring the opposition
Often when people defend plus/minus, they point at career leaders in the statistic. The issue with any raw counting statistic is that icetime is a huge factor in the values.
It makes sense that the leaders in plus/minus would be good players, since on the whole, coaches are quite good about assigning ice time to their best goal scorers and preventers. Thus, with that many minutes and a high talent level, they are likely to rack up large plus/minus.
While there are flaws in the market and process in NHL managements, for the most part statistical analysts view decision-making to be good. Hockey is not like baseball where there are huge, undervalued, low-hanging fruit being missed. On average ice time is given to the better Corsi players, the better point per minute producers, and the better WAR players.
The same cannot be said of plus/minus.
What those who point at plus/minus leaders as evidence of its effectiveness end up ignoring are all the skaters that have tallied the most minuses. Just like the players with the highest plus/minus rating are big minute players, the players with the lowest plus/minus rating are also the players eating a ton of minutes.
The weakest players are not given enough ice time to accumulate the large negative values in plus/minus. The most negative players are typically have some combination playing lots of minutes, playing for weaker teams, and carrying a low PDO. They are very rarely the worst negative players relative to ice time as well.
2: Not all similar ratings are created equal
What does a player with +5 rating tell you? Well that the team scored 5 more goals with them on the ice than the opposition scored (…well, not exactly, since this would be excluding power play goals for and short handed goals against as we noted earlier).
How good is that though?
Is the skater +5/-0, or +10/-5, or +15/-10, or +20/-15, or +25/-20,… +50/-45. That scale is a team controlling 100% of the goals, 66.7%, 60%, 57%, 56%,… 53%. Those are some huge differences in performance.
When looking at very large multi-season samples to diminish shooting and save percentage variance, players with their team controlling about 60% of goals are top players on top teams, like Pavel Datsyuk, Niklas Lidstrom, Jonathan Toews, Patrice Bergeron, and the Sedin Twins. Meanwhile, 53% goal share has players like Mark Fistric, Brenden Morrow, David Perron, Justin Abdelkader.
Yes, outscoring is important… but even in the large samples, plus/minus fails to properly measure outscoring.
When only looking at the surface level, it is understandable why some would like plus/minus. Outscoring is the ultimate objective to hockey, so you might want to look at who has outscored the most.
Looking at goals as a measure of outscoring doesn’t necessarily tell you who will outscore in the future, and we have far more effective means in determining who will.
Plus/minus doesn’t even say who outscored the most because of its arbitrary methodology, and can even give you false positives — a guy who has a positive plus/minus rating might actually have a negative goal differential. Those who have the worst plus/minus, may not even the worst plus/minus players relative to their ice time. Two players can have the same rating and be vastly different in their outscoring performance.
If you want to use goals, do not use plus/minus. Use goal% (team’s share of on-ice goals) or turn a goal differential into a rate relative to ice time instead. Also, separate out different situations, like the power play, penalty kill, and even strength, from each other.
However, looking at samples like within a season as we tend to do, we have far better measures of a player’s effectiveness than goal metrics.
Anything plus/minus tries to do, there is something else that does it better.