Input versus Output: An Ongoing Battle that No One Knows About

XKCD comics is written by Randall Munroe, a physicist who probably doesn’t know what  hockey underlying numbers (ie: #fancystats or advance statistics) even are, let alone supports them… yet – for the most part – he gets it.

Mainstream sports commentary is full of poor analysis when it comes to using numbers appropriately. Most of this comes from a lack of understanding between the difference between inputs versus outputs and how much a player can control certain factors. (It should be noted that this is a broad generalization; not everyone falls into this category).

Benjamin Wendorf displayed a bit of these factoids in his recent article Why The Hockey News’ Ken Campbell is Wrong About Alex Ovechkin, but Campbell still didn’t get it.

What happened:

For those that do not know, here is a quick summary of Campbell’s article:

* Campbell stated that Alex Ovechkin will likely score 50+ goals again this season but this will be his worst 50+ goal season for overall play

* Campbell judged Ovechkin’s overall play by the Russian forward’s very poor and team-low plus/minus

* Campbell hypothesized that Ovechkin’s poor plus/minus was due to a lack of defensive effort and acumen (to Campbell’s credit he also pointed out that the Washington Capitals had an effect due to being a weaker defensive team overall, apart from Ovechkin)

* Campbell then used video of goals from the previous game as evidence supporting his hypothesis, writing: “Not all of the goals scored by the Blue Jackets were Ovechkin’s fault, but take a look at these four and decide for yourself how much Ovechkin looks like he even has a faint interest in playing two-way hockey.” (Note: Confirmation bias alert)

* Finally Campbell conjectured that the Capitals will never be more than a one-series-or-no-playoffs team unless Ovechkin improves his defensive game

Wendorf’s article pointed out that in 2012, when Capitals lost in the 2nd round of the playoffs,  Campbell raved about Ovechkin’s excellent two-way play – yet the underlying numbers did not support this (Note: it was also a career worse post-season +/- per game for Ovechkin). Wendorf also showed how unworldly Ovechkin’s offense has been and provided a comparable season of Mario Lemieux’s, all while alluding to the real root of Ovechkin’s poor plus/minus… something that Campbell seemed to miss:

I’m sure Ken Campbell as a writer understands the definition of irony…

So, what’s the problem with Campbell’s article?

The problem is that the whole foundation to Campbell’s hypothesis is built on plus/minus being indicative of his overall play and that Ovechkin’s poor plus/minus numbers this season are due to his poor defensive play. Both of these are completely wrongful statements in this situation.

Now let’s get something straight before we dive into the numbers; the point being argued is not whether or not Ovechkin is a superstar defensively. I don’t think there is any split decision on this. Rather, the argument is whether Campbell’s assertion has any merit given the evidence he is using to support his hypothesis.

Well then what’s the problem with plus/minus?

There is a lot, the biggest problem being that goals are a very infrequent event, especially when you are limiting them only to even strength or 5v5. The whole concept to statistics is that once you get a large enough sample size the sample gives you an accurate estimation of what to expect from the total population. The larger the size, the less likely false positives and negatives affect your sample. So, the less frequent something is, the longer time it will take for the sample size to be large enough.

The Capitals have seen 41 5v5 goals against while Ovechkin has been on the ice (interestingly, only 21 in score-close situations) thus far this season. For comparison, this is less than the average amount of shot attempts against (or Corsi against events) that the Capitals see in an average game. Now Corsi doesn’t become very reliable until around the 20-game mark for teams, so you can get an idea how reliable that plus/minus sample is for Ovechkin.

The other important factor when it comes to using statistics to evaluate players is how controllable it is for a player. This is evaluated statistically by how repeatable an outcome is. I’ve always likened it to playing a game of basketball with your friend: if he sinks a fancy trick shot and you think it was lucky, you will ask for him to try it again.

Plus/minus has very low repeatability, which demonstrates that players have very little control over it. There are many factors that contribute to plus/minus: nine other players on the ice, goaltending talent differences, what type of game your coach expects of your line (top-6 players will always take more risks and chances relative to bottom-6 players who are asked to play safe), what type of minutes your coach puts you on the ice for (again with taking risks: are you trying to keep a lead or make a lead), bounces (ie: luck), the huge variance that is small-sample save percentage, etc.

Combine these two factors and you get a very unreliable statistic:

Check out at 53 games how reliable that +/- is…it’s that stumbling line far below the others.

Gabriel Desjardins once responded to how reliable +/- really is: ” I think it’s very reliable – in fact, you would do reasonably well signing the player who finishes dead last in the league in +/- every season provided he wasn’t on an expansion team.”

This is where the major tripping point for mainstream hockey analysts lies. Plus/minus is the desired output, so theoretically you’d think that better players would reach that desired output more often. On an NHL average as a whole this is true; however, it is not true often enough to be reliable piece of evidence for individual players (and sometimes even teams).

To read up on more of these I would suggest these articles and their subsequent links:

The Importance and Misconceptions of Advance Hockey Analytics

Percentages and Probabilities: Does luck exist and how do you factor for it?

Why is Ovechkin’s plus/minus so low anyways?

The short answer is mostly the two factors we discussed above, although this doesn’t disprove Campbell, but rather only creates reasonable doubt.

In an article that is probably among my favorites, Tyler Dellow shows statistics broken up into buckets for particular lines (estimated by ATOI):

Using this data, we see that 2011-12’s average first line playing the same amount of 5v5 minutes as Ovechkin would see 38 goals for and 34 goals against, as opposed to Ovechkin’s 25 GF and 41 GA.

First thing you’ll notice is that goals-for are the bigger difference and therefore the larger problem. With Ovechkin’s primary purpose being a goal scorer, you may be tempted to place the blame on Ovechkin. The problem with that theory is that Ovechkin’s 5v5 shooting and goal scoring rate is actually above his average over the last six seasons. In actuality, the problem is everyone else… sort of.

As Wendorf showed in his article, the issue is his linemates on average have only scored on 3% of their shots. That’s half the shooting percentage of a normal fourth line. Where is the call for Nicklas Backstrom and Marcus Johansson to improve their offense? Well obviously there is none because people know it’s just been situational luck. Yet this has been the largest factor for Ovechkin’s poor plus/minus; his linemates haven’t received the normal amount of bounces.

To place some perspective on it: Ovechkin has 14 goals and only 2 assists with Backstrom on the ice; he also has 15 goals and only 3 assists with Johansson. You can’t blame Ovechkin for those players running into bad luck.

But his GA is worse than average first line too.

Yes, and I’m sure that some small percentage of it will have to do with Ovechkin’s game. No one would be able to convince me that Ovechkin is as good defensively as Pavel Datsyuk, Patrice Bergeron or Anze Kopitar – and the underlying numbers show this. However, the difference between average and what Ovechkin has seen is just over 1 goal for every 10 games, and with that small of a value you can’t convince anyone that it is overwhelmingly significant.

In the end, the game is about out-scoring, not singularly creating or preventing goals. Ovechkin is 3rd in team shot attempt differentials, 4th if you remove blocked shots, and 4th for those that hit the net. These are Ovechkin’s inputs; his team-low goal differential is the output.

Underlying numbers are the inputs. Judge a player by their inputs; those are the factors players control and are repeatable. Do not focus on the factors a player doesn’t control.


What does this have to do with Ovechkin’s defensive play? Certainly more than Campbell citing Ovechkin’s plus/minus did.

Did we actually watch the game that night or did we just run spreadsheets? I did; I can’t speak for Ben, though. On the other hand, it’s not like watching the game changes these numbers…

But “watch the games, genius?” Mr. Campbell said there is a place for analysis like this, and I think he’s right. Its place is to remind people that certain numbers have certain meanings and people can get them wrong; in this example it was Ken Campbell.

Bonus Section: Craig Button defends his friend Campbell by pointing out something that doesn’t have anything to do with the subject

Button replied to Wendorf’s text with a link that doesn’t work because he typed it twice. I google’d to find what he was trying to allude to: this article. An article that’s only similarity to the discussion is that they both revolve around Ovechkin.

The faux advanced stats website started by the Star has been quite the interesting place. The numbers are real but we again see the same problems with poor analysis. So, to add this in…

First Jay Palansky mentions assist : goal ratios, something that is merely trivia statistics rather than indicative of anything at a raw data level. He alludes to it as though particular ratios are better than others, which is nonsensical. Meanwhile Wendorf’s response to Campbell demonstrated why certain players -like Ovechkin- should take a larger share of shots. Then Palansky mentions how the ratio has dropped severely this year (see above re: linemates shooting percentages) as if Ovechkin’s steep drop this year is the result of playing worse, which we now know is not true.

Then he talks about Ovechkin’s power play : even strength goal ratio. Again, he’s assuming these ratios are good or bad. Now normally, advanced stats folks are wary of power-play production because it is highly variable, but many successive, dominant seasons show that Ovechkin is a legitimate powerhouse in this respect. Palansky ignores the fact that Ovechkin has been one of the league’s top goal scorers in 5v5 over the last four years as well; that’s kind of understandable, at least, because he’s just that much better playing on the power play. But then he states Ovechkin can’t produce on 5v5, focusing on this season (that we already know is hurt by his teammates’ unlucky shooting percentage), ignoring that the last 4 years show him to be a top 10 5v5 point producer.

Palansky continues by noting the Capitals’ goal percentage, which is essentially plus/minus but expressing it as a percentage removes the comparative problem of TOI differences. Regardless, it’s redundant; we already know that over this sample the plus/minus is bunk and have uncovered the major reasons why.

Ovechkin does nothing but score goals? Probably should have looked at an important advance stat like pushing possession too.


One thought on “Input versus Output: An Ongoing Battle that No One Knows About

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s