
The only context that can explain Andre MacDonald’s performance is if he’s actually wearing these chains under his uniform.
Eric Tulsky frequently points out on twitter that common critiques of analytics people (whether it be hockey or any other sports analytics) tend to act as if those involved with analytics are kind of stupid and have ignored the obvious. For example, people tend to respond to arguments involving corsi and possession by bringing up the obvious subject of context – “Sure he has a bad corsi, but he gets tough minutes!” And the general response of course is, yes we have, and we wouldn’t be making these assertions had we not done so. Hockey Analytics has come up with a multitude of statistics to measure context – Behind The Net alone has 3 metrics for quality of competition and 3 metrics for quality of teammates, plus a measure of zone starts – HA has multiple different measures for the same thing and so does now Extra Skater (with Time on Ice QualComp and QualTeam).
But what’s ALSO the case, and is WAY too often underlooked, is that a good many of these statistics were created without actually looking too deeply into how much these parts of context actually MATTER. It’s one thing to say a D Man plays tough competition or tough zone starts (or weak opponents and cushy zone starts). But too often people don’t got the correct extra step of seeing how much that matters.
For example, Andrew MacDonald is the Isles premier left-handed “shutdown defenseman.” But for three years now, he’s kind of been awful in the possession numbers, culminating in even worse numbers this year. Yet people have always noted that MacDonald has played incredibly tough competition year in and year out, and decently hard Zone-Starts as if that justified his performance. So I took a look at how well MacDonald performed against competition of various levels, and sure enough, he underperformed against even poor competition (and was terrible vs even average opponents). Context did not explain MacDonald’s performance – the numbers were terrible because he himself was poor.
In fact, what people who have looked into these things have frequently found is that certain parts of context simply don’t make as much of a difference as you would think. The impact of quality of competition seems to be very low over the course of a season since the difference between players’ competition faced for the most part evens out over the long run (see this article by David Johnson or this by Eric Tulsky) – nearly all players face an average opponent somewhere between +1 Corsi/60 or -1 Corsi/60 (and the most extreme are around 2). That’s a 2 shot difference – basically nothing. The impact of zone-starts is similarly a lot less than previously believed – around .31 Fenwick per o zone start (roughly .4 Corsi) as seen in this Eric T study, which is about half of what was previously believed. The various neutral zone entry trackers (including myself) have all found very similar results.
Take a look at this classic graph from Hawerchuk:
People have cited this graph constantly to talk about how competition and zone starts matter – in fact, Hawerchuk does it himself in the post this graph comes from. But the difference in performance as you go across the competition axis is very very small! Hell, this is even the case if you go to extremes. With Zone Starts, you do see a difference, but in the middle where most players are between, the difference is small. The gap between 45% Zone starts and 55% aren’t huge, and most players will be within such a gap. Yet people – even people who are quite fluent in analytics – will constantly site such contextual measures as if they justify good or bad performance for various players.
This isn’t to say that all of context isn’t important of course – all findings have found that quality of teammates can make a big difference in shot differentials (since unlike competition, a team can control the level of teammates each player is on the ice with to near perfection). But the point is that in an effort to make sure we DO control for context, we too often go overboard and over emphasize context to the point where we fail to recognize what is right between our noses. Sometimes players with bad numbers are bad. Sometimes players with good numbers are good.
This makes a lot of sense, in that we have to realize the course of play in the NHL and its results, particularly in possession, fall within the window of 60-40, even on an individual basis. Hence why I threw this out there a little while ago: http://benwendorf.tumblr.com/post/71330563375/rule-of-60-40-nhl-possession-fenwick
I think, at some point, we might have to embrace a rule of 60-40 and use it to start building a new sense of “player contributions,” but for the moment its just a theory that needs to be poked and prodded.
Good stuff Garik. People who use Corsi need to understand the limited impact specifically of QoC on performance
This was why Steve and I calculated dCorsi, to account for context by weighting the above-mentioned factors in addition to TOI/GP and score effects. We really need to write an article explaining the purpose and calculation of dCorsi. Seems like something that a lot of people easily misinterpret as an evaluator of overall performance as opposed to an evaluator of how well a player over/underperforms.
I think a post explaining such things would be great, along with a formula even if 99% of people wouldn’t bother trying to calculate it. I think it’s great to see people objectively attempting to find ways to adjust for context and looking at how much each part of context affects things.
I’ll be looking at something slightly different in my next post.