Around the New Year, I put together a survey inspired by something the soccer analytics community did last year. Thom Lawrence put it together and it was very informative and cool to see. He’s also a good follow on twitter. This post will go over the results from nearly 500 responses.
Happy Max Corsi Productivity Day! We’ve reached the point in the season where Corsi best predicts future winning percentage. There’s plenty of more advanced ways to better predict how the rest of the season will go, but Corsi offers a simple baseline in a way that helps explain why it is so important. I’ll first explain what that means and why it matters, then take a look at how we can use it to predict basic shifts in the standings for the rest of the NHL season.
On November 21st a similar version of the above graph was published displaying that Eric Tulsky’s addition to the Carolina Hurricanes may have improved their shot differentials, but harmed their shooting percentage.
No longer though:
Garret Sparks of the Toronto Maple Leafs made history in his NHL debut after being drafted in the 7th round and working his way up from the ECHL. By all accounts, he did it on merit by maintaining a .924sv% since turning pro, including playing for .940 in the past two years in the minors.
He’s earned his big break, but in a way he is lucky to be playing for an organization which values performance and statistical trends as much as the Leafs. I’m not sure his story would have unfolded quite this way had he been born a couple of years earlier, or had he belonged to team which only tries out a young goalie if he’s over 6’5″. But we’ll get back to that.
Unfortunately, size couldn’t work forever…the Ducks’ failure to advance to the Stanley Cup Finals realized the 30% chance that none of our brackets correctly picked both series winners last round. My only conclusion is we don’t know anything about hockey.
In a related story, SAP bricked one of their picks as well, so the Finals will ultimately determine if their “85% accurate model” manages to do better than a coin flip this year (as of right now, they are 8 for 14). Let’s see how truculence, size, and experience did last round, where they stand for the playoffs, and which one of them will accurately predict who wins the Cup.
Out of curiosity, and having access to some of the data, I decided I could chart the distribution of player overall ratings in the EA NHL series in its first decade of existence (the first of the series and NHL 99 being the exception). Knowing full well that, by 2005, there was a popular gripe that “anybody could get a 70 overall rating,” it seemed like it would be fun to see how we arrived to that point. As you can see, the ’93 version was remarkable in its near-even distribution; most famously, Tampa Lightning defenseman Shawn Chambers received an overall rating of 1. The subsequent games never attempted a similar approach; there were marked divergences for the ’96 and ’04 versions, the latter essentially bringing us to the place where it seems anyone can get a 70 rating. I’d be interested hear your comments suggesting theories and/or evidence why we saw this kind of movement.
At this point I’m inclined to say, as an NHLPA-approved product, it probably wasn’t enjoyable for the players to have low ratings, and thus have that opinion of them reflected to thousands of young fans. More importantly, those fans probably didn’t get much of a kick out of playing with poorer players (playing against them, on the other hand…). I’d also guess that, when you are rating a player’s numerous attributes, it’s hard to end up with a 1 overall unless you had negative values (which they didn’t) or a very low weighting for multiple attributes (which they mostly didn’t).
Why would I even bother looking at this anyway? Well, for two reasons. One, after boxcar statistics (goals, assists, points) and +/-, video game ratings were really the next attempt to derive a publicly-consumed statistic for player talent and value. Whole generations observed, and potentially internalized, the way these games conceptualized important and unimportant elements of the game. Understanding hockey should be as much an understanding of society as it is an understanding of the technical components of the game.
Postscript: I plan on breaking down this data in a more complex fashion in future posts, so stay tuned…
Postscript II: Best theory I’ve seen so far, from Reddit user “DavidPuddy666” — that the inclusion of CHL and other leagues raised this bar. For the most part, though, I recall the international rosters and European leagues following these distributions. In other words, you didn’t have a bunch of sub-50 overalls buried on international rosters. The European leagues were even worse for this; top players in Euro leagues are still rated as if they would be top NHL players. As for the CHL leagues and the AHL, Puddy might have a point — but the AHL didn’t appear till NHL 08, and the CHL leagues till NHL 11. In fact, the international teams theory also has this chronological issue, as only the best international teams make their appearance first in NHL 97, before an additional 16 international teams are added for NHL 98.
Another round in the books, so it’s time to re-assess truculence, size, and experience in our Stanley Cup Playoffs predictions and reload for the Conference Finals. SAP had a better-than-coin-flip 2nd round, getting 3 of 4 series right, and you’ll be disappointed to know that that pulls them ahead of our more-celebrated team “virtues.” For those interested after our previous post, Nicholas Emptage over at Puck Prediction nailed the 2nd round and his model improved to 10-2 these playoffs — Bravo.
Let’s see how everything broke down for us…
In February of the 2009-10 season, John Buccigross of ESPN was spurred by a mailbag question to do a quick thought experiment: does he think Ovechkin could set the all-time goals mark? Gabe Desjardins voiced skepticism of Bucci’s optimistic projection but didn’t offer a counter-projection, presumably because, as he wrote:
Basically, careers are incredibly unpredictable – nobody plays 82 games a year from age 20 to age 40. And players who play at a very high level at a young age tend to not sustain that level of play until they’re 40…So, to answer the reader’s question: I believe that there is presently no significant likelihood that Alex Ovechkin finishes his career with 894 goals. He needs to display an uncommon level of durability for the next decade, and not just lead the league in goal-scoring, but do so by such a wide margin that he scores as much as Gretzky, Hull or Lemieux did in an era with vastly higher offensive levels.
That said, I thought it would be fun, with five full years gone, to see how Bucci did, and try to build a prediction model with the same data he had available. Continue reading
The first round has come and gone, and as we expected before a game had been played, brackets were not going to be fun for everyone. Most people leaning on statistical models saw their brackets chewed up by the vagaries of the playoff sample; SAP, if you’ll remember, hailed their overfit model and its “prediction” of 85% of the past 15 years of playoff series — and proceeded to do no better than a coin flip (they missed all the Eastern teams, and got all the Western matchups). An exception to the #fancystats slaughter was Nicholas Emptage, who went 6-2, which is a good thing if your site is called Puck Prediction. Not even Nicholas was a match for the gut of Steve Simmons, though, who went 8-0 in the first round. It’s the Simmons Hockey League, y’all, and he’s just sliding into our DMs.
But the big question is how our brackets, built on the tried and tested virtues of truculence, size, and experience, fared in this ultimate battle of wits and twits?
Sometimes, I hear questions float around about whether the analytics movement has changed the NHL all that much. I wrote about this a bit in my most recent post, looking at player usage, but there’s more to be said. Thankfully, I had a great opportunity to contribute to a documentary for Grantland and ESPN called “Knuckles vs. Numbers,” which focused on the influence of analytics on the reduction of the role of the enforcer. Including myself, you’ll also see interviews with Sean McIndoe (@DownGoesBrown), Steve Burtch, Paul Bissonnette, Colton Orr, and Brian McGrattan. Check it out, get the word out, it’s worth your time.
Now that you’ve enjoyed that, I have some behind-the-scenes anecdotes and information from the experience that are worth mentioning.