Revisiting the NHL Regression Predictions from January 1st

Photo by “User:Zucc63” via Wikimedia Commons, modified by author

If you’ll remember, one of the inaugural posts here was a regression prediction piece, using a combination of PDO and Fenwick Close to see who might improve or decline over the latter half of the season. I decided to put together a table of the teams I predicted would negatively or positively regress, just using the aforementioned data:

If you’ll remember, I pegged Anaheim, Colorado, Montreal, Phoenix, Toronto, and Washington for negative regression, and Florida and New Jersey for positive regression. So, even with really rudimentary predictors, this season I was able to be fairly successful building predictions from a half-season sample for the remaining season. In previous years, the fancy stats folks usually picked the much more obvious targets (Toronto being the big one this year), but it’s very possible to go further if you wanted.

More on “Corsi & Context”, with some added predictive modelling

Corsi

INTRODUCTION

I have always been of the opinion that Corsi is part of the larger puzzle in trying to gain greater understanding of the game and how a player can affect their team’s chance to win.  Like all statistics though, it needs appropriate sample size and context, and will never tell you everything. Teammates, opponents, luck, system, strategy and what moments a coach deploys a player will always effect results… although, there can also be times where context is overly stressed. While Corsi does tend to need less context than many other hockey statistics, there are some things that need to be kept in mind in how two players with the same Corsi% are not always created equally.

Tyler Dellow wrote a piece on context that is definitely worth a read. In the article Dellow used two tables showing how Corsi changes dependent on ice time for the 2011-12 season.

We will revisit this article using a larger sample and look at both forwards and defensemen.
Continue reading

Friday Quick Graph: NHL 5v5 TOI Peak at 24, 25 Years Old

This is the distribution of the skater performances w/200+ 5v5 TOI from the seasons 2007-08 through 2011-12 (n = 3,334). Use as reference for the below two charts. Notice that our line gets a little wacky as our n drops near the tails.

Some of you already know this, but I enjoy distributions, and I think they get sorely under-used in analysis (although, in the end, they are the basis of predictive work). This piece is a bit old (the data is across all skaters, 2007-08 through 2011-12, n = 3,334), but it shows the number of skaters with 200+ minutes of 5v5 time at each age grouping. The peak is clearly at 24 or 25 among this group, but we should be clear with what “peak” means. Although even-strength time can be a pretty good indicator of overall player talent, it’s still a shaky signal (c’mon, we know not all coaches put the “right” guys out there sometimes). Further, powerplay time can sometimes be a drag on better players’ energy for even-strength time, which can also compromise this signal. Nevertheless, if you were to sort all players into even-strength time groupings (say, forwards in 4 groups by ESTOI, and defensemen in 3 groups by ESTOI) you’d see that the top would generally perform better possession and offense-wise than the second, and so on down.

With that in mind, “peak” is also about health. Though we’ve not had much research into it (hint, hint), we have reason to suspect that injuries might drag on possession measures a bit. That said, 24-25 can also be a performance peak for the reason that players are less likely to have major injuries until that age or later.

I plan on digging into this data again (now that I have my ES data back to 1997-98) and splitting into forward and defense groups, but this is a good start.

NHL Defensemen and Shooting Contributions back to 1967-68

File:Defenseman Ray Bourque 1979.jpg

Photo by Dave Stanley via Wikimedia Commons

I have kicked around this data in the past, most prominently in my theoretical post on offensive systems, but I really wanted to get further into the intricacies of defensemen and their historical place in team shooting (among other offensive contributions). By looking at how much a defenseman contributes to a team’s shot generation (expressed as a percentage of team shots in the games a player played, or %TSh), we can draw some interesting comparisons across NHL eras, but I haven’t yet explored how the role of the defenseman has (or hasn’t) evolved from the Expansion Era to the present, nor have I taken a look at some of the more exceptional defense shooting teams. Let me correct that now.

Continue reading

Friday Quick Graphs: When did “Score Effects” Emerge in NHL History?

Back in 2009, Tyler Dellow first elaborated on the idea of what we now call “score effects,” or how teams with a lead will go into a “defensive shell” and purposely withdraw from the possession battle to preserve their score. Score effects are the primary reason the go-to possession stat is “Fenwick Close” today – the “close” implies the importance of looking at possession measures when teams still have a reason to engage. The limits of historical shot recording, and the possibility of score effects, are precisely why I’ve advocated the use of 2pS% (shot-differential percentage from the first two periods) as an historical possession measure.

The one thing I never completely took for granted was that score effects had always existed in the NHL. To test this, I broke down each game into individual period shot battles, and looked separately at the correlation* of 1st, 2nd, or 3rd period shots-for percentages to final goals-for percentages. The result above clearly shows that the 3rd period SF% begins to drop away drastically after 1977 or so, after a quarter-century of running pretty close to the others. It does seem possible, then, that the re-introduction of overtime in 1983-84 (gone since 1943-44) had an impact on the growth of score effects (although I’m not sure how); on the other hand, the introduction of the “loser point” in 1999-2000 doesn’t seem to have had any effect. We can also do a similar graph of correlations to goals-for percentage to validate the use of 2pS%:

As you can see, score effects have essentially become the norm, much to the detriment of overall shot differential. At any rate, whomever put two-and-two together back in the 1970s probably had the right idea; I’d forward the hypothesis that the 1970s NHL was ripe for change and innovation (a lot of competition; growth of league = increase in decision-makers and opportunities to exploit market inefficiencies). In that kind of environment, protecting the lead quickly became a best practice, and it steadily grew to a league-wide practice by the mid-1990s or so.

* Or a -1.0 to +1.0 relationship of the variance in one variable to the variance in another; positive means as one goes up, the other tends to go up, suggesting a positive relationship or correlation. A negative correlation suggests that, as one goes up, the other tends to go down. The closer to 0.0, the less likely the variables have any relationship at all.

Consistency in the NHL: How often do teams tend to play “their game”

Source: Bruce Bennett/Getty Images North America

INTRODUCTION:

Our very first published article used shot attempt differentials to see if certain teams were more consistent than others in their performance. We observed that teams differed greatly in how they performed on average, but not so much in their levels of consistency, as in the spread of their performances.

One of the commentators of the article, under name of “Anthony Delage” wondered if team’s differed much in playing “their game”, or in other words: how often low-event team’s play low-event games vs high event teams play high-event games.

See more after the jump.

Continue reading

The Top “Young Guns” in NHL History

File:Orr Trip.jpg

Photo by “Djcz” via Wikimedia Commons

I don’t think we engage the idea of the place in history that many of today’s best players hold, and I partly attribute that to the difficulty of finding points of comparison across generations. Simply using raw scoring data doesn’t do the best job because a.) everyone knows Gretzky wins, and b.) we know that scoring fluctuated drastically in the 1980s, and it wasn’t because all the best shooters and passers were playing then. With that in mind, I’ve stewed over ways to bring these different generations together, in such a way that we can be comfortable comparing them. It’s led me to build a couple of metrics that move a little bit away from the counting statistics (G, A, PTS) and towards some metrics that demonstrate a player’s share of their team’s results.

The two metrics I’m focusing on for these young guns both relate to offensive measures, but I think that generally they also allude to a player’s importance to play overall. I tend to agree with Vic Ferrari’s assertion (see his third comment here) that forwards and only a select number of defensemen play much of a role in driving offense, and recalling some of the player types implicated in Steve Burtch’s work over at Pension Plan Puppets on Shut-Down Index, I’d propose that players that drive possession (forwards and defense) more generally will return some signals in regards to shooting or playmaking. Whether that simply means, in the future, we’ll get more from simply looking at passes and shots (or robots will do the whole darn thing and save me the trouble), I can’t say. For now, though, I created %TSh, or percentage of team shots, which expresses the proportion of team shooting a player does (in games they played), and %TA, which does the same exercise with team assists. While the issue of whether this expresses positive possession players is ripe for debate, it’s indisputable that players strong in these metrics will be drivers of offense for their teams.

In that spirit, I wanted to delve into some nifty historical data; I’ve been able to go all the way back to 1967-68 with data on %TSh and %TA, and it returns some fascinating studies on NHL legends vis-à-vis today’s stars. For this piece, I’m focusing on the players that get everyone excited, so-called “young guns,” or players under 25 that have already demonstrated their ability at the top level. How do contemporary young guns measure up all-time?

Continue reading

Outperforming PDO: Mirages and Oases in the NHL

Above is the progressive stabilization (game-by-game, cumulatively) of all-situations PDO over time for the 30 NHL teams. It’s a demonstration of the pull of PDO towards the average (1000, or the addition of team SV% and shooting percentage with decimals removed), and it gives you a sense of the end game: an actual spread of PDO, from roughly 975 to roughly 1025. In other words, if you were just to use this data, you could probably conclude that it’s not outside expectations for a team to outperform 1000 by about 25 (or 2.5%) on either side.

That’s all well and good, but PDO is a breakdown of two very different things, a team’s shooting and goaltending, two variables that understandably have very little to do with each other (they are slightly related because rink counting bias usually affects both). Shooting percentage can hinge on a number of contextual variables, though its reliance on a team’s player population usually can bring it a bit in-line with league averages. Save percentage, on the other hand, hinges on one player, and what’s more past performances suggest that a single goaltender can quite significantly outperform expectations. In this piece, I want to jump into the sliding variables of PDO, and what we can expect from teams, but first I want to begin with why I’m working with all-situations PDO.

Continue reading

NHL Systems, NHL History, and Forward vs. Defense Shooting

Photo by “ravenswing” via Wikimedia Commons

“It’s a matter of systems,” “They don’t have a good system,” “There is no system there”…we hear phrases much like this frequently, and I wonder just how much weight we give the word “system” in a game that flows and relies on instinct and reflex. Teams have some kind of system, no doubt, but it’s funny how the actions of any kind of system pale in comparison to the number of times we notice the classic breakout, setting up of the zone, or cycle. What I’m trying to say is, might we be putting too much emphasis on system, when the results are not clearly resulting in different shot quality? Might we be overstating the role of something practiced for a couple of months, maybe a year or two, versus 15-30 years of playing experience, and all the instincts, common tactics, and reflexes?

In my mind, systems are important in-and-of themselves, because their organization principles are intuitive. Cover the man or take away the passing lanes, apply forecheck pressure or trap in the neutral zone…these base ideas probably need to be there to keep things from devolving into pickup hockey. And you all know that game, where everyone’s a superstar forward and nobody backchecks. Seriously, no wonder you guys can’t ever find two goalies.

Anyway, with my current treasure trove of game-by-game, player-by-player data going back to 1987-88 (thanks to Hockey Reference’s excellent Play Index), I wanted to see just how much the game has evolved since the late 1980s, particularly in regards to defensemen involvement in the offense. We already know that the difference in shots-for per team, per game is 30.4 in 1987-88 to 29.1, so not a heck of a lot has changed in shot generation, and the goals/game per team has changed drastically, from 3.71 in ’87-’88 to 2.75 today. This information alone should suggest we probably haven’t improved too much in regards to what we might call offensive systems. Has defensemen involvement increased, and driven the scoring down? Have teams attempted more forward involvement to improve scoring? Will Guy Boucher ever convince us he has the key to better offense again?

I took data from about 30,000 individual player performances in 1987-88 and about 26,000 in 2012-13; I compared the player’s shot totals to their team totals in those games and derived my %TSh, or percentage of team shots metric, previously used in my piece on Career Charting.

Continue reading

A Rule of 60-40: Thoughts on Individual Player Possession Metrics

Image

The image above is the distribution of individual offensive zone start percentage (or the percentage of times a player started their shift in the offensive zone) and the distribution of individual Fenwick percentages (shots-for and shots-missed for that player’s team divided by all shots-for and shots-missed, both teams, all tabulated when that player is on the ice). I specifically targeted player season performances wherein the player participated in at least 20 or more games, as that’s roughly around the number of games it takes before these measures start to settle down.

These distributions tell us a few important things for understanding possession, deployment, and how we might analyze the game. Most importantly, after the jump I have a modest proposal, a 60-40 Rule, that might help us in the chase for those elusive, all-encompassing player value metrics.

Continue reading