Something in hockey has been bugging me for years. Technically a lot of things about hockey bug me, but let’s not get sidetracked right off the bat. The irritating aspect of hockey I want to focus on today are neutral zone faceoff wins. They rarely ever lead to anything interesting.
In the lasttwo posts, we’ve looked at the big picture outputs of the xG model for players and teams. Now, let’s zoom in on the model itself to try and understand it better. How is it making its decisions? Which variables provide the most important information and in what way do those variables affect the outcome?
Last time, we saw that a team exiting its defensive zone with possession is much more likely to enter their offensive zone. Do the advantages end there, or do possession exits also improve the quality of zone entrances? Perhaps leaving the defensive zone with possession makes it easier to keep possession as they enter the offensive zone, and that leads to more shots per entry. Maybe pass-outs create space for more passes in the offensive zone, which improves shot quality.
It turns out that there is not much of a difference in entry quality by exit type; exiting with possession makes it more likely to gain the offensive zone, but the advantages quickly dissipate. That said, there are some interesting variations in how those zone entries play out. The differences are small enough that they could be random chance, but it’s worth taking stock of what we know with the data we have.
Over the last decade, teams have taken significant steps to improve their NHL entry draft approach. To do this, a number of teams have bolstered their analytics staff to identify the current “gaps” in prospect scouting. Whether it’s the Detroit Red Wings being the first team to dive head first into drafting Russian players, and then later Swedish players, or the Tampa Bay Lightning prioritizing small, skilled forwards, teams are looking for any available edge. More recently, the Pittsburgh Penguins have put a premium on overage players, as Namita Nandakumar found that overage players make the NHL faster. What’s the next big market inefficiency?
We recently released the final version of our contract projections for the 2019 NHL free agent class (they can be found here). Our initial projections went up in mid-April, and even though it’s only been a few weeks, we’ve had numerous questions about how the model was designed, how it works, what it means, etc. I thought we might be able to answer all the questions about it on twitter, but alas it was just a dream. A quick recap: this is our third year doing contract projections for the NHL offseason. While the model/projections this year may seem quite complicated, our first version was very simple: a few catch-all stats and a linear regression model to predict salary cap percentage (cap hit / salary cap). We use cap percentage to keep salaries on the same level as the salary cap changes. Over the last few years, we’ve developed a few new methods, and this year we took quite a bit of inspiration from the method Matt Cane used for his 2018 NHL offseason salary projections.
On Wednesday Night, Hockey-Graphs became aware that one of our contributors, Jason “jsonbaik” Baik, had been convicted of Sexual Assault in Allegheny County, Pennsylvania (Pittsburgh). To be utterly clear, Hockey-Graphs condemns these actions absolutely. Upon becoming aware of this horrible news, we have terminated our relationship with Mr. Baik and all contributions from Mr. Baik have been removed from this site.
A picture is worth a thousand words. Yes, it’s a cliché, but when it comes to visualizing data, an individual can tell a story via the choices they make when presenting their data. One of the most common visualizations is a plot showcasing the frequency and distribution of an event. Data like this are often presented in a histogram or box-and-whisker-plot. However, a limitation of both of these types of plots is that neither shows the individual where each data point falls. On the other hand, a beeswarm plot allows the user to see where each individual point falls across a range. A random jitter effect is applied to maintain a minimum distance between each point to minimize overlap.
Inspired by the wonderfulgraphs from Namita Nandakumar and Emmanuel Perry, I thought I would attempt to visualize how goaltenders have fared in goals saved above average over the course of their careers.
With the release of the Kindle version of my book, I wanted to provide an excerpt to promote it. You can purchase the Kindle at the above link or the paperback here. If you already purchased a paperback, you should be able to obtain the Kindle version for free on your Amazon account. Let me know if you run into any trouble there. Enjoy!
In the last 10 years, I have been impressed by
the development of the hockey analytics community in North America as well as
the tools made available to the public in the hope of increasing the general hockey
Unfortunately, in Switzerland, the Swiss Ice Hockey Federation (SIHF) does not provide the same level of information as there is in North America and keeps part of its proprietary data for itself. As such, fans and journalists, except on very rare occasions, don’t have access to the same kind of in-depth researches/analyses as there are in the NHL or some other European leagues. Plus/minus is still THE hockey statistic for some journalists or analysts.
The first part of my project with the Hockey-Graphs Mentorship program was to create a platform entirely dedicated to Swiss hockey statistics, called NL Ice Data, the main goal was to exploit as much as possible the available data and to give fans access to additional statistics the SIHF doesn’t necessarily provide:
GF/GA: for players, RelGF%, GF/60, …;
time on ice deployment and evolution;
aggregated shot tracker maps per player, goalie and team;
and many others.
Current features include the same core of
statistics for players, goalkeepers and teams: statistics, fouls, shootouts and
shot tracker maps. Easy to use, the website provides interactive tables and charts
so that fans can engage more with data. Additional features, charts and metrics
will be added along the project.
By slowly integrating further metrics and
concepts after the website’s launch (xG or Game Score for example), the modest goal
is to build overall knowledge amongst fans. A secondary goal was to have a
platform ready to publish more *advanced*
statistics (including at the player level) as soon as the League publishes more
of its proprietary data.