Yesterday we looked at the team and skater results from the 2016 – 2018 data that was used to train the xG model. That’s a pretty robust dataset, but it’s unfortunately a bit out of date. People care about this season, and past years are old news. So let’s take a look at the data that Corey Sznajder has tracked for 2018 – 2019 so far.Continue reading
There are few questions in hockey analytics more fundamental than who played well. Consequently, a large portion of hockey analysis has been focused on how to best measure results. This work is some of the most well-known work in “fancy stats”; when evaluating players and teams, many people who used to look at goals scored moved to focusing on Corsi and then expected goals (xG).
The concept of an xG model is simple: look at the results of past shots to predict whether or not a particular shot will become a goal. Then credit the player who took the shot with that “expected” likelihood of scoring on that shot, regardless of whether or not it went in. Several such models have been developed, including by Emmanuel Perry, Evolving Wild, Moneypuck, and many others.
However, there remains additional room for improving these models. They do impressive work based on the available play-by-play (pbp) data, but that only captures so much. There are big gaps in information, and we know that filling them would make us better at predicting goals.
Perhaps the biggest gap is pre-shot movement. We know that passes before a shot affect the quality of the scoring chance, but the pbp data does not include them. Thankfully, Corey Sznajder’s data does. While it does not cover every single shot over multiple seasons, it is a substantial dataset; when I pulled the data for this model, it had roughly half of the 2016-2017 and 2017-2018 seasons included: 72 thousand shots from 1,085 games. While the number of games tracked varies by team, we have at least 43 for every team except Vegas, for which we have 26. We can use this data to build the first public xG model that incorporates passes.Continue reading
This is one of my favorite plays:
Almost every team is coached to make their opponent fight for every inch. Skjei’s end-to-end rush cuts through those defenses and leaves his team in a much better position than when he started.
But just how much better off did he leave them? How does that compare to alternative outcomes? And which players are the best at making these plays? We have unanswered questions about transitional play. We’d like to study them in more detail, but the gif above doesn’t appear anywhere in the league’s play-by-play data to help conduct analysis.Continue reading
The first significant breakthrough in hockey analytics occurred in the mid-2000’s when analysts discovered the importance of Corsi in describing and predicting future success. Since that time, we’ve seen the creation of expected goals, WAR models, and more. Many have cited that the next big breakthrough in hockey analytics will come once the NHL is able to provide tracking data. We’ve already seen some of the incredible applications of the MLB’s Statcast data and the NBA’s SportVu data. Unfortunately, the NHL has no immediate plans to publicly provide this data and as such, many analysts have decided to manually obtain the data.
The importance of zone entries in hockey statistical analysis will come as no secret to anyone familiar with the public community at large. Back in 2011, then-Broad Street Hockey writer (and current Carolina Hurricanes manager of analytics) Eric Tulsky initiated a video tracking project that became the first organized foray into the zone entry question, and later resulted in a Sloan Analytics Conference presentation. Tulsky determined that “controlled” entries (those that came with possession of the puck) resulted in more than twice the number of average shots than “uncontrolled” entries, a key finding that provided concrete direction for additional research on the topic.
Tulsky’s initial Sloan project was limited, however, due to lack of data – only two teams had their full regular seasons tracked, and just two others reached the half-season threshold. As a result, further research would wait until a larger dataset became available. Luckily for the community, Corey Sznajder undertook a massive tracking project encompassing the entire 2013-14 season, and released the data to the public. Using this, there were more advances, including Garik16’s work on team zone performance and the repeatability of player performance in each individual zone.
Team Canada won the cup. Team Canada went undefeated. They were the favourites going in, and they came out the winner. Not only did they win, but they went about it in dominant fashion. They rarely trailed and they controlled nearly every facet of the game.
It wouldn’t be surprising for many to hear that the team also dominated in the shots column… but they were not the most effective team in every aspect, which raises some interesting questions.
Last time, I showed how using data and video evidence can be combined to inform tactical offensive zone decisions. Today, I’m going to do the same thing in the neutral zone. Neutral zone play is something that has been a hot topic among analysts for many years, going back to this paper written by Eric Tulsky, Geoffrey Detweiler, Robert Spencer, and Corey Sznajder. Our own garik16 wrote a great piece covering neutral zone tracking. Jen Lute Costella’s work shows that scoring occurs sooner with a controlled entry than an uncontrolled entry.
However, for all the work that goes into zone entries, there have been few efforts to account for how predictable these metrics are. At the end of the day, what matters is how we can better predict future goal-scoring. Also, in looking at our passing data, what can we also learn about how actions are linked when entering the zone? Does simply getting into the offensive zone matter? Does it matter whether it’s controlled or not? Or, does what happen after you enter the zone matter exponentially more? Lastly, what decisions can we make to improve the team’s process using this data?
Player A is a sniper. Player B is a playmaker. Quick: If the two of them get a 2-on-1 break, what do you expect each of them to do? Odds are you would expect the playmaker to pass and the sniper to shoot. You may not know how good each of these players is, but the monikers give you a rough idea of this player’s relative strengths and how they generally try to succeed.
We have plenty of different names that explain a player’s general “role”. We use words like sniper, dangler, two-way player, and power forwards (even if we can’t agree on what that last one actually means). However, these names are usually limited to the offensive zone. We have no easy way to describe what a player does in the neutral zone.
Hockey analysts have repeatedly shown the value of neutral zone play. If a player performs well in the neutral zone, he or she is helping generate offense for their team and limiting the opponent’s chances. In addition, neutral zone play is repeatable, and the player is likely to continue to drive possession for their team. If you can identify players who thrive in the neutral zone, you are in a position to help your team improve.
But while neutral zone play is important, we still have a very limited understanding of it. Between the distance from the goal, the fluidity of play, and the relative scarcity of data, most people don’t know how players perform in the middle third of the ice. Furthermore, we don’t even have a complete idea of how to make those evaluations. When figuring out how good a player is in the neutral zone, should offense and defense be evaluated separately, or are overall results enough? What skills translate to strong neutral zone play? What playing styles?
Some day we will reach the point where we can comprehensively analyze which power plays are the best, which players drive that success, and most elusively, what roles to place players in to maximize a unit’s output, but statistically, our special teams cupboard is pretty bare. This season, as many of you know, I took on the long and arduous task of hockey tracking in the interest of trying to get us even one step closer to our objective: how can we better evaluate and predict power play success? So let’s dive right in. Continue reading