Expected Goals Model with Pre-Shot Movement, Part 3: 2018-2019 Data

Yesterday we looked at the team and skater results from the 2016 – 2018 data that was used to train the xG model. That’s a pretty robust dataset, but it’s unfortunately a bit out of date. People care about this season, and past years are old news. So let’s take a look at the data that Corey Sznajder has tracked for 2018 – 2019 so far.

Continue reading

Expected Goals Model with Pre-Shot Movement, Part 1: The Model

There are few questions in hockey analytics more fundamental than who played well. Consequently, a large portion of hockey analysis has been focused on how to best measure results. This work is some of the most well-known work in “fancy stats”; when evaluating players and teams, many people who used to look at goals scored moved to focusing on Corsi and then expected goals (xG).

The concept of an xG model is simple: look at the results of past shots to predict whether or not a particular shot will become a goal. Then credit the player who took the shot with that “expected” likelihood of scoring on that shot, regardless of whether or not it went in. Several such models have been developed, including by Emmanuel Perry, Evolving Wild, Moneypuck, and many others.

However, there remains additional room for improving these models. They do impressive work based on the available play-by-play (pbp) data, but that only captures so much. There are big gaps in information, and we know that filling them would make us better at predicting goals.

Perhaps the biggest gap is pre-shot movement. We know that passes before a shot affect the quality of the scoring chance, but the pbp data does not include them. Thankfully, Corey Sznajder’s data does. While it does not cover every single shot over multiple seasons, it is a substantial dataset; when I pulled the data for this model, it had roughly half of the 2016-2017 and 2017-2018 seasons included: 72 thousand shots from 1,085 games. While the number of games tracked varies by team, we have at least 43 for every team except Vegas, for which we have 26. We can use this data to build the first public xG model that incorporates passes.

Continue reading

Goal Scorer Cluster Analysis

“They don’t ask how. They ask how many.”

-Hockey Proverb

“But seriously though… how?”

-Me

To state the obvious: goal-scoring is an essential skill for a hockey team. Players have made long careers by putting the puck in the net.

But how do players create goals? Skaters rely on all sorts of skills to score; some are fast, some have a huge shot, and some know how to be in the right place for an easy tap-in. But we don’t have a rigorous view of what those skills are, how they fit together, and which players rely on which ones.

In this piece, I take 100 of the top NHL goal-scorers and apply unsupervised learning techniques to group them into specific goal scoring types. The result is a classification that buckets the scorers into 5 categories: bombers, rushers, chance makers, chaos makers, and physical forces. These can help players understand how to apply their skill set to goalscoring. It can also help teams make sure that their system is putting their top players in a position to score.

Continue reading

Identifying Playing Styles with Clustering

One of the aspects of player performance that is discussed ad nauseam is chemistry. How well do certain players elevate their performance with one player or another due to some inherent ability to find the other on the ice? To know what a teammate is going to do? However, very little has been done to analyze this phenomenon. In this piece, I posit that by identifying playing styles, something that’s been done in the NBA, we can quantify how well certain players will complement one another.

All data is from 5v5 situations from the 2015 – 2016 and current season, totaling almost 900 games from the Passing Project volunteers and Corey Sznajder. Special thanks to Asmae for her guidance throughout this piece.

I want to stress that this is a first foray into this type of analysis and simply because a player has a different style than what I’ve named (which are relatively arbitrary) it doesn’t mean they are necessarily better than another player. Players may have similar styles, but some will simply be more effective due to their ability. Finally, given that each day we accumulate more data, a player with a smaller sample size could find themselves in a different cluster in future analysis.

Continue reading

Mikael Granlund, Playing Behind the Net, & Predicting Goals

Recently, I showed how passing data is a better predictor of future player scoring than existing public metrics. In this piece, I’m going to show that by accounting for shot quality via passing metrics we can more accurately predict a team and player’s on-ice goal-scoring rates. I’m going to do this by quantifying the pre-shot movement that occurs when a player is on the ice. Finally, I’ll spend some time discussing certain forwards/teams that caught my eye. All data is from 5v5 situations and special thanks to Dr. McCurdy for pulling the on-ice player data for me. All non-passing project data is from Corsica.

Continue reading

Redefining Defensemen based on Transitional Play

Last time, I showed how passing data is a better predictor of future player scoring than existing public metrics. In this piece, I’m going to spend some time talking about how we can more reliably evaluate offensive and defensive contributions from defensemen, which has been difficult due to a lack of data. Not only due to a lack of data, but from a lack of flexibility regarding the identity of the position. Traditionally thought of as existing to defend and “make a good first pass,” I feel this limits the scope of both how we evaluate the position and its responsibilities.

In order to better evaluate defensemen, we need to identify specific metrics that we can tie to future goals. In looking at entry assists (a pass occurring in the neutral or defensive zones that precedes a shot), both for and against, we can quantify how effective that defensemen is at generating offense in transition, as well as suppressing those chances. The importance of those things at the team level is something I’ve previously discussed (transition here and defensive work here with Matt Cane). Once we identify these metrics as having a strong impact on future scoring and goal-suppression, we naturally then reevaluate what the proper roles are for a defensemen, which in turn forces us to reevaluate how we evaluate them.

Personally, I’d like to see us think of them more as fullbacks or midfielders in soccer (this is part of a larger concept of redefining positions and responsibilities, which will be posted in the next month or so, I hope). There are still going to be various types of players based on their individual skill set and team tactics, but supporting play, overlapping on the attack, and distribution are all pillars of what teams should look for. Let’s get to it.

All data is from 5v5 situations and special thanks to Dr. McCurdy for pulling the on-ice player data for me. All non-passing project data is from Corsica.

Continue reading

Expected Primary Points are a better predictor of future scoring than Shots, Points

While I have spent a lot of time over the last several months digging into how we can quantify passages of play and inform better tactical decisions, it’s time to revisit how passing impacts scoring at the player level. We have only been using half of the picture in terms of individual shots and goals for player evaluation. Sure, we have primary and total points, but primary assists aren’t a very useful metric. The rate at which players create shot assists also appeared to have significantly more value than a player’s own shots in some analysis I did last year.

This piece will release individual passing data for the 2014 – 2015, 2015 – 2016, and 2016 – 2017 seasons, the latter of which tracked by Corey Sznajder, the former tracked by myself and many others. However, it is important to provide context and meaning to the numbers rather than simply inundate you with data.

Continue reading

How Can We Quantify Power Play Performance In Formation?

Screen Shot 2016-02-26 at 4.55.31 PM

Last week I wrote about a new metric, ZEFR Rate, which measures zone entry success on the power play and is relatively repeatable and predictive of future goal scoring efficiency. The metric was based around the idea that getting into formation efficiently — most frequently a 1-3-1 — is a catalyst for power play success.

But now let’s say you’re a team that has perfected your entry scheme, and you find yourself setting up in formation at a consistent rate. What now? How can one maximize one’s use of possession in formation to score goals at the highest possible rate?

Continue reading

Shot quality and save percentage revisited, again…

Embed from Getty Images

Listen; I get it. Some people are sick and tired of this supposed debate that’s been ongoing for over ten years now. But what really is the actual debate all about? What is it we are arguing on Twitter over? What should we be aware of?

Continue reading