# Goal Scorer Cluster Analysis

-Hockey Proverb

“But seriously though… how?”

-Me

To state the obvious: goal-scoring is an essential skill for a hockey team. Players have made long careers by putting the puck in the net.

But how do players create goals? Skaters rely on all sorts of skills to score; some are fast, some have a huge shot, and some know how to be in the right place for an easy tap-in. But we don’t have a rigorous view of what those skills are, how they fit together, and which players rely on which ones.

In this piece, I take 100 of the top NHL goal-scorers and apply unsupervised learning techniques to group them into specific goal scoring types. The result is a classification that buckets the scorers into 5 categories: bombers, rushers, chance makers, chaos makers, and physical forces. These can help players understand how to apply their skill set to goalscoring. It can also help teams make sure that their system is putting their top players in a position to score.

# Estimating Shot Assist Quantities for Skaters

Hockey fans and analysts have always appreciated the importance of passing. But until the passing project led by Ryan Stimson, we couldn’t quantify that importance. His work supported by a team of volunteers and other analysts has established that the passing sequence prior to a shot is a significant predictor of the likelihood of the shot becoming a goal. His work also showed that measuring shots and shot assists combined as shot contributions is a better predictor of future performance for both players and teams than shots alone.

Knowing that, the logical next step is to use passing data in analysis whenever possible. Unfortunately, the NHL does not provide passing data so it must be manually tracked by people like Corey Sznajder. Corey’s work is invaluable and I encourage you to support him but he’s only one person.

This article attempts to estimate a player’s quantity of shot assists in a given sample using publicly available data to help fill in gaps where tracked data doesn’t exist.

# Who You Calling Weak? Draft Class Variance

This year’s NHL draft class is weak. I don’t follow junior prospects closely, but that’s what I’ve heard from more knowledgeable sources. It’s a fair claim; Nolan Patrick and Nico Hischier seem talented but not among the game-changing talents that have recently been drafted first overall.

However, it’s harder to judge the draft class past the very top. Scouting is hard, especially for hundreds of prospects across the world. It’s possible that while there is no clear star in the draft class, the rest of the draft is as strong as ever.

That would have big implications for draft strategy. The conventional wisdom is that teams may trade more picks this year because they believe the weak draft class makes the picks less valuable. But if the draft is typical after the first few picks, that would be a poor use of assets.

We don’t yet know how well this year’s draft class will do in the NHL. But, we can use historical data to ask questions that establish expectations: how well does each draft class typically perform, and how much does this vary by year?

# Improving Opposition Analysis by Examining Tactical Matchups

On Monday, I introduced some work on quantifying and identifying team playing styles, which built upon my earlier work on identifying individual playing styles. Today we’re going to discuss how to make this data actionable.

What are the quantifiable traits of successful teams? What plays are they executing that makes them successful? How can we use data to then build a style of play that is more successful than what we’re currently doing? The way we bridge the gap between front office and behind the bench is by providing data to improve their matchup preparation, lineup optimization, and enhance tactical decisions.

This is what I mean by actionable: applying data-driven analysis and decision-making inside the coach’s room and on the ice. All data is from 5v5 situations and is either from the Passing Project or from Corsica.

# Friday Quick Graphs: The Dangers of Binning Data

Embed from Getty Images

If you’ve ever read a little math, you likely know the dangers of binning continuous data when testing relationships between two variables. It is one of the easiest and most common mistakes that an amateur statistician might make, largely because, intuitively, it seems like it should make sense.

But it doesn’t, and here’s why.