In part 1, we covered WAR in hockey and baseball, discussed each field’s prior philosophies, and cemented the goals for our own WAR model. This part will be devoted to the process – how we assign value to players over multiple components to sum to a total value for any given player. We’ll cover the two main modeling aspects and how we adjust for overall team performance. Given our affinity for baseball’s philosophy and the overall influence it’s had on us, let’s first go back to baseball and look at how they do it, briefly.Continue reading
In this piece we will cover Adjusted Plus-Minus (APM) / Regularized Adjusted Plus-Minus (RAPM) as a method for evaluating skaters in the NHL. Some of you may be familiar with this process – both of these methods were developed for evaluating players in the NBA and have since been modified to do the same for skaters in the NHL. We first need to acknowledge the work of Brian Macdonald. He proposed how the NBA RAPM models could be applied for skater evaluation in hockey in three papers on the subject: paper 1, paper 2, and paper 3. We highly encourage you to read these papers as they were instrumental in our own development of the RAPM method.
While the APM/RAPM method is established in the NBA and to a much lesser extent the NHL, we feel (especially for hockey) revisiting the history, process, and implementation of the RAPM technique is overdue. This method has become the go-to public framework for evaluating a given player’s value within the NBA. There are multiple versions of the framework, which we can collectively call “regression analysis”, but APM was the original method developed. The goal of this type of analysis (APM/RAPM) is to isolate a given player’s contribution while on the ice independent of all factors that we can account for. Put simply, this allows us to better measure the individual performance of a given player in an environment where many factors can impact their raw results. We will start with the history of the technique, move on to a demonstration of how linear regression works for this purpose, and finally cover how we apply this to measuring skater performance in the NHL.Continue reading
From a casual fan’s perspective, the intensity traditionally ramps up in the playoffs because teams are closer to the grand prize, the Stanley Cup. Fans are hyped up by the storylines and rivalries for every series, and so each event feels all the more momentous. So, how different are the rates of goals, shots, or hits from the regular season to the playoffs? Does the fact that a game is played during the playoffs change these rates significantly? Which rates don’t change that much?Continue reading
My office was recently planning an offsite social event. During a team meeting, we brainstormed what activity to do together. Along with ideas like mini golf, hiking, and wine tasting, someone suggested karaoke. The team initially responded positively, so when everyone turned to me, I said “sure, that sounds fun”. Then someone put the options in a Google Form for us to all vote on privately. I opened it at my desk and immediately voted for karaoke dead last. I didn’t want to be a downer in public, but there was no way I was doing karaoke.
Being in public changes our behavior. It’s a natural trait and totally understandable. What’s interesting is understanding when and how it changes, and the NHL awards voting may have given us an opportunity to do just that. For the 2017-2018 season, the Professional Hockey Writers Association (PHWA) made their individual voter ballots public for the first time, and it appears that this may have affected how some writers voted.
Though it was completely tangential to @SteveBurtch’s line of thinking, his brief comments pondering the competitiveness between the middle of NHL lineups yesterday (which I can’t locate now, natch) got me thinking about whether the NHL and team management has gotten any more efficient or competitive overall the last decade. With 10 years in the books for complex Corsi data, and hockey’s seeming “Moneyball moment” fully here regardless of the quibbling on social and mainstream media, is the league getting any tighter?
“They don’t ask how. They ask how many.”
“But seriously though… how?”
To state the obvious: goal-scoring is an essential skill for a hockey team. Players have made long careers by putting the puck in the net.
But how do players create goals? Skaters rely on all sorts of skills to score; some are fast, some have a huge shot, and some know how to be in the right place for an easy tap-in. But we don’t have a rigorous view of what those skills are, how they fit together, and which players rely on which ones.
In this piece, I take 100 of the top NHL goal-scorers and apply unsupervised learning techniques to group them into specific goal scoring types. The result is a classification that buckets the scorers into 5 categories: bombers, rushers, chance makers, chaos makers, and physical forces. These can help players understand how to apply their skill set to goalscoring. It can also help teams make sure that their system is putting their top players in a position to score.
Hockey fans and analysts have always appreciated the importance of passing. But until the passing project led by Ryan Stimson, we couldn’t quantify that importance. His work supported by a team of volunteers and other analysts has established that the passing sequence prior to a shot is a significant predictor of the likelihood of the shot becoming a goal. His work also showed that measuring shots and shot assists combined as shot contributions is a better predictor of future performance for both players and teams than shots alone.
Knowing that, the logical next step is to use passing data in analysis whenever possible. Unfortunately, the NHL does not provide passing data so it must be manually tracked by people like Corey Sznajder. Corey’s work is invaluable and I encourage you to support him but he’s only one person.
This article attempts to estimate a player’s quantity of shot assists in a given sample using publicly available data to help fill in gaps where tracked data doesn’t exist.
This year’s NHL draft class is weak. I don’t follow junior prospects closely, but that’s what I’ve heard from more knowledgeable sources. It’s a fair claim; Nolan Patrick and Nico Hischier seem talented but not among the game-changing talents that have recently been drafted first overall.
However, it’s harder to judge the draft class past the very top. Scouting is hard, especially for hundreds of prospects across the world. It’s possible that while there is no clear star in the draft class, the rest of the draft is as strong as ever.
That would have big implications for draft strategy. The conventional wisdom is that teams may trade more picks this year because they believe the weak draft class makes the picks less valuable. But if the draft is typical after the first few picks, that would be a poor use of assets.
We don’t yet know how well this year’s draft class will do in the NHL. But, we can use historical data to ask questions that establish expectations: how well does each draft class typically perform, and how much does this vary by year?
On Monday, I introduced some work on quantifying and identifying team playing styles, which built upon my earlier work on identifying individual playing styles. Today we’re going to discuss how to make this data actionable.
What are the quantifiable traits of successful teams? What plays are they executing that makes them successful? How can we use data to then build a style of play that is more successful than what we’re currently doing? The way we bridge the gap between front office and behind the bench is by providing data to improve their matchup preparation, lineup optimization, and enhance tactical decisions.
This is what I mean by actionable: applying data-driven analysis and decision-making inside the coach’s room and on the ice. All data is from 5v5 situations and is either from the Passing Project or from Corsica.
If you’ve ever read a little math, you likely know the dangers of binning continuous data when testing relationships between two variables. It is one of the easiest and most common mistakes that an amateur statistician might make, largely because, intuitively, it seems like it should make sense.
But it doesn’t, and here’s why.