While the primary focus of the hockey analytics community has been around roster optimization, there has been a small subset of the community that has worked a great deal on prospect analytics. This includes the work of Gabriel Desjardins’ on NHL Equivalent scoring, Josh Weissbock and Cam Lawrence’s work on Player Cohort Success (since purchased by the Florida Panthers), and Rhys Jessop’s work on adjusted scoring metrics.
As a big fan of prospect scouting and analytics, I wanted to add to the community by expanding upon the work done by Jessop.
Why Adjusted Scoring
Statistics matter, whether you use them or not. In hockey analytics, statistics are merely a measure of the player’s performance. A team hopes to draft the best possible performer, regardless of whether or not they use statistics to evaluate the prospects.
This is why we see statistics perform so well in predicting future success. Two examples are when Jessop showed that scoring was a huge signal in predicting the best defensemen to take in the draft and that a simple model based on scoring could do about equally well as billion dollar corporations.
Now, with the models I mentioned earlier, there are pros and cons to each. Knowing the strengths and weaknesses of each tool allows the analyst to optimize their work.
In terms of draft and prospect analysis, Desjardins’ NHLE model is often misused. The tool was constructed to look at how much scoring a player would sustain moving to the NHL the next season, with looking at how the average player leaving a particular league that makes the NHL scored the next season.
The largest issue of using this as a draft tool is that the average player leaving one league is not always in an equivalent situation to another. For example, a junior player leaving a league like the CHL would be at a very different point in the development timeline compared to that of a player leaving the AHL, who is likely a far older player taking a depth role.
PCS, and similar models, worked around this issue by looking at how players statistically similar to the prospect performed, both in terms of how often similar the players made it into the NHL and how well those that did make performed over their career. PCS started with the variables height, scoring, and age, as they were the strongest signals to future success.
One issue of exclusively using a model like this as a draft tool is the possibility of confounding success from inherent biases versus success from being the best possible choice. For example, there is a strong chance that NHL organizations are biased towards height, and so a model may over value how much height signals effectiveness.
In addition, height is an input that helps make a player as effective as they are (or are not). The players who perform best at a lower level are doing so in part due to their inputs like height and against players with different inputs. (Although, there are some differences in leagues that could make certain player types translate more effectively than others)
Finally, there is the issue of small samples in the extrema for variables. It becomes difficult to quantitatively compare multiple players who are part of very small cohorts, where one player making or missing the NHL could create a large swing in calculating expected outcomes.
This is why I would suggest not looking merely at models like PCS in prospect analysis, but also including overall player performance models. These would be things like adjusted scoring and, hopefully in the future, shot-based differentials like Corsi or Expected Goals.
We know that players provide value beyond scoring, so looking at how a player tilts the ice in the teams favour will provide valuable information for future prospect analysis. However, there is a relationship between those who drive scoring and those who drive shot differentials.
Jessop’s original metric adjusted for two factors: Age and Era.
Age adjustments make intuitive sense, as being older and having more time to develop physically and to gain experience leads to better performance, especially at draft age. Era adjustments, meanwhile, allow us to compare players to past seasons, and also help to dismiss the “inflated scoring” argument often used against players in higher scoring leagues (although the higher scoring league isn’t always the one popular opinion say it is as shown by Jessop).
As a result, I have added two other adjustments: Secondary-Assist and League.
Secondary assists have a lot more noise to them. They are heavily volatile year-to-year, and scorekeepers are notoriously biased and inaccurate in applying them equally between different arenas. That said, they still provide value, even if it is not as strong. For example, my early work at Hockey Data has suggested that overall assists per 60 minutes has a stronger relationship with successful passes per 60 minutes than primary or secondary assists alone.
Era adjustments account for the goal rate differences between different teams, but there is still a difference in quality that separates leagues beyond their scoring rates. For this reason, I added a league adjustment with a similar method to Desjardins’ NHLEs, although looking at multiple league translations beyond NHL while holding age constant.
Here is a link to a Google Document displaying the SEAL-Adjusted scoring for every OHL, WHL, QMJHL, and USHL player who is also first time draft eligible. In addition, I have also added a few key players that were of interest from the BCHL and NCAA, with Tyson Jost, Dante Fabbro, Dennis Cholowski, Luke Kunin and Charlie McAvoy.
These additional factors help to advance the analysis of draft prospects, but there are still further enhancements to consider. In the future I hope to improve adjusted scoring by accounting for quality of team, quality of opposition, and also power play scoring, while also including more leagues like Europe, college, and tier-II junior.