Comparing Scoring Talent with Empirical Bayes

Note: In this piece, I will use the phrases “scoring rate” and “5on5 Primary Points per Hour” interchangeably.

Introduction

Nico Hischier and Nikita Kucherov are both exceptional talents but suppose, for whatever reason, we want to identify the superior scorer of the two forwards. A good start would involve examining their 5on5 scoring rates in recent seasons…

compare

Evidently, Hischier has managed the higher scoring rate but that doesn’t convey much information without any notion of uncertainty – a significant issue considering that the chunks of evidence are of unequal sizes. It turns out that Kucherov has a sample that is about three times larger (3359 minutes vs. 1077 minutes) and so it is reasonable to expect that his observed scoring rate is likely more indicative of his true scoring talent. However, the degree to which we should feel more comfortable with the data being in Nikita’s favor and how that factors into our comparison of the two players is unclear.

The relationship between evidence accumulation and uncertainty is important to understand when conducting analysis in any sport. David Robinson (2015) encounters a similar dilemma but instead with regards to batting averages in baseball. He presented an interesting solution which involved a Bayesian approach and more specifically, empirical Bayes estimation. This framework is built upon the prior belief that an MLB batter likely possesses near-league-average batting talent and that the odds of him belonging to one of the extreme ends of the talent spectrum are less likely in comparison. As evidence is collected in the form of at-bats, the batter’s talent estimate can then stray from the prior belief if that initial assumption stands in contrast with the evidence.  The more data available for a specific batter, the less weight placed on the prior belief (the prior in this case being that a batter possesses league-average batting talent). Therefore, when data is plentiful, the evidence dominates. The final updated beliefs are summarized by a posterior distribution which can be used to both estimate a player’s true talent level and provide a sense of the level of uncertainty implicit in such an estimate. In the following sections we will walk through the steps involved in applying the same empirical Bayes method employed by Robinson to devise a solution to our Hischier vs. Kucherov problem.

Continue reading

Linking penalties and game minute in the NHL

By Ingrid Rolland and Michael Lopez

At the 10-minute mark of the first period during Game 2 of Tampa Bay’s 2nd-round series with Boston, Torey Krug was sent to the box for two minutes after committing this slashing violation against Brayden Point.

pen1

The Lightning cashed in on the ensuing power play, with Point scoring the game’s opening goal.

Fast forward to later in this same game, with Tampa Bay clinging to a 3-2 advantage and less than four minutes remaining in regulation. Brad Marchand skated past Anton Stralman for a scoring chance, and the Lightning defender reached around to commit what looked to be a similar violation to the one deemed a penalty on Krug above.

pen2

No penalty on Stralman was called, however, and Tampa Bay held on for a 4-2 win. It was the first of the team’s four consecutive triumphs over the Bruins that earned the Lightning a spot in the Eastern Conference Final.

“We hate to harp on the ref’s, but tonight they deserved to get harped on,” opined NBC’s Jeremy Roenick after the game. “How can you call [Krug’s] penalty early in the game, in such a big playoff game?”

 

Continue reading

The Launch of the Tape to Tape Project

Earlier this year, Rushil Ram, Mike Gallimore, and Prashanth Iyer launched Tape to Tape, an online tracking system that can be used to record locations of shot assists, zone exits, and zone entries. Rushil and I will be running the Tape to Tape Project in order to compile a database of these statistics with the application Rushil created. We have already had close to 30 trackers sign up from an announcement on Twitter last week.

 

 

 

Each individual will track zone exits, zone entries, and shot assists for games they sign up for. Once the games are complete, the data will be exported to a public Dropbox folder. The goal with this project is to enhance our understanding of these microstats as they pertain to coaching decisions, player performance, and wins. What follows next is a description of what we will be tracking, a brief summary of the research that describes why these specific microstats are important, and how we will be tracking these events.

Continue reading

An Introduction to NWHL Game Score

In July of 2016, Dom Luszczyszyn released a metric called Game Score.  Based on the baseball stat created by Bill James (and ported to basketball by John Hollinger) the objective of game score is to measure single game player productivity.

While it’s often easy to compare players across larger sample sizes, comparing two different players’ performance on a given night can be difficult. If player A has a goal, two shots, and took a penalty, did that player outperform player B who had two assists and one shot? Game score attempts to answer that question by weighting each of the actions of each player to give us a single number representing their overall performance in that game.

Unlike Dom, whose main goal was to create a better way to evaluate single game performance, mine was to create a better statistic to evaluate the total contributions of players. There are no advanced metrics, like Corsi For percentage, or even Goals For percentage, available at this time in the NWHL. Because of this, points are the best way to evaluate players, even though other box score stats are available.

Continue reading

Hockey-Graphs Podcast Episode 9: Erik Karlsson and Market Value

Chris Watkins joined Adam Stringham to discuss some of his new work and Erik Karlsson’s recent comments. Is the NHL entering a new age of superstar transition? Will the leagues best players start jumping around in free agency? Any comments are appreciated, the goal is to produce a podcast that people want to hear. Please subscribe to the podcast on iTunes!

Hockey-Graphs Podcast Episode 8: Market Efficiency and Diminishing returns

Shawn Ferris joined Adam Stringham to discuss some of his work over the last year including: his piece on whether shot parity is increasing, a look at how teams relying on high percentage changes are less consistent in their expected goal output and some of his upcoming works. Any comments are appreciated, the goal is to produce a podcast that people want to hear. Please subscribe to the podcast on iTunes!

#RITHAC 2017 Slides & Video

Yesterday, the third annual Rochester Institute of Technology Hockey Analytics Conference was held. Below are links to the slides for each presenter, as well as links to a stream of the morning and afternoon sessions. Please refer to this post for the time of each person’s talk or panel. More detailed recaps are undoubtedly coming from people, so this is simple a reference for streams and slides for those that missed the event or would like to revisit certain talks.

Continue reading

Fear and Loathing in Las Vegas: An analytical deep dive into the Vegas Expansion Draft

Despite the aura of calm projected by Golden Knights owner Bill Foley, his nascent desert franchise is already on the clock. The recent announcement that the Oakland Raiders will be moving to Las Vegas in 2019, has already undermined Foley’s plan to be the only show in town. If the Golden Knights don’t win over the Vegas fanbase in relatively short order, it could prove almost impossible for hockey to get ever get a foothold in Sin City.

As a result, the team faced a variety of difficult decisions going into the 2017 expansion draft. On one hand, the team could try to win immediately with aging veterans like Eric Staal, which would allow them to establish a foothold in the market, but also put them at risk of years of mediocrity as older players lose their fights with Father Time. On the other hand, the team could tank in the hopes of finding stars at the top of the draft, but the resulting efforts could further exacerbate the fan bases preference for the incoming NFL juggernaut.

In order to evaluate the quality of the selections of GM George McPhee, I viewed each pick as a “trade” and applied prototype of a “Trade Machine” to look at each selection, given the choices available. For example, Vegas chose Clayton Stoner and Shea Theodore from the Ducks over Sami Vatanen, which means, in essence, the selection was a trade for Theodore and Stoner for Vatenen straight up. After looking at all 31 selections, I compared Vegas’ actual roster to one consisting of an optimal roster calculated using DTMAboutHeart’s GAR statistic. The results are below.

Continue reading