June | 2018 | Hockey Graphs

My office was recently planning an offsite social event. During a team meeting, we brainstormed what activity to do together. Along with ideas like mini golf, hiking, and wine tasting, someone suggested karaoke. The team initially responded positively, so when everyone turned to me, I said “sure, that sounds fun”. Then someone put the options in a Google Form for us to all vote on privately. I opened it at my desk and immediately voted for karaoke dead last. I didn’t want to be a downer in public, but there was no way I was doing karaoke.

Being in public changes our behavior. It’s a natural trait and totally understandable. What’s interesting is understanding when and how it changes, and the NHL awards voting may have given us an opportunity to do just that. For the 2017-2018 season, the Professional Hockey Writers Association (PHWA) made their individual voter ballots public for the first time, and it appears that this may have affected how some writers voted.

Continue reading →

Note: In this piece, I will use the phrases “scoring rate” and “5on5 Primary Points per Hour” interchangeably.

Introduction

Nico Hischier and Nikita Kucherov are both exceptional talents but suppose, for whatever reason, we want to identify the superior scorer of the two forwards. A good start would involve examining their 5on5 scoring rates in recent seasons…

compare

Evidently, Hischier has managed the higher scoring rate but that doesn’t convey much information without any notion of uncertainty – a significant issue considering that the chunks of evidence are of unequal sizes. It turns out that Kucherov has a sample that is about three times larger (3359 minutes vs. 1077 minutes) and so it is reasonable to expect that his observed scoring rate is likely more indicative of his true scoring talent. However, the degree to which we should feel more comfortable with the data being in Nikita’s favor and how that factors into our comparison of the two players is unclear.

The relationship between evidence accumulation and uncertainty is important to understand when conducting analysis in any sport. David Robinson (2015) encounters a similar dilemma but instead with regards to batting averages in baseball. He presented an interesting solution which involved a Bayesian approach and more specifically, empirical Bayes estimation. This framework is built upon the prior belief that an MLB batter likely possesses near-league-average batting talent and that the odds of him belonging to one of the extreme ends of the talent spectrum are less likely in comparison. As evidence is collected in the form of at-bats, the batter’s talent estimate can then stray from the prior belief if that initial assumption stands in contrast with the evidence. The more data available for a specific batter, the less weight placed on the prior belief (the prior in this case being that a batter possesses league-average batting talent). Therefore, when data is plentiful, the evidence dominates. The final updated beliefs are summarized by a posterior distribution which can be used to both estimate a player’s true talent level and provide a sense of the level of uncertainty implicit in such an estimate. In the following sections we will walk through the steps involved in applying the same empirical Bayes method employed by Robinson to devise a solution to our Hischier vs. Kucherov problem.

Continue reading →

Hockey Graphs

Visualizing and analyzing hockey and statistics

Month: June 2018

Public Ballots May Be Changing Award Voting Behavior

Comparing Scoring Talent with Empirical Bayes