Cowritten by Brendan Kumagai, Mikael Nahabedian, Thibaud Châtel, and Tyrel Stokes
This is part 2 of a two part series introducing our Bayesian space-time model for evaluating offensive sequences and player actions. In part 1, we outlined our methodology to build the model and explained the reasoning behind the key metric, Possession Added Value (PAV), derived from it. In this second part, we will illustrate how our model and the PAV metric can be used for team and individual player analysis. Read Part 1 here.
Since the Big Data Cup in March, our team has continued to improve the model to better estimate Possession Added Value of offensive sequences. With some extra time on our hands we cleaned up a few coding bugs and made two changes to the underlying models themselves. First, we have explicitly separated failed and completed passes. Second, we drastically improved our models which predict the location of the next event. With these changes we are able to more realistically simulate play sequences and more accurately value passing and as a result the findings presented below might differ a bit from our Big Data Cup paper.
Rush vs OZ Play
The PAV metric helps us assess a player’s overall contribution within offensive sequences. It can help us break down a player’s offensive impact for each of the following events: entries, passes, shots, turnovers and recoveries. Despite the fact that our dataset is limited in terms of the number of observations for non-Erie players, it is interesting to see some general trends emerge.
When looking at initial results from our model, we notice that both time and space play a huge role in how much value players can expect to add within offensive sequences. The first few seconds following a zone entry are the most dangerous: during this time period, the slot area is expected to be very valuable. If a team manages to access this area of the ice on the rush, they should expect to be able to create a very dangerous offensive sequence. Following a zone entry, defensive structures start to take shape with time, making the slot area harder to access.
In the figures above, the area with high expected goals – characterized by brighter colours in the slot – decreases in size as we transition from rush play to offensive zone play. Thus, as time goes on, the defence will set up and offensive teams will have less time and space in high-value areas of the ice, which will lead to a decay of value in these areas.
The dataset published by Stathletes for the Big Data Cup included passing plays (both failed and completed), which represents a significant portion of the puck plays made on the ice. This resolution of data, not usually available in the public sphere, allowed us to assess the relative value of passing plays as part of offensive sequences by incorporating passes into our PAV equation.
In general, traditional hockey fans assess the playmaking ability of players by focusing on a few game-changing passes. But these game-changing passes don’t occur often in a game. Especially, as the defense sets up and locks the middle of the ice, the advantage of time and space fades away for the attacking team. As a result, we can see on the figure above that the vast majority of passes happen along the boards, through low to high or high to low plays, with spacetime characteristics that will yield very low offensive potential.
To reinforce this idea, our model also highlights the significant difference in terms of success rates, between making a pass from the middle of the ice towards the boards and passing from the boards towards the middle of the ice, in the offensive zone. To put it simply, passing from high-value areas to low-value is easy; passing from low-value areas to high-value areas is difficult and does not successfully occur often.
In summary, our model suggests that perimeter passes, which make up the bulk of all passes, neither increase or decrease the value of a possession significantly since they do not infiltrate the high value regions and balance this with being a low risk of creating a failed pass or turnover. That said, passes which greatly improve the condition of the puck are some of the highest value plays possible and players consistently able to do so can have large PAV passing values. This is in contrast with our Data Cup presentation and the biggest change due the model tweaks mentioned in the introduction.
However, passing plays are one of the most difficult hockey events to properly assess given that many external factors would need to be taken into consideration to understand the decision-making process behind them. For instance, there might be some long-term benefits in prolonging puck possession by passing to less favourable areas of the OZ, depending on the circumstances, which is reflected in the (slightly) positive average value of successful passes, even those around the perimeter.
Furthermore, without player movement data, our model can only capture proxies for the opposing team’s defensive structure, which is key to understanding the decision-making process of players when passing the puck. Therefore, our model cannot fully differentiate a case where a player willingly passes the puck to a lower value part of the ice, yielding a certain tactical advantage, from a situation where a pass truly hinders the ability of the team to generate offence given the poor decision of a player. To go further, one can only dream of the day where data would be available on players’ passing and receiving ability to assess the probability of a successful pass, along with distance, backhand vs forehand, the presence of defensemen in the way, their pass-breaking ability, the position of their stick, etc.
The other events’ involvement in the PAV equation is much more simple to capture. For the most part, players draw positive PAVs from zone entries. This is in line with the foundations of our model as a zone entry enables a team to move into the offensive zone, providing them, in most cases, with a more favourable position to generate offence. Even if a dump-in is historically half as productive as a controlled entry1, it still moves the needle upwards, going from outside of the offensive zone to the possibility of a shot. Similarly, players generally draw positive PAVs from puck recoveries, as the recovery of the puck is tied to the possibility of generating offence. On the other hand, all players get a negative PAV from turnovers. Finally, about 90% of players receive a positive PAV from shots; these rare cases can be interpreted as a player taking an abundance of lower-danger shots when there are generally more beneficial options available.
How to use PAV
As a metric assessing the full involvement of a player inside the game, in connection with the causes and consequences of his decisions, PAV should be seen as an overview go-to stat, first and foremost. However, the components of PAV open a full realm of possibilities for in-depth analysis into the whys and hows.
The performance of a player on PAV and each of its components can be measured compared to team average, league average, adjusted for position, deployment, age or league if we think in terms of scouting. Once such a metric is available across different leagues, it would be easy to weigh it so the performance of a 17 year old player in the OHL can be compared to the impact of an 18 year old prospect in the Finnish Liiga, for instance.
The figure above shows how each of the Erie Otters’ players contributed to their team’s offensive sequences. The best players (i.e., Golod, Yetman, Singer, Hoffman) created significant value from zone entries, shot and puck recoveries when compared to the rest of the team.
But beyond the raw numbers, it was also very intuitive to build heatmaps showing where players created value on the ice looking at all events and for each of the PAV components. This sort of visual is critical to ensure a proper access and use of a new metric like PAV. In addition, this sort of heatmap provides an indication for where a player is adding value rather than only examining areas of the ice in which a player is active. There is a significant difference between being active and actually creating value. This is where PAV can come in and add a new layer to the analysis.
Case Study: Connor Lockhart
To put it simply, PAV answers a very simple question regarding individual player performance: on average, by how much does a player improve or worsen the condition of the puck on the ice for his team with each puck touch?
In terms of scouting, PAV can be used as a starting point to help identify strengths and weaknesses in a prospect’s game. Combining it, afterwards, with other metrics as well as video analysis will enhance the scouting process. The following paragraphs exemplify how our metric can be utilized to analyze a prospect’s game.
Connor Lockhart is a prospect eligible for the upcoming NHL Draft. Looking at his player card, we notice that this undersized right winger adds value to this team’s offensive possessions in different ways.
In terms of entries, Connor Lockhart ranks around the 38th percentile in terms of PAV among OHL forwards. With a 50% carry rate on zone entries, Lockhart only allows his team to generate sustained offensive pressure, every second time, by ensuring full control of the puck on zone entries. As a right-winger, he tends to enter the zone from the right side rather than the middle of the ice. Therefore, in order to improve his PAV in this aspect of the game, Lockhart could work on using crossovers to gain speed and change lanes through the neutral zone. These lane changes will help him initiate more entries through the middle of the ice and generate a higher volume of chances off the rush.
In terms of recoveries, Lockhart’s strength in this area of the game are displayed by him ranking around the 53rd percentile among league forwards in terms of PAV. Continuing to focus on smartly recovering the puck on both sides of the OZ and playing inside contact along the boards will help Lockhart sustain his above average track record in this category.
From a shooting perspective, most of his attempts are in the slot (high danger) area making him an offensive threat for the opposition. His shooting PAV, which is around the 53rd percentile, helps highlight his ability to quickly release his shot in tight areas of the OZ, as a 16-year-old forward: in 2019-2020, Lockhart’s release was about 0.3 seconds faster than league average.
Lockhart is below league average both in terms of completed (46th percentile) and incomplete passes (40th percentile). When analyzing his passing clusters, we notice that most of his passes do not add much value from an offensive standpoint (low cycle passes, low to high passes…). In order to improve his passing PAV, Connor Lockhart should leverage and work on two key elements of his game. First of all, starting with a good puck reception is key: improving puck control with a better controlled first touch in a dynamic position (catching the puck in a weight shift or a crossover) would go a long way in allowing Lockhart to open up passing lanes in valuable areas of the ice. Once the valuable passing lanes are open, moving the puck quicker would limit his opponents’ reaction time and allow him to complete a higher rate of valuable passes. In 2019-2020, Lockhart was about 0.1 seconds slower than league average to pass the puck.
All in all, Lockhart ranked around the 50th percentile among OHL forwards in his rookie year, looking at average PAV per event, being the 7th best attacker on his team. He has shown some interesting signs of offensive upside, which could help convince an NHL team to take a chance on him in rounds 4-7 in the upcoming NHL draft.
While we have presented preliminary rankings of OHL players, there are many avenues to extend our work. There are two main branches to explore moving forward with the model we have presented.
First, we can continue to build upon our model and methodology. This can include expanding our model to the full ice, quantifying defensive contributions, accounting for quality of teammates, and extending to higher resolution tracking data.
Second, this article is just scraping the surface of potential data analyses with the PAV metric we have developed. The robust nature of this metric would allow us to cluster players by play style, analyze the spatiotemporal changes in PAV over a season, and incorporate uncertainties into player and sequence evaluation.
1Châtel, Thibaud. “Introducing Offensive Sequences and the Hockey Decision Tree.” Hockey Graphs, 2020, https://hockey-graphs.com/2020/03/26/introducing-offensive-sequences-and-the-hockey-decision-tree/.
One thought on “Bayesian Space-Time Models for Expected Possession added Value – Part 2 of 2”
Reblogged this on Hockey Graphs.