I’ve seen many statistical articles look at different ways to determine whether or not shot volume inflates a goaltender’s save percentage; however, I’ve never been satisfied with the methods used, regardless of the outcomes. So, I finally went and looked at the data myself.

It’s been seven months since I’ve written anything on save percentage. With all that wait, you’d think I’d give you a big, long, and in-depth article… but I won’t.

I had one planned, but accidentally lost all my data. Of course, errors always come in clumps. Instead of recovering the lost data, I ended up permanently removing it. To make matters worse, extraskater.com going black made the information a hassle to manually extract again. I probably could write a code* (or get someone else)* to draw up the information again… but I still have one piece remaining from the original data: the graph.

What is this graph of? What does it mean?The issue I always had with previous methods is that they never attempted to look at whether a particular goaltender’s save percentage changes with their experience in shot volume.

The above image though is an attempt to do just that.

The data taken was for each game since the 2010-12 season for the 40 active goaltenders who have played at least 100 games in that period. Then, each game where a goaltender faced less than 10 shots or played less than 40 minutes were removed from the population. Each data point on the graph then represents one of the remaining games.

The two variables are simply the goaltender’s save percentage and shots against rate * relative *to the average

*has faced throughout the sample.*

**that goaltender***(FYI: the per X minutes value for shot rates was selected to give the x and y variables similar mean and standard distribution so the nature in size of the variables do not effect the measured relationship viewed on the graph… If I recall correctly, it was per 30 minutes… sorry for not having that)*

Basically, the graph shows whether or not a goaltender tends to save a greater percentage of shots when they face more shots, all relative to their normal.

There are a few issues with the graph that I never fixed before losing all my information. Mostly the issues are cosmetic; there is no title, no axis labels, and the axis ranges should be similar given the data. The last problem causes the data to visually seem far more like a line than it actually is, as the image is being severely stretched from left to right.

There is one real issue though: the x and y variables are actually the opposite than what they should be. However, the most important part survives these errors. The coefficient of correlation.

The R^2 between these two variables is found to be 0.016, almost nonexistent. What does this mean? For the forty active goaltenders to play at least one hundred NHL games over the past four seasons, there is no substantial relationship in them playing better -in terms of save percentage- when facing more or less shots against.*(I unfortunately do not have N to give you a p-value, sorry again)*

It should be noted that I repeated the same experiment for shot attempts *(ie: Corsi)* and non-blocked shot attempts *(ie: Fenwick) *against, with almost no difference in correlation.

While this doesn’t prove anything unequivocally, it is another piece of evidence that a NHL goaltender’s save percentage is predominately a construct of natural variance and goaltending skill.

I looked at something similar a while ago and found that in general, both EV and Total save percentage tends to increase logarithmically with shots against (http://puckplusplus.wordpress.com/2013/06/29/shots-against-and-even-strength-shooting-percentage/). If you look at the aggregate save percentage by shots against (i.e. save percentage in all games for all goalies where a goalie faces 10 shots, 15 shots, 20 shots, etc.), the data is a little clearer. I think I also found that if you’re looking at predicting save percentage in a single game, the number of shots against is a better predictor than a goalie’s total season save percentage (I’d have to double check that, but I’m pretty sure it’s true).

ya I had something similar when you took in groups, but I wonder how much of it is being caused by sample bias within a few quick goals causing a goalie to be pulled for example (in part why I removed games with less than 10 shots and less than 40 minutes).

The other issue I have is that there is a natural problem with save percentage when looking at from total shots standpoint. You have 20 shots against and there are only certain number of “reasonable” possibilities.

Let’s look at allowing 3 to 1 goal against for _ shots against:

15: .800, .867, .933

20: .859, .900, .950

25: .880. .920, .960

30: .900, .933. .967

35: .914. .943, .971

This is why we see a distinction in your methods outcomes and mine.

There is also the possibility of sample bias in that maybe the teams that tend to have high shots against (ex: Leafs) actually have better goalies (ex: Reimer/Bernier), while the teams that tend to have low shots against (ex: Devils) actually had worse goalies (ex: Brodeur).

I’m pretty sure I applied the same 10 shot/40 minute threshold when I did my analysis, but I didn’t take into account the rate portion of it, which may skew things. Your point about the distinct outcomes is definitely a valid one too and something that’s somewhat tricky. One of my initial aims when I was looking at calculating expected goals against by shots against was to create a continuous version of Rob Vollman’s quality starts metric-obviously it evolved after that, but the binary nature, and the change in metric at 20 shots was something that I was hoping to improve upon.

Your point about the sample bias is interesting-I wonder if there’s any way to test out whether the causation runs the other way, i.e. teams with good goalies tend to give up more shots because they have more “trust” in their goalie, while teams with bad goalies will protect them from quality chances against.

It is completely possible.