Behind the Numbers: Scoring first and conditional probability

Every once-in-a-while I will rant on the concepts and ideas behind what numbers suggest in a series called Behind the Numbers, as a tip of the hat to the website that brought me into hockey analytics: Behind the Net.

Not long ago Jason Gregor tweeted about the value of scoring first.

It may be a bit controversial and difficult to get right away, but the value of scoring first is not special. Long ago, Mr. Eric Tulsky, now of the Carolina Hurricanes, showed that the value of scoring first equals the value of any other goal.

Conditional probability can be difficult. Like actually. I’m not making fun of Jason Gregor because it can be a difficult concept to grasp. There is a very small percentage of people who will understand the Monty Hall problem intuitively.

Not everyone will get what Tulsky described in that article right away. I find probability trees helpful, so I will try to show what is going on here. Now as a fair warning: I’m not artistic, and I wrote this quickly, so the drawings won’t look very pretty.

Let’s start off really simple and set some conditions. On average a winning team scores two-thirds, or 67%, of the goals. This also means that on average the winning team carries a 67% probability of scoring the next goal at any point of the game.

In fact, we do find that the team that scores first wins about two-thirds of the time, or about 67%. The team that does not score first wins the rest of the time, or about 33%:

screen-shot-2016-11-15-at-2-48-04-pm

The winning team scoring first 67% of the time also means that the team that scores first has a win percentage of 67%. The two are equivalent here.

The part that gets difficult is when we look at the value of the second goal (or the third goal). We still find that the team that scores the second goal wins about two-thirds of the time:

screen-shot-2016-11-15-at-2-51-07-pm

Now this may seem confusing, but we have to realize that the value of scoring the second goal is independent of whether or not you scored the first goal. This independence has meaning. You could be tied scoring the second goal, or you could have a 2-0 lead:

screen-shot-2016-11-15-at-2-54-16-pm

Independence has a very important meaning in probability. The value of scoring second is independent, or agnostic, of who scored first. If you do want that you’d have to break it down further:

screen-shot-2016-11-15-at-3-01-25-pm

About two-thirds of the winning teams end up on the left side, scoring the first goal. About one-third of winning teams end up on the right side, being scored upon.

The value of scoring second, though, is both the value of scoring second given you scored first AND the value of scoring second given you were not the team that scored first. After all, you can score the second goal after scoring first or you can score the second goal after being scored on first.

Mathematically you combine both the ‘x1’ and ‘x3’ values.

How does this fit on our graph? Well:

screen-shot-2016-11-15-at-3-23-56-pm

How do you read this?

Two-thirds of the time, the winning team will score the first goal. Getting the first goal is important, as you now have a 1-0 lead, can potentially lead by two in the next goal, and are not trailing.

Provided that they have scored the first goal, the winning team will score the second goal about two-thirds of the time. Provided that they have not scored the first goal, the winning team will score the second goal about two-thirds of the time. This means about 45% (two-thirds of 67%) of the time the winning team will have a 2-0 lead, while about 44% of the time they are tied (one-third of 67% and two-thirds of 33% are both 22%), and about 11% (one-third of 33%) of the time they are trailing by two goals. This also means that the team that scored second wins 67% of the time [45+22], which is equivalent to teams scoring the second goal have a win% of 67%.

Getting the second goal is important, just as important as the first goal.

Why?

Well, if you have a 1-0 lead, you increase the lead to two goals, which gives you heavily favoured odds in winning. The win percentage of teams with a 2-0 score is about 80.4% [45/ (11+45)]. Allow the second goal and all of a sudden you are on equal footing with the opposition once again, losing the impact of the first goal.

If you are trailing 0-1, you tie things up and put yourself back into equal footing. You also avoid placing yourself at a huge disadvantage with trailing by two. The win percentage of teams with a 0-2 score is about 19.6% [11/(11+45)].

The win percentage of those that score the second goal, provided that they have scored the first, is about 80%. The win percentage of those that score the second goal, provided that they have not scored the first, is about 50% [22/(22+22)]. The win percentage of those that have scored the second goal regardless of who scored the first is about 67%. The “provided that” part infers conditional probability.

Goals are important, and scoring the first one is important as well. After all, hockey is merely a goal scoring contest; the most goals win. Scoring the first goal gives you a head-start in scoring the most.

However, scoring the first goal’s importance does not seem to come from “setting the pace”, or at least if it does there are “momentum” factors in every other goal that is equally important.

A team should not try to the score first goal. A team should try to score all of the goals, focusing on one goal at a time.

Addendum

I noticed some confusion with the full tree on Twitter, so I thought I should both expand and explain.

The 45, 22, 22, and 11 percentages are the percentage of all winners who were at that score at that one time.

We get them by multiplying the probability the winner scored the second goal with the percentages prior. 45% is two-thirds of 67%, so 45% of teams that have won had a 2-0 lead. 22% is both one-third of 67% and is also two-thirds of 33%, so 44% (22+22) of all winners have been tied 1-1 and there are two ways to get there.

Now, winning percentage is different than percentage of teams that won had ___ score.

In the short graphs were the same as there were only two options: you either led by one or you didn’t. 33% of all winning teams trailed 0-1, and also the winning percentage of teams that trailed 0-1 was 33% [33/(33+67)].

In the longer graph the two values are not the same. 22% of all teams that won trailed 0-1 and also scored the second goal. The win percentage of teams tied 1-1 who allowed the first goal is not 22% though… It is 50% [(22/(22+22)]. Remember 22% of all teams that won trailed 0-1 and also scored the second goal, while exactly the same percent of teams that won were ahead 1-0 and allowed the second goal against.

List of things:

  • 67% of all winning teams scored the first goal
  • 67% of all winning teams that scored the second goal
  • 67% of all winning teams that scored the third goal
  • The winning percentage of teams that scored first is 67%
  • The winning percentage of teams that scored second is 67%
  • The winning percentage of teams that scored third is 67%
  • 45% of all winning teams had a 2-0 lead (scored first and second)
  • 44% of all winning teams had a 1-1 tie (half scored first and half did not)
  • 11% of all winning teams had a 0-2 lead (scored on first and second)
  • The winning percentage of teams that scored first and second is 80%
  • The winning percentage of a team that scored first but not second is 50%
  • The winning percentage of a team that scored second but not first is 50%
  • The winning percentage of a team that did not score first and second is 20%

The value of scoring first is that winning percentage 67% of all teams that score first.

The value of scoring second is the winning percentage 67% of all teams that score second. This value includes the improved winning percentage of a 1-0 team moving to 2-0 (67% to 80%) and the improved winning percentage of a 0-1 team moving to 1-1 (37% to 50%).

7 thoughts on “Behind the Numbers: Scoring first and conditional probability

  1. If a team has 80.4% chance of winning a game by going up 2-0, then the scoring the first goal of the game is more important than the second. You have no chance to get that 80.4% chance to win the game if you are scored upon first, you need to tie the game up and then score two more times in a row to get another two goal lead. Wouldn’t be given the chance to have a win probability of 80.4% after two goals be more valuable than a chance to claw back to a 50% win probability by tying a game 1-1? The first goal is more valuable than any other goal because it gets a team on the path to a higher win probability faster.

  2. Sorry I have gotten a bit obsessed with this, but I did some math and I think I have figured out the value of the first goal. Here’s how I figure it, the value of the first goal is the ability to get to a two goal lead faster and increase the win probability from 67% to 80.4%. If a team scores first that means that half the time they will either be tied after 2 goals or up two goals. The team that scores first has an average win probability of 65.2%, or (50% + 80.4%)/2. The team that allows the first goal will either be tied or be down two goals and then have an average win probability of 34.8% or (50 + 19.6)/2. The team that scores first will end up on average to have a 30.40% better chance of winning the game after two goals are scored. I do not have the win probabilities for after the third goal and fourth etcetera, but I would venture a guess that the average win probabilities of these situations is what nudges the win percentage up to the 67% for a team that scores first.

  3. Strategies vastly change once the first goal is scored on both sides of the puck. I don’t see anywhere above about the implementation of a trap system by the team scoring first. Or a more aggressive forecheck by the team down 1-0. Time left in the game is also a factor. I would love to know the stats of the 2000 Dallas stars. Or Guy Boucher coached teams.

  4. I think you are leaving out a major point by ignoring time. If the first goal comes at 19:59 of the third period, there is a 100% chance that team will win. And thus there is a sliding spectrum of value (or win %) for the first goal that approaches that 100% mark the later the game goes.

    I think a second mistake you are making in your analysis is in not giving value to the quality of the teams. You are treating the teams as equals, which may not be the case. I think this is the same error inherent in statistics that quote a team’s chance to come back from 3-0 down in a series. Historically, most 3-0 series are dominated by a superior team. Thus the heavy statistics of (I think) about a 97% win rate. But sometimes an equal team or even a better team might find itself down 3-0. (I’d argue Tampa this year.) Tampa’s chances of winning 4 straight are much higher than many other 3-0 situations (Even though they ended up getting swept.)

  5. This is wrong. The chance of any team scoring is 50/50 adjusted to relative strengths of the 2 teams. The stats show that the team scoring first will win about 67% of the time but that statistic is based on a full game. I am a professional player of soccer investing. My investment strategies work in part on that basis.

  6. Has anyone taken a hockey season and actually seen who does win how often with first, second, or third goals? Hockey is more than just raw probability —- there are momentum factors as well as the clock factors.

    • Yes, the very first link provided is an article that did just that. This article is to explain why this pattern exists.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s