Monday, 25 May 2015

Updated NHL Expected Goals Model


Here is the latest rendition of my Expected Goals model. If you haven't read the original post you probably should read it here before continuing.The only substantial change from the previous version is that this one now includes rush shots. As it has been previously shown that rush shots just by the very fact that they are rush shots result in a higher shooting percentage. My model currently only accounts for 5-on-5 situations and includes a total of five factors:
  • Adjusted Distance
    • The farther a shot the lower likelihood it results in a goal 
  • Type of Shot
    • Snap/Slap/Backhand/Wraparound/etc...
  • Rebound - Yes/No?
    • A rebound is defined as a shot taking place less than 4
  • Score Situation
    • Up a goal/down a goal/tied/etc…
  • Rush Shot - Yes/No?
    • Rush shots have a higher shooting percentage

Results

Same sort of graphs below as in the previous post, along with the correlations for each. The ExpGF correlation jumped slightly from 0.58 to 0.61 yet the ExpGA correlation stayed consistent at 0.60. That isn't to say adding rush shots didn't effect the model. There is definitely some difference both positive and negative on certain teams, typically within the 10 goal range.



Upcoming


I still plan on adding some aspect of regressed shooter and goaltender talent somehow into the model. I am close to releasing ExpG at the player level, hopefully within the next week. Around the time I am able to incorporate goaltender talent into the model I should also be able to update my xSV% with the shot quality aspects of this model.

Expected Goals


Here are the updated results below. Note that, dGF/dGA/dGF%, are calculated as actual minus expected. Therefore, a positive dGF means that a team scored more goals than the model predicted they would. A positive dGA means that a allowed more goals against than the model would have predicted. I will update this spreadsheet in its own tab at the top of this site too. Please let me know any questions or feedback you might have. Enjoy!



Thursday, 21 May 2015

NHL Expected Goals Model


Did anyone ever consider shot quality? 

UPDATE: This model has since been improved upon and shown here. This post still provides good background on the basics of the model.

Shot quality and possession metrics have always been somewhat a point of contention. Expected Goals (ExpG) helps to combine these two facets in hopes of providing better information about the game. Expected Goals are not a novel concept, ones have been presented previously by Brian Macdonald for hockey and the original motivation for my study by Michael Caley's soccer version. I hope to lay out my ExpG model in a way that makes hockey sense, where everyone can understand why each factor was added into the model. The model works by assigning a value to each shot taken over the course of a season based on the model's predicted probability of that shot resulting in a goal. To calculate a team's final ExpG all you have to do is sum up all of these probabilities and there you have it. First I will breakdown the methodology that goes into this model. If you don't care and just want to see the results skip down to the Expected Goals section or check out the Expected Goals tab above.

Methodology


My model uses a logistic regression to arrive at each goal probability. Basically, it uses a bunch of independent variables to produce the odds of binary outcome occurring, in our case, yes a goal was scored or no a goal wasn't scored. I reran the logistic regression for each season instead of using one big logistic regression. So far my model only accounts for 5-on-5 situations. This helps to account for minor changes in style of league play yet the regression coefficients didn't actually change much year-to-year. Here are the factors taken into account by the model:
  • Adjusted Distance
    • The farther a shot the lower likelihood it results in a goal 
  • Type of Shot
    • Snap/Slap/Backhand/Wraparound/etc...
  • Rebound - Yes/No?
    • A rebound is defined as a shot taking place less than 4
  • Score Situation
    • Up a goal/down a goal/tied/etc…


Results


In the two graphs below you can see how well ExpG, both offensively and defensively, correlates with actual results. Each point represents one team from one season, except 2012-2013 was removed due to the lockout. 



There will always be some outliers in a given season but I think the model goes a relatively good job. The chart below shows that ExpG comes out on top when compared to Corsi and Scoring Chances in terms of correlation to real goals for and against in a given season.


Goals For Goals Against
ExpG 0.58 0.6
Corsi 0.493 0.57
Scoring Chances 0.53 0.562

Future Work


In the next coming weeks I will be focusing my efforts on two different aspects of this model. Firstly, I will investigate how well it predicts future goals, from one season to the next as well as something similar to Micah Blake McCurdy did with score-adjusted Corsi. Secondly, I will be looking at other factors to add into the model. I plan on adding rush shots as a factor, though the current state of my data will require some tweaking before I can do that. I also plan on exploring the effects of incorporating shooter talent and goaltender talent. I also plan on releasing ExpG at the player level and use aspects of this model to better xSV%. 

Expected Goals


I just wanted to thank War-On-Ice and Sam Ventura for the data used in this project. Finally, here are the results below. Note that, dGF/dGA/dGF%, are calculated as actual minus expected. I will give this spreadsheet its own tab at the top of this site too. Please let me know any questions or feedback you might have. Enjoy!