How do you classify hockey players? Many would argue to go by the classical six positions (C, LW, RW, LD, RD, G) while some would argue for a rover (see picture above). I suggest a different distinction. Obviously, goalies are their own identity so they're excluded from this analysis. That leaves players which I will further breakdown into forwards and defence. Forwards and defence tend to have very distinct roles with a few exceptions (D.Byfuglien and B.Burns). In this post I am going to focus on forwards. It isn't easy to decide just which position a forward plays, don't bother asking the PHWA (see. the Ovechkin debacle) because they obviously can't tell. NHL.com is no help either since many of their positional declarations are hilarious out of place (ex. Zetterberg is listed as a LW despite taking over 1000 face-offs last season which places him 48th in the entire NHL). Then there is there is the issue of 1st/2nd/3rd/4th line. These roles are usually overstated by most media types and then there is a designation problem. If a player preforms like a 1st liner but his coach players him on the 2nd line? What really are they? I know, deep stuff. Long story short, breaking players down into categories is easier said than done.
K-Means ClusteringTherefore, I set out to with a fun exercise to reclassify forwards based on their playing characteristics. I used k-means clustering to break the players down into 8 categories based on these characteristics. I want to stress that these measurements are meant to reflect a player's playing style not how well or poorly they preformed. The chart below shows the average of each measurement broken down by cluster. I arbitrarily named the clusters myself, you shouldn't read too much into those. Come up with your own if you want. (You should click on that picture if you want to look at the cluster characteristics more carefully.)
Here is a random sample of ten players and which cluster they belong to. Please don't get too upset if you don't like a certain player's cluster. Remember these clusters group players by "playing style" not skill level.
Below are some box-and-whisker plots which breakdown the clusters by Corsi%, Age, TOI/GM and AAV. Here is a quick run through of how to read a box-and-whisker plot:
- The big solid line going down the whole graph is the mean value for the whole same. Example: The mean Corsi% for all forwards is 50%.
- Within each box plot is another solid line that marks the median value for that cluster (the middle value of that cluster). Median not the mean.
- The box itself encompasses the upper and lower quartiles of values, from the 25% to 75% percentile.
- The whiskers mark the top and bottom 25%, excluding outliers.
- Dots denote outliers.
- All-Around, seems like the group a player would want to be in but it still encompasses a range of players from Sidney Crosby to Manny Malhotra.
- "Safe" Depth is labelled as such due to them being populated of lower end players (see. AAV), yet their Corsi compares favourably when compared to the Depth cluster of players.
- High Impact players, do a bit of everything including areas that don't involves scoring ex. draw more penalties than they take while dishing more hits than they receive.
- Power Forwards, are big guys (yes, I subjectively looked at the cluster of players for 30 seconds and thought I saw a bunch of perceived power forwards) who prefer to pass more than they shoot but also take more penalties than they draw, probably due to a lack of foot speed.
- Depth players, while few in numbers (only 13) they dish hits like crazy yet clearly trail in the Corsi%, TOI/GM and AAV categories.
- Passers, create a lot of opportunities for their teammates and are wizards at taking the puck away from their opponents more than they give it up.
- Depth scorers, these are typically young players who have been held down in the lineup by their coach yet can really shoot the lights out.
- Scoring Wingers, are very similar do depth scorers yet have been given a larger playing opportunity.
- I would love to do the same exercise for defenceman but their doesn't seem to be enough distinction using my current attribute metrics. Maybe I will discover a better way to classify them in the future, who knows.
Please let me know if you have any thoughts or questions. You can comment below or reach me via email me here: DTMAboutHeart@gmail.com or via Twitter here: @DTMAboutHeart