Not Just for Nerds: A Beginner's Guide to Hockey Analytics
Every hockey fan cares about statistics. For years, goals, save percentage, hits have been more have been on the tip of every hockey fan's tongue. It's a huge part of how we evaluate players, an easy way to determine a player's strengths...and weaknesses.
Over the past decade, especially since the 2013 lockout, "basic" statistics such as assists and wins have perhaps been leapfrogged by a series of "advanced" statistics. Generally, these categories attempt to measure a player's impact, positively or negatively, on the play of his team. That may sound basic, but it's easy for a causal hockey fan to look at a stat line and wonder what a "Corsi" is, or why they should care which zone a player starts their shifts in.
On the surface, hockey analytics are daunting if you have to little to no experience with advanced stats. There are so many of them, so many numbers and percentages associated with them, it can be hard to remember what's what. That's why I'm putting this post together.
In some of my articles, I've given a general reference to some of these stats, usually giving a brief description about what they mean. But with the 2019-20 season on the horizon, I want to use analytics more consistently during my game recaps or when analyzing a player's performance. To make sure we are all on the same page, I'm putting this piece together - a guide of some of the most common advanced metrics in hockey, what they measure, and what is a "good" number each stat. This will remain pinned on our Twitter (@BackOTNet) to remain accessible over the course of the season.
This is probably the most common advanced stat in all of hockey. All that Corsi measures is shot attempt differential. This means all shot attempts, whether they go in, are stopped by the goaltender, blocked, hit the post, or miss the net entirely, are factored in. Here's an example:
In a hypothetical game, the Philadelphia Flyers have 30 shot attempts. The Pittsburgh Penguins, their opponent, have 20 shot attempts. To find the Flyers Corsi, divide their shot attempts by the total shot attempts for both teams. In this case, that's 30 Flyers shot attempts divided by 50 shot attempts (30 for the Flyers + 20 for the Penguins). That comes out to .6, or 60%, which is the Flyers Corsi For% for that game. 50% is average, and anything at 55% or better is elite, so obviously the Flyers played very well in this hypothetical contest.
This can also be applied to individual skaters, too. For example, in this same hypothetical game, Claude Giroux is on the ice (he doesn't necessarily have to take the shot attempt himself) for three Flyers shot attempts and three Penguins shot attempts. His Corsi For% (which is just Corsi itself) for that game would be 50%.
An important component of many advanced stats is how a player compares relative to his teammates. Relative to teammate metrics are helpful when trying to identify good players on bad teams or bad players on good teams. To find a player's Corsi For% relative to his teammates (simplified as Corsi For TM%), subtract the individual player's Corsi For% from his team's. If the number is positive, it means the player is helping his team win the shot differential battle, because a larger percentage of the shots belong to his team when he is on the ice compared to being off it. And of course, the inverse of this applies as well.
Using the same numbers from our earlier example, Claude Giroux's Corsi For% RelTM would be -10%, which is really bad. In other words, for every 10 shot attempts, the Flyers average one more when Giroux is off the ice than when he is on it. That's a pretty drastic difference, bigger than a large chunk of numbers in this stat for a single game (especially for an elite player like Giroux, who rarely if ever would put up a number that poor).
Here's a simple chart to look at to determine how good a player's Corsi is, per one of the best advanced stats gurus out there, The Athletic's Charlie O'Connor:
Corsi and xG (which I'll get to next)
Very good: 52% and above Solid: 50-52% Weak: 48-50% Poor: Below 48%
Corsi and xG Relative to Teammates (Rel for short) Awesome: +5.0% Very Good: +3.0-5.0% Good: +1.0-3.0% Passable: negative-1.0% - plus-1.0% Unimpressive: negative-3.0% - negative-1.0% Bad: negative-5.0% - negative-3.0% Yikes: worse than negative-5.0%
Expected Goals (xG)
I'm sure you've heard the phrase "quality over quantity" at some point before. While controlling the flow of the game and winning the shot attempt battle are nice, is it worth it to outshoot an opponent 30-10 if your 30 shots are floaters from the point and their 10 are breakaways (or another dangerous chance)?
Since you're never going to see a game with a difference in shot quality as drastic as I just described, the answer is probably "to a point." This is where the statistic expected goals comes into play - and yes, it is exactly what it sounds like. Expected goals uses several factors, including shot location, to determine the likelihood of a shot going into the net. Each shot is given a value (the higher the value, the more likely the shot is to become a goal) and are added up into one number.
Like Corsi, expected goals can be expressed as a percentage. It's common to see "the Flyers were expected to outscore the Penguins 3.75 - 2.25" (yes, I know it's not possible to score a fraction of a goal, just stay with me here) or "the Flyers had 62.5% of the expected goals in their last game" (3.75/6, which comes from adding 3.75 and 2.25).
The easiest question to ask is how a particular shot gets its value, and the answer to that is unfortunately "I don't know." Only the people who put together models, who are all way smarter than I am, could answer that for you (I'll list a bunch of advanced stats resources at the end of this article). There are also other factors than the location of a shot that can make it dangerous - passes leading to the shot, traffic in front of the net, the quality of the shooter, a change of direction from the initial shot. Most models account for factors other than shot location, but each one is a little bit different so you might see a small discrepancy depending on where you look.
While some of these factors are accounted for better than others, the bottom line is that expected goals is a bit of an imperfect stat. However, because what it measures is so important, and because most models do an effective (though perhaps not perfect) job at accounting for factors beyond shot location to determine quality, it's still a very popular stat and is often listed side-by-side with Corsi. The goal of hockey is to score goals, which is done by creating as many shot attempts as possible from as dangerous a location as possible. Corsi and expected goals (xG), respectively, tell those narratives effectively.
Goals, Assists, Points, etc. Per 60
One of the most unique things about hockey is how ice time is allotted. Even the very best players in the league only play 20-25 minutes in a regular season game, less than 50% of the time. Per 60 metrics are probably best described as a way to measure efficiency in the statistic they measure.
Take two hypothetical players, for example. Player A scores 41 points in one season, averaging 20 minutes of ice time. Player B scores the same 41 points, but only receives an average of 15 minutes of ice time. Both play all 82 games. Here's how to find their points per 60 (you can do the same for goals, assists, or basically any base statistic, too):
Player A: 41/82 = .5 x (60/20) = 1.5 points per 60
Player B: 41/82 = .5 x (60/15) = 2 points per 60
If you're really smart, you've probably realized that Player C could score fewer points than Player D but have a higher points per 60 if there was a large enough discrepancy in ice-time. The higher a player's points per 60, the more efficient of a scorer the player is. This is a great stat if you're into the fantasy hockey (or just want to make smart predictions to your friends) to find diamonds in the rough. Players who already have a good points per 60 and are slated to get more ice time - like the Sharks' Kevin Labanc - are excellent breakout candidates.
If points per 60 measures the quantity of opportunity given to a player (relative to their scoring efficiency), zone starts measure the quality of chances a coach gives a player. The best way to have a good Corsi and xG is to spend a lot of time in the offensive zone. In order to do that, you usually need to either make a nifty entry into the offensive zone to retain possession or apply a strong forecheck to win back possession. Then you need to find a shooting lane and actually take the shot. Sounds like a lot of work, right?
Or you could let a teammate do all of that and when the goalie makes a save and covers the puck, you jump over the boards and blast a shot right off a face-off win. It's a lot easier to generate shots if you're starting a lot of shifts in the offensive zone, because the hard work of getting into the zone has already been taken care of. It's a good "well, actually" stat to use - a player with amazing Corsi or/and xG should maybe be taken with a grain of salt if they have a ton of o-zone starts, or given a pass if their numbers are a bit below average but their starting most of their shifts in the defensive zone.
Controlled Zone Entries For and Exits For/Against
With puck possession viewed as important as ever, entering and exiting the offensive and defensive zone are as important as ever. Maintaining possession while crossing either blue line is important to limit your opponent's attack time and increase your own. Though the dump and chase remains a popular and somewhat effective strategy, entering the offensive zone while maintaining puck is paramount. It eliminates the need to forecheck and force the defense into a mistake, skipping right to the end of the goal of the dump and chance by controlling the puck in a dangerous area by default. This stat is more important for forwards, since they're the ones usually tasked with bringing the puck into the offensive zone.
In the other zone, puck moving defenseman have become much more popular over the past decade. Though sometimes a defenseman will be forced to simply fling the puck out of the neutral zone, relying on that is a poor strategy. Almost every single time, it leads to the opposition regaining possession in the neutral zone, allowing them to more easily transition back into your zone. And when the other team has the puck, defenseman will ideally step up on the puck carrier, forcing them to either dump the puck in or attempt a (hopefully difficult) pass. Zone entries against and zone exits are primarily used to analyze defenseman, since they're the ones to break the puck and out and the first players back to defend. Like Corsi and xG, both are commonly expressed as a percentage.
Luck is always going to play a major role in deciding the outcome of any sports game. Measuring exactly how lucky each team is easier said than done, at least by the eye test. This is where PDO comes in. To find PDO, simply add a team's shooting percentage and its save percentage.
One hundred is considered average for PDO. If your team's number is below 100, then they're probably getting a little bit unlucky. And on the other hand, a number above 100 suggests your team is getting lucky.
Granted, better teams are more likely to have a PDO above 100, because they have more talented skaters and goaltenders that are more likely to perform well compared to an average or below average team. But a team's skill isn't going to sway PDO by more than about one percentage point either way, in my opinion. Many teams in recent memory (2014 Avalanche, 2019 Islanders) have made the playoffs and done well despite a high PDO. But banking on a high PDO to be sustainable over a long period of time, especially if you don't have an elite goaltender or/and elite goal scorers, is trouble waiting to happen.
Expected Save Percentage/Goals Saved Above Average
I'm grouping them into the same category since both are roughly the inverse of expected goals for. Both take into account the quality and quantity of the shots faced and come up with a save percentage and number of goals allowed for an average goaltender. If a goalie's expected save percentage is higher than his actual save percentage, he's playing better than expected (obviously), and vice versa. The same thing goes for GSAA - the higher in the positives is better, the lower in the negatives is worse.
There are plenty of other advanced stats out there (looking at you, Fenwick), but these are the ones I see myself using the most. One important caveat to note is that most of time you see these analytics, they will only be addressing 5-on-5 play. The reason for that is the vast majority of hockey games are played at 5-on-5, so only looking at that even playing field is the best indicator. 5-on-4 numbers for these stats are also common, but should be looked at separately when analyzing the total effectiveness of a player.
Advanced Stats Resources
Below are some of the best websites and Twitter accounts to check out if you're interested in learning more. Feel free to reach out to me (or any of the people listed below) for further clarification.
Phancystats.com (specifically for the Lehigh Valley Phantoms)
Charlie O'Connor: @charlieo_conn
Dom Luszczyszyn: @domluszczyszyn
@BackOTNet/@_AndrewMcG (That's me!)
I also recommend checking out this article by Charlie O'Connor, which goes over many of the same stats and does a good job at explaining them if you're still confused or want more info (the aforementioned table is in the comments of this article):