This feature today is not for everybody. You have to be a stats fan for this one to be fun to read. Last year, we were asked to explain some of the advanced basketball metrics used today. Then, a couple weeks ago, we were asked again what certain metrics were. So, this will be an attempt to explain the basic advanced metrics and then to some degree how one might use this data to determine an approximate point spread difference.
If you are not familiar with advanced metrics in sports, it all started with baseball many decades ago. The legendary general manager of the then Brooklyn Dodgers, Branch Rickey, was always many years ahead of his contemporaries. He had basically created the farm system for Minor League baseball in the late 1920’s, and he opened the game to African Americans and then created a pipeline in Latin America for the Dodgers to take advantage there. Around the same time, Rickey was looking for a statistical advantage to evaluating baseball players, using mathematics to find hidden gems of talent that might have been somewhat overlooked by the competition. This was 50 years before Money Ball. Rickey aligned with a mathematics genius by the name of Alan Roth, who had previously tried to show some of his ideas to other baseball owners, but none of these owners had an interest. Rickey was more than interested, and he hired Roth to work for the Dodgers about the same time as Jackie Robinson debuted in Brooklyn.
Roth was one of the first baseball statisticians to realize that RBIs were basically worthless as a stat. For two decades, he worked for the Dodgers charting where players hit balls, what pitches they hit, and who fielded or did not field balls. This technology would not come into the norm for another 40 years.
Many others presented statistical data for baseball through the 1960’s, 1970’s, and 1980’s. Some of these math experts wrote books, such as Earnshaw Cook and his great work called, Percentage Baseball. It is this book that I read many years ago that roped me into the world of advanced baseball statistics.
A lot of you reading this know who Billy Beane is and what “Money Ball” is. Let me clue you in on something. This was not the first big leap into computer-generated advanced baseball statistics. It wasn’t even the first attempt by the Oakland Athletics. Beane’s predecessor, Sandy Alderson, brought the computer big time into baseball, but you could argue that Earl Weaver with the Baltimore Orioles had a basic no frills database of his batters’ and pitchers’ successes and failures against the pitchers and hitters throughout the American League.
About the time that Money Ball had come out in book form, other mathematics experts began looking at different sports. Computer specialists had come up with somewhat successful algorithms to pick winners against the point spread in football, and one or two became quite wealthy until the state of Nevada banned their wagering for life.
In the late 1990’s, the NBA began looking for ways to take advantage of the numbers to maximize talent. Was it worth it to shoot the 3-pointer? Was it better to have a strong rebounding team that maybe didn’t shoot as well than a weaker rebounding team that shot better? What was the best number of minutes to play your star players, and the best number of minutes to play your second team players? Could statistics show enough consistency to partially answer these questions?
Of course, the questions can never fully be answered. Until computers can read the minds of humans, they can never determine if Stephen Curry may have strained his right shoulder lifting his amazing daughter up in the air earlier that day. The computer cannot determine if the star player had a little too much pizza the night before and didn’t get a good night sleep. There is missing data that will be discovered in the future because basketball analytics are far behind baseball in the evolutionary process.
Basketball is starting to catch up now that very expensive software exists in NBA gyms where multiple cameras are placed in the rafters of the arenas, which feed into a computer and can show teams where all 10 players were on the floor for each 1/100 of a second of the game. If the power forward was beaten for an easy jumper when the shooter came off a baseline screen, the computer records this. Within a few years, the game will become every bit as scientific as baseball, and you will see more Cal Tech and MIT grads working in front offices.
By now, you must realize that trying to explain all the advanced basketball metrics would be terribly boring and very difficult to do. I admit that I am not the authority on basketball metrics, but then I get paid for baseball analytics and not basketball analytics.
Here is a brief look at some of the advanced stats for basketball. If you are interested, you should be able to set up these formulas on a spreadsheet and then plug your team’s stats in and have some of the more popular advanced stats for the team you follow. You can even use these stats for lower levels (high school, middle school, youth league), but the formulas must be altered by an amount I cannot give you. There is a difference between NBA and college formulas, and there will be differences as you go down in experience. Some of it has to do with how many fouls it takes to put a team in the bonus and what that bonus is.
Let’s Begin
I must start with the most basic of advanced statistics. This first set of stats will give you a lot more than the basic statistics. They are called, “The Four Factors,” but they are used for both a team’s offense and a team’s defense, so it is really eight factors.
Credit here must be given to the very brilliant Cal Tech statistician Dean Oliver who wrote the number one book on basketball statistics, Basketball on Paper. It is required reading if this is your field of interest. Oliver capitalized on his data and sold it and himself to a handful of NBA teams, but the basketball media wasn’t ready for his ideas.
They scrutinized every move made through his recommendations, forcing the NBA teams to give in to their fans that bought into the media’s opinions. Of course, many of these media hacks cannot balance their own checkbooks, so their scrutiny comes without credibility. I say this because I was once a media hack in a top 30 market who believed a lot of the preconceived misconceptions of sports.
The “Four Factors” (again eight factors since this is figured for the offense and the defense) are:
1. Effective Field Goal Percentage
2. Turnover Rate
3. Offensive Rebounding Rate
4. Free Throw Rate.
While these factors are still quite valid, they have been surpassed somewhat by more advanced data. For example, True Shooting Percentage is more detailed than EFg%.
Here are the easy calculations for the Four Factors.
1. Effective Field Goal Percentage
This stat adds three point shooting to two point shooting into one stat. A made three-pointer is worth 50% more than a made two-pointer. So, if you make 1/3 of your three-pointers, it is the same as making 50% of your two-pointers.
The formula for eFG% is: (Field Goals Made + (0.5* 3-pointers Made))/Field Goals Attempted.
Let’s say that Duke takes 58 total shots in a game. They make 26 of these shots, and 8 of them are three-pointers. The calculation would be:
(26 +(0.5*8))/58 which equals .517 or 51.7%.
It works the same for defense. Let’s say in the same game, Duke’s opponent took 57 shots and made 24 with 7 of them three-pointers.
(24+(0.5*7))/57 = .482 or 48.2%.
1A. True Shooting Percentage combines Effective Field Goal Percentage with foul shooting into one combined scoring stat. As you will see with the 4th factor, there is debate over how to use FT Rate properly.
The NBA formula for True Shooting Percentage is: Pts/(2*(FGA+(.44*FTA))) but this is the NBA formula. As I mentioned above, the formula for college basketball is a little different, and it has to do with different Free Throw rules in the two organizations.
For college, it is: Pts/(2*(FGA+(.465*FTA)))
Let’s look at this for an individual. Here are Steph Curry’s Shooting Stats for his last year at Davidson.
Curry scored 974 points in 2008-09. He took 687 shots from the field and 251 foul shots.
974/(2*(687+(.465*251)))= .606 or 60.6% which for a guard is outstanding.
Compare this to Kareem Abdul-Jabbar’s sophomore season at UCLA in 1966-67, when the NCAA made the mistake of banning the dunk following his dominant first year on the varsity. Jabbar, known then by his birth name of Lew Alcindor, scored 870 points that year with 519 shots from the field and 274 foul shots.
870/(2*(519+(.465*274)))=.673 or 67.3%.
You can see that a dominant post player like Jabbar was worth more in shooting than a top outside shooter like Curry. This is a relative statement, but it is like saying Babe Ruth was worth more as a hitter than Ty Cobb.
2. Turnover Rate
This measures the rate at which a team commits a turnover or forces the opponent to commit a turnover. We will stick with team stats for now, because the formulas for individuals are a bit more complex.
The calculation for Tunover Rate is:
TO / (FGA + (0.44 * FTA) + TO) for NBA, and
TO/(FGA+(.465*FTA+TO) for College
We will calculate a couple of extremes here. Let’s look at Temple in 1987-88 and Arkansas in 1993-94. Temple’s Coach John Chaney guided the 1987-88 Owls to the regular season number one ranking using an aggressive 2-3 matchup zone defense and a patient offense that valued every offensive possession like gold. Temple did not gamble on offense or defense, as they never attempted to force their offense or try to create turnovers with defensive pressure, preferring to force opponents to shoot poor shots.
Arkansas coach Nolan Richardson guided the Razorbacks to the national title in 1993-94. His teams pressed full court for 40 minutes (40 minutes of Hell) and played up-tempo fast-breaking offense. Arkansas committed more turnovers on offense, but they forced a lot more turnovers than average, and they came up with a lot of steals that led to easy points.
Temple in 1987-88 in 34 games
Offense: 305 Turnovers, 2,050 FGA, 704 FTA
Defense: 423 Turnovers, 1981 FGA, 513 FTA
Offensive TO Rate: 305/(2,050+(.465*704)+305) = .114 or 11.4%
Defensive TO Rate: 423/(1981+(.465*513)+423) = .160 or 16.0%
Arkansas in 1993-94 in 34 games
Offense: 539 Turnovers, 2,363 FGA, 834 FTA
Defense: 725 Turnovers, 2,234 FGA, 817 FTA
Offensive TO Rate: 539/(2,363+(.465*834)+539) = .164 or 16.4%
Defensive TO Rate: 725/(2,234+(.465*817)+725) = .217 or 21.7%
Which team was better at total turnover differential, Temple in 1988 or Arkansas in 1994? It was basically a wash. Temple played conservative basketball about as good as it could be played, going 32-2 and outscoring opponents by 15+ points per game. Arkansas played havoc basketball and went 31-3 outscoring opponents by almost 18 points per game. Both styles worked.
3. Offensive Rebound Rate (and, of course, Defensive Rebound Rate)
This measures the rate a team gets offensive rebounds and the rate in which it limits its opponents from getting offensive rebounds, which is obviously the rate of getting defensive rebounds. These stats allow the statistician to quickly see the opposite without having to perform double calculation. If Michigan State gets 36% of the rebounds on their offensive side of the floor, then Michigan State’s opponents will obviously get 64% of the rebounds on their defensive end of the floor.
The calculation for Offensive Rebound Rate is: Off. Reb/(Off. Reb + opponents Def. Reb),
and thus the Defensive Rebound Rate is: Def. Reb/(Def. Reb + opponents Off. Reb)
Coach Tom Izzo has his Michigan State Spartans totally dominating the glass this year. Their rebounding margin of 11 boards per game is giving the Spartans an incredible advantage in games (how much we will see later).
Let’s calculate their Offensive Rebound Rate so far this season:
Offensive Rebounds = 190 Defensive Rebounds = 337
Opponents Offensive Rebounds = 176 Defensive Rebounds = 513
Michigan State’s Off. Rebound Rate = 190/(190+337) = .361 or 36.1%
Michigan State’s Def. Rebound Rate = 513/(513+176) = .748 or 74.8%
You can also figure total Rebound Rate, which isn’t a Four Factor, but easy enough by taking Michigan State’s percentage of total rebounds. (190+513)/(190+513+337+176) = 57.8%
4. Free Throw Rate
This is the most controversial of the Four Factors, and there are now multiple theories about how best to calculate this stat. The original formula was simply FTA/FGA. Many metric specialists (including me) believe this is not the best way to calculate free throw rate. For one, this original formula does not calculate made free throws. Shaquille O’Neal would be just as effective and maybe more effective than Steph Curry, and there is no way you can convince me that Shaq’s free throw rate should be as strong or stronger than Curry’s.
There is another school of thought, which is the one the PiRate Ratings have adopted, and that is Free Throws Made per 100 possessions. The calculation is a bit more involved since you need the number of possessions, but total possessions is now kept as a stat in college basketball, and there is a formula that accurately approximates possessions.
Our Accepted FT Rate Calculation is: FT Made per 100 possessions.
If you do not have the number of possessions, you calculate it this way:
NBA: FGA+ (.44 * FTA) – Off. Rebounds + Turnovers
College: FGA +(.465 * FTA – Off. Rebounds + Turnovers
An example from a real game–last Sunday’s Michigan vs. Indiana game.
Michigan took 58 shots in the game. They had 16 Free Throw Attemps, 7 offensive rebounds, and an amazing 2 turnovers.
Let’s calculate their possessions: 58 + (.465*16) -7 + 2 = 60.44
In the actual game box score, Michigan had 60 possessions. In other words, this formula is very accurate, and when there is a difference of one possession in the calculation, it usually is because the team that controlled the opening tap also had the last possession of the half.
Michigan made 12 free throws in their 60 possessions, so we now have to normalize this to how many they would have made in 100 possessions, which is quite simple.
12/60*100 = 20.0, so Michigan’s Free Throw Rate in this game was 20.0.
If we use the original formula, Michigan had 16 FTA and 58 FGA for a rate of 16/58 or 27.6%. We feel that this overstates Michigan’s rate here. Because there were just 60 possessions in this game (about as low as a 30-second shot clock game can produce), the rate was inflated.
There is a third school of thought by stating this formula as FT Made / FG Attempted, which is a bit more accurate than FTA/FGA, but we still prefer making our rate per 100 possessions.
Putting it all together
So, now you have the four factors. How can we take this data before a game is played and determine an estimated point spread? It is not an exact science.
Let’s return briefly to baseball. In baseball, you have the infamous WAR stat, where players are rated in wins above a replacement player, a replacement player being somebody you can pick up on waivers or call up from AAA. There is no WAR stat at this time for basketball, although many statisticians have tried to calculate one from game stats. The problem is that it is hard to judge defense in basketball compared to judging pitching and fielding in baseball.
So, the answer is to find a way to determine how much weight to place on each of the Four (Eight) Factors to try to determine which team is better.
In the NBA, this calculation is considerably easier than in college, because strength of schedule only marginally differs in pro basketball, as most teams play an equal schedule strength. It can be argued that Golden State’s schedule is easier than Philadelphia’s schedule, because the Warrior won’t play Golden State, while the 76ers don’t benefit from playing Philadelphia, but that becomes negligible as the season progresses.
In college basketball, the Patriot League and the Big Ten are not close to comparable, so Lehigh’s Four Factors’ stats are not equal with Michigan’s Four Factors’ stats.
Originally, Oliver determined that Effective Field Goal Percentage was by far the most important of the Four Factors, and since there are a lot more shots taken in a basketball game than anything else, it goes without saying that this factor should be the most important. If your team can consistently beat its opponents in eFG%, they will win more games than they lose. If your team has an eFG% that is 10% better than the opponents, then your team is playing at a championship level.
Oliver believed that eFG% was about 40% of the success or failure of a team. He stated that turnover rate was worth 25%, offensive rebound rate was worth 20%, and FT Rate was worth the remaining 15%. In back-testing, these numbers approximated success or failure in the NBA.
It took many hours of algorithm testing for the PiRate Ratings to come up with percentages to apply to these factors. In the end, we had to create two more factors to approach legitimate accuracy.
If you have followed this site during basketball season for some time, you have probably heard about our own creation called “R+T Factor.” This is a refined version of the rebounding rate and turnover rate, which probably is the reason why Oliver gave a bit more weight to turnovers than rebounds. The key is to separate turnovers into steals and everything else. A steal in basketball is worth more than a rebound. When a team steals the ball, the chances of getting an easy basket and/or drawing a foul is much higher than obtaining a rebound. After working with the formula for a few years, we finally came up with one we like.
Our R+T rating is: (R*2) + (S*.5) + (6-Opp S) + T, where
R= Rebound Margin
S= Average Steals Per Game
T= Turnover Margin
In 2017, one NCAA Team had a rebound margin of 12.3 per game. They had a turnover margin of 1.8 per game (which means that they committed 1.8 fewer turnovers per game than their opponents), averaged 7.1 steals per game, and opponents averaged 6.2 steals per game.
This team’s R+T Rating was: (12.3*2) + (7.1*0.5) + (6-6.2) + 1.8 = 17.5
This team played in one of the top power conferences in the NCAA, and their rating of 17.5 was the best among the power conference teams. When a power conference team has an R+T rating over 10, they are Sweet 16 caliber. At 15, they are Final Four caliber. So, it can be deduced that this team did fairly well in the 2017 tournament.
This team was national champion North Carolina.
This R+T stat tries to estimate the number of extra scoring opportunities a team gets in a game. The stat is much more valuable in the NCAA Tournament where there are 25-30 really strong teams playing. When the pressure is on, many times these extra opportunities decide the outcomes. While effective field goal percentage is still the number one variable, the R+T rating becomes more and more valuable as the tournament progresses. By the Sweet 16, the teams with the best R+T rating usually continue to advance, and in many years, the team with the number one R+T rating weighted by schedule strength wins the National Championship. In every season in the 21st Century, the champion has been among the nation’s leaders in R+T factor weighted against schedule strength.
The obvious second added factor in predicting basketball games is schedule strength. If a team in the Ivy League outscores its opposition by 10 points per game, they are not as good as a team from the ACC outscoring opponents by 10 points per game.
At the start of conference play, one SEC team may have played a non-conference schedule that on average is 10 points weaker per game than another team. Kentucky usually plays a much harder pre-conference schedule than Vanderbilt or Ole Miss. Tennessee has played a more difficult schedule than Missouri.
Once conferences have played more than half of their league schedules, you can even calculate ratings based only on conference games played and then take those ratings and rank the conferences overall to get a more accurate rating for every team.
For example, let’s say that on February 20 with 80% of the Big 12 conference games in the books, Texas Tech is 1 point better than Kansas, 3 points better than Iowa State, and so on down to Oklahoma State being 14 points weaker than Texas Tech. Let’s say that Stephen F. Austin is 3 points better than Abilene Christian in the Southland Conference and 5 points better than Sam Houston. Overall, the Big 12 is calculated to be 17 points better on average than the Southland conference, so Texas Tech would be 17 points better than SF Austin, 20 points better than Abilene Christian, and 22 points better than Sam Houston. Oklahoma State would then be 8 points better than Sam Houston, since they are 14 points weaker than Texas Tech.
We don’t actually figure the ratings this way, but we have an algorithm that does a similar calculation for every team based on their overall strength of schedule for the season. It is a close cousin but goes more in-depth than the Quadrant system in place by the NCAA Selection Committee and used by our Bracketology experts when they pick their weekly selections, which by the way you can now see our PiRate Rating Bracketology at the Bracket Matrix, at http://www.bracketmatrix.com/ Our abbreviation there is “Pi.”
There are many additional advanced analytical basketball ratings. Also, you can break down the individual ratings for all of the Four Factors, as well as ratings that calculate individual offensive and defensive efficiency and the Usage Rate, which tries to estimate how much a player is used in his team’s games by looking at what he does while he is in the game. Some teams most efficient players may not have the top usage rates on their teams, while less efficient players get more game usage. Teams can look at these stats and good coaches can adjust their lineups to get their more efficient players more game time, while limiting players that may be harming the team. Then, there are coaches that continue to play the wrong players for too many minutes, while their actual more efficient players don’t play enough. There is a phrase for these coaches that continually do this: We call it “Soon to be unemployed.”