Questions and Answers
Every year, we receive a couple dozen questions from the readers, and we do not always have time to reply to each one. Since June, and especially since August 29, we have received some excellent questions and comments that will be shown today. For future reference, drop us a line at: pirate_ratings at live dot com. When we get enough to answer, we will respond with another one of these segments.
1. How do you calculate your ratings? Is this something anybody could do if they had your equations?
Answer: This is a tough one to answer. Our ratings are not 100% mathematical formulae. A mathematical formula is used for the base, but the data inputted is not cut and dry. Whereas many other ratings take the scores of games and the strengths of schedules to make a least squares or least error rating where the scores and schedule strengths fit the best pattern, our ratings try to interpret these scores before running this data.
For instance, let us take a game between Oregon and Idaho. The rating may state that Oregon should win by 77 points. If they lead 42-0 with six minutes left in the second quarter and then coast to a 66-0 win, while playing the scout team the final quarter, should we really state that the Ducks performed 11 points below par and should be penalized in their next rating? The Ducks could have won 112-0 without emptying the bench. We look at how the score was made and not just the score.
In another instance, let’s say the final score of a game was 28 to 14. There are so many different ways to interpret this 14 point win. It could have been 21-14 with seconds remaining in the game and with the trailing team knocking at the door to tie it up and force overtime. Let’s say the trailing team threw a pass in the end zone, and the ball went through the receivers’ hands and hit his shoulder pad. The ball went flying through the air. Had it flown left, another receiver would have easily caught the ball for a touchdown. However, it flew right, into the hands of the strong safety, who caught it and ran 106 yards for a touchdown to make the score 28-14 instead of 21-21. The direction of the deflection cannot be counted as 14 points. No one play is worth that.
What if this 28-14 game was 28-0 with six minutes to go, and the scrubs scored a touchdown to cut it to 28-7, and then the leading team’s scrubs fumbled and gave up another touchdown with now four minutes to go. The leading team then put their starters back in and drove from their 25 to the opposing 5 yard line before running out the clock. This game could have been 42-0 if not for the reserves. In a close game, those reserves will have little input in a future game.
2. What are the differences in your three ratings—PiRate, Mean, and Bias?
Answer: Okay, this one can be different depending on the year in question. The PiRate Regular ratings stay the same every year. They have not deviated since the advent of the Internet making statistical research so easy.
The Mean and Bias ratings have been tinkered with over the last 10 years. In fact, the Mean rating has changed since 2011. We perform 14 different calculations to start each season. We look at returning lettermen and starters. Each player at a positiong has a certain value, so that a returning starting left tackle earns the same points for Oregon and Alabama as it does for Georgia State and South Alabama. This data is looked at many ways. In one system, we may give more emphasis to the quarterback and wide receivers than in another system. Our favorite calculation actually gives more weight to the interior lines than any of the skill positions.
After we calculate all the ratings, we adjust the previous year’s final rating for each team by the change in personnel entering this year. For the PiRate regular rating, we take the 5 calculations that have always been used. For the Mean rating, we take the 14 calculations and take the average rating. For the Bias rating, we take the original 5 calculations and weight them a little differently. Two of the calculations count 30% each; a third calculation counts 20%; and the other two count 10% each. Thus, the PiRate Regular and Bias ratings will begin the season differing very little.
Additionally, each of the three ratings have a unique updating formula. The PiRate Regular rating has the most conservative update and will not vary as much as the other two. The Bias Rating has a more liberal update, and it will be more like the betting public and emphasize the most recent game over all others. The Mean rating will usually have a smaller spread believing that the most recent game is part of a larger trend, but oftentimes overemphasized. Thus, the Mean rating will frequently differ in the predicted winner when compared with the other two. This is great for our purposes, for when the three ratings agree in a similar point-range, we believe this game is less uncertain than the average game. In fact, over the last few years, when the three ratings take the same side of a selection, and the difference is two points or more on all three ratings, that selection has been the correct side about 62% of the time. At 62%, you can get rich slowly if you have the courage to believe it will continue. Of course, that 62% has a rather high standard deviation. One year, the accuracy was just 46.4%. One year, it was 73.1%. One year, the number of plays this system generated exceeded 240 for the season, while just a couple years ago, there were only 97 plays for the season (which happened to be the 73.1% year at 68-25-4).
3. You once said that strength of schedule did not count for much in your system. How can you be accurate then?
Answer: This statement is somewhat true, but let us explain what we mean. We believe that the strength of a team lies in its talent, its teamwork, its coaching, and its commitment to win. The schedule does not indicate how good a team may or may not be. It may be how the rankings and BCS standings are determined, but we do not issue ratings to try to pick how the teams will be ranked or even which teams will play in the National Championship Game. We want to rate the teams from best to worst and only care to compare which teams are actually better than others and by how many points.
Here is why strength of schedule is useless to us. Let’s say that my friend the high school coach has just been hired at Old Dominion as the Monarchs move to FBS status. In the first three years there, he successfully recruits the next Peyton Manning, Adrian Peterson, Calvin Johnson, Larry Fitzgerald, Brandon Marshall, Anthony Gonzalez, Maurkice Pouncey, Mike Iupati, Andy Levitre, Ryan Clady, and Joe Staley to start on offense. On defense, he signs Geno Atkins, Vince Wilfork, J.J. Watt, Julius Peppers, Patrick Willis, Clay Matthews, DeMarcus Ware, Darrelle Revis, Charles Tillman, Earl Thomas, and Eric Berry to start on defense.
Without a doubt, no team in college football could equal talent like this. Not only are these guys obvious first team all-Americans, every one is a future first team All-NFL. Even Alabama could not compete against this team.
Now, this ODU team’s schedule is: Georgia State, Charlotte, Appalachian St., Louisiana-Monroe, Massachusetts, Troy, South Alabama, Louisiana-Lafayette, Arkansas St., Georgia Southern, Texas St., and Army. There is no doubt that they will go 12-0 and outscore this dozen by about 500-700 points. Yet, the strength of schedule may rank this team around #20. If this were this season, they would not even compete for an at-large BCS Bowl Bid, and they would have to settle for something like the New Orleans or Military Bowl.
This has been the case in the past. In 1970, Arizona State had the best team in the nation. They did not get a chance to play in a big bowl and had to settle for the Peach, where they won handily. Nebraska was two touchdowns weaker in 1970 than they would be in 1971, and the Sun Devils had the better team in 1970, when they ran the table and proved unstoppable on offense.
In 1969, Penn State was probably a little better than Texas. The Longhorns’ new Wishbone offense proved to be an excellent weapon, but by the end of the season, teams had figured out how to slow it down. Only a miracle comeback even got UT to the Cotton Bowl, and then in the Cotton Bowl, they had trouble with a very good but not great Notre Dame team. Meanwhile, Penn State had perhaps the best college defense in the last eight years. This defense and the special teams actually scored or set up the score for more points than they gave up. Additionally, it was a team that went 11-0 for the second consecutive season and would place a host of players in the NFL. How strong was that Penn St. team? Their second and third running options were Franco Harris and Lydell Mitchell, two future NFL stars. Their quarterback, Chuck Burkhart NEVER lost a game where he was the starting QB—that includes college, high school, and junior high—undefeated for life!
The end of the BCS era does not signal the end of this travesty. Still, big name schools with gaudy schedules will beat out other schools for one of the four playoff berths. There should be no selection of playoff berths. There should be set guidelines that allow each team to qualify for a berth by winning on the field just like the NFL. The last several Super Bowl Champs might never have been in the playoffs to begin with if they had to be selected as one of the top four teams. Baltimore would have been left out last year. The Green Bay Packers and New York Giants would not have qualified when they won their most recent Super Bowls. It is our opinion, that this tournament needs to be eight-deep with each of the eight teams qualifying by winning on the field and clinching a spot based only on games played and never on human selection.
5. You used to report for Vanderbilt, and you stated that you married into a University of Wisconsin and Green Bay Packer family. How do we know that you do not fudge on these teams and rate them higher than they deserve?
Answer: You are confusing ranking and rating. Ranking might bring into play human partiality, but we are trying to rate teams based on how this rating can be used to select against the spread. We would be quite happy for these three teams to win every week, but what most excites us is picking all the winners against the spread. So, our love of being accurate is really all that matters. We have no influence over the rankings, so it really matters not which teams we cheer for. And, to tell you the truth, some of us root for different teams. And, we are not fanatical fans. Our founder has cultivated friendships with athletic officials at numerous schools including those at the University of Tennessee, the University of Minnesota, and personnel with the Chicago Bears and Cleveland Browns. He never roots against anybody. His love of the game is what keeps his interest going, and as a long-time coach in football and basketball, his first love is watching teams practice. As most long-time coaches will agree, they miss the practices when they retire. They don’t really miss the games, the schmoozing with alumni, the media, etc.
6. What happened to your Computer Simulations?
Answer: We regret to say that we lost access to the college campus computer that allowed us to run these simulations. So, unless this changes, we will not be able to offer this service in the future.
7. I want to make my own ratings. Can you offer help?
Answer: This is one we get a lot. If you want to make your own ratings, do what we did when we started out in 1969. Begin with your own personal belief about each team. Begin with each conference and rank the teams within the conference. Then, take the best teams in each conference and rank against each other. It should look something like it looked for our founder in October 1969:
Southwest Conference:
Arkansas 0, Texas -1, Texas Tech -27, TCU -29, SMU -30, Texas A&M -30, Rice -34, Baylor -41.
He did this for every conference as well as the numerous independents, which he had broken down into four regions since there were so many then.
At the time, Ohio St. was number one overall. They received the top rating at 120, or 20 points better than the average team and about 40 points better than the typical weak team. He had Arkansas as the third best of the teams, about 3 points weaker than Ohio St. Thus for the SWC, the teams had these ratings:
Arkansas 117, Texas 116, Texas Tech, 90, TCU 88. SMU 87, Texas A&M 87, Rice 83, and Baylor 76.
At the time, he gave every team with a large stadium 4 points home field advantage, every team with an average stadium 3 points, and every team with a small stadium 2 points.
After each game, he raised or lowered the rating from 1 to 6 points based on the outcome of the game, or left it the same. Whatever he gave to one team, he took the opposite away from the other. It was crude, but he was 9 years old.
8. Have you ever considered using more colors in your blog?
Answer: That was a great suggestion, and we took your advice this summer and began using team colors.