clock menu more-arrow no yes mobile

Filed under:

Bayes Theorem and the College Football Title Contenders

New, 3 comments

Why an 18th Century English Presbyterian Minister Matters to College Football Fans

This is an attempt to use Bayesian probabilities to narrow down the list of legitimate national title contenders in 2014 using two data points:  (1) The percentage of blue-chip recruits on a teams roster and (2) A team's Pythagorean Win Difference from the prior year.  Both have used in prediction models and are well-accepted as useful statistical tools in an analyst's toolbox.  As far as I know, though, they haven't been used together to develop a conditional (Bayesian) probability of reaching a given win total.  I'm using 12-wins as the benchmark that a team must reach in the regular season if it's going to be in the running for a berth in the national championship tournament.

Thomas Bayes was an 18th Century English statistician and minister.  Among his many accomplishments was an idea that didn't gain traction until long after his death: the idea of conditional probability. Bayesian probability attempts to use multiple data points to narrow down the likelihood of a given result by asking the question "what is the probability of A happening if B happened?"

Before I dig into the results of the analysis, here's a quick summary of how I derived my starting data and the assumptions I had to make.

First of all, Bayes Theorem requires at the data points used to estimate the probability be mutually exclusive and independent.  In this case mutually exclusive was not an issue...I chose the condition "12 or more wins" and "less than 12 wins" as a condition.  Since a team can't be both, it's mutually exclusive.  The independence is harder to establish.

The Pythagorean Theorem of Football is based on average points for and average points against a team.  It gives a number between 0 and 1 which, when multiplied times the total number of games a team played in a season gives an expected wins total for the season.  It is a direct calculation from on-field play.  The blue-chip percentage has an impact on the expected wins...it's been well established that having blue chips on a team is important if a team aspires to playing at an elite level.  Because of this, I cannot say with confidence that the two are independent for that reason.  It doesn't invalidate the utility of Bayes Theorem, however.  The results, however, are less useful as the direct relationship between data points grows closer.  I believe that the dependent relationship between a Pythagorean Expected Win Total and the percentage of blue-chip recruits on a team is sufficiently distant to find significant goodness in this approach.

I chose the Pythagorean Theorem of Football because it is a good indicator of whether a team's win-loss record is congruent with its performance on the field.  It is a good indicator of whether a team will do better or worse the next year.  Using the blue-chip percentage is a good proxy of the amount of pure talent on the team.  In doing a prediction model this way I'm making the assumption that a team's schedule will be similar from year to year.

The analysis is built on calculating the probability that a team will reach 12 regular season wins.   In order to calculate this, I developed probabilities that a team with a difference between it's actual wins and expected wins would experience an improvement in its win totals of a given amount.  For instance, In 2013 Bowling Green had an expected win difference of -1.9.  This is strong evidence that its actual win total was not truly indicative of its potential.  There is strong historical evidence that teams that fall 2 games or more below their expected win totals will experience a bounce back the next year.  For this reason, the probability that a team with a Pythagorean Win difference between -1.75 and -2 will improve at by no more than 3 wins is about 30%.  Florida State won all its games, so it had an expected win difference of .54.  The probability that it would improve by no more than -1 wins is .52.

I've attached the table of probabilities at the end of the article.

Next, I calculated the four-year average blue-chip percentage for each team from 2008 to 2013 and used that to establish probabilities that a team with a given blue-chip percentage would reach 12 wins.  The table of probabilities for this at that end of the article as well.

With those two probabilities I calculated the probability that a team would reach 12 wins, GIVEN that it had a certain percentage of blue-chip recruits on the team.

The top-10 in this method are below.

Rank School Prior Pythag Diff 4 - year BC % P(12 wins)
1 Florida State 0.541 0.515 0.804
2 Louisville 0.123 0.126 0.562
3 Ohio State 0.452 0.714 0.539
4 Michigan State 1.032 0.188 0.347
5 Bowling Green -1.917 0.000 0.334
6 Alabama -0.747 0.711 0.324
7 Stanford 0.298 0.383 0.245
8 Marshall -1.120 0.018 0.245
9 Baylor -0.117 0.143 0.234
10 Oregon -0.112 0.493 0.234

Florida State at #1 is no surprise.  They were dominant all year and are loaded in talent.  The disparity in its probability of 12 wins and the rest of the FBS is telling.  They are, and deserve to be, the odds on favorite to win the national championship next year.

Bowling Green's and Marshall's presence on the list has a lot do with the effect of having seasons in which they fell well below their expected win totals for the year.  Because of this, there is a probability boost toward a higher win total in 2014.

Despite losing its final two games, Ohio State still won a half game more than the Pythagorean Expected Total.  It's hard to know what to think of that, especially given its recent dominance in recruiting.  Still, because it has an outstanding 4-year blue-chip average, Ohio State has the third highest probability of 12 regular season wins.

Alabama, on the heels of the kick-six and a deflating loss to Oklahoma in the Sugar Bowl, fell almost a full game short of expectations.  That, along with its dominance in recruiting, keeps it toward the top of the list.

Rank School Prior Pythag Diff 4 - year BC % P(12 wins)
1 Florida State 0.541 0.515 0.804
2 Louisville 0.123 0.126 0.562
3 Ohio State 0.452 0.714 0.539
4 Michigan State 1.032 0.188 0.347
5 Bowling Green -1.917 0.000 0.334
6 Alabama -0.747 0.711 0.324
7 Stanford 0.298 0.383 0.245
8 Marshall -1.120 0.018 0.245
9 Baylor -0.117 0.143 0.234
10 Oregon -0.112 0.493 0.234
11 Missouri 1.307 0.165 0.202
12 Utah State -2.187 0.000 0.180
13 Wisconsin -1.967 0.175 0.158
14 Auburn 1.652 0.524 0.143
15 Northern Illinois 1.718 0.000 0.143
16 Arizona State 0.087 0.124 0.091
17 Oklahoma State -0.247 0.221 0.081
18 East Carolina 0.315 0.010 0.079
19 UCLA 0.404 0.408 0.079
20 LSU 0.298 0.609 0.079
21 USC 0.497 0.713 0.079
22 Clemson 0.740 0.384 0.078
23 North Texas -1.206 0.000 0.077
24 Washington -0.813 0.195 0.072
25 Fresno State 2.042 0.000 0.065
26 South Carolina 1.129 0.251 0.056
27 UCF 2.302 0.000 0.051
28 Cincinnati -0.350 0.052 0.047
29 Boise State -1.285 0.031 0.025
30 Houston -1.331 0.010 0.025
31 Georgia Tech -2.379 0.065 0.021
32 Iowa -0.778 0.083 0.021
33 Navy 0.290 0.000 0.020
34 Texas A&M 0.306 0.448 0.020
35 Ball State 0.526 0.000 0.020
36 Kansas State -1.038 0.034 0.019
37 Duke 1.414 0.000 0.014
38 Rice 1.521 0.000 0.014
39 Colorado State -0.493 0.000 0.013
40 North Carolina -1.523 0.189 0.013
41 Arizona -0.734 0.084 0.006
42 BYU -0.662 0.052 0.006
43 Buffalo -0.041 0.000 0.005
44 Georgia -0.159 0.543 0.005
45 Mississippi -0.166 0.257 0.005
46 Michigan -0.800 0.504 0.005
47 Mississippi State -0.820 0.178 0.005
48 Vanderbilt 1.078 0.096 0.005
49 Notre Dame 1.107 0.633 0.005
50 Nebraska 0.739 0.293 0.004
51 Minnesota 0.465 0.036 0.004
52 Texas Tech 0.371 0.165 0.004
53 Virginia Tech 0.395 0.148 0.004
54 Oklahoma 1.806 0.334 0.003
55 Louisiana-Lafayette 0.782 0.000 0.002
56 Miami (Fl) 0.844 0.303 0.002
57 Oregon State -0.237 0.030 0.001
58 Tulane -0.553 0.000 0.001
59 Toledo 0.050 0.009 0.001
60 Arkansas State 0.720 0.000 0.001
61 Old Dominion 0.569 0.000 0.001
62 Texas 0.599 0.609 0.001
63 Western Kentucky 0.522 0.000 0.001
64 Penn State 0.394 0.200 0.001
65 Maryland 0.265 0.099 0.001
66 Florida Atlantic -1.130 0.000 0.001
67 South Alabama -1.029 0.000 0.001
68 TCU -1.955 0.116 0.000
69 Middle Tennessee 0.779 0.000 0.000
70 UTSA 1.192 0.000 0.000
71 Washington State -0.151 0.026 0.000
72 Indiana -0.929 0.067 0.000
73 Northwestern -0.772 0.104 0.000
74 Utah -1.272 0.061 0.000
75 Ohio 0.521 0.000 0.000
76 Pittsburgh 0.727 0.127 0.000
77 Rutgers 0.324 0.113 0.000
78 Troy 0.350 0.000 0.000
79 Syracuse 1.304 0.037 0.000
80 Illinois -0.835 0.034 0.000
81 Boston College 0.813 0.011 0.000
82 UNLV 0.950 0.011 0.000
83 Wyoming 0.054 0.000 0.000
84 West Virginia -0.465 0.058 0.000
85 Kent State -0.313 0.000 0.000
86 Nevada -0.403 0.000 0.000
87 Florida -1.252 0.499 0.000
88 San Jose State 0.524 0.000 0.000
89 SMU 0.436 0.021 0.000
90 Tennessee 0.287 0.315 0.000
91 Akron 1.212 0.010 0.000
92 Central Michigan 1.351 0.000 0.000
93 San Diego State 1.950 0.000 0.000
94 Wake Forest -0.236 0.000 0.000
95 Georgia State -2.209 0.000 0.000
96 Louisiana Tech 0.012 0.000 0.000
97 Texas State 0.881 0.000 0.000
98 Army -1.333 0.000 0.000
99 Memphis -1.491 0.000 0.000
100 New Mexico -1.275 0.000 0.000
101 Temple -2.818 0.007 0.000
102 North Carolina State -1.200 0.025 0.000
103 UAB -0.921 0.000 0.000
104 Virginia -0.872 0.166 0.000
105 Tulsa -0.097 0.000 0.000
106 Kansas 1.021 0.033 0.000
107 Arkansas -0.514 0.176 0.000
108 Iowa State -0.657 0.022 0.000
109 Colorado 0.549 0.021 0.000
110 Kentucky -1.394 0.102 0.000
111 Hawaii -2.803 0.000 0.000
112 Air Force -1.058 0.000 0.000
113 Idaho -0.312 0.000 0.000
114 Purdue -0.338 0.036 0.000
115 Southern Miss -0.439 0.009 0.000
116 Louisiana-Monroe 1.954 0.000 0.000
117 New Mexico State 0.114 0.000 0.000
118 South Florida 0.025 0.059 0.000
119 UTEP -0.540 0.000 0.000
120 Eastern Michigan 0.495 0.012 0.000
121 California -1.127 0.194 0.000
122 Western Michigan -1.003 0.000 0.000
123 Massachusetts -0.086 0.000 0.000
124 Miami (Oh) -0.638 0.000 0.000
125 Florida International 0.409 0.000 0.000

The probability of a change in total wins of no more than:

Pythag Diff Grp -4 -3 -2 -1 0 1 2 3 4
2.25 + 0.768 0.628 0.470 0.316 0.189 0.099 0.046 0.018 0.006
2 to 2.25 0.839 0.736 0.607 0.464 0.326 0.209 0.121 0.063 0.029
1.75 to 2 0.728 0.566 0.392 0.237 0.124 0.055 0.021 0.007 0.002
1.5 to 1.75 0.802 0.691 0.559 0.420 0.290 0.183 0.105 0.054 0.025
1.25 to 1.5 0.887 0.789 0.653 0.493 0.335 0.201 0.106 0.049 0.019
1 to 1.25 0.785 0.678 0.552 0.422 0.299 0.196 0.118 0.065 0.033
0.75 to 1 0.929 0.845 0.712 0.542 0.364 0.211 0.105 0.044 0.015
0.5 to 0.75 0.891 0.798 0.669 0.516 0.361 0.226 0.125 0.061 0.026
0.25 to 0.5 0.950 0.894 0.802 0.673 0.520 0.363 0.227 0.125 0.061
0 to 0.25 0.950 0.895 0.805 0.681 0.531 0.377 0.241 0.137 0.068
-0.25 to 0 0.932 0.868 0.772 0.646 0.501 0.356 0.229 0.133 0.069
-0.5 to -0.25 0.943 0.890 0.808 0.696 0.563 0.421 0.289 0.181 0.102
-0.75 to -0.5 0.966 0.921 0.842 0.723 0.572 0.409 0.261 0.146 0.072
-1 to -0.75 0.961 0.919 0.849 0.749 0.621 0.478 0.338 0.218 0.127
-1.25 to -1 0.982 0.955 0.900 0.809 0.679 0.523 0.363 0.224 0.121
-1.5 to -1.25 0.986 0.962 0.914 0.831 0.708 0.555 0.393 0.248 0.138
-1.75 to -1.5 0.984 0.961 0.916 0.842 0.733 0.597 0.447 0.304 0.186
-2 to -1.75 0.945 0.902 0.840 0.755 0.651 0.534 0.415 0.302 0.206
-2.25 to -2 0.961 0.926 0.870 0.791 0.688 0.567 0.440 0.319 0.215
-2.25 0.994 0.983 0.957 0.905 0.816 0.689 0.534 0.373 0.232

The probability of a team reaching 12 wins based on percentage of blue-chip recruits:

BC% Grp p(12 wins)
0 0.015
0 - .1 0.025
.1 - .2 0.044
.2 - 3 0.057
.3 - .4 0.072
.4 - .5 0.256
.5 - .6 0.207
.6 - .8 0.337