One of my favorite pieces of baseball analytics is the Marcel projection system pioneered by Tom Tango. It's idea is very simple:
The Marcel the Monkey Forecasting System (or the Marcels for short) is the most advanced forecasting system ever conceived.
Actually, it is the most basic forecasting system you can have, that uses as little intelligence as possible. So, that's the allusion to the monkey.
The system uses a three year weighted average, an age adjustment, and regresses players towards an average player to get a player projection forecast for an upcoming season. I want to do something similar for College Football and I'll present my results here.
Pretty much any projection's back-bone is past performance, how you did in the past is usually a pretty good indicator of how you will do in the future. Which rating system you use is obviously important. ESPN likes to tout FPI, Bill C uses his S&P+ and Brian Fremau's FEI extensively in his opponent previews, and those are just the big three. Those are all proprietary systems that, while very accurate, rely on data that is hard to come by. For this simple projection system I am going to use the Massey Rating system. Massey Ratings take each game's margin of victory and adjusts it for the quality of opponent. The math behind it is a little more complicated, but no more so than linear regression with dummy indicators for each team and the dependent variable simply the margin of victory of each game. If you want more info on this just ask me in the comments or on twitter.
The Massey Ratings are very easy to calculate after each season for use in predicting the next season. For example here are the top 10 teams according to this method from last season:
As you can see the system isn't perfect and certainly has it's flaws but I think for an objective ranking system it does pretty well.
In College Football recruiting matters, no one can deny that. It isn't the end all be all of a team's success but man it's tough to accomplish much without it, and even harder to sustain that success. I was lucky enough to find Matthew Smith's 247 recruiting database and can now track each team's recruiting success through the years. Using this data we can look at how past recruiting success can predict the winner of future games. For example, from 2010 to 2014 the team with the higher average 247 composite class ranking for their sophomores won 64% of the time. For those visual learners here is a plot of each game's recruiting rankings along with who eventually ended up winning.
A blue dot means the home team won the game and a red dot means the away team did. As you can see below the line, so for those games where the home team had the better recruiting class last year, there is a higher concentration of blue dots but above the dashed line the away team wins more of the games.
Again, is this the perfect prediction? No, but it's certainly useful.
Draft Talent Lost
More and better recruits coming in to your program is obviously a good thing for your team's future success, but what about losing talent to the draft? While having to replace talented players is usually a tough thing to do, having success in the draft could be a sign of the health of your program. In order to answer this I downloaded the past 11 drafts from Pro Football Reference and used Chase Stuart's Draft Value chart to assign a value to each pick for a team. For example last year Florida State had 13 players drafted, the most of any school over that time period. However in 2010 Oklahoma actually had more draft value lost on only 7 picks because the Draft Value Chart gives more value to the lowest (highest? the best) draft picks and Oklahoma had three players drafted in the top 4 that year.
But how well does draft value indicate future success? From 2010 to 2014 the team with more draft talent last in the previous draft won 57% of the time. It does seem that losing players in the draft is actually a positive thing for your team. If you don't believe me I'd be happy to provide more analysis on the matter, just let me know in the comments.
Putting it all Together
For this simple Marcel Projection system those are the only factors I'm including. We could definitely improve this ranking system by getting a better measure of team strength by using a better rating method, and adding information like starters lost, coach turnover, or even production lost, not just draft talent. But that's not really the goal of this system, just simply to provide the bare-minimum level of prediction.
To test this method I built a simple logistic regression model that will estimate a team's chances of winning the game based on the two team's Massey Ratings last year and the 2nd most prior year, recruiting rankings for their sophomore and class, and the draft talent lost in the previous draft. I fit the model on the 2010-2013 seasons and used it to predict the 2014 season. Using this model we were able to accurately predict 70% percent of the games last season. That's without any updating during the season, simply from what we knew at the start of the season. Again, not the best but certainly a serviceable baseline for other projection algorithms.
The model predicts the probability of one team winning a single game. To get a sense of the strongest teams I created games so that each would team would have a matchup against every other team and predicted their chance of winning each game at a neutral field. I then took their average win probability in each matchup to see which teams the model viewed as the strongest teams coming in to the year. I also simulated the actual 2015 schedule 1000 times using the predicted probabilities of each team winning according to the Marcel Model. This allowed me to get an estimate of the average number of wins for each team. Here are the top 10 teams according to this method:
|Rank||Team||Marcel Team Rating||Simulated # of Wins|
(*) The Sports Reference Schedule is missing three LSU games so this win total is artificially low for now.
I've also built a Shiny App at https://mattmills49.shinyapps.io/Win_Totals_shiny that allows you to view team totals for any team, it's still a work in progress and I haven't listed the team names out yet so you'll have to do your best to guess the right name (for example, to view Ole Miss you'd actually have to enter "Mississippi").
I think my projections are pretty conservative so the win totals are probably lower than you would see at ESPN or Football Outsiders. But that's basically it. I hope to be doing more around these projections as the season gets closer so give me a follow on twitter or pay attention to Football Study Hall if you'd like some more updates. And as always any comments or criticisms are welcomed in the comments.