/cdn.vox-cdn.com/uploads/chorus_image/image/13753637/screen_shot_2013-05-27_at_9.58.24_pm.0.png)
This is the 2nd installment of a series of articles on third downs in college football. You can find the most recent one here.
In my first post I showed third down conversion rates by distance to go for a first, as well as the pass % for each distance to go. The one thing that stuck out to me was an apparent spike in conversion % at third-and-13. Yes, it could easily be a small sample size issue, but my hope is after this article I will be able to convince both myself and you that there is something else at work here. First, I need to thank CFB Stats for making a ton of college football data available for the past eight seasons; this wouldn't be possible without them. Let's now get a look at exactly what I am talking about when it comes to third-and-13.
This is the same chart that shows us conversion %, pass conversion %, and rush conversion % for all third downs in games between 1-A teams in non-garbage time plays over the last 5 seasons. Non-garbage conventions are the same ones used by the F/+ ratings. This shows a clear spike at third-and-13 pass conversion % compared to third-and-12 and third-and-14. In addition this spike is only seen in passing plays. No season showed a spike in run conversion % at this distance and no year showed a decline in pass conversion % like you would expect if there was nothing going on here. So in the last five years, third-and-13 has a higher pass conversion % than third-and-11 or 12. But does this actually tell us anything about predicting third down performance in the future?
Sample Size?
Here are the sample sizes for third downs where a passing play occurred from 11, 12, 13, 14, and 15 yards to go for a first down.
Distance to Go | 11 | 12 | 13 | 14 | 15 |
Total | 2480 | 1924 | 1547 | 1235 | 1278 |
Conversions | 642 | 431 | 408 | 264 | 217 |
Conversion % | 25.9% | 22.4% | 26.4% | 21.4% | 17.0% |
Something to note: Fifty-six successful third-and-13s over the last five years (3% of all passing third-and-13s that occurred) would have to change to unsuccessful in order for the conversion % to be the same as third-and-12. Perhaps that gives more strength to the thought that there is a large enough sample size we could run some statistical tests on the data. One statistical test we can run to determine if two samples are significantly different is a Student's T-Test. The Student's T-Test tests the difference of means of two samples to determine if there is significant difference between the samples. The test takes into account sample size and variance, so the T-Test can measure small effects over large samples and also large effects over small samples accurately.
A low P-Value (something less than .1/.05/.025, depending on the rigor you want) tells us that the difference in means of the two samples is significantly different than 0. In addition the Student's T-Test was developed by a statistician at the Guinness Brewery, which is far and away the coolest statistics history lesson I have heard. We can test each distance-to-go against the others to determine how statistically likely it is that they are in fact different. The P-value in this table tells us the chance that we got a bad sample and that the two distances are essentially the same.
Distance To Go | 11 | 12 | 13 | 14 | 15 |
11 | X | 0.007 | 0.733 | 0.002 | ~0 |
12 | 0.007 | X | 0.007 | 0.496 | 0.001 |
13 | 0.733 | 0.007 | X | 0.002 | ~0 |
14 | 0.002 | 0.496 | 0.002 | X | 0.005 |
15 | ~0 | 0.001 | ~0 | 0.005 | X |
You read this chart like this; the chances that in the given data set the conversion % at third-and-13 to go is the same as third-and-12 to go is less than 1% (.007). There are some caveats I kind of ignored here. The T-Test assumes a normal distribution of the means, and the data isn't quite a normal bell curve.
In addition, these were two-sided T-tests even though I was specifically looking at if third-and-13 was greater than the others, not just different. But what is important is that the sample sizes do give us enough evidence statistically to say that there is a difference in means.
Performance On Third Downs
A reader in my first post suggested that I should look at how teams are performing on a yards-per-play basis to see if anything was happening. So let's look at a frequency histogram of how many yards teams gained on all third-down passing plays in my data set within the last five years.
The average passing (including sacks) third down yardage gained was 5.7 yards, but the median yard gained was zero (there were about 25,000 third down passing plays that gained 0 yards, nearly half of all third down passing plays). This is neat to know, but what I am really interested in are when teams have 11, 12, 13, 14, 15 yards to go. And these histograms are by percent as opposed to total number of observations with bucket widths of 1 yard.
(I am using R to make these plots, and I could think of no better way to display the information. If you have any ideas or suggestions to make these easier to compare then please shoot me an email or comment.)
It's tough to get much out of these plots -- it is kind of information overload. What can the histograms of the individual distances tell us? One interesting thing to note is that one of the least frequent yards gained on all downs to go is the exact yardage needed for a first down. I think there is at least one thing at play here. The measuring process for first downs could cause this gap. If the spot is just past the first down marker the spotter/recorder might just round up and say they are at the next yard line, and may round down if they didn't get it. What do you guys think? I almost feel like this would be the opposite though. If a team knows they have to get to a certain distance, I feel like that distance would show up more than on average because teams might do all they can to get as close to the first down marker instead of just going out of bounds normally. I'm not sure what to make of it quite yet.
Back to what we wanted to know in the first place, is third-and-13 any different than other distances? Here is the chart we looked at earlier, now with the average yards gained at each distance.
Distance to Go | 11 | 12 | 13 | 14 | 15 |
Total | 2480 | 1924 | 1547 | 1235 | 1278 |
Conversion % | 25.89 | 22.4 | 26.37 | 21.38 | 16.98 |
Avg Yards Gained | 5.76 | 5.60 | 6.56 | 6.20 | 5.58 |
As you can see teams gained almost one yard more per play than they did on third-and-11 and third-and-12. Teams at third-and-13 are losing a yard at the line of scrimmage compared to third-and-12, but are making it right back up with the success of their plays. Heck, the average yards gained on passing plays at third-and-13 is higher than the average yards gained on ALL passing third downs ... wow. The average distance "short of a first" is basically the same for third-and-12 and third-and-13. So if a higher percentage of plays on third-and-13 gain 7.5 more yards than average on third-and-13 than third-and-12, the third-and-13 conversion % would be higher. And that effect is exactly what we are seeing in the histograms.
The presentation of this sucks, I know, I just have no idea how else to present it. Fortunately, I think we are done with the histograms.
Potential Causes
But what is causing this effect? I believe third-and-13 represents an optimal distance where offenses can still run their base packages and gain 13+ yards, defenses have to respect the run as a possibility, and offenses don't have to sacrifice extra blockers to go run routes. At third-and-13 perhaps defenses are blitzing less because they don't want to get "burnt" but it is still a short enough distance that offenses can get yards while holding more potential receivers back as blockers than they might at third-and-15. This last part is testable, here are the percentage of passing third downs that end in sacks:
Distance to Go | 11 | 12 | 13 | 14 | 15 |
Sack % | 10.2% | 9.6% | 8.6% | 10.1% | 11.0% |
The sample sizes are a little smaller than the conversions (133 sacks vs 408 conversions on third-and-13) but there is still a clear dip at third-and-13. The total sack % on all third downs was 7.9%. I think this tells us that teams can still get more yards on third-and-13 while not sacrificing any increased risk of a sack.
What does all this tell us? From my data I have shown that teams convert a higher percentage of passing plays, gain more yards per play, and also get sacked at a lower percentage of plays at 13 yards to go for a first than either 12 or 11 yards to go for a first. All of these factors together make it seem like there is some real effect going on here. But there is one final test we can run to be sure.
Just one last test, and ... uh oh
Lets just re-run the original analysis on the years 2005-07. I didn't know these years were available when I started the analysis and someone suggested I test these years, so here it goes. Here are the third down conversion % by distance to go for 1-A games in 2005-07.
Distance to Go | 11 | 12 | 13 | 14 | 15 |
Total | 1593 | 1203 | 935 | 782 | 820 |
Conversion % | 25.7% | 25.3% | 20.4% | 18.9% | 18.4% |
Avg Yards Gained | 5.25 | 5.77 | 5.01 | 5.35 | 5.50 |
Sack % | 10.6% | 10.5% | 10.8% | 11.9% | 10.7% |
Well.......
Well that settles it, there is literally ZERO evidence of the third-and-13 effect we were seeing from 2008-12 in the 2005-07 data. Either something has changed since 2007 that would make it easier for offenses to complete third-and-13s, or this whole exercise was a lesson in small sample sizes. Either way it was fun, please comment away with any questions or suggestions.
(If anyone is interested, here is the compiled third down data from 2008-12 I used for this analysis.)
** Edit: Better Gif **
Loading comments...