I wanted to learn more about sacks, other than the fact that they are awesome, and decided to try and discover their impact on a game as well as what factors go in to forcing and allowing sacks. This post will cover their affect on the point differential of a game.
As always I need to thank the wonderful www.cfbstats.com for providing play by play data since the 2005 season, this post, and most of my posts, would not be possible without them. The rest of this section will detail how I created my data set, so feel free to skip to the next section.
I grabbed all play by play files, as well as specific information on each rushing and passing play, from the 2005 season to the 2012 season (meaning the season started in those years). I left out last season to use as a test data set in case I needed it. I then restricted my data to only games between FBS opponents. To deal with teams moving from FCS to FBS I used the 2005 FBS list for each season, so there are some FBS-FBS games in recent years that are missing, but with the time it saved me I figured it was worth the loss of a small percentage of data. For some reason the NCAA records sacks as rushing attempts, so my first step was to change all sacks from rushing to passing plays. I then added this subset of data to the passing files to get a complete list of every pass attempt (*) from 2005-2012.
From there I wanted to filter out garbage time plays. Garbage time is defined as one team having a lead of more than 28, 24, 21, and 16 points in the 1st, 2nd, 3rd, or 4th quarter, respectively. This removed any plays where teams were getting crushed and throwing the ball no matter what. The rest of the game was still kept in the data set though, so only a small percentage of plays were removed. I then found Non-Garbage time final scores for each game using the plyr package in R, as well as finding the sack rate and number of pass attempts for each team in each game. I finally could get to work on answering my initial question; What kind of an impact do sacks have on how many points a team scores.
(*) Pass attempts only include throwing attempts and sacks, so not drop-backs where a quarterback was able to scramble for a positive gain. Sorry, just a limitation of the data.
The Effect of Sack Rates on Points Scored
The first thing I wanted to test was how much can we learn about how many points a team scores only considering what percentage of time its offense got sacked, essentially Offensive Sack Rate. Because the games were filtered for garbage time some games featured almost no passes, let alone sacks. In fact, two games in the past five years featured non-garbage time sack rates of 100% (Air Force vs. Hawaii in 2012 and Navy vs. Wake Forest in 2009). What I did was run a simple correlation study on the relationship between Offensive Sack Rate and Points Scored for every team in every game from 2005-2012. The results are in the following table:
|Size of Sample||Correlation (r)|
|All Games (*)||11,176||- .26|
|Games with > 5 drop-backs||11,090||- .27|
|Games with > 10 drop-backs||10,650||- .26|
|Games with > 20 drop-backs||8,387||- .25|
(*) Games means an offense's performance, so there are two offense's performances for each game.
Here is a graph showing the relationship visually for all offense's performances with at least 5 drop-backs (Sack Rate is measured as a decimal; the percentage of drop backs that result in a sack):
So obviously there is a lot of variation here, but in general there is a negative trend between the percentage of sacks an offense allows in a game and the number of points that team scores, and really that is all I am interested in.
Sack Rates vs Points Allowed
How much does an offense's sack rate correlate with how much that team allows on defense? Not much it turns out:
|Size of Sample||Correlation (r)|
|All Games (*)||11,176||- .05|
|Games with > 5 drop-backs||11,090||- .06|
|Games with > 10 drop-backs||10,650||- .05|
|Games with > 20 drop-backs||8,387||- .03|
This makes sense. Why would how much an offense gets sacked in non-garbage time tell you anything about how many points that same team allows on defense? There is a slight decrease in correlation as we restrict the size of our sample. The only connection I can think of between a team's offensive sack rate and their points allowed is the talent level of the team.
So I hypothesize that when we restrict our data to games with more than 20 drop-backs, were are evening out the talent gap between the two teams. This is because games that get into garbage time quicker (think of most FSU games this year) don't give the opposing offense enough time to generate a decent number of pass attempts. So the more we restrict our data, the less extreme games we get in our sample. The average margin of victory in all games was 0 (one team's MOV is simply the opposite of the other team's) but the average margin of victory for team's with at least 20 drop-backs in a game was .03, which means the majority of teams we removed were losing. This seems to support my theory that the correlation drops because the talent gap is narrower.
I could be completely and totally wrong though, just a theory of mine.
Sack Rate Margin vs Point Differential
Now that we know there is a small relationship between Sack Rate and Points Scored I wanted to explore how much your sack rate margin affects your margin of victory. For semantics purposes I'm defining Sack Rate Margin as Sack Rate Allowed (Offense) - Sack Rate Forced (Defense). This means a good Sack Rate Margin is a negative value (more sacks forced than allowed). Margin of Victory is Points Scored - Points Allowed, all in Non-Garbage time. Here is the table we should all be familiar with now showing the results of this study.
|Size of Sample||Correlation (r)|
|All Games (*)||11,176||- .37|
|Games with > 5 drop-backs||11,090||- .38|
|Games with > 10 drop-backs||10,650||- .37|
|Games with > 20 drop-backs||8,387||- .33|
I love plots a lot more, so here is a visual representation of this table (only on teams with at least 6 drop-backs):
- There is a small, negative relationship between the percentage of a team's drop-backs that result in a sack and the amount of points that team scores in non-garbage time. A four percent decrease in sack rate (one fewer sack for 25 drop-backs) would be expected to increase points scored in a game by about two points.
- There is practically no relationship between the percentage of a team's drop-backs that result in a sack and the amount of points that team allows in non-garbage time, other than a possible bias towards the talent gap within that game.
- 14% of the variation in a team's non-garbage time margin of victory can be explained by the difference between their sack rate on offense and their sack rate on defense. Teams who sack their opponent more than they get sacked themselves tend to outscore their opponents (I'm telling you guys, this is revolutionary stuff right here). An eight percent increase in sack rate margin (1 less sack on offense and 1 more sack on defense per 25 drop-backs) would be expected to increase your margin of victory by almost 5 points.
- The correlation drops for each metric as you require more drop-backs. Perhaps the size of the results I am getting would decrease if I restricted my games to only BCS-BCS games instead of just FBS-FBS.
- This study only considers sacks per pass attempt, not the down or distance that these sacks occurred. In addition it doesn't take in to account the passing style of teams and it has been shown that team's that pass the further down field get sacked more. It also doesn't take into account strength of schedule. But don't worry, all that analysis is coming.
- I used non-garbage sacks and non-garbage points scored because I assumed those would be of higher quality data than just game totals. I will test this assumption in an upcoming post.