clock menu more-arrow no yes

Filed under:

Measuring Skill Player Performance using Multilevel Modeling

New, 1 comment

Estimating Skill Player Talent using a mathematical modeling tool known as Mixed Effects Models

Trying to determine who was the best running back, receiver, or quarterback through out the year is usually an exercise in futility. Take one of the most basic statistic we have to measure performance, yards per attempt. If you look at the leaders in rushing yards per attempt last year though, you'll notice it's a flawed statistic.

Leaders in Yards Per Attempt for the 2014 season
Rank Name Team YPA Carries
1 Reggie Love Wisconsin 45.0 1
2 Riley Dixon Syracuse 42.0 1
3 Sam Irwin-Hill Arkansas 37.0 2
4 Ridge Jones New Mexico 34.7 3
5 Shane Wynn Indiana 34.5 4
6 DeAndre Smelter Georgia Tech 34.3 3
7 Devin Lucien UCLA 34.0 1
8 Brad Bars Penn State 32.0 1
9 Tyler Trosin Arkansas State 32.0 1
10 Derek Di Nardo Virginia Tech 30.0 1

Notice anything? The most carries by anyone on the list is 4, not exactly a sustained string of excellence. The range of end of season yards per carry values for college football players is almost entirely dependent on the number of carries a runner gets:

num_ypa

Another issue with yards per attempt is that it doesn't account for the quality of opposition faced or the strength of the blocking by your team. Lucky for us there is a mathematical modeling procedure called Multilevel Modeling that can at least give us a start to accounting for all three weaknesses of our basic measuring tool of skill player performance.

I'll present the basics of how Multilevel Models work, the difference between the model estimates and yards per carry, and how it affects players and teams based on number of carries and quality of opposition. Then I'll discuss some possible next steps. Oh, and all the code for this analysis is available on github, although the data isn't. Sorry, I don't pay the bills. :/

Multilevel Models

If you want a proper introduction to Multilevel Models, you can't do better than to read the one that was posted this past month to the Stitch Fix Technology Blog. You'll notice that they refer to is as Mixed Effects models, and that is also fine. The modeling procedure has many names, but I'll stick with multilevel modeling. For a very basic summary of multilevel models, here is what Andrew Gelman had to say:

Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are themselves given a model, whose parameters are also estimated from data

What this means is that we fit a regression using the players and teams themselves as inputs (as dummy variables) with the yards gained on each carry as the dependent variable. Simple linear regression can't be used in this situation since runners in college football don't ever switch teams; Justin Thomas will only run when Georgia Tech has the ball.

There is collinearity in our regression variables. Multilevel models assume that there is natural variation between the individuals in a group and shrink each individual's estimate towards the group mean. The math behind the mixed effects model will determine when an individual has done enough, either by playing very well or very poorly in a limited number of plays, or by having enough observations to demonstrate his own ability, for the estimate to deviate from the overall group mean. This shrinking allows us to fit a model that we wouldn't be able to in ordinary least squares regression.

Trey Causey has written a post measuring quarterbacks in the NFL using multilevel models, and if you are still really confused and want more detail I'd recommend checking his post out.

Comparison to Yards per Carry

Basically, instead of measuring running backs by their yards per carry, I'm going to fit a Multilevel Model with the runners, offense, and defense as random effects and extract the estimated coefficients from each group. This will allow me to measure the estimated impact each running back has on his yards per carry after taking into account the effects of the rest of their offense and the defenses they have faced. We can compare these to the yards per carry values for the runners in the 2014 season, as the model was fit on the 2014 play by play data.

Here is how the distribution for both Yards per Play and the coefficient estimates from the Multilevel Model look. I trimmed the x-axis just to help with the visuals; there are some runners with yards per attempts outside these values.

runndist

As you can see, the Multilevel Model gives a much tighter distribution for the value of a runner, and the huge outliers in yards per carry are greatly reduced. The effect that the model shrinkage has on runners can best be seen in the following scatterplot that compares a player's yards per carry to his coefficient estimate:

llmmypa

There are two main trends at play here. The first is the steeper line of small dots. These are the players with very few carries but extreme values in yards per carry. This is basically the most that a runner's yards per carry can impact their multilevel model coefficient. Without more carries, each additional 10 yards per carry only gets you about a third of a yard of value on the multilevel model coefficient scale.

The second trend at play is the much flatter line of runners with the larger dots. These are runners with enough carries to establish their own skill level in the model and have set themselves apart from the random variation associated with the overall group of runners. If two runners in this group have similar yards per carries and number of carries but different model coefficients then they probably faced tougher opponents or had other runners on their team have similar success. For example, here are the 4 most similar running backs to Dalvin Cook's numbers last year:

Runner Team YPC Attempts Multilevel Model Coefficient Estimate
(NCAA Percentile)
Dalvin Cook FSU 5.929 170 6.127 (96th)
Troy McCormick Utah 5.933 30 5.115 (80th)
Larry Rose NMSU 5.924 186 5.523 (90th)
Robert Council Morgan State 5.923 13 4.849 (71st)
James Conner Pitt 5.922 298 5.746 (93rd)

A good comparison is Larry Rose. New Mexico State obviously played an easier schedule than FSU did, so Dalvin Cook receives a boost in the Multilevel Model even though he has very similar raw numbers to Larry Rose. I think there is a lot more I can discuss on the differences between the Multilevel Model coefficient estimates and Yards per Carry, but for the sake of brevity I'll continue on.

Team Estimates from Multilevel Models

Here are the same plots as before, comparing yards per play and the model output, except now for both offenses and defenses.

off_ypa

def_ypa

Because teams have a lot more observations, they vary from the yards per carry estimates a lot less than individual players. But there is still some adjustment going on in terms of quality of opposition faced.

Conclusions and More Questions

So what's next? I think there are many more areas of research on this topic that need to be explored. The first step is expanding the scope. Why look at just yards per play when you can do success rate, first downs per play, fumbles per play, etc.? You could turn all of these model outputs into a composite score for skill players.

And why stop with running plays? If you had target data you could fit a multilevel model with random intercepts for quarterbacks, receivers, offenses, and defenses.

By the way, I do and I will :)

The tough question is, how would you determine if these models are any good besides an eye test? Is it important for this model to predict future out of sample yards per play? Do I care if the model is accurate as long as it separates the good running backs from the bad? Maybe an in-sample validation test is more important since I want to determine who has been the best *so far* and not necessarily who's talent level is highest. I honestly don't know the answer to these questions and would love some feedback so please feel free to comment on this article or get in touch on twitter or by email.

And just for fun here are the top 10 runners, offenses, and defenses according to the multilevel model coefficients.

Top 10 Multilevel Model Coefficient Estimates for Runners
Rank Player Model Estimate
1 Shane Wynn 8.22
2 Marcus Mariota 7.91
3 Jhurell Pressley 7.76
4 Tyler Murphy 7.67
5 Sherman Alston 7.63
6 Jojo Natson 7.60
7 Tanner McEvoy 7.48
8 Matt Davis 7.46
9 Ridge Jones 7.38
10 Ricardo Louis 7.38

Top 10 Multilevel Model Coefficient Estimates for Offenses
Rank Team Model Estimate
1 Georgia Tech 5.76
2 Georgia Southern 5.71
3 Navy 5.49
4 New Mexico 5.48
5 Wisconsin 5.46
6 Marshall 5.30
7 Indiana 5.28
8 Auburn 5.22
9 Oregon 5.17
10 Georgia 5.15

Top 10 Multilevel Model Coefficient Estimates for Defenses
Rank Team Model Estimate
1 Penn State 3.39
2 Alabama 3.49
3 TCU 3.56
4 Ole Miss 3.63
5 Michigan 3.65
6 Florida 3.65
7 Arkansas 3.68
8 Virginia 3.73
9 Clemson 3.80
10 Missouri 3.81