Friday, November 8, 2024

WAR Crimes: Why the concept of WAR simply does not work in football

Must read

At the conclusion of the 2024 NFL Draft, the Green Bay Packers were pilloried by most major draft graders, primarily for taking a bunch of safeties and off-ball linebackers. You see, many nerds (not me, different nerds) have put a lot of time and effort into developing “WAR” (Wins Above Replacement) for football, and a big part of creating a WAR statistic is figuring out which positions are more or less valuable, and off-ball linebackers and safeties aren’t valuable. According to these nerds, your team should be composed entirely of quarterbacks.

I kid, but…not really. The Packers have tried to fill the linebacker spot through late round picks like Kamal Martin and discount free agents like Christian Kirksey. It even worked, temporarily, with De’Vondre Campbell. But generally speaking, the Packers’ inside linebackers have been largely terrible since Clay Matthews spent a few seasons moonlighting there to prevent mobile quarterbacks from rushing for 200 yards at a time.

In 2010 when the Packers won the Super Bowl, they had Desmond Bishop, who you may remember scooped up an important fumble in that game. He was PFF’s 9th overall linebacker that year with an outstanding 89.1 grade. Bishop was a 6th round pick who managed two excellent seasons for the Packer before injuries robbed him of his effectiveness. Outside of Campbell’s fluky 2021 season, no Packer linebacker has eclipsed a PFF score of 80 since Bishop. The Packers have suffered mightily for it, frequently getting destroyed on the ground in key moments while watching fantastic opposing linebackers like San Francisco’s Fred Warner (90 grade, 2nd overall in 2023) eat up the middle of the field and the run game. By the way, the only linebacker with a higher grade than Warner last season was Buffalo’s Tyrel Dodson, who led the team in tackles, and tackles for loss, in Buffalo’s AFC championship loss to the Chiefs.

Now, linebackers don’t win you championships — that would be a silly argument. And Dodson was an undrafted free agent who has worked himself into a very good player. Warner was a 3rd round pick and seems well worth it. So, my question to nerds everywhere is, how exactly do you propose football teams go about acquiring linebackers? You Poindexters are great at telling us how NOT to acquire linebackers and safeties, so what exactly is your idea on what teams should do? The Packers have tried not investing in them, which is what you seem to want. The result has been repeated, high-profile, heart-wrenching playoff failure.

WAR and Positional Value

As the foremost expert in turning baseball statistics into football statistics in the entire world, I find the very notion of football WAR offensive. There is of course SOME value in understanding that quarterbacks, which are by far the most valuable single position in any “big team” sport (not including basketball), are much more valuable than running backs, but you don’t need WAR for that. You absolutely SHOULD NOT use WAR for that, and I’m here to tell you why that is.

First of all, baseball — which is rigorously tracked by high tech computer systems developed by defense contractor-level engineers over a period of three decades — features discrete easily quantifiable events, a collection of largely individual performances, and a gigantic 162-game sample per player. And even in that case, the WAR stat is still an approximation-at-best of player value. WAR is often represented to the tenths place (for instance, Milwaukee Brewers catcher William Contreras has a Baseball Prospectus WAR of 2.1 as of this writing), but most baseball analytics people will tell you that we should really only use integers. That the precision implied by the tenths digit is creating a false sense of accuracy.

That is because WAR is a combination of several different statistics that vary widely in reliability, and combining all of them into a single number brings some uncertainty with it. Baseball defense is harder to measure than baseball offense, and so players who derive much of their value from defense will have a “less certain” WAR than those who get on base and hit dingers.

But that’s enough about baseball. The point is that even though baseball is perfectly set up for WAR to work well, it’s still just an approximation. Football shares almost nothing in common with baseball in terms of what makes WAR work. In football, the eleven players on the field are interacting with their teammates AND the other 11 opponents on the field all the time. The ways that players create value in football versus baseball are completely different, with few analogues at an individual level. But let’s dig a little deeper.

What’s wrong with WAR in football specifically?

1. Maximum Player Value Inequality

In baseball, every player has the capacity to be roughly as valuable as every other player. A starting pitcher can be worth roughly as much as a first baseman, even though their jobs are completely different. In football, the quarterback is SO much more valuable than every other player, that the rest of the team is left fighting over WAR scraps.

Baseball teams have a 26-man roster, where players are splitting up something like 60-80 available WAR. In football, you have 48 players who can dress for every game. A good quarterback will be worth something like six wins by himself (and likely more for your Hall-of-Famers). Average-to-good quarterbacks are worth two to three wins by themselves. That means that, generally speaking, you have 47 players splitting value from roughly 11-15 games (not wins, games). Let’s assume that Jordan Love was a 3-WAR QB last year for the 9-7 Packers. That means that, on average, the other 47 players on the team had a WAR of 0.13. (Note: I’m not even going to get into “replacement level” here, but factoring in replacement level value would reduce this number even more.) Of course, that value would not be evenly distributed, but you see the problem immediately. WAR in baseball isn’t accurate to the tenths place despite all of the advantages we have in measuring baseball versus football, yet to even create a WAR framework in football, we’re forced to deal with value metrics in the hundredths.

PFF has done the most work trying to develop football WAR. A few years ago, Eric Eager and George Chahrouri published a paper for the Sloan Analytics Conference advancing the following mean WAR values per position (with variance):

Outside of quarterback, which is orders of magnitude more valuable, I don’t know how anyone could argue with any confidence that a corner is so much more valuable than a safety or even a linebacker to justify a bad draft grade for picking a safety or a linebacker.

2. On Defense, the Least Valuable Player Is the Most Important Player

Imagine for a second that the Packer secondary was a baseball team. In this scenario, Jaire Alexander gets to hit just as often as Jonathan Owens or Darnell Savage. In fact, if you hit Alexander at the top of the lineup, he’ll actually get a few additional chances. And if Alexander hits well, he’ll be more valuable, and accumulate more WAR than his teammates. Alexander’s value will manifest itself in excellent hitting, and importantly, his value will count.

Unfortunately, football isn’t anything like this. As a defensive player, it is the opposing offense that dictates how many “at bats” each defensive player gets, and if they feel like Alexander is an elite talent who can hurt them, they can target Darnell Savage or Jonathan Owens instead. Alexander might be awesome, but if there are attractive alternatives for the offense to attack, that value may never manifest in any real way.

As a result, on defense, your lowest WAR player is your most important player, especially if that player is in the secondary. The more poor players you have, the more it weakens stronger players who:

a. Wind up out of position trying to compensate for their bad teammates, and

b. Never see any targets to defend in the first place.

This goes for light defensive linemen and slow linebackers who can be run on as well. When Colin Kaepernick ran for 181 yards against the Packers in the playoffs, they still had Sam Shields, who actually had an interception. They had Clay Matthews, who had a sack. Hell, they still had Charles Woodson who had six tackles and two passes defended in that game. But do you know who led the Packers in tackles that day? Brad Jones. Brad fricking Jones had ten tackles. The same Brad Jones who Seattle would target on their fake field goal in the NFC Championship Game a few years later. A.J. Hawk, who was second on the team with eight tackles, did not help matters, and may have been worse than Jones in this game. Off-ball linebackers are traditionally the least impactful players on defense, until they’re not.

Traditional notions of positional value aren’t just inaccurate on defense, they’re conceptually incoherent. Talking about the WAR or relative positional value of an individual corner versus an edge isn’t just wrong, it’s nonsensical. Edge rushers are the one position on defense that can affirmatively create some value all by themselves, but even they are limited by the number of passing downs where the ball doesn’t come out immediately. The value of a good corner can be completely erased by the value of a bad opposite corner or a bad safety if a quarterback can just attack a bad alternative with impunity. If, say, Jonathan Owens can’t cover George Kittle, it doesn’t matter how Jaire Alexander is doing on Brandon Aiyuk. If Darnell Savage is striking out seven times while Jaire Alexander is sitting in the dugout not getting at bats, how can you hope to come up with a meaningful value for Alexander’s WAR, exactly?

3. Outside of quarterbacks, any created value can be completely wiped out by luck, or referees.

The WAR section over at Pro Football Focus has a nice rundown of WAR variability by position, but it also contains this illuminating quote:

The highest player-to-player variability in WAR is generated by defensive linemen, due likely to the existence of the J.J. Watts and Aaron Donalds of the world. Linemen (both offensive and defensive) WAR is suppressed significantly by offsides/illegal procedure penalties. For example, a false start/offsides penalty on first down and 10 yards to go is worth the better part of one expected point. A holding penalty on an offensive lineman or a roughing the passer penalty are worth even more and represent value that is very difficult to accumulate during the other snaps in each, each of which consists of a one-on-one battle with little direct effect on the outcome of a play and for which each player wins some and loses some.

There are two important points to take away from this quote. The first is that the value created by most positions is so small that outliers (like Donald and Watt) can, to some extent, throw off the analysis. While the average defensive lineman may be worth peanuts, exceptional players can be worth an enormous amount, even at low value positions.

But the bigger issue takes us back to our first point, about just how small these values are in the first place. A holding penalty can basically wipe out all of the value that an offensive lineman might accumulate in a game. The value accumulated by non-QBs is so small, that a referee making a call on something that happens on nearly every play completely swamps that value. And if a 10-yard holding penalty can entirely erase the value of an offensive lineman, what does a fumble recovery, which is almost completely random and several orders of magnitude more impactful than a penalty, do to a receiver or a tight end?


Football WAR can never work. The measure of value is too small, the sample size of 16 games is too small, the number of players vying for shares is too large, and football is filled with moments, often determined completely by chance, which are too large for any of this to matter.

But this is good news for the Packers, who just spent most of a draft on off-ball linebackers and safeties. It’s good news because rather than adhering to some purely academic notion of positional value, they have a more sophisticated understanding of cascading value, especially on defense. And they’ve learned this lesson the hard way, through mostly ignoring run defense, mostly focusing on corners and edge players at the expense of the “lesser positions,” and getting repeatedly burned by it.

The bottom line is that you cannot put a definitive number on most football positions. The most valuable player is the quarterback. The second most “valuable” player is the worst player you must put on the field on defense. Everyone else is pretty much the same, and anyone who tells you otherwise probably wants you to pay extra for the advanced stats on their website.

Latest article