The Fantasy of Mining Football Stats

Gregory Shamus

All across America, millions of Fantasy football fans are pulling all-nighters in their basements, internet cafes or home offices, using OpenOffice to program a never-ending download of every single statistic of every single play that occurs in the NFL. Using application software packages like Java, the FFers (short for players of Fantasy football) parse the data in accordance with whatever esoteric parameters these intrepid geeks design in their quest to find the “FF Golden Rule”.

For while the Geeks may inherit the earth, currently they're too busy trying to quantify every single aspect of professional football to realize there is a great big world out there with more interesting problems to solve.

To these geeks, this FF Golden Rule is an algorithm that will both unlock the secret of how to predict which combination of players will win them their league championship, and also allow them to predict which NFL player is the best (ever!) and which NFL team will win as the game progresses.

As of 2011, it was estimated that there were 37 to 47 million fantasy football players in the United States and Canada, and the "sport" has grown into a $1 Billion industry. Is it any wonder that NFL Commissioner Roger Goodell will do anything to cater to this one specific audience, even if it means changing the nature of the sport itself to satisfy this profitable parasite that has attached itself to professional football?

These FFers hope to raise the level of football statistical analysis to that brought to baseball by Bill James (Baseball Abstracts), and Billy Beane. Early pioneers like Bill Burke (Advanced NFL Stats), Aaron Schatz (Football Outsiders) and countless other sabermetricians have begun to influence, and even be employed by, professional and college football front offices in the never ending quest to predict what will happen before the event occurs.

But can something like a football game be quantified to such a degree? And even if it can, should it?

Leaving aside for the moment the question of enjoying a sport for its own sake instead of reducing it to a point scoring math exercise, consider the magnitude of trying to quantify a football game.

An offensive set of downs that doesn't result in a first down, turnover, offensive penalty or touchdown has 1,716 different possible combinations of plays(i.e. incomplete pass, run for no gain, sack, pass for less than first down, etc). On average an NFL team possesses the ball 12 times a game. Taken as separate and distinct events, that's 20,952 possible combinations of plays per game; per season that's 329,472 per team.

Calculating the probability of each of those combinations doesn't even begin to scratch the surface of what the FFers are attempting to do; they are trying to measure the success rate of 352 offensive players and quantify the probability of success of a given play (and hence the "success" of the player[s] involved) based on upwards of 161 different variables.

In addition to basic metrics such as down and distance, turnover differentials or yards per catch, these geeks are coming up with variables such as the propensity of visiting quarterbacks to throw on third down in the second quarter with less than 7 and a half minutes left while leading by more than four points while playing on natural turf facing into the sun on a cold day in the eleventh week of the season just three games after having a bye week and having flown west for more than three and a half hours. Then they are testing these variables against several years' worth of data.

Basically what the FFers, the sabermetricians and internet sites like Advanced NFL Stats are doing is called data mining. This is the practice of taking a set of historical data points and attempting to find a pattern of cause and effect by using as many different variables as possible and then devising algorithms (rules) based on linear regression to calculate the probability of such events occurring again in the future. If the rule as written doesn't work with a particular piece of historical data, they just shrug it off and create another rule making that particular piece of data an exception to the first rule.

With all their technical knowledge, sophisticated mathematical models and supposed affirmation of success by their brethren in the world of baseball, you would think it only a matter of time before the FF Golden Rule is discovered.

However, if they left their basements more often, or looked up from their computer screens once in a while, or maybe even stopped and let a little bit of the non-football world into their lives, they could learn something from history. What they have lost sight of, with their eyes blinded by the strain of looking for a way to quantify what the impact of the collective exhalations of the home team's fans has on the opposing quarterback's passer rating, is that what they're trying to do has been tried before in a similar arena; the stock market. It has been tried before, and it has failed.

In the 1990's a company was born on the basis of the claim of its founders that they had developed a new and sophisticated method of investing in stocks. David and Tom Gardner leveraged the internet to self-promote their investment advice under the name of Motley Fool. Their stock selection process, as described in their 1996 book "The Motley Fool Investment Guide" was a derivative of a strategy devised by Michael O'Higgins in his 1991 book "Beating the Dow".

The two Fools claimed that their "Foolish Four" methodology of selecting and investing in four stocks out of the 30 different companies in the Dow Jones Industrial Average was tested against two decades of stock data (1973 to 1993) and was found to produce an average annual return of 25.5 percent as compared to the 11.2 percent the DOW returned over that same period. The Fools went on to state that their methodology should "...grant its fans the same 25 percent annualized returns going forward that it has served up in the past".

The Fools made quite a name for themselves, and a lot of money to boot. With repeated testing by the hundreds of thousands of investors who bought their book and subscribed to their newsletter, it was shown that within the 20 year period of data used in the formulation of their strategy, their Foolish algorithm accurately picked four stocks every year that outperformed the DOW. To their credit, the Fools did advise that testing of the Foolish Four process was ongoing, both backwards in time as well as moving forward by using the model to pick stocks for the upcoming years and then measuring the results.

It's a good thing they issued that advice; by the year 2000, empirical testing of their Foolish Four methodology determined that their "Fool proof" method of picking stocks produced no better results than if investors had simply invested in any of the 30 stocks picked at random.

The statistical degree of confidence of a given model cannot be known without quantifying the number of times the model had to be "tweaked" to overcome a failure to produce a historical result. What happened to the Fools, and what will happen to the FFers is that over time as they add more and more variables to their algorithms and thus create "rules" to govern these variables, incidents where the model didn't work as expected will become exceptions, and such exceptions are themselves a rule. The rules governing exceptions will grow alongside rules governing exceptions to the exceptions, and so on.

The Fools used 20 years' worth of data to construct their model, and after only four years of trying to "predict" future results of their investment strategy, they failed and had to issue a mea culpa to all of their followers.

They used data on 30 companies over a 20 year span. That's 7,300 days of operations for each company or 219,000 company days for the group. These were national and multi-national companies whose operations could and were monitored on a daily, if not hourly basis by professional Wall Street analysts and brokers; if a company had an aging CEO who had missed a public speaking engagement, Wall Street was immediately buying or selling that company's stock in response. The volume of factors known to impact the stock performance of a company had been known by the players in the New York Stock Exchange for over 150 years, but the Fools, through data mining, thought they had come up with a way to "beat the system".

The Motley Fool company has moved on from its "Fool Proof" type of investment advice and now offers such cutting edge advice as "Treat every dollar as an investment" and "Fools look for well-managed companies"

FFers and other sabermetricians currently boast of mining data (but they won't call it that) that stretches all the way back to 2002. That's a mere ten years, or 160 games worth of data, or 5,120 team days for the league. With this paltry amount of data at their fingertips, these sabermetricians are expounding on the "statistical accuracy" of their prognosticating models, albeit only at 65% or so. One such group, Advanced NFL Stats, at halftime of the AFC playoff game between the Denver Broncos and Baltimore Ravens gave each team a 50 percent probability of winning; with two minutes to play in regulation, they were giving the Ravens a 1 percent chance of winning.

The problem is, no matter how smart the sabermetricians think they are, they can't possibly know every factor that impacts a player's or team's performance. Take for example passing stats. How do you quantify the impact of the bump rule? It was implemented in the late 1970's, thanks to Mel Blount of the Steelers. So how do you equalize passing effectiveness before and after the rule was put into place? Or the chop block rule; if the league ever decides the playing careers of defensive linemen has some worth and does away with the rule allowing chop blocks, how do you adjust the running stats before and after such a rule to take it into account. Maybe eliminating that rule will bring back balance between rushing and passing, and by doing so, the aerial circus atmosphere of today's NFL games might change. Or to curry favor of the FFers, maybe Goodell imposes some other rule that on its surface has no connection, but in reality (meaning outside of the basement) subtly influences the game in an indeterminable way. How will the sabermetricians handle that in their models if they can't quantify it?

If you arbitrarily decide to forget the stats from before the passing bump rule, you have now just further limited the size of your database, potentially reducing factors that could impact your algorithm. If you make an "educated guess", then you have introduced a subjectivity to your analysis, and you have begun to invalidate what you were trying to objectively measure.

Maybe the sabermetricians and FFers should forget about the FF Golden Rule and instead consider a derivative of that other Golden Rule: Avoid making the mistakes of others before their mistakes fall unto you. Do that and just enjoy the game being played for the drama and athleticism it contains.

Log In Sign Up

Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Behind the Steel Curtain

You must be a member of Behind the Steel Curtain to participate.

We have our own Community Guidelines at Behind the Steel Curtain. You should read them.

Join Behind the Steel Curtain

You must be a member of Behind the Steel Curtain to participate.

We have our own Community Guidelines at Behind the Steel Curtain. You should read them.




Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.