NFL Team Ranking Approximation From 1970 – Current

Yesterday was the last day of my semester and a great day to put one of my favorite models, the Massey Method, from linear algebra to good use. As I revisited my linear algebra book to look back at all that we’ve learned through out the semester, the idea of finding out how NFL teams ranked from day one till now caught my interest. That being said, I committed to finding an approximation to that using the Massey Method and here I will share what I’ve found.

Finding the Data

I first checked NFL’s site to see how far schedules went back and noticed it was from 1970 till now. So I went to search Google for an API or dataset that I could use to give me the scores of each game but unfortunately, I wasn’t able to find anything. Well, I lied. I found some resources but they weren’t exactly what I was looking for. So I ended up writing a python script (source code here) that crawled the week page of each season and scraped the scores and teams for all games.

How the Massey Method Works

The Massey Method works based on the amount of games played between teams and point differential across games. The ranking for each team is found using least squares approximation. Here’s the general algorithm for the Massey Method:

    1. Write down a system of equations for every single match played. An equation that relates two teams r1 and r2 with a point differential d can be expressed as r1 – r2 = d.
    1. Convert that into a system of equations of the form Ar̄ = p̄.
    1. Reach the least squares system of the form ATAr̂ = ATp̄.
    1. Change the left matrix of this system by setting the last row to a row consisting of solely the number 1. Also change the bottom entry of the new right vector to a 0.
    1. Now solve the system to determine the ranks.

Final Results

In total, over the span of around 49 years of NFL history, 11,652 games were played across 32 teams. This is only counting the regular season and playoff games. Preseason games are skewed and not worth much adding into this data as coaches use that time mainly to test out rookies and evaluate their team as a whole unit. That being said, here are the results that my script returned:

Team Rank
Pittsburgh Steelers 3.707942162671477
Dallas Cowboys 2.9660468156960293
Baltimore Ravens 2.848222524156146
New England Patriots 2.666908177632458
Denver Broncos 2.3116494180207723
San Francisco 49ers 2.190527900025692
Miami Dolphins 2.0144090483449753
Minnesota Vikings 1.3172208931756877
Greenbay Packers 1.175465583138116
Washington Redskins 1.0590513996484625
Philadelphia Eagles 0.9921052597864558
Kansas City Chiefs 0.8187838466188585
Seattle Seahawks 0.7669103717005967
Los Angeles Chargers 0.47777279702627246
Oakland Raiders 0.37408321617291307
Los Angeles Rams -0.2173506542743492
New York Giants -0.2642147178334522
Carolina Panthers -0.3937363102811801
Chicago Bears -0.6790915673325525
Cincinnati Bengals -0.7162153192730953
Indianapolis Colts -0.773696735004739
Buffalo Bills -1.1664086499973125
Jacksonville Jaguars -1.367965684118028
New York Jets -1.3913297554590114
New Orlean Saints -1.4872441285045244
Tennessee Titans -1.65339271127078
Atlanta Falcons -1.9260637829642524
Detroit Lions -2.0466076262955615
Houston Texans -2.281760321146092
Cleveland Browns -2.6836219194417277
Arizona Cardinals -2.979684378641172
Tampa Bay Buccaneers -3.6587151519770678

Analysis of Results

As you may have already noticed, the numbers in the table above are sorted in decreasing order. That is, the highest rank is at the top while the lowest rank is at the bottom of the table respectively. However, what does that really mean in terms of the question we asked at start: which teams have the greatest point differential? From the data above, the higher a team ranks, the greater is its point differential placement. What that means, for example, is that the Pittsburgh Steelers tend to win games by a greater point differential than the Dallas Cowboys and that the Dallas Cowboys tend to win games by a greater point differential than the Baltimore Ravens, and so on and so forth. It is important to note that this does not necessarily mean that one team is better than the other if it has a higher point differential ranking.

Conclusions

There are many ways this data could’ve been used for meaningful information, but I found this approach particularly interesting because of two reasons: (1) I’m very fond of the Massey Method, specifically the least squares approximation aspect of it and (2) I wanted to see which teams tend to score more than their opponents on average. At the very least, this project was very interesting and a great way to test data approximation models on real data and that is something I value. That being said, I hope you found this helpful or at least an interesting read. As usual, feel free to express your opinions and concerns below and I’d be glad to respond back.

Sources

  • Featured image can be found here.

Leave a Reply

Discover more from RealDevTalk

Subscribe now to keep reading and get access to the full archive.

Continue reading