Simulating the College Football Playoff

Monte Carlo Simulation Provides Us Insights

Welcome to the “Illumined Insights” newsletter! Thank you so much for subscribing. This weekly newsletter touches on all things analytics and data science with a focus on areas such as data visualization, AI, and sports analytics.

This week we take a look at the development of a Monte Carlo simulation model for the College Football Playoff. We also look at what such a simulation would look like for a 12-team version of the playoff.

Stephen Hill, Ph.D.

Sponsored
Data Insider: Data News and People of the WeekWe are bringing you the hottest news in world of Data and the people in Data

In this week’s edition of the newsletter we develop a Monte Carlo simulation model of the College Football Playoff to help us answer the question: Who will raise the trophy in Houston on January 8th? Monte Carlo simulation is a computational technique that uses random sampling to approximate complex systems. In a Monte Carlo simulation model, random values are generated from allowing for the study of outcomes under different scenarios. This method is particularly useful in scenarios where analytical solutions are difficult or impossible to develop. Monte Carlo simulation is widely used in fields like finance, engineering, and physics to help model uncertainty and variability.

So how does this technique work? Let’s look at a simple example: flipping a coin. When we flip a coin, we have two possible results: heads and tails. We assume that the probability of getting a heads is 0.5 and getting a tails is also 0.5. We’ll assume that heads and tails are the only possible outcomes of a coin flip (ignoring the possibility of the coin ending on its edge or some other bizarre outcome). All we have to do now is map the probabilities to the number line between 0 and 1. Let’s assign heads to the values between 0 and 0.5 and tails to the values between 0.5 and 1. Note that it doesn’t matter which order we do the mapping (i.e., tails could be mapped from 0 to 0.5, if we wanted). We end up with a number line that looks like the image below.

Coin flip probabilities (drawn with https://excalidraw.com/)

So how do we actually simulate the result of a coin flip? We use a random number generator that can generate values between 0 and 1! In Excel, we would use the “RAND” function. In R, we use the “runif” function. The example below shows how the “runif” function would be used to generate a single random number between 0 and 1. The “set.seed” is used to ensure that the random number generation process can be replicated. The seed number is, itself, arbitrary. If we run this code with the same seed, we should all get the same random number.

R runif code

Running this code yields a random number of 0.2875775. What coin flip result does this map to? Because the random number is between 0 and 0.5, we would say that the coin flip result is a “heads”.

Coin flip probability mapping

We could simulate flipping many coins by generating many random numbers and mapping each to a coin flip result. We can then examine the distribution of coin flip results. For a simulation of 100 coin flips, we get 53 “heads” and 47 “tails”.

Coin flip simulation results

So what does all of this have to do with college football simulation? Well, if we have a probability estimate that a team will win a match-up with another team, we can treat the simulation of the game just like a coin flip, but with potential uneven probabilities associated with the outcomes. For example, if Team A has a 0.75 probability (75%) to defeat Team B, we simply set-up the number line accordingly. We can then generate random numbers as we did before to simulate the game many times.

Team A versus Team B probabilities

Where do we get probabilities for college football game matchups? We could make our own estimates or we could use estimates produced by others. For this article, we use the probability estimates provided by Massey Rating’s Matchup tool (https://masseyratings.com/game.php?s=cf2023). This tool allows us to select any two college football teams and then generates probability estimates. For example, if we choose Alabama playing Tennessee at a neutral site we get the following:

Alabama vs. Tennessee Massey Ratings match-up probabilities

We’ll use the probabilities generated by the Massey tool to help use simulate the 2023 College Football Playoff. As a fun extension of this, we’ll then simulate a 12-team version of the Playoff for 2023.

Let’s get started by taking a look at the bracket for the four-team 2023 College Football Playoff. Top seeded Michigan takes on #4 Alabama in Rose Bowl and #2 Washington plays #3 Texas in the Sugar Bowl. The winners of these games play for the National Championship in Houston.

2023 College Football Playoff bracket (produced with https://brackethq.com/maker/)

We need probabilities for each potential match-up. We get these from Massey’s tool.

2023 College Football Playoff win probabilities (from https://masseyratings.com/game.php?s=cf2023)

We now need code to help us simulate the playoff, record the results, and present the results. There’s quite a bit of code, so I’ll refer readers to a GitHub repository: https://github.com/stephenhillphd/CFPPlayoffSim.

Let’s examine the results of simulating the 2023 College Football Playoff 10,000 times. Note that I’m using the awesome “gt” package for R together with the “cfbplotR” package to produce the table and include team logos.

2023 College Football Playoff Simulation Results

Michigan wins the National Championship about 33% of the time across 10,000 simulations with Alabama a close second at about 30%.

Is this simulation model any good? One way to validate the model would be to compare the model results with probabilities from a different model. For example, how do the probabilities compare to those derived from odds from betting markets? The table below shows the current (as of December 15th) National Championship odds from the FanDuel Sportsbook. The “FanDuel Odds” column expresses the odds for each of the four teams to win the National Championship (given in American-style odds). These odds can be converted to implied probabilities (expressed as percentages in the “Odds Implied %” column.

You’ll notice that the implied percentages sum to greater than 100%. This occurs due to the sportsbook’s “vig” or edge that is incorporated into the various wagers to help the sportsbook make a profit. We can remove the vig to get "vig-free” odds. These percentages sum to 100%. We can compare the vig-free percentages with the percentages from our simulation model. It looks our model is pretty good (or at least well-aligned with the betting markets).

2023 College Football Playoff Odds Comparison

What do you think about these simulation results?

Let’s have a bit more fun with simulation and consider a “what might have been” had the 12-team version of the College Football Playoff been implemented for the 2023 season. The 12-team playoff dramatically reshapes the playoff field by imposing the following rules:

  • The top six highest rated conference champions receive an automatic berth in the playoff. In practice, this almost guarantees that the Power 5 conference (ACC, Big 10, Big 12, Pac-12, and SEC) champions earn automatic bids. They are then accompanied by a sixth champion from the Group of 5 conferences (American, C-USA, MAC, Mountain West, and Sun Belt). It remains to be seen how this will play-out with the Pac-12 reduced to just two teams starting in 2024.

  • The top four conference champions receive byes to the quarterfinal round. A team that is not a conference champion cannot receive a bye.

  • In the first round, the 5, 6, 7, and 8 seeds will play home games against the 12, 11, 10, and 9 seeds, respectively.

  • Quarterfinals, semifinals, and the national championship game will be played at neutral sites.

Based on the 2023 College Football Playoff final rankings, the teams that would participate in a 12-team playoff would be:

  • #1 Michigan (Big 10 champion)

  • #2 Washington (Pac-12 champion)

  • #3 Texas (Big 12 champion)

  • #4 Alabama (SEC champion)

  • #5 Florida State (ACC champion)

  • #6 Georgia (at-large)

  • #7 Ohio State (at-large)

  • #8 Oregon (at-large)

  • #9 Missouri (at-large)

  • #10 Penn State (at-large)

  • #11 Ole Miss (at-large)

  • #12 Liberty (Conference USA champion)

The bracket then looks like:

12-team 2023 College Football Playoff bracket (produced with https://brackethq.com/maker/)

Again, we use Massey’s match-up tool find estimated probabilities for each potential match-up in the playoff.

2023 12-team playoff win probabilities (from https://masseyratings.com/game.php?s=cf2023)

We then simulate the 12-team playoff 10,000 times. As with the four team playoff, the R code for the simulation model is available on GitHub (https://github.com/stephenhillphd/CFPPlayoffSim). Let’s take a look at the results:

2023 12-team College Football Playoffs Simulation Results

The top three teams are the same as in the four-team version of the playoff, but the order is a bit different with Alabama taking the top spot with an 18.4% chance to win the championship. The #2 seed Washington is the seventh most likely to win under this format as the potential match-up with Ohio State in the Quarterfinals drags the Huskies chances down.

For fun, let’s take a look at the combinations of teams that meet for the championship across the 10,000 simulations. As you might expect, Alabama, Michigan, Ohio State, Georgia, and Texas feature prominently. Scrolling to the bottom of the match-ups list, how would you feel about a Liberty versus Ole Miss national title game? This combination occurred in 2 of the 10,000 simulations.

Illumined Insights Book Recommendations

This week I recommend one of my personal favorite business/analytics books.

Feedback?

Did you enjoy this week’s newsletter? Do you have a topic, tool, or technique that you would like to see featured in a future edition? I’d love to hear from you!

Support the Newsletter?

Support this newsletter with a “coffee” (optional, but appreciated).

Start Your Own Newsletter?

This newsletter is created on and distributed via Beehiiv, the world’s best newsletter platform. Want to start your own newsletter? Click below to get started. Please note that this is an affiliate link. I may receive a small commission if you sign up for Beehiiv via this link.