What does, “all models are wrong, some are useful,” mean? I love this quote, but it requires a bit more explanation.
This quote from George Box is famous because it is provocative and somewhat controversial.
But I am alarmed by the comments I have seen around this quote from actual data science practitioners who do not seem to understand it. It is imperative to understand this concept to avoid making huge mistakes.
All statistical models are wrong
Yes, you are reading that correctly. All models are wrong. Models approximate reality, they are not reality itself. Therefore, they are wrong…even when they are right.
Wait…what the heck?
Let me explain with an example. Imagine we are modeling how much revenue a sales person will generate in a month. What factors do you think would be important? A few could be:
- How much experience the sales person has (x1)
- The average income of customers in the sales territory (x2)
- The number of sales calls in that month (x3)
Lets imagine that you can use those three factors to model how much in revenue they would generate. A linear regression model with an adjusted R squared of 0.7 could predict 70% of the variation in their performance. That would make a model that is quite accurate for real world scenarios.
If we typed it out, it would look like this:
y = a*x1 +b*x2 +c*x3 + error
What we are saying is that revenue (y) is equal to a calculated coefficient (a) times their experience (x1) + a coefficient (b) times the average income (b), etc… + a y intercept sometimes called the error, remainder, or residual. (The very fact that the name can be “error” should be a clue about models being “wrong.”)
With an adjusted r square of 0.7, we’ll probably see predictions of ~$70k to $130k if the actual answer is $100k.
Sometimes, it will predict the exact number. For salesperson 24601 it predicted a revenue of $118k in September of 2018 and that salesperson hit exactly $118k! Is that model “wrong?”
Yes. Every single time it is wrong. Even when its prediction is correct.
Do you understand why? Because the model cannot possibly be true. Their revenue does not actually equal their experience times a coefficient, avg. income times a coefficient, number of sales calls times a coefficient. It doesn’t. There are an infinite number of other factors that go into real world performance.
Is it important that all models are wrong?
Yes, it is important to realize that. Modelers can get so confident in their models that they forget that they are not reality.
Some people claim that overconfidence was a cause of the financial crash in 2008. Traders were overconfident in the models that they took risks which were too large. That sounds logical, and may be correct, but any data nerd will tell you that it is exceptionally rare for there to be a single cause to any large result in a complex system.
However, there are plenty of times when unforeseen interactions or relationships between causes lead to massive effects. The fact that foreclosures were seen as independent events in the models, but are not independent in reality was absolutely a reason why the risks of credit default swaps were under priced.
Some models ARE useful
In fact, many models are useful. Models are not new, and many have been in place in industries like insurance for a long time.
There are a myriad of ways to test whether a model is useful. A modeler needs to understand the ramifications of different errors (type 1 or type 2). There are a myriad of tests and metrics to determine a model’s performance. Here is a solid article on several tests and their uses:
In fact, the author of the original quote George Box, later clarified, “the practical question is how wrong do they have to be to not be useful?”
Answering that question is WAY more complex than it sounds. It involves a combination of statistical evaluations combined with business realities. The metrics in the article above are a good start, but you also need to know which ones to use based on the situation.
I am not sure I feel qualified to write an in-depth article on answering how wrong a model has to be in order to not be useful. However, I agree with Roman Josue de las Heras Torres when he commented (LinkedIn post below), “I think this is a principle tightly tied to the Occam’s Razor principle. It is our job to find models which can be simple enough and useful to solve or simplify a problem.”
Bradford Salyer also added a good point when he brought up that, “models are helpful to…understanding how your business and assumptions evolve over time.” This ties to the problem of concept drift, which can be impactful and difficult to deal with as a modeler. Modeling situations involving human behavior has this issue because people often change their behavior due to the results of the model.
Imagine a retailer has a model to direct employees to fix out of stocks. If the model tells a store employee to go fix the toilet paper display three weeks in a row, that employee is probably going do it on week 4 regardless of what the model says to do. When that happens the model stops working because the problems it predicts are preempted.
How does a modeler get around a problem like the retail example I described?
Seriously, I am asking…how do you personally deal with that? That is an example of something I personally worked on, and it required us to make the model very narrow in order to be useful for more than a few weeks. I would love to hear thoughts on what we could have done differently.