What is the No Free Lunch Theorem?
A fundamental concept for effectively implementing machine learning algorithms.
Before I dive into the definition of the No Free Lunch theorem, let’s quickly discuss the context. The beauty of data science and machine learning is that no two datasets will ever be the same. The size, noise and content will always be different. Therefore, our approach to every problem must be different.
So, What is it?
The No Free Lunch theorem states that there is no one model that works best for every problem. The assumptions of a great model for one problem may not hold for another problem. Therefore, it is common in machine learning to try multiple models and find one that works best for a particular problem.
This theorem comes from a 1996 paper where David Wolpert demonstrated that if you make absolutely no assumption about the data, then there is no reason to prefer one model over any other.
Judging a Book by its Cover
Ultimately, a model is a simplified version of the observations. They are meant to discard the superfluous details that are unlikely to generalize to new instances. In order to decide what data to discard and what data to keep, you must make assumptions.
There is no model that is a priori guaranteed to work better (hence the name of the theorem). So, you may be thinking; the only way to know for sure which model is best is to evaluate them all. Since this is not possible, in practice you make some reasonable assumptions about the data and evaluate a few reasonable models. For instance, for simple tasks you may evaluate linear models with various levels of regularization, and for a complex problem you may evaluate various neural networks.
The Plan of Attack
So, the next time you are tackling a data science problem, step back and ask yourself, “What is the best method for this particular problem?” and go from there. By starting a machine learning project effectively, we can streamline our way to the best metrics.