# What is No Free Lunch Theorem

• Last Updated : 25 Jun, 2021

What is No Free Lunch Theorem :
The No Free Lunch Theorem is often used in optimization and machine learning, with little comprehension of what it means or implies.

The theory asserts that when the performance of all optimization methods is averaged across all conceivable problems, they all perform equally well. It indicates that no one optimum optimization algorithm exists. Because of the strong link between optimization, search, and machine learning, there is no one optimum machine learning method for predictive modelling tasks like classification and regression.

They all agree on one point: there is no “best” algorithm for specific kinds of algorithms, since they all perform similarly on average. Mathematically, the computing cost of finding a solution is the same for any solution technique when averaged across all problems in the class. As a result, no solution provides a shortcut.

There are two No Free Lunch (NFL) theorems in general: one for machine learning and one for search and optimization. These two theorems are connected and are frequently combined into a single general postulate (the folklore theorem).

Although many other scholars have contributed to the collective writings on the No Free Lunch theorems, David Wolpert is the most well-known name connected with these studies.
Surprisingly, the concept that may have inspired the NFL theorem was first offered by a 1700s philosopher. Yes, you read that correctly! A philosopher, not a mathematician or a statistician..

Figure 1. Understanding NFL.

David Hume, a Scottish philosopher, presented the issue of induction in the mid-1700s. This is a philosophical question about whether inductive reasoning leads to true knowledge.

Inductive reasoning is a type of thinking in which we make inferences about the world based on previous observations.

According to the “No Free Lunch” theory, there is no one model that works best for every situation. Because the assumptions of a great model for one issue may not hold true for another, it is typical in machine learning to attempt many models to discover the one that performs best for a specific problem. This is especially true in supervised learning, where validation or cross-validation is frequently used to compare the prediction accuracy of many models of various complexity in order to select the optimal model. A good model may also be trained using several methods — for example, linear regression can be learned using normal equations or gradient descent.

According to the “No Free Lunch” theorem, all optimization methods perform equally well when averaged over all optimization tasks without re-sampling. This fundamental theoretical notion has had the greatest impact on optimization, search, and supervised learning. The first theorem, No Free Lunch, was rapidly formulated, resulting in a series of research works, which defined a whole field of study with meaningful outcomes across different disciplines of science where the effective exploration of a search region is a vital and crucial activity.

In general, its usefulness is as important as the algorithm. An effective solution is created by matching the utility with the algorithm. If no good conditions for the objective function are known, and one is just working with a black box, no guarantee can be made that this or that method outperforms a (pseudo)random search.

A framework is being created to investigate the relationship between successful optimization algorithms and the issues they solve. A series of “no free lunch” (NFL) theorems are provided, establishing that any improved performance over one class of tasks is compensated by improved performance over another. These theorems provide a geometric explanation of what it means for an algorithm to be well matched to an optimization issue.

The NFL theorems are also applied to information-theoretic elements of optimization and benchmark measurements of performance.

There is no such thing as a free lunch, since adding alternatives to a project incurs both direct and opportunity expenses. As a result, incorporating actual alternatives may increase the original development cost. Direct costs are the expenses of additional development effort required to include certain flexibilities into the project’s architecture. Opportunity costs are the expenses of not being able to do anything else (for example, add a feature) as a result of the time and effort spent on generating that flexibility.

Conclusion:
Machine learning models adhere to the Garbage in, Garbage out (GIGO) principle (i.e. Predictions rely on the data quality on which our model is trained). And a lot of study went into these theorems, and others may claim that this theorem does not apply in all instances. It is preferable that we concentrate on the aspects that will help us better comprehend the data and construct the best performing models.

My Personal Notes arrow_drop_up