Optimization for Data Science

Last Updated : 16 Jul, 2020

From a mathematical foundation viewpoint, it can be said that the three pillars for data science that we need to understand quite well are Linear Algebra, Statistics and the third pillar is Optimization which is used pretty much in all data science algorithms. And to understand the optimization concepts one needs a good fundamental understanding of linear algebra.

What’s Optimization?
Wikipedia defines optimization as a problem where you maximize or minimize a real function by systematically choosing input values from an allowed set and computing the value of the function. That means when we talk about optimization we are always interested in finding the best solution. So, let say that one has some functional form(e.g in the form of f(x)) that he is interested in and he is trying to find the best solution for this functional form. Now, what does best mean? One could either say he is interested in minimizing this functional form or maximizing this functional form.

Why Optimization for Machine Learning?

Almost all machine learning algorithms can be viewed as solutions to optimization problems and it is interesting that even in cases, where the original machine learning technique has a basis derived from other fields for example, from biology and so on one could still interpret all of these machine learning algorithms as some solution to an optimization problem.
A basic understanding of optimization will help in:
- More deeply understand the working of machine learning algorithms.
- Rationalize the working of the algorithm. That means if you get a result and you want to interpret it, and if you had a very deep understanding of optimization you will be able to see why you got the result.
- And at an even higher level of understanding, you might be able to develop new algorithms yourselves.

Components of an Optimization Problem

Generally, an optimization problem has three components.

minimize f(x), w.r.t x, subject to a ≤ x ≤ b

The objective function(f(x)): The first component is an objective function f(x) which we are trying to either maximize or minimize. In general, we talk about minimization problems this is simply because if you have a maximization problem with f(x) we can convert it to a minimization problem with -f(x). So, without loss of generality, we can look at minimization problems.
Decision variables(x): The second component is the decision variables which we can choose to minimize the function. So, we write this as min f(x).
Constraints(a ≤ x ≤ b): The third component is the constraint which basically constrains this x to some set.

So, whenever you look at an optimization problem you should look for these three components in an optimization problem.

Types of Optimization Problems:

Depending on the types of constraints only:

Constrained optimization problems: In cases where the constraint is given there and we have to have the solution satisfy these constraints we call them constrained optimization problems.
Unconstrained optimization problems: In cases where the constraint is missing we call them unconstrained optimization problems.

Depending on the types of objective functions, decision variables and constraints:

If the decision variable(x) is a continuous variable: A variable x is said to be continuous if it takes an infinite number of values. In this case, x can take an infinite number of values between -2 to 2.

min f(x), x ∈ (-2, 2)
- Linear programming problem: If the decision variable(x) is a continuous variable and if the objective function(f) is linear and all the constraints are also linear then this type of problem known as a linear programming problem. So, in this case, the decision variables are continuous, the objective function is linear and the constraints are also linear.
- Nonlinear programming problem: If the decision variable(x) remains continuous; however, if either the objective function(f) or the constraints are non-linear then this type of problem known as a non-linear programming problem. So, a programming problem becomes non-linear if either the objective or the constraints become non-linear.
If the decision variable(x) is an integer variable: All numbers whose fractional part is 0 (zero) like -3, -2, 1, 0, 10, 100 are integers.

min f(x), x ∈ [0, 1, 2, 3]
- Linear integer programming problem: If the decision variable(x) is an integer variable and if the objective function(f) is linear and all the constraints are also linear then this type of problem known as a linear integer programming problem. So, in this case, the decision variables are integers, the objective function is linear and the constraints are also linear.
- Nonlinear integer programming problem: If the decision variable(x) remains integer; however, if either the objective function(f) or the constraints are non-linear then this type of problem known as a non-linear integer programming problem. So, a programming problem becomes non-linear if either the objective or the constraints become non-linear.
- Binary integer programming problem: If the decision variable(x) can take only binary values like 0 and 1 only then this type of problem known as a binary integer programming problem.
  
  min f(x), x ∈ [0, 1]
If the decision variable(x) is a mixed variable: If we combine both continuous variable and integer variable then this decision variable known as a mixed variable.

min f(x1, x2), x1 ∈ [0, 1, 2, 3] and x2 ∈ (-2, 2)
- Mixed-integer linear programming problem: If the decision variable(x) is a mixed variable and if the objective function(f) is linear and all the constraints are also linear then this type of problem known as a mixed-integer linear programming problem. So, in this case, the decision variables are mixed, the objective function is linear and the constraints are also linear.
- Mixed-integer non-linear programming problem: If the decision variable(x) remains mixed; however, if either the objective function(f) or the constraints are non-linear then this type of problem known as a mixed-integer non-linear programming problem. So, a programming problem becomes non-linear if either the objective or the constraints become non-linear.

Suggest improvement

Lagrange's Interpolation

Ordinary Least Squares (OLS) using statsmodels

Share your thoughts in the comments

Linear Algebra and Matrix

Statistics for Machine Learning

Probability and Probability Distributions

Calculus for Machine Learning

Regression in Machine Learning

Optimization for Data Science

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?