Unconstrained Multivariate Optimization

Wikipedia defines optimization as a problem where you maximize or minimize a real function by systematically choosing input values from an allowed set and computing the value of the function. That means when we talk about optimization we are always interested in finding the best solution. So, let say that one has some functional form(e.g in the form of f(x)) and he is trying to find the best solution for this functional form. Now, what does best mean? One could either say he is interested in minimizing this functional form or maximizing this functional form.
Generally, an optimization problem has three components.

minimize f(x),
w.r.t x , 
subject to a < x < b 

where, f(x) : Objective function 
x : Decision variable 
a < x < b : Constraint 
 

What’s a multivariate optimization problem?

In a multivariate optimization problem, there are multiple variables that act as decision variables in the optimization problem.



z = f(x1,x2,x3…..xn)

So, when you look at these types of problems a general function z could be some non-linear function of decision variables x1,x2,x3 to xn. So, there are n variables that one could manipulate or choose to optimize this function z. Notice that one could explain univariate optimization using pictures in two dimensions that is because in the x-direction we had the decision variable value and in the y-direction, we had the value of the function. However, if it is multivariate optimization then we have to use pictures in three dimensions and if the decision variables are more than 2 then it is difficult to visualize.

What’s unconstrained multivariate optimization?

As the name suggests multivariate optimization with no constraints is known as unconstrained multivariate optimization.
Example:

min f(x̄)
w.r.t x̄
x̄ ∈ Rn

So, when you look at this optimization problem you typically write it in this above form where you say you are going to minimize f(x̄), and this function is called the objective function. And the variable that you can use to minimize this function which is called the decision variable is written below like this w.r.t x̄ here and you also say x̄ is continuous that is it could take any value in the real number line.

The necessary and sufficient conditions for x̄* to be the minimizer of the function f(x̄*)

In case of multivariate optimization the necessary and sufficient conditions for x̄* to be the minimizer of the function f(x̄) are:

First-order necessary condition: ∇ f(x̄*) = 0

Second-order sufficiency condition: ∇ 2 f(x̄*) has to be positive definite.

where,

 \nabla f(x^*) = Gradient = \begin{bmatrix} \partial f/ \partial x_1\\ \partial f/ \partial x_2\\ ...\\ ...\\ \partial f/ \partial x_n\\ \end{bmatrix} ,and

 \nabla ^2 f(x^*) = Hessian = \begin{bmatrix} \partial ^2f/ \partial x_1^2 & \partial ^2f/\partial x_1 \partial x_2 & ... & \partial ^2f/ \partial x_1 \partial x_n\\ \partial ^2f/\partial x_2 \partial x_1 & \partial ^2f/ \partial x_2^2 & ... & \partial ^2f/ \partial x_2 \partial x_n\\ ... & ... & ... & ...\\ ... & ... & ... & ...\\ \partial ^2f/\partial x_n \partial x_1 & \partial ^2f/\partial x_n \partial x_2 & ... & \partial ^2f/ \partial x_n^2\\ \end{bmatrix}

Let us quickly solve a numerical example on this to understand these conditions better.

Numerical Example

Problem:
min x_1 + 2x_2 + 4x_1 ^2 - x_1 x_2 + 2x_2 ^2

Solution:
According to the first-order condition

 \nabla f(x^*) = \begin{bmatrix} \partial f/ \partial x_1\\ \partial f/ \partial x_2\\ \end{bmatrix} = \begin{bmatrix} 1 + 8x_1 - x_2\\ 2 - x_1 + 4x_2\\ \end{bmatrix} = \begin{bmatrix} 0\\ 0\\ \end{bmatrix}
By solving the two equation we got value of x_1 ^* and x_2 ^* as
 \begin{bmatrix} x_1 ^*\\ x_2 ^*\\ \end{bmatrix} = \begin{bmatrix} -0.19\\ -0.54\\ \end{bmatrix}
To check whether this is a maximum point or a minimum point, and to do so we look at the second-order sufficiency condition. So according to the second-order sufficiency condition:
 \nabla ^2 f(x^*) = \begin{bmatrix} \partial ^2f/ \partial x_1^2 & \partial ^2f/\partial x_1 \partial x_2\\ \partial ^2f/\partial x_2 \partial x_1 & \partial ^2f/ \partial x_2^2\\ \end{bmatrix} = \begin{bmatrix} 8 & -1\\ -1 & 4\\ \end{bmatrix}
And we know that the Hessian matrix is said to be positive definite at a point if all the eigenvalues of the Hessian matrix are positive. So now let's find the eigenvalues of the above Hessian matrix. To find eigenvalue refer here. And to find eigenvalue in python refer here. So the eigenvalue of the above hessian matrix is
 \begin{bmatrix} \lambda _1\\ \lambda _2\\ \end{bmatrix} = \begin{bmatrix} 3.76\\ 8.23\\ \end{bmatrix}
So the eigenvalues for this found to be both positive; that means, that this is a minimum point.



My Personal Notes arrow_drop_up

Technical Content Engineer at GeeksForGeeks

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :
Practice Tags :


Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.