Multivariate Optimization – KKT Conditions

What’s a multivariate optimization problem?

In a multivariate optimization problem, there are multiple variables that act as decision variables in the optimization problem.

z = f(x1, x2, x3…..xn)

So, when you look at these types of problems a general function z could be some non-linear function of decision variables x1, x2, x3 to xn. So, there are n variables that one could manipulate or choose to optimize this function z. Notice that one could explain univariate optimization using pictures in two dimensions that is because in the x-direction we had the decision variable value and in the y-direction, we had the value of the function. However, if it is multivariate optimization then we have to use pictures in three dimensions and if the decision variables are more than 2 then it is difficult to visualize.

Why we are interested in KKT Conditions?

Multivariate optimization with inequality constarint: In mathematics, an inequality is a relation which makes a non-equal comparison between two numbers or other mathematical expressions. It is used most often to compare two numbers on the number line by their size. There are several different notations used to represent different kinds of inequalities. Among them <, >, ≤, ≥ are the popular notation to represent different kinds of inequalities. So if there is given an objective function with more than one decision variable and having an inequality constarint then this is known as so.
Example:



min 2x12 + 4x22
st
3x1 + 2x2 ≤ 12

Here x1 and x2 are two decision variable with inequality constraint 3x1 + 2x2 ≤ 12

So in the case of multivariate optimization with inequality constraints, the necessary conditions for x̄* to be the minimizer is it must be satisfied KKT Conditions. So we are interested in KKT conditions.

KKT Conditions:

KKT stands for Karush–Kuhn–Tucker. In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests (sometimes called first-order necessary conditions) for a solution in nonlinear programming to be optimal, provided that some regularity conditions are satisfied.

So generally multivariate optimization problems contain both equality and inequality constraints.

z = min f(x̄)
st
hi (x̄) = 0, i = 1, 2, …m
gj (x̄) ≤ 0, j = 1, 2, …l


Here we have ‘m’ equality constraint and ‘l’ inequality constraint.

Here are the conditions for multivariate optimization problems with both equality and inequality constraints to be at it is optimum value.

  • Condition 1:
     \nabla f(x^*) + \Sigma _i_=_1^l [\nabla h_i(x^*)] \lambda _i^* + \Sigma _j_=_1^m [\nabla g_j(x^*)] \mu _j^* = 0
    
    where,
    f(x^*) = f(x_1, x_2, …., x_n) = Objective function
    h(x^*) = h(x_1, x_2, …., x_n) = Equality constraint
    g(x^*) = g(x_1, x_2, …., x_n) = Inequality constraint
    \lambda _i^* = Scalar multiple for equality constarint
    \mu _j^* = Scalar multiple for inequality constarint
    
  • Condition 2:
    h_i(x^*) = 0, for i = 1, ...l
    

    This condition ensures that the optimum satisfies equality constraints.

  • Condition 3:
    \lambda _i \in R, for i = 1, ..., l
    

    The lambda must be some real number so as many real numbers as there are in equality constraints.

  • Condition 4:
    g_j(x^*) \leq 0, j = 1, ..., m
    

    Much like in condition 2 that the optimum satisfies equality constraints, we need to have the inequality constraint also to be satisfied by the optimum point. So this ensures that the optimum point is in the feasible region.

  • Condition 5:
    \mu _j^*(g_j(x^*)) = 0
    

    Now, this is the real difference between the equality constraint condition and the inequality constrained situation shows up. And this condition is known as complementary slackness condition. So, what this says is if you take a product of the inequality constraint and the corresponding \mu _j^* then that has to be 0. Basically what it means is either \mu _j^* is 0 in which case g_j(x^*) is free to be any value such that this condition is satisfied or g_j(x^*) is 0 in which case we have to compute \mu _j^* and the \mu _j^* that we compute has to be such that it is a positive number or it is greater than equal to 0.

  • Condition 6:
     
    \mu _j^* \geq 0, j = 1, .., m
    

    In the condition 5 we have seen that either \mu _j^* is 0 in which case g_j(x^*) is free to be any value such that this condition is satisfied or g_j(x^*) is 0 in which case we have to compute \mu _j^* and the \mu _j^* that we compute has to be such that it is a positive number or it is greater than equal to 0. So, this condition is there to ensure that whatever optimum point that you have, there is no possibility of any more improvement from the optimum point. So that is the reason why this condition is there.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.




My Personal Notes arrow_drop_up

Technical Content Engineer at GeeksForGeeks

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.