ML | Cost function in Logistic Regression

In the case of Linear Regression, the Cost function is –

  J(\Theta) = \frac{1}{m} \sum_{i = 1}^{m} \frac{1}{2} [h_{\Theta}(x^{(i)}) - y^{(i)}]^{2}

But for Logistic Regression,

  h_{\Theta}(x) = g(\Theta^{T}x)



It will result in a non-convex cost function. But this results in cost function with local optima’s which is a very big problem for Gradient Descent to compute the global optima.

So, for Logistic Regression the cost function is

  Cost(h_{\Theta}(x),y) = \left\{\begin{matrix} -log(h_{\Theta}(x)) & if&y=1\\  -log(1-h_{\Theta}(x))& if& y = 0 \end{matrix}\right.

If y = 1

Cost = 0 if y = 1, hθ(x) = 1
But as,
hθ(x) -> 0
Cost -> Infinity

If y = 0

So,

  Cost(h_{\Theta}(x),y) = \left\{\begin{matrix} 0 &if  &h_{\Theta}(x)=y\\  \infty  & if & y=0 &and &h_{\Theta}(x)\rightarrow 1 \\   \infty & if &y=1  &and  &h_{\Theta}(x)\rightarrow 0  \end{matrix}\right.
  Cost(h_{\Theta}(x),y) = -y log(h_{\Theta}(x)) - (1-y) log(1-h_{\Theta}(x))
  J({\Theta}) = \frac{-1}{m}\sum_{i=1}^{m} Cost(h_{\Theta}(x),y)

To fit parameter θ, J(θ) has to be minimized and for that Gradient Descent is required.

Gradient Descent – Looks similar to that of Linear Regression but the difference lies in the hypothesis hθ(x)

 \Theta_{j} := \Theta_{j} - \alpha \sum_{i = 1}^{m}(h_\Theta(x^{(i)})- y^{(i)})x_j^{(i)}


My Personal Notes arrow_drop_up

Aspire to Inspire before I expire

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.




Article Tags :
Practice Tags :


Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.