Skip to content
Related Articles

Related Articles

Improve Article
ML | Cost function in Logistic Regression
  • Last Updated : 06 May, 2019

In the case of Linear Regression, the Cost function is –

  J(\Theta) = \frac{1}{m} \sum_{i = 1}^{m} \frac{1}{2} [h_{\Theta}(x^{(i)}) - y^{(i)}]^{2}

But for Logistic Regression,

  h_{\Theta}(x) = g(\Theta^{T}x)

It will result in a non-convex cost function. But this results in cost function with local optima’s which is a very big problem for Gradient Descent to compute the global optima.

So, for Logistic Regression the cost function is



  Cost(h_{\Theta}(x),y) = \left\{\begin{matrix} -log(h_{\Theta}(x)) & if&y=1\\  -log(1-h_{\Theta}(x))& if& y = 0 \end{matrix}\right.

If y = 1

Cost = 0 if y = 1, hθ(x) = 1
But as,
hθ(x) -> 0
Cost -> Infinity

If y = 0

So,

  Cost(h_{\Theta}(x),y) = \left\{\begin{matrix} 0 &if  &h_{\Theta}(x)=y\\  \infty  & if & y=0 &and &h_{\Theta}(x)\rightarrow 1 \\   \infty & if &y=1  &and  &h_{\Theta}(x)\rightarrow 0  \end{matrix}\right.
  Cost(h_{\Theta}(x),y) = -y log(h_{\Theta}(x)) - (1-y) log(1-h_{\Theta}(x))
  J({\Theta}) = \frac{-1}{m}\sum_{i=1}^{m} Cost(h_{\Theta}(x),y)

To fit parameter θ, J(θ) has to be minimized and for that Gradient Descent is required.

Gradient Descent – Looks similar to that of Linear Regression but the difference lies in the hypothesis hθ(x)

 \Theta_{j} := \Theta_{j} - \alpha \sum_{i = 1}^{m}(h_\Theta(x^{(i)})- y^{(i)})x_j^{(i)}

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up
Recommended Articles
Page :