Skip to content
Related Articles

Related Articles

ML | Cost function in Logistic Regression

View Discussion
Improve Article
Save Article
Like Article
  • Last Updated : 06 May, 2019

In the case of Linear Regression, the Cost function is –

  J(\Theta) = \frac{1}{m} \sum_{i = 1}^{m} \frac{1}{2} [h_{\Theta}(x^{(i)}) - y^{(i)}]^{2}

But for Logistic Regression,

  h_{\Theta}(x) = g(\Theta^{T}x)

It will result in a non-convex cost function. But this results in cost function with local optima’s which is a very big problem for Gradient Descent to compute the global optima.

So, for Logistic Regression the cost function is

  Cost(h_{\Theta}(x),y) = \left\{\begin{matrix} -log(h_{\Theta}(x)) & if&y=1\\  -log(1-h_{\Theta}(x))& if& y = 0 \end{matrix}\right.

If y = 1

Cost = 0 if y = 1, hθ(x) = 1
But as,
hθ(x) -> 0
Cost -> Infinity

If y = 0

So,

  Cost(h_{\Theta}(x),y) = \left\{\begin{matrix} 0 &if  &h_{\Theta}(x)=y\\  \infty  & if & y=0 &and &h_{\Theta}(x)\rightarrow 1 \\   \infty & if &y=1  &and  &h_{\Theta}(x)\rightarrow 0  \end{matrix}\right.
  Cost(h_{\Theta}(x),y) = -y log(h_{\Theta}(x)) - (1-y) log(1-h_{\Theta}(x))
  J({\Theta}) = \frac{-1}{m}\sum_{i=1}^{m} Cost(h_{\Theta}(x),y)

To fit parameter θ, J(θ) has to be minimized and for that Gradient Descent is required.

Gradient Descent – Looks similar to that of Linear Regression but the difference lies in the hypothesis hθ(x)

 \Theta_{j} := \Theta_{j} - \alpha \sum_{i = 1}^{m}(h_\Theta(x^{(i)})- y^{(i)})x_j^{(i)}

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!