Open In App

How to Implement Softmax and Cross-Entropy in Python and PyTorch

Multiclass classification is an application of deep learning/machine learning where the model is given input and renders a categorical output corresponding to one of the labels that form the output. For example, providing a set of images of animals and classifying it among cats, dogs, horses, etc.

For this purpose, where the model outputs multiple outputs for each class, a simple logistic function (or sigmoid function) cannot be used. Thus, another activation function called the Softmax function is used along with the cross-entropy loss.



Softmax Function:

The softmax formula is represented as:

softmax function image



 

where the values of zi are the elements of the input vector and they can take any real value. The denominator of the formula is normalised term which guarantees that all the output values of the function will sum to 1, thus making it a valid probability distribution.

The softmax function and the sigmoid function are similar to each other. Softmax operates on vector values while the sigmoid takes scalar values. Thus, we can say that sigmoid function is a specific case of the softmax function and it is for a classifier with only two input classes. The logistic function, often known as the logistic sigmoid function, is the most common object of the word “sigmoid function” in the context of machine learning. Mathematically, it is defined by:
    

 

The above function is used for classification between 2 classes, i.e., 1 and 0. In the case of Multiclass classification, the softmax function is used. The softmax converts the output for each class to a probability value (between 0-1), which is exponentially normalized among the classes. 

Example:

The below code implements the softmax function using python and NumPy. 




# The below code implements the softmax function
# using python and numpy. It takes:
# Input: It takes input array/list of values
# Output: Outputs a array/list of softmax values.
 
 
# Importing the required libraries
import numpy as np
 
# Defining the softmax function
def softmax(values):
 
    # Computing element wise exponential value
    exp_values = np.exp(values)
 
    # Computing sum of these values
    exp_values_sum = np.sum(exp_values)
 
    # Returing the softmax output.
    return exp_values/exp_values_sum
 
 
if __name__ == '__main__':
 
    # Input to be fed
    values = [2, 4, 5, 3]
 
    # Output achieved
    output = softmax(values)
    print("Softmax Output: ", output)
    print("Sum of Softmax Values: ", np.sum(output))

Output:

 

Implementing Softmax using Python and Pytorch: 

Below, we will see how we implement the softmax function using Python and Pytorch. For this purpose, we use the torch.nn.functional library provided by pytorch.

Syntax: torch.nn.functional.softmax(input_tensor, dim=None, _stacklevel=3, dtype=None)

Parameters

  • input: The input on which softmax to be applied. Should be a Tensor.
  • dim: Integer value. Indicates the dimension along which the softmax shall be applied.
  • dtype (optional) – This is for setting the desired datatype for the returned tensor
  • If set, the input is casted before the operation is applied. Default: None

Example:

The below code implements the softmax function using pytorch.




# The below code implements the softmax function
# using the function softmax provided by
# torch.nn.functional in pytorch.
# Input: The input values (list, array)
# Output: List (Computed softmax values)
 
 
# Importing the required Libraries
import torch.nn.functional as F
import torch
 
# The input tensor to be passed
input_ = torch.tensor([1, 2, 3])
 
# Printing the loss value
softmax = F.softmax(input_.float(), dim=0)
print("Softmax values are: ", softmax)
 
# Sum of all the softmax values
print("Sum of the softmax values: ", torch.sum(softmax))

Output:

 

Cross-Entropy Loss 

Loss functions are the objective functions used in any machine learning task to train the corresponding model. One of the most important loss functions used here is Cross-Entropy Loss, also known as logistic loss or log loss, used in the classification task. The understanding of Cross-Entropy Loss is based on the Softmax Activation function. 
The softmax function tends to return a vector of C classes, where each entry denotes the probability of the occurrence of the corresponding class. The cross-entropy loss tends to compute the distance/deviation of this vector from the true probability vector.

Entropy

The entropy of any random variable X is defined as the level of disorder or randomness inherited in its possible outcome. 
For P(x) be any probability distribution, we define Entropy as:

 

The negative sign is there as p(x) <= 1, therefore log(p(x)) <= 0. So to have a positive value, the negative sign is used.

Cross-Entropy

Mathematically, cross-entropy is defined as:

 

Here is the true probability of a class, while is the computed probability using the Softmax function.

Implementing Cross Entropy Loss using Python and Numpy

Below we discuss the Implementation of Cross-Entropy Loss using Python and the Numpy Library.
 

Example:

The below code implements the softmax function using Python and Numpy.




# The below code implements the cross entropy
# loss between the predicted values and the
# true values of cass labels. The function:
# Inputs: Predicted values, True values
# Output: The cross entropy loss between them.
 
 
# Importing the required library
import torch.nn as nn
import torch
 
# Cross Entropy function.
def cross_entropy(y_pred, y_true):
 
    # computing softmax values for predicted values
    y_pred = softmax(y_pred)
    loss = 0
     
    # Doing cross entropy Loss
    for i in range(len(y_pred)):
 
        # Here, the loss is computed using the
        # above mathematical formulation.
        loss = loss + (-1 * y_true[i]*np.log(y_pred[i]))
 
    return loss
 
# y_true: True Probability Distribution
y_true = [1, 0, 0, 0, 0]
 
# y_pred: Predicted values for each calss
y_pred = [10, 5, 3, 1, 4]
 
# Calling the cross_entropy function by passing
# the suitable values
cross_entropy_loss = cross_entropy(y_pred, y_true)
 
print("Cross Entropy Loss: ", cross_entropy_loss)

Output:

 

Implementing Cross-Entropy Loss using Pytorch:

For implementing Cross-Entropy Loss using Pytorch, we use torch.nn library. Below are the required steps:

torch.nn.CrossEntropyLoss(): 

The function implements the cross-entropy loss between the input and the target value.

Syntax: torch.nn.CrossEntropyLoss(weight=None, ignore_index=- 100, reduce=None, reduction=’mean’, label_smoothing=0.0)

Parameters:

  • weight (optional) – A rescaling weight provided for each class. Has to be a Tensor of size C, where C are number of classe.
  • ignore_index (optional) – An integer values which specifies the target value which is to be ignored.
    This means, it will not contribute for the gradient.
  • reduce (optional) – Boolean value. By default, the losses computed are averaged or summed over observations for each minibatch. When set False, it
    returns a loss per element of the batch. Default: True
  • reduction (optional) – A string which specifies the reduction applied to the output. Can be: ‘none’ | ‘mean’ | ‘sum’. 
    1. ‘none’: No reduction applied
    2. ‘mean’: A weighted mean of the output is applied.
    3. ‘sum’: The output will be summed. 
    Default: mean
  • label_smoothing (optional) – A value (type float) in range [0.0, 1.0]. Specifies the amount of smoothing when computing the loss. 

Below is the code:

The code below computes the cross-entropy loss between the input and the target value using Pytorch.




# The below code implements the cross
# entropy loss using the CrossEntropyLoss()
# class provided by torch.nn module in pytorch.
 
 
# Importing the required library
import torch.nn as nn
import torch
 
# Defining the object for this class.
loss = nn.CrossEntropyLoss()
 
# y_pred: Predicted values
y_pred = torch.tensor([[1.4, 0.4, 1.1, 0.1, 2.3]])
 
# y_true: True class label
y_true = torch.tensor([0])
 
# Passing these values to the loss object.
cross_entropy_loss = loss(y_pred, y_true)
 
# Printing the value of the loss.
print("Cross Entropy Loss: ", cross_entropy_loss.item())

Output:

 


Article Tags :