Open In App

Hopfield Neural Network

Improve
Improve
Like Article
Like
Save
Share
Report

The Hopfield Neural Networks, invented by Dr John J. Hopfield consists of one layer of ‘n’ fully connected recurrent neurons. It is generally used in performing auto-association and optimization tasks. It is calculated using a converging interactive process and it generates a different response than our normal neural nets. 

Discrete Hopfield Network

It is a fully interconnected neural network where each unit is connected to every other unit. It behaves in a discrete manner, i.e. it gives finite distinct output, generally of two types: 

  • Binary (0/1)
  • Bipolar (-1/1)

The weights associated with this network are symmetric in nature and have the following properties. 

[Tex]1.\ w_{ij} = w_{ji} \\ 2.\ w_{ii} = 0   [/Tex]

Structure & Architecture of Hopfield Network

  • Each neuron has an inverting and a non-inverting output.
  • Being fully connected, the output of each neuron is an input to all other neurons but not the self.

The below figure shows a sample representation of a Discrete Hopfield Neural Network architecture having the following elements.
 

Discrete Hopfield Network Architecture

Discrete Hopfield Network Architecture

[ x1 , x2 , ... , xn ] -> Input to the n given neurons. [ y1 , y2 , ... , yn ] -> Output obtained from the n given neurons Wij -> weight associated with the connection between the ith and the jth neuron.

Training Algorithm

For storing a set of input patterns S(p) [p = 1 to P], where S(p) = S1(p) … Si(p) … Sn(p), the weight matrix is given by:

  • For binary patterns

[Tex]w_{ij} = \sum_{p = 1}^{P} [2s_{i}(p) – 1][2s_{j}(p) – 1]\ (w_{ij}\ for\ all\ i\neq j) [/Tex]

  • For bipolar patterns 

[Tex]w_{ij} = \sum_{p = 1}^{P} [s_{i}(p)s_{j}(p)]\ (where\ w_{ij} = 0\ for\ all\ i= j) [/Tex]

(i.e. weights here have no self-connection)

Steps Involved in the training of a Hopfield Network are as mapped below:

  • Initialize weights (wij) to store patterns (using training algorithm).
  • For each input vector yi, perform steps 3-7.
  • Make the initial activators of the network equal to the external input vector x. 

[Tex]y_i = x_i : (for\ i = 1\ to\ n) [/Tex]

  • For each vector yi, perform steps 5-7
  • Calculate the total input of the network yin using the equation given below.

[Tex]y_{in_{i}} = x_i + \sum_{j} [y_jw_{ji}] [/Tex]

  • Apply activation over the total input to calculate the output as per the equation given below: 

[Tex]y_i = \begin{cases} & 1 \text{ if } y_{in}>\theta_i \\ & y_i \text{ if } y_{in}=\theta_i \\ & 0 \text{ if } y_{in}<\theta_i \end{cases} [/Tex]

(where θi (threshold) and is normally taken as 0)

  • Now feedback the obtained output yi to all other units. Thus, the activation vectors are updated.
  • Test the network for convergence.

Consider the following problem. We are required to create a Discrete Hopfield Network with the bipolar representation of the input vector as [1 1 1 -1] or [1 1 1 0] (in case of binary representation) is stored in the network. Test the Hopfield network with missing entries in the first and second components of the stored vector (i.e. [0 0 1 0]).

Given the input vector, x = [1 1 1 -1] (bipolar) and we initialize the weight matrix (wij) as:

[Tex]w_{ij} = \sum [s^T(p)t(p)] \\ = \begin{bmatrix} 1 \\ 1 \\ 1 \\ -1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 & -1 \end{bmatrix} \\ = \begin{bmatrix} 1 & 1 & 1 & -1 \\  1 & 1 & 1 & -1 \\   1 & 1 & 1 & -1 \\    -1 & -1 & -1 & 1 \\\end{bmatrix} [/Tex]

and weight matrix with no self-connection is:

[Tex]w_{ij} = \begin{bmatrix} 0 & 1 & 1 & -1 \\  1 & 0 & 1 & -1 \\   1 & 1 & 0 & -1 \\    -1 & -1 & -1 & 0 \\\end{bmatrix} [/Tex]

As per the question, input vector x with missing entries, x = [0 0 1 0] ([x1 x2 x3 x4]) (binary). Make yi = x = [0 0 1 0] ([y1 y2 y3 y4]). Choosing unit yi (order doesn’t matter) for updating its activation. Take the ith column of the weight matrix for calculation.        

(we will do the next steps for all values of yi and check if there is convergence or not) 

[Tex]y_{in_{1}} = x_1 + \sum_{j = 1}^4 [y_j w_{j1}] \\ = 0\ + \begin{bmatrix} 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ -1 \end{bmatrix} \\ = 0\ + 1 \\ = 1 \\ Applying\ activation,\ y_{in_1} > 0 \implies y_1 = 1 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \\ which\ is\ not\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ no\ covergence.         [/Tex]’

Now for next unit, we will take updated value via feedback. (i.e. y = [1 0 1 0])

[Tex]y_{in_{3}} = x_3 + \sum_{j = 1}^4 [y_j w_{j3}] \\ = 1\ + \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \\ 0 \\ -1 \end{bmatrix} \\ = 1\ + 1 \\ = 2 \\ Applying\ activation,\ y_{in_3} > 0 \implies y_3= 1 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \\ which\ is\ not\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ no\ covergence. [/Tex]

Now for next unit, we will take updated value via feedback. (i.e. y = [1 0 1 0])

[Tex]y_{in_{4}} = x_4 + \sum_{j = 1}^4 [y_j w_{j4}] \\ = 0\ + \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} -1 \\ -1 \\ -1 \\ 0 \end{bmatrix} \\ = 0\ + (-1)\ + (-1) \\ = -2 \\ Applying\ activation,\ y_{in_4} < 0 \implies y_4= 0 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \\ which\ is\ not\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ no\ covergence. [/Tex]

Now for next unit, we will take updated value via feedback. (i.e. y = [1 0 1 0])

[Tex]y_{in_{2}} = x_2 + \sum_{j = 1}^4 [y_j w_{j2}] \\ = 0\ + \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \\ -1 \end{bmatrix} \\ = 0\ + 1\ + 1 \\ = 2 \\ Applying\ activation,\ y_{in_2} > 0 \implies y_2= 1 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ which\ is\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ covergence\ with\ vector\ x. [/Tex]

Continuous Hopfield Network

Unlike the discrete Hopfield networks, here the time parameter is treated as a continuous variable. So, instead of getting binary/bipolar outputs, we can obtain values that lie between 0 and 1. It can be used to solve constrained optimization and associative memory problems. The output is defined as:

[Tex]v_i = g(u_i) [/Tex]

where,

  • vi = output from the continuous hopfield network
  • ui = internal activity of a node in continuous hopfield network.

Energy Function

The Hopfield networks have an energy function associated with them. It either diminishes or remains unchanged on update (feedback) after every iteration. The energy function for a continuous Hopfield network is defined as:

[Tex]E = 0.5 \sum_{i=1}^n \sum_{j=1}^n w_{ij}v_i v_j + \sum_{i=1}^n \theta_i v_i [/Tex]

To determine if the network will converge to a stable configuration, we see if the energy function reaches its minimum by:

[Tex]\frac{d}{dt} E \leq 0   [/Tex]

The network is bound to converge if the activity of each neuron wrt time is given by the following differential equation

[Tex]\frac{d}{dt}u_i = \frac{-u_i}{\tau} + \sum_{j=1}^n w_{ij} v_j + \theta_i [/Tex]



Last Updated : 13 Mar, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads