Skip to content
Related Articles

Related Articles

Improve Article

Hopfield Neural Network

  • Last Updated : 07 Jul, 2021

Prerequisites: RNN

The Hopfield Neural Networks, invented by Dr John J. Hopfield consists of one layer of ‘n’ fully connected recurrent neurons. It is generally used in performing auto association and optimization tasks. It is calculated using a converging interactive process and it generates a different response than our normal neural nets. 

Discrete Hopfield Network: It is a fully interconnected neural network where each unit is connected to every other unit. It behaves in a discrete manner, i.e. it gives finite distinct output, generally of two types: 

  • Binary (0/1)
  • Bipolar (-1/1)

The weights associated with this network is symmetric in nature and has the following properties. 

1.\ w_{ij} = w_{ji} \\ 2.\ w_{ii} = 0



Structure & Architecture  

  • Each neuron has an inverting and a non-inverting output.
  • Being fully connected, the output of each neuron is an input to all other neurons but not self.

Fig 1 shows a sample representation of a Discrete Hopfield Neural Network architecture having the following elements.
 

Fig 1: Discrete Hopfield Network Architecture

[ x1 , x2 , ... , xn ] -> Input to the n given neurons.
[ y1 , y2 , ... , yn ] -> Output obtained from the n given neurons
Wij -> weight associated with the connection between the ith and the jth neuron.

Training Algorithm

For storing a set of input patterns S(p) [p = 1 to P], where S(p) = S1(p) … Si(p) … Sn(p), the weight matrix is given by:

  • For binary patterns

w_{ij} = \sum_{p = 1}^{P} [2s_{i}(p) - 1][2s_{j}(p) - 1]\ (w_{ij}\ for\ all\ i\neq j)

  • For bipolar patterns 

w_{ij} = \sum_{p = 1}^{P} [s_{i}(p)s_{j}(p)]\ (where\ w_{ij} = 0\ for\ all\ i= j)

(i.e. weights here have no self connection)

Steps Involved



Step 1 - Initialize weights (wij) to store patterns (using training algorithm).
Step 2 - For each input vector yi, perform steps 3-7.
Step 3 - Make initial activators of the network equal to the external input vector x. 

y_i = x_i : (for\ i = 1\ to\ n)

Step 4 - For each vector yi, perform steps 5-7. 
Step 5 - Calculate the total input of the network yin using the equation given below.

y_{in_{i}} = x_i + \sum_{j} [y_jw_{ji}]

Step 6 - Apply activation over the total input to calculate the output as per the equation given below: 

y_i = \begin{cases} & 1 \text{ if } y_{in}>\theta_i \\ & y_i \text{ if } y_{in}=\theta_i \\ & 0 \text{ if } y_{in}<\theta_i \end{cases}

(where θi (threshold) and is normally taken as 0)

Step 7 - Now feedback the obtained output yi to all other units. Thus, the activation vectors are updated.
Step 8 - Test the network for convergence.

Example Problem

Consider the following problem. We are required to create Discrete Hopfield Network with bipolar representation of input vector as [1 1 1 -1] or [1 1 1 0] (in case of binary representation) is stored in the network. Test the hopfield network with missing entries in the first and second component of the stored vector (i.e. [0 0 1 0]).

Step by Step Solution

Step 1 - given input vector, x = [1 1 1 -1] (bipolar) and we initialize the weight matrix (wij) as:

w_{ij} = \sum [s^T(p)t(p)] \\ = \begin{bmatrix} 1 \\ 1 \\ 1 \\ -1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 & -1 \end{bmatrix} \\ = \begin{bmatrix} 1 & 1 & 1 & -1 \\  1 & 1 & 1 & -1 \\   1 & 1 & 1 & -1 \\    -1 & -1 & -1 & 1 \\\end{bmatrix}

and weight matrix with no self connection is:

w_{ij} = \begin{bmatrix} 0 & 1 & 1 & -1 \\  1 & 0 & 1 & -1 \\   1 & 1 & 0 & -1 \\    -1 & -1 & -1 & 0 \\\end{bmatrix}

Step 3 - As per the question, input vector x with missing entries, x = [0 0 1 0] ([x1 x2 x3 x4]) (binary)
       - Make yi = x = [0 0 1 0] ([y1 y2 y3 y4])
Step 4 - Choosing unit yi (order doesn't matter) for updating its activation. 
       - Take the ith column of the weight matrix for calculation.        
 
(we will do the next steps for all values of yi and check if there is convergence or not) 

y_{in_{1}} = x_1 + \sum_{j = 1}^4 [y_j w_{j1}] \\ = 0\ + \begin{bmatrix} 0 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \\ 1 \\ -1 \end{bmatrix} \\ = 0\ + 1 \\ = 1 \\ Applying\ activation,\ y_{in_1} > 0 \implies y_1 = 1 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \\ which\ is\ not\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ no\ covergence.



now for next unit, we will take updated value via feedback. (i.e. y = [1 0 1 0]) 

y_{in_{3}} = x_3 + \sum_{j = 1}^4 [y_j w_{j3}] \\ = 1\ + \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \\ 0 \\ -1 \end{bmatrix} \\ = 1\ + 1 \\ = 2 \\ Applying\ activation,\ y_{in_3} > 0 \implies y_3= 1 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \\ which\ is\ not\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ no\ covergence.

now for next unit, we will take updated value via feedback. (i.e. y = [1 0 1 0])

y_{in_{4}} = x_4 + \sum_{j = 1}^4 [y_j w_{j4}] \\ = 0\ + \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} -1 \\ -1 \\ -1 \\ 0 \end{bmatrix} \\ = 0\ + (-1)\ + (-1) \\ = -2 \\ Applying\ activation,\ y_{in_4} < 0 \implies y_4= 0 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \\ which\ is\ not\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ no\ covergence.

now for next unit, we will take updated value via feedback. (i.e. y = [1 0 1 0])

y_{in_{2}} = x_2 + \sum_{j = 1}^4 [y_j w_{j2}] \\ = 0\ + \begin{bmatrix} 1 & 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \\ -1 \end{bmatrix} \\ = 0\ + 1\ + 1 \\ = 2 \\ Applying\ activation,\ y_{in_2} > 0 \implies y_2= 1 \\ giving\ feedback\ to\ other\ units,\ we\ get \\ y = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ which\ is\ equal\ to\ input\ vector\ \\ x = \begin{bmatrix} 1 & 1 & 1 & 0 \end{bmatrix} \\ Hence,\ covergence\ with\ vector\ x.

Continuous Hopfield Network: Unlike the discrete hopfield networks, here the time parameter is treated as a continuous variable. So, instead of getting binary/bipolar outputs, we can obtain values that lie between 0 and 1. It can be used to solve constrained optimization and associative memory problems. The output is defined as:

v_i = g(u_i)

where,
vi = output from the continuous hopfield network
ui = internal activity of a node in continuous hopfield network.

Energy Function

The hopfield networks have an energy function associated with them. It either diminishes or remains unchanged on update (feedback) after every iteration. The energy function for a continuous hopfield network is defined as:

E = 0.5 \sum_{i=1}^n \sum_{j=1}^n w_{ij}v_i v_j + \sum_{i=1}^n \theta_i v_i

To determine if the network will converge to a stable configuration, we see if the energy function reaches its minimum by:

\frac{d}{dt} E \leq 0

The network is bound to converge if the activity of each neuron wrt time is given by the following differential equation: 

\frac{d}{dt}u_i = \frac{-u_i}{\tau} + \sum_{j=1}^n w_{ij} v_j + \theta_i

 

Attention reader! Don’t stop learning now. Get hold of all the important Machine Learning Concepts with the Machine Learning Foundation Course at a student-friendly price and become industry ready.




My Personal Notes arrow_drop_up
Recommended Articles
Page :