Neural Networks | A beginners guide

Neural networks are artificial systems that were inspired by biological neural networks. These systems learn to perform tasks by being exposed to various datasets and examples without any task-specific rules. The idea is that the system generates identifying characteristics from the data they have been passed without being programmed with a pre-programmed understanding of these datasets.

Neural networks are based on computational models for threshold logic. Threshold logic is a combination of algorithms and mathematics. Neural networks are based either on the study of the brain or on the application of neural networks to artificial intelligence. The work has led to improvements in finite automata theory.

Components of a typical neural network involve neurons, connections, weights, biases, propagation function, and a learning rule. Neurons will receive an input p_j(t) from predecessor neurons that have an activation a_j(t), threshold $\theta$_j, an activation function f, and an output function f_{out}. Connections consist of connections, weights and biases which rules how neuron $i$ transfers output to neuron $j$. Propagation computes the input and outputs the output and sums the predecessor neurons function with the weight. The learning rule modifies the weights and thresholds of the variables in the network.

Supervised vs Unsupervised Learning:
Neural networks learn via supervised learning; Supervised machine learning involves an input variable x and output variable y. The algorithm learns from a training dataset. With each correct answers, algorithms iteratively make predictions on the data. The learning stops when the algorithm reaches an acceptable level of performance.
Unsupervised machine learning has input data X and no corresponding output variables. The goal is to model the underlying structure of the data for understanding more about the data. The keywords for supervised machine learning are classification and regression. For unsupervised machine learning, the keywords are clustering and association.

Evolution of Neural Networks:
Hebbian learning deals with neural plasticity. Hebbian learning is unsupervised and deals with long term potentiation. Hebbian learning deals with pattern recognition and exclusive-or circuits; deals with if-then rules.

Back propagation solved the exclusive-or issue that Hebbian learning could not handle. This also allowed for multi-layer networks to be feasible and efficient. If an error was found, the error was solved at each layer by modifying the weights at each node. This led to the development of support vector machines, linear classifiers, and max-pooling. The vanishing gradient problem affects feedforward networks that use back propagation and recurrent neural network. This is known as deep-learning.

Hardware-based designs are used for biophysical simulation and neurotrophic computing. They have large scale component analysis and convolution creates new class of neural computing with analog. This also solved back-propagation for many-layered feedforward neural networks.

Convolutional networks are used for alternating between convolutional layers and max-pooling layers with connected layers (fully or sparsely connected) with a final classification layer. The learning is done without unsupervised pre-training. Each filter is equivalent to a weights vector that has to be trained. The shift variance has to be guaranteed to dealing with small and large neural networks. This is being resolved in Development Networks.

Types of Neural Networks

There are seven types of neural networks that can be used.

  • The first is a multilayer perceptron which has three or more layers and uses a nonlinear activation function.
  • The second is the convolutional neural network that uses a variation of the multilayer perceptrons.
  • The third is the recursive neural network that uses weights to make structured predictions.
  • The fourth is a recurrent neural network that makes connections between the neurons in a directed cycle. The long short-term memory neural network uses the recurrent neural network architecture and does not use activation function.
  • The final two are sequence to sequence modules which uses two recurrent networks and shallow neural networks which produces a vector space from an amount of text. These neural networks are applications of the basic neural network demonstrated below.

For the example, the neural network will work with three vectors: a vector of attributes X, a vector of classes Y, and a vector of weights W. The code will use 100 iterations to fit the attributes to the classes. The predictions are generated, weighed, and then outputted after iterating through the vector of weights W. The neural network handles back propagation.

Examples:

Input :
X { 2.6, 3.1, 3.0,
    3.4, 2.1, 2.5,
    2.6, 1.3, 4.9, 
    0.1, 0.3, 2.3,};
y {1, 1, 1};
W {0.3, 0.4, 0.6}; 

Output :
0.990628 
0.984596 
0.994117 

Below is the implementations:

filter_none

edit
close

play_arrow

link
brightness_4
code

import numpy as np
  
# array of any amount of numbers. n = m
X = np.array([[1, 2, 3],
              [3, 4, 1],
              [2, 5, 3]])
                
# multiplication
y = np.array([[.5, .3, .2]])
  
# transpose of y
y = y.T
  
# sigma value
sigm = 2
  
# find the delta
delt = np.random.random((3, 3)) - 1
  
for j in range(100):
    # find matrix 1. 100 layers. 
    m1 = (y - (1/(1 + np.exp(-(np.dot((1/(1 + np.exp(
                   -(np.dot(X, sigm))))), delt))))))*((1/(
                       1 + np.exp(-(np.dot((1/(1 + np.exp(
                   -(np.dot(X, sigm))))), delt)))))*(1-(1/(
                       1 + np.exp(-(np.dot((1/(1 + np.exp(
                   -(np.dot(X, sigm))))), delt)))))))
                     
    # find matrix 2
    m2 = m1.dot(delt.T) * ((1/(1 + np.exp(-(np.dot(X, sigm)))))
                    * (1-(1/(1 + np.exp(-(np.dot(X, sigm)))))))
    # find delta
    delt = delt + (1/(1 + np.exp(-(np.dot(X, sigm))))).T.dot(m1)
      
    # find sigma
    sigm = sigm + (X.T.dot(m2))
      
# print output from the matrix
print(1/(1 + np.exp(-(np.dot(X, sigm)))))

chevron_right


Output:

[[ 0.99999294  0.99999379  0.99999353]
 [ 0.99999987  0.99999989  0.99999988]
 [ 1.          1.          1.        ]]

Limitations:
The neural network is for a supervised model. It does not handle unsupervised machine learning and does not cluster and associate data. It also lacks a level of accuracy that will be found in more computationally expensive neural network. Based on Andrew Trask’s neural network. Also, the neural network does not work with any matrices where X’s number of rows and columns do not match Y and W’s number of rows.

The next steps would be to create an unsupervised neural network and to increase computational power for the supervised model with more iterations and threading.

Resources: