AHA: Artificial Hippocampal Algorithm

Last Updated : 01 Nov, 2022

The majority of ML research concerns slow, statistical learning of samples from large, labeled datasets. Humans and Animals do not learn this way. An important characteristic of animal learning is episodic learning i.e. the ability to rapidly memorize the specific experience as a composition of existing concepts, without provided labels.

Following are the qualities of biological learning:

Able to learn and reason about specific instances, even if they are very similar.
Generalize the other experiences.
Having particular robustness about recognizing entities from the partial cue.
Learn without labels
Having efficient in terms of samples (One-shot learning).
Continual learning of new knowledge.

There has been growing interest in the ML community to look for the algorithm that possesses these qualities. AHA and One-shot learning can achieve some of those tasks. Unlike most machine learning models, AHA is trained without any external labels and uses only local and immediate rewards.

Hippocampal Formation in Brain

The hippocampus is contained within a brain area widely recognized to be critical for learning and memory, the Medial Temporal Lobe. It is understood to learn quickly, retain knowledge over short time spans in the order of days and selectively consolidate memories over that time into the neo-cortex, which performs slow statistical learning. Below is the design of Complementary learning system systems, it contains two different parts: Hippocampal formation and neocortex for processing short-term and long term-memory.

CLS explanation

Below is the design of Hippocampal formation of brain:

Hippocampal formation of Brain

The Entorhinal Cortex (EC) acts as the main gateway between the neocortex and hippocampus. It sends input from the superficial layers (EC_in) to hippocampus layers and receives output from the hippocampus layers (EC_out). Now, In the hippocampus, there are several functional units but the most significant of which are DG, CA3 and CA1. EC_in forms a sparse and distributed overlapping pattern that combines input from all over the neocortex and subcortical structures,

This pattern becomes sparser and less overlapping from Dentate Gyrus (DG) and CA3 with increasing inhibition and sparse connectivity. That provides distinct representations for similar inputs and therefore an ability to separate patterns. The DG-CA3 connections are non-associative and responsible for encoding engrams. The EC-CA3 connections comprise a pattern association network and are responsible for providing a cue for retrieval. Recurrent connections in CA3 create an auto-associative memory. EC has bilateral connections to CA1 and the CA3-CA1-EC pathway forms a hierarchical pattern associative network.

During encoding, the activated neurons in CA1 form associative connections with the active EC neurons. During retrieval, the sparse CA3 pattern becomes denser and more overlapping through CA1, resulting in the original, complete pattern that was present during encoding being replayed, reinstating activation in EC_out which in turn neocortex through reciprocal feedback connections.

In abstract, the hippocampus comprises an auto-associative memory that can operate effectively with partial cues (pattern completion) and distinguish very similar inputs (pattern separation).

A similar framework describes above can be used to enhancing the standard machine learning algorithm. This model comprises the long-term memory capable of slow statistical learning to the process (similar to neo-cortex in the brain). The fast learner creates very distinct representations so that they don’t interfere with slow statistical learning.

Moreover, the representations generated by the fast-learner are highly compressed, which makes it practical to store experiences consisting of multiple high-dimensional sensory streams. The system can reconstruct the high dimensional inputs for Recognition i.e able to recreate the original stimulus when exposed to something similar and Consolidation i.e replay long-term memory to learn the important information and forgets the rest of it.

For training an episode in a one-shot, we need to create distinct and non-interfering patterns, for that, we must exaggerate the difference between them. But that makes it difficult to generalize the model to represent similar things in similar ways. These capabilities are seemingly at odds with each other.

Artificial Hippocampal Algorithm Architecture

AHA architecture, Here brown arrows represent target connections

AHA architecture is inspired from the above Hippocampal formation. It uses the fast learning quality and also used different functional pathways of the hippocampal region. The pathways are:

Pattern Separation (PS/DG), It produces sparse and orthogonal representation that appears in EC-DG-CA3 pathway. The ideal function for this is hashing i.e very similar (not equal/same) input should produce dissimilar output. The authors used here a randomly initialized fixed single layer FCNN with some sparsity constraint.
Pattern Completion (PC/CA3), recognition of a complete pattern from partial cues. It is implemented with a Hopfield network, a biologically inspired auto-associative memory network
Pattern Mapping (PM/CA1), reconstruction of the original complete pattern in a grounded form.
Pattern Retrieval (PR/EC-CA3) models the connectivity between EC(i.e VC) and CA3(PC). Its role is to provide a cue to the PC(CA3). It is implemented with a 2-layer Fully-connected Artificial Neural Network (FC-ANN).
Vision Component (VC): Performs the role similar to EC in the AHA model, i.e to process high-dimensional sensory input and output abstract visual features. It is implemented with single layer convolution sparse encoder with interest filter to suppress background encoding

Memory storage (training): In this part of training the model, the PS pathway converts the non-symbolic matrix into the symbolic form for memory storage. These symbols are used for a form of self-supervised learning in the PC pathway to recognize future learnings. The PM pathway learns to map from these symbols to the grounded non-symbolic form.

Memory Recall (Inference): The PC pathway maps non-symbolic input to symbolic representation. It contains an auto-associative memory for robust recall to important symbols. The PM then reconstructs the original output.

Experiment:

The experiments are based on a one-shot classification test on a dataset of handwritten characters. There are two main parts of the experiment:

One-shot classification (Lake 2015 dataset):

In the first step, “memorize” a set of handwritten characters that is presented to the model.
Second, a “recognize” set is presented. It consists of the same characters as above, handwritten or generated by a different person/font.
The system finds matching characters. e.g. the 2^nd character in “memorize” matches the 5^th character in ‘recognize’. Despite only seeing one version of a character/ image, the system must be able to recognize the image.

One-shot classification test example

Instance one-shot classification:

This is the similar to above steps, except that every character in the set is of the same letter.
The task is still to match corresponding characters, but now the model must be able to learn to distinguish between similar characters.