Adaptive Resonance Theory .*;
- Example 1
- Example 2: Character Recognition
- Example 3: in C++
- Example 4: Customer Personalization Application
ART1 Neural Networks
ART1 neural networks cluster binary vectors, using unsupervised learning. The neat thing about adaptive resonance theory is that it gives the user more control over the degree of relative similarity of patterns placed on the same cluster.
An ART1 net achieves stability when it cannot return any patterns to previous clusters (in other words, a pattern oscillating among different clusters at different stages of training indicates an unstable net. Some nets achieve stability by gradually reducing the learning rate as the same set of training patterns is presented many times. However, this does not allow the net to readily learn a new pattern that is presented for the first time after a number of training epochs have already taken place. The ability of a net to respond to (learn) a new pattern equally well at any stage of learning is called plasticity (e.g., this is a computational corollary of the biological model of neural plasticity). Adaptive resonance theory nets are designed to be both stable and plastic.
The basic structure of an ART1 neural network involves:
- an input processing field (called the F1 layer) which happens to consist of two parts:
- an input portion (F1(a))
- an interface portion (F1(b))
- the cluster units (the F2 layer)
- and a mechanism to control the degree of similarity of patterns placed on the same cluster
- a reset mechanism
- weighted bottom-up connections between the F1 and F2 layers
- weighted top-down connections between the F2 and F1 layers
F1(b), the interface portion, combines signals from the input portion and the F2 layer, for use in comparing the similarity of the input signal to the weight vector for the cluster unit that has been selected as a candidate for learning.
Figure 1. A simple ART1 structure.
To control the similarity of patterns placed on the same cluster, there are two sets of connections (each with its own weights) between each unit in the interface portion of the input field and the cluster unit. The F1(b) layer is connected to the F2 layer by bottom-up weights (bij). The F2 layer is connected to the F1(b) layer by top-down weights (tij).
The F2 layer is a competitive layer: The cluster unit with the largest net input becomes the candidate to learn the input pattern. The activations of all other F2 units are set to zero. The interface units, F1(b), now combine information from the input and cluster units. Whether or not this cluster unit is allowed to learn the input pattern depends on how similar its top-down weight vector is to the input vector. This decision is made by the reset unit, based on signals it receives from the input F1(a) and interface F1(b) layers. If the cluster unit is not allowed to learn, it is inhibited and a new cluster unit is selected as the candidate. If a cluster unit is allowed to learn, it is said to classify a pattern class. Sometimes there is a tie for the winning neuron in the F2 layer, when this happens, then an arbitrary rule, such as the first of them in a serial order, can be taken as the winner.
During the operation of an ART1 net, patterns emerge in the F1a and F1b layers and are called traces of STM (short−term memory. Traces of LTM (long−term memory) are in the connection weights between the input layers (F1) and output layer (F2).
Similarity and the Vigilance Parameter
The degree of similarity required for patterns to be assigned to the same cluster unit is controlled by a user-defined gain control, known as the vigilance parameter. For ART1, the reset mechanism is designed to control the state of each node in the F2 layer. At any time, an F2 node is in one of three states:
- Inactive, but available to participate in competition
- Inhibited, and prevented from participating in competition
One problem that ART1 runs into is that the final weight values and created clusters depend, to some degree, on the order in which the input vectors are presented. The vigilance parameter helps to solve this: The higher the vigilance is raised, the less dependent the clusters become on the order of input.
In adaptive resonance theory, the changes in activations of units and in weights are governed by coupled differential equations. The net is a continuously changing (dynamic) system, but the process can be simplified because the activations are assumed to change much more rapidly than the weights. Once an acceptable cluster unit has been selected for learning, the bottom-up and top-down signals are maintained for an extended period, during which time the weight changes occur. This is the "resonance" that gives the net its name.
ART1 Learning: Fast vs. Slow
ART nets generally use one of two types of learning: Fast and Slow. In the fast learning mode, it is assumed that weight updates during resonance occur rapidly, relative to the length of time a pattern is presented on any particular trial. The weights reach equilibrium on each trial with fast learning. In slow learning mode the weight changes occur slowly relative to the duration of a learning trial; the weights do not reach equilibrium on a particular trial. Many more presentations of the patterns are required for slow learning than for fast, but fewer calculations occur on each learning trial in slow learning. Generally, only one weight update, with a relatively small learning rate, occurs on each learning trial in slow learning mode.
In fast learning, the net is considered stabilized when each pattern chooses the correct cluster unit when it is presented (without causing any unit to reset). For ART1, because the patterns are binary, the weights associated with each cluster unit also stabilize in the fast learning mode. The resulting weight vectors are appropriate for the type of input patterns used in ART1. Also, the equilibrium weights are easy to determine, and the iterative solution of the differential equations that control the weight updates is not necessary (i.e., as it is in the ART2 algorithm).
Fausett, Laurene. (1994). Fundamentals of neural networks: Architectures, algorithms, and applications. New Jersey: Prentice Hall. pp. 169-175.
Carpenter, G. A. & Grossberg, S. (1987). "A Massively Parallel Architecture for a Self-Organizing Neural Pattern Recognition Machine." Computer Vision, Graphics, and Image Processing, (73), pp. 54 - 115.
Rao, Valluru B. (1995). C++ Neural Networks and Fuzzy Logic. MTBooks, IDG Books Worldwide, Inc. (from internet distributed .pdf file; ISBN: 1558515526).