Main Principles of the General Theory of Neural Network with Internal Feedback

Main Principles of the General Theory of Neural Network with Internal Feedback
D. Pescianschi


In this paper, a new model of formal neuron, analog mechanisms of neuron training, and a new model of biological feedback are proposed. The statement is supported by the neurobiological data published by other authors and through our experiments in silicon. Key qualitative and quantitative differences of the proposed neural network model from the concept accepted today are discussed. A new concept reflects the mechanisms of memory formation. The model bridges the gap between the micro-level of the molecular processes in a neuron and the macro-level of information processing and storage in brain. Thus, an opportunity appears of modeling the processes occurring in brain, as well as developing the artificial neural networks, which are trained in a real-time mode, and are not limited in their structure and complexity of connections. The proposed model is easily implemented, both as virtual emulation and by means of digital and analog artificial neural networks.

1. Introduction

Despite a wide variety of neural network models and their training algorithms developed to date [1], the cornerstone of all models (with small variations) is the concept of a formal neuron (FN) introduced by McCulloch and Pitts (Fig. 1) [2].

Fig. 1 Formal neuron
Fig. 1 Formal neuron

x1, x2, … xn – Input signals of a neuron
ω1, ω2, … ωn – Weights of a neuron for input signals
S – Sum of weighted signals
Y – Output signal

FN is the minimum information level for the modern theory of the artificial neural networks (ANNs). This theory does not take into account the processes taking place at lower level, such as molecular one.

It is supposed that the signals in FN are propagated only in a forward direction, namely, from the dendrites through the soma and axon toward the downstream synapses. Besides, in a number of models, for instance a Hopfield network and the Hamming net (reviewed in [1]), such a neuron can have a recurrent connection. In this case an axon (output) forms a synapse with a dendrite of the same neuron (input) or dendrites of upstream neurons. However, the signal direction in the networks remains the same. Moreover, such connections are rather the exception than the rule for a brain structure.

Modeling of the processes taking place in a biological brain was the initial purpose of the elaboration of the ANN theory. However, the model of FN is simplified as compared to the biological neuron. Such an excessive simplification does not allow building realistic models.

One of the least plausible features of known neural network models and training algorithms is their “digital” computing nature. The question that arises is this: who can perform in a brain all these algorithmic calculations for training of a network? Almost in all models, known up to date the neural network represents a passive object of training. Neurons themselves do nothing for the training. This is done by a certain program that is external, in relation to the neural network. Deficiencies in the current approach require looking for alternative solutions.

2. Statement of the new concept

2.1. Locality of processes

According to the theory of Hebb [3], in the absence of the global trainer, only local interaction between neurons is supposed. In this case training is an exclusively local phenomenon covering only two neurons and the synapse connecting them. A global system of feedback for the training of the synapse is not required. In other words, there can be no mechanisms of passive training of neurons (i.e., there are no any external, towards a neuron, programs that can apply information about other neurons). Only the neuron as such can train itself, using for this purpose the information, which is received due to its connections with other neurons (synapses). All processes, occurring in a neuron are localized. Except the information coming through the physical connections, no other source of information about the remaining network is available for the neuron. Based on the acquired information, the neuron itself regulates the activity and is trained by this means. Although there are other mechanisms of mass regulation of neurons, for example, based on hormones, these mechanisms do not provide the process of training as such, and only regulate mass activity of groups of neurons (so called global switches).

The neuron is a completely autonomous self-sufficient minimum adaptive unit of a network. Training of the whole network is composed of the sum of neurons, trained separately. It provides full parallelism of the training process. Thus, both main neural network processes (training and recognition) are parallel by nature.

2.2. Internal feedback on the basis of retrograde signal

According to one of the basic principles of cybernetics, any adaptive system, which includes also a neuron, must have a feedback. If to exclude an external recurrent connection since it is absent at the majority of biological neurons, it is necessary to assume that the feedback is implemented in the neuron in a different way (such as an internal feedback).

Direct connection in a neuron is implemented by the action potential (AP) that is a charge occurring in the direction from a soma to axon terminal and further, through synapses to dendrites of downstream neurons. At the same time a number of the facts allow suggesting that an internal feedback of other nature is also available in a neuron. For instance, the display of presynaptic long-term potentiation (LTP) has been shown in a response to activation of postsynaptic receptors that assumes the existence of the retrograde intermediary in synapse [4, Ch.16] The interconnected microtubules of the cytoskeleton can fulfill a role of the internal feedback, delivering a retrograde signal from the presynaptic axon terminal through the axon and soma to the neuron dendrites. In a neuron, also within the axon and dendrites, the microtubules form the interconnected network covering the whole cell. Microtubules, which may reach several mm in length, interconnect by means of protein-based MAP bridges. Microtubules are connected via dendritic spines even with synapses by means of protein-based actinic threads (Fig. 2).

Each microtubule represents the hollow cylinder, which consists of rows of tubulin dimers. Each molecule of tubulin can exist in two conformations. It has been shown that microtubules can work as cellular automata, transmitting complex signals as waves of different states of tubulin molecules electric polarization. In other words, each microtubule is capable to transmit messages at a high rate of speed [5]. Besides, it is suggested that microtubules are also responsible for maintenance and change of synapses intensity.

Fig. 2 Network of microtubules in neuron
Fig. 2 Network of microtubules in neuron

2.3. Pulse-frequency modulation of output signal of a neuron

As it has been shown [4, Ch.1], AP has a certain threshold above which the amplitude and duration of AP do not depend on stimulation parameters. Separately taken AP basically is not capable to describe fine intermediate states in a wide range. It has been shown [4, Ch.1] that intensity of neuron excitation is coded not by a separate impulse (AP), but by its frequency. The more effective stimuli cause greater depolarization and, as a result, a higher frequency of generation of AP in a neuron. Pulse-frequency modulation is confirmed also by the fact that the frequency of AP has impact on secretion of mediators in a synaptic cleft [4, Ch.15].

Thus, if to assume that a basis of output signal of a neuron is not a separately taken AP, which is actually binary by its nature, but the frequency of AP generation in a neuron, we have rather different function of neuron activation, than a threshold one. Frequency of the neuron can smoothly change from several hertz to kilohertz. Thus, the function of neuron activation is really smooth, continuous and increasing. Notably, the function of activation is not necessary a sigmoid, it can even be a linear function.

2.4. New model of the neuron in equilibrium

2.4.1. Soma, adder, activation function

Soma of a neuron, as it was supposed earlier, performs analog summing of the signals incoming through synapses, so called summation of all dendritic AP. Based on the sum of signals the neuron creates its own excitation level, which is expressed by certain frequency of single APs. Thus, the modulation of monotonously increasing function of activation in the neuron is frequency-encoded. In principle, it can be a sigmoid function, but a type of this function is not important, for instance, it can be a straight line. Natural resistance of dendrites can be described as a vector of non-negative constants:

C = {c1, c2, … cn},

where n is a number of dendrites.

The signals passing through dendrites can be described as a vector: Α = {α1, α2, … αn}.

Then their weighed sum:

Main Principles of the General Theory of Neural Network with Internal Feedback

Function of neuron activation is a function of this sum.

2.4.2. Synaptic mechanism

In the modern theory of ANN the model of synapse reflects so called Dale’s Principle, namely”one neuron — one transmitter” [6]. However to date it is known that in one chemical synapse more than one type of mediator can be released. Moreover, their set is constant for a particular cell. Several tens of various mediators are known. It is very probable that the larger number of mediator substances remains unexplored [4, Ch.15].

Various mediators act synergistically on the same synapse. For emission of mediators with low molecular weight the single impulses are sufficient, whereas for release of neuropeptide cotransmitters pulse trains are often required. Release of each mediator on each synapse can be represented by function of release of vi(xi) (Fig. 3).

Fig. 3 Release of mediators (V) depending on a signal (X)
Fig. 3 Release of mediators (V) depending on a signal (X)

x1, x2, x3– values of signal at which release a new mediator begins.

Thus, a sequence of intervals forms. On each interval a successive type of mediator is joined to the total sum of mediator signals. In other words, each interval corresponds to one certain, just joined mediator or some subgroup of just connected mediators.

A plurality of methods of signal transfer regulation by mediator is revealed. For example, for each type of mediators in a postsynaptic cell exist the receptors that are specific to them. Moreover, the same mediator can have different types of postsynaptic receptors, both excitatory (positive signal transmission), and inhibitory (negative signal transmission). Density of such chemoreceptor in a zone of synapse determines the level of the signal transmitted by mediator, both positive and negative ones [4, Ch.16].

For simplicity of a model let’s integrate all stages of regulation of a signal transmission for each j-th mediator in one signal transmission function µi(vi) that is common for i-th synapse, where vi = vi(xi). Hence, a function of signal transfer by mediator can be represented as the function depending on a signal X: µi(xi) (Fig. 4a).

Fig.  4  Mediator  functions  of  signal  transfer  (a)  and  synaptic  function of signal transfer (b)
Fig. 4 Mediator functions of signal transfer (a) and synaptic function of signal transfer (b)

X – input signal of synapse
x1, x2, x3 – values of signal at which release of a new mediator begins.
µ – level of signal transfer function by mediator
α – level of synaptic signal transmission function

Just as APs from all dendrites are summed up in soma, the contribution of each mediator to forming of dendritic AP is summed up in a dendrite. Therefore, the total signal transmitted through synapse can be described as follows:

Main Principles of the General Theory of Neural Network with Internal Feedback

where m is quantity of the mediators perceived by receptors on a synapse of i-th dendrite; xi– output signal of upstream neuron, with which the i-th dendrite has created a synapse, and µi(xi) – transfer function by j-th mediator of a signal on i-th synapse (Fig. 4b).

Thus, the transmitted signal αi(xi) can be described by a vector: µi = {µ1i(xi), µ2i(xi), … µmi(xi)}

By substituting (2) in (1) the sum of signals in a neuron is, as follows:

Main Principles of the General Theory of Neural Network with Internal Feedback

2.4.3. Training as an equilibrium process

As was shown by Hebb [3], the stimulation of presynaptic fiber without support by stimuli from the postsynaptic side leads to synaptic depression, i.e. decrease of postsynaptic potentials in response to presynaptic stimuli. To the contrary, the presynaptic stimuli supported by postsynaptic ones, lead to synaptic potentiation, i.e. increase of potentials.

Taking into consideration a retrograde signal (under stimulation of the postsynaptic side), it may be assumed that direct AP leads to synaptic depression, and the retrograde signal results in synaptic potentiation. Notably, extent of changes both in the first and second case, directly depend on the stimulus size.

It has been shown that AP of a cell appears in a point of axon just behind a soma and extends not only along the axon in the forward direction, but also through the soma to dendrites in backward direction. Retrograde calcium dendritic AP, in turn, can cause local changes of calcium concentration and to influence synaptic transfer [4, Ch.8]. It might be that this mechanism is in the core of synaptic depression under direct AP.

One of the main ideas of current work is the following: the mechanism of counter neuron signals brings synaptic weights to equilibrium state, by reducing and increasing them. This is the basis of training. As it was mentioned above, for training a neuron there should be used only direct signals arriving from upper synapses and retrograde signals coming from lower synapses. It is possible to say that the retrograde signal contains information about the expected level of a neuron excitation for the preset input signals. The nature of the retrograde signals will be a subject of a separate publication.

For a certain synapse the following situations are possible:

  1. The input image leads to weak output neuron signal, due to the specifics of its previous training. However, the strong output signal was expected that is expressed in a strong retrograde signal. As it was stated above, weak direct AP results in weak depression of synapses. If the retrograde signal exceeds a direct one, then the level of synapses potentiation will exceed the level of their depression. Next time, in case of the same input image, the excitation level of a neuron will be higher.
  2. The input image results in strong neuron output signal. However, the weak output signal was expected that it is expressed in a weak retrograde signal. Thus, a strong direct AP leads to the strong depression of synapses, and a weak retrograde signal results in their weak potentiation. Level of synapses depression in that case will exceed the level of their potentiation. Next time, at the same input image, the excitation level of the neuron is lower.
  3. Strong direct AP and strong retrograde signal lead to the strong depression on one hand and to strong potentiation on the other hand, which mutually neutralize each other and leave excitability level at the former (high) level.
  4. Weak direct AP and weak retrograde signal are similar to the case 3, with only that difference that the weak depression and weak potentiation mutually neutralize each other, having left a neuron at the former level of weak excitability.

In other words, training of a neuron is the establishment of balance between depression and potentiation of synapses at direct and retrograde signals.

The proposed equilibrium-based model of memory formation correlates well with the rules of Hebb. However, these rules do not work in a situation, when a retrograde signal is stronger than a direct signal that leads to connection strengthening.

The described mechanism is strongly simplified. It seems very probable that the given equilibrium process is unevenly distributed between all synapses of a neuron. For example, it has been shown that the backpropagation of AP from soma depends on the input resistance of different branches (dendrites). The input resistance, in turn, depends on degree of activity of excitatory and inhibitory synapses. Thus, the backpropagation of excitation to dendrites depends on synaptic activity [4, Ch.8].

2.5. The localized equilibrium-based process

The equilibrium-based mechanism of training, described above, works not at the whole synapse, and only at some of its appropriate synaptic mediators.

As it was stated above, different groups of mediators correspond to different levels of a presynaptic signal. It means that the equilibrium-based mechanism of error compensation works for a certain image at a certain group of mediators. For other image the group of mediators will be already different, i.e. different images are characterized by a sets of mediators only partially similar, from one image to another one.

2.6. Training of a neuron

Training of a neuron can be described by the following simple procedure:

  1. The input image is fed to the synapses of neuron.
  2. Input signals on synapses activate certain groups of mediators.
  3. Synaptic weights for the chosen mediators are summed up in a soma creating direct AP.
  4. The size of a direct AP determines the depression level of selected mediators (decrease of weights).
  5. A retrograde (expected) signal is fed to the output of a neuron causing potentiation of the mediators selected by the input image (increase of weights). And, the level of this potentiation depends on the size of retrograde signal. As a result of direct PD and retrograde signal the neuron comes to equilibrium state, thus being trained in the current image.
  6. The next image is fed to inputs and outputs of neuron, and steps 1 – 5 are repeated for the new image.
  7. After processing all images (one epoch), the process for these images can be repeated again – until the necessary level of training accuracy is achieved, or until the achievement of the minimum speed of convergence.

In the course of training to a new image, the level of weights forming the previous memory is only partially modified in a new way. At sufficiently high amount of synapses and possible levels of a signal (mediators) the influence of such distortions is extremely small. Moreover, at repeated training the weights are gradually redistributed, by finding state with maximum level of the general balance. Such system itself finds the state of absolute minimum.

3. Conclusions

The new models describing a formal neuron, neural network training and recognition, as well as memory formation are proposed, which are in good correspondence with experimental data. The described model of a neural network is much more similar to the biological neural network, than the ANNs known to date.

The proposed new model of formal neuron and network includes:

  • Synapse with a set of mediators (weights) and nonlinear function of signal transmission.
  • The internal feedback implementing a retrograde signal transmission in the direction from axon terminal to dendrites and synapses with upstream neurons.

The model of training of this neuron and network includes:

  • Selection of the mediators in a synapse, participating in training and recognition. This depends on the input signal.
  • Correcting weights on synapses by depression and potentiation at direct and retrograde signals. The value of decrease in synaptic weights directly depends on forward AP of a neuron, and the value of increase of these weights directly depends on retrograde signal.

Experiments [7] with virtual models of the proposed network have shown that the network has the following characteristics:

  • The high training speed, which depends linearly on the size of network and data volumes, unlike other ANN models with their exponential dependence. The proposed network requires dozen times less number of training epochs than any classical ANN.
  • The scalability that allows building networks of any size and complexity.
  • Simplicity in implementation in analog or digital form. This does not require the external trainer in the form of the computer or chip with its long and complex digital calculations [8].

This proposed concept is an important step towards understanding the memory formation mechanisms in a brain. The described ideas are confirmed by virtual experiments, which indicate breakthrough in technologies of processing and storage of information.

4. Acknowledgements

I appreciate fruitful discussions and support by A. Boudichevskaia, V. Proseanic, B. Zlotin, V. Pesceanskaia, S. Visnepolschi, and A. Zusman. I am also very grateful for the support by the Progress Inc. working team and investors.

5. References

[1] S.O. Haykin “Neural Networks and Learning Machines”(3rd Edition), Pearson Education, Inc., 2009

[2] W.S. McCulloch, W.H. Pitts “A logical calculus of the ideas immanent in nervous activity”. Bulletin of Mathematical Biophysics, Vol. 5, p. 115-133, 1943

[3] D.O. Hebb “The organization of behavior”. Wiley, New York, 1949

[4] J.G. Nicholls, A.R. Martin, P.A. Fuchs, D.A. Brown, M.E. Diamond, D.A. Weisblat “From Neuron to Brain, Fifth Edition”, Publisher: Sinauer Associates, 2012

[5] T.J.A.Craddock, D.Friesen, J.Mane,S. Hameroff, J.A. Tuszynski. “The feasibility of coherent energy transfer in microtubules”. Journal of the Royal Society, Interface / the Royal Society, 11(100):20140677, 2014

[6] H.H. Dale “Pharmacology and nerve endings”. Proc Roy Soc Med 28:319-332, 1935

[7] D. Pescianschi, A. Boudichevskaia, B. Zlotin and V. Proseanic “Analog and digital modeling of a scalable neural network”, presented for the current conference.

[8] D. Pescianschi “Neural network and method of neural network training” PCT patent application US1519236, 2015