WO2014060001A1

WO2014060001A1 - Multitransmitter model of the neural network with an internal feedback

Info

Publication number: WO2014060001A1
Application number: PCT/EP2012/003756
Authority: WO
Inventors: Dmitri PESCIANSCHI
Original assignee: FRENKEL, Christina; Frenkel, Daniel; HAEMMERLING, Valentin
Priority date: 2012-09-13
Filing date: 2012-09-13
Publication date: 2014-04-24

Abstract

The invention relates to the field of information processing, including of parallel and neural network calculations, and may be used in the development and creation of programmable and physically realizable fast neural networks. The efficiency of the calculations is ensured by the high training rate of the given neural network, as well as the parallelism of the calculation streams not only in the recognition of images, but in the neural network training. Furthermore, the described neural network model can easily be realized in the form of an analogue arrangement, which significantly improves the efficiency of its apparative realization. The said neural network training is based on a novel structure and a novel principle of functioning of the neural synapse forming the neural network structure and performing the function of transmission and transformation of a signal. The training takes place under counter-propagation of direct and backward signals, where the direct signal performs the recognition of the image, and the counter (reverse) signal ensures the neural network training. The difference in the direct and reverse signals at the output of the neuron is the basis for error compensation and, consequently, leads to the training of the neuron and the whole neural network. The information carrier (memory) in the given neural network is dendrites (neuron inputs), and also neuron synapses with dendrites of other neurons. The structure of the connections between the neurons is in principle unimportant. The contribution of dendrites is defined not only by weight coefficient, and by weight ranges. Training is carried out by modification of parameters of synapses and weight ranges of dendrites of a neuron. Thus, plasticity of a neural network is ensured not only by identical weights of dendrites, but through ranges of the weights of dendrites and adaptation of the functions of signal transformation (transfer functions) on synapses. Synapses ensure not simple signal transmission between neurons, but also transform these signals. Notably, transformation of signals on synapses has the complex non-linear functional nature depending on value of a signal coming at the synapse.

Description

Description:

MULTITRANSMITTER MODEL OF THE NEURAL NETWORK WITH AN INTERNAL FEEDBACK FIELD OF INVENTION

The present invention relates generally to artificial neural networks and their operation, and relates particularly, though not exclusively, to an improved neural network training method.

BACKGROUND OF THE INVENTION

A neural network is a network consisting of neurons connected to other neurons via synapses. A neuron is the basic element of a neural network. A neuron contains dendrites that receive input signals. A neuron changes the input signals by means of weights on the dendrites, and transmits the altered signals to the central part, termed the soma (Fig. 1 a). The soma performs the neuron summation and activation.

The functional nature of the activation is defined by the activation function, which may be different. The level of excitation of a neuron depends on the sum of the input signals received via the dendrites. The output signal (the result of neuron activation) is transmitted through the axon to the synapses.

The synapse is the part of the neuron (node) that forms a direct connection between a neuron and a dendrite of another neuron, in order to transmit an output signal to this dendrite (Fig. 1 a). At the present time various neural network models and training algorithms exist.

The majority of them are modifications of several basic models, for example the J.J. Hopfield model [1][2][3][4][5][6], T. Kohonen "Self-organizing map" (SOM) [19][20][21], S. Grossberg [7][8][9], "Back- propagation neural networks" [10][1 1 ][12][13][14][15], Boltzmann machines [22], "Radial basis function nets (RBF)" [23][24], "Adaptive resonant theory nets (ART)" [16][17][18].

Basically all these models use the same functions for the neuron model. Neurons transmit signals via synapses to dendrites of other neurons. One neuron transmits via the axon signals of the same strength to all dendrites connected by synapses to the given axon and the strength of the transmitted signal is always equal to the neuron activation level.

Furthermore, in the previous models the signal can travel only in one direction, namely from dendrites through the soma to the axon and thence through the synapses to the dendrites. In the proposed model a signal can also travel in the opposite direction. The backward signal in our new model forms the basis for neuron training. Existing models of neural networks have a number of disadvantages, which complicates their use for solving many real problems. For example:

1. The neural network training time depends exponentially on the size of the network (number of neurons and connections between them) and the volumes of data for the training. Furthermore, the neural network training time often cannot be determined beforehand. The process of training other neural networks requires a multiplicity of iterations for memorising the whole set of images. It is impossible to determine the number of necessary cycles (epochs).

2. Other neural network models analyse data (recognize images) in parallel at all neurons, but training takes place sequentially since the training algorithms are iterative and changes in the weights of the dendrites depend on other neurons and previous iterations.

3. Mostly the memory of neural networks is unstable. The network training in a new image destroys or changes the results of the previous training. In order to train the network as a whole to •recognize a new image, the network must be retrained for the whole package, after including also the new image. This makes it impossible to use neural networks in real time operations. The training in other neural network models is carried out by optimizing the weights by a step-by-step selection of values with the purpose to detect the minimum for the given data sample. The addition of one new image removes the whole system from the minimum condition. In other words, in order to memorize one more new image a complete retraining of the network is required, otherwise the previous information will be destroyed. Specific attempts to solve this problem were made in "The adaptive resonant theory nets (ART)" [16][17][18]. This approach is only a partial solution of the problem, since ART nets have a number of serious disadvantages:

• ART nets lose one of the basic advantages of neural networks, namely great stability. In this approach the loss of one module destroys the whole memory.

• ART nets have a low stability to noise. This leads to a degradation of the memory of ART nets.

• ART nets have a fairly low information capacity (memory).

• ART nets have a fairly low generalization ability.

• ART nets are difficult to implement in the form of a microchip.

4. Neural network training can lead to "paralysis", i.e. the impossibility to continue its training. It is impossible to determine whether a neural network could be trained for a given set of data beforehand. In other models paralysis during training can occur when a local minimum is reached or there is a displacement from the minimum (divergence).

5. A random sequence of samples when training in other models of neural networks sometimes leads to a divergence of the network, in other words training in these networks is critical as regards the sequence of the data presented for training - the network can be trained for one sequence of presented data, but cannot be trained for another sequence. The specific sequence cannot be predicted beforehand.

6. The processes of recognition and training in other models of neural networks cannot proceed in parallel. This is connected with the necessity of iterative training in the whole package of images, whereas recognition takes place in one step and only for one image. These two stages cannot occur in parallel, since in other models of neural networks the training process excludes the possibility of a simultaneous correct recognition of images. This is one further reason why it is often impossible to employ neural networks in real-time operations.

7. Almost all types of neural networks cannot contain more than two functionally significant layers.

The exception is the "Back-propagation neural networks" [10][1 1][12][13][14][15]. However, "back- propagating neural networks" are not free of practically all the disadvantages mentioned above.

8. A distinguishing feature of all existing neural network models is the impossibility of the selective removal (omission) of some specific information. The very method of storing the information excludes such a possibility. The reason for this is that the same synapses and weights of the dendrites are repeatedly used for different images. The re-recording of weights for one image inevitably leads to the destruction of other data that were not associated with the removed information. If a certain part of the recordings in the overall selection of data appears incorrect or unnecessary, then to remove it from the whole memory of the neural network requires complete retraining of the network in the whole selection of data, with the exception of the removed data. Accordingly, this operation cannot be described as the removal of separate recordings. In fact, it is a simple retraining of the network.

The proposed neural network model avoids the aforementioned disadvantages.

Definitions

The neural network model denotes here the principle for transmitting and transforming signals similarly to the way in which this occurs in the brain of living organisms. This model simulates the actual biological neural network of the brain. The said model can be realized by various methods. For example, electrical or optical signals can be transmitted and processed.

A formal neuron (Fig. 1 a) is the minimal computing element of a neural network, which performs the selection and processing of input information, its transformation, and the formation of an output result. A formal neuron consists of:

• Dendrites are channels for obtaining input signals of a neuron. Input signals are fed as an input vector X (x^ x₂ x_n). Each dendrite has a weight (ω^ ω₂ ω_η). The weight of a dendrite denotes here a specific numerical characteristic of the dendrite, which influences the strength of the signal transmitted by the dendrite, thereby transforming this signal. Each dendrite has its own specific weight.

The soma is the processor of the neuron. The processing of input data of the neuron is performed in the soma. The soma, being the processor of a formal neuron, carries out the following functions:

Summer - performs the summation of the input signals obtained from the dendrites. The summer can be non-linear, i.e. performs not a simple summation of the obtained signals, but also their preliminary processing. For example, such a non-linear summer can carry out the summation with a preliminary logarithmic conversion of the data.

Activation Function - this is the element that uses the result of the summer (NET) as input parameter, transforming this parameter by a defined method and transmitting the result of this transformation as an output signal of the neuron. The value of this output signal is the neuron activation level (OUT). The activation function can be expressed in an arbitrary form of any possible function. For example, it can be a linear ( OUT = t - NET ), threshold ( or any other function.

Furthermore, this function does not necessarily even have to be increasing. In any case, the activation function should be defined over the whole range of possible values of the argument (sum NET).

Axon - the channel along which the result of the neuron activation is transmitted to the synapses.

Synapses - the point of contact of a neuron (at its output) with a dendrite of another neuron (at its input). The results of the neuron activation are transmitted via the synapses to the inputs of other neurons. In fact, the synapses are responsible for the formation of a network from individually taken neurons (Fig. 1 b). The difference of the given model of a neural network, in comparison with other models is the complex constitution of a synapse (Fig. 2). In other models the signal as unchanged is transmitted from a neuron to a dendrite of other neuron via a synapse. In the given model the synapse is one of the objectives realizing transformation of the information. Besides the synapse in the given model is one of the main objectives of training of the neurons and the whole neural network. The complex mechanism of a synapse is ensured by following factors.

o Mediators are the objects forming a multilevel structure of a synapse, ensuring training the neuron during error generation at a backward signal. Being trained, mediators accumulate the statistical information which is a training basis. The signal of different intensity coming on a synapse defines a group of the active mediators for the given synapse. These active mediators define a non-uniform signal transmission to the dendrite bridged to the given synapse. Each mediator in a synapse possesses its weight. These weights are various for all mediators and for all synapses. Thus, a set of all mediators in a synapse with their weights define a synapse transfer function.

o A synapse Transfer function is a function, defining association of a signal on an output of a synapse from a signal on a synapse input. The size of the transmitted signal depends on the weights of the active mediators corresponding to an excitation level of a neuron, and also from properties of a transfer function. The transfer function can be realized by different methods, but, anyway, this function depends on the weights of mediators (Fig. 2). For example, this function can be a threshold where a level of thresholds is set according to the weights of mediators.

o A direct signal is a signal that is propagated along a neuron from the dendrites (input) through the soma and axon to the synapses (output), and from the synapses to the dendrites of other neurons. A direct signal is responsible for the recognition of input images.

o Backward signal - this is a signal that is propagated along a neuron from the synapses (output), through the axon and soma to the dendrites (input), i.e. in the opposite direction to the direct signal. The backward signal is responsible for training the neuron and neural network as a whole. A backward signal is not a propagation of an error in algorithms with "Back-propagation neural networks". In algorithms with "Back- propagation neural networks" the reduction of an error in the neural network occurs as a result of a gradual iterative propagation of changes in the reverse direction. A backward signal in the present model of a neural network is literally an information signal that is propagated in the opposite direction to the direct signal. This signal is not propagated gradually - as a result of numerous training iterations, as for example in algorithms with "Back-propagation neural networks", but in one step (one iteration), i.e. at the speed of the direct signal.

Terms "mediator", "the active mediator", "a synapse transfer function" are new in the theory of neural networks. Necessity of their application is defined by fundamental novelty of the given model of a neural network and its training method.

SUMMARY OF THE INVENTION

As is known, the majority of existing neural network models are modifications of several basic models such as, for example, J.J. Hopfield model [1][2][3][4][5][6], T. Kohonen "Self-organizing map" (SOM) [19][20][21], S. Grossberg [7][8][9], "Back-propagation neural networks" [10][1 1 ][12][13][14][15], Boltzmann machines [22], "Radial basis function nets (RBF)" [23][24], "Adaptive resonant theory nets (ART)" [16][17][18].

Basically, all these models use the same functions for the model of the neuron. Also, although these models use a different neural network structure and different training algorithms, they all have one or other disadvantages, which do not enable a neural network to be used actively to solve a multiplicity of practical problems.

For example, it is not possible to construct fully an analogue neural chip containing a significant number of neurons and connections (for example hundreds of thousands of neurons) that could recognize images and, having been trained, could simply use an analogue circuit. The following may be mentioned as some examples of the main disadvantages:

1. The neural network training time depends exponentially on the size of the network (number of neurons and connections between them) and the volumes of data for training. Furthermore, the neural network training time often cannot be determined beforehand.

2. Other neural network models analyse data (recognize images) in parallel at all neurons, but training takes place sequentially since the training algorithms are iterative.

3. Very often the memory of neural networks is unstable. The network training in a new image destroys or changes the results of the previous training.

4. Neural network training can lead to "paralysis", i.e. the impossibility to continue its training. It is impossible to determine whether a neural network could be trained in a given set of data beforehand.

5. Neural network training is critical with respect to the sequence of data presented for training. The optimal sequence cannot be predicted beforehand.

6. Neural network recognition and training processes cannot take place simultaneously.

The exception is the "Back-propagation neural networks" [10][1 1][12][13][14][15].

8. A distinguishing feature of all existing neural network models is the impossibility of the selective removal (omission) of certain specific information.

The proposed neural network model obviates the aforementioned disadvantages. For this purpose a new structure of the neuron and synapses, as well as a new training algorithm, are used.

The given invention presents a new model of the neural networks including neurons with non-uniform multi-level synapses. The synapse in the given model does not simply conduct a signal from one neuron to another and non-linearly transforms this signal, depending on strength of a signal reached this synapse. Each synapse contains a set of the mediators corresponding to a different level of a signal. Each such mediator is characterized by its weight coefficient. All these synaptic weight coefficients are parameters of a synaptic transfer function which describes the transformation of a signal transmitted by a synapse. Training the given neural network involves the choice of such synaptic weights at which the output signal of the neuron will match the expected signal.

Furthermore, a reciprocal propagation of signals is employed in training a neuron. In this connection, movement of a backward signal is envisaged not according to a reverse (recurrent) connection, but directly through an axon to the dendrites.

The given neural network model enables several different types of training algorithms to be realized:

• Training with memory fixation is a training when there is a step-by-step decrease of plasticity of weights of synapses and weights of dendrites. In order to remember an image, the weights of the active mediators are modified, converting a transfer function of synapses, and in such a way compensate errors. In a case when the plasticity of synapses is not enough for error compensation, its uncompensated part is spread to overlying neurons and compensated already by them. For this purpose the plastic (unfixed) mediator is selected in a synapse. This mediator is capable to compensate a neuron error and a level of the given mediator is transmitted as a backward signal to the overlying neuron that leads to the error generation already at this overlying neuron. Application of the given algorithm in a combination with additional statistical utilization coefficient of a mediator allows implementing an additional possibility, namely "forgetting". The given algorithm ensures the most rapid training method. Besides, the remembered information is never distorted by the new images. The given algorithm is the most convenient for using for storage and information search without information loss and distortions, for example, for creation of the neural network content addressable memory (NNCAM). Another application of the given neural network with such algorithm is the development of the neural network database. Such database, even in not sorted data will find the necessary information in one step simply recognizing it.

• Training without memory fixation is the training when there is no decrease of plasticity of the synaptic weights. In such training algorithm the plasticity of weights of the active mediators is not fixed after short-term memory formation. Memorization of the whole packet of patterns by such not fixed neural network yields that weights of all mediators come by themselves to statistically optimal (equilibrum) status, the closest to the whole packet of images. The given algorithm ensures the most effective approximation and classification of the data. The training based on given algorithm is slightly more slowly, but however is many-fold faster than the training of other neural networks models. Normally, neural network training based on the given algorithm demands only some cycles. The given algorithm is the most convenient for using for operations with the fuzzy and noisy data, for solving a classification problem, approximation, interpolation, for recognition tasks and tasks from other intellectual areas. • Training with the step-by-step fixation is the algorithm representing the compromise between the first two training algorithms. Fixation of the plasticity of synapses in the given algorithm happens, but it takes place not right after memorizing a pattern. There can be a number of versions of the given algorithm. Such approach leads to a combination of advantages of both training algorithms, namely high quality of statistical analysis of a neural network combined with information saving without loss.

ADVANTAGES OF THE INVENTION

As described above, previous neural network models have some disadvantages, which are absent in the present model:

1. Exponential growth of the training time

The training the proposed neural network requires only one to a few (maximum) cycles for each image. It is possible to determine the necessary training time beforehand.

In our model the training speed is dependent on the same factors that influence the speed in other models but differently:

• The training time of a given neural network does not depend on the size of the neural network but depends only on the speed of the system on which the network is based. Of course, in the implementation of the given model of a neural network in the form of a program simulation on a conventional computer, its training speed will depend linearly on the size of the neural network, since all the calculations will be performed consecutively on conventional processors. However, the basic effect is achieved by creating a micro-circuit operating on a new principle. Other neural network models analyse data (recognize images) in parallel at all the neurons, though the training takes place sequentially since the training algorithms are iterative and changes in the weights of the dendrites depend on other neurons and previous iterations. Our neural network is trained in the same way as it recognizes, namely completely in parallel - the training of each neuron takes place in one step and does not depend on the state of other neurons. Such a parallel mode of operation of neurons in the neural network training makes the training time independent of the size of the neural network. This makes the neural networks can be scaled up and enables neural chips to be created containing thousands of neurons.

• The training time of the given neural network depends linearly (or almost linearly) on the amount of data used for the training, in contrast to the exponential dependence in the case of other model (Fig. 3). Such an effect is achieved by virtue of the fact that the given neural network model employs a number of principles that exist in biological neural networks, which were not taken into account in all the other models of neural networks:

• Each neuron in a given neural network is encapsulated, in other words each neuron is a completely autonomous system. In its operation a neuron uses only that information that reaches its inputs. Only input data can be used to train a neuron. In all previous neural networks, even those that learn by definition without a "teacher", for example in a self- organizing map (SOM), during the training time the state of a number of other neurons is taken into account to form the state of each of the neurons. For example, the rule "a winner- takes-all" in a SOM denotes a comparison of the state of each neuron with the states of a number of other neurons. Who in a biological brain can perform such a comparison of neurons? The result of such an approach is an iterative selection of the weights simultaneously for all neurons, which gradually approaches the optimal state. This approach requires a multiplicity of optimising iterations even for an insignificant volume of data. The idea of a "teacher" and a common "regulator" for all neurons is completely rejected in our model of a neural network. Each neuron considered separately does not know the states of all the other neurons of the neural network. This neuron receives only those input data that reach it through the synapses in the form of signals. On the basis of this local information the neuron adapts to the information, arriving at a homeostatic state and completely disregarding the state of other neurons. The adaptation takes place in one step. As a result, in order to train a whole neural network in one image only one cycle is required, which will be sufficient for the complete adaptation of each neuron taken individually.

• A new feedback principle is used in the present model. In other models feedback can be absent in a manifest form and its role is performed by an "external teacher". For example in networks with "Back-propagation neural networks" [10][1 1][12][13][14][15] the "external teacher", which checks the obtained result against the expected result and corrects the weights of the dendrites, actually performs the feedback. Recurrent connections, for example in Hopfield networks, are one further method of forming a feedback. This type of feedback in a neural network is more like a biological feedback. However, recurrent connections are encountered not everywhere in biological neural networks, although neurons learn everywhere, which suggests that in biological networks another feedback principle is employed.

In some models, for example in SOMs, it is assumed that there is no feedback at all. However, the result of training such a network is unpredictable selection of output images, which require an additional classification for their application, since an absolutely random result does not carry any information. Such a classification of output images in an implicit form fulfils the role of a feedback.

In the described model the feedback is realized as the propagation of a backward signal directly through a neuron: the expected output signal of the neuron is sent in the reverse direction - to the output of the neuron (to the axon). The axon in this case is an input for the backward signal. After that the backward signal is transmitted through the soma to the dendrites; through the dendrites the backward signal is transmitted to the synapses with overlying neurons.

The difference between the direct and backward signals is for a neuron a sufficient source of information about an error and its compensation (restoration of homeostasis). Compensation of an error in a backward signal means training of the neuron. The neuron does not require any additional information about other neurons with the exception of input signals at dendrites in the case of a direct signal, and at the input at the axon in the case of a backward signal. In the described model this is the total amount of information required for the complete training of each neuron in the network as a whole. As a result, the training of a neural network in one image requires almost the same number of calculations for the backward signal as for the recognition of one input image in the case of a direct signal. The information independence and indifference of neurons to the properties of other neurons ensures complete parallelism of the calculations in neural network training.

Instability

In the presented model of a neural network the new principle of memory based on a multi-level synapse (with a set of mediators) synapse with the dynamic weights of mediators is implemented. In the given model the errors of a neural network during training in different images are statistically accumulated on the weights of different mediators. Training the network in a new image very slightly modifies the weight of one mediator of each synapse without affecting the weights of remaining mediators. Thus, training the network in a new image does not demand a transformation of the whole system of weights and does not lead to an erasure of the previous information. Besides, the given model allows enhancing stability of memory, reducing plasticity of selective weights of mediators.

The memory remains plastic, i.e. is able to learn new images and at the same time preserve its stability, thereby ensuring that old images will be not destroyed during the training. The given neural network is resistant to damage - damage of the network up to a certain level does not lead to the complete loss of the preserved images. The given neural network is very stable to noise, has a high information capacity, high indices as regards generalization and approximation of data, and is also easily realized in the form of a chip. Accordingly, the present neural network model is free of all the disadvantages of ART nets. 3. ^"Paralysis".

Paralysis of the described neural network is impossible during its training.

4. Criticality to the image sequence.

In the proposed neural network the order of presentation of data for training is unimportant, in other words the order of the images does not influence the training result.

5. Impossibility of simultaneous neural network training and image recognition by the neural network. In the proposed model, as has already been mentioned before, the process of training the network in a new image requires one backward signal step. In this, earlier information is not destroyed. The recognition of the sample itself requires one direct signal step. The speed of training of the given network in one image in the case of a backward signal is comparable to the speed of recognition of one sample in the case of a direct signal. This makes parallel neural network training and image recognition by the network possible in real-time operations.

6. Structural constraints on an amount of the functional layers and topology.

The proposed neural networks can contain any number of layers. The present model is a new model of trainable multilayer neural networks. Each neuron learns independently of others in the case of a backward signal, then in this case not only the number of levels in the network is not critical for the training algorithm, but also the structure of the network as a whole.

7. Forgetting

The proposed neural network allows the selective removal of information without damaging other information. The removal of one specific image, just like the training in one specific image, is performed in one cycle and does not influence (or virtually not) other regions of the neural network memory.

The present invention enables various technological areas associated with information processing to be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 shows schematically the structure of a neural network:

a. Formal neuron

b. Typical neural network

Figure 2 shows schematically the distribution of the signals on synapses depending on a level of the neuron excitation.

Figure 3 shows dependence of training time on the sizes of a network (amount of neurons).

Figure 4 shows UML model of the given example of the neural network. DETAILED DESCRIPTION OF THE INVENTION

The training of the given neural network is based on a new structure of the synapses forming a given neural network. Information carriers in the neural network are not only dendrites at the inputs of the neurons, but also synapses. The structure of the connections between the neurons is not of primary importance.

A difference of a synapse in the given model of a neuron from synapses in other models will be that the signal is transmitted from the neurons through synapses to dendrites nonuniformly. What signal will be transmitted through a synapse depends on a signal value arriving at a synapse, namely on a level of activation of an overlying neuron. The synapse does not only transmit a signal to a dendrite of another neuron, but also converts this signal according to a synapse transfer function. Each synapse has its own transfer function. Synaptic plasticity consists in a capability of tuning its transfer function. In the transfer function parameters are weights of mediators (Fig. 2).

Modification of excitation strength of a neuron according to the activation function leads to a choice of a new active signal on a synapse, corresponding to a current active mediator (Fig. 2).

One further principal distinction of the present neural network model from other models is the reciprocal transmission of a signal by the neuron. By this is meant not a reverse recurrent connection, but a backward signal going directly along the axon to the dendrites of the neuron. The subsequent transmission of the signal in the direct (from the dendrites to the axon) and reverse (from the axon to the dendrites) directions is necessary for the neuron training. The difference between the direct and backward signals at the axon forms the basis for compensating an error by means of training, namely by means of the change of the weights of the synapses and weights of the dendrites of a neuron. After compensation of the error, i.e. its correction, a partial fixation of the neuron can take place. Furthermore plasticity of weights of mediators is narrowed down to a level on which the further fluctuations of weights of mediators, within the limits of new plasticity, will not lead to generation of a new error. Values of dendritic weights also can be fixed. If plasticity of synapses and dendrites is more incapable to compensate an error in a feedback signal, in that case a synapse signal is transferred to a level in which a mediator with a plastic (unfixed) status, capable to compensate an error is situated. The level of the given plastic mediator as the backward signal is transmitted to the overlying neuron. This leads to an error generation at the overlying neuron. Thus, uncompensated residual of an error is spread with a backward signal through synapses to overlying neurons which will compensate this error. Such mode of neural network training allows training a network in a new image in just one cycle, adding a new image to a packet of previous images, without retraining a network to all images again.

According to the activation function, a neuron is excited, attaining a certain excitation level. The current excitation level is in correspondence with one certain mediator (the active mediator). Each of mediators is in correspondence with its own weight which defines a signal transmitted through a synapse.

The sum of all mediators of lower level can supplements a signal of the active mediator. The signals released by mediators of the lower level in this case are not disconnected and bring the contribution to a signal transmitted by a synapse. In this case summation of signals of all active mediators is made:

The case when mediators of the lower level are inactive can be a special case of such signal transmission by a synapse.

In this case all underlaying weights do not supplement a weight of the active mediator. The weight of the active mediator in that case independently defines a signal transmitted by a synapse:

Si=Oi (Fig. 2).

Signals of mediators define a basic structure of a synaptic memory. Step-by-step transition from one excitation level of a neuron to another leads to modifications of a signal transmitted by a synapse. Such modification can be carried out by various methods. The synapse transfer function is a method of a signal modification for one synapse, corresponding to one of outputs. The function value depends from the weights of mediators of a synapse and changes from the weight of one mediator to the weight of another one. This transfer function can have any functionality, for example, threshold where a level of thresholds is set by a size of weights of mediators. Such function can be also smooth. In case of a threshold transfer function a transition from one excitation level of a neuron to another one is accompanied by sharp non-uniform modifications of a signal transmitted by a synapse. In case of smooth function, for example similar to a polynomial, the step-by-step modification of an excitation level of a neuron is accompanied by a smooth modification of a signal transmitted by a synapse.

The dendrites of a neuron, as in other models, can have their own weights, which correct the signals transmitted to the neuron.

The dendrite weight as well as synapse weights can be fixed or possess plasticity.

A particular case is when the weights of all the dendrites are identical and fixed. In such a situation the memory is formed only by the synaptic weights.

Training of the neural network involves such a correction of the weights of the dendrites and synapses, in which to each input image of the neural network there corresponds (for a specific group of images) a strictly defined output image.

The input image is the vector of the signals (numerical values) lying in defined limits and transmitted to the dendrites of high-level neurons. The output image is the vector of the signals (numerical values) lying in defined limits and received at the axons of low-level neurons. The morphology of the given neural network does not differ from that of other models. The structure of the interneural connections may be any arbitrary structure.

The structure and function of a synapse and the training method of a neuron, instead of the morphology of connections have the fundamental value.

According to the present invention, the model of the functioning of an individually considered neuron and its synapses will differ from other models:

Incapsulation

Only information received at the inputs of the neuron itself can influence the training of the neuron. The state of other neurons, if this information does not reach the inputs of the given neuron, does not influence this neuron. Only neurons directly connected to a given neuron can directly act on this neuron.

Self-teaching system

A neuron is a self-regulating, self-adapting, autonomous system. The neuron itself is corrected (learns) depending on the information that reaches its inputs, and does not require an external "teacher".

Backward signal conduction

In order to correct the state of a neuron, depending on the output error (training) the neuron should transmit the backward signal (from the axon to the dendrites) having a real (expected) value. This backward signal will be the basis for training the neuron.

Principle of superpbsitioning

A neutron emits at the output (axon), as a possible result, not only a signal or its absence (superpositioning), but any value within the limits of a defined range. Accordingly, the set of decisions of a neuron is the set of the possible levels of its activation (output signals of different strength at the axon). The result can be arbitrary within the limits of the specified range.

Accuracy of a neuron

A neuron has a defined accuracy. Fluctuations of a result in a specified range from the expected value do not count as an error. Any possible result of a neuron activation located in the same accuracy range is the same neuron decision. Dendrite memory

One of the carriers of the memory of a neuron, as in the case of a conventional neuron model, are dendrites, and more specifically their weight, namely the coefficient that filters the input signal and changes it. The weight of a dendrite can have any arbitrary value within the limits of a defined range (dendrite plasticity). The value can also be a negative quantity. In this case the dendrite becomes inhibiting.

Synaptic memory

One more information carrier of a neuron, unlike a classical model, is synapses. The signal is transmitted through a synapse to the connected with it dendrite nonuniformly. Synapses have several independent mediators, each of which possesses its own weight coefficient. Each of possible levels of neuron activation activates its own group of mediators of a synapse, depending on the strength of an output signal of an axon. The axon output signal causes switching between various groups of the active mediators (fig. 2).

Short-term memory

Each mediator of a synapse possesses certain plasticity, namely it is capable to vary its weight coefficient influencing the strength of a transmittable signal within certain ranges. If the backward signal of an axon differs from the direct one this forms an error of neuron training. On the basis of activation function the demanded modification of the sum of input signals is computed. Depending on the preset modification of the sum of input signals, weights of mediators can be modified (within the limits of plasticity of these mediators) in such a manner that a value of the corresponding transmittable signals compensates an error. In such a manner a neuron remembers a current image. If the allowed level of plasticity of the active mediators does not allow compensating completely an error, its uncompensated residual can be transmitted through synapses to overlying neurons.

This memory is the short-term one as the next remembered image is capable to vary this information, again modifying weights of synapses. Therefore such memory type is short-term memory. Without short-term memory the long-term memory is impossible.

Long-term memory

The short-term memory can be fixed, becoming long-term one. The short-term memory fixation is a decrease of synaptic plasticity of mediators. In such a case intervals of possible modifications of weights of mediators are narrowed down. New ranges are those wherein the modification of the weight of any mediator in synapses, within the limits of its new plasticity, will not lead to excess of the neuron accuracy. Any new information remembered by a neuron, does not affect previous memory at all. Fixation happens only on the active mediator corresponding to the activation level. Transfer function of synapse

Each synapse changes a value of transmitted through it signals depending on a neuron excitation level according to a transfer function. This function sizes up the signal transmitted through a synapse to a dendrite bridged to it. The size of the transmitted signal depends as on weight of the active mediator corresponding to an excitation level of a neuron, and from the transfer function form. The transfer function can have any form, for example, threshold where a level of thresholds is installed by a size of weights of mediators. The transfer function can be also smooth.

Function of activation

The synapses transmit immediate spontaneous signals to the dendrites of the neurons. Since the strength of these signals does not depend directly, but only indirectly, on the strength of the axon signal (level of activation of the neuron), the activation function can be arbitrary. The only condition is that the activation function should cover the whole range of possible solutions.

Memory overflow

The moment when neuron synapses are not capable to compensate any more an error is memory overflow. Such neuron has attained a level of the full saturation, and cannot remember the new information any more, without having changed the previous long-term one.

Statistical memory of the mediators and "forgetting"

As it was specified above, a basis of the given neural network memory is not only weights of dendrites, but also synaptic weights of mediators. The information of the trained neuron is accumulated in the weights of a group of mediators of each synapse. Each weight from such groups, unlike dendrites of the classical model, can be used not only multiply, but also once (only for one image) or not to be used generally. Each mediator of any synapse has an additional parameter, namely coefficient of utilization. It is the usage counter of a mediator for the neural network training. During the neural network training with the next image, at usage of a certain mediator as an active one, its coefficient of utilization grows, in other words the usage counter of a mediator grows. It is obvious that the incrementation of counters at all mediators will be non-uniform. Weight modification of a certain mediator with coefficient of utilization larger than one can lead to a destruction of other helpful information using the given mediator. For the mediators with the coefficient of utilization equal to one, removal of information will lead to a selective destruction of only one image.

Therefore, a removal process of the separately taken image in the given model represents a process of reduction of the usage counters for each of mediators used by this image. The counter of each used mediator is reduced by one. If this counter is already equal to one, it is voids, and the weight of the given mediator can be changed that can lead to destruction of only the given current image.

However a special re-recording of the information is not required since it is enough to enlarge plasticity of a current mediator to the greatest possible range. The information will not be erased, but will lose fixity. Memorizing the new information will be able to modify the weight of the given mediator that already really can lead to recording erasure. Thus, only those weights can be safety changed, whose mediators are used only in deleted entries.

The given model of synapses, neurons and a neural network enables several different types of training algorithms to be realized:

• Training with memory fixation is training when there is a step-by-step decrease of plasticity of the weights of synapses and weights of dendrites. As it was described above, in order to remember the next image, the weights of the active mediators are modified, converting a transfer function of synapses, and in such a way compensate errors. In a case when the plasticity of synapses is not enough for error compensation, its uncompensated part is spread to overlying neurons and compensated already by them. For this purpose the plastic (unfixed) mediator is selected in a synapse. This mediator is capable to compensate a neuron error and a level of the given mediator is transmitted as a backward signal to the overlying neuron that leads to the error generation already at this overlying neuron. Application of the given algorithm in a combination with additional statistical utilization coefficient of a mediator allows implementing an additional possibility, namely "forgetting". The given algorithm ensures the most rapid training method. Besides, the remembered information is never distorted by the new images. The given algorithm is the most convenient for using for storage and information search without information loss and distortions, for example, for creation of the neural network content addressable memory (NNCAM). Another application of the given neural network with such algorithm is the development of the neural network database. Such database, even in not sorted data will find the necessary information in one step simply recognizing it.

• Training without memory fixation is the training when there is no decrease of plasticity of the

synaptic weights. In such training algorithm the plasticity of weights of the active mediators is not fixed after short-term memory formation. Memorization of the whole packet of images by such not fixed neural network yields that weights of all mediators come by themselves to statistically optimal (equilibrum) status, the closest to the whole packet of images. The given algorithm ensures the most effective approximation and classification of the data. The training based on given algorithm is slightly more slowly, but however is many-fold faster than the training of other neural networks models. Normally, neural network training based on the given algorithm demands only some cycles. The given algorithm is the most convenient for using for operations with the fuzzy and noisy data, for solving a classification problem, approximation, interpolation, for recognition tasks and tasks from other intellectual areas.

• Training with step-by-step fixation is the algorithm representing the compromise between the first two training algorithms. Fixation of the plasticity of synapses in the given algorithm happens, but it takes place not right after memorizing a pattern. There can be a number of versions of the given algorithm. Such approach leads to a combination of advantages of both training algorithms, namely high quality of statistical analysis of a neural network is combined with information saving without loss.

The described model bf a neural network and its training algorithms can easily be realized in various ways. The main advantage of all similar realizations is the possibility of constructing analogue (not digital) systems. The following are some examples:

• A virtual neural network, which is implemented in the form of a computer program. Example 1 of the present patent provides a description of a programmed realization of a neural network.

• A neural network that is implemented as an electronic circuit or microcircuit. For example, the offered neural network model allows developing the analogue neural network with incorporated training mechanism and the neural network content addressable memory (NNCAM).

• A suitable neural network can be implemented in the form of an optical arrangement (optical

neural network).

• The analogue nature of the given neural network model and its training algorithms enable the said neural network to be realized even in the form of a system of chemical processes. For example, the given model of a neural network can be realized on the basis of a system of biochemical reactions with DNA and RNA molecules.

• Other methods of the practical implementation of the described model are also possible.

Some conventional examples of the possible practical use of the present neural network are given hereinafter:

• Recognition of graphical images, including images with strong "noise" pollution and distortions.

• Recognition of speech in a dynamic range independently of the tone quality (timbre), sex, age and pronunciation of the speaker.

• Automated speech synthesis.

• Machine translation.

• Statistical analysis of data of any level of complexity.

• Filtering signals out from background noise.

• Data compression.

• Neuroprocessors for neurocomputers.

• Machine-type sensor organs, including machine-type visual systems with image recognition. The possibility of the selective removal of information in a neural network without having to completely retrain the network opens up completely new possibilities. For example, there is a possibility of creating an associative intellectual memory based on the given neural network (the neural network content addressable memory, NNCAM). NNCAM should have, apart from a faster speed, a greater capacity and low energy consumption, as well as a high resistance to information damage, since a neural network can store information and is able to survive even if a significant proportion of the neurons and synapses is lost.

Another possibility is the creation of a new type of management systems for databases (neural network databases). Up to now the creation of such databases was impossible for a number of reasons: firstly, the sizes of the neural networks and their capacity were limited, since the training time of the neural network increased exponentially depending on the volumes; secondly, there was no possibility of selected removal of information. It was impossible to carry out all the basic operations with the database without retraining its whole volume. Our new model of a neural network allows new data to be recorded and information to be selectively removed, which permits the creation of a neural network database in which all basic database operations will be realized: addition, editing, removal and, of course, searching records. The rate of information searching in such a database does not depend on the volumes, since the information is always available in just one step.

The creation of neural network databases enables another important problem to be solved. This is the development of antivirus programs. The growth of antivirus databases will not be associated with a decrease in the speed of their operation and, consequently will not lead to slower computing speeds. The search for virus signatures will be based not on an enumeration of an antivirus database, but on its recognition.

Based on the aforedescribed neural network model it is possible to create a new type of search engine, which searches for information not only in texts but also among graphical and sound files. By using the mechanism to recognize speech, words and phrases, it is possible to create for communications companies instantaneous identification of conversations relating to a specific topic.

Neural networks based on the aforedescribed model can have a high capacity and speed. Such networks can be used to create expert systems that forecast fluctuations in currency and share markets.

Robot control.

Autopilots for transportation means.

Self-training expert systems for various sectors, for example medicine, seismology, etc. The aforedescribed neural network permits the creation of a new system for developing computer applications based not on programming but on training.

ALTERNATIVE WAYS OF REALIZATION OF THE INVENTION

The aforedescribed neural network model is sufficiently flexible and permits significant variations in its construction and functioning, which overall do not affect the basic possibility for training or only influence the quality of this training. The neuron itself can be constructed in various ways. The employed terminology is relative and does not have to be strictly implemented. The following examples of such variations may be given:

• It is possible not to use an axon transmitting a signal from an excited neuron directly to the synapses. An axon in the present description is an abstract structure, and is used to imply a large biological resemblance.

• Dendrites may be not only a carrier of information (weights), but may also and only be inputs for input signals. In other words dendrites can have their own memory and plasticity (as described above), but these characteristics of dendrites are not obligatory. In this case dendrites in fact only transmit input signals, without even converting them. This is similar to dendrites without any plasticity and with identical weights. The memory of a neuron in this case will be based only on synapses. It is possible to consider such dendrites where the plasticity is cancelled, and the weights have arbitrary values. The neuron in this case does not lose the capability for training.

• The weights of the dendrites can be understood not only, as some constant quantity, but also as a function of the input signal. In other words, the weight of a dendrite may not be constant but may be a dynamic quantity, functionally changing with a value transforming an input signal. This function may also not be linear. For example, this function may reflect the dependence of the resistance of a dendrite (as a factor transforming a signal) on any external conditions.

• The soma may be divided into a summer and an activator (activation function). Thus, the soma as well as the axon is used for its resemblance to a biological neuron.

• The neuron summer can perform not only linear summation of input signals, but also summation with simultaneous transformation, and also non-linear summation. The term "summation" should not be understood literally, as a mathematical operation of summation. Instead of summation of input signals it is possible to perform other operations on them, the result of which is transmitted as an argument to the activation function. For example, instead of summation it is possible to multiply input signals or even perform some other operation.

• The activation function may also be different. There is no need to employ a sigmoid curve or a threshold activation function. The activation function may even be linear. Possibly, one of the few restrictions on the activation function should be its continuity over the whole range of its possible values.

As it has been mentioned above, the synapse can be considerably modified without loss of learning capability. So, the signals defined by mediators, corresponding to the lower level of the neuron excitation are not disconnected, and can save the activity. Modifications in this case are made at the expense of the active mediator, namely the mediator corresponding to a current excitation level of neuron. The signals of the active mediator supplement the sum of all mediators of the lower level:

Si=ai+bi+ ... +0| (Fig. 2).

The case when mediators of the lower level are inactive can be a special case of such signal transmission by a synapse. In this case all underlaying weights do not supplement a weight of the active mediator:

In both cases the synapse to the full ensures a possibility for the neural network training. It is possible to apply other mechanisms of the synaptic training. But a basis in all cases is a choice of the active mediator.

The Amount of mediators can be arbitrary. Moreover, their amount can dynamically vary during the neural network work. The neural network with one mediator in each synapse is degenerated in a classical model of a neural network.

The synapse transfer function can be arbitrary. For example, it can be a threshold function where a transition to the weights of a new mediator occurs abruptly, or it can be a polynomial function or a linear transformation function where the weights vary linearly depending on a neuron excitation level, being gradually transformed from the weight value of one mediator to the weight value of another mediator.

The synapse transfer function can consider the values of weights of unfixed mediators, but can also not consider them, doing the values of weights of only fixed mediators as the reference points. It does not affect the basic ability of a neuron to training. It influences only the quality of the training.

The structure of the connections between neurons of the neuron network may be arbitrary. Recurrent connections are possible.

The narrowing of plasticity of the mediators on synapses and the weights of the dendrites (fixation) can be carried out arbitrarily. Moreover, fixation can not be carried out completely. It is possible such a variation when the fixation can be instant and full that is when a level of new plasticity becomes at once equal to null. Thus, the further modification of the fixed weights becomes in general impossible. It does training more rigid, but does not deprive learning capability of a neuron. Methods of plasticity fixation influence not on an ability to be trained, and on neural network properties.

• Implementation of statistical memory of mediators can be also to a large degree arbitrary. It can be a simple counter of remembered images (coefficient of utilization of a mediator). It can be also a simple counter considering a certain frequency. It can be a general very simple variety of a logical fixation flag where the training fact is marked with a flag and is hot changed any more. In this case the possibility of selective removal of information is lost, but all other properties of a neural network are saved. It is also possible to refuse usage of this parameter. In this case the possibility of selective removal of information is also lost. The possibility of fixation of the plasticity of synapses is also lost, but all other properties of a neural network are saved. Normally, the statistical memory of mediators as the parameter can be freely enough modified, depending on needs for development of a network with certain properties.

References

A list of references is provided corresponding to the bracketed reference numbers appearing throughout the specification. Each reference listed in the following list or otherwise identified in this patent application is incorporated by reference into this application.

Hopfield model of neural networks (without the teacher):

[1] Abu-Mostafa YS, Jacques JSt (1985) Information capacity of the Hopfield model. IEEE

Transactions on Information Theory 31 (4): 461-64.

[2] Hopfield J J (1982) Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Science 79: 2554-58.

[3] Hopfield JJ (1984) Neural with graded response have collective computational properties like those of two-state neurons. Proceedings of the National Academy of Science 81 : 3088-92.

[4] Hopfield J. J., Tank DW (1985) Neural computation of decisions in optimization problems.

Biological Cybernetics 52: 141-52.

[5] Hopfield JJ, Tank DW (1986) Computing with neural circuits: a model. Science 233: 625-33

[6] Tank DW, Hopfield JJ (1986) Simple «neural» optimization networks: An A/D converter, signal decision circuit, and a linear programming circuit. Circuits and Systems IEEE Transactions on CAS- 33(5): 533-41. Counterpropagation networks (the model of Grossberg S and Kohonen T):

[7] Grossberg S (1969) Some networks that can learn remember and reproduce any number of complicated space-time patterns. Journal of Mathematics and Mechanics, 19: 53-91.

[8] Grossberg S (1971) Embedding fields: Underlying philosophy, mathematics, and applications of psychology, physiology, and anatomy. Journal of Cybernetics, 1 : 28-50.

[9] Grossberg S (1982) Studies of mind and brain. Boston: Reidel.

Back-propagation neural networks (multilayered neural network):

[10] Cottrell GW, Munro P, Zipser D (1987) Image compression by back propagation: An example of extensional programming. ICS Report 8702, University of California, San Diego.

[11] Parker DB (1987) Second order back propagation: Implementing an optimal 0(n) approximation to Newton's method as an artificial neural network. Manuscript submitted for publication.

[12] Pineda FJ (1988) Generalization of back-propagation to recurrent and higher order networks. In Neural information processing systems, ed. Dana Z Anderson, pp. 602-11. New York: American Institute of Physics.

[13] Stornetta WS, Huberman BA (1987) An improved three-layer, back-propagation algorithm. In Proceedings of the IEEE First International Conference on Neural Networks, eds. M. Caudill and C. Butler. San Diego, CA: SOS Printing.

[14] Wasserman, PD (1988a) Combined back-propagation/Cauchy machine. Proc. International Neural Networks Society, Pergamon Press, New York.

[15] Wasserman PD. (1988b) Experiments in translating Chinese characters using back-propagation. The adaptive resonant theory

[16] Abraham A et al. (2004) Innovations in intelligent systems. Berlin; New York : Springer.

[17] Crossberg S (1987) Competitive learning: From interactive activation to adaptive resonance. Cognitive Science 11 : 23-63.

[18] Lippman RP (1987) An introduction to computing with neural nets. IEEE ASSP Magazine, vol. 4, pp. 4-22, April.

Self-organizing map (The Kohonen Model)

[19] Allinson N (2001) Advances in self-organising maps. London; New York: Springer. [20] Obermayer K and Sejnowski TJ (2001 ) Self-organizing map formation: foundations of neural computation. Cambridge, Mass.: MIT Press.

[21] Vishwanathan SUN and Murty MN (2000) ohonen's SOM with cache. Pattern Recognition, 33:1927-1929.

Boltzmann Machines

[22] Ackley, D. H.; Hinton, G. E.; Sejnowski, T. J. (1985). "A Learning Algorithm for Boltzmann Machines". Cognitive Science 9 (1 ): 147-169. DOI:10.1207/s15516709cog0901_7.

http://dl.acm. org/citation.cfm?id=54465

Radial Basis Function nets (RBF)

[23] Martin D. Buhmann (2003). Radial Basis Functions: Theory and Implementations. Cambridge University. ISBN 0-521-63338-9.

[24] Yee, Paul V. and Haykin, Simon (2001). Regularized Radial Basis Function Networks: Theory and Applications. John Wiley, ISBN 0-471-35349-3.

Claims

1. A method of constructing a neural network, in which the neural network consists of neurons connected to one another via synapses, wherein the neural network may have any morphology, for example it may be multi-layered, have recurrent connections or may overall not have an ordered structure; neurons include inputs, a summer, an activation function; the summer can be simple or complex, for example, it can be adding signals with their simultaneous transformation; an activation function can have the different form, for example, it can be sigmoid function, a threshold function, a linear function or other function; the neuron output forms synapses with inputs of other neurons or with inputs of the same neuron; in this respect the synapses represent not a simple contact for a signal transmission, and are a complex multilayered structure where each level is formed by the separate mediators possessing such parameter as the weight that does each synapse as a multilayered set of weights which all together are parameters of a transfer function of a synapse, namely such function that transform a signal transmitted by a synapse; notably, each synaptic weight is a variable value and is one of the neural network carrier of information (memory); the neuron provides at the output not only two possible values: (1) a signal and (0) its absence, but also includes the possibility of intermediate values.which match to different synaptical levels and to the corresponding mediators with their weights; besides synaptical weights which are the parameters for a transfer function of the synapse, exist also the weights of dendrites influencing the signal being transmitted from the synapse, and also having a plasticity, i.e. they can be modified in a defined range; recognition of the image by a neuron occurs as follows: signals of different strength come on each synapse of the neuron; to a signal of a certain strength there matches a defined value of the synaptic transfer function which parameters are the synaptic weights of the mediators; the obtained value of the synaptic transfer function from each synapse after a filtering by the dendritic weights reaches a neuron summer; the outcome of summation is an argument for the activation function of the neuron, the outcome of the activation function of the neuron is the output signal of the neuron which can be fed to its synapses if they exist; such a tuning of the synaptic and dendritic weights, in which the expected result is obtained at the output of each neuron and the whole neural network is the training process of the neural network, i.e. the process of compensating an error between a direct (output) signal at the neuron output and a backward (expected) signal at the neuron output; for error compensation the weights of the active mediators on synapses are modified, that is the weights of such mediators whose level corresponds to the level of a signal, reaching a synapse; modification is made so that the new values of transfer functions modified by the dendric weights, form in the summer a sum that is the argument for the activation function, provides at the output a direct signal equal to the backward signal; the said algorithm for training neurons enables them to be trained in parallel in one step; modification of the weights of the dendrites is necessary as a means for influencing the properties of the neurons and network, as well as a means for influencing the training process.

2. Method according to the claim 1 in which the weights of the mediators lying at levels below the active mediator, that is corresponding to more weak signals reaching a synapse, continue to have summarized impact on a synapse transfer function; notably, the training error is compensated by the weight of the active mediator.

3. Method according to the claim 1 in which the weights of the mediators lying at levels below the active mediator, that is corresponding to more weak signals reaching a synapse, do not exert any influence on a synapse transfer function; notably, the training error is compensated by the weight of the active mediator.

4. Method according to claim 1 , in which the plasticity of the dendrites may be reduced; this is effected by constricting the range of plasticity, which leads to the fixation of the weights of the dendrites of the neuron; as a particular case the fixation of the weights of the dendrites is a fixation in which all the dendritic weights are fixed in an identical state, wherein the dendrites do not exert any influence on the signals transmitted via the synapses.

5. Method according to claim 1 in which the plasticity of the weights of mediators of synapses can decrease; it is carried out by a narrowing of a range of plasticity of the weights of mediators that leads to their fixation; the fixation of the weights of mediators allows in the training to new images saving the information in the neural network memory without loss, however, there can to step the moment of saturation of the neuron at which the neuron synapses will be incapable to compensate a training error, in that case the synapse level with a plastic (unfixed) mediator is selected, this mediator is capable to compensate an error, and this level as the backward signal, is transmitted to the overlying neuron that leads to the error generation at the overlying neuron; thus, uncompensated residual of an error is spread with an backward signal through the synapses to overlying neurons which will compensate this error.

6. Method according to claim 1 in which the plasticity of the weights of mediators of synapses does not decrease, i.e. mediators during the training are not fixed (a range of plasticity of the weights of mediators is not narrowed down); notably, memorizing the whole packet of images by such unfixed network leads to that the weights of all mediators by themselves come to statistically optimal (equilibrum) status, the closest to the whole packet of images.

7. Method according to claim 1 in which mediators in synapses possess the statistical coefficient of utilization containing the information about the usage frequency of the weight of the given mediator in the neural network training to various images; notably, this coefficient can have various methods of realization, for example, this coefficient can be an absolute or a relative index of the usage frequency of the weight of a mediator, or it can be a simple logical index of the weight usage of weight in the training: (0) - it was not used and (1) - it was used;

8. Method in which for forgetting of a certain image the fixation of the weights of mediators of synapses according to claim 5 and also statistical coefficients of utilization of the active mediators according to claim 7 are used; for image forgetting statistical coefficients of utilization of the active mediators are decreased and in case when the statistical coefficient is dropped to null, the fixation of the weight plasticity is taken out from the corresponding active mediator that allows to modify the weight of the given active mediator during memorizing the new images that can lead to loss of the image for which statistical coefficients were decreased, thus, it is possible to delete (forget) selectively this image from the neural network without loss of the other information.