CN113298231A - Graph representation space-time back propagation algorithm for impulse neural network - Google Patents
Graph representation space-time back propagation algorithm for impulse neural network Download PDFInfo
- Publication number
- CN113298231A CN113298231A CN202110548714.6A CN202110548714A CN113298231A CN 113298231 A CN113298231 A CN 113298231A CN 202110548714 A CN202110548714 A CN 202110548714A CN 113298231 A CN113298231 A CN 113298231A
- Authority
- CN
- China
- Prior art keywords
- neuron
- neural network
- representing
- pulse
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 76
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 68
- 210000002569 neuron Anatomy 0.000 claims abstract description 93
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000005457 optimization Methods 0.000 claims abstract description 9
- 239000011159 matrix material Substances 0.000 claims description 54
- 239000012528 membrane Substances 0.000 claims description 32
- 230000000946 synaptic effect Effects 0.000 claims description 24
- 210000002364 input neuron Anatomy 0.000 claims description 14
- 238000010304 firing Methods 0.000 claims description 13
- 230000000306 recurrent effect Effects 0.000 claims description 9
- 150000001875 compounds Chemical class 0.000 claims description 8
- 238000012421 spiking Methods 0.000 claims description 8
- 230000001537 neural effect Effects 0.000 claims description 7
- 210000000225 synapse Anatomy 0.000 claims description 7
- 230000000284 resting effect Effects 0.000 claims description 6
- 125000004122 cyclic group Chemical group 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 abstract description 18
- 239000011664 nicotinic acid Substances 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 36
- 210000000653 nervous system Anatomy 0.000 description 7
- 230000003956 synaptic plasticity Effects 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 125000004432 carbon atom Chemical group C* 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of an impulse neural network, in particular to a graph representation space-time back propagation algorithm for the impulse neural network, which obtains the impulse neural network through network forward propagation of neurons in a network structure; evaluating the error of the impulse neural network on the task through a loss function; training the impulse neural network through error back propagation; and completing parameter updating in the training process through a neural network optimization algorithm. The invention improves the accuracy of the impulse neural network through error back propagation, reduces the pulse release rate through sparse regularization so as to improve the energy efficiency under impulse (event) driving calculation, and is suitable for the training process of various bionic network structures through a graph representation method.
Description
Technical Field
The invention relates to the technical field of impulse neural networks, in particular to a graph representation space-time back propagation algorithm for an impulse neural network.
Background
In recent years, an Artificial Neural Network (ANN) inspired by a biological nervous system has been rapidly developed and greatly advanced, and is widely used in the fields of object detection, face recognition, automatic driving, voice recognition, translation, and the like. However, the traditional ANN still lacks reliable simulation of neuron behaviors and the structure of a nervous system, so that the ANN has a certain gap from organisms on intelligent tasks such as reasoning, decision making and the like, and the energy efficiency is far less efficient than that of the biological brain.
The impulse neural network (SNN) is known as a third generation artificial neural network. SNN has great potential to process signals rich in temporal-spatial domain features due to its simulation of complex neuronal dynamics and various structural designs inspired by the functional regions of the biological nervous system. Since SNNs transfer information between neurons through impulses as in biological neural systems, neurons may not perform extensive computations when they do not receive impulses to maintain a low resting energy overhead. This pulse (event) driven computational feature helps SNN achieve higher energy efficiency.
As one of the ANN's, the SNN also needs to be trained to suit the assigned task. Existing training algorithms include three classes, conversion-based algorithms, synaptic plasticity algorithms, and back propagation algorithms. The conversion-based algorithm converts the parameters of the conventional ANN into the SNN with the same structure, but because the ANN for transmitting information by floating point number and the SNN for transmitting information by pulse cannot be completely matched, the SNN after the parameter conversion has information loss, and the network accuracy is reduced. While the transform-based approach still limits the structure of SNNs to traditional ANN structures, there is a lack of further modeling of biological nervous system structures. The synaptic plasticity algorithm is a physiological phenomenon-based training algorithm that adjusts synaptic weights, i.e., parameters of SNNs, by the timing of pulses before and after synapses by neurons. The synaptic plasticity algorithm is suitable for various network structures and only needs less calculation in the learning process. The traditional synaptic plasticity algorithm is suitable for unsupervised learning, so that the SNN performance is limited to a certain extent. The emerging improved synaptic plasticity algorithm adds a global reward signal to the synaptic weight adjustment for modulation so as to realize a certain degree of supervised learning, but is still lower than the back propagation algorithm. Because the back propagation algorithm makes the network parameter be adjusted accurately through the back propagation of the error, and the higher network performance is improved. At present, the back propagation algorithm suitable for the SNN is used for propagating errors to each network parameter by constructing various guided back propagation paths or approximately substituting back propagation links, and updating and adjusting the parameters in modes of gradient descent and the like. However, similar to the transformation-based method, the existing back propagation algorithm is only suitable for the feedforward structure similar to the traditional ANN, and lacks support for various complex networks simulating biological nervous system structures. Meanwhile, most of the algorithms focus on network accuracy, and the SNN energy efficiency is not explored, so that the advantages of the SNN are not fully utilized.
Disclosure of Invention
The present invention is directed to overcome the problems of the prior art, and provides a graph representation space-time back propagation algorithm for an impulse neural network, so as to solve the training problems of various impulse neural networks with complex structures, and to fully utilize the impulse (event) driving characteristics of the impulse neural network to achieve higher energy efficiency.
The above purpose is realized by the following technical scheme:
a graph representation spatiotemporal back propagation algorithm for an impulse neural network, comprising:
obtaining a spiking neural network through network forward propagation of neurons in a network structure;
evaluating the error of the impulse neural network on the task through a loss function;
training the impulse neural network through error back propagation;
and completing parameter updating in the training process through a neural network optimization algorithm.
Further, the network forward propagation of the neuron in the network structure includes the network forward propagation of the neuron in a feed-forward network structure and the network forward propagation of the neuron in a recurrent network structure.
Further, in the feedforward network structure, the forward process of the neuron is as follows:
in the formula (I), the compound is shown in the specification,representing the membrane potential of the ith neuron in the feedforward layer at the time t;representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethRepresenting a pulse firing threshold of the neuron, the neuron firing when the membrane potential is greater than the firing thresholdPulse output;
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
further, in the feedforward network structure, the forward process of the neuron can be significantly accelerated by a matrix operation, as follows:
for t=0~T-1 do:
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,is an input pulseIn the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potentialAnd neuronal impulsesIs in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;a matrix of synaptic weights is represented, and,a bias matrix representing one dimension; an as hadamard product, representing a element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix.
Further, in the cyclic network structure, the forward process of the neuron is as follows:
in the formula (I), the compound is shown in the specification,representing the membrane potential of the ith neuron in the feedforward layer at the time t;representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ denotes the leakage constant in the LIF modelThe ratio of (Δ t ═ 1) decrease in membrane potential per unit time is expressed; u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold; w is aikRepresenting synaptic weights between neurons within the layer;
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
further, in the cyclic network structure, the forward process of the neuron can be significantly accelerated by a matrix operation, as follows:
for t=0~T-1 do:
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,is an input pulseIn the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potentialAnd neuronal impulsesIs in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;a matrix of synaptic weights is represented, and,a bias matrix representing one dimension; an as hadamard product, representing a element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix; [ | · C]Which represents the combination of the two matrices,the synaptic weight matrix of the input neuron to the recurrent layer neuron and the synaptic weight matrix inside the recurrent layer are merged.
Further, the loss function includes, but is not limited to, a squared loss function, an exponential loss function, or a cross-entropy loss function in the form of a pulse.
Furthermore, sparse regularization is added into the loss function to reduce the pulse release rate of the impulse neural network.
Further, the neural network optimization algorithm includes, but is not limited to, bulk gradient descent, random gradient descent, momentum, adarad, Adam, or AdamW.
Further, the behavior of the neurons in the spiking neural network follows the LIF neuron dynamical model and its corresponding variants.
Advantageous effects
The graph representation space-time back propagation algorithm for the impulse neural network improves the accuracy rate of the impulse neural network through error back propagation, reduces the pulse emitting rate through sparse regularization so as to improve the energy efficiency under pulse (event) driving calculation, and is suitable for the training process of various bionic network structures through the graph representation method.
Drawings
FIG. 1 is a schematic diagram of network layers showing a feedforward structure and a circular structure in a spatiotemporal back propagation algorithm for an impulse neural network according to the present invention;
FIG. 2 is a diagram of forward and backward propagation processes of a spatio-temporal back propagation algorithm in a feedforward network structure according to the present invention;
FIG. 3 is a schematic diagram of a graph representing forward and backward propagation processes of a spatio-temporal back propagation algorithm in a circular network structure for an impulse neural network according to the present invention;
FIG. 4 is a graph representing the results of sparse regularization in a spatiotemporal back propagation algorithm for a spiking neural network according to the present invention;
FIG. 5 is an algorithmic flow diagram representing a spatiotemporal back propagation algorithm for an impulse neural network in accordance with the present invention.
Detailed Description
The invention is explained in more detail below with reference to the figures and examples. The described embodiments are only some embodiments of the invention, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a graph representation space-time back propagation algorithm for a pulse neural network, aiming at improving the accuracy rate of the pulse neural network and reducing the pulse emitting rate to improve the energy efficiency. Existing impulse neural network learning algorithms are classified into three categories, namely, conversion-based algorithms, synaptic plasticity algorithms and back propagation algorithms, wherein:
the algorithm based on conversion is limited by the traditional network structure, and the impulse neural network for simulating the biological nervous system cannot be realized;
algorithms for synaptic plasticity are limited in network performance (accuracy) due to the nature of their local weight adjustment;
the back propagation achieves better network performance, but the existing back propagation algorithm still lacks support for a more flexible biological nervous system structure and consideration for a pulse delivery rate, so that the energy efficiency of the pulse neural network cannot be fully utilized and explored.
In order to solve the above problems, the present invention can support all network structures and balance the network accuracy rate and the burst rate, and the scheme is as follows:
obtaining a spiking neural network through network forward propagation of neurons in a network structure;
evaluating the error of the impulse neural network on the task through a loss function;
training the impulse neural network through error back propagation;
and completing parameter updating in the training process through a neural network optimization algorithm.
The scheme is divided into two processes of network forward propagation and error backward propagation, and the network structure is classified into a feedforward structure and a loop structure, as shown in fig. 1. Wherein:
the feedforward structure is connected to a feedforward layer of the network layer internal neurons only by the input neurons;
the circulating structure presents synapses where internal neurons connect to internal neurons.
It is noted that although the feedforward structure is a special case of a circular structure, the classification made in the algorithm helps speed up the correlation operation of the feedforward structure.
For LIF neurons (Leaky Integrated-and-Fire) neuron dynamical models in a feedforward network structure), the forward process follows the following equation:
in the above formula, the first and second carbon atoms are,representing the membrane potential of the ith neuron in the feedforward layer at the time t;representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold;
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
the above formula describes the dynamics of the standard LIF model, and other neuron model variants can realize corresponding processes by adjusting the above equation. For example, IF model is a special case of LIF model with τ ═ 1. Thus, the methods provided by the present invention are applicable to, but not limited to, LIF models, and also include a series of derived model variants.
The above process can be significantly accelerated by matrix operations.
The matrix operation process of the algorithm accepts input pulsesIn the form of a matrixThe dimensions of the matrix are noted in the corner marks; wherein m is the batch size (batch size), ninInputting the number of neurons, wherein T is the time step length of the operation of the algorithm; calculating and storing the membrane potentialAnd neuronal impulsesIs in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;is a matrix of synaptic weights that is,a bias matrix that is one-dimensional; the specific matrix algorithm flow of the forward process is as follows:
for t=0~T-1 do:
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
the above process is an iterative calculation over time, where | _ is a Hadamard product (Hadamard product), indicating element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix. By the above equation, the forward process of the feedforward layer can be calculated and accelerated.
For the forward process of a cyclic network structure, it follows the equation:
the above equation differs from the feedforward layer in that there is a synaptic connection between neurons inside the layer, and the synaptic weight is given by wikRepresents; the above process shows that the state of a neuron depends not only on the input pulse and the historical state of the neuron, but also on the pulse firing of other neurons in the same layer. Connections within a layer exist in the network layer and are therefore called a loop layer; in addition, as can be seen from the above formula, the feedforward layer is a cyclic layer at wikIn a special form under 0, in the actual calculation process, distinguishing the two layers brings a more concise calculation mode for the feedforward layer.
The above process can still benefit from the matrix operation, and the process is as follows:
for t=0~T-1 do:
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the above formula [. cndot. ]]Represents the merging of two matrices;the synaptic weight matrix of the input neuron to the recurrent layer neuron and the synaptic weight matrix inside the recurrent layer are merged. The above equation is a matrix calculation that iterates over time, and the forward process of the loop layer can be calculated and added through the above processAnd (4) speed.
Through the forward propagation process shown in fig. 5, the state of the impulse neural network is calculated layer by layer time by time, and the impulse output is also obtained.
The inference process of the spiking neural network is consistent with the above-described process.
The impulse neural network evaluates the learning condition of the tasks through a loss function, and the loss function is divided into two parts of loss of classification tasks and sparse regular terms.
Taking the classification task as an example, the available loss functions include, but are not limited to, a square loss function in the form of a pulse, an exponential loss function, or a Softmax cross-entropy loss function, specifically:
(1) the square loss function:
wherein Y is a class label (label), i belongs to an output layer of the impulse neural network, namely, the classification loss is only calculated on the output layer; lambda is a regular coefficient, and a regular term punishs frequent pulse delivery, so that pulses are forced to be sparse, and a neural network is encouraged to express information by more efficient pulse delivery.
The partial derivatives of the loss function for the output layer pulses are as follows:
although presentIs aboutA function of, i.e.Andrelated, but the above effects are taken into accountWithout being considered here.
The error of the output layer is obtained by the above formula, and the error is transmitted to other layers through a back propagation process.
(2) Point-wise squared loss function:
the partial derivatives of the loss function to the output layer are:
(3) softmax cross entropy loss function:
the partial derivatives of the loss function to the output layer are:
the propagation path of the error back propagation process is opposite to that of the forward process.
For the feedforward layer, a calculation graph between each state quantity and each parameter in the forward process is shown in fig. 2, and an arrow represents a calculation path between the state and the parameter; the backward propagation reverses the path of the computation graph, and the partial derivatives of the loss function of each forward process are solved, wherein the computation formula of the corresponding partial derivative path in the graph is as follows:
to establish a counter-propagating path, a non-derivable functionAndthe corresponding derivative of (a) is approximately calculated:
a and b are two parameters that control the approximate derivative shape, and in general, a-1 and b-1 may be taken.
Partial derivative of loss function for output layerIn the known manner, it is known that,can pass throughAnd the partial deviation path is obtained:
as shown in FIG. 2, the above formula is an inversion in time, and the algorithm takes the error of all times after t into account by means of iterative calculationIn (taking into account and correcting for errors involved in the calculation of the function)Time sequence dependency) of the error propagation path, while the complete error propagation path is calculated by the iterative calculation mode, the lengthy propagation is avoided, and the calculation process is more concise.
Substituting all formulas into the above formula can obtain:
is composed ofwijAnd biSo that further error function pairs can be foundwijAnd biPartial derivatives of (a):
andfor the gradient of the parameters of the impulse neural network, network parameters can be updated through optimization algorithms such as SGD, Adam, AdamW and the like, and learning training of the network is carried out.Using the partial derivatives of the input pulses as error functions for the next-layer networkThe error can be propagated back to the entire network layer by layer. It should be noted that the error of the non-output layer needs to be corrected by a sparse regularization term:
the back propagation calculation process can simplify the calculation process through matrix operation and fully utilize calculation resources. Accepted input quantity of algorithmIs composed ofA matrix representation of (a); output ofIs composed ofIn the form of a matrix; and updateAndis composed ofAndthe corresponding matrix of (a). The specific algorithm process is as follows:
U′m×n×T=τ·(1-Sm×n×T+α·Um×n×T⊙Sm×n×T⊙G′m×n×T)
for t=T-2~0 do:
ΔU(:,:,t)m×n×1+=ΔU(:,:,t+1)m×n×1⊙U′(:,:,t)m×n×1
where sum (-) represents the summation of the matrix over dimension axis.
The algorithm calculates the intermediate quantity without time dependence through process optimization, and remarkably optimizes the operation speed of backward propagation, so that the algorithm is remarkably benefited from computing equipment such as a matrix operation library (such as MKL of Intel) running in a CPU, a GPU and the like designed for large-scale matrix operation and the like.
For the back propagation process of the loop layer, the computation diagram of which is shown in fig. 3, it is distinguished from the feed-forward layer that there is a connection of neurons in the loop layer to other neurons, so that there is an additional error propagation path:
thus, errors are also propagated through the membrane potential of a neuron in the layer to the impulses of other neurons in the layer. The derivative of the error function to membrane potential is calculated as:
the above formula is still an inversion with respect to time, and error propagation at all times is accounted for by iterative calculationsIn (1). Substituting into all formulas to obtainIs calculated as:
the other layers except the output layer still need to be corrected by a sparse regular term when error is propagated layer by layer:
the above-described correlation calculation process can still be accelerated by matrix operations. An algorithm receives inputOutput ofAnd performing (a) onAndthe gradient of (2) is updated.
It should be noted that the above-mentioned materials,combining input neurons into neurons in layers and synaptic weight gradients between layers, the matrix operation of the above process can be expressed as follows:
U′m×n×T=τ·(1-Sm×n×T+α·Um×n×T⊙Sm×n×T⊙G′m×n×T)
for t=T-2~0 do:
ΔU(:,:,t)m×n×1+=ΔU(:,:,t+1)m×n×1⊙U′(:,:,t)m×n×1+ΔU(:,:,t+1)m×n×1⊙W(:,-n:)1×n×n⊙G′(:,:,t)m×n×1
axis=[0,3])
the computation process of intermediate quantity is still optimized for the back propagation of the loop layer, and the algorithm obtains remarkable acceleration effect on devices such as a CPU (central processing unit), a GPU (graphics processing unit) and the like.
Through the error back propagation process shown in fig. 5, the gradients of the parameters of the impulse neural network have been calculated by time-reversal layer-by-layer. In the network training learning process, the parameters may be updated by a network optimization algorithm such as SGD, Adam, AdamW, etc.
Taking SGD as an example, the parameter update satisfies the following equation:
where w and b represent all synaptic weights and biases in the network,andhas been calculated by the above process. Alpha is a learning rate constant (variable under the strategy of dynamic learning rate) in the training process.
The method provides a more flexible back propagation training method of the impulse neural network structure on one hand, and balances the accuracy rate and the issuing rate of the impulse neural network on the other hand.
As shown in fig. 4, by adjusting the sparse regularization term, the pulse delivery rate can be greatly reduced while causing only a small loss in accuracy. On a hardware platform driven by pulse (event) such as special pulse neural network hardware or a brain-like computing accelerator, lower pulse emitting rate can bring smaller computing overhead, so that the energy efficiency of the pulse neural network is improved while higher accuracy is kept.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A graph representation spatio-temporal back propagation algorithm for an impulse neural network, characterized by:
obtaining a spiking neural network through network forward propagation of neurons in a network structure;
evaluating the error of the impulse neural network on the task through a loss function;
training the impulse neural network through error back propagation;
and completing parameter updating in the training process through a neural network optimization algorithm.
2. A graph representation spatiotemporal back propagation algorithm for spiking neural networks according to claim 1, characterized by the network forward propagation of the neurons in the network structure, comprising the network forward propagation of the neurons in a feed-forward network structure and the network forward propagation of the neurons in a recurrent network structure.
3. A graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 2, wherein in said feed forward network structure the forward course of said neurons is as follows:
in the formula (I), the compound is shown in the specification,representing the membrane potential of the ith neuron in the feedforward layer at the time t;representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold;
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
4. a graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 3, wherein in said feed forward network structure the forward process of said neurons can also be significantly accelerated by matrix operations as follows:
for t=0~T-1 do:
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,is an input pulseIn the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potentialAnd neuronal impulsesIs in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;a matrix of synaptic weights is represented, and,a bias matrix representing one dimension; an as hadamard product, representing a element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix.
5. A graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 2, wherein in said cyclic network structure the forward course of said neurons is as follows:
in the formula (I), the compound is shown in the specification,representing the membrane potential of the ith neuron in the feedforward layer at the time t;representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold; w is aikRepresenting synaptic weights between neurons within the layer;
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
6. a graph representation spatiotemporal back propagation algorithm for spiking neural networks according to claim 5, characterized in that the forward process of the neurons in the recurrent network structure is also significantly accelerated by matrix operations as follows:
for t=0~T-1 do:
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,is an input pulseIn the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potentialAnd neuronal impulsesIs in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;a matrix of synaptic weights is represented, and,a bias matrix representing one dimension; an as Hadamard product, representing a multiplication between matrices by elements(ii) a U (: t) represents the slicing operation of the matrix; [ | · C]Which represents the combination of the two matrices,the synaptic weight matrix of the input neuron to the recurrent layer neuron and the synaptic weight matrix inside the recurrent layer are merged.
7. A graph representation spatio-temporal back propagation algorithm for impulse neural networks according to claim 1, characterized in that the loss function includes but is not limited to a squared loss function, an exponential loss function or a cross entropy loss function in the form of impulses.
8. The graph representation spatiotemporal back propagation algorithm for an impulse neural network of claim 7, wherein sparse regularization is further added to the loss function to reduce the pulse firing rate of the impulse neural network.
9. A graph representation spatio-temporal back propagation algorithm for an impulse neural network as claimed in claim 1, characterized in that the neural network optimization algorithm includes but is not limited to bulk gradient descent, random gradient descent, momentum, adarad, Adam or AdamW.
10. A graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 1, wherein the neuron behavior in said impulse neural network follows LIF neuron dynamical models and their corresponding variants.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110548714.6A CN113298231A (en) | 2021-05-19 | 2021-05-19 | Graph representation space-time back propagation algorithm for impulse neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110548714.6A CN113298231A (en) | 2021-05-19 | 2021-05-19 | Graph representation space-time back propagation algorithm for impulse neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113298231A true CN113298231A (en) | 2021-08-24 |
Family
ID=77322896
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110548714.6A Pending CN113298231A (en) | 2021-05-19 | 2021-05-19 | Graph representation space-time back propagation algorithm for impulse neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113298231A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113723594A (en) * | 2021-08-31 | 2021-11-30 | 绍兴市北大信息技术科创中心 | Impulse neural network target identification method |
CN113792857A (en) * | 2021-09-10 | 2021-12-14 | 中国人民解放军军事科学院战争研究院 | Impulse neural network training method based on membrane potential self-increment mechanism |
-
2021
- 2021-05-19 CN CN202110548714.6A patent/CN113298231A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113723594A (en) * | 2021-08-31 | 2021-11-30 | 绍兴市北大信息技术科创中心 | Impulse neural network target identification method |
CN113723594B (en) * | 2021-08-31 | 2023-12-05 | 绍兴市北大信息技术科创中心 | Pulse neural network target identification method |
CN113792857A (en) * | 2021-09-10 | 2021-12-14 | 中国人民解放军军事科学院战争研究院 | Impulse neural network training method based on membrane potential self-increment mechanism |
CN113792857B (en) * | 2021-09-10 | 2023-10-20 | 中国人民解放军军事科学院战争研究院 | Pulse neural network training method based on membrane potential self-increasing mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11308392B2 (en) | Fixed-point training method for deep neural networks based on static fixed-point conversion scheme | |
US10929744B2 (en) | Fixed-point training method for deep neural networks based on dynamic fixed-point conversion scheme | |
Noriega | Multilayer perceptron tutorial | |
CN113298231A (en) | Graph representation space-time back propagation algorithm for impulse neural network | |
CN110659730A (en) | Method for realizing end-to-end functional pulse model based on pulse neural network | |
CN114037047A (en) | Training method of impulse neural network | |
JP7240650B2 (en) | Spiking neural network system, learning processing device, learning processing method and program | |
CN111382840B (en) | HTM design method based on cyclic learning unit and oriented to natural language processing | |
Bodyanskiy et al. | Multilayer radial-basis function network and its learning | |
Alonso et al. | Tightening the biological constraints on gradient-based predictive coding | |
Musakulova et al. | Synthesis of the backpropagation error algorithm for a multilayer neural network with nonlinear synaptic inputs | |
CN113902092A (en) | Indirect supervised training method for impulse neural network | |
Hong et al. | A cooperative method for supervised learning in spiking neural networks | |
GB2611681A (en) | Drift regularization to counteract variation in drift coefficients for analog accelerators | |
Dao | Image classification using convolutional neural networks | |
CN110853707A (en) | Gene regulation and control network reconstruction method based on deep learning | |
CN115936070A (en) | Low-delay low-power-consumption pulse neural network conversion method | |
Kumar et al. | Neural networks and fuzzy logic | |
CN111582470B (en) | Self-adaptive unsupervised learning image identification method and system based on STDP | |
CN115204350A (en) | Training method and training device for impulse neural network | |
KR102090109B1 (en) | Learning and inference apparatus and method | |
Ikuta et al. | Multi-layer perceptron with glial network for solving two-spiral problem | |
JP7336710B2 (en) | Neural network system, learning method and program | |
Kubo et al. | Biologically-inspired neuronal adaptation improves learning in neural networks | |
RU2774625C1 (en) | Modified intelligent controller with fuzzy rules and neural network training unit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210824 |