CN113298231A - Graph representation space-time back propagation algorithm for impulse neural network - Google Patents

Graph representation space-time back propagation algorithm for impulse neural network Download PDF

Info

Publication number
CN113298231A
CN113298231A CN202110548714.6A CN202110548714A CN113298231A CN 113298231 A CN113298231 A CN 113298231A CN 202110548714 A CN202110548714 A CN 202110548714A CN 113298231 A CN113298231 A CN 113298231A
Authority
CN
China
Prior art keywords
neuron
neural network
representing
pulse
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110548714.6A
Other languages
Chinese (zh)
Inventor
闫钰龙
褚皓明
环宇翔
梁龙飞
邹卓
郑立荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai New Helium Brain Intelligence Technology Co ltd
Fudan University
Original Assignee
Shanghai New Helium Brain Intelligence Technology Co ltd
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai New Helium Brain Intelligence Technology Co ltd, Fudan University filed Critical Shanghai New Helium Brain Intelligence Technology Co ltd
Priority to CN202110548714.6A priority Critical patent/CN113298231A/en
Publication of CN113298231A publication Critical patent/CN113298231A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of an impulse neural network, in particular to a graph representation space-time back propagation algorithm for the impulse neural network, which obtains the impulse neural network through network forward propagation of neurons in a network structure; evaluating the error of the impulse neural network on the task through a loss function; training the impulse neural network through error back propagation; and completing parameter updating in the training process through a neural network optimization algorithm. The invention improves the accuracy of the impulse neural network through error back propagation, reduces the pulse release rate through sparse regularization so as to improve the energy efficiency under impulse (event) driving calculation, and is suitable for the training process of various bionic network structures through a graph representation method.

Description

Graph representation space-time back propagation algorithm for impulse neural network
Technical Field
The invention relates to the technical field of impulse neural networks, in particular to a graph representation space-time back propagation algorithm for an impulse neural network.
Background
In recent years, an Artificial Neural Network (ANN) inspired by a biological nervous system has been rapidly developed and greatly advanced, and is widely used in the fields of object detection, face recognition, automatic driving, voice recognition, translation, and the like. However, the traditional ANN still lacks reliable simulation of neuron behaviors and the structure of a nervous system, so that the ANN has a certain gap from organisms on intelligent tasks such as reasoning, decision making and the like, and the energy efficiency is far less efficient than that of the biological brain.
The impulse neural network (SNN) is known as a third generation artificial neural network. SNN has great potential to process signals rich in temporal-spatial domain features due to its simulation of complex neuronal dynamics and various structural designs inspired by the functional regions of the biological nervous system. Since SNNs transfer information between neurons through impulses as in biological neural systems, neurons may not perform extensive computations when they do not receive impulses to maintain a low resting energy overhead. This pulse (event) driven computational feature helps SNN achieve higher energy efficiency.
As one of the ANN's, the SNN also needs to be trained to suit the assigned task. Existing training algorithms include three classes, conversion-based algorithms, synaptic plasticity algorithms, and back propagation algorithms. The conversion-based algorithm converts the parameters of the conventional ANN into the SNN with the same structure, but because the ANN for transmitting information by floating point number and the SNN for transmitting information by pulse cannot be completely matched, the SNN after the parameter conversion has information loss, and the network accuracy is reduced. While the transform-based approach still limits the structure of SNNs to traditional ANN structures, there is a lack of further modeling of biological nervous system structures. The synaptic plasticity algorithm is a physiological phenomenon-based training algorithm that adjusts synaptic weights, i.e., parameters of SNNs, by the timing of pulses before and after synapses by neurons. The synaptic plasticity algorithm is suitable for various network structures and only needs less calculation in the learning process. The traditional synaptic plasticity algorithm is suitable for unsupervised learning, so that the SNN performance is limited to a certain extent. The emerging improved synaptic plasticity algorithm adds a global reward signal to the synaptic weight adjustment for modulation so as to realize a certain degree of supervised learning, but is still lower than the back propagation algorithm. Because the back propagation algorithm makes the network parameter be adjusted accurately through the back propagation of the error, and the higher network performance is improved. At present, the back propagation algorithm suitable for the SNN is used for propagating errors to each network parameter by constructing various guided back propagation paths or approximately substituting back propagation links, and updating and adjusting the parameters in modes of gradient descent and the like. However, similar to the transformation-based method, the existing back propagation algorithm is only suitable for the feedforward structure similar to the traditional ANN, and lacks support for various complex networks simulating biological nervous system structures. Meanwhile, most of the algorithms focus on network accuracy, and the SNN energy efficiency is not explored, so that the advantages of the SNN are not fully utilized.
Disclosure of Invention
The present invention is directed to overcome the problems of the prior art, and provides a graph representation space-time back propagation algorithm for an impulse neural network, so as to solve the training problems of various impulse neural networks with complex structures, and to fully utilize the impulse (event) driving characteristics of the impulse neural network to achieve higher energy efficiency.
The above purpose is realized by the following technical scheme:
a graph representation spatiotemporal back propagation algorithm for an impulse neural network, comprising:
obtaining a spiking neural network through network forward propagation of neurons in a network structure;
evaluating the error of the impulse neural network on the task through a loss function;
training the impulse neural network through error back propagation;
and completing parameter updating in the training process through a neural network optimization algorithm.
Further, the network forward propagation of the neuron in the network structure includes the network forward propagation of the neuron in a feed-forward network structure and the network forward propagation of the neuron in a recurrent network structure.
Further, in the feedforward network structure, the forward process of the neuron is as follows:
Figure BDA0003072896970000031
Figure BDA0003072896970000032
in the formula (I), the compound is shown in the specification,
Figure BDA0003072896970000033
representing the membrane potential of the ith neuron in the feedforward layer at the time t;
Figure BDA0003072896970000034
representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;
Figure BDA0003072896970000035
the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethRepresenting a pulse firing threshold of the neuron, the neuron firing when the membrane potential is greater than the firing thresholdPulse output;
Figure BDA0003072896970000036
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;
Figure BDA0003072896970000037
expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
Figure BDA0003072896970000038
further, in the feedforward network structure, the forward process of the neuron can be significantly accelerated by a matrix operation, as follows:
for t=0~T-1 do:
Figure BDA0003072896970000039
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,
Figure BDA00030728969700000310
is an input pulse
Figure BDA00030728969700000312
In the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potential
Figure BDA00030728969700000311
And neuronal impulses
Figure BDA00030728969700000313
Is in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;
Figure BDA0003072896970000041
a matrix of synaptic weights is represented, and,
Figure BDA0003072896970000042
a bias matrix representing one dimension; an as hadamard product, representing a element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix.
Further, in the cyclic network structure, the forward process of the neuron is as follows:
Figure BDA0003072896970000043
Figure BDA0003072896970000044
in the formula (I), the compound is shown in the specification,
Figure BDA0003072896970000045
representing the membrane potential of the ith neuron in the feedforward layer at the time t;
Figure BDA0003072896970000046
representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;
Figure BDA0003072896970000047
the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ denotes the leakage constant in the LIF modelThe ratio of (Δ t ═ 1) decrease in membrane potential per unit time is expressed; u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold; w is aikRepresenting synaptic weights between neurons within the layer;
Figure BDA0003072896970000048
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;
Figure BDA0003072896970000049
expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
Figure BDA00030728969700000410
further, in the cyclic network structure, the forward process of the neuron can be significantly accelerated by a matrix operation, as follows:
for t=0~T-1 do:
Figure BDA0003072896970000051
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,
Figure BDA0003072896970000052
is an input pulse
Figure BDA0003072896970000053
In the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potential
Figure BDA0003072896970000057
And neuronal impulses
Figure BDA0003072896970000058
Is in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;
Figure BDA0003072896970000054
a matrix of synaptic weights is represented, and,
Figure BDA0003072896970000055
a bias matrix representing one dimension; an as hadamard product, representing a element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix; [ | · C]Which represents the combination of the two matrices,
Figure BDA0003072896970000056
the synaptic weight matrix of the input neuron to the recurrent layer neuron and the synaptic weight matrix inside the recurrent layer are merged.
Further, the loss function includes, but is not limited to, a squared loss function, an exponential loss function, or a cross-entropy loss function in the form of a pulse.
Furthermore, sparse regularization is added into the loss function to reduce the pulse release rate of the impulse neural network.
Further, the neural network optimization algorithm includes, but is not limited to, bulk gradient descent, random gradient descent, momentum, adarad, Adam, or AdamW.
Further, the behavior of the neurons in the spiking neural network follows the LIF neuron dynamical model and its corresponding variants.
Advantageous effects
The graph representation space-time back propagation algorithm for the impulse neural network improves the accuracy rate of the impulse neural network through error back propagation, reduces the pulse emitting rate through sparse regularization so as to improve the energy efficiency under pulse (event) driving calculation, and is suitable for the training process of various bionic network structures through the graph representation method.
Drawings
FIG. 1 is a schematic diagram of network layers showing a feedforward structure and a circular structure in a spatiotemporal back propagation algorithm for an impulse neural network according to the present invention;
FIG. 2 is a diagram of forward and backward propagation processes of a spatio-temporal back propagation algorithm in a feedforward network structure according to the present invention;
FIG. 3 is a schematic diagram of a graph representing forward and backward propagation processes of a spatio-temporal back propagation algorithm in a circular network structure for an impulse neural network according to the present invention;
FIG. 4 is a graph representing the results of sparse regularization in a spatiotemporal back propagation algorithm for a spiking neural network according to the present invention;
FIG. 5 is an algorithmic flow diagram representing a spatiotemporal back propagation algorithm for an impulse neural network in accordance with the present invention.
Detailed Description
The invention is explained in more detail below with reference to the figures and examples. The described embodiments are only some embodiments of the invention, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a graph representation space-time back propagation algorithm for a pulse neural network, aiming at improving the accuracy rate of the pulse neural network and reducing the pulse emitting rate to improve the energy efficiency. Existing impulse neural network learning algorithms are classified into three categories, namely, conversion-based algorithms, synaptic plasticity algorithms and back propagation algorithms, wherein:
the algorithm based on conversion is limited by the traditional network structure, and the impulse neural network for simulating the biological nervous system cannot be realized;
algorithms for synaptic plasticity are limited in network performance (accuracy) due to the nature of their local weight adjustment;
the back propagation achieves better network performance, but the existing back propagation algorithm still lacks support for a more flexible biological nervous system structure and consideration for a pulse delivery rate, so that the energy efficiency of the pulse neural network cannot be fully utilized and explored.
In order to solve the above problems, the present invention can support all network structures and balance the network accuracy rate and the burst rate, and the scheme is as follows:
obtaining a spiking neural network through network forward propagation of neurons in a network structure;
evaluating the error of the impulse neural network on the task through a loss function;
training the impulse neural network through error back propagation;
and completing parameter updating in the training process through a neural network optimization algorithm.
The scheme is divided into two processes of network forward propagation and error backward propagation, and the network structure is classified into a feedforward structure and a loop structure, as shown in fig. 1. Wherein:
the feedforward structure is connected to a feedforward layer of the network layer internal neurons only by the input neurons;
the circulating structure presents synapses where internal neurons connect to internal neurons.
It is noted that although the feedforward structure is a special case of a circular structure, the classification made in the algorithm helps speed up the correlation operation of the feedforward structure.
For LIF neurons (Leaky Integrated-and-Fire) neuron dynamical models in a feedforward network structure), the forward process follows the following equation:
Figure BDA0003072896970000071
Figure BDA0003072896970000072
in the above formula, the first and second carbon atoms are,
Figure BDA0003072896970000073
representing the membrane potential of the ith neuron in the feedforward layer at the time t;
Figure BDA0003072896970000074
representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;
Figure BDA0003072896970000075
the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold;
Figure BDA0003072896970000081
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;
Figure BDA0003072896970000082
expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
Figure BDA0003072896970000083
the above formula describes the dynamics of the standard LIF model, and other neuron model variants can realize corresponding processes by adjusting the above equation. For example, IF model is a special case of LIF model with τ ═ 1. Thus, the methods provided by the present invention are applicable to, but not limited to, LIF models, and also include a series of derived model variants.
The above process can be significantly accelerated by matrix operations.
The matrix operation process of the algorithm accepts input pulses
Figure BDA0003072896970000084
In the form of a matrix
Figure BDA0003072896970000085
The dimensions of the matrix are noted in the corner marks; wherein m is the batch size (batch size), ninInputting the number of neurons, wherein T is the time step length of the operation of the algorithm; calculating and storing the membrane potential
Figure BDA0003072896970000086
And neuronal impulses
Figure BDA0003072896970000087
Is in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;
Figure BDA0003072896970000088
is a matrix of synaptic weights that is,
Figure BDA0003072896970000089
a bias matrix that is one-dimensional; the specific matrix algorithm flow of the forward process is as follows:
for t=0~T-1 do:
Figure BDA00030728969700000810
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
the above process is an iterative calculation over time, where | _ is a Hadamard product (Hadamard product), indicating element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix. By the above equation, the forward process of the feedforward layer can be calculated and accelerated.
For the forward process of a cyclic network structure, it follows the equation:
Figure BDA0003072896970000091
Figure BDA0003072896970000092
the above equation differs from the feedforward layer in that there is a synaptic connection between neurons inside the layer, and the synaptic weight is given by wikRepresents; the above process shows that the state of a neuron depends not only on the input pulse and the historical state of the neuron, but also on the pulse firing of other neurons in the same layer. Connections within a layer exist in the network layer and are therefore called a loop layer; in addition, as can be seen from the above formula, the feedforward layer is a cyclic layer at wikIn a special form under 0, in the actual calculation process, distinguishing the two layers brings a more concise calculation mode for the feedforward layer.
The above process can still benefit from the matrix operation, and the process is as follows:
for t=0~T-1 do:
Figure BDA0003072896970000093
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the above formula [. cndot. ]]Represents the merging of two matrices;
Figure BDA0003072896970000094
the synaptic weight matrix of the input neuron to the recurrent layer neuron and the synaptic weight matrix inside the recurrent layer are merged. The above equation is a matrix calculation that iterates over time, and the forward process of the loop layer can be calculated and added through the above processAnd (4) speed.
Through the forward propagation process shown in fig. 5, the state of the impulse neural network is calculated layer by layer time by time, and the impulse output is also obtained.
The inference process of the spiking neural network is consistent with the above-described process.
The impulse neural network evaluates the learning condition of the tasks through a loss function, and the loss function is divided into two parts of loss of classification tasks and sparse regular terms.
Taking the classification task as an example, the available loss functions include, but are not limited to, a square loss function in the form of a pulse, an exponential loss function, or a Softmax cross-entropy loss function, specifically:
(1) the square loss function:
Figure BDA0003072896970000101
wherein Y is a class label (label), i belongs to an output layer of the impulse neural network, namely, the classification loss is only calculated on the output layer; lambda is a regular coefficient, and a regular term punishs frequent pulse delivery, so that pulses are forced to be sparse, and a neural network is encouraged to express information by more efficient pulse delivery.
The partial derivatives of the loss function for the output layer pulses are as follows:
Figure BDA0003072896970000102
although present
Figure BDA0003072896970000103
Is about
Figure BDA0003072896970000104
A function of, i.e.
Figure BDA0003072896970000105
And
Figure BDA0003072896970000106
related, but the above effects are taken into account
Figure BDA0003072896970000107
Without being considered here.
The error of the output layer is obtained by the above formula, and the error is transmitted to other layers through a back propagation process.
(2) Point-wise squared loss function:
Figure BDA0003072896970000108
the partial derivatives of the loss function to the output layer are:
Figure BDA0003072896970000109
(3) softmax cross entropy loss function:
Figure BDA00030728969700001010
Figure BDA0003072896970000111
the partial derivatives of the loss function to the output layer are:
Figure BDA0003072896970000112
the propagation path of the error back propagation process is opposite to that of the forward process.
For the feedforward layer, a calculation graph between each state quantity and each parameter in the forward process is shown in fig. 2, and an arrow represents a calculation path between the state and the parameter; the backward propagation reverses the path of the computation graph, and the partial derivatives of the loss function of each forward process are solved, wherein the computation formula of the corresponding partial derivative path in the graph is as follows:
Figure BDA0003072896970000113
Figure BDA0003072896970000114
Figure BDA0003072896970000115
to establish a counter-propagating path, a non-derivable function
Figure BDA0003072896970000116
And
Figure BDA0003072896970000117
the corresponding derivative of (a) is approximately calculated:
Figure BDA0003072896970000118
Figure BDA0003072896970000119
a and b are two parameters that control the approximate derivative shape, and in general, a-1 and b-1 may be taken.
Partial derivative of loss function for output layer
Figure BDA00030728969700001110
In the known manner, it is known that,
Figure BDA00030728969700001111
can pass through
Figure BDA00030728969700001112
And the partial deviation path is obtained:
Figure BDA0003072896970000121
as shown in FIG. 2, the above formula is an inversion in time, and the algorithm takes the error of all times after t into account by means of iterative calculation
Figure BDA0003072896970000122
In (taking into account and correcting for errors involved in the calculation of the function)
Figure BDA0003072896970000123
Time sequence dependency) of the error propagation path, while the complete error propagation path is calculated by the iterative calculation mode, the lengthy propagation is avoided, and the calculation process is more concise.
Substituting all formulas into the above formula can obtain:
Figure BDA0003072896970000124
Figure BDA0003072896970000125
is composed of
Figure BDA0003072896970000126
wijAnd biSo that further error function pairs can be found
Figure BDA0003072896970000127
wijAnd biPartial derivatives of (a):
Figure BDA0003072896970000128
Figure BDA0003072896970000129
Figure BDA00030728969700001210
Figure BDA00030728969700001211
and
Figure BDA00030728969700001212
for the gradient of the parameters of the impulse neural network, network parameters can be updated through optimization algorithms such as SGD, Adam, AdamW and the like, and learning training of the network is carried out.
Figure BDA00030728969700001213
Using the partial derivatives of the input pulses as error functions for the next-layer network
Figure BDA00030728969700001214
The error can be propagated back to the entire network layer by layer. It should be noted that the error of the non-output layer needs to be corrected by a sparse regularization term:
Figure BDA0003072896970000131
the back propagation calculation process can simplify the calculation process through matrix operation and fully utilize calculation resources. Accepted input quantity of algorithm
Figure BDA00030728969700001314
Is composed of
Figure BDA0003072896970000132
A matrix representation of (a); output of
Figure BDA0003072896970000133
Is composed of
Figure BDA0003072896970000134
In the form of a matrix; and update
Figure BDA0003072896970000135
And
Figure BDA0003072896970000136
is composed of
Figure BDA0003072896970000137
And
Figure BDA0003072896970000138
the corresponding matrix of (a). The specific algorithm process is as follows:
Figure BDA0003072896970000139
U′m×n×T=τ·(1-Sm×n×T+α·Um×n×T⊙Sm×n×T⊙G′m×n×T)
Figure BDA00030728969700001315
for t=T-2~0 do:
ΔU(:,:,t)m×n×1+=ΔU(:,:,t+1)m×n×1⊙U′(:,:,t)m×n×1
Figure BDA00030728969700001310
Figure BDA00030728969700001311
Figure BDA00030728969700001312
where sum (-) represents the summation of the matrix over dimension axis.
The algorithm calculates the intermediate quantity without time dependence through process optimization, and remarkably optimizes the operation speed of backward propagation, so that the algorithm is remarkably benefited from computing equipment such as a matrix operation library (such as MKL of Intel) running in a CPU, a GPU and the like designed for large-scale matrix operation and the like.
For the back propagation process of the loop layer, the computation diagram of which is shown in fig. 3, it is distinguished from the feed-forward layer that there is a connection of neurons in the loop layer to other neurons, so that there is an additional error propagation path:
Figure BDA00030728969700001313
thus, errors are also propagated through the membrane potential of a neuron in the layer to the impulses of other neurons in the layer. The derivative of the error function to membrane potential is calculated as:
Figure BDA0003072896970000141
the above formula is still an inversion with respect to time, and error propagation at all times is accounted for by iterative calculations
Figure BDA0003072896970000142
In (1). Substituting into all formulas to obtain
Figure BDA0003072896970000143
Is calculated as:
Figure BDA0003072896970000144
further obtain error function pair
Figure BDA00030728969700001414
wij、wikAnd biPartial derivatives of (a):
Figure BDA0003072896970000145
Figure BDA0003072896970000146
Figure BDA0003072896970000147
Figure BDA0003072896970000148
the other layers except the output layer still need to be corrected by a sparse regular term when error is propagated layer by layer:
Figure BDA0003072896970000149
the above-described correlation calculation process can still be accelerated by matrix operations. An algorithm receives input
Figure BDA00030728969700001415
Output of
Figure BDA00030728969700001410
And performing (a) on
Figure BDA00030728969700001411
And
Figure BDA00030728969700001412
the gradient of (2) is updated.
It should be noted that the above-mentioned materials,
Figure BDA00030728969700001413
combining input neurons into neurons in layers and synaptic weight gradients between layers, the matrix operation of the above process can be expressed as follows:
Figure BDA0003072896970000151
U′m×n×T=τ·(1-Sm×n×T+α·Um×n×T⊙Sm×n×T⊙G′m×n×T)
Figure BDA0003072896970000159
for t=T-2~0 do:
ΔU(:,:,t)m×n×1+=ΔU(:,:,t+1)m×n×1⊙U′(:,:,t)m×n×1U(:,:,t+1)m×n×1⊙W(:,-n:)1×n×n⊙G′(:,:,t)m×n×1
Figure BDA0003072896970000152
Figure BDA0003072896970000153
axis=[0,3])
Figure BDA0003072896970000154
the computation process of intermediate quantity is still optimized for the back propagation of the loop layer, and the algorithm obtains remarkable acceleration effect on devices such as a CPU (central processing unit), a GPU (graphics processing unit) and the like.
Through the error back propagation process shown in fig. 5, the gradients of the parameters of the impulse neural network have been calculated by time-reversal layer-by-layer. In the network training learning process, the parameters may be updated by a network optimization algorithm such as SGD, Adam, AdamW, etc.
Taking SGD as an example, the parameter update satisfies the following equation:
Figure BDA0003072896970000155
Figure BDA0003072896970000156
where w and b represent all synaptic weights and biases in the network,
Figure BDA0003072896970000157
and
Figure BDA0003072896970000158
has been calculated by the above process. Alpha is a learning rate constant (variable under the strategy of dynamic learning rate) in the training process.
The method provides a more flexible back propagation training method of the impulse neural network structure on one hand, and balances the accuracy rate and the issuing rate of the impulse neural network on the other hand.
As shown in fig. 4, by adjusting the sparse regularization term, the pulse delivery rate can be greatly reduced while causing only a small loss in accuracy. On a hardware platform driven by pulse (event) such as special pulse neural network hardware or a brain-like computing accelerator, lower pulse emitting rate can bring smaller computing overhead, so that the energy efficiency of the pulse neural network is improved while higher accuracy is kept.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A graph representation spatio-temporal back propagation algorithm for an impulse neural network, characterized by:
obtaining a spiking neural network through network forward propagation of neurons in a network structure;
evaluating the error of the impulse neural network on the task through a loss function;
training the impulse neural network through error back propagation;
and completing parameter updating in the training process through a neural network optimization algorithm.
2. A graph representation spatiotemporal back propagation algorithm for spiking neural networks according to claim 1, characterized by the network forward propagation of the neurons in the network structure, comprising the network forward propagation of the neurons in a feed-forward network structure and the network forward propagation of the neurons in a recurrent network structure.
3. A graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 2, wherein in said feed forward network structure the forward course of said neurons is as follows:
Figure FDA0003072896960000011
Figure FDA0003072896960000012
in the formula (I), the compound is shown in the specification,
Figure FDA0003072896960000013
representing the membrane potential of the ith neuron in the feedforward layer at the time t;
Figure FDA0003072896960000014
representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;
Figure FDA0003072896960000015
the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold;
Figure FDA0003072896960000016
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;
Figure FDA0003072896960000017
expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
Figure FDA0003072896960000021
4. a graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 3, wherein in said feed forward network structure the forward process of said neurons can also be significantly accelerated by matrix operations as follows:
for t=0~T-1 do:
Figure FDA0003072896960000022
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,
Figure FDA0003072896960000023
is an input pulse
Figure FDA0003072896960000024
In the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potential
Figure FDA0003072896960000025
And neuronal impulses
Figure FDA0003072896960000026
Is in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;
Figure FDA00030728969600000213
a matrix of synaptic weights is represented, and,
Figure FDA00030728969600000212
a bias matrix representing one dimension; an as hadamard product, representing a element-by-element multiplication between matrices; u (: t) represents the slicing operation of the matrix.
5. A graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 2, wherein in said cyclic network structure the forward course of said neurons is as follows:
Figure FDA0003072896960000027
Figure FDA0003072896960000028
in the formula (I), the compound is shown in the specification,
Figure FDA0003072896960000029
representing the membrane potential of the ith neuron in the feedforward layer at the time t;
Figure FDA00030728969600000210
representing the input pulse of the jth neuron at the time t, wherein the value of the input pulse of the jth neuron meets x ∈ {0, 1}, wherein 0 represents no input pulse, and 1 represents an input pulse;
Figure FDA00030728969600000211
the output pulse of the ith neuron in the feedforward layer at the time t is represented, the value also meets s ∈ {0, 1}, and whether the output pulse exists or not is represented respectively; w is aijRepresents the synaptic weight from input neuron j to feed-forward layer neuron i, with a 0 value in w indicating the absence of the synapse; biRepresents a bias of a neuron, a value of 0 in b indicates that the neuron is not biased; τ represents a leakage constant in the LIF model, and represents a rate of decrease in membrane potential per unit time (Δ t ═ 1); u shapethA pulse firing threshold representing a neuron that fires a pulse when the membrane potential is greater than the firing threshold; w is aikRepresenting synaptic weights between neurons within the layer;
Figure FDA0003072896960000031
representing a reset function of the membrane potential, controlling the membrane potential to drop to a resting potential (0 potential) after the pulse is issued;
Figure FDA0003072896960000032
expressing a release function and controlling whether the neuron releases pulses, wherein the specific function values are as follows:
Figure FDA0003072896960000033
6. a graph representation spatiotemporal back propagation algorithm for spiking neural networks according to claim 5, characterized in that the forward process of the neurons in the recurrent network structure is also significantly accelerated by matrix operations as follows:
for t=0~T-1 do:
Figure FDA0003072896960000034
S(:,:,t)m×n×1=U(:,:,t)m×n×1≥Uth
in the formula (I), the compound is shown in the specification,
Figure FDA0003072896960000035
is an input pulse
Figure FDA0003072896960000036
In the form of a matrix, the dimensions of the matrix are indicated in the corner marks, m denotes the batch size (batch size), ninRepresenting the number of input neurons, and T representing the time step of algorithm operation; calculating and storing the membrane potential
Figure FDA0003072896960000037
And neuronal impulses
Figure FDA0003072896960000038
Is in matrix form Um×n×TAnd Sm×n×TAnd then S ism×n×TPassing as the output of that layer as the input of the next layer;
Figure FDA0003072896960000039
a matrix of synaptic weights is represented, and,
Figure FDA00030728969600000311
a bias matrix representing one dimension; an as Hadamard product, representing a multiplication between matrices by elements(ii) a U (: t) represents the slicing operation of the matrix; [ | · C]Which represents the combination of the two matrices,
Figure FDA00030728969600000310
the synaptic weight matrix of the input neuron to the recurrent layer neuron and the synaptic weight matrix inside the recurrent layer are merged.
7. A graph representation spatio-temporal back propagation algorithm for impulse neural networks according to claim 1, characterized in that the loss function includes but is not limited to a squared loss function, an exponential loss function or a cross entropy loss function in the form of impulses.
8. The graph representation spatiotemporal back propagation algorithm for an impulse neural network of claim 7, wherein sparse regularization is further added to the loss function to reduce the pulse firing rate of the impulse neural network.
9. A graph representation spatio-temporal back propagation algorithm for an impulse neural network as claimed in claim 1, characterized in that the neural network optimization algorithm includes but is not limited to bulk gradient descent, random gradient descent, momentum, adarad, Adam or AdamW.
10. A graph representation spatiotemporal back propagation algorithm for an impulse neural network as claimed in claim 1, wherein the neuron behavior in said impulse neural network follows LIF neuron dynamical models and their corresponding variants.
CN202110548714.6A 2021-05-19 2021-05-19 Graph representation space-time back propagation algorithm for impulse neural network Pending CN113298231A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110548714.6A CN113298231A (en) 2021-05-19 2021-05-19 Graph representation space-time back propagation algorithm for impulse neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110548714.6A CN113298231A (en) 2021-05-19 2021-05-19 Graph representation space-time back propagation algorithm for impulse neural network

Publications (1)

Publication Number Publication Date
CN113298231A true CN113298231A (en) 2021-08-24

Family

ID=77322896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110548714.6A Pending CN113298231A (en) 2021-05-19 2021-05-19 Graph representation space-time back propagation algorithm for impulse neural network

Country Status (1)

Country Link
CN (1) CN113298231A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723594A (en) * 2021-08-31 2021-11-30 绍兴市北大信息技术科创中心 Impulse neural network target identification method
CN113792857A (en) * 2021-09-10 2021-12-14 中国人民解放军军事科学院战争研究院 Impulse neural network training method based on membrane potential self-increment mechanism

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723594A (en) * 2021-08-31 2021-11-30 绍兴市北大信息技术科创中心 Impulse neural network target identification method
CN113723594B (en) * 2021-08-31 2023-12-05 绍兴市北大信息技术科创中心 Pulse neural network target identification method
CN113792857A (en) * 2021-09-10 2021-12-14 中国人民解放军军事科学院战争研究院 Impulse neural network training method based on membrane potential self-increment mechanism
CN113792857B (en) * 2021-09-10 2023-10-20 中国人民解放军军事科学院战争研究院 Pulse neural network training method based on membrane potential self-increasing mechanism

Similar Documents

Publication Publication Date Title
US11308392B2 (en) Fixed-point training method for deep neural networks based on static fixed-point conversion scheme
US10929744B2 (en) Fixed-point training method for deep neural networks based on dynamic fixed-point conversion scheme
Noriega Multilayer perceptron tutorial
CN113298231A (en) Graph representation space-time back propagation algorithm for impulse neural network
CN110659730A (en) Method for realizing end-to-end functional pulse model based on pulse neural network
CN114037047A (en) Training method of impulse neural network
JP7240650B2 (en) Spiking neural network system, learning processing device, learning processing method and program
CN111382840B (en) HTM design method based on cyclic learning unit and oriented to natural language processing
Bodyanskiy et al. Multilayer radial-basis function network and its learning
Alonso et al. Tightening the biological constraints on gradient-based predictive coding
Musakulova et al. Synthesis of the backpropagation error algorithm for a multilayer neural network with nonlinear synaptic inputs
CN113902092A (en) Indirect supervised training method for impulse neural network
Hong et al. A cooperative method for supervised learning in spiking neural networks
GB2611681A (en) Drift regularization to counteract variation in drift coefficients for analog accelerators
Dao Image classification using convolutional neural networks
CN110853707A (en) Gene regulation and control network reconstruction method based on deep learning
CN115936070A (en) Low-delay low-power-consumption pulse neural network conversion method
Kumar et al. Neural networks and fuzzy logic
CN111582470B (en) Self-adaptive unsupervised learning image identification method and system based on STDP
CN115204350A (en) Training method and training device for impulse neural network
KR102090109B1 (en) Learning and inference apparatus and method
Ikuta et al. Multi-layer perceptron with glial network for solving two-spiral problem
JP7336710B2 (en) Neural network system, learning method and program
Kubo et al. Biologically-inspired neuronal adaptation improves learning in neural networks
RU2774625C1 (en) Modified intelligent controller with fuzzy rules and neural network training unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210824