CN113902092A - Indirect supervised training method for impulse neural network - Google Patents
Indirect supervised training method for impulse neural network Download PDFInfo
- Publication number
- CN113902092A CN113902092A CN202111024733.5A CN202111024733A CN113902092A CN 113902092 A CN113902092 A CN 113902092A CN 202111024733 A CN202111024733 A CN 202111024733A CN 113902092 A CN113902092 A CN 113902092A
- Authority
- CN
- China
- Prior art keywords
- neural network
- ann
- snn
- training
- converting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Measuring Volume Flow (AREA)
Abstract
The invention discloses an indirect supervision and training method for a pulse neural network, which comprises a method for converting an Artificial Neural Network (ANN) into a pulse neural network (SNN), and is characterized in that: the method for converting the artificial neural network ANN into the impulse neural network SNN comprises the following steps: ReLU () is selected as an activation function; training after setting all biases in the ANN to zero; and (5) fixing the weight. The invention has the advantages that: and applying an ANN indirect supervision mode to the SNN.
Description
Technical Field
The invention relates to the technical field of a pulse neural network, in particular to an indirect supervision training method of the pulse neural network.
Background
Whether ANN or SNN, its training, or learning, is achieved by adjusting the connection weights between neurons, which is reflected in synaptic plasticity in biological neural networks. The weight adjustment algorithm is crucial in the learning of artificial neural networks, in the ANN, the back propagation algorithm and the gradient descent have achieved great success, but in the SNN, the back propagation algorithm is no longer applicable, and the conflict is mainly reflected by two points: firstly, in the impulse neural network, the activation function in the ANN becomes the weighted sum of a plurality of impulses, and the impulses can be regarded as dirac functions and have no derivative, so that the back propagation algorithm cannot be applied in the SNN. Another problem is biological reasonableness, also known as the Weight Transport problem, which exists in both ANN and SNN, specifically, the Weight values of forward links are needed in the computation of the back-propagation algorithm, however, such back-links do not exist in the creatures, making the back-propagation algorithm non-biological reasonableness.
At present, a recognized training algorithm does not appear in the impulse neural network, and the training algorithm can be classified into unsupervised learning and supervised learning according to whether a label is used or not.
The impulse neural network adopts a structure more similar to that of the biological neural network, and although it cannot be applied to the back propagation algorithm of the great diversity in the ANN, it also enables the application of a learning rule with a biological interpretability, the biological basis of which is impulse Timing-Dependent Plasticity (STDP). The main characteristic is that the connection weight between the pre-synaptic and post-synaptic neurons is adjusted according to the relative excitation time (in the order of 10 ms) of the neurons, and the mathematical approximation is as follows:
where Δ ω represents the amount of change in weight and τ represents the time window constant, the weight between the pre-synaptic neurons becomes larger when they fire money after the post-synaptic neuron and becomes smaller when they do not fire, the change being subject to the hyperparameters τ, a+And a-Similar to the learning rate in the gradient descent algorithm, the unsupervised learning method designed by using the STDP rule can play a good role in feature extraction.
(2) Supervised learning
SpikeProp is the earliest learning algorithm adopting error back propagation in an impulse neural network, and is characterized in that an impulse response model is adopted as a neuron model, the change of the activation state value of a neuron is regarded as linear increase in a very short time, the neuron is only required to output a single pulse, and the error is defined as the mean square error of the pulse excitation time of the output neuron. Learning algorithms such as ReSuMe and SPAN are emerging gradually, and the learning algorithms are characterized in that one neuron receives the input of a plurality of neurons to generate a desired pulse time sequence.
In the deep pulse neural network, supervised learning can be divided into indirect and direct categories. Firstly, the indirect supervised learning means that the ANN is trained firstly, then the ANN is converted into the SNN, and the labels are used for supervised training in the training process of the ANN. The core idea is to understand the continuous activation value in the ANN as the pulse excitation frequency in the SNN. Studies in this direction include constraints on the ANN structure, the transformation methods, and so on. Direct supervised learning has proposed some solutions to the conflict of back propagation with SNN, which is generally solved with an approximate, derivable function for the non-derivable problem. A study on the Weight Transport problem found that using random weights instead of weights in the back-propagation did not significantly affect the results. It should be noted, however, that the training mode with direct supervision is still less accurate than the training mode with indirect supervision.
Disclosure of Invention
In order to solve the various problems, the invention provides an indirect supervision and training method of a pulse neural network, which applies an ANN indirect supervision mode to an SNN.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: an indirect supervised training method of an impulse neural network (ANN), which comprises a method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN), and the method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN) comprises the following steps:
the method comprises the following steps: ReLU () is selected as an activation function;
step two: training after setting all biases in the ANN to zero;
step three: and (5) fixing the weight.
Finally, the steps of generating the SNN logical network are summarized as follows:
the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;
step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;
step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;
step four: all weights are fixed.
Further, the ReLU () activation function is different from Sigmoid () and Tanh (), and the output of the ReLU () activation function is a non-negative value, so that the problem that the activation value is a negative number can be solved. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
Compared with the prior art, the invention has the advantages that: unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
Drawings
FIG. 1 is a schematic diagram illustrating the comparison between the ANN value output and the SNN pulse output of the indirect supervised training method of the spiking neural network according to the present invention.
FIG. 2 is a functional diagram of Sigmoid () and Tanh () of the method for training an impulse neural network by indirect supervision according to the present invention.
FIG. 3 is a diagram of the ReLU () function of an impulse neural network indirect supervised training method of the present invention.
FIG. 4 is a schematic diagram of ANN to SNN conversion of the neural network training method.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The transition from ANN to SNN using indirect supervision faces several problems, including:
firstly, the activation values of neurons in each layer of the traditional neural network have positive and negative scores, wherein the meaning of the negative activation values is the inhibition of neurons in the later layer, and the inhibition is difficult to accurately express in the impulse neural network. The reasons for this problem are mainly the following:
(1) the input to the network may be negative
Generally, the input of the artificial neural network is preprocessed, and a common preprocessing mode is normalization, that is, input data is mapped between-1 and 1 through transformation, so that the purpose of this is to increase the generalization capability of the network on one hand, and to accelerate the convergence of the network during training on the other hand.
(2) Multiply-add operation
In an artificial neural network, neurons convert inputs to outputs by activating multiply-add operations with weights, and biases, both of which may be negative.
(3) Activating a function
The nonlinear activation functions Sigmoid () and Tanh () commonly used by artificial neural networks have output ranges of-1 to 1.
The second problem is that the impulse neural network does not represent a bias as well as the artificial neural network. In ANN, each operation of a neuron adds a multiplication and addition operation of an input and a weight to an offset and then passes through an activation function, but for SNN, the operation of the neuron is converted into a pulse trigger, and the corresponding weight is added to an activation level every time a new pulse appears, so that the offset cannot be represented.
Thirdly, the trained ANN is usually floating point number, and FPGA is difficult to process the floating point number.
In conjunction with figures 1 to 4 of the accompanying drawings,
examples
An indirect supervised training method of an impulse neural network (ANN), which comprises a method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN), and the method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN) comprises the following steps:
the method comprises the following steps: ReLU () is selected as an activation function;
step two: training after setting all biases in the ANN to zero;
step three: and (5) fixing the weight.
Finally, the steps of generating the SNN logical network are summarized as follows:
the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;
step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;
step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;
step four: all weights are fixed.
Unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
The present invention and its embodiments have been described above, and the description is not intended to be limiting, and the drawings are only one embodiment of the present invention, and the actual structure is not limited thereto. In summary, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (2)
1. An indirect supervision training method for an impulse neural network comprises a method for converting an Artificial Neural Network (ANN) into an impulse neural network (SNN), and is characterized in that: the method for converting the artificial neural network ANN into the impulse neural network SNN comprises the following steps:
the method comprises the following steps: ReLU () is selected as an activation function;
step two: training after setting all biases in the ANN to zero;
step three: and (5) fixing the weight.
Finally, the steps of generating the SNN logical network are summarized as follows:
the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;
step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;
step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;
step four: all weights are fixed.
2. The indirect supervised training method of the spiking neural network as recited in claim 1, wherein: unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111024733.5A CN113902092A (en) | 2021-09-02 | 2021-09-02 | Indirect supervised training method for impulse neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111024733.5A CN113902092A (en) | 2021-09-02 | 2021-09-02 | Indirect supervised training method for impulse neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113902092A true CN113902092A (en) | 2022-01-07 |
Family
ID=79188381
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111024733.5A Pending CN113902092A (en) | 2021-09-02 | 2021-09-02 | Indirect supervised training method for impulse neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113902092A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114781633A (en) * | 2022-06-17 | 2022-07-22 | 电子科技大学 | Processor fusing artificial neural network and pulse neural network |
CN114861892A (en) * | 2022-07-06 | 2022-08-05 | 深圳时识科技有限公司 | Chip on-loop agent training method and device, chip and electronic device |
CN115100458A (en) * | 2022-06-02 | 2022-09-23 | 西安电子科技大学 | Image classification method and related device |
-
2021
- 2021-09-02 CN CN202111024733.5A patent/CN113902092A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115100458A (en) * | 2022-06-02 | 2022-09-23 | 西安电子科技大学 | Image classification method and related device |
CN114781633A (en) * | 2022-06-17 | 2022-07-22 | 电子科技大学 | Processor fusing artificial neural network and pulse neural network |
CN114781633B (en) * | 2022-06-17 | 2022-10-14 | 电子科技大学 | Processor fusing artificial neural network and impulse neural network |
CN114861892A (en) * | 2022-07-06 | 2022-08-05 | 深圳时识科技有限公司 | Chip on-loop agent training method and device, chip and electronic device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113902092A (en) | Indirect supervised training method for impulse neural network | |
CN107301864B (en) | Deep bidirectional LSTM acoustic model based on Maxout neuron | |
US10832123B2 (en) | Compression of deep neural networks with proper use of mask | |
CN112633497A (en) | Convolutional pulse neural network training method based on reweighted membrane voltage | |
CN114037047A (en) | Training method of impulse neural network | |
CN114266351A (en) | Pulse neural network training method and system based on unsupervised learning time coding | |
CN115936070A (en) | Low-delay low-power-consumption pulse neural network conversion method | |
Bodyanskiy et al. | Multilayer radial-basis function network and its learning | |
CN114662644A (en) | Image identification method of deep pulse neural network based on dynamic threshold neurons | |
Dai et al. | Fast training and model compression of gated RNNs via singular value decomposition | |
CN113298231A (en) | Graph representation space-time back propagation algorithm for impulse neural network | |
CN107273971B (en) | Feed-forward neural network structure self-organization method based on neuron significance | |
CN113723594A (en) | Impulse neural network target identification method | |
CN113628615A (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN116702865A (en) | Pulse neural network training method, device and storage medium based on knowledge migration | |
CN116629332A (en) | Signal compensation method based on optical reserve tank calculation | |
CN111582470B (en) | Self-adaptive unsupervised learning image identification method and system based on STDP | |
Bondarev | Training a digital model of a deep spiking neural network using backpropagation | |
Li et al. | A Hybrid Training Framework for Speeding up the Inference Process of Spiking Neural Networks | |
Sheel et al. | Accelerated learning in MLP using adaptive learning rate with momentum coefficient | |
Depenau | Automated design of neural network architecture for classification | |
Neukart et al. | A Machine Learning Approach for Abstraction Based on the Idea of Deep Belief Artificial Neural Networks | |
CN115936088A (en) | Efficient multi-pulse learning algorithm combining synaptic weight and delay plasticity | |
CN116579389A (en) | Method for enhancing expression capacity of impulse neural network | |
Chen et al. | Exploiting memristive autapse and temporal distillation for training spiking neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |