CN113902092A - Indirect supervised training method for impulse neural network - Google Patents

Indirect supervised training method for impulse neural network Download PDF

Info

Publication number
CN113902092A
CN113902092A CN202111024733.5A CN202111024733A CN113902092A CN 113902092 A CN113902092 A CN 113902092A CN 202111024733 A CN202111024733 A CN 202111024733A CN 113902092 A CN113902092 A CN 113902092A
Authority
CN
China
Prior art keywords
neural network
ann
snn
training
converting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111024733.5A
Other languages
Chinese (zh)
Inventor
黄漪婧
吴志刚
沈伟
戴靠山
吴建军
廖光明
卫军名
周林
杨斌
张丁凡
张辉
周成刚
魏莞月
向光明
童波
朱瑞蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Shengjinhui Technology Co ltd
Original Assignee
Sichuan Shengjinhui Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Shengjinhui Technology Co ltd filed Critical Sichuan Shengjinhui Technology Co ltd
Priority to CN202111024733.5A priority Critical patent/CN113902092A/en
Publication of CN113902092A publication Critical patent/CN113902092A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Measuring Volume Flow (AREA)

Abstract

The invention discloses an indirect supervision and training method for a pulse neural network, which comprises a method for converting an Artificial Neural Network (ANN) into a pulse neural network (SNN), and is characterized in that: the method for converting the artificial neural network ANN into the impulse neural network SNN comprises the following steps: ReLU () is selected as an activation function; training after setting all biases in the ANN to zero; and (5) fixing the weight. The invention has the advantages that: and applying an ANN indirect supervision mode to the SNN.

Description

Indirect supervised training method for impulse neural network
Technical Field
The invention relates to the technical field of a pulse neural network, in particular to an indirect supervision training method of the pulse neural network.
Background
Whether ANN or SNN, its training, or learning, is achieved by adjusting the connection weights between neurons, which is reflected in synaptic plasticity in biological neural networks. The weight adjustment algorithm is crucial in the learning of artificial neural networks, in the ANN, the back propagation algorithm and the gradient descent have achieved great success, but in the SNN, the back propagation algorithm is no longer applicable, and the conflict is mainly reflected by two points: firstly, in the impulse neural network, the activation function in the ANN becomes the weighted sum of a plurality of impulses, and the impulses can be regarded as dirac functions and have no derivative, so that the back propagation algorithm cannot be applied in the SNN. Another problem is biological reasonableness, also known as the Weight Transport problem, which exists in both ANN and SNN, specifically, the Weight values of forward links are needed in the computation of the back-propagation algorithm, however, such back-links do not exist in the creatures, making the back-propagation algorithm non-biological reasonableness.
At present, a recognized training algorithm does not appear in the impulse neural network, and the training algorithm can be classified into unsupervised learning and supervised learning according to whether a label is used or not.
The impulse neural network adopts a structure more similar to that of the biological neural network, and although it cannot be applied to the back propagation algorithm of the great diversity in the ANN, it also enables the application of a learning rule with a biological interpretability, the biological basis of which is impulse Timing-Dependent Plasticity (STDP). The main characteristic is that the connection weight between the pre-synaptic and post-synaptic neurons is adjusted according to the relative excitation time (in the order of 10 ms) of the neurons, and the mathematical approximation is as follows:
Figure BDA0003242851750000011
where Δ ω represents the amount of change in weight and τ represents the time window constant, the weight between the pre-synaptic neurons becomes larger when they fire money after the post-synaptic neuron and becomes smaller when they do not fire, the change being subject to the hyperparameters τ, a+And a-Similar to the learning rate in the gradient descent algorithm, the unsupervised learning method designed by using the STDP rule can play a good role in feature extraction.
(2) Supervised learning
SpikeProp is the earliest learning algorithm adopting error back propagation in an impulse neural network, and is characterized in that an impulse response model is adopted as a neuron model, the change of the activation state value of a neuron is regarded as linear increase in a very short time, the neuron is only required to output a single pulse, and the error is defined as the mean square error of the pulse excitation time of the output neuron. Learning algorithms such as ReSuMe and SPAN are emerging gradually, and the learning algorithms are characterized in that one neuron receives the input of a plurality of neurons to generate a desired pulse time sequence.
In the deep pulse neural network, supervised learning can be divided into indirect and direct categories. Firstly, the indirect supervised learning means that the ANN is trained firstly, then the ANN is converted into the SNN, and the labels are used for supervised training in the training process of the ANN. The core idea is to understand the continuous activation value in the ANN as the pulse excitation frequency in the SNN. Studies in this direction include constraints on the ANN structure, the transformation methods, and so on. Direct supervised learning has proposed some solutions to the conflict of back propagation with SNN, which is generally solved with an approximate, derivable function for the non-derivable problem. A study on the Weight Transport problem found that using random weights instead of weights in the back-propagation did not significantly affect the results. It should be noted, however, that the training mode with direct supervision is still less accurate than the training mode with indirect supervision.
Disclosure of Invention
In order to solve the various problems, the invention provides an indirect supervision and training method of a pulse neural network, which applies an ANN indirect supervision mode to an SNN.
In order to solve the technical problems, the technical scheme provided by the invention is as follows: an indirect supervised training method of an impulse neural network (ANN), which comprises a method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN), and the method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN) comprises the following steps:
the method comprises the following steps: ReLU () is selected as an activation function;
step two: training after setting all biases in the ANN to zero;
step three: and (5) fixing the weight.
Finally, the steps of generating the SNN logical network are summarized as follows:
the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;
step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;
step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;
step four: all weights are fixed.
Further, the ReLU () activation function is different from Sigmoid () and Tanh (), and the output of the ReLU () activation function is a non-negative value, so that the problem that the activation value is a negative number can be solved. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
Compared with the prior art, the invention has the advantages that: unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
Drawings
FIG. 1 is a schematic diagram illustrating the comparison between the ANN value output and the SNN pulse output of the indirect supervised training method of the spiking neural network according to the present invention.
FIG. 2 is a functional diagram of Sigmoid () and Tanh () of the method for training an impulse neural network by indirect supervision according to the present invention.
FIG. 3 is a diagram of the ReLU () function of an impulse neural network indirect supervised training method of the present invention.
FIG. 4 is a schematic diagram of ANN to SNN conversion of the neural network training method.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The transition from ANN to SNN using indirect supervision faces several problems, including:
firstly, the activation values of neurons in each layer of the traditional neural network have positive and negative scores, wherein the meaning of the negative activation values is the inhibition of neurons in the later layer, and the inhibition is difficult to accurately express in the impulse neural network. The reasons for this problem are mainly the following:
(1) the input to the network may be negative
Generally, the input of the artificial neural network is preprocessed, and a common preprocessing mode is normalization, that is, input data is mapped between-1 and 1 through transformation, so that the purpose of this is to increase the generalization capability of the network on one hand, and to accelerate the convergence of the network during training on the other hand.
(2) Multiply-add operation
In an artificial neural network, neurons convert inputs to outputs by activating multiply-add operations with weights, and biases, both of which may be negative.
(3) Activating a function
The nonlinear activation functions Sigmoid () and Tanh () commonly used by artificial neural networks have output ranges of-1 to 1.
The second problem is that the impulse neural network does not represent a bias as well as the artificial neural network. In ANN, each operation of a neuron adds a multiplication and addition operation of an input and a weight to an offset and then passes through an activation function, but for SNN, the operation of the neuron is converted into a pulse trigger, and the corresponding weight is added to an activation level every time a new pulse appears, so that the offset cannot be represented.
Thirdly, the trained ANN is usually floating point number, and FPGA is difficult to process the floating point number.
In conjunction with figures 1 to 4 of the accompanying drawings,
examples
An indirect supervised training method of an impulse neural network (ANN), which comprises a method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN), and the method for converting the Artificial Neural Network (ANN) into the impulse neural network (SNN) comprises the following steps:
the method comprises the following steps: ReLU () is selected as an activation function;
step two: training after setting all biases in the ANN to zero;
step three: and (5) fixing the weight.
Finally, the steps of generating the SNN logical network are summarized as follows:
the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;
step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;
step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;
step four: all weights are fixed.
Unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
The present invention and its embodiments have been described above, and the description is not intended to be limiting, and the drawings are only one embodiment of the present invention, and the actual structure is not limited thereto. In summary, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (2)

1. An indirect supervision training method for an impulse neural network comprises a method for converting an Artificial Neural Network (ANN) into an impulse neural network (SNN), and is characterized in that: the method for converting the artificial neural network ANN into the impulse neural network SNN comprises the following steps:
the method comprises the following steps: ReLU () is selected as an activation function;
step two: training after setting all biases in the ANN to zero;
step three: and (5) fixing the weight.
Finally, the steps of generating the SNN logical network are summarized as follows:
the method comprises the following steps: according to the set network structure and the set hyper-parameters, training an ANN (artificial neural network) by adopting a BP (back propagation) algorithm to obtain input weights of all neurons;
step two: converting the multiplication and addition operation between the weight and the input in the second generation neuron model into the addition operation of the third generation neuron model, wherein the addition is triggered by the arrival of the pulse, and judging whether the ID of the pulse is connected with the current neuron;
step three: converting the nonlinear activation process in the second generation neuron model into threshold judgment, generating new pulse and setting the membrane potential to zero when the membrane potential is greater than the threshold, otherwise, keeping the membrane potential unchanged;
step four: all weights are fixed.
2. The indirect supervised training method of the spiking neural network as recited in claim 1, wherein: unlike Sigmoid () and Tanh (), the output of the ReLU () activation function is a non-negative value, which can solve the problem that the activation value is negative. Meanwhile, when the input is greater than 0, the ReLU () activation function is linear, and the characteristic can reduce the performance loss from ANN to SNN to a certain extent.
CN202111024733.5A 2021-09-02 2021-09-02 Indirect supervised training method for impulse neural network Pending CN113902092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111024733.5A CN113902092A (en) 2021-09-02 2021-09-02 Indirect supervised training method for impulse neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111024733.5A CN113902092A (en) 2021-09-02 2021-09-02 Indirect supervised training method for impulse neural network

Publications (1)

Publication Number Publication Date
CN113902092A true CN113902092A (en) 2022-01-07

Family

ID=79188381

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111024733.5A Pending CN113902092A (en) 2021-09-02 2021-09-02 Indirect supervised training method for impulse neural network

Country Status (1)

Country Link
CN (1) CN113902092A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781633A (en) * 2022-06-17 2022-07-22 电子科技大学 Processor fusing artificial neural network and pulse neural network
CN114861892A (en) * 2022-07-06 2022-08-05 深圳时识科技有限公司 Chip on-loop agent training method and device, chip and electronic device
CN115100458A (en) * 2022-06-02 2022-09-23 西安电子科技大学 Image classification method and related device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115100458A (en) * 2022-06-02 2022-09-23 西安电子科技大学 Image classification method and related device
CN114781633A (en) * 2022-06-17 2022-07-22 电子科技大学 Processor fusing artificial neural network and pulse neural network
CN114781633B (en) * 2022-06-17 2022-10-14 电子科技大学 Processor fusing artificial neural network and impulse neural network
CN114861892A (en) * 2022-07-06 2022-08-05 深圳时识科技有限公司 Chip on-loop agent training method and device, chip and electronic device

Similar Documents

Publication Publication Date Title
CN113902092A (en) Indirect supervised training method for impulse neural network
CN107301864B (en) Deep bidirectional LSTM acoustic model based on Maxout neuron
US10832123B2 (en) Compression of deep neural networks with proper use of mask
CN112633497A (en) Convolutional pulse neural network training method based on reweighted membrane voltage
CN114037047A (en) Training method of impulse neural network
CN114266351A (en) Pulse neural network training method and system based on unsupervised learning time coding
CN115936070A (en) Low-delay low-power-consumption pulse neural network conversion method
Bodyanskiy et al. Multilayer radial-basis function network and its learning
CN114662644A (en) Image identification method of deep pulse neural network based on dynamic threshold neurons
Dai et al. Fast training and model compression of gated RNNs via singular value decomposition
CN113298231A (en) Graph representation space-time back propagation algorithm for impulse neural network
CN107273971B (en) Feed-forward neural network structure self-organization method based on neuron significance
CN113723594A (en) Impulse neural network target identification method
CN113628615A (en) Voice recognition method and device, electronic equipment and storage medium
CN116702865A (en) Pulse neural network training method, device and storage medium based on knowledge migration
CN116629332A (en) Signal compensation method based on optical reserve tank calculation
CN111582470B (en) Self-adaptive unsupervised learning image identification method and system based on STDP
Bondarev Training a digital model of a deep spiking neural network using backpropagation
Li et al. A Hybrid Training Framework for Speeding up the Inference Process of Spiking Neural Networks
Sheel et al. Accelerated learning in MLP using adaptive learning rate with momentum coefficient
Depenau Automated design of neural network architecture for classification
Neukart et al. A Machine Learning Approach for Abstraction Based on the Idea of Deep Belief Artificial Neural Networks
CN115936088A (en) Efficient multi-pulse learning algorithm combining synaptic weight and delay plasticity
CN116579389A (en) Method for enhancing expression capacity of impulse neural network
Chen et al. Exploiting memristive autapse and temporal distillation for training spiking neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination