WO2023284142A1 - 脉冲神经网络中神经元的信号处理方法及该网络训练方法 - Google Patents

脉冲神经网络中神经元的信号处理方法及该网络训练方法 Download PDF

Info

Publication number
WO2023284142A1
WO2023284142A1 PCT/CN2021/123091 CN2021123091W WO2023284142A1 WO 2023284142 A1 WO2023284142 A1 WO 2023284142A1 CN 2021123091 W CN2021123091 W CN 2021123091W WO 2023284142 A1 WO2023284142 A1 WO 2023284142A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
neuron
loss
pulse
membrane voltage
Prior art date
Application number
PCT/CN2021/123091
Other languages
English (en)
French (fr)
Inventor
西克萨迪克·尤艾尔阿明
邢雁南
魏德尔菲利普
鲍尔菲利克斯·克里斯琴
Original Assignee
成都时识科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都时识科技有限公司 filed Critical 成都时识科技有限公司
Priority to US18/251,000 priority Critical patent/US20230385617A1/en
Publication of WO2023284142A1 publication Critical patent/WO2023284142A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the invention relates to a pulse neuron, in particular to a signal processing method of a neuron in a pulse neural network and the network training method.
  • Spiking neural network is currently the best neural network that simulates the working principle of biological nerves.
  • SNN Spiking neural network
  • a popular approach is to use proxy gradients to solve this problem, such as prior art 1:
  • this type of technology only supports a single-pulse mechanism at each time step.
  • pulse data with extremely high time-resolution inputs such as DVS data
  • using a single-pulse mechanism will result in a very large and unacceptable number of simulation time steps. This will lead to the fact that the network training method of the single-pulse mechanism will become extremely inefficient when facing complex tasks, especially in the face of the increasing scale of configuration parameters.
  • the present invention proposes an automatic differentiable spiking neuron model and training method capable of generating multiple pulses in one simulation time step.
  • This model/training method can greatly improve training efficiency.
  • the present invention achieves the object in the following manner: a signal processing method for neurons in the spiking neural network, the spiking neural network includes several layers, and each of the layers includes several of the described neural networks.
  • the unit is characterized in that the signal processing method includes the following steps: receiving step: at least one of the neurons receives at least one input pulse sequence; accumulation step: based on the weighted summation of the at least one input pulse sequence, to obtain the membrane voltage; activate Step: when the membrane voltage exceeds the threshold, determine the amplitude of the pulse excited by the neuron based on the ratio of the membrane voltage to the threshold.
  • determining the amplitude of the pulse excited by the neuron based on the ratio of the membrane voltage to the threshold is specifically: at a single simulation time step, the amplitude of the excited pulse is related to the ratio of the membrane voltage to the threshold
  • the thresholds are ratio dependent.
  • the determination of the amplitude of the pulse excited by the neuron based on the ratio of the membrane voltage to the threshold is specifically: the ratio of the amplitude of the excited pulse to the unit pulse amplitude at a single simulation time step , equal to the value of the ratio of the membrane voltage to the threshold value rounded down.
  • the weighted summation based on the at least one input pulse sequence to obtain the membrane voltage specifically includes: based on the weighted summation after convolution of the post-synaptic potential kernel and each input pulse sequence, to obtain the membrane voltage Voltage.
  • the said weighted summation based on the at least one input pulse sequence to obtain the membrane voltage specifically includes: weighted summation based on the convolution of the post-synaptic potential kernel and each input pulse sequence, and not Membrane voltages are obtained by convolution of the periodic kernel with the neuron output spike train.
  • ⁇ (t) is the neuronal membrane voltage
  • ⁇ j is the jth synaptic weight
  • ⁇ (t) is the post-synaptic potential nucleus
  • sj (t) is the jth input pulse train
  • '*' is the volume Product operation
  • t is time.
  • ⁇ (t) is the membrane voltage of the neuron
  • ⁇ (t) is the refractory nucleus
  • s'(t) is the output pulse sequence of the neuron
  • ⁇ j is the jth synaptic weight
  • ⁇ (t) is Post-synaptic potential kernel
  • s j (t) is the j-th input pulse train
  • '*' is the convolution operation
  • t is time.
  • a kind of pulse neural network training method described pulse neural network comprises several layers, and each described layer comprises several neurons, it is characterized in that: when described neuron processes signal in network training, comprises the following steps: receiving step : at least one of the neurons receives at least one input pulse sequence; accumulation step: based on the weighted summation of the at least one input pulse sequence to obtain the membrane voltage; activation step: when the membrane voltage exceeds a threshold, based on the membrane voltage The ratio of the threshold value determines the magnitude of the pulse excited by the neuron; the total loss of the spiking neural network includes a first loss and a second loss, wherein the first loss reflects the expected output of the spiking neural network and The gap between the actual output results of the spiking neural network, and the second loss reflects the activity or activity level of neurons.
  • the training method further includes: detecting the peak value of the output trace; calculating the first loss at the moment corresponding to the peak value of the output trace; calculating the second loss, the second loss reflects the activity/time of the neuron activity level; combining the first loss and the second loss into the total loss; according to the corresponding function of the total loss, using the error backpropagation algorithm to train the neural network.
  • the merging of the first loss and the second loss into the total loss is specifically: where the parameter ⁇ is a tuning parameter and the total loss is The first loss is The second loss is
  • the second loss is
  • T is the duration
  • N neurons is the size of the neuron cluster
  • H( ⁇ ) is the Heaviside function
  • the first loss is
  • a training device includes a memory, and at least one processor coupled to the memory, configured to execute the neural network training method included in any one of the above.
  • a storage device configured to store the source code written by programming the neural network training method included in any one of the above, or/and the machine code that can be directly run on the machine.
  • a neural network accelerator on which the neural network configuration parameters trained by the neural network training method included in any one of the above items are deployed.
  • a pseudo-expression chip on which the neural network configuration parameters trained by the neural network training method included in any one of the above items are deployed.
  • a neural network configuration parameter deployment method deploying the neural network configuration parameters trained by any one of the neural network training methods included in the above to a neural network accelerator.
  • a neural network configuration parameter deployment device stores the neural network configuration parameters trained by any one of the neural network training methods mentioned above, and transmits the configuration parameters to the neural network accelerator through a channel.
  • a neural network accelerator the neuron included in the neural network accelerator applies the aforementioned neuron signal processing method when performing reasoning functions.
  • integers are included in the spike events in the neural network accelerator.
  • the accuracy of the model/training method can also be improved
  • Figure 1 is a schematic diagram of the SNN neural network architecture
  • Fig. 2 is a schematic diagram of a single-pulse neuron signal processing mechanism
  • Fig. 3 is a schematic diagram of multi-pulse neuron signal processing mechanism
  • Figure 4 is a function graph of the proxy gradient
  • Figure 5 is a flow chart of loss function construction during training
  • Fig. 6 is a schematic diagram of output trace and peak time
  • Fig. 7 is a schematic diagram of neurons firing pulses at precise moments and patterns generated after neuron clusters are trained.
  • the “pulse” mentioned anywhere in the present invention refers to the spike in the field of pseudo-expression, which is also called “peak", not the pulse in the general circuit.
  • the described training algorithm can be written into a computer program in the form of computer code, stored in a storage medium, and read by a computer (such as a high-performance GPU device, FPGA, ASIC, etc.) processor, in the training data (various Various data sets) and the training of the training algorithm are used to obtain the neural network configuration parameters that can be deployed to simulated neuromorphic devices (such as brain-like chips).
  • the simulating device configured with this parameter will obtain reasoning ability.
  • the simulating device performs inference and outputs (such as wires, wireless communication module, etc.) to other external electronic devices (such as MCU, etc.) to achieve linkage effects.
  • inference and outputs such as wires, wireless communication module, etc.
  • other external electronic devices such as MCU, etc.
  • SNN has a similar topology to traditional artificial neural networks, but has a completely different information processing mechanism.
  • the speech signal is encoded by the encoding layer (including several encoding neurons), and the encoding neuron transmits the output pulse to the hidden layer of the next layer.
  • the hidden layer includes a number of neurons (shown as circles in the figure), and each neuron performs weighted summation of each input pulse sequence according to the synaptic weight, and then outputs the pulse sequence based on the activation (also called excitation) function, and transmits it to Next level.
  • the activation also called excitation
  • the neuron model is the basic unit of the neural network, and different neural network architectures can be constructed by using the basic unit.
  • the present invention is not intended to face a specific network architecture, but any SNN using the neuron model.
  • the learned neural network configuration parameters are obtained.
  • Deploy the neural network accelerator such as a brain-like chip
  • the neural network can easily complete the reasoning work and realize artificial intelligence.
  • the LIF neuron model uses a synaptic time constant ⁇ s , a membrane time constant ⁇ ⁇ .
  • the subthreshold dynamics of neurons can be described using the following formula:
  • the present invention simulates LIF neurons through the following impulse response (SRM) model:
  • the non-leaking IAF (Integrate And Fire) neuron is:
  • post-synaptic potential nucleus ⁇ (t) ( ⁇ s * ⁇ ⁇ )(t)
  • synaptic dynamic function Membrane dynamic function "*" is the convolution operation
  • j is the count label. That is, the membrane voltage is obtained based on the post-synaptic potential kernel and each input pulse sequence after convolution and weighted summation.
  • the pulsed excitation function is cycled to calculate the membrane voltage, which is a time-consuming operation.
  • the above-mentioned kernel function is used to convolve the input pulses of these 100 time steps, so that the membrane voltage corresponding to these 100 time steps can be obtained, thereby greatly improving the Improve the information processing efficiency of neurons.
  • the "multi-pulse" mechanism described later is not used in a single simulation time step, especially when the time step is small enough that the multi-pulse mechanism is not needed.
  • the single-shot regime with smaller time steps means a large, unaffordable number of simulation time steps, which makes the training algorithm extremely inefficient.
  • a threshold ⁇ which is a fixed value, and can also be set as a dynamic value in some embodiments. If the membrane voltage exceeds N ⁇ , this neuron will generate a pulse with N times the unit pulse amplitude (it can be called N pulses, multi-pulse, which refers to the superposition of the amplitude at the same time step), and the membrane voltage will be proportional to Subtract, where N is a positive integer value.
  • N pulses the unit pulse amplitude
  • multi-pulse which refers to the superposition of the amplitude at the same time step
  • the amplitude of the generated pulse is determined according to the relationship between the membrane voltage and the threshold in a simulated time step, that is, the "multi-pulse" of the present invention (multi-spikes) mechanism (the “multi” pulse here can be understood as multiple unit amplitude pulses superimposed on the same time step).
  • the pulse amplitude generated by the specific multi-pulse mechanism can be determined according to the ratio relationship between the membrane voltage and a fixed value (such as a threshold), for example, it can be the Gaussian function of ⁇ (t)/ ⁇ in the above formula (rounded down), It can also be some other function transformation relationship, such as Gaussian function rounded up, or some kind of linear or nonlinear transformation of the aforementioned rounded value, that is, in a single simulation time step, the amplitude of the excited pulse is related to the membrane voltage and
  • the thresholds are ratio dependent.
  • the neuron at this time step (t 1 ⁇ t 4 ) generates afterpulses with a height that is several times (or related to) the unit amplitude, and constitutes a neuron output pulse sequence.
  • This mechanism of generating multiple pulses allows for more robustness when simulating time steps.
  • the advantage brought by this mechanism also includes that relatively larger time steps can be selected in the simulation. In practice, we have found that some neurons produce this so-called multi-spiking from time to time.
  • the training phase/method in the training device the signal processing method of neurons.
  • the concept of (simulation) time step does not exist in the mimetic hardware (such as brain-like chips), and the above-mentioned "multi-pulse" cannot be generated. Therefore, in the actual mimetic hardware, the aforementioned amplitude angle
  • the multiple pulses of will appear in the form of multiple pulses (equal to the aforementioned unit amplitude multiple) continuous on the time axis. For example, a pulse with an amplitude of 5 units is generated in the training algorithm, and correspondingly, 5 pulses with a fixed amplitude are continuously generated in the mimetic device.
  • the multi-pulse information can also be carried (or contained) by the pulse event in the neural network accelerator (such as a pseudo-expression chip), such as a pulse event is characterized by carrying (or containing) an integer It delivers a multi-pulse.
  • the neural network accelerator such as a pseudo-expression chip
  • the above discloses a signal processing method for neurons in a spiking neural network.
  • the spiking neural network includes several layers, and each layer includes several neurons.
  • the signal processing method includes the following steps: receiving Step: at least one of the neurons receives at least one input pulse sequence; accumulation step: based on the weighted summation of the at least one input pulse sequence to obtain a membrane voltage; activation step: when the membrane voltage exceeds a threshold, based on the membrane The ratio of the voltage to the threshold determines the amplitude of the pulse that the neuron fires.
  • the above neuron signal processing method can exist as a basic module/step of the spiking neural network training method.
  • the spiking neural network may include several above-mentioned neurons, and thus constitute several layers of the network.
  • the above-mentioned neuron signal processing method can also be applied in the reasoning stage of the neural network.
  • the neurons included in the neural network accelerator such as Mimic chip
  • the neurons included in the neural network accelerator apply the above-mentioned signal processing method of neurons.
  • the above neuron model can be applied to various neural network architectures, such as various existing network architectures and a new neural network architecture.
  • the present invention does not limit the specific neural network architecture.
  • the network prediction error needs to be transmitted to each layer of the network to adjust configuration parameters such as weights, so that the loss function value of the network can be minimized.
  • This is the error back propagation training method of the network.
  • Different training methods will lead to different network training performance and efficiency.
  • training schemes in the prior art but these training methods are basically based on the concept of gradient, especially the traditional ANN network.
  • the spike neural network training method in the present invention relates to the following technical means:
  • the present invention uses a surrogate gradient scheme.
  • the program selects the periodic exponential function as the proxy gradient in the backpropagation stage of the training process, and the present invention does not make specific parameters of the periodic exponential function limited.
  • This periodic exponential function spikes when the membrane voltage exceeds the neuron's threshold N ( ⁇ 1) times.
  • the gradient function maximizes the influence of parameters when a neuron is about to fire or has fired, and is a variant of the periodic exponential function.
  • Heaviside function A minimalist form of the periodic exponential function is the Heaviside function in Figure 4.
  • the Heaviside function is similar to a ReLU unit, which has a limited range of membrane voltages and a gradient of 0, which would likely prevent the neural network from learning with low levels of activity.
  • the above-mentioned Heaviside function is used as the proxy gradient during the backpropagation phase of the training process.
  • the above proxy gradient scheme can be applied to various backpropagation training models, such as a brand new training model, and the present invention does not limit the specific training scheme.
  • the pulse neural network training method involves the following technical means:
  • a kind of training method of pulse neural network comprises several layers, and each described layer comprises several neurons, is characterized in that:
  • Receiving step at least one neuron receives at least one input pulse sequence
  • Accumulation step obtain the membrane voltage based on the weighted summation of the at least one input pulse sequence
  • Activation step when the membrane voltage exceeds a threshold, determine the amplitude of the pulse excited by the neuron based on the ratio of the membrane voltage to the threshold;
  • the total loss of the spiking neural network includes a first loss and a second loss, wherein the first loss reflects the gap between the expected output of the spiking neural network and the actual output of the spiking neural network, and the second loss It reflects the activity or activity level of neurons.
  • the cross entropy of the sum of outputs is calculated for each output neuron to determine the class/class of the output. While this would yield decent classification accuracy, the magnitude of the output trace at a given moment is not indicative of the network's predictions. In other words, this approach does not work in streaming mode.
  • the total loss of the spiking neural network includes a first loss and a second loss, wherein the first loss reflects the difference between the expected output of the spiking neural network and the actual output of the spiking neural network Gap, the second loss reflects the activity/activity level of neurons. Specifically include:
  • Step 31 Detect the peak value of the output trace
  • Step 33 Calculate the first loss at the moment corresponding to the peak value of the output trace
  • the first loss is determined according to a cross entropy loss function.
  • the cross-entropy loss function is:
  • the first loss reflects the gap between the expected output of the spiking neural network and the actual output of the spiking neural network.
  • the moment corresponding to the peak value of the output trace may be referred to as the peak moment Referring to FIG. 6 , the output trace can be activated to the maximum extent at this moment.
  • the above-mentioned neural network predicts that the current input belongs to the indication of the relative possibility of category c , which can be calculated by the softmax function:
  • i is the count mark of the i-th category
  • c is the fraction of input data belonging to category
  • e is the base of the natural logarithm function
  • the denominator is corresponding to all categories Do the summation.
  • is the configuration parameter of the neural network
  • is the internal state of the network at time t.
  • the present invention sends the peak of each output trace into the softmax function, and the peak is obtained as follows:
  • the above-mentioned peak time is the time when the output trace can be activated to the maximum.
  • LIF neurons can change dramatically during the course of learning. This can occur by sending spikes at a high rate at each timestep potentially eliminating the advantage of using spiking neurons and thus no longer having sparsity. This may result in high energy consumption of mimetic devices implementing such networks.
  • Step 35 Calculate the second loss This second loss reflects the activity/level of activity of the neurons.
  • the second loss also known as activation loss, is a loss set to punish activation of too many neurons.
  • the second loss is defined as follows: The second loss depends on the total excess number of spikes produced by a population of N neurons in response to an input of T duration in
  • H( ) is the Heaviside function, is the ith neuron at time step t. That is, the sum of the spikes of all neurons N i exceeding 1 in each time bin.
  • Step 37 Combine first loss and second loss to total loss middle.
  • the above-mentioned combination method is: where the parameter ⁇ is a tuning parameter, optionally equal to 0.01.
  • the above combining manner also includes any other reasonable manner that takes the second loss into consideration, such as combining the first loss and the second loss in a non-linear manner.
  • the total loss, the first loss and the second loss all refer to the value of the corresponding loss function. These losses are based on the corresponding loss functions, such as Calculated.
  • Step 39 According to the function corresponding to the total loss
  • the neural network is trained using the error back propagation algorithm.
  • BPTT Backpropagation through time
  • the loss function in this invention, the total loss function ) value to feedback and adjust the configuration parameters such as the weights of the neural network (weights), and finally optimize the value of the loss function toward the direction of minimization to complete the learning/training process.
  • any reasonable BPTT algorithm can be applied to the above training, and the present invention does not limit the specific form of the BPTT algorithm.
  • the present invention also discloses the following products related to neural networks. Due to space limitations, the aforementioned neural network architecture and training methods will not be repeated here. All of the following are referenced, and any one or more of the aforementioned neural network architectures and their training methods are included in related products as part of the product.
  • a training device includes a memory, and at least one processor coupled to the memory, configured to execute the neural network training method included in any one of the above.
  • the training device can be an ordinary computer, a server, a training device dedicated to machine learning (such as a computing device including a high-performance GPU), a high-performance computer, or an FPGA device, an ASIC device, and the like.
  • a storage device configured to store the source code written by programming the neural network training method included in any one of the above, or/and the machine code that can be directly run on the machine.
  • the storage device includes but is not limited to memory carriers such as RAM, ROM, magnetic disk, solid-state hard disk, and optical disk. It may be a part of the training device, or it may be remotely separated from the training device.
  • a neural network accelerator on which the neural network configuration parameters trained by the neural network training method included in any one of the above items are deployed.
  • a neural network accelerator characterized in that: when the neurons included in the neural network accelerator perform reasoning functions, the aforementioned neuron signal processing method is applied.
  • integers are included in the spike events in the neural network accelerator.
  • a neural network accelerator is a hardware device used to accelerate the calculation of a neural network model. It may be a coprocessor configured on the side of the CPU and configured to perform specific tasks, such as keyword detection based on Event-triggered detection.
  • a pseudo-expression chip on which the neural network configuration parameters trained by the neural network training method included in any one of the above items are deployed.
  • Mimic chip/brain-like chip that is, a chip developed by simulating the working mode of biological neurons, usually based on event triggering, has the characteristics of low power consumption, low latency response, and no privacy disclosure.
  • Existing mimetic chips include Intel's Loihi, IBM's TrueNorth, Synsense's Dynap-CNN, etc.
  • a neural network configuration parameter deployment method deploying the neural network configuration parameters trained by any one of the neural network training methods included in the above to a neural network accelerator.
  • the configuration data generated during the training phase (which may be directly stored in the training device, or stored in a dedicated deployment device not shown) is passed through channels (such as cables, various various types of networks, etc.) to the storage unit of a neural network accelerator (such as an artificial intelligence chip, a mixed-signal brain-like chip), such as a storage unit that simulates a synapse, etc.
  • a neural network accelerator such as an artificial intelligence chip, a mixed-signal brain-like chip
  • the configuration parameter deployment process of the neural network accelerator can be completed.
  • a neural network configuration parameter deployment device stores the neural network configuration parameters trained by any one of the neural network training methods mentioned above, and transmits the configuration parameters to the neural network accelerator through a channel.
  • the multi-pulse mechanism proposed by the present invention will not affect the normal function of the network model.
  • the applicant repeated the pulse pattern (pattern) task in prior art 1 included 250 input neurons neurons to receive random/frozen inputs, and 25 hidden neurons to learn precise pulse times.
  • the SNN can complete the precise pulse beat after about 400 epochs, while the original model needs 739 epochs to reach the convergence state.
  • RGB images to train neuron clusters to fire pulses this time.
  • the target image has 350*355 pixels of 3 channels, and define the first dimension as time, and the other dimensions as neurons. From this, we trained 1065 neurons to fire spikes reflecting pixel values in all 3 channels, and plotted their output spike trains into an RGB map. As shown in part B of Fig. 7, the spike pattern can accurately reflect the Logo, which proves that the neuron cluster can accurately learn the pulse beat and the number of spikes.
  • RGB images to train neuron clusters to fire pulses this time.
  • the target image has 350*355 pixels of 3 channels, and define the first dimension as time, and the other dimensions as neurons. From this, we trained 1065 neurons to fire spikes reflecting pixel values in all 3 channels, and plotted their output spike trains into an RGB map. As shown in part B of Fig. 7, the spike pattern can accurately reflect the Logo, which proves that the neuron cluster can accurately learn the pulse beat and the number of spikes.
  • Table 1 shows the performance of different models on the N-MNIST dataset.
  • the performance is the best under this data set, whether it is the training or the test set, the performance is the best, followed by the LIF model, and the training time of both is 6.5 hours.
  • the model in the prior art 1 shown in the last row takes 42.5 hours to train, which is about 6-7 times that of the proposed scheme, and the accuracy is not as good as the proposed new scheme.
  • Table 2 Effects of pulse generation mechanisms of different coding layers on accuracy performance at different time step lengths
  • Table 2 shows that in the face of the small N-MNIST data set, the other network structures are the same, but at different time step lengths (1-100ms), only the encoding layer encodes the input signal with different encoding mechanisms (that is, generating multi-pulse or single-pulse ) in the case of network performance comparison. It can be seen from the table that even in the encoding layer, as the time step increases, the network performance of the single-pulse mechanism decreases most obviously, especially for the test set, no matter in the training phase or the testing phase. This result also highlights the performance advantage of the multi-pulse mechanism in terms of precision.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种脉冲神经网络中神经元的信号处理方法及该网络训练方法。不同于目前普遍使用的单脉冲机制,其被设计成多脉冲机制。该神经元的信号处理方法包括:接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;累积步骤:基于所述至少一路输入脉冲序列加权求和,获得膜电压;激活步骤:当所述膜电压超过阈值后,基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度。为解决日益增长的配置参数规模带来的训练算法耗时、低效问题,所述网络训练方法通过多脉冲机制、周期指数函数代理梯度、添加抑制神经元活跃程度作为损失,实现了脉冲神经网络高效训练,能够维持拟神态硬件低功耗,还提升了精度、以及收敛速度。

Description

脉冲神经网络中神经元的信号处理方法及该网络训练方法
本申请要求于2021年7月16日向中国专利局提交的申请号为202110808342.6、发明名称为“脉冲神经网络中神经元的信号处理方法及该网络训练方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及一种脉冲神经元,具体涉及一种脉冲神经网络中神经元的信号处理方法及该网络训练方法。
背景技术
脉冲神经网络(spiking neural network,SNN)是当前最佳的模拟生物神经工作原理的神经网络。但是限于其内在的不连续性和非线性机制,很难为SNN构造出高效的监督学习算法,而这又是该领域一个十分重要的课题。脉冲生成函数是不可微分的,所以传统标准的误差反向传播算法不能与SNN直接兼容。一种流行的途径是使用代理梯度去解决这个问题,比如现有技术1:
现有技术1:Shrestha S B,Orchard G.Slayer:Spike layer error reassignment in time[J].arXiv preprint arXiv:1810.08646,2018.
然而这类技术在每个时间步上仅支持单脉冲机制,对于诸如DVS数据等具有极高时间分辨率输入的脉冲数据,使用单脉冲机制将导致极大的、不可承受的模拟时间步数量,这将会导致面对复杂任务时,尤其是面对日益增长的配置参数规模,单脉冲机制的网络训练方式将变得极其低效。
为了解决/缓解上述技术问题,本发明提出一种在一个模拟时间步上能产生多个脉冲的自动可微分脉冲神经元模型和训练方法,该种模型/训练方法能够极大提升训练效率。
发明内容
为提升脉冲神经网络训练效率,本发明通过如下方式实现该目的:一种脉冲神经网络中神经元的信号处理方法,该脉冲神经网络包括若干层,每一所述的层包括若干所述的神经元,其特征在于,该信号处理方法包括如下步骤:接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;累积步骤:基于所述至少一路输入脉冲序列加权求和,获得膜电压;激活步骤:当所述膜电压超过阈值后,基 于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度。
在某类实施例中:所述基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度,具体为:在单个模拟时间步,激发的脉冲的幅度与所述膜电压与所述阈值的比值相关。
在某类实施例中:所述基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度,具体为:在单个模拟时间步,激发的脉冲的幅度与单位脉冲幅度的比值,等于所述膜电压与所述阈值的比值向下取整的值。
在某类实施例中:所述的基于所述至少一路输入脉冲序列加权求和,获得膜电压,具体包括:基于后突触电位核与每路输入脉冲序列卷积后加权求和,获得膜电压。
在某类实施例中:所述的基于所述至少一路输入脉冲序列加权求和,获得膜电压,具体包括:基于后突触电位核与每路输入脉冲序列卷积后加权求和,以及不应期核与所述神经元输出脉冲序列的卷积,获得膜电压。
在某类实施例中:
Figure PCTCN2021123091-appb-000001
其中υ(t)是神经元膜电压,ω j是第j个突触权重,∈(t)是后突触电位核,s j(t)是第j个输入脉冲序列,‘*’为卷积操作,t为时间。
在某类实施例中:
Figure PCTCN2021123091-appb-000002
其中υ(t)是神经元膜电压,η(t)是不应期核,s'(t)是所述神经元输出脉冲序列,ω j是第j个突触权重,∈(t)是后突触电位核,s j(t)是第j个输入脉冲序列,‘*’为卷积操作,t为时间。
在某类实施例中:其中的后突触电位核∈(t)=(∈ s*∈ υ)(t),突触动态函数
Figure PCTCN2021123091-appb-000003
膜动态函数
Figure PCTCN2021123091-appb-000004
τ s是突触时间常数、τ υ是膜时间常数,t为时间。
不应期核
Figure PCTCN2021123091-appb-000005
θ为阈值,当υ(t)≥θ时,
Figure PCTCN2021123091-appb-000006
否则s'(t)=0。
一种脉冲神经网络训练方法,所述脉冲神经网络包括若干层,每一所述的层包括若干神经元,其特征在于:所述神经元在网络训练中处理信号时,包括如下步骤:接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;累积步骤: 基于所述至少一路输入脉冲序列加权求和,获得膜电压;激活步骤:当所述膜电压超过阈值后,基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度;所述脉冲神经网络的总损失包括第一损失和第二损失,其中所述第一损失反映的是脉冲神经网络的预期输出结果与脉冲神经网络实际输出结果之间的差距,所述第二损失反映的是神经元的活性或活动程度。
在某类实施例中:该训练方法还包括:检测输出迹的峰值;在输出迹的所述峰值对应的时刻,计算第一损失;计算第二损失,该第二损失反映神经元的活性/活动程度;合并第一损失和第二损失至总损失中;依据总损失的对应的函数,采用误差反向传播算法对所述神经网络进行训练。
在某类实施例中:所述合并第一损失和第二损失至总损失中,具体为:
Figure PCTCN2021123091-appb-000007
其中参数α是一个调节参数,总损失为
Figure PCTCN2021123091-appb-000008
第一损失为
Figure PCTCN2021123091-appb-000009
第二损失为
Figure PCTCN2021123091-appb-000010
在某类实施例中:所述第二损失为
Figure PCTCN2021123091-appb-000011
其中,T为时长,N neurons为神经元集群规模,
Figure PCTCN2021123091-appb-000012
H(·)是Heaviside函数,
Figure PCTCN2021123091-appb-000013
是在t时间步的第i个神经元。
在某类实施例中:所述第一损失为
Figure PCTCN2021123091-appb-000014
其中,当类的标签c与当前输入相符时,λ c=1,否则λ c=0;是神经网络预测当前输入属于分类c的相对的可能性大小指示。
在某类实施例中:使用周期指数函数或Heaviside函数作为代理梯度。
一种训练设备,包括存储器,和耦合至该存储器至少一个处理器,其被配置为可以执行上述任意一项所包括的神经网络训练方法。
一种存储设备,其被配置为存储有通过编程语言将上述任意一项所包括的神经网络训练方法编写而成的源代码,或/和可以直接在机器上运行的机器代码。
一种神经网络加速器,其上部署有上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数。
一种拟神态芯片,其上部署有上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数。
一种神经网络配置参数部署方法,将上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数部署至神经网络加速器中。
一种神经网络配置参数部署设备,其上存储有将上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数,并通过信道将配置参数传输至神经网络加速器。
一种神经网络加速器,该神经网络加速器包括的神经元在执行推理功能时,应用了如前所述的神经元的信号处理方法。
在某类实施例中,在所述神经网络加速器中的脉冲事件中包含整数。
除上述目的外,相比于现有技术,本发明的某些不同的实施例还具有如下优点之一或多个:
1、除了提升训练速度外,对于相同的模型和训练方法,还可以提升模型/训练方法的精度;
2、抑制神经元的活跃程度,保持计算的稀疏性,降低拟神态芯片的功耗。
3、脉冲节拍的学习可以更快速收敛。
4、计算膜电压时,在一个时间周期通过卷积操作的计算量远比逐个时间步计算量要低。
以上披露的技术方案、技术特征、技术手段,与后续的具体实施方式部分中所描述的技术方案、技术特征、技术手段之间可能不完全相同、一致。但是该部分披露的这些新的技术方案同样属于本发明文件所公开的众多技术方案的一部分,该部分披露的这些新的技术特征、技术手段与后续具体实施方式部分公开的技术特征、技术手段是以相互合理组合的方式,披露更多的技术方案,是具体实施方式部分的有益补充。与此相同,说明书附图中的部分细节内容可能在说明书中未被明确描述,但是如果本领域技术人员基于本发明其它相关文字或附图的描述、本领域的普通技术知识、其它现有技术(如会议、期刊论文等),可以推知其技术含义,那么该部分未明确被文字记载的技术方案、技术特征、技术手段,同样属于本发明所披露的技术内容,且如上描述的一样可以被用于组合,以获得相应的新的技术方案。本发明任意位置所披露的所有技术特征所组合出的技术方 案,用于支撑对技术方案的概括、专利文件的修改、技术方案的披露。
附图说明
图1是SNN神经网络架构示意图;
图2是单脉冲神经元信号处理机制示意图;
图3是多脉冲神经元信号处理机制示意图;
图4是代理梯度的函数图;
图5是训练过程中损失函数构造流程图;
图6是输出迹与峰值时刻示意图;
图7是神经元被训练后在精确时刻发射脉冲与神经元集群被训练后生成图样示意图。
具体实施方式
本发明中任意位置出现所述的“脉冲”,均指的是拟神态领域中的spike,其也被称为“尖峰”,并非一般电路里的Pulse。所述的训练算法,可以以计算机代码的形式编写成计算机程序,存储在存储介质中,并被计算机(如具有高性能GPU设备、FPGA、ASIC等)处理器读取,在训练数据(各种各样的数据集)、训练算法的训练下,获得用于可部署至模拟神经形态设备(如类脑芯片)中的神经网络配置参数。配置有该参数的拟神态器件将获得推理能力,根据传感器(如感知光线明暗变化的DVS、专用的声音信号采集设备等)获取的信号,拟神态器件对其进行推理,并输出(比如导线、无线通信模块等)推理结果至其它外部电子设备(如MCU等),实现联动效果。对于下文未详细披露与神经网络相关的技术方案和细节,一般均属于本领域的常规技术手段/公知常识,由于篇幅限制,本发明不对其详细介绍。文中的“基于....”或类似表述,表明至少利用了这里所述的技术特征来达到某个目的,这并不暗示仅仅只是利用了所述的技术特征,其可能还包括其它的技术特征,尤其是权利要求中。除非是除法含义,本发明中任意位置处的“/”均表示逻辑“或”。
SNN与传统的人工神经网络具有相似的拓扑结构,但是却拥有截然不同的信息处理机制。参考图1所示的SNN网络结构,在采集到语音信号后,经过编码层(包含若干编码神经元)对语音信号编码后,编码神经元将输出脉冲传递给下一层的隐藏层。隐藏层包括若干神经元(图中以圆圈示意),每个神经元对输 入的每路脉冲序列根据突触权重进行加权求和,然后基于激活(也称激励)函数输出脉冲序列,并传递至下一层。图中所示的仅仅是包含一个隐藏层的网络结构,网络可以被设计具有多层的隐藏层。最后,在网络的输出层输出结果。
1、神经元模型
神经元的模型是神经网络的基础单元,利用该基础单元可以构建出不同的神经网络架构,本发明并非旨在面对特定的网络架构,而是任何利用该神经元模型的SNN。根据数据集和训练/学习算法,对具有特定结构的网络模型进行训练后,获得学习后的神经网络配置参数。部署该训练好的配置参数的神经网络加速器(如类脑芯片),对于任意的输入,比如声音、图像信号等,神经网络可以轻松完成推理工作,实现人工智能。
在某类实施例中,LIF神经元模型使用突触时间常数τ s、膜时间常数τ υ。神经元的亚阈值动态特性可以使用如下公式描述:
Figure PCTCN2021123091-appb-000015
Figure PCTCN2021123091-appb-000016
其中,
Figure PCTCN2021123091-appb-000017
Figure PCTCN2021123091-appb-000018
均为导数/微商的记法,即
Figure PCTCN2021123091-appb-000019
Figure PCTCN2021123091-appb-000020
υ(t)是膜电压,i s(t)是突触电流,ω j是第j个突触权,是输入脉冲序列(train)中的第j个/路(“/”为逻辑“或”),t为时间。
为了进一步提升模拟效率,在某类实施例中,本发明通过如下脉冲响应(SRM)模型模拟LIF神经元:
Figure PCTCN2021123091-appb-000021
其中:后突触电位(PSP,post synaptic potential)核∈(t)=(∈ s*∈ υ)(t),突触动态函数
Figure PCTCN2021123091-appb-000022
膜动态函数
Figure PCTCN2021123091-appb-000023
不应期核(refractory kernel)
Figure PCTCN2021123091-appb-000024
其同样属于负指数核函数且带有与膜电压(membrane potential)一样的时间常数τ υ,“*”为卷积操作,j是计数标号,s'或s'(t)均为神经元输出脉冲序列,t为时间。即,基于后突触电位核与每路输入脉冲序列卷积后加权求和,以及不应期核与所述神经元输出脉冲序列的卷积,获得膜电压。
在某替代实施例中,非泄露IAF(Integrate And Fire)神经元为:
Figure PCTCN2021123091-appb-000025
其中:后突触电位核∈(t)=(∈ s*∈ υ)(t),突触动态函数
Figure PCTCN2021123091-appb-000026
膜动态函数
Figure PCTCN2021123091-appb-000027
“*”为卷积操作,j是计数标号。即,基于后突触电位核与每路输入脉冲序列卷积后加权求和,获得膜电压。
在传统的SNN解决方案中,对于每个时间步,都会循环使用脉冲激励函数去计算膜电压,这是一种耗时的操作。然而在本发明中,比如针对100个时间步,通过上述的核函数对这100个时间步的输入脉冲进行卷积,由此可以获得针对这100个时间步对应的膜电压,由此极大地提升了神经元的信息处理效率。
传统LIF模型中,在膜电压超过阈值θ后就会被重置至静息电位(resting potential)。参考图2,对于一个单脉冲机制的神经元,其接收多路/至少一路脉冲序列(前脉冲)s j,在突触权重ω j的加权下求和,然后获得的膜电压与阈值θ相比较,如果超出了该阈值,神经元就在该时间步(t 1~t 4)产生一个后脉冲,所有的生成的脉冲均具有统一的固定单位幅度,构成神经元输出脉冲序列,这就是所谓的“单脉冲机制”。
通常现有技术中,在单个模拟时间步(time step)中并不会使用后文所述的“多脉冲”机制,尤其是时间步足够小时可以不需要多脉冲机制。但是更小时间步的单脉冲机制意味着大量的、不可承受的模拟时间步数量,这使得训练算法极其低效。
然而在某类实施例中,我们会减去一个阈值θ,该阈值是一个固定值,在某些实施例中也可设置成动态值。如果膜电压超过了Nθ,这个神经元就产生N倍单位脉冲幅度的脉冲(可以形象称之为N个脉冲、多脉冲,指的是同一时间步上振幅的叠加),膜电压就按比例地减去,其中N是一个正整数值。这样做的好处是可以提升优化模拟的时间和计算效率。神经元输出脉冲序列用数学语言描述即为:
Figure PCTCN2021123091-appb-000028
即,在某类实施例中,当神经元的膜电压满足一定条件后,在一个模拟的时间步 中依据膜电压与阈值的关系,决定生成的脉冲的幅度,即本发明的“多脉冲”(multi-spikes)机制(这里的“多”脉冲,可以理解成多个单位幅度脉冲叠加在同一个时间步上)。具体的多脉冲机制生成的脉冲幅度,可以是根据膜电压与固定值(如阈值)的比值关系来确定,比如可以是上述公式中υ(t)/θ的高斯函数(向下取整),也可以是某种其它函数变换关系,比如高斯函数向上取整,或前述取整后数值的某种线性、非线性变换,即在单个模拟时间步,激发的脉冲的幅度与所述膜电压与所述阈值的比值相关。此处的“s′(t)=1”含义为具有单位幅度的脉冲(即单位脉冲)。即上述公式披露了:在单个模拟时间步,激发的脉冲的幅度与单位脉冲幅度的比值,等于所述膜电压与所述阈值的比值向下取整的值。
参考图3,与单脉冲机制神经元不同,在接收至少一路/个前脉冲(输入脉冲序列)后,如果神经元的膜电压超过了阈值θ若干倍,那么神经元在该时间步(t 1~t 4)就产生单位振幅若干倍(或与该倍数相关的)高度的后脉冲,构成神经元输出脉冲序列。
这种产生多脉冲的机制允许在模拟时间步时更具鲁棒性。这种机制带来的好处还包括在模拟中可以选择相对更大的时间步。在实践中,我们发现一些神经元不时就会产生这种所谓的多脉冲。
以上描述的是在训练设备中的训练阶段/方法,神经元的信号处理方法。应注意到,在拟神态硬件(如类脑芯片)中,并不存在(模拟)时间步的概念,且无法生成上述的“多脉冲”,因此在实际的拟神态硬件中,前述的振幅角度的多脉冲会以时间轴上连续的多个(等于前述的单位振幅倍数)脉冲的形式出现。比如,在训练算法中生成幅度为5个单位的脉冲,对应地,在拟神态器件中连续生成5个幅度固定的脉冲。然而在另一类实施例中,在神经网络加速器(如拟神态芯片)中也可以通过脉冲事件携带(或包含)所述多脉冲信息,比如一个脉冲事件通过携带(或包含)整数,来表征其传递的是一个多脉冲。
综上,上述公开了一种脉冲神经网络中神经元的信号处理方法,该脉冲神经网络包括若干层,所述的每一层包括若干所述的神经元,该信号处理方法包括如下步骤:接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;累积步骤:基于所述至少一路输入脉冲序列加权求和,获得膜电压;激活步骤:当所述膜电压超过阈值后,基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅 度。
以上神经元的信号处理方式,可以作为脉冲神经网络训练方法的一个基本模块/步骤而存在。脉冲神经网络中可以包括若干上述的神经元,并由此构成若干网络的层(layer)。
事实上,在神经网络的推理阶段同样可以应用上述神经元的信号处理方法。换言之,在执行推理功能时,神经网络加速器(如拟神态芯片)中包括的神经元应用了上述神经元的信号处理方法。
上述神经元模型可以被应用于各种各样的神经网络架构中,比如已有的各种网络架构、某种全新的神经网络架构,本发明对具体的神经网络的架构不做限定。
2、代理梯度
网络训练阶段,需要将网络预测的误差传递至网络各层,以调整权重等配置参数,是的网络的损失函数值降到最低,这就是网络的误差反向传播训练方法。不同的训练方法会导致不同的网络训练性能、效率,现有技术中存在不少训练方案,但是这些训练方法基本都会基于梯度的概念,尤其是传统的ANN网络。为此,本发明中脉冲神经网络训练方法涉及如下技术手段:
为解决SNN脉冲梯度的不可导问题,本发明使用了代理梯度(surrogate gradient)方案。在某类实施例中,参考图4,为适应神经元的多脉冲行为,方案在训练过程中的反向传播阶段选用周期指数函数作为代理梯度,本发明对具体的周期指数函数的参数不做限定。当膜电压超过神经元的阈值N(≥1)倍,这个周期指数函数就发出尖峰。梯度函数(gradient function)可以在一个神经元将要发出脉冲或已发出脉冲时,最大化参数的影响,且是周期指数函数的一个变体(variant)。
周期指数函数的极简形式是图4中的Heaviside函数。该Heaviside函数类似ReLU单元,其具有有限范围的膜电压且梯度为0,这将可能阻止神经网络以低水平活动学习。在某个替代的实施例中,在训练过程中的反向传播阶段,使用上述Heaviside函数作为代理梯度。
上述代理梯度方案可以被应用于各种反向传播训练模型中,比如某种全新的训练模型,本发明对具体的训练方案不做限定。
3、损失函数
在脉冲神经网络训练方法中,一般会涉及到损失函数,这是对当前网络的训练结果的一种评价指标。损失值越大,代表该网络性能越差,反之则越好。本发明中脉冲神经网络训练方法涉及如下技术手段:
一种脉冲神经网络训练方法,所述脉冲神经网络包括若干层,每一所述的层包括若干神经元,其特征在于:
所述神经元在网络训练中处理信号时,包括如下步骤:
接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;
累积步骤:基于所述至少一路输入脉冲序列加权求和,获得膜电压;
激活步骤:当所述膜电压超过阈值后,基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度;
所述脉冲神经网络的总损失包括第一损失和第二损失,其中所述第一损失反映的是脉冲神经网络的预期输出结果与脉冲神经网络实际输出结果之间的差距,所述第二损失反映的是神经元的活性或活动程度。
在分类任务中,一般地,对每个输出神经元计算采样长度内(over the sample length)输出(outputs)的和的交叉熵,就可以决定输出的类/class。虽然这样会有不错的分类精度,但是在给定时刻的输出迹(output trace)的幅度并不代表着网络预测。换言之,这种做法在流(steaming)模式下行不通。为此,参考图5我们设计了全新的总损失函数
Figure PCTCN2021123091-appb-000029
和脉冲神经网络训练方法,所述脉冲神经网络的总损失包括第一损失和第二损失,其中所述第一损失反映的是脉冲神经网络的预期输出结果与脉冲神经网络实际输出结果之间的差距,所述第二损失反映的是神经元的活性/活动程度。具体包括:
Step 31:检测输出迹的峰值;
Step 33:在输出迹的所述峰值对应的时刻,计算第一损失
Figure PCTCN2021123091-appb-000030
在某类具体实施例中,第一损失是根据交叉熵损失(cross entropy loss)函数来确定的。具体地,该交叉熵损失函数为:
Figure PCTCN2021123091-appb-000031
其中,当类的标签c(也即分类c)与当前输入相符时,λ c=1,否则λ c=0;p c是神经网络预测当前输入属于分类c的相对的可能性大小的指示(比如概率/ 几率或其某种函数映射值)。第一损失反映的是脉冲神经网络的预期输出结果与脉冲神经网络实际输出结果之间的差距。
输出迹的所述峰值对应的时刻,可以被称为峰值时刻
Figure PCTCN2021123091-appb-000032
参考图6,该时刻下能够最大程度激活输出迹。
在某类具体实施例中,上述神经网络预测当前输入属于分类c的相对的可能性大小的指示p c可以通过softmax函数来计算:
Figure PCTCN2021123091-appb-000033
其中,
Figure PCTCN2021123091-appb-000034
Figure PCTCN2021123091-appb-000035
均是神经网络输出的logits值,i是第i个分类的计数标记,
Figure PCTCN2021123091-appb-000036
是输入数据属于分类c的分数,
Figure PCTCN2021123091-appb-000037
是输入数据属于第i个分类的分数,e为自然对数函数的底数,分母为对全部的分类对应的
Figure PCTCN2021123091-appb-000038
进行求和。
对于时域任务,输入x=x T=x 1,2,3...T,神经网络的输出
Figure PCTCN2021123091-appb-000039
(logits值)是时长T上的时间序列。t时刻的神经网络输出:
Figure PCTCN2021123091-appb-000040
其中,
Figure PCTCN2021123091-appb-000041
是神经网络变换,Θ是神经网络的配置参数,
Figure PCTCN2021123091-appb-000042
是t时刻网络的内部状态。
对于尖峰-损失(peak-loss),本发明把每个输出迹的尖峰送入softmax函数,并且所述尖峰是通过如下方式得到的:
Figure PCTCN2021123091-appb-000043
其中
Figure PCTCN2021123091-appb-000044
即上述的峰值时刻,参考图6,是能够最大激活输出迹的时刻。
申请人发现,LIF神经元在学习的过程中的活动可以急剧地变化。这可能出现在每个时间步上都以高速率发送脉冲而潜在地消除了使用脉冲神经元的优势,因而不再具有稀疏性。这可能会导致拟神态器件实施这样的网络后却具有较高的能耗。
Step 35:计算第二损失
Figure PCTCN2021123091-appb-000045
该第二损失反映神经元的活性/活动程度。
为了抑制/限制神经元的活性/活动而依然保持稀疏活动,在总损失
Figure PCTCN2021123091-appb-000046
中还包括第二损失
Figure PCTCN2021123091-appb-000047
总损失
Figure PCTCN2021123091-appb-000048
是合并了/包括第一损失
Figure PCTCN2021123091-appb-000049
和第二损失
Figure PCTCN2021123091-appb-000050
后的损失。 第二损失也称为激活损失,是为了惩罚激活过多的神经元而设置的损失。
可选地,第二损失定义如下:
Figure PCTCN2021123091-appb-000051
该第二损失取决于具有N neurons规模的神经元集群(population)响应具有T时长的输入而产生的总脉冲超出数(total excess number of spikes)
Figure PCTCN2021123091-appb-000052
其中
Figure PCTCN2021123091-appb-000053
此处的H(·)是Heaviside函数,
Figure PCTCN2021123091-appb-000054
是在t时间步的第i个神经元。
Figure PCTCN2021123091-appb-000055
也即在每个时间箱(bin)中所有超过1的神经元N i的脉冲的和。
Step 37:合并第一损失
Figure PCTCN2021123091-appb-000056
和第二损失
Figure PCTCN2021123091-appb-000057
至总损失
Figure PCTCN2021123091-appb-000058
中。
在某类实施例中,上述合并方式为:
Figure PCTCN2021123091-appb-000059
其中参数α是一个调节参数,可选地其等于0.01。在可替代的实施例中,上述合并方式还包括其它任何合理的将第二损失考虑在内的方式,比如以非线性方式合并第一损失和第二损失。
这里的总损失与第一损失和第二损失,均指的是对应的损失函数的值。这些损失均是根据对应的损失函数,如
Figure PCTCN2021123091-appb-000060
计算得出的。
Step 39:依据总损失对应的函数
Figure PCTCN2021123091-appb-000061
采用误差反向传播算法对神经网络进行训练。
时序反向传播算法(Backpropagation through time,BPTT),是一种本领域所熟知的基于梯度的神经网络训练(有时也称学习)方法。通常根据损失函数(本发明中是总损失函数
Figure PCTCN2021123091-appb-000062
)值的大小来反馈调节神经网络的权重(weights)等配置参数,最后使得损失函数的值朝着最小化方向优化,完成学习/训练过程。
对于本发明,任何合理的BPTT算法均可适用于上述训练,本发明对于BPTT算法的具体的形式不做限定。
虽然以上各个Step后均辅以数字区分,但是这些数字的大小并不暗示绝对的步骤执行顺序,并且数字之间差值也不暗示其还可能存在的其它步骤的数量。
4、神经网络相关产品
除了前述的神经网络架构、训练方法外,本发明还披露如下与神经网络相关的产品。限于篇幅,前述的神经网络架构、训练方法此处不再赘述。以下均采用 引用的方式,将前述的全部神经网络架构及其训练方法中的任意一种或多种,包含至相关的产品中,并作为产品的一部分。
一种训练设备,包括存储器,和耦合至该存储器至少一个处理器,其被配置为可以执行上述任意一项所包括的神经网络训练方法。
训练设备可以是普通的计算机、服务器、专用于机器学习的训练设备(如包括高性能GPU的计算设备)、高性能计算机,也可以是FPGA设备、ASIC设备等。
一种存储设备,其被配置为存储有通过编程语言将上述任意一项所包括的神经网络训练方法编写而成的源代码,或/和可以直接在机器上运行的机器代码。
该存储设备包括但不限于RAM、ROM,磁盘,固态硬盘、光盘等记忆载体,其有可能是训练设备的一部分,也有可能与训练设备是远程分离的。
一种神经网络加速器,其上部署有上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数。
一种神经网络加速器,其特征在于:该神经网络加速器包括的神经元在执行推理功能时,应用了如前所述的神经元的信号处理方法。
在某类实施例中,在所述神经网络加速器中的脉冲事件中包含整数。
神经网络加速器是一种用于加速神经网络模型计算的硬件设备,其可能是一种协处理器而被配置在CPU的一侧,而被配置为执行特定的任务,如诸如关键词检测等基于事件触发的检测。
一种拟神态芯片,其上部署有上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数。
拟神态芯片/类脑芯片,即模拟生物神经元形态工作方式而开发出来的芯片,通常其是基于事件触发的,具有低功耗、低延迟响应、无隐私泄露的特点。现有的拟神态芯片有Intel的Loihi、IBM的TrueNorth、Synsense的Dynap-CNN等。
一种神经网络配置参数部署方法,将上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数部署至神经网络加速器中。
借助于专用的部署软件,该部署阶段将训练阶段生成的配置数据(其可以是直接存储于训练设备中,也可以是存储于未示出的专用的部署设备)通过信道(如 线缆、各种类型的网络等)传输至神经网络加速器(如人工智能芯片、混合信号类脑芯片)的存储单元,如模拟突触的存储单元等。如此即可完成神经网络加速器的配置参数部署流程。
一种神经网络配置参数部署设备,其上存储有将上述任意一项所包括的神经网络训练方法所训练出的神经网络配置参数,并通过信道将配置参数传输至神经网络加速器。
5、性能测试
首先,本发明所提出的多脉冲机制,并不会影响网络模型的正常功能。为验证该结论,作为举例,利用现有技术1中所述的网络和训练方法,申请人重复了现有技术1中的脉冲图样(pattern)任务,该重复验证模型中包括了250个输入神经元以接收随机/冻结输入,以及25个隐藏神经元以学习精确脉冲节拍(times)。参考图7的A部分,SNN能够在大约400个世代(epochs)后,即可完成精确脉冲节拍,而原模型则需要739个世代才能到达收敛状态。
同样地,除了脉冲节拍可以被精确地学习外,为了进一步验证脉冲数量也能被准确学习,与先前的实验类似,我们这次以RGB图像的图样去训练神经元集群去发射脉冲,目标图像具有3通道的350*355像素,且定义第一维度为时间,其它维度为神经元。由此,我们训练1065个神经元去发射脉冲以反映所有3个通道的像素值,并把它们输出的脉冲序列绘制成RGB图。如图7的B部分所示,脉冲图样可以精确地反映Logo,这证明了神经元集群可以准确地学习脉冲节拍和脉冲数量。
同样地,除了脉冲节拍可以被精确地学习外,为了进一步验证脉冲数量也能被准确学习,与先前的实验类似,我们这次以RGB图像的图样去训练神经元集群去发射脉冲,目标图像具有3通道的350*355像素,且定义第一维度为时间,其它维度为神经元。由此,我们训练1065个神经元去发射脉冲以反映所有3个通道的像素值,并把它们输出的脉冲序列绘制成RGB图。如图7的B部分所示,脉冲图样可以精确地反映Logo,这证明了神经元集群可以准确地学习脉冲节拍和脉冲数量。
表1:不同模型下N-MNIST数据集上的而表现
Figure PCTCN2021123091-appb-000063
表1展示了不同模型下N-MNIST数据集上的而表现。对于使用IAF神经元模型的方案,在该数据集下表现的最好,不论是训练还是测试集,均表现最佳,LIF模型次之,二者训练耗时均为6.5小时。而最后一行展示的现有技术1中的模型,训练耗时42.5小时,大约是所提出方案的6-7倍,且精度也不及所提出的新方案。
表2:不同时间步长度下不同编码层脉冲生成机制对精度性能的影响
Figure PCTCN2021123091-appb-000064
表2展示了面对小N-MNIST数据集,在其它网络结构相同,但在不同时间步长度(1~100ms)下、仅仅编码层对输入信号的不同编码机制(即产生多脉冲或单脉冲)情况下网络性能的对比。从表中可以得知,即便是在编码层,随着时间步的增大,不论是训练阶段还是测试阶段,单脉冲机制的网络性能下降最为明显,尤其是对于测试集。该结果也凸显了多脉冲机制在精度方面的性能优势。
尽管已经参考本发明的具体特征和实施例描述了本发明,但是在不脱离本发明的情况下可以对其进行各种修改和组合。因此,说明书和附图应简单地视为由所附权利要求限定的本发明的一些实施例的说明,并且预期涵盖落入本发明范围内的任何和所有修改、变化、组合或等同物。因此,尽管已经详细描述了本发明及其优点,但是在不脱离由所附权利要求限定的本发明的情况下,可以进行各种改变、替换和变更。此外,本申请的范围不旨在限于说明书中描述的过程、机器、制造、物质组成、装置、方法和步骤的特定实施例。
本领域普通技术人员从本发明的公开内容将容易理解,可以根据本发明应用执行与本文描述的相应实施例实质上相同功能或达到实质上相同的结果的当前存在或稍后开发的过程、机器、制造、物质组成、装置、方法或步骤。因此,所附权利要求目的在于在其范围内包括这样的过程、机器、制造、物质组成、装置、方法或步骤。
为了实现更好的技术效果或出于某些应用的需求,本领域技术人员可能在本发明的基础之上,对技术方案作出进一步的改进。然而,即便该部分改进/设计具有创造性或/和进步性,只要利用了本发明权利要求所覆盖的技术特征,依据“全面覆盖原则”,该技术方案同样应落入本发明的保护范围之内。
所附的权利要求中所提及的若干技术特征可能存在替代的技术特征,或者对某些技术流程的顺序、物质组织顺序可以重组。本领域普通技术人员知晓本发明后,容易想到该些替换手段,或者改变技术流程的顺序、物质组织顺序,然后采用了基本相同的手段,解决基本相同的技术问题,达到了基本相同的技术效果,因此即便权利要求中明确限定了上述手段或/和顺序,然而该些修饰、改变、替换,均应依据“等同原则”而落入权利要求的保护范围。
对于权利要求中有明确的数值限定的,通常情况下,本领域技术人员能够理解,该数值附近的其它合理数值同样能够应用于某具体的实施方式中。这些未脱离本发明构思的通过细节规避的设计方案,同样落入该权利要求的保护范围。
结合本文中所公开的实施例中描述的各方法步骤和单元,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明所要求保护的范围。

Claims (20)

  1. 一种脉冲神经网络中神经元的信号处理方法,该脉冲神经网络包括若干层,每一所述的层包括若干所述的神经元,其特征在于,该信号处理方法包括如下步骤:
    接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;
    累积步骤:基于所述至少一路输入脉冲序列加权求和,获得膜电压;
    激活步骤:当所述膜电压超过阈值后,基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度。
  2. 根据权利要求1所述的脉冲神经网络中神经元的信号处理方法,其特征在于:所述基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度,具体为:在单个模拟时间步,激发的脉冲的幅度与所述膜电压与所述阈值的比值相关。
  3. 根据权利要求1所述的脉冲神经网络中神经元的信号处理方法,其特征在于:所述基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度,具体为:在单个模拟时间步,激发的脉冲的幅度与单位脉冲幅度的比值,等于所述膜电压与所述阈值的比值向下取整的值。
  4. 根据权利要求1-3任一项所述的脉冲神经网络中神经元的信号处理方法,其特征在于:所述的基于所述至少一路输入脉冲序列加权求和,获得膜电压,具体包括:基于后突触电位核与每路输入脉冲序列卷积后加权求和,获得膜电压。
  5. 根据权利要求4所述的脉冲神经网络中神经元的信号处理方法,其特征在于:所述的基于所述至少一路输入脉冲序列加权求和,获得膜电压,具体包括:基于后突触电位核与每路输入脉冲序列卷积后加权求和,以及不应期核与所述神经元输出脉冲序列的卷积,获得膜电压。
  6. 根据权利要求4所述的脉冲神经网络中神经元的信号处理方法,其特征在于:
    Figure PCTCN2021123091-appb-100001
    其中,υ(t)是神经元膜电压,ω j是第j个突触权重,∈(t)是后突触电位核,s j(t)是第j个输入脉冲序列,‘*’为卷积操作,t为时间。
  7. 根据权利要求5所述的脉冲神经网络中神经元的信号处理方法,其特征在于:
    Figure PCTCN2021123091-appb-100002
    其中,υ(t)是神经元膜电压,η(t)是不应期核,s'(t)是所述神经元输出脉冲序列,ω j是第j个突触权重,∈(t)是后突触电位核,s j(t)是第j个输入脉冲序列,‘*’为卷积操作,t为时间。
  8. 根据权利要求6所述的脉冲神经网络中神经元的信号处理方法,其特征在于:其中的后突触电位核∈(t)=(∈ s*∈ υ)(t),突触动态函数
    Figure PCTCN2021123091-appb-100003
    膜动态函数
    Figure PCTCN2021123091-appb-100004
    τ s是突触时间常数、τ υ是膜时间常数,t为时间。
  9. 根据权利要求7所述的脉冲神经网络中神经元的信号处理方法,其特征在于:其中的后突触电位核∈(t)=(∈ s*∈ υ)(t),突触动态函数
    Figure PCTCN2021123091-appb-100005
    膜动态函数
    Figure PCTCN2021123091-appb-100006
    τ s是突触时间常数、τ υ是膜时间常数,t为时间;不应期核
    Figure PCTCN2021123091-appb-100007
    θ为阈值,当υ(t)≥θ时,
    Figure PCTCN2021123091-appb-100008
    否则s'(t)=0。
  10. 一种脉冲神经网络训练方法,所述脉冲神经网络包括若干层,每一所述的层包括若干神经元,其特征在于:
    所述神经元在网络训练中处理信号时,包括如下步骤:
    接收步骤:至少一个所述神经元接收至少一路输入脉冲序列;
    累积步骤:基于所述至少一路输入脉冲序列加权求和,获得膜电压;
    激活步骤:当所述膜电压超过阈值后,基于所述膜电压与所述阈值的比值确定该神经元激发的脉冲的幅度;
    所述脉冲神经网络的总损失包括第一损失和第二损失,其中所述第一损失反映的是脉冲神经网络的预期输出结果与脉冲神经网络实际输出结果之间的差距,所述第二损失反映的是神经元的活性或活动程度。
  11. 根据权利要求10所述的脉冲神经网络训练方法,其特征在于:该训练方法还包括:
    检测输出迹的峰值;
    在输出迹的所述峰值对应的时刻,计算第一损失;
    计算第二损失,该第二损失反映神经元的活性或活动程度;
    合并第一损失和第二损失至总损失中;
    依据总损失对应的函数,采用误差反向传播算法对所述神经网络进行训练。
  12. 根据权利要求11所述的脉冲神经网络训练方法,其特征在于:所述合并第一损失和第二损失至总损失中,具体为:
    Figure PCTCN2021123091-appb-100009
    其中参数α是一个调节参数,总损失为
    Figure PCTCN2021123091-appb-100010
    第一损失为
    Figure PCTCN2021123091-appb-100011
    第二损失为
    Figure PCTCN2021123091-appb-100012
  13. 根据权利要求10所述的脉冲神经网络训练方法,其特征在于:所述第二损失为
    Figure PCTCN2021123091-appb-100013
    其中,T为时长,N neurons为神经元集群规模,
    Figure PCTCN2021123091-appb-100014
    H(·)是Heaviside函数,
    Figure PCTCN2021123091-appb-100015
    是在t时间步的第i个神经元。
  14. 根据权利要求10所述的脉冲神经网络训练方法,其特征在于:所述第一损失为
    Figure PCTCN2021123091-appb-100016
    其中,当类的标签c与当前输入相符时,λ c=1,否则λ c=0;p c是神经网络预测当前输入属于分类c的相对的可能性大小指示。
  15. 根据权利要求10-14任一项所述的脉冲神经网络训练方法,其特征在于:使用周期指数函数或Heaviside函数作为代理梯度。
  16. 一种训练设备,包括存储器,和耦合至该存储器至少一个处理器,其特征在于:其被配置为可以执行上述权利要求10-15中任意一项所包括的神经网络训练方法。
  17. 一种存储设备,其特征在于:其被配置为存储有通过编程语言将上述权利要求10-15中任意一项所包括的神经网络训练方法编写而成的源代码,或/和可以直接在机器上运行的机器代码。
  18. 一种神经网络加速器,其特征在于:该神经网络加速器包括的神经元在执行推理功能时,应用了如权利要求1所述的神经元的信号处理方法。
  19. 根据权利要求18所述的神经网络加速器,其特征在于:在所述神经网络加速器中的脉冲事件中包含整数。
  20. 一种拟神态芯片,其特征在于:其上部署有上述权利要求10-15中任意一项所包括的神经网络训练方法所训练出的神经网络配置参数。
PCT/CN2021/123091 2021-07-16 2021-10-11 脉冲神经网络中神经元的信号处理方法及该网络训练方法 WO2023284142A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/251,000 US20230385617A1 (en) 2021-07-16 2021-10-11 Signal processing method for neuron in spiking neural network and method for training said network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110808342.6 2021-07-16
CN202110808342.6A CN113255905B (zh) 2021-07-16 2021-07-16 脉冲神经网络中神经元的信号处理方法及该网络训练方法

Publications (1)

Publication Number Publication Date
WO2023284142A1 true WO2023284142A1 (zh) 2023-01-19

Family

ID=77180574

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/123091 WO2023284142A1 (zh) 2021-07-16 2021-10-11 脉冲神经网络中神经元的信号处理方法及该网络训练方法

Country Status (3)

Country Link
US (1) US20230385617A1 (zh)
CN (1) CN113255905B (zh)
WO (1) WO2023284142A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115862338A (zh) * 2023-03-01 2023-03-28 天津大学 一种机场交通流量预测方法、系统、电子设备及介质
CN116056285A (zh) * 2023-03-23 2023-05-02 浙江芯源交通电子有限公司 一种基于神经元电路的信号灯控制系统及电子设备
CN116306857A (zh) * 2023-05-18 2023-06-23 湖北大学 一种基于神经元膜高低电位采样的脉冲电路

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255905B (zh) * 2021-07-16 2021-11-02 成都时识科技有限公司 脉冲神经网络中神经元的信号处理方法及该网络训练方法
CN113408713B (zh) * 2021-08-18 2021-11-16 成都时识科技有限公司 消除数据副本的方法、神经网络处理器及电子产品
CN113408671B (zh) * 2021-08-18 2021-11-16 成都时识科技有限公司 一种对象识别方法及装置、芯片及电子设备
CN113627603B (zh) * 2021-10-12 2021-12-24 成都时识科技有限公司 在芯片中实现异步卷积的方法、类脑芯片及电子设备
CN114936331A (zh) * 2022-04-18 2022-08-23 北京大学 位置预测方法、装置、电子设备及存储介质
CN114970829B (zh) * 2022-06-08 2023-11-17 中国电信股份有限公司 脉冲信号处理方法、装置、设备及存储
CN114998996B (zh) * 2022-06-14 2024-04-05 中国电信股份有限公司 具有运动属性信息的信号处理方法、装置、设备及存储
CN114861892B (zh) * 2022-07-06 2022-10-21 深圳时识科技有限公司 芯片在环代理训练方法及设备、芯片及电子设备
TWI832406B (zh) * 2022-09-01 2024-02-11 國立陽明交通大學 反向傳播訓練方法和非暫態電腦可讀取媒體
CN115169547B (zh) * 2022-09-09 2022-11-29 深圳时识科技有限公司 神经形态芯片及电子设备
CN115456149B (zh) * 2022-10-08 2023-07-25 鹏城实验室 脉冲神经网络加速器学习方法、装置、终端及存储介质
CN116205784B (zh) * 2023-05-04 2023-08-01 北京科技大学 一种基于事件时间触发神经元的光流识别系统
CN117556877B (zh) * 2024-01-11 2024-04-02 西南交通大学 基于数据脉冲特征评估的脉冲神经网络训练方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304913A (zh) * 2017-12-30 2018-07-20 北京理工大学 一种利用脉冲神经元阵列来实现卷积功能的方法
US20200019850A1 (en) * 2018-07-12 2020-01-16 Commissariat à l'énergie atomique et aux énergies alternatives Circuit neuromorphique impulsionnel implementant un neurone formel
CN111639754A (zh) * 2020-06-05 2020-09-08 四川大学 一种神经网络的构建、训练、识别方法和系统、存储介质
CN112465134A (zh) * 2020-11-26 2021-03-09 重庆邮电大学 一种基于lif模型的脉冲神经网络神经元电路
CN113033795A (zh) * 2021-03-29 2021-06-25 重庆大学 基于时间步的二值脉冲图的脉冲卷积神经网络硬件加速器
CN113255905A (zh) * 2021-07-16 2021-08-13 成都时识科技有限公司 脉冲神经网络中神经元的信号处理方法及该网络训练方法

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06274661A (ja) * 1993-03-18 1994-09-30 Hitachi Ltd シナプス回路およびそれを用いたニューラルネットワークシステム
CN105760930B (zh) * 2016-02-18 2018-06-05 天津大学 用于aer的多层脉冲神经网络识别系统
US10341669B2 (en) * 2016-12-20 2019-07-02 Intel Corporation Temporally encoding a static spatial image
US10956811B2 (en) * 2017-07-31 2021-03-23 Intel Corporation Variable epoch spike train filtering
CN108681772B (zh) * 2018-04-02 2020-09-29 北京大学 多模态神经元电路及神经元实现方法
CN108710770B (zh) * 2018-05-31 2022-03-25 杭州电子科技大学 一种面向多脉冲神经网络监督学习的精确突触调整方法
US20200019839A1 (en) * 2018-07-11 2020-01-16 The Board Of Trustees Of The Leland Stanford Junior University Methods and apparatus for spiking neural network computing based on threshold accumulation
US11861483B2 (en) * 2018-11-20 2024-01-02 Electronics And Telecommunications Research Institute Spike neural network circuit including comparator operated by conditional bias current
CN109948504B (zh) * 2019-03-13 2022-02-18 东软睿驰汽车技术(沈阳)有限公司 一种车道线识别方法及装置
US20220253674A1 (en) * 2019-05-30 2022-08-11 Nec Corporation Spiking neural network system, learning processing device, learning method, and recording medium
CN110210563B (zh) * 2019-06-04 2021-04-30 北京大学 基于Spike cube SNN的图像脉冲数据时空信息学习及识别方法
CN110647034B (zh) * 2019-09-04 2020-08-14 北京航空航天大学 一种脉冲等离子体推力器的神经网络控制方法
CN110705428B (zh) * 2019-09-26 2021-02-02 北京智能工场科技有限公司 一种基于脉冲神经网络的脸部年龄识别系统及方法
CN110659730A (zh) * 2019-10-10 2020-01-07 电子科技大学中山学院 基于脉冲神经网络的端到端功能性脉冲模型的实现方法
CN112130118B (zh) * 2020-08-19 2023-11-17 复旦大学无锡研究院 基于snn的超宽带雷达信号处理系统及处理方法
CN112101535B (zh) * 2020-08-21 2024-04-09 深圳微灵医疗科技有限公司 脉冲神经元的信号处理方法及相关装置
CN112183739B (zh) * 2020-11-02 2022-10-04 中国科学技术大学 基于忆阻器的低功耗脉冲卷积神经网络的硬件架构
CN112328398A (zh) * 2020-11-12 2021-02-05 清华大学 任务处理方法及装置、电子设备和存储介质
CN112529176A (zh) * 2020-12-03 2021-03-19 鹏城实验室 一种加速脉冲神经网络的训练方法、终端及存储介质
CN112633497B (zh) * 2020-12-21 2023-08-18 中山大学 一种基于重加权膜电压的卷积脉冲神经网络的训练方法
CN112699956B (zh) * 2021-01-08 2023-09-22 西安交通大学 一种基于改进脉冲神经网络的神经形态视觉目标分类方法
CN112990429A (zh) * 2021-02-01 2021-06-18 深圳市华尊科技股份有限公司 机器学习方法、电子设备及相关产品
CN113111758B (zh) * 2021-04-06 2024-01-12 中山大学 一种基于脉冲神经网络的sar图像舰船目标识别方法
CN112906828A (zh) * 2021-04-08 2021-06-04 周士博 一种基于时域编码和脉冲神经网络的图像分类方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304913A (zh) * 2017-12-30 2018-07-20 北京理工大学 一种利用脉冲神经元阵列来实现卷积功能的方法
US20200019850A1 (en) * 2018-07-12 2020-01-16 Commissariat à l'énergie atomique et aux énergies alternatives Circuit neuromorphique impulsionnel implementant un neurone formel
CN111639754A (zh) * 2020-06-05 2020-09-08 四川大学 一种神经网络的构建、训练、识别方法和系统、存储介质
CN112465134A (zh) * 2020-11-26 2021-03-09 重庆邮电大学 一种基于lif模型的脉冲神经网络神经元电路
CN113033795A (zh) * 2021-03-29 2021-06-25 重庆大学 基于时间步的二值脉冲图的脉冲卷积神经网络硬件加速器
CN113255905A (zh) * 2021-07-16 2021-08-13 成都时识科技有限公司 脉冲神经网络中神经元的信号处理方法及该网络训练方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115862338A (zh) * 2023-03-01 2023-03-28 天津大学 一种机场交通流量预测方法、系统、电子设备及介质
CN116056285A (zh) * 2023-03-23 2023-05-02 浙江芯源交通电子有限公司 一种基于神经元电路的信号灯控制系统及电子设备
CN116306857A (zh) * 2023-05-18 2023-06-23 湖北大学 一种基于神经元膜高低电位采样的脉冲电路
CN116306857B (zh) * 2023-05-18 2023-07-18 湖北大学 一种基于神经元膜高低电位采样的脉冲电路

Also Published As

Publication number Publication date
CN113255905A (zh) 2021-08-13
US20230385617A1 (en) 2023-11-30
CN113255905B (zh) 2021-11-02

Similar Documents

Publication Publication Date Title
WO2023284142A1 (zh) 脉冲神经网络中神经元的信号处理方法及该网络训练方法
US10671912B2 (en) Spatio-temporal spiking neural networks in neuromorphic hardware systems
US10339447B2 (en) Configuring sparse neuronal networks
US9330355B2 (en) Computed synapses for neuromorphic systems
US9558442B2 (en) Monitoring neural networks with shadow networks
US20150170028A1 (en) Neuronal diversity in spiking neural networks and pattern classification
JP2017525038A (ja) ニューラルネットワークにおける畳込み演算の分解
US20150134582A1 (en) Implementing synaptic learning using replay in spiking neural networks
US9959499B2 (en) Methods and apparatus for implementation of group tags for neural models
US10902311B2 (en) Regularization of neural networks
US9721204B2 (en) Evaluation of a system including separable sub-systems over a multidimensional range
WO2023010663A1 (zh) 计算设备及电子设备
CN113609773B (zh) 基于小样本的数据可靠性评估结果预测性能的方法及系统
US20140310216A1 (en) Method for generating compact representations of spike timing-dependent plasticity curves
JP6193509B2 (ja) 可塑性シナプス管理
JP2016537712A (ja) シナプス遅延を動的に割り当てることおおよび検査すること
US9449272B2 (en) Doppler effect processing in a neural network model
WO2024103639A1 (zh) 支持在线学习的气体识别方法、装置、设备、介质和产品
JP6881693B2 (ja) ニューロモーフィック回路、ニューロモーフィックアレイの学習方法およびプログラム
US9342782B2 (en) Stochastic delay plasticity
CN115293249A (zh) 一种基于动态时序预测的电力系统典型场景概率预测方法
Gerlinghoff et al. Desire backpropagation: A lightweight training algorithm for multi-layer spiking neural networks based on spike-timing-dependent plasticity
TWI832406B (zh) 反向傳播訓練方法和非暫態電腦可讀取媒體
Chalasani et al. Application of artificial neural networks to forecast ITK inhibitor activity data
Singh Exploring Column Update Elimination Optimization for Spike-Timing-Dependent Plasticity Learning Rule

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21949915

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18251000

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE