CN114492773A - Neural network batch standardization layer hardware implementation method, device, equipment and medium - Google Patents

Neural network batch standardization layer hardware implementation method, device, equipment and medium Download PDF

Info

Publication number
CN114492773A
CN114492773A CN202111601714.4A CN202111601714A CN114492773A CN 114492773 A CN114492773 A CN 114492773A CN 202111601714 A CN202111601714 A CN 202111601714A CN 114492773 A CN114492773 A CN 114492773A
Authority
CN
China
Prior art keywords
layer
result
neural network
convolution
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111601714.4A
Other languages
Chinese (zh)
Inventor
高滨
周颖
刘琪
唐建石
张清天
钱鹤
吴华强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beijing Superstring Academy of Memory Technology
Original Assignee
Tsinghua University
Beijing Superstring Academy of Memory Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beijing Superstring Academy of Memory Technology filed Critical Tsinghua University
Priority to CN202111601714.4A priority Critical patent/CN114492773A/en
Publication of CN114492773A publication Critical patent/CN114492773A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of neural network computing, in particular to a neural network batch standardization layer hardware implementation method, device, equipment and medium, wherein the method comprises the following steps: storing weight parameters of the neural network in a form of conductance into the memristor array; obtaining a corresponding quantization result according to the actual current flowing through each source line of the memristor array based on the convolution result of the last convolution layer; and sending the quantization result to the next convolution layer for convolution layer calculation. Therefore, the ADC module commonly used in the calculation integration task is realized based on the memristor array, calculation of the BN layer is realized, the function module is activated, extra overhead of the processor for calculation of the BN layer is saved, and the system energy efficiency is improved.

Description

Neural network batch standardization layer hardware implementation method, device, equipment and medium
Technical Field
The present application relates to the field of neural network computing technologies, and in particular, to a method, an apparatus, a device, and a medium for implementing a neural network batch normalization layer hardware.
Background
The Batch Normalization (BN) layer is a common module in deep neural network training. The method concentrates output results which are distributed more discretely in a certain range, avoids the problem of gradient disappearance and accelerates the network training speed. In the hardware implementation method of the BN layer, comparison with the improved threshold is often performed.
In the related art, as shown in fig. 1, in a binary neural network, an output value of each layer is 1 or-1. XN,YN,ZNThe results of the Nth convolution layer, the BN layer and the sign bit judgment module are respectively obtained. After the output vector of the convolutional layer passes through the BN layer, the sign bit of the convolutional layer needs to be judged and sent to the next convolutional layer for calculation, as shown in the following formula:
Figure BDA0003433261290000011
the improved threshold scheme combines BN layer calculation and a subsequent sign bit judgment module. The specific calculation formula is as follows:
Figure BDA0003433261290000012
if k is greater than 0 and k is less than 0, the method avoids the complicated calculation steps of the BN layer and greatly reduces the calculation cost.
However, the implementation scheme of the BN layer for improving the threshold is designed for a binary neural network, the output value of each layer is only two values, and the calculation steps are relatively simple. However, for the implementation of neural network hardware with higher precision, the method is not suitable and needs to be solved urgently.
Content of application
The application provides a method, a device, equipment and a medium for realizing hardware of a neural network batch standardization layer, wherein an Analog-to-digital converter (ADC) module commonly used in a calculation integrated task is realized based on a memristor array, calculation of a BN layer is realized, a function module is activated, extra overhead of a processor for calculation of the BN layer is saved, and system energy efficiency is improved.
An embodiment of a first aspect of the present application provides a hardware implementation method for a neural network batch normalization layer, including the following steps:
storing weight parameters of the neural network in a form of conductance into the memristor array;
obtaining a corresponding quantization result according to the actual current flowing through each source line of the memristor array based on the convolution result of the last convolution layer; and
and sending the quantization result to the next convolution layer for convolution layer calculation.
Optionally, the obtaining a corresponding quantization result according to an actual current flowing through each source line of the memristor array includes:
and 8bit quantization is carried out on the preset range to obtain the quantization result.
Optionally, the preset range is:
Figure BDA0003433261290000021
wherein the content of the first and second substances,
Figure BDA0003433261290000022
Zmaxfor the upper limit of the batch normalization layer calculation, β, γ, σ, μ are parameters of the batch normalization layer, and ε is the minimum value.
Optionally, before obtaining the corresponding quantized result according to an actual current flowing through each source line of the memristor array, the method further includes:
and charging and discharging the capacitor in the integrator according to a preset charging and discharging strategy so as to integrate the convolution result of the last convolution layer.
Optionally, the performing 8-bit quantization on the preset range to obtain the quantization result includes:
and sending the output voltage of the integrator to an 8-bit ADC, and quantizing the convolution result to a plurality of levels in a preset voltage.
An embodiment of a second aspect of the present application provides a hardware implementation apparatus for a neural network batch normalization layer, including:
the storage module is used for storing the weight parameters of the neural network into the memristor array in a conductance mode;
the obtaining module is used for obtaining a corresponding quantization result according to actual current flowing through each source line of the memristor array based on the convolution result of the last convolution layer; and
and the quantization module is used for sending the quantization result to the next convolutional layer so as to calculate the convolutional layer.
Optionally, the obtaining module is specifically configured to:
and 8bit quantization is carried out on the preset range to obtain the quantization result.
Optionally, the preset range is:
Figure BDA0003433261290000023
wherein the content of the first and second substances,
Figure BDA0003433261290000024
Zmaxfor the upper limit of the batch normalization layer calculation, β, γ, σ, μ are parameters of the batch normalization layer, and ε is the minimum value. Optionally, before obtaining the corresponding quantized result according to an actual current flowing through each source line of the memristor array, the obtaining module is further configured to:
and charging and discharging the capacitor in the integrator according to a preset charging and discharging strategy so as to integrate the convolution result of the last convolution layer.
Optionally, the obtaining module is specifically configured to:
and sending the output voltage of the integrator to an 8-bit ADC, and quantizing the convolution result to a plurality of levels in a preset voltage.
An embodiment of a third aspect of the present application provides an electronic device, including: the hardware implementation method comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the hardware implementation method of the neural network batch standardization layer according to the embodiment.
A fourth aspect of the present application provides a computer-readable storage medium, on which a computer program is stored, the program being executed by a processor for implementing the neural network batch normalization layer hardware implementation method according to any one of claims 1 to 5.
Therefore, the weight parameters of the neural network can be stored into the memristor array in a conductance mode, corresponding quantization results are obtained according to actual currents flowing through the source lines of the memristor array on the basis of convolution results of the previous convolution layer, and the quantization results are sent to the next convolution layer to perform convolution layer calculation. Therefore, the ADC module commonly used in the calculation integration task is realized based on the memristor array, calculation of the BN layer is realized, the function module is activated, extra overhead of the processor for calculation of the BN layer is saved, and the system energy efficiency is improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart illustrating a convolutional layer with a BN layer according to an embodiment of the present disclosure;
FIG. 2 is a comparative illustration of BN layer calculation and ReLU function implemented in two ways;
FIG. 3 is a diagram illustrating the result of the Nth layer of convolution and the BN calculation result;
FIG. 4 is a flowchart of a hardware implementation method of a neural network batch normalization layer according to an embodiment of the present disclosure;
FIG. 5 is a diagram illustrating an implementation of a convolutional layer, a BN layer, and a ReLU activation function according to an embodiment of the present application;
FIG. 6 is an exemplary diagram of a BN layer implementation with quantization modules and ReLU activation function calculation;
FIG. 7 is an exemplary diagram of an apparatus for a neural network batch normalization layer hardware implementation according to an embodiment of the present application;
fig. 8 is an exemplary diagram of an electronic device according to an embodiment of the application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
A neural network batch normalization layer hardware implementation method, apparatus, device, and medium according to an embodiment of the present application are described below with reference to the accompanying drawings.
Before introducing the hardware implementation method of neural network batch standardization layer according to the embodiment of the present application, the present application will be described with reference to fig. 2 to 4
Figure BDA0003433261290000041
The range is quantized by 8 bits, and original BN layer calculation and the reason of activating the function steps are skipped.
As shown in fig. 2, fig. 2(a) is a flow chart of the calculation of the nth layer of the neural network, and fig. 2(b) is a flow chart of a scheme for implementing the calculation of the BN layer and the ReLU activation function in an improved scheme.
Specifically, three modules, namely an Nth convolutional layer, a BN layer and a ReLU activation function in the neural network are taken as examples.
YN=WN·XN
Wherein, YNAs a result of convolution of the Nth layer, WNIs a weight matrix, X, obtained by spreading and splicing together n convolution kernelsNIs the input vector of the nth layer; y isNQuantization is performed by an 8bit ADC.
The BN layer calculation formula is as follows:
Figure BDA0003433261290000042
the BN layer calculation step can be viewed as a linear transformation function: zN=k·YN+ b, wherein
Figure BDA0003433261290000043
Figure BDA0003433261290000044
The calculation result of the BN layer needs to be sent to the ReLU activation function module.
Figure BDA0003433261290000045
Suppose Z of BN layerNIn the range of (Z)min,Zmax) Inner (Z)min<0) Then XN+1In the range of (0, Z)max) Within the range; after calculation of BN layer, at (0, Z)max) Z within the rangeN8 bits of quantization are required and fed into the ReLU activation function module, such as Z N0, quantization result corresponds to 0, ZmaxCorresponding to 255.
According to the above-mentioned ZNAnd YNThe conversion relationship between them can be known
Figure BDA0003433261290000051
As can be seen from FIG. 3, FIG. 3(a) is a schematic diagram of the convolution layer output result quantized to 8 bits (i.e. the result after convolution of the Nth layer) after the improvement, and FIG. 3(b) is a schematic diagram of the convolution layer output result after the improvementIn the improved network, a schematic diagram (i.e. the calculation result of the N-th BN layer) for quantizing the output value of the BN layer to 8 bits is required, so that if the same quantization result as that in fig. 3(b) is to be obtained, Y isNThe quantization range of (1) needs to be corrected, assuming that k is greater than 0, (0, Z)max) The range corresponds to
Figure BDA0003433261290000052
Therefore, the embodiments of the present application need to be aligned
Figure BDA0003433261290000053
And 8bit quantization is carried out in the range, the original steps of BN layer calculation and function activation are skipped, and the BN layer calculation and the hardware realization of the ReLU function are realized through an improved output result quantization scheme.
Specifically, fig. 4 is a schematic flowchart of a hardware implementation method of a neural network batch normalization layer according to an embodiment of the present disclosure.
As shown in fig. 4, the hardware implementation method of the neural network batch standardization layer includes the following steps:
in step S401, the weight parameters of the neural network are stored in the form of conductances into the memristor array.
In step S402, based on the convolution result of the last convolution layer, a corresponding quantization result is obtained according to the actual current flowing through each source line of the memristor array.
Optionally, in some embodiments, deriving the corresponding quantized result from the actual current flowing through each source line of the memristor array includes: and 8bit quantization is carried out on the preset range to obtain a quantization result.
Optionally, in some embodiments, the preset range is:
Figure BDA0003433261290000054
wherein b is
Figure BDA0003433261290000055
k is
Figure BDA0003433261290000056
ZmaxAn upper limit value of the result is calculated for the batch normalization layer. β, γ, σ, μ are parameters of the batch normalization layer, where β and γ are trainable parameters that gradually converge during the training process, whereas σ and μ represent the standard deviation and mean of the output values, respectively, in relation to the trained samples, and ε is a negligible minimum introduced to prevent the denominator from being zero.
In some embodiments, the quantizing the preset range by 8 bits to obtain a quantization result includes: and (4) sending the output voltage of the integrator to an 8-bit ADC, and quantizing the convolution result to a plurality of levels in a preset voltage.
Optionally, in some embodiments, before obtaining the corresponding quantized result according to the actual current flowing through each source line of the memristor array, further includes: and charging and discharging the capacitor in the integrator according to a preset charging and discharging strategy so as to integrate the convolution result of the previous convolution layer.
Specifically, the weight parameters of the neural network can be stored in the 2T2R memristor array in the form of conductance, and the current value flowing through each source line needs to be converted into a voltage value and quantified. Taking the output quantization module (including the integrator and 8-bit adc) of fig. 5 as an example, the current-to-voltage module may convert the source line current into a voltage through a transimpedance amplifier or an integrator, and the reference voltage of the integrator is 2.5V. Integrating the convolution layer result, i.e. to the capacitance C in the integratorintegAnd (4) charging and discharging. Suppose that
Figure BDA0003433261290000061
When the voltage on the capacitor drops to 2V, when
Figure BDA0003433261290000062
5V was integrated. Then the output voltage of the integrator is sent to an 8-bit ADC, and the result is quantized to 256 levels in 2-5V, compared with the implementation method shown in FIG. 6, the latter BN layer and the ReLU activation function meter are not needed any moreAnd the calculation step saves the extra overhead of the processor for calculating the BN layer and improves the system energy efficiency.
It should be noted that, for each convolution kernel in the convolution layer, the parameters k1, b1, k2, b2, … …, kn, BN of the BN layer corresponding to the output neuron are all different, so that details of each integration capacitor need to be finely designed to enable the subsequent voltage range to be between 2V and 5V, and multiplexing of the 8-bit adc is achieved.
In step S403, the quantization result is sent to the next convolutional layer for convolutional layer calculation.
Therefore, the BN layer implementation scheme combined with the ADC does not need to pass through BN layer calculation and ReLU activation function steps, and the quantization result can be directly sent to the next convolutional layer for calculation.
According to the neural network batch standardization layer hardware implementation method provided by the embodiment of the application, the weight parameters of the neural network can be stored in the memristor array in a conductance mode, the corresponding quantization result is obtained according to the actual current flowing through each source line of the memristor array on the basis of the convolution result of the previous convolution layer, and the quantization result is sent to the next convolution layer to perform convolution layer calculation. Therefore, the ADC module commonly used in the calculation integration task is realized based on the memristor array, calculation of the BN layer is realized, the function module is activated, extra overhead of the processor for calculation of the BN layer is saved, and the system energy efficiency is improved.
Next, a neural network batch normalization layer hardware implementation apparatus proposed according to an embodiment of the present application is described with reference to the drawings.
Fig. 7 is a block diagram illustrating an apparatus for implementing hardware of a neural network batch normalization layer according to an embodiment of the present application.
As shown in fig. 7, the apparatus 10 for implementing neural network batch normalization layer hardware includes: a storage module 100, an acquisition module 200 and a quantization module 300.
The storage module 100 is configured to store the weight parameters of the neural network in a conductance form into the memristor array;
the obtaining module 200 is configured to obtain a corresponding quantization result according to an actual current flowing through each source line of the memristor array based on a convolution result of the last convolution layer; and
the quantization module 300 is used to send the quantization result to the next convolutional layer for convolutional layer calculation.
Optionally, the obtaining module 200 is specifically configured to:
and 8bit quantization is carried out on the preset range to obtain a quantization result.
Optionally, the preset range is:
Figure BDA0003433261290000071
wherein the content of the first and second substances,
Figure BDA0003433261290000072
Zmaxfor the upper limit of the batch normalization layer calculation, β, γ, σ, μ are parameters of the batch normalization layer, and ε is the minimum value. Optionally, before obtaining the corresponding quantized result according to the actual current flowing through each source line of the memristor array, the obtaining module 200 is further configured to:
and charging and discharging the capacitor in the integrator according to a preset charging and discharging strategy so as to integrate the convolution result of the previous convolution layer.
Optionally, the obtaining module 200 is specifically configured to:
and (4) sending the output voltage of the integrator to an 8-bit ADC, and quantizing the convolution result to a plurality of levels in a preset voltage.
It should be noted that the foregoing explanation of the embodiment of the hardware implementation method for neural network batch standardization layer is also applicable to the hardware implementation device for neural network batch standardization layer of this embodiment, and is not repeated here.
According to the neural network batch standardization layer hardware implementation device provided by the embodiment of the application, the weight parameters of the neural network can be stored in the memristor array in a conductance mode, the corresponding quantization result is obtained according to the actual current flowing through each source line of the memristor array on the basis of the convolution result of the previous convolution layer, and the quantization result is sent to the next convolution layer to perform convolution layer calculation. Therefore, the ADC module commonly used in the calculation integration task is realized based on the memristor array, calculation of the BN layer is realized, the function module is activated, extra overhead of the processor for calculation of the BN layer is saved, and the system energy efficiency is improved.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device may include:
a memory 801, a processor 802, and a computer program stored on the memory 801 and executable on the processor 802.
The processor 802, when executing the program, implements the neural network batch normalization layer hardware implementation method provided in the above embodiments.
Further, the electronic device further includes:
a communication interface 803 for communicating between the memory 801 and the processor 802.
A memory 801 for storing computer programs operable on the processor 802.
The memory 801 may include high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
If the memory 801, the processor 802 and the communication interface 803 are implemented independently, the communication interface 803, the memory 801 and the processor 802 may be connected to each other via a bus and communicate with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 8, but that does not indicate only one bus or one type of bus.
Optionally, in a specific implementation, if the memory 801, the processor 802, and the communication interface 803 are integrated on one chip, the memory 801, the processor 802, and the communication interface 803 may complete communication with each other through an internal interface.
The processor 802 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present Application.
The present embodiment also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the neural network batch normalization layer hardware implementation method as above.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, e.g., two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of implementing the embodiments of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (12)

1. A neural network batch standardization layer hardware implementation method is characterized by comprising the following steps:
storing weight parameters of the neural network in a form of conductance into the memristor array;
obtaining a corresponding quantization result according to the actual current flowing through each source line of the memristor array based on the convolution result of the last convolution layer; and
and sending the quantization result to the next convolution layer for convolution layer calculation.
2. The method of claim 1, wherein the deriving a corresponding quantization result from an actual current flowing through each source line of the memristor array comprises:
and 8bit quantization is carried out on the preset range to obtain the quantization result.
3. The method of claim 1, wherein the predetermined range is:
Figure FDA0003433261280000011
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003433261280000012
Zmaxfor the upper limit of the batch normalization layer calculation, β, γ, σ, μ are parameters of the batch normalization layer, and ε is the minimum value.
4. The method of claim 2, further comprising, prior to deriving the corresponding quantized result from an actual current flowing through each source line of the memristor array:
and charging and discharging the capacitor in the integrator according to a preset charging and discharging strategy so as to integrate the convolution result of the last convolution layer.
5. The method according to claim 4, wherein the quantizing the preset range by 8 bits to obtain the quantization result comprises:
and sending the output voltage of the integrator to an 8-bit ADC, and quantizing the convolution result to a plurality of levels in a preset voltage.
6. An apparatus for implementing hardware in neural network batch normalization layer, comprising:
the storage module is used for storing the weight parameters of the neural network into the memristor array in a conductance mode;
the obtaining module is used for obtaining a corresponding quantization result according to actual current flowing through each source line of the memristor array based on the convolution result of the last convolution layer; and
and the quantization module is used for sending the quantization result to the next convolutional layer so as to calculate the convolutional layer.
7. The apparatus of claim 6, wherein the obtaining module is specifically configured to:
and 8bit quantization is carried out on the preset range to obtain the quantization result.
8. The apparatus of claim 6, wherein the predetermined range is:
Figure FDA0003433261280000021
wherein b is
Figure FDA0003433261280000022
k is
Figure FDA0003433261280000023
ZmaxFor the upper limit of the batch normalization layer calculation, β, γ, σ, μ are parameters of the batch normalization layer, and ε is the minimum value.
9. The apparatus of claim 7, wherein the obtaining module, prior to obtaining the corresponding quantized result from an actual current flowing through each source line of the memristor array, is further to:
and charging and discharging the capacitor in the integrator according to a preset charging and discharging strategy so as to integrate the convolution result of the last convolution layer.
10. The apparatus of claim 9, wherein the obtaining module is specifically configured to:
and sending the output voltage of the integrator to an 8-bit ADC, and quantizing the convolution result to a plurality of levels in a preset voltage.
11. An electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the neural network batch normalization layer hardware implementation method of any one of claims 1-5.
12. A computer-readable storage medium, on which a computer program is stored, the program being executable by a processor for implementing the neural network batch normalization layer hardware implementation method according to any one of claims 1 to 5.
CN202111601714.4A 2021-12-24 2021-12-24 Neural network batch standardization layer hardware implementation method, device, equipment and medium Pending CN114492773A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111601714.4A CN114492773A (en) 2021-12-24 2021-12-24 Neural network batch standardization layer hardware implementation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111601714.4A CN114492773A (en) 2021-12-24 2021-12-24 Neural network batch standardization layer hardware implementation method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114492773A true CN114492773A (en) 2022-05-13

Family

ID=81497025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111601714.4A Pending CN114492773A (en) 2021-12-24 2021-12-24 Neural network batch standardization layer hardware implementation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114492773A (en)

Similar Documents

Publication Publication Date Title
CN112085186B (en) Method for determining quantization parameter of neural network and related product
CN108701250B (en) Data fixed-point method and device
US10853721B2 (en) Multiplier accumulator, network unit, and network apparatus
US11385863B2 (en) Adjustable precision for multi-stage compute processes
CN111723901A (en) Training method and device of neural network model
US11727277B2 (en) Method and apparatus for automatically producing an artificial neural network
CN112287968A (en) Image model training method, image processing method, chip, device and medium
WO2018119143A1 (en) Reference disturbance mitigation in successive approximation register analog to digtal converter
US20220236909A1 (en) Neural Network Computing Chip and Computing Method
CN113408715A (en) Fixed-point method and device for neural network
CN113157076B (en) Electronic equipment and power consumption control method
CN109649361B (en) Automobile electronic control brake gain adjusting method, system, equipment and storage medium
CN111027684A (en) Deep learning model quantification method and device, electronic equipment and storage medium
CN114492773A (en) Neural network batch standardization layer hardware implementation method, device, equipment and medium
CN112307850A (en) Neural network training method, lane line detection method, device and electronic equipment
CN112766397B (en) Classification network and implementation method and device thereof
KR20200054759A (en) Method and apparatus of quantization for weights of batch normalization layer
CN111383157A (en) Image processing method and device, vehicle-mounted operation platform, electronic equipment and system
CN115099396B (en) Full-weight mapping method and device based on memristor array
KR102155060B1 (en) Multi level memory device and its data sensing method
US20230058500A1 (en) Method and machine learning system to perform quantization of neural network
CN111783444B (en) Text vector generation method and device
CN112115825B (en) Quantification method, device, server and storage medium of neural network
CN114757348A (en) Model quantitative training method and device, storage medium and electronic equipment
CN113326914A (en) Neural network computing method and neural network computing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination