CN110880033A - Neural network operation module and method - Google Patents

Neural network operation module and method Download PDF

Info

Publication number
CN110880033A
CN110880033A CN201811041573.3A CN201811041573A CN110880033A CN 110880033 A CN110880033 A CN 110880033A CN 201811041573 A CN201811041573 A CN 201811041573A CN 110880033 A CN110880033 A CN 110880033A
Authority
CN
China
Prior art keywords
precision
gradient
output neuron
neuron
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811041573.3A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN201811041573.3A priority Critical patent/CN110880033A/en
Priority to EP19803375.5A priority patent/EP3624020A4/en
Priority to PCT/CN2019/085844 priority patent/WO2019218896A1/en
Priority to US16/718,742 priority patent/US11409575B2/en
Priority to US16/720,145 priority patent/US11442785B2/en
Priority to US16/720,171 priority patent/US11442786B2/en
Publication of CN110880033A publication Critical patent/CN110880033A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a neural network operation module, which comprises a storage unit, a neural network unit and a neural network unit, wherein the neural network unit is used for acquiring the input neuron precision, the weight precision and the output neuron gradient precision of an L-th layer from the storage unit; acquiring gradient updating precision T according to the input neuron precision, the weight precision and the output neuron gradient precision; when the gradient updating precision T is smaller than the preset precision TrAdjusting the precision of an input neuron, the precision of weight and the precision of gradient of an output neuron; and an operation unit for representing the output neurons and weights of the L-th layer according to the increased input neuron precision and weight precision and representing the gradient of the L-th layer output neurons obtained by operation according to the increased output neuron gradient precision so as to perform subsequent operation. By adopting the embodiment of the invention, the operation requirement can be met, the error of the operation result and the operation overhead are reduced, and the operation resource is saved.

Description

Neural network operation module and method
Technical Field
The invention relates to the field of neural networks, in particular to a neural network operation module and a method.
Background
The fixed point number is a data format capable of specifying the position of a decimal point, and we usually use bit width to represent the data length of one fixed point number. For example, the bit width of a 16-bit fixed point number is 16. For a given number of fixed points of bit width, the precision of the representable data and the range of numbers that can be represented are traded off against each other, the greater the precision that can be represented, the smaller the range of numbers that can be represented. As shown in FIG. 1a, for a fixed point data format with bitnum bit widthThe first bit is sign bit, the integer part occupies x bit, the decimal part occupies S bit, the maximum fixed point precision S that the fixed point data format can represent is 2-s. The fixed point data format may represent a range of [ neg, pos [ ]]Wherein pos is (2)bitnum-1-1)*2-s,neg=-(2bitnum-1)*2-s
In the neural network operation, data can be expressed and operated by using a fixed point data format. For example, during forward operation, the data of layer L includes input neuron X(l)And output neuron Y(l)Weight W(l). During the inverse operation, the data of the L < th > layer includes input neuron gradients
Figure BDA0001791329100000011
Output neuron gradient
Figure BDA0001791329100000012
Gradient of weight
Figure BDA0001791329100000013
The above data may be expressed by fixed-point numbers, or may be calculated by fixed-point numbers.
The training process of the neural network generally comprises two steps of forward operation and reverse operation, during the reverse operation, the precision required by the input neuron gradient, the weight gradient and the output neuron gradient may change, and may decrease along with the training process, if the precision of the fixed point number is redundant, the operation overhead may be increased, and the operation resource is wasted.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is that in the neural network operation process, the precision of an input neuron, the precision of a weight, or the gradient precision of an output neuron is insufficient, which causes an error in the result of operation or training.
In a first aspect, the present invention provides a neural network operation module, configured to perform operations on a multilayer neural network, including:
the storage unit is used for storing the input neuron precision, the weight precision and the output neuron gradient precision;
a controller unit for obtaining the input neuron precision S of the L-th layer of the multilayer neural network from the storage unitx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000014
Wherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791329100000015
Obtaining gradient updating precision T; when the gradient updating precision T is smaller than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000016
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
an arithmetic unit for adjusting the input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights of the L-th layer according to the adjusted output neuron gradient precision
Figure BDA0001791329100000017
To express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
In a possible embodiment, the controller unit is adapted to determine the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791329100000021
Obtaining the gradient update precision T, specifically comprising:
the controller unit is used for carrying out precision S on the input neuron according to a preset formulax(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791329100000022
Calculating to obtain the gradient updating precision T;
wherein the first preset formula is as follows:
Figure BDA0001791329100000023
in a possible embodiment, the controller unit adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000024
The method comprises the following steps:
the controller unit maintains the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, increasing the output neuron gradient precision
Figure BDA0001791329100000025
In a possible embodiment, the controller unit increases the output neuron gradient precision
Figure BDA0001791329100000026
The bit width of a fixed-point data format representing the output neuron gradient is reduced.
In a possible embodiment, the controller unit increases the output neuron gradient precision
Figure BDA0001791329100000027
Thereafter, the controller unit is further configured to:
judging the gradient precision of the output neuron
Figure BDA0001791329100000028
Whether the precision is smaller than the required precision, wherein the required precision is the minimum precision of the output neuron gradient when multilayer neural network operation is carried out;
when the gradient precision of the output neuron
Figure BDA0001791329100000029
And when the required precision is smaller than the required precision, reducing the bit width of the fixed point data format representing the output neuron gradient.
In one possible embodiment, the controller unit reduces bit widths of a fixed-point data format representing the output neuron gradient, including:
the controller unit reduces the bit width of the fixed point data format representing the gradient of the output neuron according to a first preset step length N1;
the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
In one possible embodiment, the controller unit reduces bit widths of a fixed-point data format representing the output neuron gradient, including:
and the controller unit reduces the bit width of the fixed point data format representing the output neuron gradient in a 2-time decreasing mode.
In a possible embodiment, the controller unit is further configured to:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
In a second aspect, an embodiment of the present invention provides a neural network operation method, including:
obtaining L-th layer input neuron precision S of neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA00017913291000000210
According to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA00017913291000000211
Calculating to obtain gradient updating precision T;
when the gradient updating precision T is smaller than the preset precision TrTime, adjust input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient
Figure BDA00017913291000000213
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
according to the adjusted input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights for layer L; according to the gradient precision of the adjusted output neuron
Figure BDA00017913291000000212
To express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
In one possible embodiment, said determining the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791329100000031
Calculating to obtain gradient updating precision T, comprising:
according to a preset formula, the precision S of the input neuronx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791329100000032
Calculating to obtain the gradient updating precision T;
wherein the preset formula is as follows:
Figure BDA0001791329100000033
in one possible embodiment, the adjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000034
The method comprises the following steps:
maintaining the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, increasing the output neuron gradient precision
Figure BDA0001791329100000035
In one possible embodiment, the increasing the output neuron gradient precision
Figure BDA0001791329100000036
Reducing bit width of a fixed point data format representing the output neuron gradient
In one possible embodiment, the increasing the output neuron gradient precision
Figure BDA0001791329100000037
Thereafter, the method further comprises:
judging the gradient precision of the output neuron
Figure BDA0001791329100000038
Whether the precision is smaller than the required precision, wherein the required precision is the minimum precision of the output neuron gradient when multilayer neural network operation is carried out;
when the gradient precision of the output neuron
Figure BDA0001791329100000039
And when the required precision is smaller than the required precision, reducing the bit width of the fixed point data format representing the output neuron gradient.
In one possible embodiment, the reducing the bit width of the fixed-point data format representing the output neuron gradient comprises:
reducing the bit width of the fixed point data format representing the gradient of the output neuron according to a first preset step length N1;
the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
In one possible embodiment, the reducing the bit width of the fixed-point data format representing the output neuron gradient comprises:
and reducing the bit width of the fixed point data format representing the output neuron gradient in a 2-time decreasing mode.
In a possible embodiment, the method further comprises:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
It can be seen that in the solution of the embodiment of the present invention, the input neuron precision S is dynamically adjusted (including increasing or decreasing) in the neural network operation processxWeight accuracy SwAnd output neuron gradient accuracy
Figure BDA00017913291000000310
The method has the advantages that the calculation requirements are met, meanwhile, the precision redundancy is reduced, the calculation overhead is reduced, and the waste of calculation resources is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1a is a schematic diagram of a fixed-point data format;
fig. 1b is a schematic structural diagram of a neural network operation module according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a neural network operation method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terminology used in the embodiments of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
In the neural network operation process, because of a series of operations such as addition, subtraction, multiplication, division, convolution and the like, the input neuron, the weight and the output neuron included in the forward operation process and the input neuron gradient, the weight gradient and the output neuron gradient included in the reverse training process are also changed. The precision with which input neurons, weights, output neurons, input neuron gradients, weight gradients, and output neuron gradients are represented in fixed-point data format may need to be increased or decreased. If the precision of the input neuron, the weight, the output neuron, the input neuron gradient, the weight gradient and the output neuron gradient is not enough, a larger error occurs in an operation result, and even a reverse training failure is caused; if the precision of the input neuron, the weight, the output neuron, the input neuron gradient, the weight gradient and the output neuron gradient is redundant, unnecessary operation overhead is increased, and operation resources are wasted. The application provides a neural network operation module and a method, which dynamically adjust the precision of the data in the neural network operation process, so as to reduce the error of an operation result and improve the precision of the operation result while meeting the operation requirement.
In the embodiment of the present application, the purpose of adjusting the data precision is achieved by adjusting the bit width of the data. For example, when the precision of the fixed-point data format exceeds the requirement of operation, the precision of the fixed-point data format can be reduced by reducing the bit width of the decimal part in the fixed-point data format, namely reducing s in fig. 1 a; however, the precision of the fixed-point data format is related to the bit width of the fractional part, i.e. the precision of the fixed-point data format can be adjusted by increasing or decreasing the bit width of the fractional part. When the precision of the fixed point data format is smaller than the required precision, the bit width of the decimal part can be reduced, so that the precision of the fixed point data format is increased, the precision redundancy of the fixed point data format is further reduced, the operation overhead is reduced, and the waste of operation resources is avoided.
Referring to fig. 1b, fig. 1b is a schematic structural diagram of a neural network operation module according to an embodiment of the present invention. The neural network operation module is used for performing operation of a multilayer neural network. As shown in fig. 1b, the neural network operation module 100 includes:
and the storage unit 101 is used for storing the input neuron precision, the weight precision and the output neuron gradient precision.
A controller unit 102 for obtaining the input neuron precision S of the L-th layer of the multi-layer neural network from the storage unit 101x(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000041
Wherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791329100000042
Obtaining gradient updating precision T; when the gradient updating precision T is smaller than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000043
In a possible embodiment, the storage unit 101 is further configured to store input neurons, weights, output neurons, and output neuron gradients, the controller unit 102 obtains L-th layer input neurons, weights, and output neuron gradients from the storage unit 101, and the controller unit 102 obtains the input neuron precision S according to the L-th layer input neurons, weights, and output neuron gradientsx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000051
The bit width of the fixed-point data format used for representing the number of the fixed-point data of the input neuron and the bit width of the fixed-point data format used for representing the weight are first bit width, and the bit width of the fixed-point data format used for representing the gradient of the output neuron is second bit width.
Optionally, the second bit width is greater than the first bit width.
Further, the second bit width is twice the first bit width, so as to facilitate processing by an electronic computer.
Further, the first bit width is preferably 8 bits, and the second bit width is preferably 16 bits.
The controller unit 102 may be the root ofPresetting preset precision T according to experiencer(ii) a Or a second preset formula is adopted to obtain the preset precision T matched with the input parameters in a mode of changing the input parametersr(ii) a T can also be acquired by a machine learning methodr
Alternatively, the controller unit 102 sets the preset accuracy T according to a learning rate and a batch size (number of samples in batch processing)r
Further, if there is a parameter sharing layer (such as convolutional layer and recurrent neural network layer) in the neural network, the controller unit 102 sets the predetermined precision T according to the number of output neurons in the previous layer, the blocksize, and the learning raterThat is, the higher the number of output neurons of the previous layer, the larger the blocksize, and the higher the learning rate, the preset accuracy TrThe larger.
Specifically, the controller unit 102 obtains the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000052
Then, according to a first preset formula, the precision S of the input neuron is determinedx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000053
Calculating to obtain the gradient update precision T, wherein the first preset formula may be:
Figure BDA0001791329100000054
wherein the controller unit 102 adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000055
The method comprises the following steps:
the controller unit 102 maintains the input neuron precision Sx(l)And weightPrecision Sw(l)The gradient precision of the output neuron is increased without changing
Figure BDA0001791329100000056
It is noted that due to the above-mentioned output neuron gradient accuracy
Figure BDA0001791329100000057
The controller unit 102 increases the gradient accuracy of the output neuron
Figure BDA0001791329100000058
This means that the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron is reduced.
Alternatively, the controller unit 102 may decrease the decimal part bit width s1 of the fixed point data format indicating the weight by a first preset step N1 according to the value of Tr-T.
Specifically, for the decimal part bit width s1 in the fixed point data format representing the gradient of the output neuron, the controller unit 102 decreases N1 bits at a time, that is, the bit width of the decimal part is s1-N1, and obtains the gradient precision of the output neuron
Figure BDA0001791329100000059
Then according to the above-mentioned preset formula
Figure BDA00017913291000000510
Judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is smaller or not; when determining that the absolute value of the difference between the gradient update precision T and the preset precision Tr is smaller, the controller unit 102 continues to decrease the bit width of the decimal part in the fixed point data format indicating the gradient of the output neuron by N1, that is, the bit width is s1-2 × N1, and obtains the gradient precision of the output neuron
Figure BDA00017913291000000512
Continuously judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is reduced or not; if it becomes smaller, thenContinuing to process according to the method; if the absolute value of the difference between the gradient update precision T and the preset precision Tr increases during the nth processing, the controller unit 102 uses the bit width obtained by the nth-1 processing, i.e., s1- (N-1) × N1, as the bit width of the decimal part of the fixed point data format indicating the gradient of the output neuron, and reduces the bit width of the decimal part to obtain the gradient precision of the output neuron
Figure BDA00017913291000000511
Optionally, the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
Alternatively, the controller unit 102 may decrease the bit width of the decimal part in the fixed-point data format indicating the gradient of the output neuron in a 2-fold decreasing manner.
For example, the decimal part bit width of the fixed point data format indicating the gradient of the output neuron is 4, that is, the precision of the weight is 2-4Then, the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron after reducing the bit width in a 2-fold decreasing mode is 2, that is, the gradient precision of the reduced output neuron is 2-2
In one possible embodiment, after the controller unit 102 determines the reduction amplitude b of the decimal part bit width of the fixed-point data format representing the gradient of the output neuron, the controller unit 102 reduces the decimal part bit width of the fixed-point data format for a plurality of times, for example, the controller unit 102 reduces the decimal part bit width of the fixed-point data format twice, the first reduction amplitude is b1, the second reduction amplitude is b2, and b is 1+ b 2.
Wherein, the b1 and the b2 can be the same or different.
Optionally, the controller unit 102 increases the gradient precision of the output neuron
Figure BDA0001791329100000061
The bit width of the fixed-point data format representing the gradient of the output neuron is reduced.
Further, the gradient precision of the output neuron is increased
Figure BDA0001791329100000062
Since the bit width of the fixed point data format indicating the gradient of the output neuron is reduced, the bit width of the fixed point data format indicating the gradient of the output neuron is not changed, and if the bit width of the fixed point data format is reduced, the bit width of the integer part is increased, the data range indicated by the fixed point data format is increased, but the accuracy indicated by the fixed point data format is also increased, so that the accuracy of the gradient of the output neuron is increased in the controller unit 102
Figure BDA0001791329100000063
Then, the controller unit 102 reduces the bit width of the fixed-point data format, and after the bit width of the fixed-point data format is reduced, the bit width of the integer part remains unchanged, that is, the reduced value of the bit width of the integer part is the same as the reduced value of the bit width of the fractional part, that is, the maximum value represented by the fixed-point data format is guaranteed to be unchanged under the condition that the bit width of the fractional part is changed.
For example, the bit width of the fixed-point data format is 9, where the bit width of the sign bit is 1, the bit width of the integer portion is 5, and the bit width of the fractional portion is 4, and after the controller unit 102 decreases the bit width of the fractional portion and the bit width of the integer portion, the bit width of the fractional portion is 2, and then the bit width of the integer portion is 5, that is, the bit width of the fractional portion is decreased, and the bit width of the integer portion remains unchanged.
In one possible embodiment, the controller unit 102 reduces the gradient precision of the output neurons
Figure BDA00017913291000000612
The controller unit 102 is then further configured to:
judging the gradient precision of the output neuron
Figure BDA0001791329100000064
Whether the required precision is less than the required precision, wherein the required precision is to carry out multilayer nervesOutputting the minimum precision of the neuron gradient during network operation;
when the gradient precision of the output neuron
Figure BDA0001791329100000065
And when the required precision is smaller than the required precision, reducing the bit width of the fixed point data format representing the output neuron gradient.
It should be noted that the controller unit 102 increases the gradient precision of the output neuron
Figure BDA0001791329100000066
The reason for (a) is the output neuron gradient accuracy
Figure BDA0001791329100000067
The control is smaller than the required precision, namely precision redundancy exists, at the moment, the operation expense is increased, and the operation resource is wasted. Therefore, in order to reduce the operation overhead and avoid the waste of operation resources, the gradient precision of the output neuron needs to be increased
Figure BDA0001791329100000068
Specifically, as can be seen from the above description, the controller unit 102 increases the gradient precision of the output neuron
Figure BDA00017913291000000613
Then, whether precision redundancy exists needs to be further judged, namely, the gradient precision of the output neuron is judged
Figure BDA0001791329100000069
Whether less than the required accuracy. When determining the gradient accuracy of the output neuron
Figure BDA00017913291000000610
When the required precision is less than the required precision, bit width of a fixed point data format representing the output neuron gradient is reduced so as to increase the output neuron gradient precision
Figure BDA00017913291000000611
Reducing the accuracy redundancy.
It should be noted that the controller unit 102 reduces the bit width of the fixed-point data format, specifically, reduces the bit width of the integer part of the fixed-point data format.
Further, the reducing, by the controller unit 102, the bit width of the fixed-point data format indicating the gradient of the output neuron includes:
the controller unit 102 reduces the bit width of the fixed-point data format representing the gradient of the output neuron according to a second preset step N2, where the second preset step N2 may be 1, 2, 3, 4, 5, 7, 8 or other positive integer.
Specifically, when determining to reduce the bit width of the fixed-point data format, the reduction value of the controller unit 102 every time the bit width of the fixed-point data format is reduced is the second preset step N2.
In one possible embodiment, the controller unit 102 reduces the bit width of the fixed-point data format representing the gradient of the output neuron, including:
the controller unit 102 reduces the bit width of the fixed-point data format indicating the gradient of the output neuron in a 2-fold decreasing manner.
For example, if the bit width of the fixed-point data format excluding the sign bit is 16, the bit width of the fixed-point data format excluding the sign bit is 8 after the bit width of the fixed-point data format is reduced in a 2-time decreasing manner; and reducing the bit width of the fixed-point data format by a 2-time decreasing mode, wherein the bit width of the fixed-point data format except the sign bit is 4.
In one possible embodiment, the controller unit 102 adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000078
Comprises that
The controller unit 102 increases the input neuron precision Sx(l)And/or the above-mentioned spirit of outputPrecision of meridian element gradient
Figure BDA0001791329100000071
Maintaining the above weight precision Sw(l)Unchanged, or;
the controller unit 102 increases the input neuron precision Sx(l)Reducing the gradient precision of the output neuron
Figure BDA0001791329100000079
Maintaining the above weight precision Sw(l)Unchanged and the input neuron precision Sx(l)The increased amplitude is greater than the gradient precision of the output neuron
Figure BDA0001791329100000072
Or is reduced by a reduced magnitude of;
the controller unit 102 reduces the gradient accuracy of the output neuron
Figure BDA0001791329100000073
Increasing the precision S of the input neuronx(l)Maintaining the above weight accuracy Sw(l)Unchanged and the gradient precision of the output neuron
Figure BDA0001791329100000074
The amplitude of increase and decrease is less than the input neuron precision Sx(l)Or, or;
the controller unit 102 increases or decreases the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000075
So as to update the gradient with the predetermined accuracy TrThe absolute value of the difference of (a) is smallest.
Here, it should be noted that the controller unit 102 applies the weight precision S to the weight precision Sw(l)The accuracy S of the input neuronx(l)And output neuron gradient accuracy
Figure BDA0001791329100000076
The specific process of performing the decreasing operation in any one of the above embodiments may be referred to the above related operation of increasing by the controller unit 102, and will not be described here.
Adjusting the input neuron precision S according to the methodx(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA00017913291000000710
Then, the arithmetic unit 103 adjusts the input neuron precision S in accordance with the adjusted input neuron precision S during the arithmetic processx(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000077
And expressing the input neurons, the weights and the output neuron gradients of the L-th layer in a fixed point data format, and then carrying out subsequent operation.
It should be noted that the frequency of calculating the gradient update accuracy T by the controller unit 102 can be flexibly set according to the requirement.
The controller unit 102 may adjust and calculate the frequency of the gradient update precision T according to the number of training iterations in the neural network training process.
Optionally, the controller unit 102 recalculates the gradient update precision T once per iteration in the neural network training process; or recalculating the gradient updating precision T every iteration preset times; or the frequency is set according to the change of the gradient update accuracy T.
Alternatively, the controller unit 102 sets the frequency of calculating the gradient update accuracy T according to the number of training iterations in the neural network training.
An arithmetic unit 103 for calculating the precision S of the input neuron according to the increased or decreased input neuronx(l)Sum weight precision Sw(l)To represent input neurons and weights for layer L; according to increased or decreased output neuron gradient precision
Figure BDA0001791329100000081
To represent the computed level L output neuron gradient.
In other words, the above-mentioned arithmetic unit is used for increasing or decreasing the input neuron precision Sx(l)The fixed point data format of (1) represents the L-th input neuron by increasing or decreasing the weight precision Sw(l)The fixed point data format of (1) represents the weight of the L-th layer, and the gradient precision of the output neuron is increased or decreased
Figure BDA0001791329100000082
The fixed point data format of (1) represents the gradient of the output neuron of the L-th layer for subsequent operations.
By dynamically adjusting (including increasing or decreasing) the precision S of the input neuron in the operation process of the neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000083
The method has the advantages that the calculation requirements are met, meanwhile, the precision redundancy is reduced, the calculation overhead is reduced, and the waste of calculation resources is avoided.
Referring to fig. 2, fig. 2 is a schematic flow chart of a neural network operation method according to an embodiment of the present invention, and as shown in fig. 2, the method includes:
s201, the neural network operation module acquires the precision, the weight precision and the gradient precision of output neurons of the L-th layer of the neural network.
Wherein the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000084
The values of (A) may be the same, or partially the same or different from each other two by two.
Wherein the neural network is a multilayer neural network, and the L-th input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000085
The input neuron precision, the weight precision and the output neuron gradient precision of any layer of the multilayer neural network are respectively.
In a possible embodiment, the neural network operation module obtains the input neurons, the weights and the output neurons of the L-th layer; obtaining the precision S of the L-th layer input neuron according to the L-th layer input neuron, the weight and the output neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000086
S202, the neural network operation module calculates to obtain gradient updating precision T according to the precision of the L-th layer input neurons, the weight precision and the gradient precision of the output neurons.
Specifically, the neural network operation module performs precision S on the input neuron according to a first preset formulax(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000087
And calculating to obtain the gradient updating precision T.
Wherein the first predetermined formula is
Figure BDA0001791329100000088
S203, when the gradient updating precision T is smaller than the preset precision TrThe neural network operation module adjusts the precision, the weight precision and the output neuron gradient of the L-th layer input neuron so as to update the gradient precision T and preset precision TrThe absolute value of the difference of (a) is smallest.
The bit width of the fixed-point data format used for representing the input neuron and the fixed-point data format used for representing the weight is a first bit width, and the bit width of the fixed-point data format used for representing the gradient of the output neuron is a second bit width.
Optionally, the second bit width is greater than the first bit width.
Further, the second bit width is twice the first bit width, so as to facilitate processing by an electronic computer.
Further, the first bit width is preferably 8 bits, and the second bit width is preferably 16 bits.
Wherein the predetermined accuracy TrThe setting can be carried out according to experience in advance; the T matched with the input parameters can be obtained by changing the input parameters through a second preset formular(ii) a T can also be acquired by a machine learning methodr
Optionally, the neural network operation module sets the preset precision T according to a learning rate and a batch size (number of samples in batch processing)r
Further, if there is a parameter sharing layer (such as convolutional layer and recurrent neural network layer) in the neural network, the predetermined precision T is set according to the number of output neurons in the previous layer, the blocksize, and the learning raterThat is, the higher the number of output neurons in the previous layer is, the larger the blocksize is, the higher the learning rate is, and the preset precision T isrThe larger.
Wherein, the neural network operation module adjusts the precision S of the input neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000091
The method comprises the following steps:
the neural network operation module keeps the precision S of the input neuronx(l)Sum weight precision Sw(l)The gradient precision of the output neuron is increased without changing
Figure BDA0001791329100000092
It is noted that due to the above-mentioned output neuron gradient accuracy
Figure BDA0001791329100000094
The neural network operation module increases the gradient precision of the output neuron
Figure BDA0001791329100000093
This means that the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron is reduced.
Optionally, the neural network operation module controller unit decreases the decimal part bit width s1 of the fixed point data format representing the weight by a first preset step N1 according to the value of Tr-T.
Specifically, for the decimal part bit width s1 in the fixed point data format representing the gradient of the output neuron, the neural network operation module reduces N1 bits each time, that is, the bit width of the decimal part is s1-N1, and obtains the gradient precision of the output neuron
Figure BDA0001791329100000095
Then according to the above-mentioned preset formula
Figure BDA0001791329100000096
Judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is smaller or not; when the absolute value of the difference between the gradient update precision T and the preset precision Tr is determined to be smaller, the neural network operation module continuously reduces the bit width of the decimal part in the fixed point data format representing the gradient of the output neuron by N1, namely the bit width is s1-2 × N1, and obtains the gradient precision of the output neuron
Figure BDA0001791329100000097
Continuously judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is reduced or not; if the size of the sample is smaller, continuing to process according to the method; if the absolute value of the difference between the gradient update precision T and the preset precision Tr is increased during the nth processing, the neural network operation module uses the bit width obtained by the nth-1 processing, i.e., s1- (N-1) × N1, as the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron, and reduces the bit width of the decimal part to obtain the output neuronThe accuracy of the element gradient is
Figure BDA0001791329100000098
Optionally, the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
Optionally, the neural network operation module reduces the bit width of the decimal part in the fixed point data format indicating the gradient of the output neuron in a 2-fold decreasing manner.
For example, the decimal part bit width of the fixed point data format indicating the gradient of the output neuron is 4, that is, the precision of the weight is 2-4Then, the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron after reducing the bit width in a 2-fold decreasing mode is 2, that is, the gradient precision of the reduced output neuron is 2-2
In one possible embodiment, after the neural network operation module determines the reduction amplitude b of the decimal part bit width of the fixed point data format representing the gradient of the output neuron, the neural network operation module reduces the decimal part bit width of the fixed point data format for a plurality of times, for example, the neural network operation module reduces the decimal part bit width of the fixed point data format for two times, the first reduction amplitude is b1, the second reduction amplitude is b2, and b is b1+ b 2.
Wherein, the b1 and the b2 can be the same or different.
Optionally, the neural network operation module increases the gradient precision of the output neuron
Figure BDA0001791329100000099
The bit width of the fixed-point data format representing the gradient of the output neuron is reduced.
Further, the gradient precision of the output neuron is increased
Figure BDA00017913291000000910
By reducing the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron, the output neuron is represented byThe bit width of the fixed point data format via the element gradient is not changed, when the bit width of the decimal part is reduced, the bit width of the integer part is increased, the data range represented by the fixed point data format is increased, but the precision represented by the fixed point data format is also increased, so the gradient precision of the output neuron is increased in the neural network operation module
Figure BDA0001791329100000101
And then, the neural network operation module reduces the bit width of the fixed point data format, and after the bit width of the fixed point data format is reduced, the bit width of the integer part is kept unchanged, namely the reduced value of the bit width of the integer part is the same as the reduced value of the bit width of the decimal part, namely the maximum value represented by the fixed point data format is ensured to be unchanged under the condition that the bit width of the decimal part is changed.
For example, the bit width of the fixed point data format is 9, where the bit width of the sign bit is 1, the bit width of the integer part is 5, and the bit width of the fractional part is 4, and after the bit width of the fractional part and the bit width of the integer part are reduced by the neural network operation module, the bit width of the fractional part is 2, and then the bit width of the integer part is 5, that is, the bit width of the fractional part is reduced, and the bit width of the integer part remains unchanged.
In one possible embodiment, the neural network operation module reduces the gradient precision of the output neuron
Figure BDA0001791329100000102
Then, the neural network operation module is further configured to:
judging the gradient precision of the output neuron
Figure BDA0001791329100000103
Whether the precision is smaller than the required precision, wherein the required precision is the minimum precision of the output neuron gradient when multilayer neural network operation is carried out;
when the gradient precision of the output neuron
Figure BDA0001791329100000104
Less than the required accuracyReducing a bit width of a fixed-point data format representing the output neuron gradient.
It should be noted that the neural network operation module increases the gradient precision of the output neuron
Figure BDA0001791329100000105
The reason for (a) is the output neuron gradient accuracy
Figure BDA0001791329100000106
The control is smaller than the required precision, namely precision redundancy exists, at the moment, the operation expense is increased, and the operation resource is wasted. Therefore, in order to reduce the operation overhead and avoid the waste of operation resources, the gradient precision of the output neuron needs to be increased
Figure BDA0001791329100000107
Specifically, as can be seen from the above description, the neural network operation module increases the gradient precision of the output neuron
Figure BDA0001791329100000108
Then, whether precision redundancy exists needs to be further judged, namely, the gradient precision of the output neuron is judged
Figure BDA00017913291000001011
Whether less than the required accuracy. When determining the gradient accuracy of the output neuron
Figure BDA0001791329100000109
When the required precision is less than the required precision, bit width of a fixed point data format representing the output neuron gradient is reduced so as to increase the output neuron gradient precision
Figure BDA00017913291000001010
Reducing the accuracy redundancy.
It should be noted that the neural network operation module reduces the bit width of the fixed-point data format, specifically, reduces the bit width of the integer portion of the fixed-point data format.
Further, the reducing, by the neural network operation module, bit width of the fixed-point data format indicating the gradient of the output neuron includes:
the neural network operation module reduces the bit width of the fixed point data format representing the gradient of the output neuron according to a second preset step N2, wherein the second preset step N2 may be 1, 2, 3, 4, 5, 7, 8 or other positive integers.
Specifically, when the bit width of the fixed point data format is determined to be decreased, the decrease value of the neural network operation module every time the bit width of the fixed point data format is decreased is the second preset step N2.
In one possible embodiment, the neural network operation module reduces the bit width of the fixed-point data format representing the gradient of the output neuron, including:
the neural network operation module reduces the bit width of the fixed point data format representing the gradient of the output neuron in a 2-fold decreasing manner.
For example, if the bit width of the fixed-point data format excluding the sign bit is 16, the bit width of the fixed-point data format excluding the sign bit is 8 after the bit width of the fixed-point data format is reduced in a 2-time decreasing manner; and reducing the bit width of the fixed-point data format by a 2-time decreasing mode, wherein the bit width of the fixed-point data format except the sign bit is 4.
In one embodiment, the neural network operation module adjusts the precision S of the input neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA00017913291000001012
Comprises that
The neural network operation module increases the precision S of the input neuronx(l)And/or accuracy of gradient of the output neurons
Figure BDA0001791329100000111
Maintaining the above weight precision Sw(l)Unchanged, or;
the neural network operation module increases the precision S of the input neuronx(l)Reducing the gradient precision of the output neuron
Figure BDA0001791329100000112
Maintaining the above weight precision Sw(l)Unchanged and the input neuron precision Sx(l)The increased amplitude is greater than the gradient precision of the output neuron
Figure BDA0001791329100000113
Or is reduced by a reduced magnitude of;
the neural network operation module reduces the gradient precision of the output neurons
Figure BDA0001791329100000114
Increasing the precision S of the input neuronx(l)Maintaining the above weight accuracy Sw(l)Unchanged and the gradient precision of the output neuron
Figure BDA0001791329100000115
The amplitude of increase and decrease is less than the input neuron precision Sx(l)Or, or;
the neural network operation module increases or decreases the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000116
So as to update the gradient with the predetermined accuracy TrThe absolute value of the difference of (a) is smallest.
It should be noted that, the neural network operation module applies the weight precision Sw(l)The accuracy S of the input neuronx(l)And output neuron gradient accuracy
Figure BDA0001791329100000117
The specific process of performing the reduction operation in any one of the above can be seen in the above neural network operationThe modules augment the above-described related operations and are not described further herein.
S204, the neural network operation module represents the output neurons and the weights of the L-th layer according to the adjusted input neuron precision and the adjusted weight precision; and expressing the gradient of the L-th layer output neuron obtained by operation according to the adjusted gradient precision of the output neuron so as to perform subsequent operation.
In other words, the above-mentioned arithmetic unit is used for increasing or decreasing the input neuron precision Sx(l)The fixed point data format of (1) represents the L-th input neuron by increasing or decreasing the weight precision Sw(l)The fixed point data format of (1) represents the weight of the L-th layer, and the gradient precision of the output neuron is increased or decreased
Figure BDA0001791329100000118
The fixed point data format of (1) represents the gradient of the output neuron of the L-th layer for subsequent operations.
Adjusting the input neuron precision S according to the methodx(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA0001791329100000119
Then, the neural network operation module recalculates the gradient updating precision T; when the gradient updating precision is no longer greater than the preset precision TrThe neural network operation module reduces the input neuron precision S by referring to the method of step S203x(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA00017913291000001110
It should be noted that the frequency of calculating the gradient update precision T by the neural network operation module can be flexibly set according to requirements.
The neural network operation module can adjust and calculate the frequency of the gradient updating precision T according to the training iteration times in the neural network training process.
Optionally, during the neural network training process, the neural network operation module recalculates the gradient update precision T once per iteration; or recalculating the gradient updating precision T every iteration preset times; or the frequency is set according to the change of the gradient update accuracy T.
Optionally, the neural network operation module sets a frequency of calculating the gradient update precision T according to a training iteration number in the neural network training.
It can be seen that, in the solution of the embodiment of the present invention, in the operation process of the neural network, the precision S of the input neuron is dynamically adjustedxWeight accuracy SwAnd output neuron gradient accuracy
Figure BDA00017913291000001111
The method has the advantages that the calculation requirements are met, meanwhile, the precision redundancy is reduced, the calculation overhead is reduced, and the waste of calculation resources is avoided.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (16)

1. A neural network operation module, wherein the neural network operation module is used for performing operations of a multilayer neural network, and comprises:
the storage unit is used for storing the input neuron precision, the weight precision and the output neuron gradient precision;
a controller unit for obtaining the input neuron precision S of the L-th layer of the multilayer neural network from the storage unitx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791329090000011
Wherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791329090000012
Obtaining gradient updating precision T; when the gradient updating precision T is smaller than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791329090000013
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
an arithmetic unit for adjusting the input neuron precision Sx(l)Sum weight precision Sw(l)To represent input neurons and weights of the L-th layer and to adjust the gradient precision of the output neurons according to the adjusted weights
Figure FDA0001791329090000014
To express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
2. The module of claim 1, wherein the controller unit is configured to determine the input neuron precision S based on the input neuron precisionx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791329090000015
Obtaining the gradient update precision T, specifically comprising:
the controller unit is used for carrying out precision S on the input neuron according to a preset formulax(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791329090000016
Calculating to obtain the gradient updating precision T;
wherein the preset formula is as follows:
Figure FDA0001791329090000017
3. the module of claim 2, wherein the controller unit adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791329090000018
The method comprises the following steps:
the controller unit maintains the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, increasing the output neuron gradient precision
Figure FDA0001791329090000019
4. The module of claim 3, wherein the controller unit increases the output neuron gradient precision
Figure FDA00017913290900000110
The bit width of a fixed-point data format representing the output neuron gradient is reduced.
5. A module according to claim 3 or 4, characterized in that the controller unit increases the output neuron gradient precision
Figure FDA00017913290900000111
Thereafter, the controller unit is further configured to:
judging the gradient precision of the output neuron
Figure FDA00017913290900000112
Whether the precision is smaller than the required precision, wherein the required precision is the minimum precision of the output neuron gradient when multilayer neural network operation is carried out;
when the gradient precision of the output neuron
Figure FDA00017913290900000113
And when the required precision is smaller than the required precision, reducing the bit width of the fixed point data format representing the output neuron gradient.
6. The module according to claim 4 or 5, wherein said controller unit reduces bit widths of said fixed-point data format representing said weights, comprising:
the controller unit reduces the bit width of the fixed point data format representing the gradient of the output neuron according to a preset step length N1;
the preset step length N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
7. The module of claim 4 or 5, wherein the controller unit reduces the bit width of the fixed-point data format representing the output neuron gradient, comprising:
the controller unit reduces the bit width of the fixed-point data format representing the gradient of the output neuron in a 2-fold increasing manner.
8. The module according to any one of claims 1 to 7, wherein the controller unit is further configured to:
obtaining the preset precision T according to a machine learning methodrOr is or;
Obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
9. A neural network operation method, comprising:
obtaining L-th layer input neuron precision S of neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791329090000021
According to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791329090000022
Calculating to obtain gradient updating precision T;
when the gradient updating precision T is smaller than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient
Figure FDA0001791329090000023
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
according to the adjusted input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights for layer L; according to the gradient precision of the adjusted output neuron
Figure FDA0001791329090000024
To represent the gradient of the L-th output neuron obtained by operation; for subsequent operations.
10. The method of claim 9, wherein said determining is based on said input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791329090000025
Calculating to obtain gradient updating precision T, comprising:
according to a preset formula, the precision S of the input neuronx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791329090000026
Calculating to obtain the gradient updating precision T;
wherein the preset formula is as follows:
Figure FDA0001791329090000027
11. the method of claim 10, wherein the adjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791329090000028
The method comprises the following steps:
maintaining the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, increasing the output neuron gradient precision
Figure FDA0001791329090000029
12. The method of claim 11, wherein the increasing the output neuron gradient precision
Figure FDA00017913290900000210
While, decreasing represents the output nerveBit width in fixed point data format of element gradient.
13. The method of claim 12, wherein the increasing the output neuron gradient precision
Figure FDA00017913290900000211
Thereafter, the method further comprises:
judging the gradient precision of the output neuron
Figure FDA00017913290900000212
Whether the precision is smaller than the required precision, wherein the required precision is the minimum precision of the output neuron gradient when multilayer neural network operation is carried out;
when the gradient precision of the output neuron
Figure FDA00017913290900000213
And when the required precision is smaller than the required precision, reducing the bit width of the fixed point data format representing the output neuron gradient.
14. The method of claim 12 or 13, wherein said increasing a bit width of a fixed-point data format representing said output neuron gradient comprises:
reducing the bit width of the fixed point data format representing the gradient of the output neuron according to a preset step length N1;
the preset step length N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
15. The method of claim 12 or 13, wherein said reducing a bit width of a fixed-point data format representing said output neuron gradient comprises:
reducing the bit width of the fixed-point data format representing the output neuron gradient in a 2-fold increasing manner.
16. The method according to any one of claims 9-15, further comprising:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
CN201811041573.3A 2018-05-18 2018-09-06 Neural network operation module and method Pending CN110880033A (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
CN201811041573.3A CN110880033A (en) 2018-09-06 2018-09-06 Neural network operation module and method
EP19803375.5A EP3624020A4 (en) 2018-05-18 2019-05-07 Computing method and related product
PCT/CN2019/085844 WO2019218896A1 (en) 2018-05-18 2019-05-07 Computing method and related product
US16/718,742 US11409575B2 (en) 2018-05-18 2019-12-18 Computation method and product thereof
US16/720,145 US11442785B2 (en) 2018-05-18 2019-12-19 Computation method and product thereof
US16/720,171 US11442786B2 (en) 2018-05-18 2019-12-19 Computation method and product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811041573.3A CN110880033A (en) 2018-09-06 2018-09-06 Neural network operation module and method

Publications (1)

Publication Number Publication Date
CN110880033A true CN110880033A (en) 2020-03-13

Family

ID=69727245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811041573.3A Pending CN110880033A (en) 2018-05-18 2018-09-06 Neural network operation module and method

Country Status (1)

Country Link
CN (1) CN110880033A (en)

Similar Documents

Publication Publication Date Title
JP7146952B2 (en) DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
CN112085183B (en) Neural network operation method and device and related products
JP7146955B2 (en) DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
JP7146954B2 (en) DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
CN110717585B (en) Training method of neural network model, data processing method and related product
CN111656315A (en) Data processing method and device based on convolutional neural network architecture
CN111160531B (en) Distributed training method and device for neural network model and electronic equipment
US11341400B1 (en) Systems and methods for high-throughput computations in a deep neural network
CN106855952A (en) Computational methods and device based on neutral net
CN113222102B (en) Optimization method for neural network model quantization
KR102368590B1 (en) Electronic apparatus and control method thereof
CN110880037A (en) Neural network operation module and method
CN113888524A (en) Defect detection model training method, device and equipment and readable storage medium
CN110880033A (en) Neural network operation module and method
CN114830137A (en) Method and system for generating a predictive model
CN115759238B (en) Quantization model generation method and device, electronic equipment and storage medium
CN116503370A (en) Tobacco shred width determining method and device, electronic equipment and storage medium
WO2020021396A1 (en) Improved analog computing implementing arbitrary non-linear functions using chebyshev-polynomial- interpolation schemes and methods of use
CN111753971A (en) Neural network operation module and method
US20220156562A1 (en) Neural network operation module and method
CN111753972A (en) Neural network operation module and method
CN111753970A (en) Neural network operation module and method
US20200371746A1 (en) Arithmetic processing device, method for controlling arithmetic processing device, and non-transitory computer-readable storage medium for storing program for controlling arithmetic processing device
CN110580523B (en) Error calibration method and device for analog neural network processor
US20230008014A1 (en) Data processing device, data-processing method and recording media

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination