CN110880037A - Neural network operation module and method - Google Patents
Neural network operation module and method Download PDFInfo
- Publication number
- CN110880037A CN110880037A CN201811040961.XA CN201811040961A CN110880037A CN 110880037 A CN110880037 A CN 110880037A CN 201811040961 A CN201811040961 A CN 201811040961A CN 110880037 A CN110880037 A CN 110880037A
- Authority
- CN
- China
- Prior art keywords
- precision
- gradient
- output neuron
- neuron
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses a neural network operation module, which comprises a storage unit, a neural network unit and a neural network unit, wherein the neural network unit is used for acquiring the input neuron precision, the weight precision and the output neuron gradient precision of an L-th layer from the storage unit; acquiring gradient updating precision T according to the input neuron precision, the weight precision and the output neuron gradient precision; when the gradient updating precision T is larger than the preset precision TrAdjusting the precision of an input neuron, the precision of weight and the precision of gradient of an output neuron; and an operation unit for representing the output neurons and weights of the L-th layer according to the increased input neuron precision and weight precision and representing the gradient of the L-th layer output neurons obtained by operation according to the increased output neuron gradient precision so as to perform subsequent operation. By adopting the embodiment of the invention, the operation requirement can be met, the error of the operation result and the operation overhead are reduced, and the operation resource is saved.
Description
Technical Field
The invention relates to the field of neural networks, in particular to a neural network operation module and a method.
Background
The fixed point number is a data format capable of specifying the position of a decimal point, and we usually use bit width to represent the data length of one fixed point number. For example, the bit width of a 16-bit fixed point number is 16. For a given number of fixed points of bit width, the precision of the representable data and the range of numbers that can be represented are traded off against each other, the greater the precision that can be represented, the smaller the range of numbers that can be represented. As shown in FIG. 1a, for a fixed point data format with bitnum bit widthThe first bit is sign bit, the integer part occupies x bit, the decimal part occupies S bit, the maximum fixed point precision S that the fixed point data format can represent is 2-s. The fixed point data format may represent a range of [ neg, pos [ ]]Wherein pos is (2)bitnum-1-1)*2-s,neg=-(2bitnum-1)*2-s。
In the neural network operation, data can be expressed and operated by using a fixed point data format. For example, during forward operation, the data of layer L includes input neuron X(l)And output neuron Y(l)Weight W(l). During the inverse operation, the data of the L < th > layer includes input neuron gradientsOutput neuron gradientGradient of weightThe above data may be expressed by fixed-point numbers, or may be calculated by fixed-point numbers.
The training process of the neural network generally comprises two steps of forward operation and reverse operation, and during the reverse operation, the precision required by the input neuron gradient, the weight gradient and the output neuron gradient may change, which may increase along with the training process, and if the precision of the fixed point number is not enough, a larger error may occur in the operation result, and even training failure may occur.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is that in the neural network operation process, the accuracy of input neurons, the accuracy of weight, or the accuracy of output neuron gradient is not sufficient, which causes errors in the operation or training result.
In a first aspect, the present invention provides a neural network operation module, configured to perform operations on a multilayer neural network, including:
the storage unit is used for storing the input neuron precision, the weight precision and the output neuron gradient precision;
a controller unit for obtaining the input neuron precision S of the L-th layer of the multilayer neural network from the storage unitx(l)Weight accuracy Sw(l)And output neuron gradient accuracyWherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionObtaining gradient updating precision T; when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracySo as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
an arithmetic unit for adjusting the input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights of the L-th layer according to the adjusted output neuron gradient precisionTo express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
In a possible embodiment, the controller unit is adapted to determine the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionObtaining the gradient update precision T, specifically comprising:
the controller unit is used for carrying out precision S on the input neuron according to a preset formulax(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain the gradient updating precision T;
in a possible embodiment, the controller unit adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
the controller unit maintains the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, reduced gradient precision of the output neuron
In a possible embodiment, the controller unit reduces the output neuron gradient accuracyAnd increasing the bit width of the fixed point data format representing the output neuron gradient.
In a possible embodiment, the controller unit increases the output neuron gradient precisionThereafter, the controller unit is further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
In one possible embodiment, the controller unit increases a bit width of a fixed-point data format representing the output neuron gradient, including:
the controller unit increases the bit width of the fixed point data format representing the gradient of the output neuron according to a first preset step length N1;
the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
In one possible embodiment, the controller unit increases a bit width of a fixed-point data format representing the output neuron gradient, including:
the controller unit increases the bit width of the fixed-point data format representing the gradient of the output neuron in a 2-fold increasing manner.
In a possible embodiment, the controller unit is further configured to:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
In a second aspect, an embodiment of the present invention provides a neural network operation module, where the neural network operation module is configured to perform operations on a multilayer neural network, and includes:
a storage unit for storing output neuron gradients of the multilayer neural network;
a controller unit for acquiring an input neuron gradient of an Lth layer of the multilayer neural network from the storage unit; l is an integer greater than 0; obtaining the L < th > layerThe number n1 of output neuron gradients with absolute values smaller than a first preset threshold value in the output neuron gradients; acquiring proportional data a according to the number n1 and the number n2 of the L-th layer output neuron gradients, wherein a is n1/n 2; when the proportion data a is larger than a second preset threshold value, reducing the gradient precision of the L-th layer output neuron
An arithmetic unit for calculating gradient accuracy of the output neuron according to the reduced gradientAnd representing the L-th output neuron gradient for subsequent operation.
In one possible embodiment, the controller unit increases the L < th > layer output neuron gradient precisionAnd increasing the bit width of the fixed point data format representing the L-th layer output neuron gradient.
In one possible embodiment, the controller unit reduces the L < th > layer output neuron gradient precisionThereafter, the controller unit is further configured to:
judging whether overflow occurs when the L-th layer output neuron gradient is in a fixed point data format for representing the L-th layer output neuron gradient;
when overflow is determined, increasing bit width of a fixed point data format representing gradient of the L-th layer output neuron.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the L-th layer output neuron gradient includes:
and the controller unit increases the bit width of the fixed point data format representing the L-th layer output neuron gradient according to a second preset step length N2.
In one possible embodiment, the controller unit increasing a bit width of the fixed-point data format representing the L-th layer output neuron gradient comprises:
and controlling the controller unit to increase the bit width of the fixed point data format representing the L-th layer output neuron gradient in a 2-time incremental mode.
In a third aspect, an embodiment of the present invention provides a neural network operation method, including:
obtaining L-th layer input neuron precision S of neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
According to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain gradient updating precision T;
when the gradient updating precision T is larger than the preset precision TrTime, adjust input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradientSo as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
according to the adjusted input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights for layer L; according to the gradient precision of the adjusted output neuronTo express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
In one possible embodiment, said determining the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain gradient updating precision T, comprising:
according to a preset formula, the precision S of the input neuronx(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain the gradient updating precision T;
in one possible embodiment, the adjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
maintaining the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, reduced gradient precision of the output neuron
In one possible embodiment, the reducing the output neuron gradient precisionIncreasing bit width of fixed point data format representing gradient of output neuron
In one possible embodiment, the reducing the output neuron gradient precisionThereafter, the method further comprises:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the output neuron gradient comprises:
increasing the bit width of the fixed point data format representing the gradient of the output neuron according to a first preset step length N1;
the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the output neuron gradient comprises:
and increasing the bit width of the fixed point data format representing the output neuron gradient in a 2-time increment mode.
In a possible embodiment, the method further comprises:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
In a fourth aspect, an embodiment of the present invention provides a neural network operation method, including:
obtaining an input neuron gradient of an L-th layer of the multilayer neural network, wherein L is an integer greater than 0;
acquiring the number n1 of output neuron gradients of which the absolute values are smaller than a first preset threshold value in the L-th layer of output neuron gradients;
acquiring proportional data a according to the number n1 and the number n2 of the L-th layer output neuron gradients, wherein a is n1/n 2;
when the proportion data a is larger than the secondWhen a threshold value is preset, reducing gradient precision of the L-th layer output neuron
According to reduced output neuron gradient precisionAnd representing the L-th output neuron gradient for subsequent operation.
In one possible embodiment, the reducing the L < th > layer output neuron gradient precisionAnd increasing the bit width of the fixed point data format representing the L-th layer output neuron gradient.
In one possible embodiment, the reducing the L < th > layer output neuron gradient precisionThereafter, the method further comprises:
judging whether the weight overflows when in a fixed point data format representing the gradient of the L-th layer output neuron;
when overflow is determined, increasing bit width of a fixed point data format representing gradient of the L-th layer output neuron.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the L-th layer output neuron gradient includes:
and increasing the bit width of the fixed point data format representing the gradient of the L-th layer output neuron according to a third preset step N2.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the L-th layer output neuron gradient includes:
and increasing the bit width of the fixed point data format representing the L-th layer output neuron gradient in a 2-time incremental mode.
It can be seen that in the solution of the embodiment of the present invention, the input neuron precision S is dynamically adjusted (including increasing or decreasing) in the neural network operation processxWeight accuracy SwAnd output neuron gradient accuracyThe method and the device can reduce the error of the operation result and improve the precision of the operation result while meeting the operation requirement.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1a is a schematic diagram of a fixed-point data format;
fig. 1b is a schematic structural diagram of a neural network operation module according to an embodiment of the present invention;
fig. 2 is a schematic flow chart illustrating a neural network operation method according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of another neural network operation method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terminology used in the embodiments of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
In the neural network operation process, because of a series of operations such as addition, subtraction, multiplication, division, convolution and the like, the input neuron, the weight and the output neuron included in the forward operation process and the input neuron gradient, the weight gradient and the output neuron gradient included in the reverse training process are also changed. The precision with which input neurons, weights, output neurons, input neuron gradients, weight gradients, and output neuron gradients are represented in fixed-point data format may need to be increased or decreased. If the precision of the input neuron, the weight, the output neuron, the input neuron gradient, the weight gradient and the output neuron gradient is not enough, a larger error occurs in an operation result, and even a reverse training failure is caused; if the precision of the input neuron, the weight, the output neuron, the input neuron gradient, the weight gradient and the output neuron gradient is redundant, unnecessary operation overhead is increased, and operation resources are wasted. The application provides a neural network operation module and a method, which dynamically adjust the precision of the data in the neural network operation process, so as to reduce the error of an operation result and improve the precision of the operation result while meeting the operation requirement.
In the embodiment of the present application, the purpose of adjusting the data precision is achieved by adjusting the bit width of the data. For example, when the precision of the fixed-point data format cannot meet the requirement of operation, the precision of the fixed-point data format can be increased by increasing the bit width of the decimal part in the fixed-point data format, i.e. increasing s in fig. 1 a; however, since the bit width of the fixed-point data format is fixed, when the bit width of the fractional part is increased, the bit width of the integer part is decreased, and therefore, the data range that can be represented by the fixed-point data format is decreased.
Referring to fig. 1b, fig. 1b is a schematic structural diagram of a neural network operation module according to an embodiment of the present invention. The neural network operation module is used for performing operation of a multilayer neural network. As shown in fig. 1b, the neural network operation module 100 includes:
and the storage unit 101 is used for storing the input neuron precision, the weight precision and the output neuron gradient precision.
A controller unit 102 for obtaining the input neuron precision S of the L-th layer of the multi-layer neural network from the storage unit 101x(l)Weight accuracy Sw(l)And output neuron gradient accuracyWherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionObtaining gradient updating precision T; when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
In a possible embodiment, the storage unit 101 is further configured to store input neurons, weights, output neurons, and output neuron gradients, the controller unit 102 obtains L-th layer input neurons, weights, and output neuron gradients from the storage unit 101, and the controller unit 102 obtains the input neuron precision S according to the L-th layer input neurons, weights, and output neuron gradientsx(l)Precision of weightSw(l)And output neuron gradient accuracy
The bit width of the fixed-point data format used for representing the number of the fixed-point data of the input neuron and the bit width of the fixed-point data format used for representing the weight are first bit width, and the bit width of the fixed-point data format used for representing the gradient of the output neuron is second bit width.
Optionally, the second bit width is greater than the first bit width.
Further, the second bit width is twice the first bit width, so as to facilitate processing by an electronic computer.
Further, the first bit width is preferably 8 bits, and the second bit width is preferably 16 bits.
The controller unit 102 may preset the preset accuracy T according to experiencer(ii) a Or a second preset formula is adopted to obtain the preset precision T matched with the input parameters in a mode of changing the input parametersr(ii) a T can also be acquired by a machine learning methodr。
Alternatively, the controller unit 102 sets the preset accuracy T according to a learning rate and a batch size (number of samples in batch processing)r。
Further, if there is a parameter sharing layer (such as convolutional layer and recurrent neural network layer) in the neural network, the controller unit 102 sets the predetermined precision T according to the number of output neurons in the previous layer, the blocksize, and the learning raterThat is, the higher the number of output neurons of the previous layer, the larger the blocksize, and the higher the learning rate, the preset accuracy TrThe larger.
Specifically, the controller unit 102 obtains the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThen, according to a first preset formula, the precision S of the input neuron is determinedx(l)Weight accuracy Sw(l)And output neuron gradient accuracyCalculating to obtain the gradient update precision T, wherein the first preset formula may be:
wherein the controller unit 102 adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
the controller unit 102 maintains the input neuron precision Sx(l)Sum weight precision Sw(l)The gradient precision of the output neuron is not changed, and the gradient precision of the output neuron is reduced
It is noted that due to the above-mentioned output neuron gradient accuracyThe controller unit 102 reduces the gradient accuracy of the output neuronThis means that the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron is increased.
Alternatively, the controller unit 102 may increase the decimal part bit width s1 of the fixed point data format indicating the weight by a first preset step N1 according to the value of Tr-T.
Specifically, for the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron, the controller unit 102 increments N1 bits at a time, that is, the decimal part bit width is s1+ N1, and obtains the gradient precision of the output neuronThen according to the above-mentioned preset formulaJudging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is smaller or not; when determining that the absolute value of the difference between the gradient update precision T and the preset precision Tr is smaller, the controller unit 102 continues to increase the bit width of the decimal part in the fixed point data format representing the gradient of the output neuron by N1, that is, the bit width is s1+2 × N1, and obtains the gradient precision of the output neuronContinuously judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is reduced or not; if the size of the sample is smaller, continuing to process according to the method; if the absolute value of the difference between the gradient update precision T and the preset precision Tr increases during the nth processing, the controller unit 102 uses the bit width obtained by the nth-1 processing, i.e., s1+ (N-1) × N1, as the bit width of the decimal part of the fixed point data format indicating the gradient of the output neuron, and the gradient precision of the output neuron after increasing the bit width of the decimal part is set to be equal to
Optionally, the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
Alternatively, the controller unit 102 increases the bit width of the decimal part in the fixed point data format indicating the gradient of the output neuron in increments of 2 times.
For example, the decimal part bit width of the fixed point data format indicating the gradient of the output neuron is 3, that is, the precision of the weight is 2-3Then, the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron after increasing the bit width in a 2-fold increasing manner is 6, that is, the gradient precision of the reduced output neuron is 2-6。
In one possible embodiment, after the controller unit 102 determines the increment width b of the decimal part bit width of the fixed-point data format representing the gradient of the output neuron, the controller unit 102 increments the decimal part bit width of the fixed-point data format multiple times, for example, the controller unit 102 increments the decimal part bit width of the fixed-point data format two times, the first increment is b1, the second increment is b2, and b is 1+ b 2.
Wherein, the b1 and the b2 can be the same or different.
Optionally, the controller unit 102 reduces the gradient precision of the output neuronThe bit width of the fixed point data format representing the gradient of the output neuron is increased.
Further, the gradient precision of the output neuron is increasedThe bit width of the decimal part in the fixed point data format representing the gradient of the output neuron is increased, and since the bit width of the fixed point data format representing the gradient of the output neuron is unchanged, if the bit width of the decimal part is increased, the bit width of the integer part is reduced, and the data range represented by the fixed point data format is reduced, the precision of the gradient of the output neuron is reduced in the controller unit 102Then, the controller unit 102 increases the bit width of the fixed-point data format, and after the bit width of the fixed-point data format is increased, the bit width of the integer portion remains unchanged, i.e. the increased value of the bit width of the integer portion is the same as the increased value of the bit width of the fractional portion.
For example, the bit width of the fixed-point data format is 9, where the bit width of the sign bit is 1, the bit width of the integer portion is 5, and the bit width of the fractional portion is 3, and after the bit width of the fractional portion and the bit width of the integer portion are increased by the controller unit 102, the bit width of the fractional portion is 6, and then the bit width of the integer portion is 5, that is, the bit width of the fractional portion is increased, and the bit width of the integer portion remains unchanged.
In one possible embodiment, the controller unit 102 reduces the gradient precision of the output neuronsThe controller unit 102 is then further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
Specifically, as can be seen from the above description, the controller unit 102 reduces the gradient precision of the output neuronIn this case, since the range of the fixed-point data format indicating the gradient of the output neuron is narrowed, the controller unit 102 decreases the gradient accuracy of the output neuronThen, judging whether the output neuron gradient overflows when being expressed in the fixed point data format; when overflow is determined, the controller unit 102 increases the bit width of the fixed-point data format, thereby expanding the range of data represented by the fixed-point data format so that the output neuron gradient is not overflowed when represented by the fixed-point data format.
It should be noted that the controller unit 102 increases the bit width of the fixed-point data format, specifically, increases the bit width of the integer part of the fixed-point data format.
Further, the increasing, by the controller unit 102, the bit width of the fixed-point data format indicating the gradient of the output neuron includes:
the controller unit 102 increases the bit width of the fixed-point data format representing the gradient of the output neuron according to a second preset step N2, where the second preset step N2 may be 1, 2, 3, 4, 5, 7, 8 or other positive integer.
Specifically, when determining to increase the bit width of the fixed-point data format, the controller unit 102 increases the bit width of the fixed-point data format by the second preset step N2 each time.
In one possible embodiment, the controller unit 102 increases the bit width of the fixed-point data format representing the gradient of the output neuron, including:
the controller unit 102 increases the bit width of the fixed-point data format indicating the gradient of the output neuron in increments of 2 times.
For example, if the bit width of the fixed-point data format excluding the sign bit is 8, the bit width of the fixed-point data format excluding the sign bit is 16 after the bit width of the fixed-point data format is increased in a 2-time increasing manner; and increasing the bit width of the fixed-point data format in a 2-time incremental mode, wherein the bit width of the fixed-point data format except the sign bit is 32.
In one possible embodiment, the controller unit 102 adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyComprises that
The controller unit 102 reduces the input neuron precision Sx(l)And/or accuracy of gradient of the output neuronsMaintaining the above weight precision Sw(l)Unchanged, or;
the controller unit 102 reduces the input neuron precision Sx(l)Increasing the gradient precision of the output neuronMaintaining the above weight precision Sw(l)Unchanged and the input neuron precision Sx(l)The reduced amplitude is larger than the gradient precision of the output neuronOr, or;
the controller unit 102 increases the gradient accuracy of the output neuronReducing the precision S of the input neuronx(l)Maintaining the above weight accuracy Sw(l)Unchanged and the gradient precision of the output neuronThe increased amplitude is smaller than the input neuron precision Sx(l)Or is reduced by a reduced magnitude of;
the controller unit 102 increases or decreases the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracySo as to update the gradient with the predetermined accuracy TrThe absolute value of the difference of (a) is smallest.
Here, it should be noted that the controller unit 102 applies the weight precision S to the weight precision Sw(l)The accuracy S of the input neuronx(l)And output neuron gradient accuracyThe specific process of performing the increase operation in any one of the above-mentioned embodiments can be referred to the above-mentioned related operation of the controller unit 102, and will not be described here.
Adjusting the input neuron precision S according to the methodx(l)Precision of weight Sw(l)And output neuron gradient accuracyThen, the arithmetic unit 103 adjusts the input neuron precision S in accordance with the adjusted input neuron precision S during the arithmetic processx(l)Precision of weight Sw(l)And output neuron gradient accuracyAnd expressing the input neurons, the weights and the output neuron gradients of the L-th layer in a fixed point data format, and then carrying out subsequent operation.
It should be noted that the frequency of calculating the gradient update accuracy T by the controller unit 102 can be flexibly set according to the requirement.
The controller unit 102 may adjust and calculate the frequency of the gradient update precision T according to the number of training iterations in the neural network training process.
Optionally, the controller unit 102 recalculates the gradient update precision T once per iteration in the neural network training process; or recalculating the gradient updating precision T every iteration preset times; or the frequency is set according to the change of the gradient update accuracy T.
Alternatively, the controller unit 102 sets the frequency of calculating the gradient update accuracy T according to the number of training iterations in the neural network training.
An arithmetic unit 103 for calculating the precision S of the input neuron according to the increased or decreased input neuronx(l)Sum weight precision Sw(l)To represent input neurons and weights for layer L; according to increased or decreased output neuron gradient precisionTo represent the computed level L output neuron gradient.
In other words, the above-mentioned arithmetic unit is used for increasing or decreasing the input neuron precision Sx(l)The fixed point data format of (1) represents the L-th input neuron by increasing or decreasing the weight precision Sw(l)The fixed point data format of (1) represents the weight of the L-th layer, and the gradient precision of the output neuron is increased or decreasedThe fixed point data format of (1) represents the gradient of the output neuron of the L-th layer for subsequent operations.
By dynamically adjusting (including increasing or decreasing) the precision S of the input neuron in the operation process of the neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method and the device can meet the operation requirement, reduce the error of the operation result and the operation overhead and save the operation resource.
In another alternative embodiment, the controller unit 102 obtains an L-th layer output neuron gradient of the multilayer neural network; .
In one possible embodiment, the controller unit 102 acquires the output neurons of the L-th layer and the output neurons of the L-1 th layer, and then acquires the gradient of the L-th layer output neurons according to the output neurons of the L-th layer and the output neurons of the L-1 th layer.
The controller unit 102 obtains proportional data a of the output neuron gradient whose absolute value is smaller than a first preset threshold value.
Alternatively, the first preset threshold may be 0, 0.01, 0.05, 0.1, 0.12, 0.05 or other values.
Specifically, after acquiring the L-th layer output neuron gradient, the controller unit 102 acquires the number n1 of gradient values having an absolute value smaller than the first preset threshold value in the L-th layer output neuron gradient, and then acquires the proportional data a, that is, a ═ n1/n2, based on the number n1 and the number n2 of the L-th layer output neuron gradient.
Alternatively, the above ratio data may be 50%, 60%, 65%, 70%, 80%, 85%, 90%, or other values.
Optionally, the above ratio data is 80%.
When the proportion data a is larger than a second preset threshold valueThe controller unit 102 reduces the gradient accuracy of the L-th layer output neuron
In one possible embodiment, the controller unit 102 reduces the gradient precision of the L-th layer output neuronsThen, the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron is increased.
In one possible embodiment, the controller unit 102 reduces the gradient precision of the L-th layer output neuronsThen, the controller unit 102 is further configured to:
judging whether overflow occurs when the L-th layer output neuron gradient is in a fixed point data format for representing the L-th layer output neuron gradient;
when the overflow is determined, the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron is increased.
In a possible embodiment, the controller unit 102 increases the bit width of the fixed-point data format representing the gradient of the L-th layer output neuron, including:
the controller unit 102 increases the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron according to a third preset step N3.
In a possible embodiment, the controller unit 102 increasing the bit width of the fixed-point data format indicating the L-th layer output neuron gradient includes:
the controller unit 102 increases the bit width of the fixed-point data format indicating the gradient of the L-th layer output neuron by a 2-fold increment.
It should be noted that the controller unit 102 reduces the gradient precision of the output neuronThe specific processes of (1) can be seen from the above description, and will not be described again.
Adjusting the gradient precision of the output neuron according to the methodThen, the arithmetic unit 103 adjusts the gradient accuracy of the output neuron according to the adjusted output neuron during the arithmetic processThe gradient of the output neuron of the L-th layer is expressed in the form of fixed point number, and then the subsequent operation is performed.
The precision of the neural network is adjusted according to the gradient of the output neurons in the operation process of the neural network, so that the error of the output neurons is reduced, and the training is ensured to be normally carried out.
Referring to fig. 2, fig. 2 is a schematic flow chart of a neural network operation method according to an embodiment of the present invention, and as shown in fig. 2, the method includes:
s201, the neural network operation module acquires the precision, the weight precision and the gradient precision of output neurons of the L-th layer of the neural network.
Wherein the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe values of (A) may be the same, or partially the same or different from each other two by two.
Wherein the neural network is a multilayer neural network, and the L-th input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe input neuron precision, the weight precision and the output neuron gradient precision of any layer of the multilayer neural network are respectively.
In a possible embodiment, the neural network operation module obtains the input neurons, the weights and the output neurons of the L-th layer; obtaining the precision S of the L-th layer input neuron according to the L-th layer input neuron, the weight and the output neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
S202, the neural network operation module calculates to obtain gradient updating precision T according to the precision of the L-th layer input neurons, the weight precision and the gradient precision of the output neurons.
Specifically, the neural network operation module performs precision S on the input neuron according to a first preset formulax(l)Weight accuracy Sw(l)And output neuron gradient accuracyAnd calculating to obtain the gradient updating precision T.
S203, when the gradient updating precision T is larger than the preset precision TrThe neural network operation module adjusts the precision, the weight precision and the output neuron gradient of the L-th layer input neuron so as to update the gradient precision T and preset precision TrThe absolute value of the difference of (a) is smallest.
The bit width of the fixed-point data format used for representing the input neuron and the fixed-point data format used for representing the weight is a first bit width, and the bit width of the fixed-point data format used for representing the gradient of the output neuron is a second bit width.
Optionally, the second bit width is greater than the first bit width.
Further, the second bit width is twice the first bit width, so as to facilitate processing by an electronic computer.
Further, the first bit width is preferably 8 bits, and the second bit width is preferably 16 bits.
Wherein the predetermined accuracy TrThe setting can be carried out according to experience in advance; the T matched with the input parameters can be obtained by changing the input parameters through a second preset formular(ii) a T can also be acquired by a machine learning methodr。
Optionally, the neural network operation module sets the preset precision T according to a learning rate and a batch size (number of samples in batch processing)r。
Further, if there is a parameter sharing layer (such as convolutional layer and recurrent neural network layer) in the neural network, the predetermined precision T is set according to the number of output neurons in the previous layer, the blocksize, and the learning raterThat is, the higher the number of output neurons in the previous layer is, the larger the blocksize is, the higher the learning rate is, and the preset precision T isrThe larger.
Wherein the neural network operation module adjusts the precision S of the input neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
maintaining the precision S of the input neuronx(l)Sum weight precision Sw(l)The gradient precision of the output neuron is increased without changing
It should be noted that the neural network operation module reduces the gradient precision of the output neuronThis means that the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron is increased.
Optionally, the neural network operation module controller unit increases the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron according to the value of Tr-T by a first preset step N1.
Specifically, for the decimal part bit width s1 in the fixed point data format representing the gradient of the output neuron, the neural network operation module increases N1 each time, that is, the bit width of the decimal part is s1+ N1, and obtains the gradient precision of the output neuronThen according to the above-mentioned preset formulaJudging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is smaller or not; when the absolute value of the difference between the gradient update precision T and the preset precision Tr is determined to be smaller, the neural network operation module continues to increase the bit width of the decimal part in the fixed point data format representing the gradient of the output neuron by N1, namely the bit width is s1+2 × N1, and obtains the gradient precision of the output neuronContinuously judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is reduced or not; if the size of the sample is smaller, continuing to process according to the method; if the absolute value of the difference between the gradient update precision T and the preset precision Tr is increased during the nth processing, the neural network operation module uses the bit width obtained by the nth-1 processing, i.e., s1+ (N-1) × N1, as the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron, and the gradient precision of the output neuron after increasing the bit width of the decimal part is equal to
Optionally, the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
Optionally, the neural network operation module increases a bit width of a decimal part in a fixed point data format indicating the gradient of the output neuron in a 2-fold increasing manner.
For example, the output neuron ladder is representedThe bit width of the decimal part of the fixed point data format of the degree is 3, namely the gradient precision of the output neuron is 2-3Then, the decimal part bit width of the fixed point data format representing the gradient of the output neuron after increasing in a 2-fold increasing manner is 6, that is, the gradient precision of the output neuron after decreasing is 2-6。
In one possible embodiment, after the neural network operation module determines the increment b of the decimal part bit width of the fixed point data format representing the gradient of the output neuron, the neural network operation module increments the decimal part bit width of the fixed point data format multiple times, for example, the neural network operation module increments the decimal part bit width of the fixed point data format two times, the first increment is b1, the second increment is b2, and b is b1+ b 2.
Wherein, the b1 and the b2 can be the same or different.
Optionally, when the neural network operation module decreases the gradient precision of the output neuron, the bit width of the fixed point data format indicating the weight is increased.
Further, the gradient accuracy S of the output neuron is reducedw(l)The bit width of the decimal part in the fixed point data format representing the weight is increased, the bit width of the decimal part in the fixed point data format representing the output neuron gradient is unchanged, if the bit width of the decimal part is increased, the bit width of the integer part is reduced, the data range represented by the fixed point data format is reduced, and therefore the gradient precision S of the output neuron is reduced in the neural network operation modulew(l)And then, the neural network operation module increases the bit width of the fixed point data format, and after the bit width of the fixed point data format is increased, the bit width of the integer part is kept unchanged, namely the increased value of the bit width of the integer part is the same as the increased value of the bit width of the decimal part.
For example, the bit width of the fixed-point data format is 9, where the bit width of the sign bit is 1, the bit width of the integer portion is 5, and the bit width of the decimal portion is 3, and after the bit width of the decimal portion and the bit width of the integer portion are increased by the controller unit 102, the bit width of the decimal portion is 6, and then the bit width of the integer portion is 5, that is, the bit width of the decimal portion is increased, and the bit width of the integer portion remains unchanged.
In a possible embodiment, after the neural network operation module reduces the gradient precision of the output neuron, the neural network operation module is further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
Specifically, as can be seen from the above description, when the neural network operation module reduces the precision of the output neuron gradient, the range of the fixed point data format representing data representing the output neuron gradient is reduced, and therefore, after the neural network operation module reduces the precision of the output neuron gradient, it is determined whether the output neuron gradient is overflowed when represented in the fixed point data format; when overflow is determined, the neural network operation module increases the bit width of the fixed point data format, so that the range of data represented by the fixed point data format is expanded, and the output neuron gradient is not overflowed when represented by the fixed point data format.
It should be noted that, the neural network operation module increases the bit width of the fixed-point data format, specifically, increases the bit width of the integer part of the fixed-point data format.
Further, the increasing, by the neural network operation module, a bit width of the fixed-point data format indicating the gradient of the output neuron includes:
the neural network operation module increases the bit width of the fixed point data format representing the gradient of the output neuron according to a second preset step N2, wherein the second preset step N2 may be 1, 2, 3, 4, 5, 7, 8 or other positive integers.
Specifically, when determining to increase the bit width of the fixed point data format, the neural network operation module increases the bit width of the fixed point data format by the second preset step N2 each time.
In one possible embodiment, the neural network operation module increases the bit width of the fixed-point data format representing the gradient of the output neuron, including:
the neural network operation module increases the bit width of the fixed point data format representing the gradient of the output neuron in a 2-fold increasing manner.
For example, if the bit width of the fixed-point data format excluding the sign bit is 8, the bit width of the fixed-point data format excluding the sign bit is 16 after the bit width of the fixed-point data format is increased in a 2-time increasing manner; after the bit width of the fixed-point data format is increased again in a 2-time increasing mode, the bit width of the fixed-point data format excluding the sign bit is 32.
In one embodiment, the neural network operation module adjusts the precision S of the input neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
reducing the precision S of the input neuronx(l)And/or accuracy of gradient of the output neuronsMaintaining the above weight precision Sw(l)Unchanged, or;
reducing the precision S of the input neuronx(l)Increasing the gradient precision of the output neuronMaintaining the above weight precision Sw(l)Unchanged and the input neuron precision Sx(l)The reduced amplitude is larger than the gradient precision of the output neuronOr, or;
increasing the gradient precision of the output neuronReducing the precision S of the input neuronx(l)Maintaining the above weight accuracy Sw(l)Unchanged and the gradient precision of the output neuronThe increased amplitude is smaller than the input neuron precision Sx(l)Or is reduced by a reduced magnitude of;
increasing or decreasing the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracySo as to update the gradient with the predetermined accuracy TrThe absolute value of the difference of (a) is smallest.
It should be noted that, the neural network operation module applies the weight precision Sw(l)The accuracy S of the input neuronx(l)And output neuron gradient accuracyThe specific process of performing the increasing operation in any one of the above-mentioned embodiments can be referred to the above-mentioned neural network operation module to increase the above-mentioned related operation, and will not be described herein.
S204, the neural network operation module represents the output neurons and the weights of the L-th layer according to the adjusted input neuron precision and the adjusted weight precision; and expressing the gradient of the L-th layer output neuron obtained by operation according to the adjusted gradient precision of the output neuron so as to perform subsequent operation.
In other words, the above-mentioned arithmetic unit is used for increasing or decreasing the input neuron precision Sx(l)The fixed point data format of (1) represents the L-th input neuron by increasing or decreasing the weight precision Sw(l)The fixed point data format of (1) represents the weight of the L-th layer, and the gradient precision of the output neuron is increased or decreasedThe fixed point data format of (1) represents the gradient of the output neuron of the L-th layer for subsequent operations.
Adjusting the input neuron precision S according to the methodx(l)Precision of weight Sw(l)And output neuron gradient accuracyThen, the neural network operation module recalculates the gradient updating precision T; when the gradient updating precision is no longer greater than the preset precision TrThe neural network operation module reduces the input neuron precision S by referring to the method of step S203x(l)Precision of weight Sw(l)And output neuron gradient accuracy
It should be noted that the frequency of calculating the gradient update precision T by the neural network operation module can be flexibly set according to requirements.
The neural network operation module can adjust and calculate the frequency of the gradient updating precision T according to the training iteration times in the neural network training process.
Optionally, during the neural network training process, the neural network operation module recalculates the gradient update precision T once per iteration; or recalculating the gradient updating precision T every iteration preset times; or the frequency is set according to the change of the gradient update accuracy T.
Optionally, the neural network operation module sets a frequency of calculating the gradient update precision T according to a training iteration number in the neural network training.
It can be seen that, in the solution of the embodiment of the present invention, in the operation process of the neural network, the precision S of the input neuron is dynamically adjustedxWeight accuracy SwAnd output neuron gradient accuracyThe method and the device can meet the operation requirement, reduce the error of the operation result and the operation overhead and save the operation resource.
Fig. 3 and fig. 3 are schematic flow charts of a neural network operation method according to an embodiment of the present invention. As shown in fig. 3, the method includes:
s301, the neural network operation module obtains the L-th layer output neuron gradient.
In a possible embodiment, the neural network operation module obtains the output neurons of the L-th layer and the output neurons of the L-1 th layer, and then obtains the gradient of the L-th layer output neurons according to the output neurons of the L-th layer and the output neurons of the L-1 th layer.
S302, the neural network operation module obtains proportion data a of which the absolute value in the L-th layer output neuron gradient is smaller than a first preset threshold value.
Alternatively, the first preset threshold may be 0, 0.01, 0.05, 0.1, 0.12, 0.05 or other values.
Specifically, the neural network operation module acquires the number n1 of gradient values having an absolute value smaller than the first preset threshold value in the L-th layer output neuron gradient after acquiring the L-th layer output neuron gradient, and then acquires the proportional data a, that is, a ═ n1/n2, based on the number n1 and the number n2 of the L-th layer output neuron gradient.
Alternatively, the above ratio data may be 50%, 60%, 65%, 70%, 80%, 85%, 90%, or other values.
Optionally, the above ratio data is 80%.
And S303, when the proportion data a is larger than a second preset threshold value, reducing the gradient precision of the L-th layer output neuron by the neural network operation module.
In one possible embodiment, the neural network operation module reduces the gradient precision of the L-th layer output neuronsIncreasing the number of output neurons in the L-th layerBit width in the fixed point data format of the gradient.
In one possible embodiment, the neural network operation module reduces the gradient precision of the L-th layer output neuronsThen, the neural network operation module is further configured to:
judging whether overflow occurs when the L-th layer output neuron gradient is in a fixed point data format for representing the L-th layer output neuron gradient;
when the overflow is determined, the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron is increased.
In a possible embodiment, the neural network operation module increases a bit width of a fixed-point data format representing a gradient of the L-th layer output neuron, including:
and increasing the bit width of the fixed point data format representing the gradient of the L-th layer output neuron according to a third preset step N3.
In a possible embodiment, the neural network operation module increases a bit width of a fixed-point data format representing a gradient of the L-th layer output neuron, including:
and increasing the bit width of the fixed point data format representing the gradient of the L-th layer output neuron according to a 2-time incremental mode.
It should be noted that the controller unit 102 reduces the gradient precision of the output neuronThe specific processes of (1) can be seen from the above description, and will not be described again.
Adjusting the gradient precision of the output neuron according to the methodThen, the neural network operation module adjusts the gradient precision of the output neurons in the operation processAnd expressing the gradient of the output neuron of the L-th layer in a fixed point data format, and then carrying out subsequent operation.
It can be seen that in the scheme of the embodiment of the invention, the precision of the output neurons is adjusted according to the gradient of the output neurons in the operation process of the neural network, so that the error of the output neurons is reduced, and the normal training is ensured.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (16)
1. A neural network operation module, wherein the neural network operation module is used for performing operations of a multilayer neural network, and comprises:
the storage unit is used for storing the input neuron precision, the weight precision and the output neuron gradient precision;
a controller unit for obtaining the input neuron precision S of the L-th layer of the multilayer neural network from the storage unitx(l)Weight accuracy Sw(l)And output neuron gradient accuracyWherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionObtaining gradient updating precision T; when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracySo as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
an arithmetic unit for adjusting the input neuron precision Sx(l)Sum weight precision Sw(l)To represent input neurons and weights of the L-th layer and to adjust the gradient precision of the output neurons according to the adjusted weightsTo express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
2. The module of claim 1, wherein the controller unit is configured to determine the input neuron precision S based on the input neuron precisionx(l)The weight accuracy Sw(l)And the output neuron gradient precisionObtaining the gradient update precision T, specifically comprising:
the controller unit is used for carrying out precision S on the input neuron according to a preset formulax(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain the gradient updating precision T;
3. the module of claim 2, wherein the controller unit adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
5. A module according to claim 3 or 4, characterized in that the controller unit reduces the output neuron gradient precisionThereafter, the controller unit is further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
6. The module according to claim 4 or 5, wherein said controller unit increases bit width of said fixed-point data format representing said weight, comprising:
the controller unit increases the bit width of the fixed point data format representing the gradient of the output neuron according to a preset step length N1;
the preset step length N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
7. The module of claim 4 or 5, wherein the controller unit increases the bit width of the fixed-point data format representing the output neuron gradient, comprising:
the controller unit increases the bit width of the fixed-point data format representing the gradient of the output neuron in a 2-fold increasing manner.
8. The module according to any one of claims 1 to 7, wherein the controller unit is further configured to:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
9. A neural network operation method, comprising:
obtaining L-th layer input neuron precision S of neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
According to the input neuron precision Sx(l)The weight accuracy Sw(l)And said output spiritPrecision of meridian element gradientCalculating to obtain gradient updating precision T;
when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradientSo as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
10. The method of claim 9, wherein said determining is based on said input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain gradient updating precision T, comprising:
according to a preset formula, the precision S of the input neuronx(l)The weight accuracy Sw(l)And the output neuron gradient precisionCalculating to obtain the gradient updating precision T;
11. the method of claim 10, wherein the adjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracyThe method comprises the following steps:
13. The method of claim 11 or 12, wherein the reducing the output neuron gradient precisionThereafter, the method further comprises:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
14. The method of claim 12 or 13, wherein said increasing a bit width of a fixed-point data format representing said output neuron gradient comprises:
increasing the bit width of the fixed point data format representing the gradient of the output neuron according to a preset step length N1;
the preset step length N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
15. The method of claim 12 or 13, wherein said increasing a bit width of a fixed-point data format representing said output neuron gradient comprises:
and increasing the bit width of the fixed point data format representing the output neuron gradient in a 2-time increment mode.
16. The method according to any one of claims 9-15, further comprising:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811040961.XA CN110880037A (en) | 2018-09-06 | 2018-09-06 | Neural network operation module and method |
EP19803375.5A EP3624020A4 (en) | 2018-05-18 | 2019-05-07 | Computing method and related product |
PCT/CN2019/085844 WO2019218896A1 (en) | 2018-05-18 | 2019-05-07 | Computing method and related product |
US16/718,742 US11409575B2 (en) | 2018-05-18 | 2019-12-18 | Computation method and product thereof |
US16/720,145 US11442785B2 (en) | 2018-05-18 | 2019-12-19 | Computation method and product thereof |
US16/720,171 US11442786B2 (en) | 2018-05-18 | 2019-12-19 | Computation method and product thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811040961.XA CN110880037A (en) | 2018-09-06 | 2018-09-06 | Neural network operation module and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110880037A true CN110880037A (en) | 2020-03-13 |
Family
ID=69727298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811040961.XA Pending CN110880037A (en) | 2018-05-18 | 2018-09-06 | Neural network operation module and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110880037A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020192582A1 (en) * | 2019-03-26 | 2020-10-01 | 上海寒武纪信息科技有限公司 | Neural network operation module and method |
-
2018
- 2018-09-06 CN CN201811040961.XA patent/CN110880037A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020192582A1 (en) * | 2019-03-26 | 2020-10-01 | 上海寒武纪信息科技有限公司 | Neural network operation module and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7146955B2 (en) | DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM | |
JP7146952B2 (en) | DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM | |
CN112085183A (en) | Neural network operation method and device and related product | |
CN111656315A (en) | Data processing method and device based on convolutional neural network architecture | |
CN111985523A (en) | Knowledge distillation training-based 2-exponential power deep neural network quantification method | |
CN108537327B (en) | Neural network prediction method and device based on time series BP | |
CN111160531B (en) | Distributed training method and device for neural network model and electronic equipment | |
CN111758104B (en) | Neural network parameter optimization method and neural network calculation method and device suitable for hardware implementation | |
CN114462594A (en) | Neural network training method and device, electronic equipment and storage medium | |
US20230037498A1 (en) | Method and system for generating a predictive model | |
CN110109646A (en) | Data processing method, device and adder and multiplier and storage medium | |
CN110880037A (en) | Neural network operation module and method | |
CN113642711B (en) | Processing method, device, equipment and storage medium of network model | |
CN107666107B (en) | Method of correcting laser power, laser, storage medium, and electronic apparatus | |
CN115759238B (en) | Quantization model generation method and device, electronic equipment and storage medium | |
US10984163B1 (en) | Systems and methods for parallel transient analysis and simulation | |
CN110880033A (en) | Neural network operation module and method | |
WO2020021396A1 (en) | Improved analog computing implementing arbitrary non-linear functions using chebyshev-polynomial- interpolation schemes and methods of use | |
CN111753971A (en) | Neural network operation module and method | |
CN111753972A (en) | Neural network operation module and method | |
US20220156562A1 (en) | Neural network operation module and method | |
TWI743710B (en) | Method, electric device and computer program product for convolutional neural network | |
CN111753970A (en) | Neural network operation module and method | |
CN110580523B (en) | Error calibration method and device for analog neural network processor | |
US20200371746A1 (en) | Arithmetic processing device, method for controlling arithmetic processing device, and non-transitory computer-readable storage medium for storing program for controlling arithmetic processing device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |