CN110880037A - Neural network operation module and method - Google Patents

Neural network operation module and method Download PDF

Info

Publication number
CN110880037A
CN110880037A CN201811040961.XA CN201811040961A CN110880037A CN 110880037 A CN110880037 A CN 110880037A CN 201811040961 A CN201811040961 A CN 201811040961A CN 110880037 A CN110880037 A CN 110880037A
Authority
CN
China
Prior art keywords
precision
gradient
output neuron
neuron
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811040961.XA
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN201811040961.XA priority Critical patent/CN110880037A/en
Priority to EP19803375.5A priority patent/EP3624020A4/en
Priority to PCT/CN2019/085844 priority patent/WO2019218896A1/en
Priority to US16/718,742 priority patent/US11409575B2/en
Priority to US16/720,145 priority patent/US11442785B2/en
Priority to US16/720,171 priority patent/US11442786B2/en
Publication of CN110880037A publication Critical patent/CN110880037A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a neural network operation module, which comprises a storage unit, a neural network unit and a neural network unit, wherein the neural network unit is used for acquiring the input neuron precision, the weight precision and the output neuron gradient precision of an L-th layer from the storage unit; acquiring gradient updating precision T according to the input neuron precision, the weight precision and the output neuron gradient precision; when the gradient updating precision T is larger than the preset precision TrAdjusting the precision of an input neuron, the precision of weight and the precision of gradient of an output neuron; and an operation unit for representing the output neurons and weights of the L-th layer according to the increased input neuron precision and weight precision and representing the gradient of the L-th layer output neurons obtained by operation according to the increased output neuron gradient precision so as to perform subsequent operation. By adopting the embodiment of the invention, the operation requirement can be met, the error of the operation result and the operation overhead are reduced, and the operation resource is saved.

Description

Neural network operation module and method
Technical Field
The invention relates to the field of neural networks, in particular to a neural network operation module and a method.
Background
The fixed point number is a data format capable of specifying the position of a decimal point, and we usually use bit width to represent the data length of one fixed point number. For example, the bit width of a 16-bit fixed point number is 16. For a given number of fixed points of bit width, the precision of the representable data and the range of numbers that can be represented are traded off against each other, the greater the precision that can be represented, the smaller the range of numbers that can be represented. As shown in FIG. 1a, for a fixed point data format with bitnum bit widthThe first bit is sign bit, the integer part occupies x bit, the decimal part occupies S bit, the maximum fixed point precision S that the fixed point data format can represent is 2-s. The fixed point data format may represent a range of [ neg, pos [ ]]Wherein pos is (2)bitnum-1-1)*2-s,neg=-(2bitnum-1)*2-s
In the neural network operation, data can be expressed and operated by using a fixed point data format. For example, during forward operation, the data of layer L includes input neuron X(l)And output neuron Y(l)Weight W(l). During the inverse operation, the data of the L < th > layer includes input neuron gradients
Figure BDA0001791206350000011
Output neuron gradient
Figure BDA0001791206350000012
Gradient of weight
Figure BDA0001791206350000013
The above data may be expressed by fixed-point numbers, or may be calculated by fixed-point numbers.
The training process of the neural network generally comprises two steps of forward operation and reverse operation, and during the reverse operation, the precision required by the input neuron gradient, the weight gradient and the output neuron gradient may change, which may increase along with the training process, and if the precision of the fixed point number is not enough, a larger error may occur in the operation result, and even training failure may occur.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is that in the neural network operation process, the accuracy of input neurons, the accuracy of weight, or the accuracy of output neuron gradient is not sufficient, which causes errors in the operation or training result.
In a first aspect, the present invention provides a neural network operation module, configured to perform operations on a multilayer neural network, including:
the storage unit is used for storing the input neuron precision, the weight precision and the output neuron gradient precision;
a controller unit for obtaining the input neuron precision S of the L-th layer of the multilayer neural network from the storage unitx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000014
Wherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000015
Obtaining gradient updating precision T; when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000016
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
an arithmetic unit for adjusting the input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights of the L-th layer according to the adjusted output neuron gradient precision
Figure BDA0001791206350000017
To express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
In a possible embodiment, the controller unit is adapted to determine the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000021
Obtaining the gradient update precision T, specifically comprising:
the controller unit is used for carrying out precision S on the input neuron according to a preset formulax(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000022
Calculating to obtain the gradient updating precision T;
wherein the first preset formula is as follows:
Figure BDA0001791206350000023
in a possible embodiment, the controller unit adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000024
The method comprises the following steps:
the controller unit maintains the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, reduced gradient precision of the output neuron
Figure BDA0001791206350000025
In a possible embodiment, the controller unit reduces the output neuron gradient accuracy
Figure BDA0001791206350000026
And increasing the bit width of the fixed point data format representing the output neuron gradient.
In a possible embodiment, the controller unit increases the output neuron gradient precision
Figure BDA0001791206350000027
Thereafter, the controller unit is further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
In one possible embodiment, the controller unit increases a bit width of a fixed-point data format representing the output neuron gradient, including:
the controller unit increases the bit width of the fixed point data format representing the gradient of the output neuron according to a first preset step length N1;
the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
In one possible embodiment, the controller unit increases a bit width of a fixed-point data format representing the output neuron gradient, including:
the controller unit increases the bit width of the fixed-point data format representing the gradient of the output neuron in a 2-fold increasing manner.
In a possible embodiment, the controller unit is further configured to:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
In a second aspect, an embodiment of the present invention provides a neural network operation module, where the neural network operation module is configured to perform operations on a multilayer neural network, and includes:
a storage unit for storing output neuron gradients of the multilayer neural network;
a controller unit for acquiring an input neuron gradient of an Lth layer of the multilayer neural network from the storage unit; l is an integer greater than 0; obtaining the L < th > layerThe number n1 of output neuron gradients with absolute values smaller than a first preset threshold value in the output neuron gradients; acquiring proportional data a according to the number n1 and the number n2 of the L-th layer output neuron gradients, wherein a is n1/n 2; when the proportion data a is larger than a second preset threshold value, reducing the gradient precision of the L-th layer output neuron
Figure BDA0001791206350000028
An arithmetic unit for calculating gradient accuracy of the output neuron according to the reduced gradient
Figure BDA0001791206350000029
And representing the L-th output neuron gradient for subsequent operation.
In one possible embodiment, the controller unit increases the L < th > layer output neuron gradient precision
Figure BDA00017912063500000210
And increasing the bit width of the fixed point data format representing the L-th layer output neuron gradient.
In one possible embodiment, the controller unit reduces the L < th > layer output neuron gradient precision
Figure BDA0001791206350000031
Thereafter, the controller unit is further configured to:
judging whether overflow occurs when the L-th layer output neuron gradient is in a fixed point data format for representing the L-th layer output neuron gradient;
when overflow is determined, increasing bit width of a fixed point data format representing gradient of the L-th layer output neuron.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the L-th layer output neuron gradient includes:
and the controller unit increases the bit width of the fixed point data format representing the L-th layer output neuron gradient according to a second preset step length N2.
In one possible embodiment, the controller unit increasing a bit width of the fixed-point data format representing the L-th layer output neuron gradient comprises:
and controlling the controller unit to increase the bit width of the fixed point data format representing the L-th layer output neuron gradient in a 2-time incremental mode.
In a third aspect, an embodiment of the present invention provides a neural network operation method, including:
obtaining L-th layer input neuron precision S of neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000033
According to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000032
Calculating to obtain gradient updating precision T;
when the gradient updating precision T is larger than the preset precision TrTime, adjust input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient
Figure BDA0001791206350000034
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
according to the adjusted input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights for layer L; according to the gradient precision of the adjusted output neuron
Figure BDA0001791206350000035
To express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
In one possible embodiment, said determining the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000036
Calculating to obtain gradient updating precision T, comprising:
according to a preset formula, the precision S of the input neuronx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000037
Calculating to obtain the gradient updating precision T;
wherein the preset formula is as follows:
Figure BDA0001791206350000038
in one possible embodiment, the adjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000039
The method comprises the following steps:
maintaining the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, reduced gradient precision of the output neuron
Figure BDA00017912063500000310
In one possible embodiment, the reducing the output neuron gradient precision
Figure BDA00017912063500000311
Increasing bit width of fixed point data format representing gradient of output neuron
In one possible embodiment, the reducing the output neuron gradient precision
Figure BDA00017912063500000312
Thereafter, the method further comprises:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the output neuron gradient comprises:
increasing the bit width of the fixed point data format representing the gradient of the output neuron according to a first preset step length N1;
the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the output neuron gradient comprises:
and increasing the bit width of the fixed point data format representing the output neuron gradient in a 2-time increment mode.
In a possible embodiment, the method further comprises:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
In a fourth aspect, an embodiment of the present invention provides a neural network operation method, including:
obtaining an input neuron gradient of an L-th layer of the multilayer neural network, wherein L is an integer greater than 0;
acquiring the number n1 of output neuron gradients of which the absolute values are smaller than a first preset threshold value in the L-th layer of output neuron gradients;
acquiring proportional data a according to the number n1 and the number n2 of the L-th layer output neuron gradients, wherein a is n1/n 2;
when the proportion data a is larger than the secondWhen a threshold value is preset, reducing gradient precision of the L-th layer output neuron
Figure BDA0001791206350000041
According to reduced output neuron gradient precision
Figure BDA0001791206350000042
And representing the L-th output neuron gradient for subsequent operation.
In one possible embodiment, the reducing the L < th > layer output neuron gradient precision
Figure BDA0001791206350000043
And increasing the bit width of the fixed point data format representing the L-th layer output neuron gradient.
In one possible embodiment, the reducing the L < th > layer output neuron gradient precision
Figure BDA0001791206350000044
Thereafter, the method further comprises:
judging whether the weight overflows when in a fixed point data format representing the gradient of the L-th layer output neuron;
when overflow is determined, increasing bit width of a fixed point data format representing gradient of the L-th layer output neuron.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the L-th layer output neuron gradient includes:
and increasing the bit width of the fixed point data format representing the gradient of the L-th layer output neuron according to a third preset step N2.
In one possible embodiment, the increasing the bit width of the fixed-point data format representing the L-th layer output neuron gradient includes:
and increasing the bit width of the fixed point data format representing the L-th layer output neuron gradient in a 2-time incremental mode.
It can be seen that in the solution of the embodiment of the present invention, the input neuron precision S is dynamically adjusted (including increasing or decreasing) in the neural network operation processxWeight accuracy SwAnd output neuron gradient accuracy
Figure BDA0001791206350000045
The method and the device can reduce the error of the operation result and improve the precision of the operation result while meeting the operation requirement.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1a is a schematic diagram of a fixed-point data format;
fig. 1b is a schematic structural diagram of a neural network operation module according to an embodiment of the present invention;
fig. 2 is a schematic flow chart illustrating a neural network operation method according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of another neural network operation method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terminology used in the embodiments of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
In the neural network operation process, because of a series of operations such as addition, subtraction, multiplication, division, convolution and the like, the input neuron, the weight and the output neuron included in the forward operation process and the input neuron gradient, the weight gradient and the output neuron gradient included in the reverse training process are also changed. The precision with which input neurons, weights, output neurons, input neuron gradients, weight gradients, and output neuron gradients are represented in fixed-point data format may need to be increased or decreased. If the precision of the input neuron, the weight, the output neuron, the input neuron gradient, the weight gradient and the output neuron gradient is not enough, a larger error occurs in an operation result, and even a reverse training failure is caused; if the precision of the input neuron, the weight, the output neuron, the input neuron gradient, the weight gradient and the output neuron gradient is redundant, unnecessary operation overhead is increased, and operation resources are wasted. The application provides a neural network operation module and a method, which dynamically adjust the precision of the data in the neural network operation process, so as to reduce the error of an operation result and improve the precision of the operation result while meeting the operation requirement.
In the embodiment of the present application, the purpose of adjusting the data precision is achieved by adjusting the bit width of the data. For example, when the precision of the fixed-point data format cannot meet the requirement of operation, the precision of the fixed-point data format can be increased by increasing the bit width of the decimal part in the fixed-point data format, i.e. increasing s in fig. 1 a; however, since the bit width of the fixed-point data format is fixed, when the bit width of the fractional part is increased, the bit width of the integer part is decreased, and therefore, the data range that can be represented by the fixed-point data format is decreased.
Referring to fig. 1b, fig. 1b is a schematic structural diagram of a neural network operation module according to an embodiment of the present invention. The neural network operation module is used for performing operation of a multilayer neural network. As shown in fig. 1b, the neural network operation module 100 includes:
and the storage unit 101 is used for storing the input neuron precision, the weight precision and the output neuron gradient precision.
A controller unit 102 for obtaining the input neuron precision S of the L-th layer of the multi-layer neural network from the storage unit 101x(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000051
Wherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure BDA0001791206350000052
Obtaining gradient updating precision T; when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000061
In a possible embodiment, the storage unit 101 is further configured to store input neurons, weights, output neurons, and output neuron gradients, the controller unit 102 obtains L-th layer input neurons, weights, and output neuron gradients from the storage unit 101, and the controller unit 102 obtains the input neuron precision S according to the L-th layer input neurons, weights, and output neuron gradientsx(l)Precision of weightSw(l)And output neuron gradient accuracy
Figure BDA0001791206350000062
The bit width of the fixed-point data format used for representing the number of the fixed-point data of the input neuron and the bit width of the fixed-point data format used for representing the weight are first bit width, and the bit width of the fixed-point data format used for representing the gradient of the output neuron is second bit width.
Optionally, the second bit width is greater than the first bit width.
Further, the second bit width is twice the first bit width, so as to facilitate processing by an electronic computer.
Further, the first bit width is preferably 8 bits, and the second bit width is preferably 16 bits.
The controller unit 102 may preset the preset accuracy T according to experiencer(ii) a Or a second preset formula is adopted to obtain the preset precision T matched with the input parameters in a mode of changing the input parametersr(ii) a T can also be acquired by a machine learning methodr
Alternatively, the controller unit 102 sets the preset accuracy T according to a learning rate and a batch size (number of samples in batch processing)r
Further, if there is a parameter sharing layer (such as convolutional layer and recurrent neural network layer) in the neural network, the controller unit 102 sets the predetermined precision T according to the number of output neurons in the previous layer, the blocksize, and the learning raterThat is, the higher the number of output neurons of the previous layer, the larger the blocksize, and the higher the learning rate, the preset accuracy TrThe larger.
Specifically, the controller unit 102 obtains the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000063
Then, according to a first preset formula, the precision S of the input neuron is determinedx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000064
Calculating to obtain the gradient update precision T, wherein the first preset formula may be:
Figure BDA0001791206350000065
wherein the controller unit 102 adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000066
The method comprises the following steps:
the controller unit 102 maintains the input neuron precision Sx(l)Sum weight precision Sw(l)The gradient precision of the output neuron is not changed, and the gradient precision of the output neuron is reduced
Figure BDA0001791206350000067
It is noted that due to the above-mentioned output neuron gradient accuracy
Figure BDA0001791206350000068
The controller unit 102 reduces the gradient accuracy of the output neuron
Figure BDA0001791206350000069
This means that the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron is increased.
Alternatively, the controller unit 102 may increase the decimal part bit width s1 of the fixed point data format indicating the weight by a first preset step N1 according to the value of Tr-T.
Specifically, for the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron, the controller unit 102 increments N1 bits at a time, that is, the decimal part bit width is s1+ N1, and obtains the gradient precision of the output neuron
Figure BDA00017912063500000610
Then according to the above-mentioned preset formula
Figure BDA00017912063500000611
Judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is smaller or not; when determining that the absolute value of the difference between the gradient update precision T and the preset precision Tr is smaller, the controller unit 102 continues to increase the bit width of the decimal part in the fixed point data format representing the gradient of the output neuron by N1, that is, the bit width is s1+2 × N1, and obtains the gradient precision of the output neuron
Figure BDA00017912063500000612
Continuously judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is reduced or not; if the size of the sample is smaller, continuing to process according to the method; if the absolute value of the difference between the gradient update precision T and the preset precision Tr increases during the nth processing, the controller unit 102 uses the bit width obtained by the nth-1 processing, i.e., s1+ (N-1) × N1, as the bit width of the decimal part of the fixed point data format indicating the gradient of the output neuron, and the gradient precision of the output neuron after increasing the bit width of the decimal part is set to be equal to
Figure BDA0001791206350000071
Optionally, the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
Alternatively, the controller unit 102 increases the bit width of the decimal part in the fixed point data format indicating the gradient of the output neuron in increments of 2 times.
For example, the decimal part bit width of the fixed point data format indicating the gradient of the output neuron is 3, that is, the precision of the weight is 2-3Then, the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron after increasing the bit width in a 2-fold increasing manner is 6, that is, the gradient precision of the reduced output neuron is 2-6
In one possible embodiment, after the controller unit 102 determines the increment width b of the decimal part bit width of the fixed-point data format representing the gradient of the output neuron, the controller unit 102 increments the decimal part bit width of the fixed-point data format multiple times, for example, the controller unit 102 increments the decimal part bit width of the fixed-point data format two times, the first increment is b1, the second increment is b2, and b is 1+ b 2.
Wherein, the b1 and the b2 can be the same or different.
Optionally, the controller unit 102 reduces the gradient precision of the output neuron
Figure BDA0001791206350000072
The bit width of the fixed point data format representing the gradient of the output neuron is increased.
Further, the gradient precision of the output neuron is increased
Figure BDA0001791206350000073
The bit width of the decimal part in the fixed point data format representing the gradient of the output neuron is increased, and since the bit width of the fixed point data format representing the gradient of the output neuron is unchanged, if the bit width of the decimal part is increased, the bit width of the integer part is reduced, and the data range represented by the fixed point data format is reduced, the precision of the gradient of the output neuron is reduced in the controller unit 102
Figure BDA0001791206350000074
Then, the controller unit 102 increases the bit width of the fixed-point data format, and after the bit width of the fixed-point data format is increased, the bit width of the integer portion remains unchanged, i.e. the increased value of the bit width of the integer portion is the same as the increased value of the bit width of the fractional portion.
For example, the bit width of the fixed-point data format is 9, where the bit width of the sign bit is 1, the bit width of the integer portion is 5, and the bit width of the fractional portion is 3, and after the bit width of the fractional portion and the bit width of the integer portion are increased by the controller unit 102, the bit width of the fractional portion is 6, and then the bit width of the integer portion is 5, that is, the bit width of the fractional portion is increased, and the bit width of the integer portion remains unchanged.
In one possible embodiment, the controller unit 102 reduces the gradient precision of the output neurons
Figure BDA0001791206350000075
The controller unit 102 is then further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
Specifically, as can be seen from the above description, the controller unit 102 reduces the gradient precision of the output neuron
Figure BDA0001791206350000076
In this case, since the range of the fixed-point data format indicating the gradient of the output neuron is narrowed, the controller unit 102 decreases the gradient accuracy of the output neuron
Figure BDA0001791206350000077
Then, judging whether the output neuron gradient overflows when being expressed in the fixed point data format; when overflow is determined, the controller unit 102 increases the bit width of the fixed-point data format, thereby expanding the range of data represented by the fixed-point data format so that the output neuron gradient is not overflowed when represented by the fixed-point data format.
It should be noted that the controller unit 102 increases the bit width of the fixed-point data format, specifically, increases the bit width of the integer part of the fixed-point data format.
Further, the increasing, by the controller unit 102, the bit width of the fixed-point data format indicating the gradient of the output neuron includes:
the controller unit 102 increases the bit width of the fixed-point data format representing the gradient of the output neuron according to a second preset step N2, where the second preset step N2 may be 1, 2, 3, 4, 5, 7, 8 or other positive integer.
Specifically, when determining to increase the bit width of the fixed-point data format, the controller unit 102 increases the bit width of the fixed-point data format by the second preset step N2 each time.
In one possible embodiment, the controller unit 102 increases the bit width of the fixed-point data format representing the gradient of the output neuron, including:
the controller unit 102 increases the bit width of the fixed-point data format indicating the gradient of the output neuron in increments of 2 times.
For example, if the bit width of the fixed-point data format excluding the sign bit is 8, the bit width of the fixed-point data format excluding the sign bit is 16 after the bit width of the fixed-point data format is increased in a 2-time increasing manner; and increasing the bit width of the fixed-point data format in a 2-time incremental mode, wherein the bit width of the fixed-point data format except the sign bit is 32.
In one possible embodiment, the controller unit 102 adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000081
Comprises that
The controller unit 102 reduces the input neuron precision Sx(l)And/or accuracy of gradient of the output neurons
Figure BDA0001791206350000082
Maintaining the above weight precision Sw(l)Unchanged, or;
the controller unit 102 reduces the input neuron precision Sx(l)Increasing the gradient precision of the output neuron
Figure BDA0001791206350000083
Maintaining the above weight precision Sw(l)Unchanged and the input neuron precision Sx(l)The reduced amplitude is larger than the gradient precision of the output neuron
Figure BDA0001791206350000084
Or, or;
the controller unit 102 increases the gradient accuracy of the output neuron
Figure BDA0001791206350000085
Reducing the precision S of the input neuronx(l)Maintaining the above weight accuracy Sw(l)Unchanged and the gradient precision of the output neuron
Figure BDA0001791206350000086
The increased amplitude is smaller than the input neuron precision Sx(l)Or is reduced by a reduced magnitude of;
the controller unit 102 increases or decreases the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000087
So as to update the gradient with the predetermined accuracy TrThe absolute value of the difference of (a) is smallest.
Here, it should be noted that the controller unit 102 applies the weight precision S to the weight precision Sw(l)The accuracy S of the input neuronx(l)And output neuron gradient accuracy
Figure BDA0001791206350000088
The specific process of performing the increase operation in any one of the above-mentioned embodiments can be referred to the above-mentioned related operation of the controller unit 102, and will not be described here.
Adjusting the input neuron precision S according to the methodx(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000089
Then, the arithmetic unit 103 adjusts the input neuron precision S in accordance with the adjusted input neuron precision S during the arithmetic processx(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA00017912063500000810
And expressing the input neurons, the weights and the output neuron gradients of the L-th layer in a fixed point data format, and then carrying out subsequent operation.
It should be noted that the frequency of calculating the gradient update accuracy T by the controller unit 102 can be flexibly set according to the requirement.
The controller unit 102 may adjust and calculate the frequency of the gradient update precision T according to the number of training iterations in the neural network training process.
Optionally, the controller unit 102 recalculates the gradient update precision T once per iteration in the neural network training process; or recalculating the gradient updating precision T every iteration preset times; or the frequency is set according to the change of the gradient update accuracy T.
Alternatively, the controller unit 102 sets the frequency of calculating the gradient update accuracy T according to the number of training iterations in the neural network training.
An arithmetic unit 103 for calculating the precision S of the input neuron according to the increased or decreased input neuronx(l)Sum weight precision Sw(l)To represent input neurons and weights for layer L; according to increased or decreased output neuron gradient precision
Figure BDA0001791206350000091
To represent the computed level L output neuron gradient.
In other words, the above-mentioned arithmetic unit is used for increasing or decreasing the input neuron precision Sx(l)The fixed point data format of (1) represents the L-th input neuron by increasing or decreasing the weight precision Sw(l)The fixed point data format of (1) represents the weight of the L-th layer, and the gradient precision of the output neuron is increased or decreased
Figure BDA0001791206350000092
The fixed point data format of (1) represents the gradient of the output neuron of the L-th layer for subsequent operations.
By dynamically adjusting (including increasing or decreasing) the precision S of the input neuron in the operation process of the neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000093
The method and the device can meet the operation requirement, reduce the error of the operation result and the operation overhead and save the operation resource.
In another alternative embodiment, the controller unit 102 obtains an L-th layer output neuron gradient of the multilayer neural network; .
In one possible embodiment, the controller unit 102 acquires the output neurons of the L-th layer and the output neurons of the L-1 th layer, and then acquires the gradient of the L-th layer output neurons according to the output neurons of the L-th layer and the output neurons of the L-1 th layer.
The controller unit 102 obtains proportional data a of the output neuron gradient whose absolute value is smaller than a first preset threshold value.
Alternatively, the first preset threshold may be 0, 0.01, 0.05, 0.1, 0.12, 0.05 or other values.
Specifically, after acquiring the L-th layer output neuron gradient, the controller unit 102 acquires the number n1 of gradient values having an absolute value smaller than the first preset threshold value in the L-th layer output neuron gradient, and then acquires the proportional data a, that is, a ═ n1/n2, based on the number n1 and the number n2 of the L-th layer output neuron gradient.
Alternatively, the above ratio data may be 50%, 60%, 65%, 70%, 80%, 85%, 90%, or other values.
Optionally, the above ratio data is 80%.
When the proportion data a is larger than a second preset threshold valueThe controller unit 102 reduces the gradient accuracy of the L-th layer output neuron
Figure BDA0001791206350000094
In one possible embodiment, the controller unit 102 reduces the gradient precision of the L-th layer output neurons
Figure BDA0001791206350000095
Then, the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron is increased.
In one possible embodiment, the controller unit 102 reduces the gradient precision of the L-th layer output neurons
Figure BDA0001791206350000096
Then, the controller unit 102 is further configured to:
judging whether overflow occurs when the L-th layer output neuron gradient is in a fixed point data format for representing the L-th layer output neuron gradient;
when the overflow is determined, the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron is increased.
In a possible embodiment, the controller unit 102 increases the bit width of the fixed-point data format representing the gradient of the L-th layer output neuron, including:
the controller unit 102 increases the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron according to a third preset step N3.
In a possible embodiment, the controller unit 102 increasing the bit width of the fixed-point data format indicating the L-th layer output neuron gradient includes:
the controller unit 102 increases the bit width of the fixed-point data format indicating the gradient of the L-th layer output neuron by a 2-fold increment.
It should be noted that the controller unit 102 reduces the gradient precision of the output neuron
Figure BDA0001791206350000101
The specific processes of (1) can be seen from the above description, and will not be described again.
Adjusting the gradient precision of the output neuron according to the method
Figure BDA0001791206350000102
Then, the arithmetic unit 103 adjusts the gradient accuracy of the output neuron according to the adjusted output neuron during the arithmetic process
Figure BDA0001791206350000103
The gradient of the output neuron of the L-th layer is expressed in the form of fixed point number, and then the subsequent operation is performed.
The precision of the neural network is adjusted according to the gradient of the output neurons in the operation process of the neural network, so that the error of the output neurons is reduced, and the training is ensured to be normally carried out.
Referring to fig. 2, fig. 2 is a schematic flow chart of a neural network operation method according to an embodiment of the present invention, and as shown in fig. 2, the method includes:
s201, the neural network operation module acquires the precision, the weight precision and the gradient precision of output neurons of the L-th layer of the neural network.
Wherein the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000104
The values of (A) may be the same, or partially the same or different from each other two by two.
Wherein the neural network is a multilayer neural network, and the L-th input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000105
The input neuron precision, the weight precision and the output neuron gradient precision of any layer of the multilayer neural network are respectively.
In a possible embodiment, the neural network operation module obtains the input neurons, the weights and the output neurons of the L-th layer; obtaining the precision S of the L-th layer input neuron according to the L-th layer input neuron, the weight and the output neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000106
S202, the neural network operation module calculates to obtain gradient updating precision T according to the precision of the L-th layer input neurons, the weight precision and the gradient precision of the output neurons.
Specifically, the neural network operation module performs precision S on the input neuron according to a first preset formulax(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000107
And calculating to obtain the gradient updating precision T.
Wherein the first predetermined formula is
Figure BDA0001791206350000108
S203, when the gradient updating precision T is larger than the preset precision TrThe neural network operation module adjusts the precision, the weight precision and the output neuron gradient of the L-th layer input neuron so as to update the gradient precision T and preset precision TrThe absolute value of the difference of (a) is smallest.
The bit width of the fixed-point data format used for representing the input neuron and the fixed-point data format used for representing the weight is a first bit width, and the bit width of the fixed-point data format used for representing the gradient of the output neuron is a second bit width.
Optionally, the second bit width is greater than the first bit width.
Further, the second bit width is twice the first bit width, so as to facilitate processing by an electronic computer.
Further, the first bit width is preferably 8 bits, and the second bit width is preferably 16 bits.
Wherein the predetermined accuracy TrThe setting can be carried out according to experience in advance; the T matched with the input parameters can be obtained by changing the input parameters through a second preset formular(ii) a T can also be acquired by a machine learning methodr
Optionally, the neural network operation module sets the preset precision T according to a learning rate and a batch size (number of samples in batch processing)r
Further, if there is a parameter sharing layer (such as convolutional layer and recurrent neural network layer) in the neural network, the predetermined precision T is set according to the number of output neurons in the previous layer, the blocksize, and the learning raterThat is, the higher the number of output neurons in the previous layer is, the larger the blocksize is, the higher the learning rate is, and the preset precision T isrThe larger.
Wherein the neural network operation module adjusts the precision S of the input neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000111
The method comprises the following steps:
maintaining the precision S of the input neuronx(l)Sum weight precision Sw(l)The gradient precision of the output neuron is increased without changing
Figure BDA0001791206350000112
It should be noted that the neural network operation module reduces the gradient precision of the output neuron
Figure BDA0001791206350000113
This means that the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron is increased.
Optionally, the neural network operation module controller unit increases the decimal part bit width s1 in the fixed point data format indicating the gradient of the output neuron according to the value of Tr-T by a first preset step N1.
Specifically, for the decimal part bit width s1 in the fixed point data format representing the gradient of the output neuron, the neural network operation module increases N1 each time, that is, the bit width of the decimal part is s1+ N1, and obtains the gradient precision of the output neuron
Figure BDA0001791206350000114
Then according to the above-mentioned preset formula
Figure BDA0001791206350000115
Judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is smaller or not; when the absolute value of the difference between the gradient update precision T and the preset precision Tr is determined to be smaller, the neural network operation module continues to increase the bit width of the decimal part in the fixed point data format representing the gradient of the output neuron by N1, namely the bit width is s1+2 × N1, and obtains the gradient precision of the output neuron
Figure BDA0001791206350000116
Continuously judging whether the absolute value of the difference value between the gradient updating precision T and the preset precision Tr is reduced or not; if the size of the sample is smaller, continuing to process according to the method; if the absolute value of the difference between the gradient update precision T and the preset precision Tr is increased during the nth processing, the neural network operation module uses the bit width obtained by the nth-1 processing, i.e., s1+ (N-1) × N1, as the bit width of the decimal part of the fixed point data format representing the gradient of the output neuron, and the gradient precision of the output neuron after increasing the bit width of the decimal part is equal to
Figure BDA0001791206350000117
Optionally, the first preset step N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
Optionally, the neural network operation module increases a bit width of a decimal part in a fixed point data format indicating the gradient of the output neuron in a 2-fold increasing manner.
For example, the output neuron ladder is representedThe bit width of the decimal part of the fixed point data format of the degree is 3, namely the gradient precision of the output neuron is 2-3Then, the decimal part bit width of the fixed point data format representing the gradient of the output neuron after increasing in a 2-fold increasing manner is 6, that is, the gradient precision of the output neuron after decreasing is 2-6
In one possible embodiment, after the neural network operation module determines the increment b of the decimal part bit width of the fixed point data format representing the gradient of the output neuron, the neural network operation module increments the decimal part bit width of the fixed point data format multiple times, for example, the neural network operation module increments the decimal part bit width of the fixed point data format two times, the first increment is b1, the second increment is b2, and b is b1+ b 2.
Wherein, the b1 and the b2 can be the same or different.
Optionally, when the neural network operation module decreases the gradient precision of the output neuron, the bit width of the fixed point data format indicating the weight is increased.
Further, the gradient accuracy S of the output neuron is reducedw(l)The bit width of the decimal part in the fixed point data format representing the weight is increased, the bit width of the decimal part in the fixed point data format representing the output neuron gradient is unchanged, if the bit width of the decimal part is increased, the bit width of the integer part is reduced, the data range represented by the fixed point data format is reduced, and therefore the gradient precision S of the output neuron is reduced in the neural network operation modulew(l)And then, the neural network operation module increases the bit width of the fixed point data format, and after the bit width of the fixed point data format is increased, the bit width of the integer part is kept unchanged, namely the increased value of the bit width of the integer part is the same as the increased value of the bit width of the decimal part.
For example, the bit width of the fixed-point data format is 9, where the bit width of the sign bit is 1, the bit width of the integer portion is 5, and the bit width of the decimal portion is 3, and after the bit width of the decimal portion and the bit width of the integer portion are increased by the controller unit 102, the bit width of the decimal portion is 6, and then the bit width of the integer portion is 5, that is, the bit width of the decimal portion is increased, and the bit width of the integer portion remains unchanged.
In a possible embodiment, after the neural network operation module reduces the gradient precision of the output neuron, the neural network operation module is further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
Specifically, as can be seen from the above description, when the neural network operation module reduces the precision of the output neuron gradient, the range of the fixed point data format representing data representing the output neuron gradient is reduced, and therefore, after the neural network operation module reduces the precision of the output neuron gradient, it is determined whether the output neuron gradient is overflowed when represented in the fixed point data format; when overflow is determined, the neural network operation module increases the bit width of the fixed point data format, so that the range of data represented by the fixed point data format is expanded, and the output neuron gradient is not overflowed when represented by the fixed point data format.
It should be noted that, the neural network operation module increases the bit width of the fixed-point data format, specifically, increases the bit width of the integer part of the fixed-point data format.
Further, the increasing, by the neural network operation module, a bit width of the fixed-point data format indicating the gradient of the output neuron includes:
the neural network operation module increases the bit width of the fixed point data format representing the gradient of the output neuron according to a second preset step N2, wherein the second preset step N2 may be 1, 2, 3, 4, 5, 7, 8 or other positive integers.
Specifically, when determining to increase the bit width of the fixed point data format, the neural network operation module increases the bit width of the fixed point data format by the second preset step N2 each time.
In one possible embodiment, the neural network operation module increases the bit width of the fixed-point data format representing the gradient of the output neuron, including:
the neural network operation module increases the bit width of the fixed point data format representing the gradient of the output neuron in a 2-fold increasing manner.
For example, if the bit width of the fixed-point data format excluding the sign bit is 8, the bit width of the fixed-point data format excluding the sign bit is 16 after the bit width of the fixed-point data format is increased in a 2-time increasing manner; after the bit width of the fixed-point data format is increased again in a 2-time increasing mode, the bit width of the fixed-point data format excluding the sign bit is 32.
In one embodiment, the neural network operation module adjusts the precision S of the input neuronx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000121
The method comprises the following steps:
reducing the precision S of the input neuronx(l)And/or accuracy of gradient of the output neurons
Figure BDA0001791206350000122
Maintaining the above weight precision Sw(l)Unchanged, or;
reducing the precision S of the input neuronx(l)Increasing the gradient precision of the output neuron
Figure BDA0001791206350000123
Maintaining the above weight precision Sw(l)Unchanged and the input neuron precision Sx(l)The reduced amplitude is larger than the gradient precision of the output neuron
Figure BDA0001791206350000124
Or, or;
increasing the gradient precision of the output neuron
Figure BDA0001791206350000125
Reducing the precision S of the input neuronx(l)Maintaining the above weight accuracy Sw(l)Unchanged and the gradient precision of the output neuron
Figure BDA0001791206350000126
The increased amplitude is smaller than the input neuron precision Sx(l)Or is reduced by a reduced magnitude of;
increasing or decreasing the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000127
So as to update the gradient with the predetermined accuracy TrThe absolute value of the difference of (a) is smallest.
It should be noted that, the neural network operation module applies the weight precision Sw(l)The accuracy S of the input neuronx(l)And output neuron gradient accuracy
Figure BDA0001791206350000131
The specific process of performing the increasing operation in any one of the above-mentioned embodiments can be referred to the above-mentioned neural network operation module to increase the above-mentioned related operation, and will not be described herein.
S204, the neural network operation module represents the output neurons and the weights of the L-th layer according to the adjusted input neuron precision and the adjusted weight precision; and expressing the gradient of the L-th layer output neuron obtained by operation according to the adjusted gradient precision of the output neuron so as to perform subsequent operation.
In other words, the above-mentioned arithmetic unit is used for increasing or decreasing the input neuron precision Sx(l)The fixed point data format of (1) represents the L-th input neuron by increasing or decreasing the weight precision Sw(l)The fixed point data format of (1) represents the weight of the L-th layer, and the gradient precision of the output neuron is increased or decreased
Figure BDA0001791206350000132
The fixed point data format of (1) represents the gradient of the output neuron of the L-th layer for subsequent operations.
Adjusting the input neuron precision S according to the methodx(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000133
Then, the neural network operation module recalculates the gradient updating precision T; when the gradient updating precision is no longer greater than the preset precision TrThe neural network operation module reduces the input neuron precision S by referring to the method of step S203x(l)Precision of weight Sw(l)And output neuron gradient accuracy
Figure BDA0001791206350000134
It should be noted that the frequency of calculating the gradient update precision T by the neural network operation module can be flexibly set according to requirements.
The neural network operation module can adjust and calculate the frequency of the gradient updating precision T according to the training iteration times in the neural network training process.
Optionally, during the neural network training process, the neural network operation module recalculates the gradient update precision T once per iteration; or recalculating the gradient updating precision T every iteration preset times; or the frequency is set according to the change of the gradient update accuracy T.
Optionally, the neural network operation module sets a frequency of calculating the gradient update precision T according to a training iteration number in the neural network training.
It can be seen that, in the solution of the embodiment of the present invention, in the operation process of the neural network, the precision S of the input neuron is dynamically adjustedxWeight accuracy SwAnd output neuron gradient accuracy
Figure BDA0001791206350000135
The method and the device can meet the operation requirement, reduce the error of the operation result and the operation overhead and save the operation resource.
Fig. 3 and fig. 3 are schematic flow charts of a neural network operation method according to an embodiment of the present invention. As shown in fig. 3, the method includes:
s301, the neural network operation module obtains the L-th layer output neuron gradient.
In a possible embodiment, the neural network operation module obtains the output neurons of the L-th layer and the output neurons of the L-1 th layer, and then obtains the gradient of the L-th layer output neurons according to the output neurons of the L-th layer and the output neurons of the L-1 th layer.
S302, the neural network operation module obtains proportion data a of which the absolute value in the L-th layer output neuron gradient is smaller than a first preset threshold value.
Alternatively, the first preset threshold may be 0, 0.01, 0.05, 0.1, 0.12, 0.05 or other values.
Specifically, the neural network operation module acquires the number n1 of gradient values having an absolute value smaller than the first preset threshold value in the L-th layer output neuron gradient after acquiring the L-th layer output neuron gradient, and then acquires the proportional data a, that is, a ═ n1/n2, based on the number n1 and the number n2 of the L-th layer output neuron gradient.
Alternatively, the above ratio data may be 50%, 60%, 65%, 70%, 80%, 85%, 90%, or other values.
Optionally, the above ratio data is 80%.
And S303, when the proportion data a is larger than a second preset threshold value, reducing the gradient precision of the L-th layer output neuron by the neural network operation module.
In one possible embodiment, the neural network operation module reduces the gradient precision of the L-th layer output neurons
Figure BDA0001791206350000141
Increasing the number of output neurons in the L-th layerBit width in the fixed point data format of the gradient.
In one possible embodiment, the neural network operation module reduces the gradient precision of the L-th layer output neurons
Figure BDA0001791206350000142
Then, the neural network operation module is further configured to:
judging whether overflow occurs when the L-th layer output neuron gradient is in a fixed point data format for representing the L-th layer output neuron gradient;
when the overflow is determined, the bit width of the fixed point data format indicating the gradient of the L-th layer output neuron is increased.
In a possible embodiment, the neural network operation module increases a bit width of a fixed-point data format representing a gradient of the L-th layer output neuron, including:
and increasing the bit width of the fixed point data format representing the gradient of the L-th layer output neuron according to a third preset step N3.
In a possible embodiment, the neural network operation module increases a bit width of a fixed-point data format representing a gradient of the L-th layer output neuron, including:
and increasing the bit width of the fixed point data format representing the gradient of the L-th layer output neuron according to a 2-time incremental mode.
It should be noted that the controller unit 102 reduces the gradient precision of the output neuron
Figure BDA0001791206350000143
The specific processes of (1) can be seen from the above description, and will not be described again.
Adjusting the gradient precision of the output neuron according to the method
Figure BDA0001791206350000144
Then, the neural network operation module adjusts the gradient precision of the output neurons in the operation process
Figure BDA0001791206350000145
And expressing the gradient of the output neuron of the L-th layer in a fixed point data format, and then carrying out subsequent operation.
It can be seen that in the scheme of the embodiment of the invention, the precision of the output neurons is adjusted according to the gradient of the output neurons in the operation process of the neural network, so that the error of the output neurons is reduced, and the normal training is ensured.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (16)

1. A neural network operation module, wherein the neural network operation module is used for performing operations of a multilayer neural network, and comprises:
the storage unit is used for storing the input neuron precision, the weight precision and the output neuron gradient precision;
a controller unit for obtaining the input neuron precision S of the L-th layer of the multilayer neural network from the storage unitx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791206340000011
Wherein L is an integer greater than 0; according to the input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791206340000012
Obtaining gradient updating precision T; when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791206340000013
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
an arithmetic unit for adjusting the input neuron precision Sx(l)Sum weight precision Sw(l)To represent input neurons and weights of the L-th layer and to adjust the gradient precision of the output neurons according to the adjusted weights
Figure FDA0001791206340000014
To express the gradient of the output neuron of the L-th layer obtained by the operation so as to carry out the subsequent operation.
2. The module of claim 1, wherein the controller unit is configured to determine the input neuron precision S based on the input neuron precisionx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791206340000015
Obtaining the gradient update precision T, specifically comprising:
the controller unit is used for carrying out precision S on the input neuron according to a preset formulax(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791206340000016
Calculating to obtain the gradient updating precision T;
wherein the preset formulaComprises the following steps:
Figure FDA0001791206340000017
3. the module of claim 2, wherein the controller unit adjusts the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791206340000018
The method comprises the following steps:
the controller unit maintains the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, reduced gradient precision of the output neuron
Figure FDA0001791206340000019
4. The module of claim 3, wherein the controller unit reduces the output neuron gradient precision
Figure FDA00017912063400000110
And increasing the bit width of the fixed point data format representing the output neuron gradient.
5. A module according to claim 3 or 4, characterized in that the controller unit reduces the output neuron gradient precision
Figure FDA00017912063400000111
Thereafter, the controller unit is further configured to:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
6. The module according to claim 4 or 5, wherein said controller unit increases bit width of said fixed-point data format representing said weight, comprising:
the controller unit increases the bit width of the fixed point data format representing the gradient of the output neuron according to a preset step length N1;
the preset step length N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
7. The module of claim 4 or 5, wherein the controller unit increases the bit width of the fixed-point data format representing the output neuron gradient, comprising:
the controller unit increases the bit width of the fixed-point data format representing the gradient of the output neuron in a 2-fold increasing manner.
8. The module according to any one of claims 1 to 7, wherein the controller unit is further configured to:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
9. A neural network operation method, comprising:
obtaining L-th layer input neuron precision S of neural networkx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791206340000021
According to the input neuron precision Sx(l)The weight accuracy Sw(l)And said output spiritPrecision of meridian element gradient
Figure FDA0001791206340000022
Calculating to obtain gradient updating precision T;
when the gradient updating precision T is larger than the preset precision TrAdjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient
Figure FDA0001791206340000023
So as to update the gradient with the predetermined precision TrThe absolute value of the difference of (a) is minimum;
according to the adjusted input neuron precision Sx(l)Sum weight precision Sw(l)To represent output neurons and weights for layer L; according to the gradient precision of the adjusted output neuron
Figure FDA0001791206340000024
To represent the gradient of the L-th output neuron obtained by operation; for subsequent operations.
10. The method of claim 9, wherein said determining is based on said input neuron precision Sx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791206340000025
Calculating to obtain gradient updating precision T, comprising:
according to a preset formula, the precision S of the input neuronx(l)The weight accuracy Sw(l)And the output neuron gradient precision
Figure FDA0001791206340000026
Calculating to obtain the gradient updating precision T;
wherein the preset formula is as follows:
Figure FDA0001791206340000027
11. the method of claim 10, wherein the adjusting the input neuron precision Sx(l)Weight accuracy Sw(l)And output neuron gradient accuracy
Figure FDA0001791206340000028
The method comprises the following steps:
maintaining the input neuron precision Sx(l)And the weight precision Sw(l)Unchanged, reduced gradient precision of the output neuron
Figure FDA0001791206340000029
12. The method of claim 11, wherein the reducing the output neuron gradient precision
Figure FDA00017912063400000211
And increasing the bit width of the fixed point data format representing the output neuron gradient.
13. The method of claim 11 or 12, wherein the reducing the output neuron gradient precision
Figure FDA00017912063400000210
Thereafter, the method further comprises:
judging whether the output neuron gradient overflows when in a fixed point data format for representing the output neuron gradient;
when overflow is determined, increasing a bit width of a fixed-point data format representing the output neuron gradient.
14. The method of claim 12 or 13, wherein said increasing a bit width of a fixed-point data format representing said output neuron gradient comprises:
increasing the bit width of the fixed point data format representing the gradient of the output neuron according to a preset step length N1;
the preset step length N1 is 1, 2, 4, 6, 7, 8 or other positive integer.
15. The method of claim 12 or 13, wherein said increasing a bit width of a fixed-point data format representing said output neuron gradient comprises:
and increasing the bit width of the fixed point data format representing the output neuron gradient in a 2-time increment mode.
16. The method according to any one of claims 9-15, further comprising:
obtaining the preset precision T according to a machine learning methodrOr alternatively;
obtaining the preset precision T according to the number of output neurons of the L-1 layer, the learning rate and the number of samples in batch processingr(ii) a And the more the number of the L-1 layer output neurons and the number of samples in batch processing and the higher the learning rate are, the preset precision T isrThe larger.
CN201811040961.XA 2018-05-18 2018-09-06 Neural network operation module and method Pending CN110880037A (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
CN201811040961.XA CN110880037A (en) 2018-09-06 2018-09-06 Neural network operation module and method
EP19803375.5A EP3624020A4 (en) 2018-05-18 2019-05-07 Computing method and related product
PCT/CN2019/085844 WO2019218896A1 (en) 2018-05-18 2019-05-07 Computing method and related product
US16/718,742 US11409575B2 (en) 2018-05-18 2019-12-18 Computation method and product thereof
US16/720,145 US11442785B2 (en) 2018-05-18 2019-12-19 Computation method and product thereof
US16/720,171 US11442786B2 (en) 2018-05-18 2019-12-19 Computation method and product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811040961.XA CN110880037A (en) 2018-09-06 2018-09-06 Neural network operation module and method

Publications (1)

Publication Number Publication Date
CN110880037A true CN110880037A (en) 2020-03-13

Family

ID=69727298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811040961.XA Pending CN110880037A (en) 2018-05-18 2018-09-06 Neural network operation module and method

Country Status (1)

Country Link
CN (1) CN110880037A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020192582A1 (en) * 2019-03-26 2020-10-01 上海寒武纪信息科技有限公司 Neural network operation module and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020192582A1 (en) * 2019-03-26 2020-10-01 上海寒武纪信息科技有限公司 Neural network operation module and method

Similar Documents

Publication Publication Date Title
JP7146955B2 (en) DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
JP7146952B2 (en) DATA PROCESSING METHOD, APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
CN112085183A (en) Neural network operation method and device and related product
CN111656315A (en) Data processing method and device based on convolutional neural network architecture
CN111985523A (en) Knowledge distillation training-based 2-exponential power deep neural network quantification method
CN108537327B (en) Neural network prediction method and device based on time series BP
CN111160531B (en) Distributed training method and device for neural network model and electronic equipment
CN111758104B (en) Neural network parameter optimization method and neural network calculation method and device suitable for hardware implementation
CN114462594A (en) Neural network training method and device, electronic equipment and storage medium
US20230037498A1 (en) Method and system for generating a predictive model
CN110109646A (en) Data processing method, device and adder and multiplier and storage medium
CN110880037A (en) Neural network operation module and method
CN113642711B (en) Processing method, device, equipment and storage medium of network model
CN107666107B (en) Method of correcting laser power, laser, storage medium, and electronic apparatus
CN115759238B (en) Quantization model generation method and device, electronic equipment and storage medium
US10984163B1 (en) Systems and methods for parallel transient analysis and simulation
CN110880033A (en) Neural network operation module and method
WO2020021396A1 (en) Improved analog computing implementing arbitrary non-linear functions using chebyshev-polynomial- interpolation schemes and methods of use
CN111753971A (en) Neural network operation module and method
CN111753972A (en) Neural network operation module and method
US20220156562A1 (en) Neural network operation module and method
TWI743710B (en) Method, electric device and computer program product for convolutional neural network
CN111753970A (en) Neural network operation module and method
CN110580523B (en) Error calibration method and device for analog neural network processor
US20200371746A1 (en) Arithmetic processing device, method for controlling arithmetic processing device, and non-transitory computer-readable storage medium for storing program for controlling arithmetic processing device

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination