CN109800877A - Parameter regulation means, device and the equipment of neural network - Google Patents
Parameter regulation means, device and the equipment of neural network Download PDFInfo
- Publication number
- CN109800877A CN109800877A CN201910127149.9A CN201910127149A CN109800877A CN 109800877 A CN109800877 A CN 109800877A CN 201910127149 A CN201910127149 A CN 201910127149A CN 109800877 A CN109800877 A CN 109800877A
- Authority
- CN
- China
- Prior art keywords
- layer
- parameter
- bit wide
- network
- decimal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Image Analysis (AREA)
Abstract
The embodiment of the present application discloses the parameter regulation means, device and equipment of a kind of neural network, belongs to field of computer technology.The described method includes: obtaining the parameter and training sample for the neural network trained, training sample includes input data, and parameter and input data are the identical floating number of precision;For the i-th layer network in neural network, predetermined operation is carried out to i-th layer of input data and the i-th layer parameter and obtains the first operation result;I-th layer of input data and the i-th layer parameter be respectively converted into the fixed-point number for meeting the decimal bit wide of the i-th layer network, and to after conversion i-th layer of input data and the i-th layer parameter carry out predetermined operation and obtain the second operation result;When the error of the second operation result and the first operation result is less than error threshold, the i-th layer parameter after conversion is determined as to the target component of the i-th layer network.The embodiment of the present application can improve the efficiency of parameter adjustment under the premise of guaranteeing the precision of neural network.
Description
Technical field
The invention relates to field of computer technology, in particular to a kind of parameter regulation means of neural network, dress
It sets and equipment.
Background technique
Neural network recognition accuracy with higher and preferably can concurrency, in recent years in image recognition, object point
The fields such as class, pattern-recognition are widely used.Wherein, the application of neural network is related to trained and the two ranks of reasoning
Section, training, which refers to, is trained the parameter in neural network using the training sample of magnanimity;Reasoning refers to using with training
The neural network of good parameter handles input data, obtains the reasoning results.
Since the training stage is to the more demanding of data precision, and neural network has certain robustness, so, it can be with
Requirement of the reasoning stage to data precision is reduced, to reduce hardware resource used in the reasoning stage.In the related technology, server
It can be directed to the data precision in reasoning stage, in the parameter of training stage re -training neural network.For example, server is first to mind
Parameter through network carries out nonlinear transformation and low-bit width conversion, obtains low-bit width transformation parameter;It is reversed by neural network again
Process obtains the gradient value to be updated of the low-bit width transformation parameter;Finally the parameter is carried out more according to the gradient value to be updated
Newly.
Computation complexity is higher when due to re -training neural network, so, training takes a long time, and difficulty is higher, leads
Cause the efficiency of parameter adjustment lower.
Summary of the invention
The embodiment of the present application provides the parameter regulation means, device and equipment of a kind of neural network, is directed to for solving
When the parameter of the data precision re -training neural network in reasoning stage, the lower problem of the efficiency of parameter adjustment.The technology
Scheme is as follows:
On the one hand, a kind of parameter regulation means of neural network are provided, which comprises
The parameter and training sample of the neural network trained are obtained, the training sample includes input data, and described
Parameter and the input data are the identical floating number of precision;
For the i-th layer network in the neural network, i-th layer of input data is obtained according to the input data, is obtained
The i-th layer parameter in the parameter, and predetermined operation is carried out to i-th layer of input data and i-th layer parameter and obtains the
One operation result, i are positive integer;
I-th layer of input data and i-th layer parameter are respectively converted into the decimal place for meeting i-th layer network
Wide fixed-point number, and to after conversion i-th layer of input data and i-th layer parameter carry out the predetermined operation and obtain the
Two operation results, the decimal bit wide are used to indicate the digit of the decimal place in fixed-point number;
When the error of second operation result and first operation result is less than error threshold, by the institute after conversion
State the target component that the i-th layer parameter is determined as i-th layer network.
On the one hand, a kind of parameter adjustment controls of neural network are provided, described device includes:
Module is obtained, for obtaining the parameter and training sample of the neural network trained, the training sample includes defeated
Enter data, and the parameter and the input data are the identical floating number of precision;
Computing module, for for the i-th layer network in the neural network, obtained according to the acquisition module described in
Input data obtains i-th layer of input data, obtains the i-th layer parameter in the parameter, and to i-th layer of input data and institute
It states the i-th layer parameter progress predetermined operation and obtains the first operation result, i is positive integer;
The computing module is also used to be respectively converted into and meet institute i-th layer of input data and i-th layer parameter
State the fixed-point number of the decimal bit wide of the i-th layer network, and to after conversion i-th layer of input data and i-th layer parameter into
The row predetermined operation obtains the second operation result, and the decimal bit wide is used to indicate the digit of the decimal place in fixed-point number;
Determining module, second operation result and first operation result for being obtained in the computing module
When error is less than error threshold, i-th layer parameter after conversion is determined as to the target component of i-th layer network.
On the one hand, a kind of parameter adjustment device of neural network, the parameter adjustment device packet of the neural network are provided
Processor and memory are included, at least one instruction is stored in the memory, described instruction is loaded and held by the processor
It goes to realize the parameter regulation means of neural network as described above.
The beneficial effect of technical solution provided by the embodiments of the present application includes at least:
Predetermined operation, which is carried out, by i-th layer of input data to floating number format and the i-th layer parameter obtains the first operation knot
Fruit, i-th layer of input data and the i-th layer parameter to fixed point number format carry out predetermined operation and obtain the second operation result, when second
When error between operation result and the first operation result is less than error threshold, which is determined as the i-th layer network
Target component, due to that the parameter of every layer network directly can be converted to fixed-point number by floating number, without re -training mind
Through network, so, the efficiency of parameter adjustment can be improved.
In addition, the number of plies due to neural network is deep, it is not very sensitive for noise and loss of significance, so, single layer
The precision influence that the parameter adjustment of network is final on neural network is smaller, so as in the premise for the precision for guaranteeing neural network
The lower efficiency for improving parameter adjustment.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, the drawings in the following description are only some examples of the present application, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is the structural representation for implementing a kind of parameter adjustment system of the neural network exemplified according to exemplary partial
Figure;
Fig. 2 is the method flow diagram of the parameter regulation means for the neural network that the application one embodiment provides;
Fig. 3 is the method flow diagram of the parameter regulation means for the neural network that another embodiment of the application provides;
Fig. 4 is the block diagram for the bit wide adjustment unit that another embodiment of the application provides;
Fig. 5 is the block diagram for the two subparameters adjustment that another embodiment of the application provides;
Fig. 6 is the block diagram of the parameter adjustment system for the neural network that another embodiment of the application provides;
Fig. 7 is the structural block diagram of the parameter adjustment controls for the neural network that the application one embodiment provides;
Fig. 8 is the structural block diagram for the terminal that the application one embodiment provides;
Fig. 9 is the structural block diagram for the server that the application another embodiment provides.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with attached drawing to the application
Embodiment is described in further detail.
Below to the invention relates to noun explain.
Fixed-point number: fixed-point number is the changeless data of scaling position.Computer usually arranges decimal point in number in advance
Fixation position in, without indicating decimal point in data.In the present embodiment, it can be used for decimal bit wide to arrange decimal
Fixation position of the point in data, the decimal bit wide are used to indicate the digit of the decimal place in fixed-point number.For example decimal bit wide is
5, then the digit of decimal place is 5 in designation date, i.e., has 5 bit digitals after decimal point.
The format of fixed-point number may include a variety of, such as int64 format, int32 format, int8 format etc., wherein int
Number later represents the occupied bit of fixed-point number (bit) number of the format.
Floating number: floating number is the unfixed data of scaling position.Floating number includes decimal and index, to pass through finger
Number is to achieve the effect that floating point.For example, 123.45 can be expressed as 1.2345 × 102, decimal is indicated by index 2
There are 2 bit digitals after point;12.345 can be expressed as 1.2345 × 103, indicate there are 3 digits after decimal point by index 3
Word.
The format of floating number may include a variety of, such as double-precision floating point FP64 format, single-precision floating point FP32 format, half
Accuracy floating-point FP16 format etc., wherein the number after FP represents the occupied bit number of floating number of the format.
It should be noted that double-precision floating point FP64 format includes 1 sign bit, 11 exponent bits and 52 decimal places;
Single-precision floating point FP32 format includes 1 sign bit, 8 exponent bits and 23 decimal places;Half accuracy floating-point FP16 format includes 1
A sign bit, 5 exponent bits and 10 decimal places.
Since GPU (Graphics Processing Unit, graphics processor) has the spy that precision is high, concurrency is good
Point, so, the training of neural network is usually completed by GPU, to quickly train the parameter of neural network by GPU.In mind
After the completion of network training, the neural network can be applied in Inference Platform.Since neural network has certain robustness, and
Requirement of the Inference Platform to data precision does not have the data precision of training stage high, so, diversified hair is presented in Inference Platform
Exhibition trend, such as Inference Platform can be CPU (Central Processing Unit, central processing unit), FPGA (Field
Programmable Gate Array, field programmable gate array), ASIC (Application Specific
Integrated Circuit, application-specific IC) etc. the platform that coexists of kinds of platform.
In order to reduce neural network occupied hardware resource when using in the reasoning stage, neural network is promoted in reasoning rank
The performance of section, people begin trying to reduce the precision of the computing unit in neural network, for example, design is based on single-precision floating point
FP32 format, half accuracy floating-point FP16 format, fixed point int16 format or the calculating list for pinpointing int8 format, two-value network format
Member also needs the High Accuracy Parameter in neural network being converted to low precision parameter at this time, to be suitable for the calculating of the low precision
Unit.
Currently, being by the practices well that the High Accuracy Parameter of neural network is converted into low precision parameter: being directed to Inference Platform
Required precision re -training neural network, but its cost is that the training time is longer, and training difficulty is higher, leads to trained function
Consumption is higher, and the efficiency for also allowing for parameter adjustment is lower.Furthermore, it is contemplated that training platform and Inference Platform are mentioned by different producers
For it is therefore desirable to which different manufacturers coordinate the re -training of tissue neural network, so that training often stops on paper.
Referring to FIG. 1, in the present embodiment, it can be first according to parameter, training sample and the bit wide of the neural network trained
It is bad to carry out selection parameter adjustment effect further according to bit wide adjustment accelerating engine for the parameter of the every layer network of adjusting parameter algorithm pre-acquiring
Network layer, based on bit wide adjustment section the parameter of these network layers is adjusted end-to-end again, after adjustment
Parameter, training sample and bit wide adjusting parameter the algorithm every layer network of pre-acquiring again parameter, until the parameter of every layer network
Stop when adjustment effect is preferable.Later, the neural network can be used in Inference Platform, that is, by the existing netting index in Inference Platform
According to the neural network is inputted, to obtain data result, for example, data result is picture recognition, picture classification etc..Wherein, position
Wide adjusting parameter algorithm adjusts the parameter of every layer network for determining, bit wide adjusts accelerating engine and is used for selection parameter adjustment effect
Bad network layer, bit wide adjustment section are used to indicate adjustment of this layer network relative to the decimal bit wide of the parameter of upper layer network
Range, it is as detailed below in description.
Due to that the parameter of every layer network directly can be converted to fixed-point number by floating number, without re -training nerve
Network, so, the efficiency of parameter adjustment can be improved.In addition, the number of plies due to neural network is deep, for noise and precision
Loss is not very sensitive, so, the parameter of single layer network adjust the precision final on neural network influence it is smaller, so as to
Guarantee the efficiency that parameter adjustment is improved under the premise of the precision of neural network.
Since the present embodiment can reduce the power consumption of the parameter adjustment of neural network, so, the present embodiment can be applied to
For in the scene of sensitive power consumption, for example, the present embodiment can be applied in the end side equipment of such as terminal etc.Wherein, eventually
End can be mobile phone, computer, tablet computer, wearable device, unmanned plane etc., and this embodiment is not limited.Certainly, this reality
Applying example can also be applied in the cloud side apparatus of such as server etc, and this embodiment is not limited.Hereafter by end side equipment and
Cloud side apparatus is referred to as equipment.
Referring to FIG. 2, it illustrates the methods of the parameter regulation means of the neural network of the application one embodiment offer
Flow chart.The parameter regulation means of the neural network, comprising:
Step 201, the parameter and training sample of the neural network trained are obtained, which includes input data,
And the parameter and the input data are the identical floating number of precision.
Neural network be training platform utilize magnanimity the trained neural network of training sample, so, nerve
The parameter of network is datum.When due to Processing with Neural Network input data, convolution algorithm and biasing operation are generally involved,
That is, input data and weight parameter (weight) are first carried out convolution algorithm, convolution algorithm is obtained as a result, again by convolution algorithm knot
Fruit add offset parameter (bias), exported as a result, so, the parameter of neural network includes weight parameter and offset parameter.
Wherein, training sample is the sample in training stage training neural network, and includes input data in training sample.
Optionally, at least one available training sample of equipment.In the present embodiment, the number for the training sample that equipment obtains
Amount can be hundred or thousand orders of magnitude, much smaller than the order of magnitude of the magnanimity training sample of training stage.
It should be noted that the parameter and input data in the present embodiment are the identical floating number of precision.For example, parameter and
Input data is the floating number of FP64 format, alternatively, parameter and input data are the floating number of FP32 format, alternatively, ginseng
Several and input data is the floating number of FP16 format.
Assuming that neural network includes n-layer network, and below to be illustrated for adjusting the parameter of the i-th layer network, the value of i
It is 1 to n, and i and n are positive integer.I.e. first setting i is 1, executes step 202-204, i is updated to i+1 later, continues to hold
Row step 202-204, until stopping after obtaining the parameter of n-th layer network.
Step 202, for the i-th layer network in neural network, i-th layer of input data is obtained according to the input data, is obtained
The i-th layer parameter in the parameter is taken, and predetermined operation is carried out to i-th layer of input data and the i-th layer parameter and obtains the first operation knot
Fruit.
When i is 1, i-th layer of input data is the input data in training sample, and equipment can directly acquire training at this time
Input data in sample;When i is greater than 1, i-th layer of input data is the output data of the (i-1)-th layer network, and equipment can be at this time
Directly acquire the output data of the (i-1)-th layer network.
Since the parameter of neural network includes the parameter of every layer network, so, equipment can be directly obtained in parameter
I-th layer parameter.
In one possible implementation, pre-defined algorithm includes convolution algorithm described in step 201 and biases operation,
At this point, the weight parameter in i-th layer of input data and the i-th layer parameter is first carried out convolution algorithm by equipment, by convolution algorithm result
In addition the offset parameter in the i-th layer parameter, obtains the first operation result.
It should be noted that since i-th layer of input data and the i-th layer parameter are floating number, so, the first operation result
It also is floating number.
Step 203, i-th layer of input data and the i-th layer parameter are respectively converted into the decimal bit wide for meeting the i-th layer network
Fixed-point number, and to after conversion i-th layer of input data and the i-th layer parameter carry out predetermined operation obtain the second operation result.
Decimal bit wide is used to indicate the digit of the decimal place in fixed-point number.
In the present embodiment, equipment can there are many implementations for the decimal bit wide for determining the i-th layer network, below with wherein
Three kinds of implementations for be illustrated.
In the first implementation, equipment can preset the decimal bit wide of default, using the decimal bit wide of the default as
The decimal bit wide of i-th layer network.Wherein, the decimal bit wide of the default can be empirical value.
In the second implementation, equipment can determine the maximum value and minimum value in i-th layer of input data, according to this
Maximum value and minimum value determine the decimal bit wide of the i-th layer network.
In the third implementation, equipment can determine the data distribution in i-th layer of input data, according to distribution density
Biggish data interval determines the decimal bit wide of the i-th layer network.
After the decimal bit wide of the i-th layer network has been determined, equipment can distinguish i-th layer of input data and the i-th layer parameter
It is converted, so that the i-th layer of input data and the i-th layer parameter after conversion are fixed-point number, and i-th layer of input number after conversion
The digit of decimal place in is equal to the digit of decimal place indicated by the decimal bit wide, small in the i-th layer parameter after conversion
The digit of numerical digit is equal to the digit of decimal place indicated by the decimal bit wide.
It should be noted that i-th layer of input data and the i-th layer parameter can be converted by equipment, alternatively, due to i-th
Layer parameter can import equipment by outside, so, it optionally, can also be before importing i-th layer parameter to i-th layer parameter
It is converted, the i-th layer parameter that equipment receives has been the fixed-point number after conversion, at this point, equipment only needs to input i-th layer
Data are converted.
After being converted i-th layer of input data and the i-th layer parameter after, equipment can using pre-defined algorithm calculate second
Operation result, calculation process are detailed in the description in step 202, do not repeat herein.
Step 204, when the error of the second operation result and the first operation result is less than error threshold, by the after conversion
I layer parameter is determined as the target component of the i-th layer network.
Equipment calculates error of second operation result relative to the first operation result, then by the error and preset error threshold
Value is compared, and when the error is less than the error threshold, illustrates the shadow of processing of the i-th layer parameter after converting to input data
It rings within an acceptable error range, the i-th layer parameter after the conversion can be determined as the final target of the i-th layer network at this time
Parameter;When the error is greater than or equal to the error threshold, readjust i-th layer parameter, it is as detailed below in description.Its
In, error can be at least one of mean value and variance, and this embodiment is not limited.
In conclusion the parameter regulation means of neural network provided by the embodiments of the present application, by floating number format
I-th layer of input data and the i-th layer parameter carry out predetermined operation and obtain the first operation result, i-th layer of input to fixed point number format
Data and the i-th layer parameter carry out predetermined operation and obtain the second operation result, when between the second operation result and the first operation result
Error when being less than error threshold, which is determined as to the target component of the i-th layer network, due to can directly will be every
The parameter of layer network is converted to fixed-point number by floating number, without re -training neural network, so, parameter tune can be improved
Whole efficiency.
In addition, the number of plies due to neural network is deep, it is not very sensitive for noise and loss of significance, so, single layer
The precision influence that the parameter adjustment of network is final on neural network is smaller, so as in the premise for the precision for guaranteeing neural network
The lower efficiency for improving parameter adjustment.
Referring to FIG. 3, it illustrates the methods of the parameter regulation means of the neural network of another embodiment offer of the application
Flow chart.The parameter regulation means of the neural network, comprising:
Step 301, the parameter and training sample of the neural network trained are obtained, which includes input data,
And the parameter and the input data are the identical floating number of precision.
Wherein, the explanation of parameter, training sample, input data is detailed in the description in step 201.
Assuming that neural network includes n-layer network, and below to be illustrated for adjusting the parameter of the i-th layer network, the value of i
It is 1 to n, and i and n are positive integer.I.e. as 1≤i < n, step 301-303 is executed, when error at this time is less than error threshold
Step 304 is executed when value, i is updated to i+1 later, continues to execute step 301-303, is missed when error at this time is greater than or equal to
Step 305-307 is executed when poor threshold value, and i is updated to i+1 later, continues to execute step 301-303;As i=n, step is executed
Rapid 301-303 executes step 304 when error at this time is less than error threshold, when error at this time is greater than or equal to error threshold
Step 308-310 is executed when value.
Step 302, for the i-th layer network in neural network, i-th layer of input data is obtained according to the input data, is obtained
The i-th layer parameter in the parameter is taken, and predetermined operation is carried out to i-th layer of input data and the i-th layer parameter and obtains the first operation knot
Fruit.
Wherein, the process for obtaining i-th layer of input data, the i-th layer parameter and the first operation result is detailed in retouching in step 202
It states.
Step 303, i-th layer of input data and the i-th layer parameter are respectively converted into the decimal bit wide for meeting the i-th layer network
Fixed-point number, and to after conversion i-th layer of input data and the i-th layer parameter carry out predetermined operation obtain the second operation result.
After the decimal bit wide of the i-th layer network has been determined, equipment can distinguish i-th layer of input data and the i-th layer parameter
It is converted, so that the i-th layer of input data and the i-th layer parameter after conversion are fixed-point number, and i-th layer of input number after conversion
The digit of decimal place in is equal to the digit of decimal place indicated by the decimal bit wide, small in the i-th layer parameter after conversion
The digit of numerical digit is equal to the digit of decimal place indicated by the decimal bit wide.
It should be noted that i-th layer of input data and the i-th layer parameter can be converted by equipment, alternatively, due to i-th
Layer parameter can import equipment by outside, so, it optionally, can also be before importing i-th layer parameter to i-th layer parameter
It is converted, the i-th layer parameter that equipment receives has been the fixed-point number after conversion, at this point, equipment only needs to input i-th layer
Data are converted.
When converting i-th layer of input data, as i=1, can calculate the decimal bit wide of the 1st layer of input data relative to
The bit wide variable quantity k of the decimal bit wide of 1st layer network selects k cascade shift units, Mei Geyi from bit wide adjustment unit
Bit location is used to carry out data the adjustment of one decimal bit wide, and the 1st layer of input data is inputted k cascade shift units,
I-th layer of input data after being converted.Wherein, k is positive integer.Optionally, bit wide adjustment unit can also include selector,
To select k cascade shift units.
For example, equipment selects 3 cascade shift units from bit wide adjustment unit when k is 3, then the 1st layer is inputted
After data input the 1st shift unit, the 1st shift unit carries out the tune of a decimal bit wide to the 1st layer of input data
It is whole, data adjusted are exported to the 2nd shift unit, the 2nd shift unit carries out a decimal bit wide to the data
Adjustment, data adjusted are exported to the 3rd shift unit, the 3rd shift unit carries out a decimal to the data
The adjustment of bit wide, the 1st layer of input data after being converted.
In general, decimal bit wide will not be usually adjusted on a large scale when floating number is converted to fixed-point number, so, it does not need
The quantity shift unit equal with the digit of fixed-point number is set, and nerve can be met by only needing to be arranged a small amount of shift unit
The application of network completes the conversion that floating-point counts to fixed-point number using a small amount of cost so as to accelerate on platform in bottom.Than
Such as, by taking fixed-point number is int16 format as an example, then 16 shift units of setting are not needed, and only needs to be arranged 4 shift units i.e.
The application of neural network can be met, so that the quantity that bit wide adjustment unit occupies is original 1/4, please refer to Fig. 4.
It should be noted that when due to carrying out bit wide adjustment, it may be necessary to increase decimal bit wide, it is also possible to need to reduce small
Numerical digit is wide, so, it can also include for increasing (the displacement also referred to as moved to left of the shift unit of decimal bit wide in bit wide adjustment unit
Unit) and shift unit (shift unit also referred to as moved to right) for reducing decimal bit wide, then from bit wide adjustment unit
When selecting k cascade shift units, it is also necessary to first determine the adjustment direction of decimal bit wide, the k on the reselection adjustment direction
A cascade shift unit.
As i > 1, i.e., when i-th layer of input data is (i-1)-th layer of output data, the decimal of available i-th layer network
Bit wide variable quantity k of the bit wide relative to the decimal bit wide of the (i-1)-th layer network;K cascade shiftings are selected from bit wide adjustment unit
Bit location, each shift unit are used to carry out data the adjustment of one decimal bit wide;I-th layer of input data is inputted into k grade
The shift unit of connection, i-th layer of input data after being converted.For example, the decimal bit wide of (i-1)-th layer of input data is 7, i-th
The decimal bit wide of layer network is 5, then bit wide variable quantity is to reduce 2 decimal bit wides namely decimal point to move to right 2, at this time can be with
Select the 2 cascade shift units moved to right.
After being converted i-th layer of input data and the i-th layer parameter after, equipment can using pre-defined algorithm calculate second
Operation result.Since i-th layer of input data and the i-th layer parameter are all fixed-point numbers, and fixed-point number decimal bit wide after convolution algorithm
It can expand, so, equipment also needs to be adjusted convolution algorithm result.At this point, to i-th layer of input data and after conversion
I layer parameter carry out predetermined operation obtain the second operation result, comprising: to after conversion i-th layer of input data and the i-th layer parameter into
Row convolution algorithm obtains convolution algorithm as a result, the decimal place of convolution algorithm result is wider than the decimal bit wide of the i-th layer network;It will
Convolution algorithm result is adjusted to meet the intermediate data of the decimal bit wide of the i-th layer network;Operation is biased to intermediate data, is obtained
To the second operation result.
For example, i-th layer of input data and the decimal bit wide of the i-th layer parameter are 2, then the decimal bit wide of convolution algorithm result
It is 4, equipment needs to be truncated the last two-decimal in convolution algorithm result at this time, so that the decimal bit wide of obtained intermediate data
It is 2.
Step 304, when the error of the second operation result and the first operation result is less than error threshold, by the after conversion
I layer parameter is determined as the target component of the i-th layer network.
Equipment calculates error of second operation result relative to the first operation result, then by the error and preset error threshold
Value is compared, and when the error is less than the error threshold, illustrates the shadow of processing of the i-th layer parameter after converting to input data
It rings within an acceptable error range, the i-th layer parameter after the conversion can be determined as the final target of the i-th layer network at this time
Parameter;When the error is greater than or equal to the error threshold, readjust i-th layer parameter, it is as detailed below in description.Its
In, error can be at least one of mean value and variance, and this embodiment is not limited.
It should be noted that equipment can directly store the target component of the i-th layer network, at this point, being stored in equipment each
The target component of layer network.Alternatively, optional, equipment can calculate the target component of the i-th layer network relative to the (i-1)-th layer network
Target component bit wide variable quantity, the bit wide variable quantity is stored, at this point, being stored with the decimal bit wide of the 1st layer network in equipment
And the bit wide variable quantity of the decimal bit wide of every adjacent two layers network.Table one is please referred to, wherein the bit wide of the i-th layer network is adjusted
Section is known as delta_i, and the decimal bit wide of the i-th layer parameter is known as w_i.
Table one
The number of plies of network | Bit wide adjusts section | The decimal bit wide of parameter |
1 | delta_1 | w_1 |
2 | delta_2 | w_2 |
… | … | … |
n | delta_n | w_n |
In step 305, when the i-th layer network is not the last layer network in neural network, and the second operation result with
When the error of first operation result is greater than or equal to error threshold, increase the decimal bit wide of the i-th layer network.
When error is greater than or equal to error threshold, illustrate the shadow of processing of the i-th layer parameter after converting to input data
Not within an acceptable error range, the precision of fixed-point number is too low, can increase decimal bit wide, at this time to improve fixed-point number for sound
Precision.
When increasing decimal bit wide, equipment can increase default value on the basis of original decimal bit wide, such as 1,
2,3 etc., this embodiment is not limited.
Within step 306, i-th layer of input data and the i-th layer parameter are respectively converted into i-th layer met after increasing again
The fixed-point number of the decimal bit wide of network, and to again convert after i-th layer of input data and the i-th layer parameter carry out predetermined operation obtain
To the second operation result.
Equipment can use the decimal bit wide being calculated in step 305 and update the original decimal bit wide of the i-th layer network, then
Secondary execution step 303 calculates the second operation result, and calculating process is detailed in the description in step 303.
In step 307, when the error of the second operation result and the first operation result that obtain again is less than error threshold
When, the i-th layer parameter after converting again is determined as to the target component of the i-th layer network.
When the error of the second operation result and the first operation result that obtain again is less than error threshold, will convert again
The i-th layer parameter afterwards is determined as the target component of the i-th layer network;When the second operation result obtained again and the first operation knot
When the error of fruit is greater than or equal to error threshold, circulation executes step 305-306, until the second operation result for obtaining again with
The error of first operation result stops when being less than error threshold.
In step 308, if the error of the last layer network is greater than or equal to error threshold in neural network, from all
Error in select the maximum m error of numerical value.
The error that equipment can generate all layer networks is ranked up, and therefrom selects the maximum m error of numerical value, m
For positive integer.Here m can be empirical value, be also possible to the numerical value being calculated according to pre-defined algorithm, and the present embodiment is not made
It limits.
In a step 309, for each error in m error, the bit wide adjustment for generating the jth layer network of error is determined
Section, and according to the decimal bit wide in bit wide adjustment section and the decimal bit wide increase jth layer network of -1 layer network of jth.
Bit wide adjustment section includes at least one bit wide adjustment data, and bit wide adjustment data are used to indicate this layer network phase
For the bit wide variable quantity of the decimal bit wide of upper layer network.For example, the decimal bit wide of -1 layer network of jth is 7, jth layer network
It is [- 2,2] that bit wide, which adjusts section, then the range of the decimal bit wide of jth layer network is [9,5].Wherein, j >=2.
Optionally, the decimal place of the decimal bit wide increase jth layer network of section and -1 layer network of jth is adjusted according to bit wide
It is wide, comprising: one bit wide of selection adjusts data from bit wide adjustment section;By the decimal bit wide and bit wide tune of -1 layer network of jth
Entire data is added or subtracts each other, and obtains the decimal bit wide of jth layer network, the decimal place of updated jth layer network is wider than update
The decimal bit wide of preceding jth layer network.
Wherein, the bit wide adjustment data of selection need to meet following conditions: the decimal place of updated jth layer network is roomy
The decimal bit wide of jth layer network before update.For example, the decimal bit wide of -1 layer network of jth is 7, the bit wide selected before updating
Adjusting data is 1, then the decimal bit wide of jth layer network is 6 before updating, then selectable bit wide adjustment data are [- 2,0],
So that the value interval of the decimal bit wide of updated jth layer network is [9,7], greater than the decimal of the jth layer network before update
Bit wide 6.
In the present embodiment, the decimal bit wide obtained in step 303 can be known as to parameter pre-acquiring step, i.e., above-mentioned implementation
Described bit wide adjusting parameter algorithm obtains the parameter of every layer network in environment;It can will be to the corresponding network of above-mentioned m error
The adjustment of the parameter of layer is known as end-to-end adjustment, i.e., the adjustment accelerating engine of bit wide described in above-mentioned implementation environment carrys out selection parameter
The bad network layer of adjustment effect, referring to FIG. 5, it illustrates the block diagrams of parameter adjustment.
In the step 310, it is recalculated often according to the decimal bit wide of layer network every in input data, parameter and neural network
The target component of layer network.
In having updated m error after the parameter of the corresponding network layer of each error, equipment executes the present embodiment again and mentions
The method of confession, to recalculate the target component of every layer network.
First point for needing to illustrate is, when carrying out convolution algorithm to floating number, needs the floating number for different-format
When designing different convolution algorithm units, and carrying out convolution algorithm to fixed-point number, since computer does not embody decimal place, institute
With no matter the decimal bit wide of fixed-point number is how many, can carry out convolution algorithm by a convolution algorithm unit, rolled up
Decimal place is determined again after product operation result, to realize the convolution for carrying out all-network layer using a convolution algorithm unit circulation
Operation, to achieve the effect that normalize computing unit.
The second point for needing to illustrate is that equipment may include two bit wide adjustment units, wherein bit wide adjustment unit 1 is used
According to decimal bit wide, fixed-point number is converted input data into;Bit wide adjustment unit 2 is used for the decimal bit wide according to parameter to volume
Product operation result is adjusted, so that the decimal bit wide of convolution algorithm result adjusted is equal to the decimal bit wide of parameter.
It, can be maximum by having normalized computing unit, and the negligible amounts of bit wide adjustment unit in this present embodiment
Reduce to degree the consumption of bottom hardware resource.In addition, the core of the present embodiment is by bit wide adjusting parameter algorithm and bit wide
It adjusts accelerating engine two parts to constitute, the two cooperates.That is, first with bit wide adjusting parameter algorithm, based on part training sample
This and parameter carry out parameter pre-acquiring and the adjustment of end-to-end parameter, and the bit wide supported in conjunction with bit wide adjustment accelerating engine adjusts
Quick adjusting parameter is carried out in section, comes for the scheme of adjusting parameter compared to re -training neural network, computation complexity is substantially
It reduces, parameter adjustment can be completed within common computer a few minutes.
Referring to FIG. 6, it illustrates the block diagrams of the parameter adjustment system of neural network, wherein input-buffer unit is used for
Input data is cached, output cache unit is used for cache weights parameter and biasing for caching output data, parameter cache unit
Parameter, bit wide adjustment accelerating engine is used to obtain the parameter of every layer network according to the bit wide adjusting parameter algorithm of light weight, then drives
Bit wide adjustment unit carries out parameter adjustment.Optionally, the parameter adjustment system of neural network can also include pond unit etc. its
His computing unit, this embodiment is not limited.
In conclusion the parameter regulation means of neural network provided by the embodiments of the present application, by floating number format
I-th layer of input data and the i-th layer parameter carry out predetermined operation and obtain the first operation result, i-th layer of input to fixed point number format
Data and the i-th layer parameter carry out predetermined operation and obtain the second operation result, when between the second operation result and the first operation result
Error when being less than error threshold, which is determined as to the target component of the i-th layer network, due to can directly will be every
The parameter of layer network is converted to fixed-point number by floating number, without re -training neural network, so, parameter tune can be improved
Whole efficiency.
In addition, the number of plies due to neural network is deep, it is not very sensitive for noise and loss of significance, so, single layer
The precision influence that the parameter adjustment of network is final on neural network is smaller, so as in the premise for the precision for guaranteeing neural network
The lower efficiency for improving parameter adjustment.
It, can be maximum by having normalized computing unit, and the negligible amounts of bit wide adjustment unit in this present embodiment
Reduce to degree the consumption of bottom hardware resource.
When floating number is converted to fixed-point number, will not usually adjust decimal bit wide on a large scale, so, do not need setting with
The equal quantity shift unit of the digit of fixed-point number, and neural network can be met by only needing to be arranged a small amount of shift unit
Using so as to complete the conversion that floating-point counts to fixed-point number using a small amount of cost on bottom acceleration platform.
Referring to FIG. 7, it illustrates the structures of the parameter adjustment controls of the neural network of the application one embodiment offer
Block diagram.The parameter adjustment controls of the neural network, comprising:
Module 701 is obtained, for obtaining the parameter and training sample of the neural network trained, training sample includes input
Data, and parameter and input data are the identical floating number of precision;
Computing module 702, for for the i-th layer network in neural network, according to the input number for obtaining module 701 and obtaining
According to i-th layer of input data is obtained, get parms in the i-th layer parameter, and i-th layer of input data and the i-th layer parameter are carried out pre-
Determine operation and obtain the first operation result, i is positive integer;
Computing module 702, is also used to for i-th layer of input data and the i-th layer parameter being respectively converted into and meets the i-th layer network
The fixed-point number of decimal bit wide, and to after conversion i-th layer of input data and the i-th layer parameter carry out predetermined operation obtain the second operation
As a result, decimal bit wide is used to indicate the digit of the decimal place in fixed-point number;
The error of determining module 703, the second operation result and the first operation result for obtaining in computing module 702 is small
When error threshold, the i-th layer parameter after conversion is determined as to the target component of the i-th layer network.
In one possible implementation, the device further include:
Selecting module, when the error for the last layer network in neural network is greater than or equal to error threshold, from institute
The maximum m error of numerical value is selected in some errors, m is positive integer;
The first adjustment module, for determining the error for generating selecting module selection for each error in m error
The bit wide of jth layer network adjusts section, and increases jth layer net according to the decimal bit wide that bit wide adjusts section and -1 layer network of jth
The decimal bit wide of network, j >=2;
Computing module, for being recalculated according to the decimal bit wide of layer network every in input data, parameter and neural network
The target component of every layer network.
In one possible implementation, the first adjustment module is also used to:
One bit wide of selection adjusts data from bit wide adjustment section;
The decimal bit wide of -1 layer network of jth is added or is subtracted each other with bit wide adjustment data, the decimal place of jth layer network is obtained
Width, the decimal place of updated jth layer network are wider than the decimal bit wide of the jth layer network before updating.
In one possible implementation, when the i-th layer network is not the last layer network in neural network, the dress
It sets further include:
Second adjustment module, for being greater than or equal to error threshold in the error of the second operation result and the first operation result
When, increase the decimal bit wide of the i-th layer network;
Computing module 702 is also used to that i-th layer of input data and the i-th layer parameter are respectively converted into after meeting increase again
The i-th layer network decimal bit wide fixed-point number, and to after converting again i-th layer of input data and the i-th layer parameter carry out it is pre-
Determine operation and obtains the second operation result;
Determining module 703 is also used to be less than in the error of the second operation result and the first operation result that obtain again and miss
When poor threshold value, the i-th layer parameter after converting again is determined as to the target component of the i-th layer network.
In one possible implementation, when predetermined operation includes convolution algorithm and biasing operation, computing module
702, it is also used to:
To the i-th layer of input data and the i-th layer parameter progress convolution algorithm after conversion, convolution algorithm is obtained as a result, convolution
The decimal place of operation result is wider than the decimal bit wide of the i-th layer network;
Convolution algorithm result is adjusted to meet the intermediate data of the decimal bit wide of the i-th layer network;
Operation is biased to intermediate data, obtains the second operation result.
In one possible implementation, when i-th layer of input data is (i-1)-th layer of output data, computing module
702, it is also used to:
The bit wide variable quantity k of the decimal bit wide of the i-th layer network relative to the decimal bit wide of the (i-1)-th layer network is obtained, k is positive
Integer;
K cascade shift units are selected from bit wide adjustment unit, each shift unit is used to carry out one to data
The adjustment of decimal bit wide;
I-th layer of input data by k cascade shift units of i-th layer of input data input, after being converted.
In conclusion the parameter adjustment controls of neural network provided by the embodiments of the present application, by floating number format
I-th layer of input data and the i-th layer parameter carry out predetermined operation and obtain the first operation result, i-th layer of input to fixed point number format
Data and the i-th layer parameter carry out predetermined operation and obtain the second operation result, when between the second operation result and the first operation result
Error when being less than error threshold, which is determined as to the target component of the i-th layer network, due to can directly will be every
The parameter of layer network is converted to fixed-point number by floating number, without re -training neural network, so, parameter tune can be improved
Whole efficiency.
In addition, the number of plies due to neural network is deep, it is not very sensitive for noise and loss of significance, so, single layer
The precision influence that the parameter adjustment of network is final on neural network is smaller, so as in the premise for the precision for guaranteeing neural network
The lower efficiency for improving parameter adjustment.
It, can be maximum by having normalized computing unit, and the negligible amounts of bit wide adjustment unit in this present embodiment
Reduce to degree the consumption of bottom hardware resource.
When floating number is converted to fixed-point number, will not usually adjust decimal bit wide on a large scale, so, do not need setting with
The equal quantity shift unit of the digit of fixed-point number, and neural network can be met by only needing to be arranged a small amount of shift unit
Using so as to complete the conversion that floating-point counts to fixed-point number using a small amount of cost on bottom acceleration platform.
Fig. 8 shows the structural block diagram of the terminal 800 of one exemplary embodiment of the application offer.The terminal 800 can be with
It is portable mobile termianl, such as: smart phone, tablet computer, MP3 player (Moving Picture Experts
Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture
Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, laptop
Or desktop computer.Terminal 800 is also possible to referred to as other names such as user equipment, portable terminal, laptop terminal, terminal console
Claim.
In general, terminal 800 includes: processor 801 and memory 802.
Processor 801 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place
Reason device 801 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-
Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed
Logic array) at least one of example, in hardware realize.Processor 801 also may include primary processor and coprocessor, master
Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing
Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.?
In some embodiments, processor 801 can be integrated with GPU (Graphics Processing Unit, image processor),
GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 801 can also be wrapped
AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learning
Calculating operation.
Memory 802 may include one or more computer readable storage mediums, which can
To be non-transient.Memory 802 may also include high-speed random access memory and nonvolatile memory, such as one
Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 802 can
Storage medium is read for storing at least one instruction, at least one instruction performed by processor 801 for realizing this Shen
Please in embodiment of the method provide neural network parameter regulation means.
In some embodiments, terminal 800 is also optional includes: peripheral device interface 803 and at least one peripheral equipment.
It can be connected by bus or signal wire between processor 801, memory 802 and peripheral device interface 803.Each peripheral equipment
It can be connected by bus, signal wire or circuit board with peripheral device interface 803.Specifically, peripheral equipment includes: radio circuit
804, at least one of touch display screen 805, camera 806, voicefrequency circuit 807, positioning component 808 and power supply 809.
Peripheral device interface 803 can be used for I/O (Input/Output, input/output) is relevant outside at least one
Peripheral equipment is connected to processor 801 and memory 802.In some embodiments, processor 801, memory 802 and peripheral equipment
Interface 803 is integrated on same chip or circuit board;In some other embodiments, processor 801, memory 802 and outer
Any one or two in peripheral equipment interface 803 can realize on individual chip or circuit board, the present embodiment to this not
It is limited.
Radio circuit 804 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.It penetrates
Frequency circuit 804 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 804 turns electric signal
It is changed to electromagnetic signal to be sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 804 wraps
It includes: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, codec chip
Group, user identity module card etc..Radio circuit 804 can be carried out by least one wireless communication protocol with other terminals
Communication.The wireless communication protocol includes but is not limited to: Metropolitan Area Network (MAN), each third generation mobile communication network (2G, 3G, 4G and 5G), wireless office
Domain net and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio circuit 804 may be used also
To include the related circuit of NFC (Near Field Communication, wireless near field communication), the application is not subject to this
It limits.
Display screen 805 is for showing UI (User Interface, user interface).The UI may include figure, text, figure
Mark, video and its their any combination.When display screen 805 is touch display screen, display screen 805 also there is acquisition to show
The ability of the touch signal on the surface or surface of screen 805.The touch signal can be used as control signal and be input to processor
801 are handled.At this point, display screen 805 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft button and/or
Soft keyboard.In some embodiments, display screen 805 can be one, and the front panel of terminal 800 is arranged;In other embodiments
In, display screen 805 can be at least two, be separately positioned on the different surfaces of terminal 800 or in foldover design;In still other reality
It applies in example, display screen 805 can be flexible display screen, be arranged on the curved surface of terminal 800 or on fold plane.Even, it shows
Display screen 805 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 805 can use LCD (Liquid
Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode)
Etc. materials preparation.
CCD camera assembly 806 is for acquiring image or video.Optionally, CCD camera assembly 806 include front camera and
Rear camera.In general, the front panel of terminal is arranged in front camera, the back side of terminal is arranged in rear camera.One
In a little embodiments, rear camera at least two is main camera, depth of field camera, wide-angle camera, focal length camera shooting respectively
Any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide-angle
Camera fusion realizes that pan-shot and VR (Virtual Reality, virtual reality) shooting function or other fusions are clapped
Camera shooting function.In some embodiments, CCD camera assembly 806 can also include flash lamp.Flash lamp can be monochromatic warm flash lamp,
It is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for not
With the light compensation under colour temperature.
Voicefrequency circuit 807 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and will
Sound wave, which is converted to electric signal and is input to processor 801, to be handled, or is input to radio circuit 804 to realize voice communication.
For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 800 to be multiple.Mike
Wind can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 801 or radio circuit will to be come from
804 electric signal is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramic loudspeaker.When
When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, it can also be by telecommunications
Number the sound wave that the mankind do not hear is converted to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 807 can also include
Earphone jack.
Positioning component 808 is used for the current geographic position of positioning terminal 800, to realize navigation or LBS (Location
Based Service, location based service).Positioning component 808 can be the GPS (Global based on the U.S.
Positioning System, global positioning system), the dipper system of China, Russia Gray receive this system or European Union
The positioning component of Galileo system.
Power supply 809 is used to be powered for the various components in terminal 800.Power supply 809 can be alternating current, direct current,
Disposable battery or rechargeable battery.When power supply 809 includes rechargeable battery, which can support wired charging
Or wireless charging.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 800 further includes having one or more sensors 810.The one or more sensors
810 include but is not limited to: acceleration transducer 811, gyro sensor 812, pressure sensor 813, fingerprint sensor 814,
Optical sensor 815 and proximity sensor 816.
The acceleration that acceleration transducer 811 can detecte in three reference axis of the coordinate system established with terminal 800 is big
It is small.For example, acceleration transducer 811 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 801 can
With the acceleration of gravity signal acquired according to acceleration transducer 811, touch display screen 805 is controlled with transverse views or longitudinal view
Figure carries out the display of user interface.Acceleration transducer 811 can be also used for the acquisition of game or the exercise data of user.
Gyro sensor 812 can detecte body direction and the rotational angle of terminal 800, and gyro sensor 812 can
To cooperate with acquisition user to act the 3D of terminal 800 with acceleration transducer 811.Processor 801 is according to gyro sensor 812
Following function may be implemented in the data of acquisition: when action induction (for example changing UI according to the tilt operation of user), shooting
Image stabilization, game control and inertial navigation.
The lower layer of side frame and/or touch display screen 805 in terminal 800 can be set in pressure sensor 813.Work as pressure
When the side frame of terminal 800 is arranged in sensor 813, user can detecte to the gripping signal of terminal 800, by processor 801
Right-hand man's identification or prompt operation are carried out according to the gripping signal that pressure sensor 813 acquires.When the setting of pressure sensor 813 exists
When the lower layer of touch display screen 805, the pressure operation of touch display screen 805 is realized to UI circle according to user by processor 801
Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menu
At least one of control.
Fingerprint sensor 814 is used to acquire the fingerprint of user, collected according to fingerprint sensor 814 by processor 801
The identity of fingerprint recognition user, alternatively, by fingerprint sensor 814 according to the identity of collected fingerprint recognition user.It is identifying
When the identity of user is trusted identity out, the user is authorized to execute relevant sensitive operation, the sensitive operation packet by processor 801
Include solution lock screen, check encryption information, downloading software, payment and change setting etc..Terminal can be set in fingerprint sensor 814
800 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 800, fingerprint sensor 814 can be with
It is integrated with physical button or manufacturer Logo.
Optical sensor 815 is for acquiring ambient light intensity.In one embodiment, processor 801 can be according to optics
The ambient light intensity that sensor 815 acquires controls the display brightness of touch display screen 805.Specifically, when ambient light intensity is higher
When, the display brightness of touch display screen 805 is turned up;When ambient light intensity is lower, the display for turning down touch display screen 805 is bright
Degree.In another embodiment, the ambient light intensity that processor 801 can also be acquired according to optical sensor 815, dynamic adjust
The acquisition parameters of CCD camera assembly 806.
Proximity sensor 816, also referred to as range sensor are generally arranged at the front panel of terminal 800.Proximity sensor 816
For acquiring the distance between the front of user Yu terminal 800.In one embodiment, when proximity sensor 816 detects use
When family and the distance between the front of terminal 800 gradually become smaller, touch display screen 805 is controlled from bright screen state by processor 801
It is switched to breath screen state;When proximity sensor 816 detects user and the distance between the front of terminal 800 becomes larger,
Touch display screen 805 is controlled by processor 801 and is switched to bright screen state from breath screen state.
It will be understood by those skilled in the art that the restriction of the not structure paired terminal 800 of structure shown in Fig. 8, can wrap
It includes than illustrating more or fewer components, perhaps combine certain components or is arranged using different components.
Present invention also provides a kind of server, which includes processor and memory, be stored in memory to
A few instruction, at least one instruction are loaded by processor and are executed the nerve net to realize above-mentioned each embodiment of the method offer
The parameter regulation means of network.It should be noted that the server can be server provided by following Fig. 9.
Referring to FIG. 9, the structural schematic diagram of the server provided it illustrates one exemplary embodiment of the application.Specifically
For: the server 900 includes central processing unit (CPU) 901 including random access memory (RAM) 902 and read-only deposits
The system storage 904 of reservoir (ROM) 903, and the system bus of connection system storage 904 and central processing unit 901
905.The server 900 further includes the basic input/output that information is transmitted between each device helped in computer
(I/O system) 906, and massive store for storage program area 913, application program 914 and other program modules 915 are set
Standby 907.
The basic input/output 906 includes display 908 for showing information and inputs letter for user
The input equipment 909 of such as mouse, keyboard etc of breath.Wherein the display 908 and input equipment 909 are all by being connected to
The input and output controller 910 of system bus 905 is connected to central processing unit 901.The basic input/output 906
Can also include input and output controller 910 with for receive and handle from keyboard, mouse or electronic touch pen etc. it is multiple its
The input of his equipment.Similarly, input and output controller 910 also provides output to display screen, printer or other kinds of defeated
Equipment out.
The mass-memory unit 907 is by being connected to the bulk memory controller (not shown) of system bus 905
It is connected to central processing unit 901.The mass-memory unit 907 and its associated computer readable storage medium are clothes
Business device 900 provides non-volatile memories.That is, the mass-memory unit 907 may include such as hard disk or CD-
The computer readable storage medium (not shown) of ROI driver etc.
Without loss of generality, the computer readable storage medium may include computer storage media and communication media.Meter
Calculation machine storage medium is believed including computer readable instructions, data structure, program module or other data etc. for storage
The volatile and non-volatile of any method or technique realization of breath, removable and irremovable medium.Computer storage medium
Including RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storages its technologies, CD-ROM, DVD or other optical storages, magnetic
Tape drum, tape, disk storage or other magnetic storage devices.Certainly, skilled person will appreciate that computer storage is situated between
Matter is not limited to above-mentioned several.Above-mentioned system storage 904 and mass-memory unit 907 may be collectively referred to as memory.
Memory is stored with one or more programs, and one or more programs are configured to by one or more central processings
Unit 901 executes, and one or more programs include the instruction of the parameter regulation means for realizing above-mentioned neural network, centre
Reason unit 901 executes the parameter adjustment side for the neural network that the one or more program realizes that above-mentioned each embodiment of the method provides
Method.
According to various embodiments of the present invention, the server 900 can also be arrived by network connections such as internets
Remote computer operation on network.Namely server 900 can be by the network interface that is connected on the system bus 905
Unit 911 is connected to network 912, in other words, Network Interface Unit 911 also can be used be connected to other kinds of network or
Remote computer system (not shown).
The memory further includes that one or more than one program, the one or more programs are stored in
In memory, the one or more programs include the parameter for carrying out neural network provided in an embodiment of the present invention
The step as performed by server in method of adjustment.
The embodiment of the present application also provides a kind of computer readable storage medium, and at least one finger is stored in the storage medium
Enable, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or
Instruction set is loaded by the processor 910 and executes the parameter regulation means to realize neural network as described above.
Present invention also provides a kind of computer program products to make when computer program product is run on computers
Obtain the parameter regulation means that computer executes the neural network that above-mentioned each embodiment of the method provides.
The application one embodiment provides a kind of computer readable storage medium, is stored at least in the storage medium
One instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the generation
Code collection or instruction set are loaded by processor and are executed the parameter regulation means to realize neural network as described above.
The application one embodiment provides a kind of parameter adjustment device of neural network, the parameter tune of the neural network
Finishing equipment includes processor and memory, at least one instruction is stored in the memory, described instruction is by the processor
Load and execute the parameter regulation means to realize neural network as described above.
It should be understood that the parameter adjustment controls of neural network provided by the above embodiment are in the ginseng for carrying out neural network
Number adjustment when, only the example of the division of the above functional modules, in practical application, can according to need and will be above-mentioned
Function distribution is completed by different functional modules, i.e., is divided into the internal structure of the parameter adjustment controls of neural network different
Functional module, to complete all or part of the functions described above.In addition, the parameter of neural network provided by the above embodiment
The parameter regulation means embodiment of adjustment device and neural network belongs to same design, and specific implementation process is detailed in method implementation
Example, which is not described herein again.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The above is not to limit the embodiment of the present application, all within the spirit and principle of the embodiment of the present application, institute
Any modification, equivalent substitution, improvement and etc. of work, should be included within the protection scope of the embodiment of the present application.
Claims (13)
1. a kind of parameter regulation means of neural network, which is characterized in that the described method includes:
Obtain the parameter and training sample of neural network trained, the training sample includes input data, and the parameter
It is the identical floating number of precision with the input data;
For the i-th layer network in the neural network, i-th layer of input data is obtained according to the input data, described in acquisition
The i-th layer parameter in parameter, and predetermined operation is carried out to i-th layer of input data and i-th layer parameter and obtains the first fortune
It calculates as a result, i is positive integer;
I-th layer of input data and i-th layer parameter are respectively converted into the decimal bit wide for meeting i-th layer network
Fixed-point number, and to after conversion i-th layer of input data and i-th layer parameter carry out the predetermined operation and obtain the second fortune
It calculates as a result, the decimal bit wide is used to indicate the digit of the decimal place in fixed-point number;
When the error of second operation result and first operation result is less than error threshold, by described the after conversion
I layer parameter is determined as the target component of i-th layer network.
2. the method according to claim 1, wherein the method also includes:
If the error of the last layer network is greater than or equal to the error threshold in the neural network, from all errors
The maximum m error of numerical value is selected, m is positive integer;
For each error in the m error, determine that the bit wide for generating the jth layer network of the error adjusts section, and
Increase the decimal bit wide of the jth layer network, j >=2 according to the decimal bit wide in bit wide adjustment section and -1 layer network of jth;
Every layer of net is recalculated according to the decimal bit wide of every layer network in the input data, the parameter and the neural network
The target component of network.
3. according to the method described in claim 2, it is characterized in that, described adjust -1 layer of net in section and jth according to the bit wide
The decimal bit wide of network increases the decimal bit wide of the jth layer network, comprising:
One bit wide of selection adjusts data from bit wide adjustment section;
The decimal bit wide of -1 layer network of jth is added or is subtracted each other with bit wide adjustment data, the jth layer network is obtained
Decimal bit wide, the decimal place of the updated jth layer network be wider than update before the jth layer network decimal place
It is wide.
4. the method according to claim 1, wherein when i-th layer network is not in the neural network
When the last layer network, the method also includes:
If the error of second operation result and first operation result is greater than or equal to the error threshold, increase institute
State the decimal bit wide of the i-th layer network;
I-th layer of input data and i-th layer parameter are respectively converted into i-th layer network met after increasing again
Decimal bit wide fixed-point number, and to again convert after i-th layer of input data and i-th layer parameter carry out it is described pre-
Determine operation and obtains the second operation result;
It, will when the error of second operation result and first operation result that obtain again is less than the error threshold
I-th layer parameter after converting again is determined as the target component of i-th layer network.
5. method according to any one of claims 1 to 4, which is characterized in that when the predetermined operation include convolution algorithm and
When biasing operation, i-th layer of input data and i-th layer parameter after described pair of conversion carry out the predetermined operation and obtain
Second operation result, comprising:
To the i-th layer of input data and i-th layer parameter progress convolution algorithm after conversion, convolution algorithm knot is obtained
Fruit, the decimal place of the convolution algorithm result are wider than the decimal bit wide of i-th layer network;
The convolution algorithm result is adjusted to meet the intermediate data of the decimal bit wide of i-th layer network;
The biasing operation is carried out to the intermediate data, obtains second operation result.
6. method according to any one of claims 1 to 4, which is characterized in that when i-th layer of input data is (i-1)-th layer
It is described that i-th layer of input data is converted to the fixed-point number for meeting the decimal bit wide of i-th layer network when output data,
Include:
Obtain bit wide the variable quantity k, k of the decimal bit wide of i-th layer network relative to the decimal bit wide of (i-1)-th layer network
For positive integer;
K cascade shift units are selected from bit wide adjustment unit, each shift unit is used to carry out a decimal to data
The adjustment of bit wide;
I-th layer of input data is inputted into the k cascade shift unit, i-th layer of input number after being converted
According to.
7. a kind of parameter adjustment controls of neural network, which is characterized in that described device includes:
Module is obtained, for obtaining the parameter and training sample of the neural network trained, the training sample includes input number
According to, and the parameter and the input data are the identical floating number of precision;
Computing module, the input for being obtained according to the acquisition module for the i-th layer network in the neural network
I-th layer of input data of data acquisition obtains the i-th layer parameter in the parameter, and to i-th layer of input data and described
I layer parameter carries out predetermined operation and obtains the first operation result, and i is positive integer;
The computing module, is also used to for i-th layer of input data and i-th layer parameter being respectively converted into and meets described
The fixed-point number of the decimal bit wide of i layer network, and to after conversion i-th layer of input data and i-th layer parameter carry out institute
It states predetermined operation and obtains the second operation result, the decimal bit wide is used to indicate the digit of the decimal place in fixed-point number;
Determining module, the error of second operation result and first operation result for being obtained in the computing module
When less than error threshold, i-th layer parameter after conversion is determined as to the target component of i-th layer network.
8. device according to claim 7, which is characterized in that described device further include:
Selecting module, when the error for the last layer network in the neural network is greater than or equal to the error threshold,
The maximum m error of numerical value is selected from all errors, m is positive integer;
The first adjustment module, for determining the institute for generating the selecting module selection for each error in the m error
The bit wide adjustment section of the jth layer network of error is stated, and adjusts the decimal bit wide in section and -1 layer network of jth according to the bit wide
Increase the decimal bit wide of the jth layer network, j >=2;
Computing module, for the decimal bit wide according to every layer network in the input data, the parameter and the neural network
Recalculate the target component of every layer network.
9. device according to claim 8, which is characterized in that the first adjustment module is also used to:
One bit wide of selection adjusts data from bit wide adjustment section;
The decimal bit wide of -1 layer network of jth is added or is subtracted each other with bit wide adjustment data, the jth layer network is obtained
Decimal bit wide, the decimal place of the updated jth layer network be wider than update before the jth layer network decimal place
It is wide.
10. device according to claim 7, which is characterized in that when i-th layer network is not in the neural network
When the last layer network, described device further include:
Second adjustment module is greater than or equal to described for the error in second operation result and first operation result
When error threshold, increase the decimal bit wide of i-th layer network;
The computing module is also used to again be respectively converted into and meet increasing i-th layer of input data and i-th layer parameter
The fixed-point number of the decimal bit wide of i-th layer network after big, and to i-th layer of input data after converting again and described
I-th layer parameter carries out the predetermined operation and obtains the second operation result;
The determining module is also used to small in the error of second operation result and first operation result that obtain again
When the error threshold, i-th layer parameter after converting again is determined as to the target component of i-th layer network.
11. according to any device of claim 7 to 10, which is characterized in that when the predetermined operation includes convolution algorithm
When with biasing operation, the computing module is also used to:
To the i-th layer of input data and i-th layer parameter progress convolution algorithm after conversion, convolution algorithm knot is obtained
Fruit, the decimal place of the convolution algorithm result are wider than the decimal bit wide of i-th layer network;
The convolution algorithm result is adjusted to meet the intermediate data of the decimal bit wide of i-th layer network;
The biasing operation is carried out to the intermediate data, obtains second operation result.
12. according to any device of claim 7 to 10, which is characterized in that when i-th layer of input data is (i-1)-th
When layer output data, the computing module is also used to:
Obtain bit wide the variable quantity k, k of the decimal bit wide of i-th layer network relative to the decimal bit wide of (i-1)-th layer network
For positive integer;
K cascade shift units are selected from bit wide adjustment unit, each shift unit is used to carry out a decimal to data
The adjustment of bit wide;
I-th layer of input data is inputted into the k cascade shift unit, i-th layer of input number after being converted
According to.
13. a kind of parameter adjustment device of neural network, which is characterized in that the parameter adjustment device of the neural network includes place
Manage device and memory, be stored at least one instruction in the memory, described instruction loaded by the processor and executed with
Realize the parameter regulation means of the neural network as described in claim 1 to 6 is any.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910127149.9A CN109800877B (en) | 2019-02-20 | 2019-02-20 | Parameter adjustment method, device and equipment of neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910127149.9A CN109800877B (en) | 2019-02-20 | 2019-02-20 | Parameter adjustment method, device and equipment of neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109800877A true CN109800877A (en) | 2019-05-24 |
CN109800877B CN109800877B (en) | 2022-12-30 |
Family
ID=66562255
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910127149.9A Active CN109800877B (en) | 2019-02-20 | 2019-02-20 | Parameter adjustment method, device and equipment of neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109800877B (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110135568A (en) * | 2019-05-28 | 2019-08-16 | 赵恒锐 | A kind of full integer nerve network system using Bounded Linear rectification unit |
CN110852434A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN quantization method, forward calculation method and device based on low-precision floating point number |
CN110852416A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN accelerated computing method and system based on low-precision floating-point data expression form |
CN111831354A (en) * | 2020-07-09 | 2020-10-27 | 北京灵汐科技有限公司 | Data precision configuration method, device, chip array, equipment and medium |
CN111831355A (en) * | 2020-07-09 | 2020-10-27 | 北京灵汐科技有限公司 | Weight precision configuration method, device, equipment and storage medium |
CN111831356A (en) * | 2020-07-09 | 2020-10-27 | 北京灵汐科技有限公司 | Weight precision configuration method, device, equipment and storage medium |
CN112085150A (en) * | 2019-06-12 | 2020-12-15 | 安徽寒武纪信息科技有限公司 | Quantization parameter adjusting method and device and related product |
WO2020248423A1 (en) * | 2019-06-12 | 2020-12-17 | 上海寒武纪信息科技有限公司 | Quantization parameter determination method for neural network, and related product |
CN112308216A (en) * | 2019-07-26 | 2021-02-02 | 杭州海康威视数字技术股份有限公司 | Data block processing method and device and storage medium |
WO2021021304A1 (en) * | 2019-07-26 | 2021-02-04 | Microsoft Technology Licensing, Llc | Conformance testing machine learning operations executed on gpus |
WO2021016932A1 (en) * | 2019-07-31 | 2021-02-04 | 深圳市大疆创新科技有限公司 | Data processing method and apparatus, and computer-readable storage medium |
WO2021036892A1 (en) * | 2019-08-27 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Method and apparatus for adjusting quantization parameter of recurrent neural network, and related product |
CN112508167A (en) * | 2019-09-13 | 2021-03-16 | 富士通株式会社 | Information processing apparatus and method, and recording medium |
CN112836806A (en) * | 2021-02-26 | 2021-05-25 | 上海阵量智能科技有限公司 | Data format adjusting method and device, computer equipment and storage medium |
CN113033787A (en) * | 2019-12-24 | 2021-06-25 | 中科寒武纪科技股份有限公司 | Method and equipment for quantizing neural network matrix, computer product and board card |
CN113593538A (en) * | 2021-09-02 | 2021-11-02 | 北京声智科技有限公司 | Voice feature classification method, related device and readable storage medium |
US11397579B2 (en) | 2018-02-13 | 2022-07-26 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11437032B2 (en) | 2017-09-29 | 2022-09-06 | Shanghai Cambricon Information Technology Co., Ltd | Image processing apparatus and method |
US11513586B2 (en) | 2018-02-14 | 2022-11-29 | Shanghai Cambricon Information Technology Co., Ltd | Control device, method and equipment for processor |
US11676028B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11789847B2 (en) | 2018-06-27 | 2023-10-17 | Shanghai Cambricon Information Technology Co., Ltd | On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system |
US11797850B2 (en) | 2020-07-09 | 2023-10-24 | Lynxi Technologies Co., Ltd. | Weight precision configuration method and apparatus, computer device and storage medium |
US12001955B2 (en) | 2019-08-23 | 2024-06-04 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
US12112257B2 (en) | 2019-08-27 | 2024-10-08 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2048799A (en) * | 1998-11-09 | 2000-05-29 | Widex A/S | Method for in-situ measuring and correcting or adjusting the output signal of a hearing aid with a model processor and hearing aid employing such a method |
CN105760933A (en) * | 2016-02-18 | 2016-07-13 | 清华大学 | Method and apparatus for fixed-pointing layer-wise variable precision in convolutional neural network |
US20160379112A1 (en) * | 2015-06-29 | 2016-12-29 | Microsoft Technology Licensing, Llc | Training and operation of computational models |
CN106412687A (en) * | 2015-07-27 | 2017-02-15 | 腾讯科技(深圳)有限公司 | Interception method and device of audio and video clips |
CN106570559A (en) * | 2015-10-09 | 2017-04-19 | 阿里巴巴集团控股有限公司 | Data processing method and device based on neural network |
CN106897734A (en) * | 2017-01-12 | 2017-06-27 | 南京大学 | K average clusters fixed point quantization method heterogeneous in layer based on depth convolutional neural networks |
US20170286830A1 (en) * | 2016-04-04 | 2017-10-05 | Technion Research & Development Foundation Limited | Quantized neural network training and inference |
CN107239829A (en) * | 2016-08-12 | 2017-10-10 | 北京深鉴科技有限公司 | A kind of method of optimized artificial neural network |
CN107679618A (en) * | 2017-07-28 | 2018-02-09 | 北京深鉴科技有限公司 | A kind of static policies fixed point training method and device |
CN107688849A (en) * | 2017-07-28 | 2018-02-13 | 北京深鉴科技有限公司 | A kind of dynamic strategy fixed point training method and device |
CN108229663A (en) * | 2018-01-29 | 2018-06-29 | 百度在线网络技术(北京)有限公司 | For generating the method and apparatus of convolutional neural networks |
WO2018140294A1 (en) * | 2017-01-25 | 2018-08-02 | Microsoft Technology Licensing, Llc | Neural network based on fixed-point operations |
US20180307950A1 (en) * | 2017-04-24 | 2018-10-25 | Intel Corporation | Compute optimizations for neural networks |
CN108898168A (en) * | 2018-06-19 | 2018-11-27 | 清华大学 | The compression method and system of convolutional neural networks model for target detection |
US20180349758A1 (en) * | 2017-06-06 | 2018-12-06 | Via Alliance Semiconductor Co., Ltd. | Computation method and device used in a convolutional neural network |
CN109102064A (en) * | 2018-06-26 | 2018-12-28 | 杭州雄迈集成电路技术有限公司 | A kind of high-precision neural network quantization compression method |
US20190041961A1 (en) * | 2018-09-27 | 2019-02-07 | Intel Corporation | Power savings for neural network architecture with zero activations during inference |
US20190042948A1 (en) * | 2017-08-04 | 2019-02-07 | Samsung Electronics Co., Ltd. | Method and apparatus for generating fixed-point quantized neural network |
US20190050710A1 (en) * | 2017-08-14 | 2019-02-14 | Midea Group Co., Ltd. | Adaptive bit-width reduction for neural networks |
-
2019
- 2019-02-20 CN CN201910127149.9A patent/CN109800877B/en active Active
Patent Citations (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2048799A (en) * | 1998-11-09 | 2000-05-29 | Widex A/S | Method for in-situ measuring and correcting or adjusting the output signal of a hearing aid with a model processor and hearing aid employing such a method |
US20160379112A1 (en) * | 2015-06-29 | 2016-12-29 | Microsoft Technology Licensing, Llc | Training and operation of computational models |
CN106412687A (en) * | 2015-07-27 | 2017-02-15 | 腾讯科技(深圳)有限公司 | Interception method and device of audio and video clips |
CN106570559A (en) * | 2015-10-09 | 2017-04-19 | 阿里巴巴集团控股有限公司 | Data processing method and device based on neural network |
CN105760933A (en) * | 2016-02-18 | 2016-07-13 | 清华大学 | Method and apparatus for fixed-pointing layer-wise variable precision in convolutional neural network |
US20170286830A1 (en) * | 2016-04-04 | 2017-10-05 | Technion Research & Development Foundation Limited | Quantized neural network training and inference |
CN107239829A (en) * | 2016-08-12 | 2017-10-10 | 北京深鉴科技有限公司 | A kind of method of optimized artificial neural network |
CN106897734A (en) * | 2017-01-12 | 2017-06-27 | 南京大学 | K average clusters fixed point quantization method heterogeneous in layer based on depth convolutional neural networks |
WO2018140294A1 (en) * | 2017-01-25 | 2018-08-02 | Microsoft Technology Licensing, Llc | Neural network based on fixed-point operations |
US20180307950A1 (en) * | 2017-04-24 | 2018-10-25 | Intel Corporation | Compute optimizations for neural networks |
US20180349758A1 (en) * | 2017-06-06 | 2018-12-06 | Via Alliance Semiconductor Co., Ltd. | Computation method and device used in a convolutional neural network |
CN107688849A (en) * | 2017-07-28 | 2018-02-13 | 北京深鉴科技有限公司 | A kind of dynamic strategy fixed point training method and device |
CN107679618A (en) * | 2017-07-28 | 2018-02-09 | 北京深鉴科技有限公司 | A kind of static policies fixed point training method and device |
US20190034784A1 (en) * | 2017-07-28 | 2019-01-31 | Beijing Deephi Intelligence Technology Co., Ltd. | Fixed-point training method for deep neural networks based on dynamic fixed-point conversion scheme |
US20190042948A1 (en) * | 2017-08-04 | 2019-02-07 | Samsung Electronics Co., Ltd. | Method and apparatus for generating fixed-point quantized neural network |
US20190050710A1 (en) * | 2017-08-14 | 2019-02-14 | Midea Group Co., Ltd. | Adaptive bit-width reduction for neural networks |
CN108229663A (en) * | 2018-01-29 | 2018-06-29 | 百度在线网络技术(北京)有限公司 | For generating the method and apparatus of convolutional neural networks |
CN108898168A (en) * | 2018-06-19 | 2018-11-27 | 清华大学 | The compression method and system of convolutional neural networks model for target detection |
CN109102064A (en) * | 2018-06-26 | 2018-12-28 | 杭州雄迈集成电路技术有限公司 | A kind of high-precision neural network quantization compression method |
US20190041961A1 (en) * | 2018-09-27 | 2019-02-07 | Intel Corporation | Power savings for neural network architecture with zero activations during inference |
Non-Patent Citations (5)
Title |
---|
ELDAD MELLER等: ""Same,same But Different-Recovering Neural Network Quantization Error Through Weight Factorization"", 《ARXIV》 * |
JACOB B等: ""Quantization and training of neural networks for efficient integer-arithmetic-only inference"", 《COMPUTER VISION AND PATTERN RECOGNITION》 * |
余洋等: ""面向"边缘"应用的卷积神经网络量化与压缩方法"", 《计算机应用》 * |
王磊等: ""面向嵌入式应用的深度神经网络模型压缩技术综述"", 《北京交通大学学报》 * |
陈俊保等: ""卷积神经网络的定点化研究"", 《信息技术》 * |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11437032B2 (en) | 2017-09-29 | 2022-09-06 | Shanghai Cambricon Information Technology Co., Ltd | Image processing apparatus and method |
US11397579B2 (en) | 2018-02-13 | 2022-07-26 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11507370B2 (en) | 2018-02-13 | 2022-11-22 | Cambricon (Xi'an) Semiconductor Co., Ltd. | Method and device for dynamically adjusting decimal point positions in neural network computations |
US11513586B2 (en) | 2018-02-14 | 2022-11-29 | Shanghai Cambricon Information Technology Co., Ltd | Control device, method and equipment for processor |
US11789847B2 (en) | 2018-06-27 | 2023-10-17 | Shanghai Cambricon Information Technology Co., Ltd | On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system |
CN110135568A (en) * | 2019-05-28 | 2019-08-16 | 赵恒锐 | A kind of full integer nerve network system using Bounded Linear rectification unit |
US12093148B2 (en) | 2019-06-12 | 2024-09-17 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11676029B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11676028B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11675676B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
CN112085150A (en) * | 2019-06-12 | 2020-12-15 | 安徽寒武纪信息科技有限公司 | Quantization parameter adjusting method and device and related product |
WO2020248423A1 (en) * | 2019-06-12 | 2020-12-17 | 上海寒武纪信息科技有限公司 | Quantization parameter determination method for neural network, and related product |
CN112308216A (en) * | 2019-07-26 | 2021-02-02 | 杭州海康威视数字技术股份有限公司 | Data block processing method and device and storage medium |
EP4339841A3 (en) * | 2019-07-26 | 2024-04-03 | Microsoft Technology Licensing, LLC | Techniques for conformance testing computational operations |
US11704231B2 (en) | 2019-07-26 | 2023-07-18 | Microsoft Technology Licensing, Llc | Techniques for conformance testing computational operations |
WO2021021304A1 (en) * | 2019-07-26 | 2021-02-04 | Microsoft Technology Licensing, Llc | Conformance testing machine learning operations executed on gpus |
WO2021016932A1 (en) * | 2019-07-31 | 2021-02-04 | 深圳市大疆创新科技有限公司 | Data processing method and apparatus, and computer-readable storage medium |
US12001955B2 (en) | 2019-08-23 | 2024-06-04 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
US12112257B2 (en) | 2019-08-27 | 2024-10-08 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
WO2021036892A1 (en) * | 2019-08-27 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Method and apparatus for adjusting quantization parameter of recurrent neural network, and related product |
CN112508167A (en) * | 2019-09-13 | 2021-03-16 | 富士通株式会社 | Information processing apparatus and method, and recording medium |
CN110852434A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN quantization method, forward calculation method and device based on low-precision floating point number |
CN110852416B (en) * | 2019-09-30 | 2022-10-04 | 梁磊 | CNN hardware acceleration computing method and system based on low-precision floating point data representation form |
CN110852434B (en) * | 2019-09-30 | 2022-09-23 | 梁磊 | CNN quantization method, forward calculation method and hardware device based on low-precision floating point number |
CN110852416A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN accelerated computing method and system based on low-precision floating-point data expression form |
CN113033787A (en) * | 2019-12-24 | 2021-06-25 | 中科寒武纪科技股份有限公司 | Method and equipment for quantizing neural network matrix, computer product and board card |
CN111831356A (en) * | 2020-07-09 | 2020-10-27 | 北京灵汐科技有限公司 | Weight precision configuration method, device, equipment and storage medium |
CN111831354A (en) * | 2020-07-09 | 2020-10-27 | 北京灵汐科技有限公司 | Data precision configuration method, device, chip array, equipment and medium |
US11797850B2 (en) | 2020-07-09 | 2023-10-24 | Lynxi Technologies Co., Ltd. | Weight precision configuration method and apparatus, computer device and storage medium |
CN111831355A (en) * | 2020-07-09 | 2020-10-27 | 北京灵汐科技有限公司 | Weight precision configuration method, device, equipment and storage medium |
WO2022007880A1 (en) * | 2020-07-09 | 2022-01-13 | 北京灵汐科技有限公司 | Data accuracy configuration method and apparatus, neural network device, and medium |
CN112836806B (en) * | 2021-02-26 | 2023-12-22 | 上海阵量智能科技有限公司 | Data format adjustment method, device, computer equipment and storage medium |
CN112836806A (en) * | 2021-02-26 | 2021-05-25 | 上海阵量智能科技有限公司 | Data format adjusting method and device, computer equipment and storage medium |
CN113593538B (en) * | 2021-09-02 | 2024-05-03 | 北京声智科技有限公司 | Voice characteristic classification method, related equipment and readable storage medium |
CN113593538A (en) * | 2021-09-02 | 2021-11-02 | 北京声智科技有限公司 | Voice feature classification method, related device and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109800877B (en) | 2022-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109800877A (en) | Parameter regulation means, device and the equipment of neural network | |
CN110097019A (en) | Character identifying method, device, computer equipment and storage medium | |
CN108304265A (en) | EMS memory management process, device and storage medium | |
CN108629747A (en) | Image enchancing method, device, electronic equipment and storage medium | |
CN110045960A (en) | Instruction set processing method, device and storage medium based on chip | |
CN109816042B (en) | Data classification model training method and device, electronic equipment and storage medium | |
CN109828802A (en) | List View display methods, device and readable medium | |
CN107978321A (en) | Audio-frequency processing method and device | |
CN110147852A (en) | Method, apparatus, equipment and the storage medium of image recognition | |
CN109840584A (en) | Convolutional neural networks model, data processing method and device | |
CN108281152A (en) | Audio-frequency processing method, device and storage medium | |
CN110210573A (en) | Fight generation method, device, terminal and the storage medium of image | |
CN108320756A (en) | It is a kind of detection audio whether be absolute music audio method and apparatus | |
CN110211593B (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN110070143A (en) | Obtain method, apparatus, equipment and the storage medium of training data | |
CN110147533A (en) | Coding method, device, equipment and storage medium | |
CN107958672A (en) | The method and apparatus for obtaining pitch waveform data | |
CN109192218A (en) | The method and apparatus of audio processing | |
CN109003621A (en) | A kind of audio-frequency processing method, device and storage medium | |
CN110163296A (en) | Method, apparatus, equipment and the storage medium of image recognition | |
CN108922531A (en) | Slot position recognition methods, device, electronic equipment and storage medium | |
CN109102811A (en) | Generation method, device and the storage medium of audio-frequency fingerprint | |
CN109547843A (en) | The method and apparatus that audio-video is handled | |
CN109218751A (en) | The method, apparatus and system of recommendation of audio | |
CN110535890A (en) | The method and apparatus that file uploads |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |