CN112446461A - Neural network model training method and device - Google Patents

Neural network model training method and device Download PDF

Info

Publication number
CN112446461A
CN112446461A CN201910808066.6A CN201910808066A CN112446461A CN 112446461 A CN112446461 A CN 112446461A CN 201910808066 A CN201910808066 A CN 201910808066A CN 112446461 A CN112446461 A CN 112446461A
Authority
CN
China
Prior art keywords
network
weight
network layer
activation quantity
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910808066.6A
Other languages
Chinese (zh)
Inventor
张渊
谢迪
浦世亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201910808066.6A priority Critical patent/CN112446461A/en
Priority to PCT/CN2020/111912 priority patent/WO2021037174A1/en
Publication of CN112446461A publication Critical patent/CN112446461A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The embodiment of the application provides a neural network model training method and device, wherein training samples are obtained, and the neural network model is trained by utilizing the training samples. When the neural network model is trained, integer fixed-point coding is carried out on the first activation quantity input into each network layer and the network weight of each network layer, the first activation quantity and the network weight after coding are integer fixed-point data with specified bit width, when operation is carried out, the involved operations such as matrix multiplication, matrix addition and the like all adopt integer fixed-point formats, and the bit width of the integer fixed-point data is obviously less than the bit width of single-precision floating-point data, so that the hardware resource overhead required by running the neural network model can be greatly reduced.

Description

Neural network model training method and device
Technical Field
The application relates to the technical field of machine learning, in particular to a neural network model training method and device.
Background
The deep neural network is an emerging field in machine learning research, analyzes data by simulating a mechanism of a human brain, and is an intelligent model for analyzing and learning by establishing and simulating the human brain. At present, deep neural networks, such as convolutional neural networks, cyclic neural networks, long-short term memory networks, and the like, have been well applied in the aspects of target detection and segmentation, behavior detection and recognition, voice recognition, and the like.
At present, the training of the neural network model usually adopts single-precision floating point data to operate so as to ensure the precision of the convergence of the neural network model. However, single-precision floating point data has a high bit width, and the amount of data participating in the operation is large, so that high hardware resource overhead is required for operating the neural network model.
Disclosure of Invention
The embodiment of the application aims to provide a neural network model training method and device so as to reduce hardware resource overhead required by running a neural network model. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present application provides a neural network model training method, where the method includes:
obtaining a training sample;
training the neural network model by using the training sample, wherein when the neural network model is trained, aiming at each network layer in the neural network model, the following steps are respectively executed:
acquiring a first activation quantity input into a network layer and a network weight of the network layer;
performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight into integer fixed point data with a specified bit width;
and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight.
Optionally, the method is applied to a camera; the training sample is a training sample with a specified target; the neural network model is a target detection model for detecting a specified target;
before the step of training the neural network model using the training samples, the method further comprises:
starting a target detection function;
judging whether a model online training function is started or not;
training a neural network model by using the training samples, wherein the training steps comprise:
and if the model on-line training function is started, training the target detection model by using a training sample with a specified target.
Optionally, the step of training the neural network model by using the training sample includes:
inputting a training sample into a neural network model, and performing forward operation on the training sample according to the sequence of each network layer in the neural network model from front to back to obtain a forward operation result of the neural network model, wherein when performing forward operation, a first activation quantity input into the network layer and a network weight of the network layer are respectively subjected to integer fixed point coding aiming at each network layer, the first activation quantity and the network weight are coded into integer fixed point data with a specified bit width, a second activation quantity output by the network layer is calculated according to the coded first activation quantity and the network weight, and the second activation quantity is used as the first activation quantity input into the next network layer for calculation until the second activation quantity output by the last network layer is determined as the forward operation result;
comparing the forward operation result with a preset nominal value to obtain a loss value;
inputting the loss value into the neural network model, according to the sequence of each network layer in the neural network model from back to front, the loss value is reversely calculated to obtain the weight gradient of each network layer in the neural network model, wherein, when the reverse operation is performed, the shaping fixed point coding is performed on the first activation quantity, the first activation quantity gradient and the network weight value of each network layer, which are input into the network layer, respectively, the first activation quantity gradient and the network weight value are coded into the shaping fixed point data with the appointed bit width, calculating a second activation quantity gradient and a weight gradient output by the network layer according to the coded first activation quantity, the coded first activation quantity gradient and the coded network weight, and calculating by taking the second activation quantity gradient as the first activation quantity gradient input into the next network layer until the weight gradients of all the network layers are calculated;
and adjusting the network weight of each network layer according to the weight gradient of each network layer.
Optionally, the step of adjusting the network weight of each network layer according to the weight gradient of each network layer includes:
performing integer fixed point coding on the weight gradient of each network layer, and coding the weight gradient of each network layer into integer fixed point data with a specified bit width;
and calculating the adjusted network weight of each network layer by using a preset optimization algorithm according to the encoded weight gradient of each network layer and the encoded network weight of each network layer.
Optionally, after calculating a second activation amount output by the network layer according to the encoded first activation amount and the network weight, the method provided in the embodiment of the present application further includes:
and performing integer fixed point encoding on the second activation quantity, and encoding the second activation quantity into integer fixed point data with the specified bit width.
Optionally, the step of performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight into integer fixed point data with a specified bit width includes:
and respectively coding each scalar numerical value in the first activation quantity and the network weight as a product of a parameter value representing the global dynamic range and an integer fixed point value of the designated bit width.
Optionally, if the network layer is a convolutional layer, the size of the network weight is C × R × N, and for each scalar numerical value in each three-dimensional tensor with the size of C × R, the corresponding parameter values are the same;
if the network layer is a full connection layer, the network weight is MxN, and corresponding parameter values are the same for each scalar numerical value in each column vector with the size of 1 xN;
the parameter values corresponding to the scalar values in the first activation quantity are the same.
In a second aspect, an embodiment of the present application provides a neural network model training apparatus, including:
the acquisition module is used for acquiring a training sample;
the training module is used for training the neural network model by utilizing the training sample, wherein when the training module trains the neural network model, the training module respectively executes the following steps aiming at each network layer in the neural network model:
acquiring a first activation quantity input into a network layer and a network weight of the network layer;
performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight into integer fixed point data with a specified bit width;
and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight.
Optionally, the apparatus is applied to a camera; the training sample is a training sample with a specified target; the neural network model is a target detection model for detecting a specified target;
the device also includes:
the starting module is used for starting a target detection function;
the judging module is used for judging whether a model on-line training function is started or not;
the training module is specifically configured to:
and if the judgment result of the judgment module is that the on-line model training function is started, training the target detection model by using a training sample with a specified target.
Optionally, the training module is specifically configured to:
inputting a training sample into a neural network model, and performing forward operation on the training sample according to the sequence of each network layer in the neural network model from front to back to obtain a forward operation result of the neural network model, wherein when performing forward operation, a first activation quantity input into the network layer and a network weight of the network layer are respectively subjected to integer fixed point coding aiming at each network layer, the first activation quantity and the network weight are coded into integer fixed point data with a specified bit width, a second activation quantity output by the network layer is calculated according to the coded first activation quantity and the network weight, and the second activation quantity is used as the first activation quantity input into the next network layer for calculation until the second activation quantity output by the last network layer is determined as the forward operation result;
comparing the forward operation result with a preset nominal value to obtain a loss value;
inputting the loss value into the neural network model, according to the sequence of each network layer in the neural network model from back to front, the loss value is reversely calculated to obtain the weight gradient of each network layer in the neural network model, wherein, when the reverse operation is performed, the shaping fixed point coding is performed on the first activation quantity, the first activation quantity gradient and the network weight value of each network layer, which are input into the network layer, respectively, the first activation quantity gradient and the network weight value are coded into the shaping fixed point data with the appointed bit width, calculating a second activation quantity gradient and a weight gradient output by the network layer according to the coded first activation quantity, the coded first activation quantity gradient and the coded network weight, and calculating by taking the second activation quantity gradient as the first activation quantity gradient input into the next network layer until the weight gradients of all the network layers are calculated;
and adjusting the network weight of each network layer according to the weight gradient of each network layer.
Optionally, the training module, when configured to adjust the network weights of the network layers according to the weight gradients of the network layers, is specifically configured to:
performing integer fixed point coding on the weight gradient of each network layer, and coding the weight gradient of each network layer into integer fixed point data with a specified bit width;
and calculating the adjusted network weight of each network layer by using a preset optimization algorithm according to the encoded weight gradient of each network layer and the encoded network weight of each network layer.
Optionally, the training module is further configured to:
and performing integer fixed point encoding on the second activation quantity, and encoding the second activation quantity into integer fixed point data with the specified bit width.
Optionally, the training module is specifically configured to, when the training module is configured to perform integer fixed point coding on the first activation quantity and the network weight, and code the first activation quantity and the network weight as integer fixed point data with a specified bit width:
and respectively coding each scalar numerical value in the first activation quantity and the network weight as a product of a parameter value representing the global dynamic range and an integer fixed point value of the designated bit width.
Optionally, if the network layer is a convolutional layer, the size of the network weight is C × R × N, and for each scalar numerical value in each three-dimensional tensor with the size of C × R, the corresponding parameter values are the same;
if the network layer is a full connection layer, the network weight is MxN, and corresponding parameter values are the same for each scalar numerical value in each column vector with the size of 1 xN;
the parameter values corresponding to the scalar values in the first activation quantity are the same.
In a third aspect, embodiments provide a computer device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the method provided by the first aspect of the embodiments of the present application is implemented.
In a fourth aspect, an embodiment of the present application provides a machine-readable storage medium, which stores machine-executable instructions and, when being invoked and executed by a processor, implements the method provided in the first aspect of the embodiment of the present application.
The neural network model training method and device provided by the embodiment of the application acquire the training samples and train the neural network model by using the training samples. When the neural network model is trained, aiming at each network layer in the neural network model, respectively executing: the method comprises the steps of obtaining a first activation quantity input into a network layer and a network weight of the network layer, conducting integer fixed point coding on the first activation quantity and the network weight, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight. When the neural network model is trained, integer fixed-point coding is carried out on the first activation quantity input into each network layer and the network weight of each network layer, the first activation quantity and the network weight after coding are integer fixed-point data with specified bit width, when operation is carried out, the involved operations such as matrix multiplication, matrix addition and the like all adopt integer fixed-point formats, and the bit width of the integer fixed-point data is obviously less than the bit width of single-precision floating-point data, so that the hardware resource overhead required by running the neural network model can be greatly reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart illustrating a neural network model training method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a neural network model training process according to an embodiment of the present application;
fig. 3 is a schematic diagram of an execution flow of each network layer in a neural network model in a process of training the neural network model according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a tensor space structure corresponding to a convolution kernel of a four-dimensional tensor of size C × R × R × N according to an embodiment of the present application;
FIG. 5 is a diagram illustrating an example of how each scalar value is encoded within a three-dimensional tensor of size C R in accordance with an embodiment of the present application;
fig. 6 is a schematic diagram of a tensor space structure corresponding to a two-dimensional matrix with size of mxn according to an embodiment of the present application;
FIG. 7 is a diagram illustrating an example of the encoding of each scalar value within a column vector of size 1N according to the present disclosure;
FIG. 8 is a schematic diagram of the manner in which each scalar value is encoded within the activation volume and activation volume gradient three-dimensional tensor according to an embodiment of the present application;
FIG. 9 is a schematic flowchart of a target detection model training method applied to a camera according to an embodiment of the present disclosure;
FIG. 10 is a schematic structural diagram of a neural network model training apparatus according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to reduce hardware resource overhead required for running a neural network model, embodiments of the present application provide a neural network model training method, apparatus, computer device, and machine-readable storage medium. Next, a neural network model training method provided in the embodiment of the present application is first described.
An execution main body of the neural network training method provided by the embodiment of the application may be a computer device with a neural network model training function, a computer device for implementing functions such as target detection and segmentation, behavior detection and recognition, voice recognition, and the like, a camera with functions such as target detection and segmentation, behavior detection and recognition, or a microphone with a voice recognition function, and the execution main body at least includes a core processing chip with a data processing capability. The method for implementing the neural network training method provided by the embodiment of the application can be at least one of software, hardware circuits and logic circuits arranged in an execution subject.
As shown in fig. 1, a neural network model training method provided in an embodiment of the present application may include the following steps.
And S101, obtaining a training sample.
When the neural network training is performed, a large number of training samples are generally required to be collected, and the collected training samples are different according to different functions to be realized based on the neural network model. For example, if the detection model is trained for face detection, the collected training samples are face samples; and if the tracking model is trained for tracking the vehicle, the collected training samples are vehicle samples.
And S102, training the neural network model by using the training sample.
Inputting training samples into a neural network model, calculating the training samples by using a Back Propagation (BP) algorithm or other model training algorithms, comparing the calculation result with a set nominal value, and adjusting the network weight of the neural network model based on the comparison result. Different training samples are sequentially input into the neural network model, the steps are iteratively executed, the network weight is continuously adjusted, the output of the neural network model is more and more approximate to a nominal value, and the training of the neural network model is considered to be completed until the difference between the output of the neural network model and the nominal value is small enough or the output of the neural network model is converged.
Taking the BP algorithm as an example, the main calculation operation and data flow in the neural network model training process are shown in fig. 2, and each network layer mainly performs convolution operation Y during forward operationi=Wi*Yi-1Each network layer mainly performs convolution operation dY when performing reverse operationi-1=dYi-1*WiAnd a matrix multiplication operation dWi=dYi*Yi-1Wherein, the forward operation refers to the operation sequence from the first network layer to the back, the reverse operation refers to the operation sequence from the last network layer to the back, WiIndicating network weights at i-layer network layers, e.g. convolutional or full link layer parameters, YiIndicating the amount of activation, dW, input to or output from the i-layer network layeriRepresents the weight gradient, dY, corresponding to the i-th network layeriActivation amount ladder for indicating input of i-th layer network layerAnd (4) degree.
As shown in fig. 2, in the process of training the neural network model by using the BP algorithm, the training sample X is input into the neural network model, and the k-layer network layer sequentially performs convolution operation from front to back through the forward operation of the neural network model to obtain the model output YkComparing the model output with a nominal value by a loss function to obtain a loss value dYkAnd performing reverse operation of the neural network model, sequentially performing convolution operation and matrix multiplication operation on the k-layer network layers from back to front to obtain a weight gradient corresponding to each network layer, and adjusting the network weight according to the weight gradient. Through a continuous iteration process, the output of the neural network model is more and more approximate to a nominal value.
In the embodiment of the present application, in the process of training the neural network model, each network layer in the neural network model needs to perform each step shown in fig. 3.
S301, obtaining the first activation quantity input into the network layer and the network weight of the network layer.
When the forward operation is carried out, the first activation quantity input into the i-layer network layer is YiWhen performing the inverse operation, the first activation amount inputted to the i-th layer network layer is dYi
S302, integer fixed point coding is carried out on the first activation quantity and the network weight, and the first activation quantity and the network weight are coded into integer fixed point data with a designated bit width.
For the i-layer network layer, a first activation amount Y input to the network layer is requiredi、dYiAnd the network weight W of the network layeriAnd carrying out integer fixed-point coding, wherein the integer fixed-point coding is used for coding the data in the floating-point format into the data in the integer fixed-point format.
Optionally, S302 may specifically be:
and respectively coding each scalar numerical value in the first activation quantity and the network weight as a product of a parameter value representing the global dynamic range and an integer fixed point value of the designated bit width.
The specific coding mode can be that the first activation quantity is combined with the netEach scalar value in the network weights is encoded as the product of a parameter value sp characterizing the global dynamic range and an integer fixed-point value ip specifying the bit-width, where sp is 2EE is a signed binary number with bit width EB, EB is a set bit width, ip is a signed binary number with bit width IB, and IB is a bit width set according to the size of original floating point data. The integer fixed-point value ip and the parameter value sp are calculated in the following way:
Figure BDA0002184231510000091
Figure BDA0002184231510000092
wherein s is the sign bit of binary number x, and takes the value of 0 or 1, xiIs the ith numerical value of the binary number x, and takes the value of 0 or 1.
Optionally, if the network layer is a convolutional layer, the size of the network weight is C × R × N, and for each scalar numerical value in each three-dimensional tensor with the size of C × R, the corresponding parameter values are the same; if the network layer is a full connection layer, the network weight is MxN, and corresponding parameter values are the same for each scalar numerical value in each column vector with the size of 1 xN; the parameter values corresponding to the scalar values in the first activation quantity are the same.
WiThe network weight corresponding to the ith layer of the neural network model, and the type of the network layer is convolutional layer or full connection layer. If the ith layer is a convolutional layer, WiThe four-dimensional tensor convolution kernel with the size of C × R × N is a corresponding tensor space structure as shown in fig. 4, where C represents the input channel direction dimension size of the convolution kernel, R represents the space dimension size of the convolution kernel, and N represents the output channel direction dimension size of the convolution kernel. For each three-dimensional tensor W of size C Ri pEach scalar value w within can be expressed as:
w=sp·ip (3)
wherein each Wi pThe three-dimensional tensors share an sp, one for each scalar value wThe integer fixed point value ip. The encoding of each scalar value in a three-dimensional tensor of size C x R is shown in fig. 5. The calculation methods of ip and sp are as in formulas (1) and (2), and are not described herein again.
Similarly, if the ith layer is a fully connected layer, WiThe two-dimensional matrix with the size of M × N corresponds to a tensor space structure as shown in fig. 6, and the matrix can be divided into a structure that the two-dimensional matrix with the size of M × N is divided into M column vectors with the size of 1 × N. For each column vector W of size 1 XNi pEach scalar value w within is represented by the above equation (3). Each Wi pThe column vectors share an sp, and each scalar value w corresponds to an integer fixed-point value ip. The encoding of each scalar value within a column vector of size 1 × N is shown in fig. 7. The calculation methods of ip and sp are as in formulas (1) and (2), and are not described herein again.
YiAnd dYiThe activation amount and the activation amount gradient corresponding to the ith layer of the neural network model are three-dimensional tensors with the size of C multiplied by H multiplied by W, and the three-dimensional tensors are YiOr dYiEach scalar value y or dy within may be expressed as:
y=sp·ip (4)
dy=sp·ip (5)
wherein each three-dimensional tensor YiOr dYiSharing an sp, each scalar value y or dy corresponds to an integer fixed-point value ip. The way in which each scalar value is encoded within the activation volume and activation volume gradient three-dimensional tensor is shown in figure 8. The calculation methods of ip and sp are as in formulas (1) and (2), and are not described herein again.
S303, calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight.
As described above, each scalar value in the first activation quantity and the network weight is subjected to integer fixed point coding, and the coded value is an integer fixed point value, so that the maximum operation of the involved operation resource overhead, such as convolution operation and matrix multiplication operation, is converted from floating point operation to integer fixed point operation during forward operation and reverse operation, and the training efficiency of the neural network on a hardware platform is greatly improved.
Optionally, S102 may specifically be implemented by the following steps:
the method comprises the steps of firstly, inputting training samples into a neural network model, carrying out forward operation on the training samples according to the sequence of each network layer in the neural network model from front to back to obtain a forward operation result of the neural network model, wherein when the forward operation is carried out, aiming at each network layer, respectively carrying out integer fixed point coding on a first activation quantity input into the network layer and a network weight of the network layer, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, calculating a second activation quantity output by the network layer according to the coded first activation quantity and the coded network weight of each network layer, and calculating the second activation quantity as the first activation quantity input into the next network layer until the second activation quantity output by the last network layer is determined as the forward operation result.
And secondly, comparing the forward operation result with a preset nominal value to obtain a loss value.
Thirdly, inputting the loss value into the neural network model, according to the sequence of each network layer in the neural network model from back to front, the loss value is reversely calculated to obtain the weight gradient of each network layer in the neural network model, wherein, when the reverse operation is performed, the shaping fixed point coding is performed on the first activation quantity, the first activation quantity gradient and the network weight value of each network layer, which are input into the network layer, respectively, the first activation quantity gradient and the network weight value are coded into the shaping fixed point data with the appointed bit width, and calculating a second activation quantity gradient and a weight gradient output by the network layer according to the coded first activation quantity, the first activation quantity gradient and the network weight, and calculating by taking the second activation quantity gradient as the first activation quantity gradient input into the next network layer until the weight gradients of all the network layers are calculated.
And fourthly, adjusting the network weight of each network layer according to the weight gradient of each network layer.
The process from the first step to the fourth step is the BP algorithmThe four steps are continuously and circularly executed in the operation process of the neural network model training device, and the training of the neural network model is realized. The forward operation process is Y through multiplication of the first activation quantity and the network weighti=Wi*Yi-1Calculating a second activation amount YiThe inverse operation process is dY through multiplication of the first activation quantity gradient and the network weighti-1=dYi-1*WiCalculating a second activation gradient dYi-1And dW is multiplied by the first activation amount gradienti=dYi*Yi-1Calculating weight gradient dWiWith the integer fixed point encoding, the floating point operation becomes an integer fixed point operation:
f32(Yk+1)=f32(Yk)*f32(Wk)→intYB(Yk+1)=intYB(Yk)*intWB(Wk) (6)
f32(dYk-1)=f32(dYk)*f32(Wk)→intdYB(dYk-1)=intdYB(dYk)*intWB(Wk) (7)
f32(dWk)=f32(dYk)*f32(Yk-1)→intdWB(dWk)=intdYB(dYk)*intYB(Yk-1) (8)
wherein YB, WB, dYB and dWB are integer bit width values, f32() And int () represents 32-bit floating point format and integer fixed point format.
Optionally, the fourth step may be specifically implemented by the following steps:
performing integer fixed point coding on the weight gradient of each network layer, and coding the weight gradient of each network layer into integer fixed point data with a specified bit width; and calculating the adjusted network weight of each network layer by using a preset optimization algorithm according to the encoded weight gradient of each network layer and the encoded network weight of each network layer.
After calculating the weight gradients of each network layer, the weight gradients may be encoded, and the specific encoding process may refer to the above process of encoding the network weights, which is not described herein again. After encoding, the network weight needs to be adjusted based on the weight Gradient, the adjustment process mainly includes matrix addition, and particularly, optimization algorithms such as SGD (Stochastic Gradient Descent) are adopted, so that the network weight can be converted from a floating point format to an integer fixed point format. Taking the SGD optimization algorithm as an example, the conversion of the network weights is shown in equations (9) to (11).
f32(dW)=f32(dW)+f32w)·f32(W)→intdWB(dW)=intdWB(dW)+intλBw)·intWB(W) (9)
f32(Wold)=f32(m)·f32(dWold)+f32(η)·f32(dW)→intWB(Wold)=intmB(m)·intdWB(dWold)+intηB(η)·intdWB(dW) (10)
f32(W)=f32(W)+f32(Wold)→intWB(W)=intWB(W)+intWB(Wold) (11)
Wherein, dW is the weight gradient of the network layer at the current moment, dWoldThe weight gradient of the network layer at the last moment, W is the network weight of the network layer at the current moment, lambdawη and m are training hyperparameters (which may be set).
Optionally, after step S303 is executed, the neural network model training method provided in the embodiment of the present application may further execute:
and performing integer fixed point encoding on the second activation quantity, and encoding the second activation quantity into integer fixed point data with the specified bit width.
After the operation of each network layer, the bit width of the obtained integer fixed point data generally becomes long, when the data is input to a subsequent network layer for operation, the operation efficiency may be reduced due to a longer bit width, and in order to ensure the operation efficiency, the second activation amount to be calculated may be subjected to integer fixed point encoding again, so as to reduce the bit width of the second activation amount, so that the bit width of the second activation amount can meet the calculation requirement of the next network layer.
By applying the embodiment of the application, the training sample is obtained, and the neural network model is trained by utilizing the training sample. When the neural network model is trained, aiming at each network layer in the neural network model, respectively executing: the method comprises the steps of obtaining a first activation quantity input into a network layer and a network weight of the network layer, conducting integer fixed point coding on the first activation quantity and the network weight, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight. When the neural network model is trained, integer fixed-point coding is carried out on the first activation quantity input into each network layer and the network weight of each network layer, the first activation quantity and the network weight after coding are integer fixed-point data with specified bit width, when operation is carried out, the involved operations such as matrix multiplication, matrix addition and the like all adopt integer fixed-point formats, and the bit width of the integer fixed-point data is obviously less than the bit width of single-precision floating-point data, so that the hardware resource overhead required by running the neural network model can be greatly reduced.
The neural network model training method is mainly suitable for edge devices with limited resources, such as cameras, and for cameras, the intelligent inference functions of the cameras mainly include target detection, target tracking, face recognition and the like, and the following introduces a training method for a target detection model deployed on the camera by taking target detection as an example, as shown in fig. 9, the method mainly includes the following steps:
and S901, starting a target detection function.
The camera can start a target detection function based on a selection result of a user when the target detection is required according to the actual requirement of the user.
And S902, judging whether to start the model on-line training function, if so, executing S903, and otherwise, waiting for starting the model on-line training function.
Before the target detection model is used for target detection, the target detection model needs to be trained, whether online training is performed or not can be selected by a user, and in a general case, only after the model online training function is started, the camera trains the target detection model according to the steps of the embodiment shown in fig. 1.
And S903, training the target detection model by using the obtained training sample with the specified target.
When the target detection model is trained, the training sample input into the target detection model is a training sample with a specified target, so that the trained target detection model can detect the specified target. The specific way of training the target detection model is the same as the way of training the neural network model in the embodiment shown in fig. 3, and details are not repeated here.
Because the camera trains the target detection model by adopting the training mode in the embodiment shown in fig. 3, the first activation quantity input into each network layer and the network weight of each network layer are subjected to integer fixed point coding in the training process, and the coded first activation quantity and the network weight are integer fixed point data with a specified bit width, when the operation is performed, the involved operations such as matrix multiplication, matrix addition and the like all adopt an integer fixed point format, and the bit width of the integer fixed point data is obviously less than that of single-precision floating point data, so that the hardware resource overhead of the camera can be greatly reduced. And performing on-line training of the target detection model on the camera, so that the camera can have a scene self-adaption function.
Corresponding to the above method embodiment, an embodiment of the present application provides a neural network model training apparatus, as shown in fig. 10, the apparatus may include:
an obtaining module 1010, configured to obtain a training sample;
a training module 1020, configured to train the neural network model by using the training samples, where the training module 1020 executes the following steps for each network layer in the neural network model when training the neural network model:
acquiring a first activation quantity input into a network layer and a network weight of the network layer;
performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight into integer fixed point data with a specified bit width;
and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight.
Optionally, the apparatus is applied to a camera; the training sample is a training sample with a specified target; the neural network model is a target detection model for detecting a specified target;
the apparatus may further include:
the starting module is used for starting a target detection function;
the judging module is used for judging whether a model on-line training function is started or not;
training module 1020 may be specifically configured to:
and if the judgment result of the judgment module is that the on-line model training function is started, training the target detection model by using a training sample with a specified target.
Optionally, the training module 1020 may be specifically configured to:
inputting a training sample into a neural network model, and performing forward operation on the training sample according to the sequence of each network layer in the neural network model from front to back to obtain a forward operation result of the neural network model, wherein when performing forward operation, a first activation quantity input into the network layer and a network weight of the network layer are respectively subjected to integer fixed point coding aiming at each network layer, the first activation quantity and the network weight are coded into integer fixed point data with a specified bit width, a second activation quantity output by the network layer is calculated according to the coded first activation quantity and the network weight, and the second activation quantity is used as the first activation quantity input into the next network layer for calculation until the second activation quantity output by the last network layer is determined as the forward operation result;
comparing the forward operation result with a preset nominal value to obtain a loss value;
inputting the loss value into the neural network model, according to the sequence of each network layer in the neural network model from back to front, the loss value is reversely calculated to obtain the weight gradient of each network layer in the neural network model, wherein, when the reverse operation is performed, the shaping fixed point coding is performed on the first activation quantity, the first activation quantity gradient and the network weight value of each network layer, which are input into the network layer, respectively, the first activation quantity gradient and the network weight value are coded into the shaping fixed point data with the appointed bit width, calculating a second activation quantity gradient and a weight gradient output by the network layer according to the coded first activation quantity, the coded first activation quantity gradient and the coded network weight, and calculating by taking the second activation quantity gradient as the first activation quantity gradient input into the next network layer until the weight gradients of all the network layers are calculated;
and adjusting the network weight of each network layer according to the weight gradient of each network layer.
Optionally, the training module 1020, when being configured to adjust the network weights of the network layers according to the weight gradients of the network layers, may specifically be configured to:
performing integer fixed point coding on the weight gradient of each network layer, and coding the weight gradient of each network layer into integer fixed point data with a specified bit width;
and calculating the adjusted network weight of each network layer by using a preset optimization algorithm according to the encoded weight gradient of each network layer and the encoded network weight of each network layer.
Optionally, the training module 1020 may further be configured to:
and performing integer fixed point encoding on the second activation quantity, and encoding the second activation quantity into integer fixed point data with the specified bit width.
Optionally, the training module 1020, when being configured to perform integer fixed point coding on the first activation quantity and the network weight, and code the first activation quantity and the network weight as integer fixed point data with a specified bit width, may be specifically configured to:
and respectively coding each scalar numerical value in the first activation quantity and the network weight as a product of a parameter value representing the global dynamic range and an integer fixed point value of the designated bit width.
Optionally, if the network layer is a convolutional layer, the size of the network weight is C × R × N, and for each scalar numerical value in each three-dimensional tensor with the size of C × R, the corresponding parameter values are the same;
if the network layer is a full connection layer, the network weight is MxN, and corresponding parameter values are the same for each scalar numerical value in each column vector with the size of 1 xN;
the parameter values corresponding to the scalar values in the first activation quantity are the same.
By applying the embodiment of the application, the training sample is obtained, and the neural network model is trained by utilizing the training sample. When the neural network model is trained, aiming at each network layer in the neural network model, respectively executing: the method comprises the steps of obtaining a first activation quantity input into a network layer and a network weight of the network layer, conducting integer fixed point coding on the first activation quantity and the network weight, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight. When the neural network model is trained, integer fixed-point coding is carried out on the first activation quantity input into each network layer and the network weight of each network layer, the first activation quantity and the network weight after coding are integer fixed-point data with specified bit width, when operation is carried out, the involved operations such as matrix multiplication, matrix addition and the like all adopt integer fixed-point formats, and the bit width of the integer fixed-point data is obviously less than the bit width of single-precision floating-point data, so that the hardware resource overhead required by running the neural network model can be greatly reduced.
The present application embodiment provides a computer device, as shown in fig. 11, which may include a processor 1101 and a machine-readable storage medium 1102, where the machine-readable storage medium 1102 stores machine-executable instructions capable of being executed by the processor 1101, and the processor 1101 is caused by the machine-executable instructions to: all steps of the neural network model training method described above are implemented.
The machine-readable storage medium may include a RAM (Random Access Memory) and a NVM (Non-Volatile Memory), such as at least one disk Memory. Alternatively, the machine-readable storage medium may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The machine-readable storage medium 1102 and the processor 1101 may be in data communication by way of a wired or wireless connection, and the computer device may communicate with other devices by way of a wired or wireless communication interface. Fig. 11 shows only an example of data transmission between the processor 1101 and the machine-readable storage medium 1102 through a bus, and the connection manner is not limited in particular.
In this embodiment, the processor 1101 can realize that by reading the machine executable instructions stored in the machine readable storage medium 1102 and by executing the machine executable instructions: and acquiring a training sample, and training the neural network model by using the training sample. When the neural network model is trained, aiming at each network layer in the neural network model, respectively executing: the method comprises the steps of obtaining a first activation quantity input into a network layer and a network weight of the network layer, conducting integer fixed point coding on the first activation quantity and the network weight, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight. When the neural network model is trained, integer fixed-point coding is carried out on the first activation quantity input into each network layer and the network weight of each network layer, the first activation quantity and the network weight after coding are integer fixed-point data with specified bit width, when operation is carried out, the involved operations such as matrix multiplication, matrix addition and the like all adopt integer fixed-point formats, and the bit width of the integer fixed-point data is obviously less than the bit width of single-precision floating-point data, so that the hardware resource overhead required by running the neural network model can be greatly reduced.
The embodiment of the application also provides a machine-readable storage medium, which stores machine executable instructions and realizes all the steps of the neural network model training method when being called and executed by a processor.
In this embodiment, the machine-readable storage medium stores machine-executable instructions for executing the neural network model training method provided in this embodiment when running, so that the following can be implemented: and acquiring a training sample, and training the neural network model by using the training sample. When the neural network model is trained, aiming at each network layer in the neural network model, respectively executing: the method comprises the steps of obtaining a first activation quantity input into a network layer and a network weight of the network layer, conducting integer fixed point coding on the first activation quantity and the network weight, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight. When the neural network model is trained, integer fixed-point coding is carried out on the first activation quantity input into each network layer and the network weight of each network layer, the first activation quantity and the network weight after coding are integer fixed-point data with specified bit width, when operation is carried out, the involved operations such as matrix multiplication, matrix addition and the like all adopt integer fixed-point formats, and the bit width of the integer fixed-point data is obviously less than the bit width of single-precision floating-point data, so that the hardware resource overhead required by running the neural network model can be greatly reduced.
For the embodiments of the computer device and the machine-readable storage medium, the contents of the related methods are substantially similar to those of the foregoing method embodiments, so that the description is relatively simple, and for the relevant points, reference may be made to partial descriptions of the method embodiments.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, computer device, and machine-readable storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and in relation to the description, reference may be made to some portions of the method embodiments.
The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims (14)

1. A neural network model training method, the method comprising:
obtaining a training sample;
training a neural network model by using the training sample, wherein when the neural network model is trained, the following steps are respectively executed for each network layer in the neural network model:
acquiring a first activation quantity input into the network layer and a network weight of the network layer;
performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight into integer fixed point data with a specified bit width;
and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight.
2. The method of claim 1, wherein the method is applied to a camera; the training sample is a training sample with a specified target; the neural network model is a target detection model used for detecting the specified target;
before the training of the neural network model using the training samples, the method further comprises:
starting a target detection function;
judging whether a model online training function is started or not;
the training of the neural network model by using the training samples comprises:
and if the model on-line training function is started, training the target detection model by using the training sample with the specified target.
3. The method of claim 1, wherein training a neural network model using the training samples comprises:
inputting the training sample into a neural network model, and performing forward operation on the training sample according to the sequence of each network layer in the neural network model from front to back to obtain a forward operation result of the neural network model, wherein when performing forward operation, for each network layer, performing integer fixed point coding on a first activation quantity input into the network layer and a network weight of the network layer, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight, and calculating the second activation quantity as a first activation quantity input into a next network layer until a second activation quantity output by a last network layer is determined as a forward operation result;
comparing the forward operation result with a preset nominal value to obtain a loss value;
inputting the loss value into the neural network model, and performing reverse operation on the loss value according to the sequence of each network layer in the neural network model from back to front to obtain a weight gradient of each network layer in the neural network model, wherein during the reverse operation, a first activation quantity gradient and a network weight of the network layer are respectively subjected to integer fixed-point coding aiming at each network layer, the first activation quantity gradient and the network weight are coded into integer fixed-point data with a specified bit width, a second activation quantity gradient and a weight gradient output by the network layer are calculated according to the coded first activation quantity, the first activation quantity gradient and the network weight, and the second activation quantity gradient is used as a first activation quantity gradient input into the next network layer for calculation, until the weight gradients of all network layers are calculated;
and adjusting the network weight of each network layer according to the weight gradient of each network layer.
4. The method according to claim 3, wherein the adjusting the network weight of each network layer according to the weight gradient of each network layer comprises:
performing integer fixed point coding on the weight gradient of each network layer, and coding the weight gradient of each network layer into integer fixed point data with a specified bit width;
and calculating the adjusted network weight of each network layer by using a preset optimization algorithm according to the encoded weight gradient of each network layer and the encoded network weight of each network layer.
5. The method according to claim 1, wherein after the calculating a second activation amount output by the network layer according to the encoded first activation amount and the network weight, the method further comprises:
and performing integer fixed point encoding on the second activation quantity, and encoding the second activation quantity into integer fixed point data with a specified bit width.
6. The method according to claim 1, wherein the performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight as integer fixed point data with a specified bit width comprises:
and respectively coding each scalar numerical value in the first activation quantity and the network weight as a product of a parameter value representing a global dynamic range and an integer fixed point value of a specified bit width.
7. The method of claim 6, wherein if the network layer is a convolutional layer, the network weight has a size of C x R x N, and the corresponding parameter values are the same for each scalar value in each three-dimensional tensor having a size of C x R;
if the network layer is a full connection layer, the size of the network weight is MxN, and the corresponding parameter values are the same for each scalar numerical value in each column vector with the size of 1 xN;
and the parameter values corresponding to the scalar numerical values in the first activation quantity are the same.
8. An apparatus for neural network model training, the apparatus comprising:
the acquisition module is used for acquiring a training sample;
a training module, configured to train a neural network model using the training samples, where the training module, when training the neural network model, respectively executes the following steps for each network layer in the neural network model:
acquiring a first activation quantity input into the network layer and a network weight of the network layer;
performing integer fixed point coding on the first activation quantity and the network weight, and coding the first activation quantity and the network weight into integer fixed point data with a specified bit width;
and calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight.
9. The apparatus according to claim 8, wherein the apparatus is applied to a camera; the training sample is a training sample with a specified target; the neural network model is a target detection model used for detecting the specified target;
the device further comprises:
the starting module is used for starting a target detection function;
the judging module is used for judging whether a model on-line training function is started or not;
the training module is specifically configured to:
and if the judgment result of the judgment module is that the model on-line training function is started, training the target detection model by using the training sample with the specified target.
10. The apparatus of claim 8, wherein the training module is specifically configured to:
inputting the training sample into a neural network model, and performing forward operation on the training sample according to the sequence of each network layer in the neural network model from front to back to obtain a forward operation result of the neural network model, wherein when performing forward operation, for each network layer, performing integer fixed point coding on a first activation quantity input into the network layer and a network weight of the network layer, coding the first activation quantity and the network weight into integer fixed point data with a specified bit width, calculating a second activation quantity output by the network layer according to the coded first activation quantity and the network weight, and calculating the second activation quantity as a first activation quantity input into a next network layer until a second activation quantity output by a last network layer is determined as a forward operation result;
comparing the forward operation result with a preset nominal value to obtain a loss value;
inputting the loss value into the neural network model, and performing reverse operation on the loss value according to the sequence of each network layer in the neural network model from back to front to obtain a weight gradient of each network layer in the neural network model, wherein during the reverse operation, a first activation quantity gradient and a network weight of the network layer are respectively subjected to integer fixed-point coding aiming at each network layer, the first activation quantity gradient and the network weight are coded into integer fixed-point data with a specified bit width, a second activation quantity gradient and a weight gradient output by the network layer are calculated according to the coded first activation quantity, the first activation quantity gradient and the network weight, and the second activation quantity gradient is used as a first activation quantity gradient input into the next network layer for calculation, until the weight gradients of all network layers are calculated;
and adjusting the network weight of each network layer according to the weight gradient of each network layer.
11. The apparatus according to claim 10, wherein the training module, when configured to adjust the network weights of the network layers according to the weight gradients of the network layers, is specifically configured to:
performing integer fixed point coding on the weight gradient of each network layer, and coding the weight gradient of each network layer into integer fixed point data with a specified bit width;
and calculating the adjusted network weight of each network layer by using a preset optimization algorithm according to the encoded weight gradient of each network layer and the encoded network weight of each network layer.
12. The apparatus of claim 8, wherein the training module is further configured to:
and performing integer fixed point encoding on the second activation quantity, and encoding the second activation quantity into integer fixed point data with a specified bit width.
13. The apparatus according to claim 8, wherein the training module, when configured to perform integer fixed point coding on the first activation quantity and the network weight, and encode the first activation quantity and the network weight as integer fixed point data having a specified bit width, is specifically configured to:
and respectively coding each scalar numerical value in the first activation quantity and the network weight as a product of a parameter value representing a global dynamic range and an integer fixed point value of a specified bit width.
14. The apparatus of claim 13, wherein if the network layer is a convolutional layer, the network weight has a size of cxrxrxrxrxrxrxrxrxrxrxnxn, and the corresponding parameter values are the same for each scalar value in each three-dimensional tensor of size cxxrxr;
if the network layer is a full connection layer, the size of the network weight is MxN, and the corresponding parameter values are the same for each scalar numerical value in each column vector with the size of 1 xN;
and the parameter values corresponding to the scalar numerical values in the first activation quantity are the same.
CN201910808066.6A 2019-08-29 2019-08-29 Neural network model training method and device Pending CN112446461A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910808066.6A CN112446461A (en) 2019-08-29 2019-08-29 Neural network model training method and device
PCT/CN2020/111912 WO2021037174A1 (en) 2019-08-29 2020-08-27 Neural network model training method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910808066.6A CN112446461A (en) 2019-08-29 2019-08-29 Neural network model training method and device

Publications (1)

Publication Number Publication Date
CN112446461A true CN112446461A (en) 2021-03-05

Family

ID=74685187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910808066.6A Pending CN112446461A (en) 2019-08-29 2019-08-29 Neural network model training method and device

Country Status (2)

Country Link
CN (1) CN112446461A (en)
WO (1) WO2021037174A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575379A (en) * 2014-09-09 2017-04-19 英特尔公司 Improved fixed point integer implementations for neural networks
CN109902745A (en) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 A kind of low precision training based on CNN and 8 integers quantization inference methods

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934331B (en) * 2016-04-29 2020-06-19 中科寒武纪科技股份有限公司 Apparatus and method for performing artificial neural network forward operations
CN110096968B (en) * 2019-04-10 2023-02-07 西安电子科技大学 Ultra-high-speed static gesture recognition method based on depth model optimization

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575379A (en) * 2014-09-09 2017-04-19 英特尔公司 Improved fixed point integer implementations for neural networks
CN109902745A (en) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 A kind of low precision training based on CNN and 8 integers quantization inference methods

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WU,SHUANG等: "TRAINING AND INFERENCE WITH INTEGERS IN DEEP NEURAL NETWORKS", ICLR2018, pages 1 - 14 *

Also Published As

Publication number Publication date
WO2021037174A1 (en) 2021-03-04

Similar Documents

Publication Publication Date Title
US10713818B1 (en) Image compression with recurrent neural networks
US20240119286A1 (en) Adaptive artificial neural network selection techniques
US20190130255A1 (en) Method and apparatus for generating fixed-point type neural network
CN106897254B (en) Network representation learning method
JP2019528502A (en) Method and apparatus for optimizing a model applicable to pattern recognition and terminal device
CN112149797B (en) Neural network structure optimization method and device and electronic equipment
KR20190034985A (en) Method and apparatus of artificial neural network quantization
WO2022027937A1 (en) Neural network compression method, apparatus and device, and storage medium
CN111105017B (en) Neural network quantization method and device and electronic equipment
CN112508125A (en) Efficient full-integer quantization method of image detection model
CN113132723B (en) Image compression method and device
CN111401550A (en) Neural network model quantification method and device and electronic equipment
WO2021042857A1 (en) Processing method and processing apparatus for image segmentation model
CN110084250B (en) Image description method and system
CN110647974A (en) Network layer operation method and device in deep neural network
CN114698395A (en) Quantification method and device of neural network model, and data processing method and device
WO2022246986A1 (en) Data processing method, apparatus and device, and computer-readable storage medium
CN110874635A (en) Deep neural network model compression method and device
CN112561050B (en) Neural network model training method and device
CN112446461A (en) Neural network model training method and device
RU62314U1 (en) FORMAL NEURON
CN111091495A (en) High-resolution compressive sensing reconstruction method for laser image based on residual error network
CN111916049B (en) Voice synthesis method and device
CN115906941B (en) Neural network adaptive exit method, device, equipment and readable storage medium
CN116030537A (en) Three-dimensional human body posture estimation method based on multi-branch attention-seeking convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination