WO2021037174A1 - 一种神经网络模型训练方法及装置 - Google Patents

一种神经网络模型训练方法及装置 Download PDF

Info

Publication number
WO2021037174A1
WO2021037174A1 PCT/CN2020/111912 CN2020111912W WO2021037174A1 WO 2021037174 A1 WO2021037174 A1 WO 2021037174A1 CN 2020111912 W CN2020111912 W CN 2020111912W WO 2021037174 A1 WO2021037174 A1 WO 2021037174A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
weight
network layer
activation
activation amount
Prior art date
Application number
PCT/CN2020/111912
Other languages
English (en)
French (fr)
Inventor
张渊
谢迪
浦世亮
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2021037174A1 publication Critical patent/WO2021037174A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • This application relates to the field of machine learning technology, and in particular to a neural network model training method and device.
  • Deep neural network as an emerging field in machine learning research, analyzes data by imitating the mechanism of the human brain. It is an intelligent model for analyzing and learning by establishing and simulating the human brain.
  • deep neural networks such as convolutional neural networks, recurrent neural networks, and long- and short-term memory networks, have been well applied in many types of data processing technologies. For example: in the field of video image processing, the detection and segmentation of target objects in the image, behavior detection and recognition, and the field of audio data processing, voice recognition, etc. have been well applied.
  • neural network model training usually uses single-precision floating-point data for operations.
  • single-precision floating-point data has a relatively high bit width and the amount of data involved in calculations is relatively large, it requires high hardware resource overhead to run the neural network model.
  • the purpose of the embodiments of the present application is to provide a neural network model training method and device, so as to reduce the hardware resource overhead required to run the neural network model.
  • the specific technical solutions are as follows:
  • an embodiment of the present application provides a neural network model training method, which includes:
  • the second activation amount output by the network layer is calculated.
  • an embodiment of the present application provides a neural network model training device, which includes:
  • the acquisition module is used to acquire training samples
  • the training module is used to train the neural network model by using the training samples.
  • the training module trains the neural network model, it performs the following steps for each network layer in the neural network model:
  • the second activation amount output by the network layer is calculated.
  • an embodiment of the present application provides a computer device, including a processor and a machine-readable storage medium.
  • the machine-readable storage medium stores machine-executable instructions that can be executed by the processor, and the processor is executed by the machine-executable instructions.
  • Prompt to implement the method provided in the first aspect of the embodiments of the present application.
  • an embodiment of the present application provides a machine-readable storage medium that stores machine-executable instructions that, when called and executed by a processor, implement the method provided in the first aspect of the embodiments of the present application.
  • an embodiment of the present application provides a computer program product for executing the method provided in the first aspect of the embodiment of the present application at runtime.
  • the neural network model training method and device provided by the embodiments of the present application obtain training samples, and use the training samples to train the neural network model.
  • training the neural network model perform separately for each network layer in the neural network model: obtain the first activation input to the network layer and the network weight of the network layer, and perform the first activation and the network weight Integer fixed-point coding, encode the first activation amount and network weight into integer fixed-point data with a specified bit width, and calculate the second activation amount output by the network layer according to the encoded first activation amount and network weight.
  • integer fixed-point coding When training the neural network model, perform integer fixed-point coding on the first activation of each network layer and the network weight of each network layer, and the encoded first activation and network weight have a specified bit width Integer fixed-point data, when performing operations, the matrix multiplication, matrix addition and other operations involved use integer fixed-point format.
  • the bit width of integer fixed-point data is significantly less than that of single-precision floating-point data. Therefore, , Can greatly reduce the hardware resource overhead required to run the neural network model.
  • FIG. 1 is a schematic flowchart of a neural network model training method according to an embodiment of the application
  • Fig. 2 is a schematic diagram of a neural network model training process according to an embodiment of the application
  • FIG. 3 is a schematic diagram of the execution process for each network layer in the neural network model in the process of training the neural network model according to an embodiment of the application;
  • FIG. 4 is a schematic diagram of a tensor space structure corresponding to a four-dimensional tensor convolution kernel with a size of C ⁇ R ⁇ R ⁇ N according to an embodiment of the application;
  • FIG. 5 is a schematic diagram of an encoding method of each scalar value in a three-dimensional tensor with a size of C ⁇ R ⁇ R according to an embodiment of the application;
  • FIG. 6 is a schematic diagram of a tensor space structure corresponding to a two-dimensional matrix with a size of M ⁇ N according to an embodiment of the application;
  • FIG. 7 is a schematic diagram of an encoding method of each scalar value in a column vector with a size of 1 ⁇ N according to an embodiment of the application;
  • FIG. 8 is a schematic diagram of the encoding method of each scalar value in the activation amount and the activation amount gradient three-dimensional tensor according to an embodiment of the application;
  • FIG. 9 is a schematic flowchart of a method for training a target detection model applied to a camera according to an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of a neural network model training device according to an embodiment of the application.
  • FIG. 11 is a schematic structural diagram of a computer device according to an embodiment of the application.
  • embodiments of the present application provide a neural network model training method, device, computer equipment, and machine-readable storage medium.
  • the neural network model training method provided by the embodiments of the present application is first introduced.
  • the execution subject of the neural network training method provided by the embodiments of the application may be a computer device with a neural network model training function, or a computer device that implements functions such as target detection and segmentation, behavior detection and recognition, and speech recognition. It can also be a camera with functions such as target detection and segmentation, behavior detection and recognition, or a microphone with voice recognition function, etc.
  • the execution body includes at least a core processing chip with data processing capabilities.
  • the method for implementing the neural network training method provided by the embodiment of the present application may be at least one of software, hardware circuit, and logic circuit provided in the execution body.
  • the neural network model training method provided by this embodiment of the application may include the following steps.
  • the collected training samples are also different. For example, if you are training a detection model for moving target detection, the collected training samples are sample images that contain moving targets; if you are training a recognition model for vehicle model recognition, the collected training samples include different models A sample image of a vehicle; if it is a recognition model trained for speech recognition, the collected training samples are audio sample data.
  • S102 Use the training samples to train the neural network model.
  • Input the training samples into the neural network model use the BP (Back Propagation) algorithm or other model training algorithms to perform calculations on the training samples, compare the calculation results with the set nominal value, and compare the results based on the comparison results.
  • the network weights of the neural network model are adjusted. By inputting different training samples into the neural network model in turn, performing the above steps iteratively, and continuously adjusting the network weights, the output of the neural network model will get closer and closer to the nominal value until the output of the neural network model is equal to the nominal value. When the value difference is small enough (less than the preset threshold), or the output of the neural network model converges, it is considered that the training of the neural network model is completed.
  • matrix multiplication operation dW i dY i *Y i-1 when performing reverse operation, where forward operation refers to It is the order of operations from front to back from the first network layer.
  • Reverse operation refers to the order of operations from back to front from the last network layer.
  • W i represents the network weight of the i-th network layer, such as volume Layer parameter or fully connected layer parameter
  • Y i represents the input activation amount of the i-th network layer or the output of the i-th network layer
  • dW i represents the weight gradient corresponding to the i-th network layer
  • dY i represents the input i-th layer
  • the activation gradient of the network layer, 1 ⁇ i ⁇ k, k is the total number of network layers.
  • the training sample X is input into the neural network model, and after the forward operation of the neural network model, the k-layer network layer performs convolution operations from front to back. , Get the model output Y k , compare the output of the model with the nominal value through the loss function, and get the loss value dY k , and then through the reverse operation of the neural network model, the k-layer network layer is convolved from back to front. The calculation and matrix multiplication are performed to obtain the weight gradient corresponding to each network layer, and the network weight is adjusted according to the weight gradient. After a continuous iterative process, the output of the neural network model is getting closer and closer to the nominal value.
  • each step as shown in FIG. 3 needs to be performed respectively.
  • the first activation input to the i-th network layer is Y i
  • the first activation input to the i-th network layer is dY i .
  • S302 Perform integer fixed-point encoding on the first activation amount and network weight, and encode the first activation amount and network weight into integer fixed-point data with a specified bit width.
  • the input layer of the network a first activation amount Y i, dY i, and a network layer of the network weights W i be an integer value encoded point
  • fixed-point integer coded data is floating-point format
  • the data is encoded as an integer fixed-point format.
  • S302 may specifically be: encoding each scalar value in the first activation amount and the network weight into a parameter value representing the global dynamic range and an integer fixed-point value with a specified bit width. The product of.
  • the calculation method of integer fixed-point value ip and parameter value sp is:
  • s is the sign bit binary number x, a value of 0 or 1
  • x is a binary number x i bit i, a value of 0 or 1.
  • the size of the network weight is C ⁇ R ⁇ R ⁇ N, and for each three-dimensional tensor of size C ⁇ R ⁇ R Scalar value, the corresponding parameter value is the same; if the network layer is a fully connected layer, the size of the network weight is M ⁇ N. For each scalar value in each column vector with a size of 1 ⁇ N, the corresponding parameter value is the same ; The parameter values corresponding to the scalar values in the first activation quantity are the same.
  • W i is the network weight corresponding to the i-th layer of the neural network model, and the type of the network layer is a convolutional layer or a fully connected layer. If the i-th layer is a convolutional layer, W i is a four-dimensional tensor convolution kernel of size C ⁇ R ⁇ R ⁇ N, and the corresponding tensor space structure is shown in Figure 4, where C represents the convolution kernel input The dimension of the channel direction, R represents the spatial dimension of the convolution kernel, and N represents the dimension of the output channel of the convolution kernel. For each scalar value w in each three-dimensional tensor W i p of size C ⁇ R ⁇ R, it can be expressed as:
  • each W i p sharing a three-dimensional tensor sp each scalar value w corresponding to a fixed point integer value IP, wherein, 1 ⁇ p ⁇ N.
  • the encoding method of each scalar value in a three-dimensional tensor of size C ⁇ R ⁇ R is shown in Figure 5.
  • a three-dimensional tensor corresponds to one ip (ip1, ip2, ip3 in Figure 5), and all three-dimensional tensors Share a sp.
  • the calculation methods of ip and sp are as formulas (1) and (2), which will not be repeated here.
  • W i is a two-dimensional matrix of size M ⁇ N, and the corresponding tensor space structure is shown in Figure 6.
  • the matrix can be divided into the following structure, and the size is M
  • the ⁇ N two-dimensional matrix is divided into M column vectors with a size of 1 ⁇ N.
  • Each scalar value w in each column vector W i q with a size of 1 ⁇ N is expressed by the above formula (3), where 1 ⁇ q ⁇ M.
  • Each W i q column vector shares a sp, and each scalar value w corresponds to an integer fixed-point value ip.
  • the coding method of each scalar value in the column vector with a size of 1 ⁇ N is shown in FIG. 7. Among them, the calculation methods of ip and sp are as formulas (1) and (2), which will not be repeated here.
  • Y i and dY i are the activations and activation gradients corresponding to the i-th layer of the neural network model. They are a three-dimensional tensor of size C ⁇ H ⁇ W. Each scalar in the three-dimensional tensor Y i or dY i The value y or dy can be expressed as:
  • each three-dimensional tensor dY i or Y i a shared sp
  • each scalar value y dy corresponding to an integer or fixed-point value ip.
  • the encoding method of each scalar value in the activation and activation gradient three-dimensional tensor is shown in Figure 8.
  • a three-dimensional tensor corresponds to one ip (ip1, ip2, ip3 in Figure 8), and all three-dimensional tensors share one sp.
  • the calculation methods of ip and sp are as formulas (1) and (2), which are not repeated here.
  • S303 Calculate the second activation amount output by the network layer according to the encoded first activation amount and the network weight.
  • each scalar value in the first activation quantity and the network weight is encoded with an integer fixed-point value, and the encoded value is an integer fixed-point value, so that the forward operation and the reverse operation are involved.
  • the most expensive operations of computing resources such as convolution operations and matrix multiplication operations, have changed from floating-point operations to integer fixed-point operations, greatly improving the training efficiency of neural networks on hardware platforms.
  • the first activation is the training sample of the input neural network model; for other network layers in the neural network model, the first activation is the input of the network layer) and the network weight of the network layer;
  • An activation quantity and network weight are encoded with integer fixed-point, the first activation quantity and network weight are encoded into integer fixed-point data with a specified bit width; the encoded first activation quantity is input into the network layer, and the network
  • the layer uses the encoded network weight to perform a convolution operation on the encoded first activation to obtain the second activation output by the network layer. If the network layer is not the last network layer, the second activation amount is used as the first activation amount to be input to the next network layer.
  • S102 may be specifically implemented through the following steps:
  • the first step is to input the training samples into the neural network model.
  • forward operations are performed on the training samples to obtain the forward operation results of the neural network model.
  • the first activation amount input to the network layer and the network weight value of the network layer are respectively encoded with integer fixed-point, and the first activation amount and the network weight value are encoded into an integer with a specified bit width.
  • Type fixed-point data and calculate the second activation output of the network layer according to the first activation of each network layer after encoding and the network weight, and use the second activation as the first activation input to the next network layer. Calculate until the second activation output of the last network layer is determined as the result of the forward operation.
  • the second step is to compare the result of the forward operation with the preset nominal value to obtain the loss value.
  • the third step is to input the loss value into the neural network model.
  • reverse the loss value to obtain the weight gradient of each network layer in the neural network model.
  • the reverse operation for each network layer, the first activation quantity, the first activation quantity gradient input to the network layer, and the network weight of the network layer are respectively subjected to integer fixed-point coding, and the first activation quantity and the first activation quantity are An activation gradient and network weight are encoded as integer fixed-point data with a specified bit width, and the second activation output of the network layer is calculated according to the encoded first activation, the first activation gradient and the network weight Gradient and weight gradient, calculate the second activation gradient as the first activation gradient input to the next network layer, until the weight gradients of all network layers are calculated.
  • the fourth step is to adjust the network weight of each network layer according to the weight gradient of each network layer.
  • the process from the first step to the fourth step is the calculation process of the BP algorithm, and the training of the neural network model is realized by performing these four steps in a continuous loop.
  • the value gradient dW i through the above integer fixed-point encoding, the above floating-point operation becomes an integer fixed-point operation:
  • YB, WB, dYB, dWB are integer bit width values
  • f 32 () and int() indicate 32-bit floating point format and integer fixed-point format.
  • the fourth step described above can be specifically implemented by the following steps: performing integer fixed-point encoding on the weight gradient of each network layer, and encoding the weight gradient of each network layer to have a specified position Wide integer fixed-point data; according to the weight gradient of each network layer after encoding and the network weight of each network layer after encoding, use the preset optimization algorithm to calculate the network weight of each network layer after adjustment.
  • the weight gradient can be encoded.
  • the specific encoding process can refer to the process of encoding the network weights above, which will not be repeated here.
  • the network weights need to be adjusted based on the weight gradient.
  • the adjustment process is mainly matrix addition.
  • optimization algorithms such as SGD (Stochastic Gradient Descent) are used to change the network weights from the floating point format. Convert to integer fixed-point format. Taking the SGD optimization algorithm as an example, the transformation of network weights is shown in formulas (9) to (11).
  • int WB (W old) int mB (m) ⁇ int dWB (dW old) + int ⁇ B ( ⁇ ) ⁇ int dWB (dW) (10)
  • dW is the weight gradient of the network layer at the current moment
  • dW old is the weight gradient of the network layer at the previous moment
  • W is the network weight of the network layer at the current moment
  • ⁇ w , ⁇ and m are training hyperparameters (Can be set).
  • the neural network model training method provided in the embodiment of the present application may also be executed: integer fixed-point encoding is performed on the second activation amount, and the second activation amount is encoded It is an integer fixed-point data with a specified bit width.
  • the bit width of the integer fixed-point data obtained generally becomes longer.
  • the calculation efficiency may be reduced due to the longer bit width.
  • the calculated second activation value can be subjected to integer fixed-point coding again, with the purpose of reducing the bit width of the second activation value so that the bit width of the second activation value can meet the calculation requirements of the next network layer.
  • training samples are obtained, and the training samples are used to train the neural network model.
  • training the neural network model perform separately for each network layer in the neural network model: obtain the first activation input to the network layer and the network weight of the network layer, and perform the first activation and the network weight Integer fixed-point coding, encode the first activation amount and network weight into integer fixed-point data with a specified bit width, and calculate the second activation amount output by the network layer according to the encoded first activation amount and network weight.
  • the following describes the neural network model training method of the embodiment of the present application in combination with a specific scene of target recognition from an image.
  • the target recognition model includes three convolutional layers and a fully connected layer, and each network layer is set with an initial network weight.
  • sample image is input into the neural network model, and the output result of the model is obtained. Specifically include the following steps:
  • A. Use the first convolutional layer as the current network layer, and use the pixel value of each pixel in the sample image as the first activation of the first convolutional layer;
  • B. Perform integer fixed-point encoding on the first activation amount, and encode the first activation amount into integer fixed-point data with a specified bit width; and obtain the network weight of the current layer, and perform the integer type on the network weight of the current network layer
  • Fixed-point encoding encode the network weights of the current network layer into integer fixed-point data with a specified bit width; input the encoded first activation value into the current network layer, and the current network layer uses the encoded network weights to encode the The first activation amount performs the current layer convolution operation to obtain the second activation amount output by the current network layer;
  • step C Use the second activation amount output by the current layer as the first activation amount of the next network layer, and return to step B; until the last network layer is obtained, that is, the fully connected layer outputs the second activation amount, and the fully connected layer
  • the output second activation amount is used as the output result of the target recognition model.
  • the output of the target recognition model is compared with the marked target information to obtain the loss value, and then according to the reverse operation process of the above process, convolution operation and matrix multiplication operation are performed sequentially from back to front to obtain The weight gradient corresponding to each network layer is adjusted according to the weight gradient. After a continuous iterative process, the training of the target recognition model is realized.
  • the above neural network model training method is mainly suitable for edge devices with limited resources, such as cameras.
  • the camera's intelligent inference functions mainly include target detection and recognition.
  • the following takes target detection as an example to detect the targets deployed on the camera.
  • the training method of the model is introduced, as shown in Figure 9, which mainly includes the following steps:
  • the camera can turn on the target detection function based on the user's selection result when target detection is required according to the actual needs of the user.
  • S902 Judge whether to start the model online training function, if yes, execute S903, otherwise, wait to start the model online training function.
  • the target detection model Before using the target detection model for target detection, the target detection model needs to be trained. Whether to perform online training can be selected by the user. Normally, the camera will only follow the embodiment shown in Figure 1 after starting the model online training function. Steps to train the target detection model.
  • S903 Train the target detection model by using the acquired training sample with the specified target.
  • the training sample input to the target detection model is a sample image containing the specified target, so that the trained target detection model can detect the specified target.
  • the specific method of training the target detection model is the same as the method of training the neural network model in the embodiment shown in FIG. 3, and will not be repeated here.
  • the first activation input of each network layer and the network weight of each network layer are integer fixed-point encoding during the training process.
  • the first activation and network weights are integer fixed-point data with a specified bit width, when performing operations, the matrix multiplication, matrix addition and other operations involved all use integer fixed-point format, and the integer fixed-point data
  • the bit width is significantly less than that of single-precision floating-point data, so the hardware resource overhead of the camera can be greatly reduced. Online training of the target detection model is performed on the camera, so that the camera can have a scene adaptive function.
  • an embodiment of the present application provides a neural network model training device.
  • the device may include:
  • the obtaining module 1010 is used to obtain training samples
  • the training module 1020 is used to train the neural network model by using the training samples.
  • the training module 1020 performs the following steps for each network layer in the neural network model: Obtain the input network layer The first activation amount and the network weight of the network layer; the first activation amount and the network weight are integer fixed-point encoding, and the first activation amount and the network weight are encoded into integer fixed-point data with a specified bit width; according to The first activation amount and the network weight after the encoding are used to calculate the second activation amount output by the network layer.
  • the device is applied to a camera;
  • the training sample is a training sample with a designated target;
  • the neural network model is a target detection model used to detect the designated target;
  • the device may also include:
  • the judgment module is used to judge whether to start the online training function of the model
  • the training module 1020 can be specifically used to: if the judgment result of the judgment module is to start the model online training function, use the training sample with the specified target to train the target detection model.
  • the training module 1020 can be specifically used to input training samples into a neural network model, and perform forward operations on the training samples in the order of each network layer in the neural network model from front to back.
  • the forward calculation result of the neural network model where, when performing forward calculation, for each network layer, the first activation input to the network layer and the network weight of the network layer are respectively subjected to integer fixed-point coding, The first activation amount and network weight are encoded into integer fixed-point data with a specified bit width, and the second activation amount output by the network layer is calculated according to the encoded first activation amount and network weight, and the second activation The amount is calculated as the first activation amount input to the next network layer, until the second activation amount output by the last network layer is determined as the forward calculation result; the forward calculation result is compared with the preset nominal value to get Loss value; input the loss value into the neural network model, and perform reverse calculation on the loss value according to the order of each network layer in the neural network model from
  • the first activation quantity, the first activation quantity gradient input to the network layer and the network weight of the network layer are respectively subjected to integer fixed-point coding, and the first activation quantity, the first activation quantity and the first activation quantity are respectively encoded.
  • the activation gradient and network weight are encoded as integer fixed-point data with a specified bit width, and the second activation gradient output by the network layer is calculated according to the encoded first activation, first activation gradient, and network weight And weight gradient, the second activation gradient is calculated as the first activation gradient input to the next network layer, until the weight gradient of all network layers is calculated; each network layer is adjusted according to the weight gradient of each network layer Network weight.
  • the training module 1020 is used to adjust the network weight of each network layer according to the weight gradient of each network layer. Specifically, it can be used to: The gradient performs integer fixed-point encoding, and encodes the weight gradient of each network layer into integer fixed-point data with a specified bit width; according to the weight gradient of each network layer after encoding and the network weight of each network layer after encoding, Use the preset optimization algorithm to calculate the network weight of each network layer after adjustment.
  • the training module 1020 may also be used to: perform integer fixed-point encoding on the second activation amount, and encode the second activation amount into integer fixed-point data with a specified bit width.
  • the training module 1020 is used to perform integer fixed-point encoding on the first activation amount and network weight, and encode the first activation amount and network weight into a specified bit width
  • integer fixed-point data it can be specifically used to encode each scalar value in the first activation amount and the network weight as the product of the parameter value representing the global dynamic range and the integer fixed-point value of the specified bit width.
  • the size of the network weight is C ⁇ R ⁇ R ⁇ N, and for each three-dimensional tensor of size C ⁇ R ⁇ R Scalar value, the corresponding parameter value is the same; if the network layer is a fully connected layer, the size of the network weight is M ⁇ N. For each scalar value in a column vector of size 1 ⁇ N, the corresponding parameter value is the same ; The parameter values corresponding to the scalar values in the first activation quantity are the same.
  • training samples are obtained, and the training samples are used to train the neural network model.
  • training the neural network model perform separately for each network layer in the neural network model: obtain the first activation input to the network layer and the network weight of the network layer, and perform the first activation and the network weight Integer fixed-point coding, encode the first activation amount and network weight into integer fixed-point data with a specified bit width, and calculate the second activation amount output by the network layer according to the encoded first activation amount and network weight.
  • integer fixed-point coding When training the neural network model, perform integer fixed-point coding on the first activation of each network layer and the network weight of each network layer, and the encoded first activation and network weight have a specified bit width Integer fixed-point data, when performing operations, the matrix multiplication, matrix addition and other operations involved use integer fixed-point format.
  • the bit width of integer fixed-point data is significantly less than that of single-precision floating-point data. Therefore, , Can greatly reduce the hardware resource overhead required to run the neural network model.
  • An embodiment of the present application provides a computer device. As shown in FIG. 11, it may include a processor 1101 and a machine-readable storage medium 1102.
  • the machine-readable storage medium 1102 stores machine executable instructions that can be executed by the processor 1101.
  • the processor 1101 is prompted by machine-executable instructions to implement the steps of the neural network model training method described above.
  • the above-mentioned machine-readable storage medium may include RAM (Random Access Memory, random access memory), and may also include NVM (Non-Volatile Memory, non-volatile memory), for example, at least one disk storage.
  • NVM Non-Volatile Memory, non-volatile memory
  • the machine-readable storage medium may also be at least one storage device located far away from the foregoing processor.
  • the above-mentioned processor may be a general-purpose processor, including CPU (Central Processing Unit), NP (Network Processor, network processor), etc.; it may also be DSP (Digital Signal Processing, digital signal processor), ASIC (Application Specific Integrated Circuit, FPGA (Field-Programmable Gate Array, Field Programmable Gate Array) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor, network processor
  • DSP Digital Signal Processing, digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array, Field Programmable Gate Array
  • other programmable logic devices discrete gates or transistor logic devices, discrete hardware components.
  • the machine-readable storage medium 1102 and the processor 1101 may transmit data through a wired connection or a wireless connection, and the computer device may communicate with other devices through a wired communication interface or a wireless communication interface. What is shown in FIG. 11 is only an example of data transmission between the processor 1101 and the machine-readable storage medium 1102 through a bus, and is not intended to limit the specific connection manner.
  • the processor 1101 reads the machine executable instructions stored in the machine-readable storage medium 1102 and runs the machine executable instructions to achieve: obtain training samples, use the training samples to perform the neural network model training.
  • training the neural network model perform separately for each network layer in the neural network model: obtain the first activation input to the network layer and the network weight of the network layer, and perform the first activation and the network weight Integer fixed-point coding, encode the first activation amount and network weight into integer fixed-point data with a specified bit width, and calculate the second activation amount output by the network layer according to the encoded first activation amount and network weight.
  • integer fixed-point coding When training the neural network model, perform integer fixed-point coding on the first activation of each network layer and the network weight of each network layer, and the encoded first activation and network weight have a specified bit width Integer fixed-point data, when performing operations, the matrix multiplication, matrix addition and other operations involved use integer fixed-point format.
  • the bit width of integer fixed-point data is significantly less than that of single-precision floating-point data. Therefore, , Can greatly reduce the hardware resource overhead required to run the neural network model.
  • the embodiment of the present application also provides a machine-readable storage medium that stores machine-executable instructions, which when called and executed by a processor, implement the steps of the neural network model training method described above.
  • the machine-readable storage medium stores machine executable instructions that execute the neural network model training method provided in the embodiments of this application at runtime. Therefore, it can be achieved: obtaining training samples, using the training samples, and comparing the neural network model Conduct training.
  • training the neural network model perform separately for each network layer in the neural network model: obtain the first activation input to the network layer and the network weight of the network layer, and perform the first activation and the network weight Integer fixed-point coding, encode the first activation amount and network weight into integer fixed-point data with a specified bit width, and calculate the second activation amount output by the network layer according to the encoded first activation amount and network weight.
  • integer fixed-point coding When training the neural network model, perform integer fixed-point coding on the first activation of each network layer and the network weight of each network layer, and the encoded first activation and network weight have a specified bit width Integer fixed-point data, when performing operations, the matrix multiplication, matrix addition and other operations involved use integer fixed-point format.
  • the bit width of integer fixed-point data is significantly less than that of single-precision floating-point data. Therefore, , Can greatly reduce the hardware resource overhead required to run the neural network model.
  • the embodiment of the present application also provides a computer program product for executing the steps of the neural network model training method described above at runtime.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (such as a floppy disk, a hard disk, a magnetic tape), an optical medium (such as a DVD (Digital Versatile Disc)), or a semiconductor medium (such as an SSD (Solid State Disk)), etc. .
  • the program can be stored in a computer readable storage medium, which is referred to herein as Storage media, such as ROM/RAM, magnetic disks, optical disks, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)

Abstract

一种神经网络模型训练方法及装置,获取训练样本,利用训练样本,对神经网络模型进行训练。在进行神经网络模型训练时,对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,可以大幅地降低运行神经网络模型需要的硬件资源开销。

Description

一种神经网络模型训练方法及装置
本申请要求于2019年08月29日提交中国专利局、申请号为201910808066.6、发明名称为“一种神经网络模型训练方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及机器学习技术领域,特别是涉及一种神经网络模型训练方法及装置。
背景技术
深度神经网络作为机器学习研究中的一个新兴领域,通过模仿人脑的机制来解析数据,是一种通过建立和模拟人脑进行分析学习的智能模型。目前,深度神经网络,如卷积神经网络、循环神经网络、长短期记忆网络等已在很多类型的数据处理技术中得到了得到了很好的应用。例如:在视频图像处理领域中,对图像中的目标对象进行检测与分割和行为检测与识别等方面,以及音频数据处理领域中,进行语音识别等方面得到了很好的应用。
目前,由于图像数据或音频数据待处理数据本身的数据量较大,为了保证神经网络模型收敛的精度,神经网络模型的训练通常采用单精度浮点数据进行运算。但是,由于单精度浮点数据具有较高的位宽,参与运算的数据量较大,导致运行神经网络模型需要较高的硬件资源开销。
发明内容
本申请实施例的目的在于提供一种神经网络模型训练方法及装置,以降低运行神经网络模型需要的硬件资源开销。具体技术方案如下:
第一方面,本申请实施例提供了一种神经网络模型训练方法,该方法包括:
获取训练样本;
利用训练样本,对神经网络模型进行训练,其中,在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行如下步骤:
获取输入网络层的第一激活量及该网络层的网络权值;
对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据;
根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量。
第二方面,本申请实施例提供了一种神经网络模型训练装置,该装置包括:
获取模块,用于获取训练样本;
训练模块,用于利用训练样本,对神经网络模型进行训练,其中,训练模块在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行如下步骤:
获取输入网络层的第一激活量及网络层的网络权值;
对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据;
根据编码后的第一激活量及网络权值,计算网络层输出的第二激活量。
第三方面,本申请实施例提供了一种计算机设备,包括处理器和机器可读存储介质,机器可读存储介质存储有能够被处理器执行的机器可执行指令,处理器被机器可执行指令促使:实现本申请实施例第一方面提供的方法。
第四方面,本申请实施例提供了一种机器可读存储介质,存储有机器可执行指令,在被处理器调用和执行时,实现本申请实施例第一方面提供的方法。
第五方面,本申请实施例提供了一种计算机程序产品,用于在运行时执行本申请实施例第一方面提供的方法。
本申请实施例提供的一种神经网络模型训练方法及装置,获取训练样本,利用训练样本,对神经网络模型进行训练。在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行:获取输入该网络层的第一激活量及网络层的网络权值,对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量。在进行神经网络模型训练时,对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,可以大幅地降低运行神经网络模型需要的硬件资源开销。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例的神经网络模型训练方法的流程示意图;
图2为本申请实施例的神经网络模型训练过程示意图;
图3为本申请实施例的在对神经网络模型进行训练的过程中,针对神经网络模型中的各网络层的执行流程示意图;
图4为本申请实施例的大小为C×R×R×N的四维张量卷积核对应的张量空间结构示意图;
图5为本申请实施例的大小为C×R×R的三维张量内每个标量数值的编码方式的示意图;
图6为本申请实施例的大小为M×N的二维矩阵对应的张量空间结构示意图;
图7为本申请实施例的大小为1×N的列向量内每个标量数值的编码方式的示意图;
图8为本申请实施例的激活量和激活量梯度三维张量内每个标量数值的编码方式的示意图;
图9为本申请实施例的应用于相机的目标检测模型训练方法的流程示意图;
图10为本申请实施例的神经网络模型训练装置的结构示意图;
图11为本申请实施例的计算机设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为了降低运行神经网络模型需要的硬件资源开销,本申请实施例提供了 一种神经网络模型训练方法、装置、计算机设备及机器可读存储介质。下面,首先对本申请实施例所提供的神经网络模型训练方法进行介绍。
本申请实施例所提供的一种神经网络训练方法的执行主体可以为具有神经网络模型训练功能的计算机设备,也可以为实现目标检测与分割、行为检测与识别、语音识别等功能的计算机设备,还可以为具有目标检测与分割、行为检测与识别等功能的摄像机,或者具有语音识别功能的麦克风等,执行主体中至少包括具有数据处理能力的核心处理芯片。实现本申请实施例所提供的一种神经网络训练方法的方式可以为设置于执行主体中的软件、硬件电路和逻辑电路的至少一种方式。
如图1所示,为本申请实施例所提供的神经网络模型训练方法,该方法可以包括如下步骤。
S101,获取训练样本。
在进行神经网络训练时,通常需要收集大量的训练样本,基于神经网络模型需要实现的功能不同,所收集的训练样本也不同。例如,如果是训练用于进行运动目标检测的检测模型,收集的训练样本为包含有运动目标的样本图像;如果是训练用于进行车辆型号识别的识别模型,收集的训练样本为包含有不同型号的车辆的样本图像;如果是训练用于进行语音识别的识别模型,收集的训练样本为音频样本数据。
S102,利用训练样本,对神经网络模型进行训练。
将训练样本输入到神经网络模型中,利用BP(Back Propagation,反向传播)算法或者其他模型训练算法,对训练样本进行运算,将运算结果和设置的标称值进行比较,基于比较结果,对神经网络模型的网络权值进行调整。通过将不同的训练样本依次输入神经网络模型,迭代执行上述步骤,对网络权值不断地进行调整,神经网络模型的输出会越来越逼近于标称值,直至神经网络模型的输出与标称值的差异足够小(小于预设阈值),或者神经网络模型的输出收敛时,则认为对神经网络模型完成训练。
以BP算法为例,神经网络模型训练过程中主要的计算操作及数据流如图2所示,每个网络层在进行前向运算时主要进行卷积运算Y i=W i*Y i-1,每个网络层在进行反向运算时主要进行卷积运算dY i-1=dY i*W i,以及矩阵相乘运算dW i=dY i*Y i-1,其中,前向运算指的是从第一个网络层开始从前到后的运算顺序,反向运算指的是从最后一个网络层开始从后到前的运算顺序,W i表示第 i层网络层的网络权值,如卷积层参数或全连接层参数,Y i表示输入第i层网络层或者第i层网络层输出的激活量,dW i表示第i层网络层对应的权值梯度,dY i表示输入第i层网络层的激活量梯度,1≤i≤k,k为网络层的总层数。
如图2所示,在利用BP算法对神经网络模型进行训练的过程中,将训练样本X输入神经网络模型,经过神经网络模型的前向运算,k层网络层从前到后依次进行卷积运算,得到模型输出Y k,经过损失函数将该模型的输出与标称值进行比较,得到损失值dY k,再经过神经网络模型的反向运算,k层网络层从后到前依次进行卷积运算和矩阵相乘运算,得到每个网络层对应的权值梯度,根据权值梯度对网络权值进行调整。经过不断的迭代过程,使得神经网络模型的输出越来越逼近于标称值。
本申请实施例中,在对神经网络模型进行训练的过程中,针对神经网络模型中的各网络层,分别需要执行如图3所示的各步骤。
S301,获取输入网络层的第一激活量及网络层的网络权值。
在进行前向运算时,输入第i层网络层的第一激活量为Y i,在进行反向运算时,输入第i层网络层的第一激活量为dY i
S302,对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据。
对于第i层网络层,对输入该网络层的第一激活量Y i、dY i,以及该网络层的网络权值W i进行整型定点编码,整型定点编码就是将浮点格式的数据编码为整型定点格式的数据。
在本申请实施例的一种实现方式中,S302具体可以为:将第一激活量及网络权值中的各标量数值分别编码为表征全局动态范围的参数值与指定位宽的整型定点值的乘积。
具体的编码方式可以是将第一激活量和网络权值中的各标量数值编码为表征全局动态范围的参数值sp与指定位宽的整型定点值ip的乘积,其中,sp=2 E,E是位宽为EB的有符号二进制数,EB为设定的位宽,ip是位宽为IB的有符号二进制数,IB是根据原浮点数据的大小设定的位宽。整型定点值ip及参数值sp的计算方式为:
Figure PCTCN2020111912-appb-000001
Figure PCTCN2020111912-appb-000002
其中,s为二进制数x的符号位,取值为0或1,x i为二进制数x第i位数值,取值为0或1。
在本申请实施例的一种实现方式中,若网络层为卷积层,则网络权值的大小为C×R×R×N,对于每个大小为C×R×R的三维张量中的各标量数值,对应的参数值相同;若网络层为全连接层,则网络权值的大小为M×N,对于每个大小为1×N的列向量中的各标量数值,对应的参数值相同;第一激活量中的各标量数值对应的参数值相同。
W i为神经网络模型第i层所对应的网络权值,网络层类型为卷积层或全连接层。如果第i层为卷积层,则W i为大小为C×R×R×N的四维张量卷积核,对应的张量空间结构如图4所示,其中,C表示卷积核输入通道方向维度大小,R表示卷积核空间维度大小,N表示卷积核输出通道方向维度大小。对每个大小为C×R×R的三维张量W i p内的每个标量数值w,可以表示为:
w=ip*sp    (3)
其中,每个W i p三维张量共享一个sp,每个标量数值w对应一个整型定点值ip,其中,1≤p≤N。大小为C×R×R的三维张量内每个标量数值的编码方式如图5所示,一个三维张量对应一个ip(如图5中的ip1、ip2、ip3),所有的三维张量共享一个sp。其中,ip和sp的计算方式如公式(1)和(2),这里不再赘述。
同理,如果第i层为全连接层,则W i为大小为M×N的二维矩阵,对应的张量空间结构如图6所示,该矩阵可以划分为如下结构,把大小为M×N的二维矩阵切分为M个大小为1×N的列向量组成。对每个大小为1×N的列向量W i q内的每个标量数值w用上述公式(3)表示,其中,1≤q≤M。每个W i q列向量共享一个sp,每个标量数值w对应一个整型定点值ip。大小为1×N的列向量内每个标量数值的编码方式如图7所示。其中,ip和sp的计算方式如公式(1)和(2),这里不再赘述。
Y i和dY i为神经网络模型第i层所对应的激活量及激活量梯度,是大小为C×H×W的三维张量,对该三维张量Y i或者dY i内的每个标量数值y或者dy,可以表示为:
y=ip*sp   (4)
dy=ip*sp     (5)
其中,每个三维张量Y i或者dY i共享一个sp,每个标量数值y或者dy对 应一个整型定点值ip。激活量和激活量梯度三维张量内每个标量数值的编码方式如图8所示,一个三维张量对应一个ip(如图8中的ip1、ip2、ip3),所有的三维张量共享一个sp。其中,ip和sp的计算方式如公式(1)和(2),这里不再赘述。
S303,根据编码后的第一激活量及网络权值,计算网络层输出的第二激活量。
如上述,对第一激活量和网络权值中的各标量数值均进行了整型定点编码,编码后的数值为整型定点数值,从而使得前向运算和反向运算时,所涉及到的运算资源开销最大操作如卷积运算、矩阵乘法运算从浮点运算转变为了整型定点运算,大幅提升了神经网络在硬件平台上的训练效率。
具体的,在对神经网络模型进行训练的过程中,对于神经网络模型中的任一网络层,获取待输入该网络层的第一激活量(对于神经网络模型中的第一层网络层来讲,第一激活量即为输入神经网络模型的训练样本;对于神经网络模型中的其他网络层来讲,第一激活量即为该网络层的输入)及该网络层的网络权值;对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据;将编码后的第一激活量输入该网络层,由该网络层利用编码后的网络权值对编码后的第一激活量进行卷积运算,得到该网络层输出的第二激活量。如果该网络层不是最后一个网络层,则该第二激活量就作为待输入下一网络层的第一激活量。
在本申请实施例的一种实现方式中,S102具体可以通过如下步骤实现:
第一步,将训练样本输入神经网络模型,按照神经网络模型中各网络层从前到后的顺序,对训练样本进行前向运算,得到神经网络模型的前向运算结果,其中,在进行前向运算时,针对各网络层,分别对输入该网络层的第一激活量及该网络层的网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,并根据编码后的各网络层的第一激活量及网络权值,计算该网络层输出的第二激活量,将第二激活量作为输入下一个网络层的第一激活量进行计算,直至将最后一个网络层输出的第二激活量确定为前向运算结果。
第二步,将前向运算结果与预设的标称值进行比较,得到损失值。
第三步,将损失值输入神经网络模型,按照神经网络模型中各网络层从后到前的顺序,对损失值进行反向运算,得到神经网络模型中各网络层的权 值梯度,其中,在进行反向运算时,针对各网络层,分别对输入该网络层的第一激活量、第一激活量梯度及该网络层的网络权值进行整型定点编码,将第一激活量、第一激活量梯度及网络权值编码为具有指定位宽的整型定点数据,并根据编码后的第一激活量、第一激活量梯度及网络权值,计算该网络层输出的第二激活量梯度及权值梯度,将第二激活量梯度作为输入下一个网络层的第一激活量梯度进行计算,直至计算出所有网络层的权值梯度。
第四步,根据各网络层的权值梯度,调整各网络层的网络权值。
上述第一步至第四步的过程即为BP算法的运算过程,通过不断的循环执行这四个步骤,实现神经网络模型的训练。前向运算过程为通过第一激活量与网络权值的相乘Y i=W i*Y i-1计算第二激活量Y i,反向运算过程为通过第一激活量梯度与网络权值的相乘dY i-1=dY i*W i计算第二激活量梯度dY i-1,以及通过第一激活量梯度与第一激活量相乘dW i=dY i*Y i-1计算权值梯度dW i,通过上述整型定点编码,上述浮点运算变为整型定点运算:
f 32(Y k+1)=f 32(Y k)*f 32(W k)→int YB(Y k+1)=int YB(Y k)*int WB(W k)   (6)
f 32(dY k-1)=f 32(dY k)*f 32(W k)→int dYB(dY k-1)=int dYB(dY k)*int WB(W k)  (7)
f 32(dW k)=f 32(dY k)*f 32(Y k-1)→int dWB(dW k)=int dYB(dY k)*int YB(Y k-1)  (8)
其中,YB、WB、dYB、dWB为整型位宽取值,f 32()及int()表示32位浮点格式及整型定点格式。
在本申请实施例的一种实现方式中,上述第四步具体可以通过如下步骤实现:对各网络层的权值梯度进行整型定点编码,将各网络层的权值梯度编码为具有指定位宽的整型定点数据;根据编码后的各网络层的权值梯度及编码后的各网络层的网络权值,利用预设的优化算法,计算调整后各网络层的网络权值。
在计算各网络层的权值梯度之后,可以对权值梯度进行编码,具体的编码过程可以参考上述对网络权值进行编码的过程,这里不再赘述。在编码后,需要基于权值梯度对网络权值进行调整,调整的过程主要是进行矩阵加法,具体采用SGD(Stochastic Gradient Descent,随机梯度下降)等优化算法,可以将网络权值从浮点格式转换为整型定点格式。以SGD优化算法为例,网络权值的转化如公式(9)至(11)所示。
f 32(dW)=f 32(dW)+f 32w)·f 32(W)→
int dWB(dW)=int dWB(dW)+int λBw)·int WB(W)     (9)
f 32(W old)=f 32(m)·f 32(dW old)+f 32(η)·f 32(dW)→
int WB(W old)=int mB(m)·int dWB(dW old)+int ηB(η)·int dWB(dW)     (10)
f 32(W)=f 32(W)+f 32(W old)→int WB(W)=int WB(W)+int WB(W old)    (11)
其中,dW为当前时刻该网络层的权值梯度,dW old为上一时刻该网络层的权值梯度,W为当前时刻该网络层的网络权值,λ w、η和m为训练超参(可以是设定的)。
在本申请实施例的一种实现方式中,在执行S303之后,本申请实施例所提供的神经网络模型训练方法还可以执行:对第二激活量进行整型定点编码,将第二激活量编码为具有指定位宽的整型定点数据。
在每个网络层的运算之后,得到的整型定点数据的位宽一般会变长,在输入到后续的网络层进行运算时,可能会因为较长的位宽导致运算效率降低,为了保证运算效率,可以将计算的第二激活量再进行一次整型定点编码,目的是降低第二激活量的位宽,使得第二激活量的位宽能够满足下一个网络层的计算要求。
应用本申请实施例,获取训练样本,利用训练样本,对神经网络模型进行训练。在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行:获取输入该网络层的第一激活量及网络层的网络权值,对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量。在进行神经网络模型训练时,对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,在保证神经网络模型收敛的精度下,可以大幅地降低运行神经网络模型需要的硬件资源开销。
为了便于理解,下面结合从图像中进行目标识别的具体场景,对本申请实施例的神经网络模型训练方法进行介绍。
首先,建立初始的目标识别模型,例如卷积神经网络模型,该目标识别模型包括三个卷积层和一个全连接层,每个网络层都设置有初始的网络权值。
然后,获取大量的样本图像,样本图像中标记有目标信息,任意读取出一个样本图像,得到该样本图像中各像素点的像素值(为单精度浮点数据)。将该样本图像输入该神经网络模型,得到模型输出结果。具体包括如下步骤:
A、将第一层卷积层作为当前网络层,将该样本图像中各像素点的像素值作为第一层卷积层的第一激活量;
B、对第一激活量进行整型定点编码,将第一激活量编码为具有指定位宽的整型定点数据;并且获取当前层的网络权值,对当前网络层的网络权值进行整型定点编码,将当前网络层的网络权值编码为具有指定位宽的整型定点数据;将编码后的第一激活量输入当前网络层,当前网络层利用编码后的网络权值对编码后的第一激活量进行当前层卷积运算,得到当前网络层输出的第二激活量;
C、将当前层输出的第二激活量,作为下一网络层的第一激活量,返回执行步骤B;直到得到最后一个网络层,即全连接层输出第二激活量,将该全连接层输出的第二激活量作为该目标识别模型的输出结果。
然后,经过损失函数将该目标识别模型的输出与标记的目标信息进行比较,得到损失值,再按照上述过程的反向运算过程,从后到前依次进行卷积运算和矩阵相乘运算,得到每个网络层对应的权值梯度,根据权值梯度对网络权值进行调整。经过不断的迭代过程,实现对目标识别模型的训练。
上述神经网络模型训练方法主要适用于资源受限的边缘设备,例如相机,针对于相机,相机的智能推理功能主要包括目标检测、识别等,下面以目标检测为例,对相机上部署的目标检测模型的训练方法进行介绍,如图9所示,主要包括如下步骤:
S901,开启目标检测功能。
相机可以根据用户的实际需求,在需要进行目标检测时,基于用户的选择结果,开启目标检测功能。
S902,判断是否启动模型在线训练功能,若是则执行S903,否则等待启动模型在线训练功能。
在使用目标检测模型进行目标检测之前,需要对目标检测模型进行训练,是否进行在线训练可以由用户选择,通常情况下,只有在启动模型在线训练功能后,相机才会按照图1所示实施例的步骤,对目标检测模型进行训练。
S903,利用获取的具有指定目标的训练样本,对目标检测模型进行训练。
在对目标检测模型进行训练时,输入目标检测模型的训练样本为包含指定目标的样本图像,这样,训练出来的目标检测模型可以检测出指定目标。具体对目标检测模型进行训练的方式与图3所示实施例中训练神经网络模型的方式相同,这里不再赘述。
由于相机采用图3所示实施例中的训练方式对目标检测模型进行训练,训练过程中对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,可以大幅地降低相机的硬件资源开销。在相机上进行目标检测模型的在线训练,使相机能够具备场景自适应功能。
相应于上述方法实施例,本申请实施例提供了一种神经网络模型训练装置,如图10所示,该装置可以包括:
获取模块1010,用于获取训练样本;
训练模块1020,用于利用训练样本,对神经网络模型进行训练,其中,训练模块1020在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行如下步骤:获取输入网络层的第一激活量及网络层的网络权值;对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据;根据编码后的第一激活量及网络权值,计算网络层输出的第二激活量。
在本申请实施例的一种实现方式中,该装置应用于相机;训练样本为具有指定目标的训练样本;神经网络模型为用于检测指定目标的目标检测模型;
该装置还可以包括:
开启模块,用于开启目标检测功能;
判断模块,用于判断是否启动模型在线训练功能;
训练模块1020,具体可以用于:若判断模块的判断结果为启动模型在线训练功能,则利用具有指定目标的训练样本,对目标检测模型进行训练。
在本申请实施例的一种实现方式中,训练模块1020,具体可以用于:将训练样本输入神经网络模型,按照神经网络模型中各网络层从前到后的顺序, 对训练样本进行前向运算,得到神经网络模型的前向运算结果,其中,在进行前向运算时,针对各网络层,分别对输入该网络层的第一激活量及该网络层的网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,并根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量,将第二激活量作为输入下一个网络层的第一激活量进行计算,直至将最后一个网络层输出的第二激活量确定为前向运算结果;将前向运算结果与预设的标称值进行比较,得到损失值;将损失值输入神经网络模型,按照神经网络模型中各网络层从后到前的顺序,对损失值进行反向运算,得到神经网络模型中各网络层的权值梯度,其中,在进行反向运算时,针对各网络层,分别对输入该网络层的第一激活量、第一激活量梯度及该网络层的网络权值进行整型定点编码,将第一激活量、第一激活量梯度及网络权值编码为具有指定位宽的整型定点数据,并根据编码后的第一激活量、第一激活量梯度及网络权值,计算该网络层输出的第二激活量梯度及权值梯度,将第二激活量梯度作为输入下一个网络层的第一激活量梯度进行计算,直至计算出所有网络层的权值梯度;根据各网络层的权值梯度,调整各网络层的网络权值。
在本申请实施例的一种实现方式中,训练模块1020,在用于根据各网络层的权值梯度,调整各网络层的网络权值时,具体可以用于:对各网络层的权值梯度进行整型定点编码,将各网络层的权值梯度编码为具有指定位宽的整型定点数据;根据编码后的各网络层的权值梯度及编码后的各网络层的网络权值,利用预设的优化算法,计算调整后各网络层的网络权值。
在本申请实施例的一种实现方式中,训练模块1020,还可以用于:对第二激活量进行整型定点编码,将第二激活量编码为具有指定位宽的整型定点数据。
在本申请实施例的一种实现方式中,训练模块1020,在用于对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据时,具体可以用于:将第一激活量及网络权值中的各标量数值分别编码为表征全局动态范围的参数值与指定位宽的整型定点值的乘积。
在本申请实施例的一种实现方式中,若网络层为卷积层,则网络权值的大小为C×R×R×N,对于每个大小为C×R×R的三维张量中的各标量数值, 对应的参数值相同;若网络层为全连接层,则网络权值的大小为M×N,对于每个大小为1×N的列向量中的各标量数值,对应的参数值相同;第一激活量中的各标量数值对应的参数值相同。
应用本申请实施例,获取训练样本,利用训练样本,对神经网络模型进行训练。在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行:获取输入该网络层的第一激活量及网络层的网络权值,对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量。在进行神经网络模型训练时,对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,可以大幅地降低运行神经网络模型需要的硬件资源开销。
本申请实施例提供了一种计算机设备,如图11所示,可以包括处理器1101和机器可读存储介质1102,机器可读存储介质1102存储有能够被处理器1101执行的机器可执行指令,处理器1101被机器可执行指令促使:实现如上述神经网络模型训练方法的步骤。
上述机器可读存储介质可以包括RAM(Random Access Memory,随机存取存储器),也可以包括NVM(Non-Volatile Memory,非易失性存储器),例如至少一个磁盘存储器。可选的,机器可读存储介质还可以是至少一个位于远离上述处理器的存储装置。
上述处理器可以是通用处理器,包括CPU(Central Processing Unit,中央处理器)、NP(Network Processor,网络处理器)等;还可以是DSP(Digital Signal Processing,数字信号处理器)、ASIC(Application Specific Integrated Circuit,专用集成电路)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
机器可读存储介质1102与处理器1101之间可以通过有线连接或者无线连接的方式进行数据传输,并且计算机设备可以通过有线通信接口或者无线通信接口与其他的设备进行通信。图11所示的仅为处理器1101与机器可读 存储介质1102之间通过总线进行数据传输的示例,不作为具体连接方式的限定。
本实施例中,处理器1101通过读取机器可读存储介质1102中存储的机器可执行指令,并通过运行该机器可执行指令,能够实现:获取训练样本,利用训练样本,对神经网络模型进行训练。在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行:获取输入该网络层的第一激活量及网络层的网络权值,对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量。在进行神经网络模型训练时,对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,可以大幅地降低运行神经网络模型需要的硬件资源开销。
本申请实施例还提供了一种机器可读存储介质,存储有机器可执行指令,在被处理器调用和执行时,实现如上述神经网络模型训练方法的步骤。
本实施例中,机器可读存储介质存储有在运行时执行本申请实施例所提供的神经网络模型训练方法的机器可执行指令,因此能够实现:获取训练样本,利用训练样本,对神经网络模型进行训练。在对神经网络模型进行训练时,针对神经网络模型中的各网络层,分别执行:获取输入该网络层的第一激活量及网络层的网络权值,对第一激活量及网络权值进行整型定点编码,将第一激活量及网络权值编码为具有指定位宽的整型定点数据,根据编码后的第一激活量及网络权值,计算该网络层输出的第二激活量。在进行神经网络模型训练时,对输入每个网络层的第一激活量和每个网络层的网络权值进行整型定点编码,编码后的第一激活量和网络权值为具有指定位宽的整型定点数据,则在进行运算时,所涉及到的矩阵乘法、矩阵加法等运算都采用整型定点格式,整型定点数据的位宽明显少于单精度浮点数据的位宽,因此,可以大幅地降低运行神经网络模型需要的硬件资源开销。
本申请实施例还提供一种计算机程序产品,用于在运行时执行上述神经网络模型训练方法的步骤。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意 组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、DSL(Digital Subscriber Line,数字用户线))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如软盘、硬盘、磁带)、光介质(例如DVD(Digital Versatile Disc,数字多功能光盘))、或者半导体介质(例如SSD(Solid State Disk,固态硬盘))等。
对于装置、电子设备、计算机可读存储介质和计算机程序产品实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本领域普通技术人员可以理解实现上述方法实施方式中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,所述的程序可以存储于计算机可读取存储介质中,这里所称得的存储介质,如:ROM/RAM、磁碟、光盘等。
以上所述仅为本申请的较佳实施例,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本申请的保护范围内。

Claims (17)

  1. 一种神经网络模型训练方法,其特征在于,所述方法包括:
    获取训练样本;
    利用所述训练样本,对神经网络模型进行训练,其中,在对所述神经网络模型进行训练时,针对所述神经网络模型中的各网络层,分别执行如下步骤:
    获取输入所述网络层的第一激活量及所述网络层的网络权值;
    对所述第一激活量及所述网络权值进行整型定点编码,将所述第一激活量及所述网络权值编码为具有指定位宽的整型定点数据;
    根据编码后的所述第一激活量及所述网络权值,计算所述网络层输出的第二激活量。
  2. 根据权利要求1所述的方法,其特征在于,所述方法应用于相机;所述训练样本为具有指定目标的训练样本;所述神经网络模型为用于检测所述指定目标的目标检测模型;
    在所述利用所述训练样本,对神经网络模型进行训练之前,所述方法还包括:
    开启目标检测功能;
    判断是否启动模型在线训练功能;
    所述利用所述训练样本,对神经网络模型进行训练,包括:
    若启动所述模型在线训练功能,则利用所述具有指定目标的训练样本,对所述目标检测模型进行训练。
  3. 根据权利要求1所述的方法,其特征在于,所述利用所述训练样本,对神经网络模型进行训练,包括:
    将所述训练样本输入神经网络模型,按照所述神经网络模型中各网络层从前到后的顺序,对所述训练样本进行前向运算,得到所述神经网络模型的前向运算结果,其中,在进行前向运算时,针对各网络层,分别对输入该网络层的第一激活量及该网络层的网络权值进行整型定点编码,将所述第一激活量及所述网络权值编码为具有指定位宽的整型定点数据,并根据编码后的所述第一激活量及所述网络权值,计算该网络层输出的第二激活量,将所述第二激活量作为输入下一个网络层的第一激活量进行计算,直至将最后一个网络层输出的第二激活量确定为前向运算结果;
    将所述前向运算结果与预设的标称值进行比较,得到损失值;
    将所述损失值输入所述神经网络模型,按照所述神经网络模型中各网络层从后到前的顺序,对所述损失值进行反向运算,得到所述神经网络模型中各网络层的权值梯度,其中,在进行反向运算时,针对各网络层,分别对输入该网络层的第一激活量、第一激活量梯度及该网络层的网络权值进行整型定点编码,将所述第一激活量、所述第一激活量梯度及所述网络权值编码为具有指定位宽的整型定点数据,并根据编码后的所述第一激活量、所述第一激活量梯度及所述网络权值,计算该网络层输出的第二激活量梯度及权值梯度,将所述第二激活量梯度作为输入下一个网络层的第一激活量梯度进行计算,直至计算出所有网络层的权值梯度;
    根据所述各网络层的权值梯度,调整所述各网络层的网络权值。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述各网络层的权值梯度,调整所述各网络层的网络权值,包括:
    对所述各网络层的权值梯度进行整型定点编码,将所述各网络层的权值梯度编码为具有指定位宽的整型定点数据;
    根据编码后的所述各网络层的权值梯度及编码后的所述各网络层的网络权值,利用预设的优化算法,计算调整后所述各网络层的网络权值。
  5. 根据权利要求1所述的方法,其特征在于,在所述根据编码后的所述第一激活量及所述网络权值,计算所述网络层输出的第二激活量之后,所述方法还包括:
    对所述第二激活量进行整型定点编码,将所述第二激活量编码为具有指定位宽的整型定点数据。
  6. 根据权利要求1所述的方法,其特征在于,所述对所述第一激活量及所述网络权值进行整型定点编码,将所述第一激活量及所述网络权值编码为具有指定位宽的整型定点数据,包括:
    将所述第一激活量及所述网络权值中的各标量数值分别编码为表征全局动态范围的参数值与指定位宽的整型定点值的乘积。
  7. 根据权利要求6所述的方法,其特征在于,若所述网络层为卷积层,则所述网络权值的大小为C×R×R×N,对于每个大小为C×R×R的三维张量中的各标量数值,对应的所述参数值相同;
    若所述网络层为全连接层,则所述网络权值的大小为M×N,对于每个大 小为1×N的列向量中的各标量数值,对应的所述参数值相同;
    所述第一激活量中的各标量数值对应的所述参数值相同。
  8. 一种神经网络模型训练装置,其特征在于,所述装置包括:
    获取模块,用于获取训练样本;
    训练模块,用于利用所述训练样本,对神经网络模型进行训练,其中,所述训练模块在对所述神经网络模型进行训练时,针对所述神经网络模型中的各网络层,分别执行如下步骤:
    获取输入所述网络层的第一激活量及所述网络层的网络权值;
    对所述第一激活量及所述网络权值进行整型定点编码,将所述第一激活量及所述网络权值编码为具有指定位宽的整型定点数据;
    根据编码后的所述第一激活量及所述网络权值,计算所述网络层输出的第二激活量。
  9. 根据权利要求8所述的装置,其特征在于,所述装置应用于相机;所述训练样本为具有指定目标的训练样本;所述神经网络模型为用于检测所述指定目标的目标检测模型;
    所述装置还包括:
    开启模块,用于开启目标检测功能;
    判断模块,用于判断是否启动模型在线训练功能;
    所述训练模块,具体用于:
    若所述判断模块的判断结果为启动所述模型在线训练功能,则利用所述具有指定目标的训练样本,对所述目标检测模型进行训练。
  10. 根据权利要求8所述的装置,其特征在于,所述训练模块,具体用于:
    将所述训练样本输入神经网络模型,按照所述神经网络模型中各网络层从前到后的顺序,对所述训练样本进行前向运算,得到所述神经网络模型的前向运算结果,其中,在进行前向运算时,针对各网络层,分别对输入该网络层的第一激活量及该网络层的网络权值进行整型定点编码,将所述第一激活量及所述网络权值编码为具有指定位宽的整型定点数据,并根据编码后的所述第一激活量及所述网络权值,计算该网络层输出的第二激活量,将所述第二激活量作为输入下一个网络层的第一激活量进行计算,直至将最后一个网络层输出的第二激活量确定为前向运算结果;
    将所述前向运算结果与预设的标称值进行比较,得到损失值;
    将所述损失值输入所述神经网络模型,按照所述神经网络模型中各网络层从后到前的顺序,对所述损失值进行反向运算,得到所述神经网络模型中各网络层的权值梯度,其中,在进行反向运算时,针对各网络层,分别对输入该网络层的第一激活量、第一激活量梯度及该网络层的网络权值进行整型定点编码,将所述第一激活量、所述第一激活量梯度及所述网络权值编码为具有指定位宽的整型定点数据,并根据编码后的所述第一激活量、所述第一激活量梯度及所述网络权值,计算该网络层输出的第二激活量梯度及权值梯度,将所述第二激活量梯度作为输入下一个网络层的第一激活量梯度进行计算,直至计算出所有网络层的权值梯度;
    根据所述各网络层的权值梯度,调整所述各网络层的网络权值。
  11. 根据权利要求10所述的装置,其特征在于,所述训练模块,在用于所述根据所述各网络层的权值梯度,调整所述各网络层的网络权值时,具体用于:
    对所述各网络层的权值梯度进行整型定点编码,将所述各网络层的权值梯度编码为具有指定位宽的整型定点数据;
    根据编码后的所述各网络层的权值梯度及编码后的所述各网络层的网络权值,利用预设的优化算法,计算调整后所述各网络层的网络权值。
  12. 根据权利要求8所述的装置,其特征在于,所述训练模块,还用于:
    对所述第二激活量进行整型定点编码,将所述第二激活量编码为具有指定位宽的整型定点数据。
  13. 根据权利要求8所述的装置,其特征在于,所述训练模块,在用于所述对所述第一激活量及所述网络权值进行整型定点编码,将所述第一激活量及所述网络权值编码为具有指定位宽的整型定点数据时,具体用于:
    将所述第一激活量及所述网络权值中的各标量数值分别编码为表征全局动态范围的参数值与指定位宽的整型定点值的乘积。
  14. 根据权利要求13所述的装置,其特征在于,若所述网络层为卷积层,则所述网络权值的大小为C×R×R×N,对于每个大小为C×R×R的三维张量中的各标量数值,对应的所述参数值相同;
    若所述网络层为全连接层,则所述网络权值的大小为M×N,对于每个大小为1×N的列向量中的各标量数值,对应的所述参数值相同;
    所述第一激活量中的各标量数值对应的所述参数值相同。
  15. 一种计算机设备,其特征在于,包括处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令,所述处理器被所述机器可执行指令促使:实现权利要求1至7中任意一项所述的方法。
  16. 一种机器可读存储介质,其特征在于,所述机器可读存储介质内存储有机器可执行指令,在被处理器调用和执行时,实现权利要求1至7中任意一项所述的方法。
  17. 一种计算机程序产品,其特征在于,用于在运行时执行:权利要求1至7中任意一项所述的方法。
PCT/CN2020/111912 2019-08-29 2020-08-27 一种神经网络模型训练方法及装置 WO2021037174A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910808066.6A CN112446461A (zh) 2019-08-29 2019-08-29 一种神经网络模型训练方法及装置
CN201910808066.6 2019-08-29

Publications (1)

Publication Number Publication Date
WO2021037174A1 true WO2021037174A1 (zh) 2021-03-04

Family

ID=74685187

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/111912 WO2021037174A1 (zh) 2019-08-29 2020-08-27 一种神经网络模型训练方法及装置

Country Status (2)

Country Link
CN (1) CN112446461A (zh)
WO (1) WO2021037174A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935470A (zh) * 2021-10-27 2022-01-14 安谋科技(中国)有限公司 神经网络模型的运行方法、介质和电子设备
CN117557244A (zh) * 2023-09-27 2024-02-13 国网江苏省电力有限公司信息通信分公司 基于知识图谱的电力运维警戒系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902745A (zh) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 一种基于cnn的低精度训练与8位整型量化推理方法
CN109934331A (zh) * 2016-04-29 2019-06-25 北京中科寒武纪科技有限公司 用于执行人工神经网络正向运算的装置和方法
CN110096968A (zh) * 2019-04-10 2019-08-06 西安电子科技大学 一种基于深度模型优化的超高速静态手势识别方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575379B (zh) * 2014-09-09 2019-07-23 英特尔公司 用于神经网络的改进的定点整型实现方式

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934331A (zh) * 2016-04-29 2019-06-25 北京中科寒武纪科技有限公司 用于执行人工神经网络正向运算的装置和方法
CN109902745A (zh) * 2019-03-01 2019-06-18 成都康乔电子有限责任公司 一种基于cnn的低精度训练与8位整型量化推理方法
CN110096968A (zh) * 2019-04-10 2019-08-06 西安电子科技大学 一种基于深度模型优化的超高速静态手势识别方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHUANG WU, GUOQI LI, FENG CHEN, LUPING SHI: "TRAINING AND INFERENCE WITH INTEGERS IN DEEP NEURAL NETWORKS", ARXIV, 13 February 2018 (2018-02-13), pages 1 - 14, XP002798214, Retrieved from the Internet <URL:https://arxiv.org/pdf/1802.04680.pdf> [retrieved on 20200310] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935470A (zh) * 2021-10-27 2022-01-14 安谋科技(中国)有限公司 神经网络模型的运行方法、介质和电子设备
CN117557244A (zh) * 2023-09-27 2024-02-13 国网江苏省电力有限公司信息通信分公司 基于知识图谱的电力运维警戒系统

Also Published As

Publication number Publication date
CN112446461A (zh) 2021-03-05

Similar Documents

Publication Publication Date Title
US11373087B2 (en) Method and apparatus for generating fixed-point type neural network
CN109949255B (zh) 图像重建方法及设备
KR102469261B1 (ko) 적응적 인공 신경 네트워크 선택 기법들
JP2019528502A (ja) パターン認識に適用可能なモデルを最適化するための方法および装置ならびに端末デバイス
TW201915839A (zh) 對人工神經網路及浮點神經網路進行量化的方法及裝置
CN111105017B (zh) 神经网络量化方法、装置及电子设备
CN112200057B (zh) 人脸活体检测方法、装置、电子设备及存储介质
WO2021135715A1 (zh) 一种图像压缩方法及装置
CN110689599A (zh) 基于非局部增强的生成对抗网络的3d视觉显著性预测方法
WO2021037174A1 (zh) 一种神经网络模型训练方法及装置
CA3137297C (en) Adaptive convolutions in neural networks
WO2020164189A1 (zh) 图像复原方法及装置、电子设备、存储介质
CN112561028A (zh) 训练神经网络模型的方法、数据处理的方法及装置
TW202143164A (zh) 圖像處理方法、電子設備和電腦可讀儲存介質
WO2022242122A1 (zh) 一种视频优化方法、装置、终端设备及存储介质
CN116030792A (zh) 用于转换语音音色的方法、装置、电子设备和可读介质
CN114698395A (zh) 神经网络模型的量化方法和装置、数据处理的方法和装置
CN110211017B (zh) 图像处理方法、装置及电子设备
WO2022246986A1 (zh) 数据处理方法、装置、设备及计算机可读存储介质
WO2021057926A1 (zh) 一种神经网络模型训练方法及装置
CN117173269A (zh) 一种人脸图像生成方法、装置、电子设备和存储介质
JP7352243B2 (ja) コンピュータプログラム、サーバ装置、端末装置、学習済みモデル、プログラム生成方法、及び方法
CN114792388A (zh) 图像描述文字生成方法、装置及计算机可读存储介质
CN110852202A (zh) 一种视频分割方法及装置、计算设备、存储介质
WO2021093780A1 (zh) 一种目标识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20857527

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20857527

Country of ref document: EP

Kind code of ref document: A1