WO2020118553A1 - 一种卷积神经网络的量化方法、装置及电子设备 - Google Patents

一种卷积神经网络的量化方法、装置及电子设备 Download PDF

Info

Publication number
WO2020118553A1
WO2020118553A1 PCT/CN2018/120560 CN2018120560W WO2020118553A1 WO 2020118553 A1 WO2020118553 A1 WO 2020118553A1 CN 2018120560 W CN2018120560 W CN 2018120560W WO 2020118553 A1 WO2020118553 A1 WO 2020118553A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
neural network
convolutional neural
ssdlite
quantization
Prior art date
Application number
PCT/CN2018/120560
Other languages
English (en)
French (fr)
Inventor
范鸿翔
Original Assignee
深圳鲲云信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳鲲云信息科技有限公司 filed Critical 深圳鲲云信息科技有限公司
Priority to PCT/CN2018/120560 priority Critical patent/WO2020118553A1/zh
Priority to CN201880083718.8A priority patent/CN111542838B/zh
Publication of WO2020118553A1 publication Critical patent/WO2020118553A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of artificial intelligence, and more specifically, to a quantization method, device and electronic equipment for convolutional neural networks.
  • the lightweight single-lens multi-target SSDLite convolutional neural network is a newly proposed new type of convolutional neural network for intelligent object detection tasks.
  • the SSDLite convolutional neural network has a high accuracy.
  • the algorithm complexity is still very high, which is a challenge for hardware resources such as small embedded systems. Therefore, when SSDLite convolutional neural network is applied to small embedded systems, there is a problem of high algorithm complexity.
  • the purpose of this application is to provide a quantization method, device and electronic equipment for convolutional neural network in view of the above-mentioned defects in the prior art, which is applied to SSDLite convolutional neural network and solves the application of SSDLite convolutional neural network to small embedded
  • the system has the problem of high algorithm complexity.
  • a quantization method for a convolutional neural network includes:
  • the initial SSDlite convolutional neural network including a feature processor for feature extraction and a position predictor for predicting feature locations;
  • the target SSDlite convolutional neural network is output.
  • the quantizing the network layer parameters in the feature processor to obtain the quantized feature processor includes:
  • the quantizing the parameters of at least one convolutional layer in the feature processor to obtain at least one quantized convolutional layer includes:
  • the quantizing the parameters of at least one convolutional layer in the feature processor to obtain at least one convolutional layer includes:
  • the quantizing the parameters of at least one convolutional layer in the feature processor to obtain at least one quantized convolutional layer includes:
  • the method further includes:
  • a quantization device for a convolutional neural network for an SSDlite convolutional neural network, characterized in that the device includes:
  • An acquisition module for acquiring an initial SSDlite convolutional neural network including a feature processor for feature extraction and a position predictor for predicting feature locations;
  • a quantization module used to quantize the network layer parameters in the feature processor to obtain a quantized feature processor
  • a holding module used to maintain the network layer parameter status of the position predictor, and based on the quantization feature processor and the position predictor, obtain a quantized SSDlite convolutional neural network
  • the output module based on the quantized SSDlite convolutional neural network, outputs the target SSDlite convolutional neural network.
  • an electronic device including: a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the present application is implemented Examples provide steps in the quantization method of convolutional neural networks.
  • a computer-readable storage medium is provided, and a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, a quantization method for a convolutional neural network provided in an embodiment of the present application is implemented. A step of.
  • Beneficial effects brought by the present application acquiring an initial SSDlite convolutional neural network, the initial SSDlite convolutional neural network includes a feature processor for feature extraction and a position predictor for predicting the feature position; the feature processor Quantify the network layer parameters in the quantization feature processor; maintain the network layer parameter status of the position predictor, and obtain the quantified SSDlite convolutional neural network based on the quantization feature processor and the position predictor;
  • the quantitative SSDlite convolutional neural network is described, and the target SSDlite convolutional neural network is output.
  • the complexity of the feature processing algorithm in the feature processor is reduced, thereby reducing the complexity of the SSDlite convolutional neural network, and solving the SSDLite convolutional neural network application in small In embedded systems, there is a problem of high algorithm complexity.
  • the network layer parameter status of the position predictor is maintained, the accuracy loss of detection can be reduced.
  • FIG. 1 is a schematic flowchart of a method for quantifying a convolutional neural network according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another quantization method of a convolutional neural network provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a quantization device of a convolutional neural network according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of another quantization device of a convolutional neural network provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of another quantization device for convolutional neural networks according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another quantization device for convolutional neural networks according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of quantization of a convolutional neural network provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of feature processor quantization provided by an embodiment of the present application.
  • This application provides a quantization method, device and electronic equipment for convolutional neural networks.
  • FIG. 1 is a schematic flowchart of a quantization method for a convolutional neural network according to an embodiment of the present application. As shown in FIG. 1, the method includes the following steps:
  • the initial SSDlite convolutional neural network includes a feature processor for feature extraction and a position predictor for predicting feature locations.
  • the above initial SSDlite convolutional neural network can be established by the user himself, for example, by inputting the corresponding data and program, or the established SSDlite convolutional neural network can be used as the initial SSDlite convolutional neural network
  • the above initial SSDlite convolutional neural network may be an already trained SSDlite convolutional neural network or an untrained SSDlite convolutional neural network.
  • the above feature processor can be understood as a network area in the initial SSDlite convolutional neural network for extracting image features
  • the above-described position predictor can be understood as a network area in the initial SSDlite convolutional neural network for predicting feature positions in the image.
  • Object detection can be achieved through feature processors and position predictors.
  • the above network layer parameters can be weights, input data, etc.
  • the above quantization can be linear quantization. Specifically, the maximum and minimum values of the network layer parameters can be obtained first, and the maximum and minimum values are formed into an interval to obtain the network Layer parameter interval. It can be understood that different parameters may correspond to different parameter intervals.
  • the above-mentioned quantization may be quantization using 8-bit floating point numbers, or quantization using other bit floating point numbers, such as 10-bit floating point numbers and 16-bit floating point numbers. Taking 8-bit floating-point data for quantization as an example, 8-bit floating-point numbers can represent [0, 255] a total of 256 discrete numbers.
  • the linear interval of [-3, 9] is represented in the coordinate system, and the linear interval of [-3, 9] is a line segment.
  • This line segment is divided into 256 segments on average, and the interval of each segment is [ X, X+0.046875], that is, the interval [-3, 9] can be divided into [-3, -2.953125], (-2.953125, -2.90625]... (8.953125, 9], or [-3, 9]
  • the interval is divided into [-3, -2.953125), [-2.953125, -2.90625)...
  • the interval [-3, 9] can be divided into 256 segments, of which two end values Is a segment, then the interval of each segment is [X, X+0.04724409449], which is (- ⁇ , -3], (-3, -2.95275590551]...
  • the parameters -3 is 0 after [0, 255] after quantization, 255 after [0, 255] after quantization of parameter 9, and [0 after quantization of parameters falling within the range of (-3, -2.95275590551] , 255] 1 and quantize the parameters that fall into the (8.95275590551, 9) interval to [0, 255] 254.
  • the quantization is 1.
  • the above retention can be understood as not quantifying the network layer parameters of the position predictor, so that the network layer of the position predictor can use the original parameters to detect the position of the feature, for example, in the initial SSDlite convolutional neural network
  • 32-bit floating point numbers are used for calculation, and the parameters of the network layer of the position predictor are maintained as 32-bit floating point numbers, so that the network layer in the position predictor is calculated with 32-bit floating point numbers.
  • the above type converter is used to convert the original parameter type to the quantized parameter type and convert the quantized parameter type to the original parameter type, for example,
  • 32-bit floating-point numbers can be converted to 8-bit floating-point numbers, and before the feature processor outputs, 8-bit floating-point numbers are converted to 32-bit floating-point numbers for output, so that the position predictor uses 32-bit floating-point numbers Calculate the data to maintain the accuracy of feature detection.
  • the network area of the feature extraction part of the SSDlite convolutional neural network can be quantified, reducing the complexity of the network.
  • the above-mentioned quantitative SSDlite convolutional neural network refers to the SSDlite convolutional neural network that quantizes the feature processor parameters in step 103 and keeps the feature predictor parameters unchanged, and outputs the quantitative SSDlite convolutional neural network to obtain the target SSDlite convolutional neural network.
  • the algorithm complexity is greatly reduced.
  • the required on-chip memory resources and hardware computing resources are also greatly reduced.
  • the above-mentioned target SSDlite convolutional neural network may be an SSDlite convolutional neural network after training, or an SSDlite convolutional neural network without training.
  • the initial SSDlite convolutional neural network can be trained to make the output target SSDlite convolutional neural network trained SSDlite convolutional neural network, or, if the above target SSDlite convolutional neural network is an untrained SSDlite convolutional neural network, you can train the target SSDlite convolutional neural network through pre-prepared sample data before outputting.
  • an initial SSDlite convolutional neural network is obtained, and the initial SSDlite convolutional neural network includes a feature processor for feature extraction and a position predictor for predicting feature locations; Quantify the network layer parameters to obtain a quantized feature processor; maintain the network layer parameter status of the position predictor, and obtain a quantified SSDlite convolutional neural network based on the quantized feature processor and the position predictor; based on the quantization SSDlite convolutional neural network, output the target SSDlite convolutional neural network.
  • the complexity of the feature processing algorithm in the feature processor is reduced, thereby reducing the complexity of the SSDlite convolutional neural network, and solving the SSDLite convolutional neural network application in small In embedded systems, there is a problem of high algorithm complexity.
  • the network layer parameter status of the position predictor is maintained, the accuracy loss of detection can be reduced.
  • the quantization method of the convolutional neural network provided in the embodiments of the present application can be applied to devices for quantifying the convolutional neural network, such as computers, servers, mobile phones, and other devices that can quantify the convolutional neural network.
  • FIG. 2 is a schematic flowchart of another quantization method of a convolutional neural network provided by an embodiment of the present application. As shown in FIG. 2, the method includes the following steps:
  • the initial SSDlite convolutional neural network includes a feature processor for feature extraction and a position predictor for predicting feature locations.
  • the above feature processor includes at least one convolutional layer.
  • the convolution layer in the feature processor is used to extract features in the image.
  • the above-mentioned convolutional layer parameters may be parameters such as weights and inputs.
  • the above-mentioned convolutional layer parameters can be obtained when the initial SSDlite convolutional neural network is established, it can be untrained hyperparameters, or it can be the parameters after training.
  • the above-mentioned convolutional layer parameters can be obtained through various parameters It can be expressed by a set of values, or by continuous parameter intervals of various parameters. For example, if the weights of the convolutional layer include 1.4362, 0.8739, ... 0.3857, etc., you can use ⁇ 1.4362, 0.8739, ...
  • the set is represented by a set of discrete values. Assuming that 0.3857 is the minimum value and 1.4362 is the maximum value, it can also be expressed with the closed interval of [0.3857, 1.4362]. At this time, there are 0.8739 ⁇ [0.3857 , 1.4362].
  • the end value of the convolutional layer parameters may be obtained, that is, only the maximum and minimum values of the parameters of the convolutional layer are obtained, thereby establishing the parameter interval of the parameters of the convolutional layer, such as obtaining a certain
  • the maximum value of each parameter is 9, and the minimum value is 0, then the corresponding parameter interval of this parameter can be established as [0, 9], and the remaining values of this parameter belong to the parameter interval of [0, 9].
  • the quantization of the convolutional layer parameters can be linear quantization.
  • the convolutional layer parameters can be simplified, thereby reducing the algorithmic complexity of the convolutional layer.
  • the convolutional layer parameter obtained in step 202 is a discrete parameter set
  • the discrete parameter set can be linearized, so as to be converted into a quantized floating-point type through linear quantization, or it can be based on linear
  • the number of quantized discrete values, after sorting the above parameter set according to size, is divided into a subset corresponding to the number of linear quantized discrete values, for example: taking 8-bit floating-point numbers as an example, 8-bit floating-point numbers can be Represents 256 discrete values, then the parameter set can be expressed on one-dimensional coordinates, the distance between the starting point and the final point (the absolute value of the difference between the maximum value and the minimum value) is the length, and the length is divided into 256 on average Segment, so that the parameter set is divided into 256 subsets
  • the 256 subsets correspond to each element in the discrete interval [0, 255], and the elements in each subset correspond to the corresponding area on the one-dimensional coordinates; for the above Each subset is clustered. After clustering the elements in each subset, a parameter value representing the subset can be obtained, and the parameter value is associated with the corresponding element in the discrete interval [0, 255].
  • a parameter value representing the subset can be obtained, and the parameter value is associated with the corresponding element in the discrete interval [0, 255].
  • the remaining network layers mentioned above may be offset layers, regularization layers, etc.
  • the above-mentioned retention can be understood as not quantizing the parameters of the remaining network layers, so that the remaining network layers can be performed with the original floating-point numbers.
  • each layer uses 32-bit floating-point numbers for calculation, then keep the parameters of the remaining network layers in the feature processor except the convolutional layer to 32-bit floating-point numbers, so that The remaining network layers are calculated with 32-bit floating point numbers.
  • the quantized data type of the convolutional layer can be converted back to the original floating-point number type through a type converter and stored in memory.
  • the original data is read.
  • the floating-point type data can be stored in the quantized data in the memory.
  • the quantized data is read and input into the type converter to convert to the original floating-point type for calculation.
  • the algorithm complexity of the SSDlite convolutional neural network can be reduced while reducing the accuracy loss, further ensuring The detection accuracy of SSDlite convolutional neural network.
  • steps 202, 203, and 204 can be regarded as a further optional definition of step 102 in the embodiment of FIG. 1, in some possible implementation manners, all network layers in the entire feature processor can be Quantization is also possible.
  • the quantizing the parameters of at least one convolutional layer in the feature processor to obtain at least one quantized convolutional layer includes:
  • the convolutional layer to be quantized may be a layer of convolution or multiple layers of convolutional layers or all convolutional layers in the feature processor, and the maximum and minimum weight values of the convolutional layer to be quantized may be obtained by Value to establish the weight value interval of the convolutional layer to be quantized.
  • the above quantization of the weight value may be linear quantization, and the above quantization interval may be determined according to the floating point number to be selected. After the weight value of the convolutional layer to be quantized is quantized, a quantized convolutional layer can be obtained, thereby realizing the quantization of the feature processor.
  • the quantizing the parameters of at least one convolutional layer in the feature processor to obtain at least one convolutional layer includes:
  • the input data may be image data, and quantizing the image data may reduce the calculation amount of the algorithm.
  • input data can be quantized by linear quantization.
  • the quantizing the parameters of at least one convolutional layer in the feature processor to obtain at least one quantized convolutional layer includes:
  • acquiring the parameter interval of the convolutional layer to be quantized may be acquiring the maximum and minimum values of each parameter to form a continuous parameter interval.
  • the above-mentioned linear quantization of the parameter interval with 8-bit floating point numbers can be understood as ,
  • the continuous parameter interval is divided into 256 sub-intervals, so that each sub-interval is associated with each element in the discrete [0, 255] one-to-one correspondence, so that the parameter quantization interval is discretized, thereby reducing the algorithm of the convolution layer the complexity.
  • the method further includes:
  • the trained SSDlite convolutional neural network can be obtained, and after quantization, there is no need to retrain.
  • the initial SSDlite convolutional neural network may have been trained before input.
  • the quantization method of the convolutional neural network in the embodiments corresponding to FIG. 1 and FIG. 2 can be implemented to achieve the same effect, and details are not described herein again.
  • FIG. 3 is a schematic structural diagram of a quantization device for a convolutional neural network according to an embodiment of the present application. As shown in FIG. 3, the device includes:
  • An acquisition module for acquiring an initial SSDlite convolutional neural network including a feature processor for feature extraction and a position predictor for predicting feature locations;
  • a quantization module used to quantize the network layer parameters in the feature processor to obtain a quantized feature processor
  • a holding module used to maintain the initial parameter state of the network layer of the position predictor, combining the quantization feature processor and the position predictor, to obtain a quantized SSDlite convolutional neural network
  • the output module based on the quantized SSDlite convolutional neural network, outputs the target SSDlite convolutional neural network.
  • the quantization module includes:
  • An obtaining unit configured to obtain at least one layer of convolution layer parameters in the feature processor
  • a quantization unit configured to quantize at least one layer of convolutional layer parameters in the feature processor to obtain at least one layer of quantized convolutional layer
  • the holding unit is configured to maintain the initial parameter states of the remaining network layers in the feature processor, and combine the at least one quantization convolution layer and the remaining network layers to obtain a quantization feature processor.
  • the quantization unit includes:
  • An acquisition subunit used to acquire the weight value interval of the convolutional layer to be quantified
  • a quantization subunit configured to quantize the weight value interval of the convolutional layer to be quantized to obtain a quantization interval of the weight value
  • the output subunit is used to obtain a quantized convolution layer according to the quantization interval of the weight value.
  • the quantization unit includes:
  • a quantization subunit configured to quantize the input data interval of the convolutional layer to be quantized to obtain a quantization interval of the input data
  • the output subunit is used to obtain a quantized convolution layer according to the quantization interval of the input data.
  • the quantization unit includes:
  • a quantization subunit used to linearly quantize the parameter interval with 8-bit floating point numbers to obtain a parameter quantization interval of [0, 255];
  • the output subunit is used to obtain a quantized convolution layer according to the parameter quantization interval.
  • the device further includes:
  • the training module is used to train the initial SSDlite convolutional neural network using preset training sample data.
  • an embodiment of the present application provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, when the processor executes the computer program
  • the steps in the quantization method of the convolutional neural network provided in the embodiments of the present application are implemented.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program on the computer-readable storage medium, and when the computer program is executed by a processor, a convolutional neural network provided by an embodiment of the present application is implemented Steps in a quantitative method.
  • processors and chips in the embodiments of the present application may be integrated into one processing unit, or may exist alone physically, or two or more hardwares may be integrated into one unit.
  • the computer-readable storage medium or computer-readable program may be stored in a computer-readable memory.
  • the technical solution of the present application essentially or part of the contribution to the existing technology or all or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a memory, Several instructions are included to enable a computer device (which may be a personal computer, server, network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application.
  • the aforementioned memory includes: U disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
  • the program may be stored in a computer-readable memory, and the memory may include: a flash disk , Read-Only Memory (English: Read-Only Memory, abbreviation: ROM), Random Access Device (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.
  • ROM Read-Only Memory
  • RAM Random Access Device
  • magnetic disk or optical disk etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

一种卷积神经网络的量化方法、装置及电子设备,所述方法包括:获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器(101);将所述特征处理器中的网络层参数进行量化,得到量化特征处理器(102);保持所述位置预测器的网络层初始参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络(103);基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络(104)。通过对SSDlite卷积神经网络中的特征处理器进行量化,使得特征处理器中的特征处理算法复杂度下降,从而降低了SSDlite卷积神经网络的复杂度,解决了SSDLite卷积神经网络应用于小型嵌入式系统时存在算法复杂度高的问题。

Description

一种卷积神经网络的量化方法、装置及电子设备 技术领域
本申请涉及人工智能领域,更具体的说,是涉及一种卷积神经网络的量化方法、装置及电子设备。
背景技术
随着人工智能的发展,智能物体检测技术已经被应用于大量场景中,如自动驾驶,人脸检测以及交通监控。轻量级单镜头多目标SSDLite卷积神经网络是一种新的提出的新型卷积神经网络,用于智能物体检测任务,SSDLite卷积神经网络虽然拥有较高的准确率。不过,对小型嵌入式系统等资源有限的硬件而言,其算法复杂度依然很高,对于小型嵌入式系统等的硬件资源是一个挑战。因此,SSDLite卷积神经网络应用于小型嵌入式系统时存在算法复杂度高的问题。
申请内容
本申请的目的是针对上述现有技术存在的缺陷,提供一种卷积神经网络的量化方法、装置及电子设备,应用于SSDLite卷积神经网络,解决了SSDLite卷积神经网络应用于小型嵌入式系统时存在算法复杂度高的问题。
本申请的目的是通过以下技术方案来实现的:
第一方面,提供一种卷积神经网络的量化方法,所述方法包括:
获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;
将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;
保持所述位置预测器的网络层参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;
基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
可选的,所述将所述特征处理器中的网络层参数进行量化,得到量化特征处理器,包括:
获取所述特征处理器中至少一层卷积层参数;
将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层;
保持所述特征处理器中的其他网络层参数状态,结合所述至少一层量化卷积层与所述其他网络层,得到量化特征处理器。
可选的,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层,包括:
获取待量化的卷积层的权重值区间;
将所述待量化的卷积层的权重值区间进行量化,得到权重值的量化区间;
根据所述权重值的量化区间,得到量化卷积层。
可选的,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层卷积层,包括:
获取待量化的卷积层的输入数据区间;
将所述待量化的卷积层的输入数据区间进行量化,得到输入数据的量化区间;
根据所述输入数据的量化区间,得到量化卷积层。
可选的,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层,包括:
获取待量化的卷积层的参数区间;
将所述参数区间进行以8比特浮点数进行线性量化,得到[0,255]的参数量化区间;
根据所述参数量化区间,得到量化卷积层。
可选的,在所述建立初始SSDlite卷积神经网络之后,所述方法还包括:
使用预先设置的训练样本数据对初始SSDlite卷积神经网络进行训练。
第二方面,提供一种卷积神经网络的量化装置,用于SSDlite卷积神经网络,其特征在于,所述装置包括:
获取模块,用于获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;
量化模块,用于将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;
保持模块,用于保持所述位置预测器的网络层参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;
输出模块,基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
第三方面,提供一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本申请实施例提供的卷积神经网络的量化方法中的步骤。
第四方面,提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例提供的卷积神经网络的量化方法中的步骤。
本申请带来的有益效果:获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;保持所述位置预测器的网络层参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。通过对SSDlite卷积神经网络中的特征处理器进行量化,使得特征处理器中的特征处理算法复杂度下降,从而降低了SSDlite卷积神经网络的复杂度,解决了SSDLite卷积神经网络应用于小型嵌入式系统时存在算法复杂度高的问题,同时,由于保持位置预测器的网络层参数状态,可以降低检测的精度损失。
附图说明
图1为本申请实施例提供的一种卷积神经网络的量化方法流程示意图;
图2为本申请实施例提供的另一种卷积神经网络的量化方法流程示意图;
图3为本申请实施例提供的一种卷积神经网络的量化装置示意图;
图4为本申请实施例提供的另一种卷积神经网络的量化装置示意图;
图5为本申请实施例提供的另一种卷积神经网络的量化装置示意图;
图6为本申请实施例提供的另一种卷积神经网络的量化装置示意图;
图7为本申请实施例提供的卷积神经网络量化示意图;
图8为本申请实施例提供的特征处理器量化示意图。
具体实施方式
下面描述本申请的优选实施方式,本领域普通技术人员将能够根据下文所述用本领域的相关技术加以实现,并能更加明白本申请的创新之处和带来的益处。
本申请提供了一种卷积神经网络的量化方法、装置及电子设备。
本申请的目的是通过以下技术方案来实现的:
第一方面,请参见图1,图1是本申请实施例提供的一种卷积神经网络的量化方法的流程示意图,如图1所示,所述方法包括以下步骤:
101、获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器。
该步骤中,上述的初始SSDlite卷积神经网络可以通过用户自己建立,比如通过输入对应的数据和程序进行建立,也可以是将已经建立好的SSDlite卷积神经网络做为初始SSDlite卷积神经网络,比如将别人做好的神经网络数据通过复制方式再现出来,上述的初始SSDlite卷积神经网络可以是已经训练好的SSDlite卷积神经网络,也可以是未训练的SSDlite卷积神经网络。上述的特征处理器可以理解为初始SSDlite卷积神经网络中用于提取图像特征的网络区域,上述的位置预测器可以理解为初始SSDlite卷积神经网络中用于预测图像中特征位置的网络区域。通过特征处理器与位置预测器可以实现对物体的检测。
102、将所述特征处理器中的网络层参数进行量化,得到量化特征处理器。
上述的网络层参数可以是权重、输入数据等,上述的量化可以是线性量化,具体的,可以先获取到网络层参数的最大值与最小值,将最大值与最小值形成一个区间,得到网络层参数区间。可以理解的是,不同的参数可以对应有不同的参数区间。上述的量化,可以是使用8比特浮点数进行量化,也可以是以其他比特浮点进行量化,比如10比特浮点数、16比特浮点数等。以使用8比特浮点数据进行量化为例,8比特浮点数可以表示[0,255]一共256个离散的数,假设有一个参数的最大值为9,最小值为-3,则该参数的区间为[-3,9]的线性区间,在坐标系上进行表示,[-3,9]的线性区间则为一根线段,将这根线段平均划分为256段,每段的区间为[X,X+0.046875],即是可以将[-3,9]区间划分为[-3,-2.953125]、(-2.953125,-2.90625]……(8.953125,9],或者可以将[-3,9]区间划分为[-3,-2.953125)、[-2.953125,-2.90625)……[8.953125,9],将落入[-3,-2.953125]区间内的参数进行量化后则为[0,255]的0,将落入(-2.953125,-2.90625]区间内的参数进行量化后则为[0,255]的1,……将落入(8.953125,9]区间内的参数进行量化后则为 [0,255]的255,假设有一个参数为-2.953124,则量化后为1。这样,就可将线性的区间[-3,9]量化为离散的区间[0,255],在离散的区间[0,255]中,每一个离散的元素点表示一个线性的区间。一层网络的参数数据量是非常大的,是趋于线性的,将该层网络的参数进行如上的量化后,就只有256个离散的数据。在一些可能实施例中,为了增加端值(最大值与最值)的影响度,可以将[-3,9]区间划分256段,其中两个端值各为一段,则每段的区间为[X,X+0.04724409449],为(-∞,-3]、(-3,-2.95275590551]……(8.95275590551,9)、[9,+∞),将参数-3量化后则为[0,255]的0,将参数9量化后则为[0,255]的255,将落入(-3,-2.95275590551]区间内的参数进行量化后则为[0,255]的1,将落入(8.95275590551,9)区间内的参数进行量化后则为[0,255]的254,同样的,假设有参数为-2.953124,则量化后为1。通过这样对特征处理器中的网络层进行量化,可以极大减少该层网络的算法复杂度,进而减少对片上内存资源和硬件计算资源的需求。
103、保持所述位置预测器的网络层初始参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络。
请结合图7,上述的保持可以理解为不对位置预测器的网络层参数进行量化,使位置预测器的网络层可以使用原来的参数对特征的位置进行检测,比如,在初始SSDlite卷积神经网络中,各层使用的是32比特浮点数进行计算,则保持位置预测器网络层的参数为32比特浮点数,以使位置预测器中的网络层以32比特浮点数计算。可以理解的是,在量化特征处理中可以存在类型转换器,上述的类型转换器用于实现将原参数类型转换为量化后的参数类型,以及将量化后的参数类型转换为原参数类型,比如,在特征处理器中可以将32比特浮点数转换为8比特浮点数,在特征处理器进行输出前,将8比特浮点数转换为32比特浮点数进行输出,以使位置预测器以32比特浮点数对数据进行计算,保持特征检测的精准度。这样,就可以使SSDlite卷积神经网络的特征提取部分的网络区域被量化,减少网络的复杂度。
104、基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
上述的量化SSDlite卷积神经网络指的是步骤103中对特征处理器参数进行量化,保持特征预测器参数不变的SSDlite卷积神经网络,将量化SSDlite卷积神经网络进行输出,则可以得到目标SSDlite卷积神经网络,该目标SSDlite卷积神经网络的特征处理器经过量化后,算法复杂度极大降低,对应的,所需 要的片上内存资源及硬件计算资源也极大降低。上述的目标SSDlite卷积神经网络可以是训练完成的SSDlite卷积神经网络,也可以是没有经过训练的SSDlite卷积神经网络。需要说明的是,如果上述的目标SSDlite卷积神经网络是训练完成的SSDlite卷积神经网络,则可以通过对初始SSDlite卷积神经网络进行训练从而使输出的目标SSDlite卷积神经网络为训练过的SSDlite卷积神经网络,或者,如果上述的目标SSDlite卷积神经网络是未训练的SSDlite卷积神经网络,则可以将目标SSDlite卷积神经网络通过预先准备的样本数据进行训练后再进行输出。
在本实施例中,获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;保持所述位置预测器的网络层参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。通过对SSDlite卷积神经网络中的特征处理器进行量化,使得特征处理器中的特征处理算法复杂度下降,从而降低了SSDlite卷积神经网络的复杂度,解决了SSDLite卷积神经网络应用于小型嵌入式系统时存在算法复杂度高的问题,同时,由于保持位置预测器的网络层参数状态,可以降低检测的精度损失。
需要说明的是,本申请实施例提供的卷积神经网络的量化方法可以应用于量化卷积神经网络的设备,例如:计算机、服务器、手机等可以进行卷积神经网络的量化的设备。
请参见图2,图2是本申请实施例提供的另一种卷积神经网络的量化方法的流程示意图,如图2所示,所述方法包括以下步骤:
201、获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器。
202、获取所述特征处理器中至少一层卷积层参数。
在该步骤中,上述的特征处理器中至少包括一层卷积层,特征处理器中的卷积层用于提取图像中的特征,上述的卷积层参数可以是权值、输入等参数,上述的卷积层参数可以是在初始SSDlite卷积神经网络建立时获取,可以是未训练的超参数,也可以是训练过后的参数,上述的获取到的卷积层参数可以是通过各参数的数值集合来进行表示,也可以是通过各参数的连续参数区间来进行表示,比如,假设卷积层的权值包括1.4362、0.8739、……0.3857等,可以 用{1.4362,0.8739,……0.3857}的集合来进行表示,该集合为一个离散的数值集合,假设0.3857为最小值,1.4362为最大值,则也可以和[0.3857,1.4362]的闭区间来进行表示,这时,有0.8739∈[0.3857,1.4362]。在一些可能的实施例方式中,可以是获取卷积层参数的端值,即只获取卷积层各参数的最大值与最小值,从而建立卷积层各参数的参数区间,比如获取到某个参数的最大值为9,最小值为0,则可以建立该参数的对应参数区间为[0,9],该参数的其余数值均是属于[0,9]这个参数区间的。
203、将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层。
在该步骤中,对卷积层参数进行量化可以是线性量化,通过线性量化,可以将卷积层参数进行简化,从而减小卷积层的算法复杂度。具体的,在步骤202中获取到卷积层参数如果是离散型的参数集合,可以将离散型的参数集合进行线性化,从而通过线性量化转换为量化后的浮点数类型,也可以是根据线性量化的离散数值个数,将上述的参数集合按大小进行排序后,划分为相应于线性量化的离散数值个数的子集,比如:以转换为8比特浮点数为例,8比特浮点数可以表示256个离散数值,则可以将参数集合在一维坐标上进行表示,以起始点和最终点间的距离(最大值与最小值之差的绝对值)为长度,将该长度平均分为256段,从而将参数集合分为256个子集,上述256个子集一一对应于离散区间[0,255]中的各个元素,每个子集中的元素对应于该一维坐标上的对应区域;对上述每个子集做聚类处理,将每个子集中的元素进行聚类后,可以得到一个代表该子集的参数数值,将该参数数值与离散区间[0,255]中的对应元素进行关联。对连续参数区间的量化,具体的可以参见图1实施例中步骤102的量化方式,在此不再另行赘述。
204、保持所述特征处理器中的其余网络层初始参数状态,结合所述至少一层量化卷积层与所述其余网络层,得到量化特征处理器。
请结合图8,该步骤中,上述的其余网络层可以是偏移层、规则化层等,上述的保持可以理解为不对其余网络层参数进行量化,使其余网络层能以原来的浮点数进行计算,比如,在初始SSDlite卷积神经网络中,各层使用的是32比特浮点数进行计算,则保持特征处理器中除卷积层外的其余网络层的参数为32比特浮点数,以使其余网络层以32比特浮点数计算。在一些可能的实施例中,可以将卷积层经过量化处理的数据类型通过类型转换器转换回原来的浮点数类型后存储到存储器中,在下一层网络进行计算时,读取到的是原浮点数类 型的数据,或者,可以是将量化处理的数据存储到存储器,在下一层网络进行计算时,读取到量化处理的数据后输入类型转换器中转换为原浮点数类型进行计算。
205、保持所述位置预测器的网络层初始参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络。
206、基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
该实施例中,通过将特征处理器中的卷积层进行量化,其他网络层保持原浮点数进行计算,可以使SSDlite卷积神经网络的算法复杂度下降的同时,降低了精度损失,进一步保证SSDlite卷积神经网络的检测精度。需要说明的是,上述的步骤202、步骤203、步骤204可以看成图1实施例中步骤102的进一步可选限定,在一些可能的实施方式中,可以将整个特征处理器中的所有网络层都进行量化也是可以的。
可选的,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层,包括:
获取待量化的卷积层的权重值区间;
将所述待量化的卷积层的权重值区间进行量化,得到权重值的量化区间;
根据所述权重值的量化区间,得到量化卷积层。
该实施方式中,待量化的卷积层可以是特征处理器中的一层卷积或多层卷积层或所有卷积层,可以通过获取待量化的卷积层权重值的最大值与最小值来建立该待量化的卷积层的权重值区间。上述对权重值进行量化可以是线性量化,上述的量化区间可以是根据要选择的浮点数来进行确定。将待量化的卷积层的权重值进行量化后,可以得到量化卷积层,从而实现对特征处理器的量化。
可选的,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层卷积层,包括:
获取待量化的卷积层的输入数据区间;
将所述待量化的卷积层的输入数据区间进行量化,得到输入数据的量化区间;
根据所述输入数据的量化区间,得到量化卷积层。
在该实施方式中,输入数据可以是图像数据,对图像数据进行量化,可以减少算法的计算量。同权重值量化一样,可以是通过线性量化对输入数据进行量化。
可选的,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层,包括:
获取待量化的卷积层的参数区间;
将所述参数区间进行以8比特浮点数进行线性量化,得到[0,255]的参数量化区间;
根据所述参数量化区间,得到量化卷积层。
在该实施方式中,获取待量化的卷积层的参数区间可以是获取各个参数的最大值与最小值形成一个连续的参数区间,上述的以8比特浮点数对参数区间进行线性量化可以理解为,将连续的参数区间划分为256段子区间,使每一段子区间与离散的[0,255]中各个元素一一对应的建立联系,从而使参数量化区间离散化,进而降低卷积层的算法复杂度。
可选的,在所述建立初始SSDlite卷积神经网络之后,所述方法还包括:
使用预先设置的训练样本数据对初始SSDlite卷积神经网络进行训练。
在该实施例中,对建立的初始SSDlite卷积神经网络进行训练,可以得到经过训练的SSDlite卷积神经网络,在进行量化后,无需再重新进行训练。当然,在一些可能的实施例中,初始SSDlite卷积神经网络可以是在输入前就已经训练过的。
上述的可选实施方式,可以现实图1和图2对应实施例的卷积神经网络的量化方法,达到相同的效果,在此不再赘述。
第二方面,请参见图3,图3是本申请实施例提供的一种卷积神经网络的量化装置的结构示意图,如图3所示,所述装置包括:
获取模块,用于获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;
量化模块,用于将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;
保持模块,用于保持所述位置预测器的网络层初始参数状态,结合所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;
输出模块,基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
可选的,如图4所示,所述量化模块包括:
获取单元,用于获取所述特征处理器中至少一层卷积层参数;
量化单元,用于将所述特征处理器中的至少一层卷积层参数进行量化,得 到至少一层量化卷积层;
保持单元,用于保持所述特征处理器中的其余网络层初始参数状态,结合所述至少一层量化卷积层与所述其余网络层,得到量化特征处理器。
可选的,如图5所示,所述量化单元包括:
获取子单元,用于获取待量化的卷积层的权重值区间;
量化子单元,用于将所述待量化的卷积层的权重值区间进行量化,得到权重值的量化区间;
输出子单元,用于根据所述权重值的量化区间,得到量化卷积层。
可选的,如图5所示,所述量化单元包括:
获取子单元,用于获取待量化的卷积层的输入数据区间;
量化子单元,用于将所述待量化的卷积层的输入数据区间进行量化,得到输入数据的量化区间;
输出子单元,用于根据所述输入数据的量化区间,得到量化卷积层。
可选的,如图5所示,所述量化单元包括:
获取子单元,用于获取待量化的卷积层的参数区间;
量化子单元,用于将所述参数区间进行以8比特浮点数进行线性量化,得到[0,255]的参数量化区间;
输出子单元,用于根据所述参数量化区间,得到量化卷积层。
可选的,如图6所示,所述装置还包括:
训练模块,用于使用预先设置的训练样本数据对初始SSDlite卷积神经网络进行训练。
第三方面,本申请实施例提供一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现本申请实施例提供的卷积神经网络的量化方法中的步骤。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现本申请实施例提供的卷积神经网络的量化方法中的步骤。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作和模块并不一定是本申请所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的
另外,在本申请各个实施例中的处理器、芯片可以集成在一个处理单元中,也可以是单独物理存在,也可以两个或两个以上硬件集成在一个单元中。计算机可读存储介质或计算机可读程序可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory,简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。
以上内容是结合具体的优选实施方式对本申请所作的进一步详细说明,不能认定本申请的具体实施方式只局限于这些说明。对于本申请所属技术领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本申请的保护范围。

Claims (10)

  1. 一种卷积神经网络的量化方法,用于轻量级单镜头多目标SSDlite卷积神经网络,其特征在于,所述方法包括:
    获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;
    将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;
    保持所述位置预测器的网络层初始参数状态,基于所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;
    基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
  2. 如权利要求1所述的方法,其特征在于,所述将所述特征处理器中的网络层参数进行量化,得到量化特征处理器,包括:
    获取所述特征处理器中至少一层卷积层参数;
    将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层;
    保持所述特征处理器中的其余网络层初始参数状态,结合所述至少一层量化卷积层与所述其余网络层,得到量化特征处理器。
  3. 如权利要求2所述的方法,其特征在于,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层,包括:
    获取待量化的卷积层的权重值区间;
    将所述待量化的卷积层的权重值区间进行量化,得到权重值的量化区间;
    根据所述权重值的量化区间,得到量化卷积层。
  4. 如权利要求2所述的方法,特征在于,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层卷积层,包括:
    获取待量化的卷积层的输入数据区间;
    将所述待量化的卷积层的输入数据区间进行量化,得到输入数据的量化区间;
    根据所述输入数据的量化区间,得到量化卷积层。
  5. 如权利要求2所述的方法,其特征在于,所述将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层,包括:
    获取待量化的卷积层的参数区间;
    将所述参数区间进行以8比特浮点数进行线性量化,得到[0,255]的参数 量化区间;
    根据所述参数量化区间,得到量化卷积层。
  6. 如权利要求1至5中任一所述的方法,其特征在于,在所述建立初始SSDlite卷积神经网络之后,所述方法还包括:
    使用预先设置的训练样本数据对初始SSDlite卷积神经网络进行训练。
  7. 一种卷积神经网络的量化装置,用于SSDlite卷积神经网络,其特征在于,所述装置包括:
    获取模块,用于获取初始SSDlite卷积神经网络,所述初始SSDlite卷积神经网络包括用于特征提取的特征处理器与用于预测特征位置的位置预测器;
    量化模块,用于将所述特征处理器中的网络层参数进行量化,得到量化特征处理器;
    保持模块,用于保持所述位置预测器的网络层初始参数状态,结合所述量化特征处理器与所述位置预测器,得到量化SSDlite卷积神经网络;
    输出模块,基于所述量化SSDlite卷积神经网络,输出得到目标SSDlite卷积神经网络。
  8. 如权利要求7所述的装置,其特征在于,所述量化模块包括:
    获取单元,用于获取所述特征处理器中至少一层卷积层参数;
    量化单元,用于将所述特征处理器中的至少一层卷积层参数进行量化,得到至少一层量化卷积层;
    保持单元,用于保持所述特征处理器中的其余网络层初始参数状态,结合所述至少一层量化卷积层与所述其余网络层,得到量化特征处理器。
  9. 一种电子设备,其特征在于,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述的卷积神经网络的量化方法中的步骤。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的卷积神经网络的量化方法中的步骤。
PCT/CN2018/120560 2018-12-12 2018-12-12 一种卷积神经网络的量化方法、装置及电子设备 WO2020118553A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/120560 WO2020118553A1 (zh) 2018-12-12 2018-12-12 一种卷积神经网络的量化方法、装置及电子设备
CN201880083718.8A CN111542838B (zh) 2018-12-12 2018-12-12 一种卷积神经网络的量化方法、装置及电子设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/120560 WO2020118553A1 (zh) 2018-12-12 2018-12-12 一种卷积神经网络的量化方法、装置及电子设备

Publications (1)

Publication Number Publication Date
WO2020118553A1 true WO2020118553A1 (zh) 2020-06-18

Family

ID=71075935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/120560 WO2020118553A1 (zh) 2018-12-12 2018-12-12 一种卷积神经网络的量化方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN111542838B (zh)
WO (1) WO2020118553A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101284A (zh) * 2020-09-25 2020-12-18 北京百度网讯科技有限公司 图像识别方法、图像识别模型的训练方法、装置及系统
CN112232491A (zh) * 2020-10-29 2021-01-15 深兰人工智能(深圳)有限公司 基于卷积神经网络模型的特征提取方法和装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016168235A1 (en) * 2015-04-17 2016-10-20 Nec Laboratories America, Inc. Fine-grained image classification by exploring bipartite-graph labels
CN107480770A (zh) * 2017-07-27 2017-12-15 中国科学院自动化研究所 可调节量化位宽的神经网络量化与压缩的方法及装置
CN108510067A (zh) * 2018-04-11 2018-09-07 西安电子科技大学 基于工程化实现的卷积神经网络量化方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017031630A1 (zh) * 2015-08-21 2017-03-02 中国科学院自动化研究所 基于参数量化的深度卷积神经网络的加速与压缩方法
CN107239826A (zh) * 2017-06-06 2017-10-10 上海兆芯集成电路有限公司 在卷积神经网络中的计算方法及装置
CN108304919A (zh) * 2018-01-29 2018-07-20 百度在线网络技术(北京)有限公司 用于生成卷积神经网络的方法和装置
CN108596143B (zh) * 2018-05-03 2021-07-27 复旦大学 基于残差量化卷积神经网络的人脸识别方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016168235A1 (en) * 2015-04-17 2016-10-20 Nec Laboratories America, Inc. Fine-grained image classification by exploring bipartite-graph labels
CN107480770A (zh) * 2017-07-27 2017-12-15 中国科学院自动化研究所 可调节量化位宽的神经网络量化与压缩的方法及装置
CN108510067A (zh) * 2018-04-11 2018-09-07 西安电子科技大学 基于工程化实现的卷积神经网络量化方法

Also Published As

Publication number Publication date
CN111542838B (zh) 2024-02-20
CN111542838A (zh) 2020-08-14

Similar Documents

Publication Publication Date Title
CN109754066B (zh) 用于生成定点型神经网络的方法和装置
US11798131B2 (en) Method for processing image for improving the quality of the image and apparatus for performing the same
US11379723B2 (en) Method and apparatus for compressing neural network
KR20190034985A (ko) 인공 신경망의 양자화 방법 및 장치
US11138505B2 (en) Quantization of neural network parameters
WO2020118553A1 (zh) 一种卷积神经网络的量化方法、装置及电子设备
CN112926570A (zh) 一种自适应比特网络量化方法、系统及图像处理方法
CN110647974A (zh) 深度神经网络中的网络层运算方法及装置
WO2019118639A1 (en) Residual binary neural network
WO2022064656A1 (ja) 処理システム、処理方法及び処理プログラム
US20200257966A1 (en) Quality monitoring and hidden quantization in artificial neural network computations
CN111079930B (zh) 数据集质量参数的确定方法、装置及电子设备
US11659181B2 (en) Method and apparatus for determining region of interest
CN113868808B (zh) 一种道路网络临近检测时延优化方法、装置和系统
WO2021057926A1 (zh) 一种神经网络模型训练方法及装置
TW202001701A (zh) 影像的量化方法及神經網路的訓練方法
WO2021037174A1 (zh) 一种神经网络模型训练方法及装置
KR20200038072A (ko) 엔트로피 기반 신경망(Neural Networks) 부분학습 방법 및 시스템
CN116468967B (zh) 样本图像筛选方法、装置、电子设备及存储介质
CN114758130B (zh) 图像处理及模型训练方法、装置、设备和存储介质
CN113469324B (zh) 模型动态量化方法、装置、电子设备和计算机可读介质
CN116152595A (zh) 模型训练方法、图像处理方法、装置、设备与介质
US11526735B2 (en) Neuromorphic neuron apparatus for artificial neural networks
CN111382854B (zh) 一种卷积神经网络处理方法、装置、设备及存储介质
WO2024057578A1 (ja) 抽出システム、抽出方法および抽出プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18942725

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29.09.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18942725

Country of ref document: EP

Kind code of ref document: A1