WO2023015674A1 - 一种对深度卷积神经网络进行多位宽量化的方法 - Google Patents

一种对深度卷积神经网络进行多位宽量化的方法 Download PDF

Info

Publication number
WO2023015674A1
WO2023015674A1 PCT/CN2021/119006 CN2021119006W WO2023015674A1 WO 2023015674 A1 WO2023015674 A1 WO 2023015674A1 CN 2021119006 W CN2021119006 W CN 2021119006W WO 2023015674 A1 WO2023015674 A1 WO 2023015674A1
Authority
WO
WIPO (PCT)
Prior art keywords
bit
width
quantization
model
constraints
Prior art date
Application number
PCT/CN2021/119006
Other languages
English (en)
French (fr)
Inventor
王东
李浥东
许柯
冯乾泰
Original Assignee
北京交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京交通大学 filed Critical 北京交通大学
Publication of WO2023015674A1 publication Critical patent/WO2023015674A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Definitions

  • the invention relates to the technical field of convolutional neural networks, in particular to a method for performing multi-bit width quantization on deep convolutional neural networks.
  • Neural network quantization refers to compressing the neural network model in 32-bit floating-point format into an 8-1 ratio specific point format to reduce storage and calculation costs.
  • Neural network quantization technology is currently a popular technology for compressing deep neural networks. It is used to compress the neural network, so that the neural network can be deployed on edge devices that perform fixed-point computing.
  • the technical route of one-time quantization and multi-scenario deployment is a new quantization direction.
  • the current technical solutions include apq, oqa, coquant, any precision, and robust quantization.
  • the multi-bit wide-aware quantization method for one-time quantization and multi-scenario deployment can achieve multiple deployments with only one quantization training, which solves the training cost caused by the traditional quantization method for quantization training of individual models in each scene.
  • the neural network compression and quantization methods in the prior art all focus on the quantization model of fixed bit width (single precision), and the model must be Carrying out independent model quantization and compression will easily cause large computing resources, human resources and time overhead when faced with the deployment requirements of different scenarios (such as sometimes requiring cloud computing and sometimes requiring edge computing).
  • Embodiments of the present invention provide a method for performing multi-bit width quantization on a deep convolutional neural network to overcome problems in the prior art.
  • the present invention adopts the following technical solutions.
  • a method for multi-bit-width quantization of deep convolutional neural networks comprising:
  • the multi-bit width perceptual quantization model for weight sharing includes:
  • a multi-bit width perceptual quantization model with weight sharing is established.
  • the multi-bit perceptual quantization model is a multi-layer structure super network.
  • the sub-network of the multi-bit perceptual quantization model includes the lowest bit width model, the highest bit width model and random A bit-width model, simultaneously quantizing and training multiple sub-networks in the multi-bit-width perceptual quantization model;
  • the quantization configuration of the multi-bit width-aware quantization model be expressed as Represents the weight and activation bit width of layer l respectively, given a floating point weight w, activation v, a set of learnable quantization steps and zero-point collection Then the objective function of multi-bit width perceptual quantization model training is expressed as:
  • Q( ⁇ ) represents a quantization function
  • the multi-bit-width-aware quantization supernet training for the multi-bit-width-aware quantization model includes:
  • the training target is the objective function shown in formula 1, and M+2 different models are composed of different express;
  • a e is a square matrix with N rows and N columns, and each column in A e corresponds to a soft label of a category.
  • p L ( xi ), p R ( xi ), p H ( xi ) are the logit outputs of the highest bit width model, the random bit width model and the lowest bit width model respectively;
  • Update formula 3 in each round of iteration normalize A e after each round of epoch, and use it in formula 4 in the next round of epoch until the multi-bit width perceptual quantization model converges or reaches the set number of training times, the training process of the multi-bit width perceptual quantization model ends.
  • the target constraint is set according to the requirements, and the trained multi-bit-width perceptual quantization model is searched with mixed precision according to the target constraint to obtain sub-networks that satisfy the constraints, and each sub-network that satisfies the constraints is used to form a multi-bit-width Quantized deep convolutional neural networks, including:
  • the multi-bit width perceptual quantization model after training is regarded as a model pool containing many sub-networks, and the target constraints are set according to the required multi-bit width quantized deep convolutional neural network.
  • the target constraints include average bit constraints. According to the target constraints Using three methods of Monte Carlo sampling, quantization sensing accuracy predictor, and genetic algorithm to perform mixed-precision search on the trained multi-bit width sensing quantization model, and search for sub-networks that meet the constraints;
  • each target sub-network is separately used as an independent unit in the multi-bit-width quantized deep convolutional neural network.
  • the three methods of Monte Carlo sampling, quantization perception accuracy predictor, and genetic algorithm are used to perform a mixed-precision search on the trained multi-bit width perception quantization model, and search for a sub-network that meets the constraints, including :
  • Monte Carlo sampling is used to generate several chromosomes, and the several chromosomes are used as the initial Pareto solution set, and Monte Carlo sampling is used to generate structure-precision data pairs.
  • Monte Carlo sampling is used to generate structure-precision data pairs.
  • For different chromosomes use the prediction output of the quantitative perceptual accuracy predictor as the fitness score of the chromosome, save and add the chromosome with the highest fitness score to the elite set, and select the elite for mutation and crossover according to a predetermined probability to obtain a new chromosome.
  • the selection-mutation-crossover process is repeated until the algorithm reaches a Pareto solution that satisfies the weight and activation average bit width targets.
  • the embodiments of the present invention solve the problem of competitive training under different bit subnets through minimum-random-maximum bit width cooperative training and adaptive label softening, and realize different average bit widths. Higher model accuracy under wide constraints.
  • FIG. 1 is a processing flowchart of a method for performing multi-bit width quantization on a deep convolutional neural network according to an embodiment of the present invention.
  • Embodiments of the present invention provide a multi-scenario-oriented quantization method for multi-scenario deployment (each application scenario has different requirements for neural network calculation accuracy), which only needs to train the quantized deep convolutional neural network once to obtain a satisfactory
  • the all-in-once network a multi-bit width-aware quantization model for any number of deployments, greatly reduces the time and computational expenses of deep convolutional neural network compression, and achieves higher performance under different average bit constraints. Model accuracy, forming a better Pareto optimal frontier, making neural network deployment lighter and better.
  • the multi-bit width perception of the model is realized through the minimum-random-maximum bit width collaborative training, and a quantized model for one-time quantization and multi-scenario deployment is constructed.
  • a quantized model for one-time quantization and multi-scenario deployment is constructed.
  • the performance improvement of the quantization-aware accuracy rate predictor is done by Monte Carlo search.
  • Step S10 establishing a weight-sharing multi-bit width perceptual quantization model.
  • the all-in-once quantization model supports diverse quantization bit width configurations.
  • the quantized configuration of a model can be expressed as at the same time Represents the weight and activation bit width of layer l respectively, given a floating point weight w and activation v, a set of learnable quantization steps and zero-point collection
  • the objective function of supernet training can be expressed as:
  • Q( ⁇ ) represents a quantization function.
  • the goal of Multi-bit quantization is to learn robust weight distributions, independent quantization steps and zero-point sets under different bit width configurations.
  • LSQ Learning Step-size Quantization, based on the low-bit quantization of the trainable step size.
  • Equation 1 expresses the objective function of supernet training.
  • Formula 2 represents the formula for lsq quantization. It can be seen as a specific description of Q() in Formula 1, k means quantization to kbits.
  • the multi-bit wide-aware quantization model aims to build a model structure with weight sharing and independent quantization step size in multi-bit wide scenarios by stripping model weights and quantization steps.
  • the multi-bit width-aware quantization model predefines the quantization step size under different bit widths of each layer, and the corresponding quantization step size and quantization boundary can be activated by setting the quantization bit width of each layer of the model. Therefore, the model can be flexibly adjusted to uniform quantization and mixed precision quantization forms under different bit width scenarios.
  • Step S20 performing multi-bit-width-aware quantization supernet training on the multi-bit-width-aware quantization model.
  • This method proposes min-random-max bitwidth co-training and adaptive label softening to iteratively train multi-bitwidth-aware quantization models.
  • the multi-bit width perceptual quantization model training includes the lowest bit width model, the highest bit width model and M random bit width models.
  • the M+2 seed network is optimized at the same time.
  • the training target is the objective function shown in formula 1, M+2 different models by the different to express.
  • the minimum-random-maximum bit-width collaborative training method is used for the lowest bit-width model (such as fixed 2 bits per layer) and the highest bit-width model (such as fixed 8 bits per layer) and two random
  • the bit-wide model is trained at the same time to improve the overall performance of the supernet model.
  • Adaptive tabs softened Given a dataset Contains N categories, xi represents the input image, and y i represents the corresponding real label. definition As the class-level soft label of each round, A e is a square matrix with N rows and N columns, and each column in A e corresponds to a soft label of a category.
  • An input sample (xi , y i ) is arbitrarily The quantitative model is correctly judged, we construct ⁇ p L ( xi ), p R ( xi ), p H (xi ) ⁇ to update the y i column in A e , M represents the number of random subnetworks, n represents Predictive value.
  • p L ( xi ), p R ( xi ), p H ( xi ) all describe the same object.
  • the balance coefficient ⁇ is generally set to 0.5.
  • p L ( xi ), p R ( xi ), and p H ( xi ) are the logit outputs of the highest bit-width model, random bit-width model, and lowest bit-width model mentioned above, respectively.
  • formula 3 is updated once, and A e is normalized after each round of epoch, and used in formula 4 in the next round of epoch.
  • the total round epoch is artificially set.
  • the training process of the multi-bit-width-aware quantization model ends.
  • the conditions for judging the convergence of the multi-bit width-aware quantization model include that the accuracy no longer improves with the increase of the number of training rounds.
  • the trained multi-bit wide perceptual quantization model is regarded as a large model pool, which contains many sub-networks, and sub-networks that meet the requirements can be selected from the pool. For example, if a quantized deep convolutional neural network with an average bit width of 4 is required, the target constraint is set to 4. According to the target constraint, three methods of Monte Carlo sampling, quantization-aware accuracy predictor, and genetic algorithm are used for training.
  • the multi-bit wide-aware quantization model performs mixed-precision search to search out the target subnet.
  • Target constraints include average bit constraints.
  • the average bit constraint means that the activations and weights of each layer have different bit width representations, and a value obtained by multiplying the activations and weights of all layers by their proportional weights is the average bit.
  • a multi-bit-width quantized deep convolutional neural network is formed according to the target sub-networks satisfying the constraints, and each target sub-network is separately used as an independent unit in the multi-bit-width quantized deep convolutional neural network.
  • Monte Carlo sampling Let's start with Monte Carlo sampling.
  • a (subnet architecture, average bit) sampling pool is obtained through random uniform sampling. For example, by randomly collecting 500,000 subnet models and calculating the corresponding average number of bits, an empirical distribution of the number of bits in different layers under each average number of bits can be obtained. By sampling from this empirical distribution, results that satisfy the target distribution can be obtained with a higher probability.
  • Quantization-aware accuracy predictors are used to construct quantization accuracy prediction training datasets. And sampling the population that initially satisfies the constraints in the genetic algorithm for mixed precision search.
  • a quantization-aware accuracy predictor to provide an accurate estimate of the network's accuracy, which predicts the accuracy of a model given a configuration. More specifically, it is a 7-layer feed-forward neural network with each embedding dimension equal to 150.
  • the bit width configuration is encoded into a one-hot vector as input, (such as [2,4,6,4,8] such a set of weight bit width configuration, each number represents the quantization bit width of the weight of a certain layer, activation value) is input into the predictor to obtain the prediction accuracy as output.
  • Monte Carlo sampling to generate structure-precision data pairs, which can avoid the imbalance of the data set and improve the prediction performance of lower and higher bit widths, such as the accuracy prediction of models below 3 bits or the prediction of models above 7 bits .
  • the specific method is to uniformly and randomly sample an average number of bits, such as 5 bits, and then use Monte Carlo sampling technology to sample in the empirical distribution under 5 bits, which can make the sampled model easily meet the 5-bit constraint, so that The resulting data set can be more uniform, instead of making the sampled sub-network concentrated in the middle bit part like random uniform sampling.
  • the genetic algorithm for mixed precision search first uses Monte Carlo sampling to generate several chromosomes (that is, the configuration of the subnetwork: the bit number setting of different layers) as the initial Pareto solution set. Monte Carlo sampling can greatly speed up the time to construct the initial solution set.
  • the prediction output of the quantitative perceptual accuracy predictor is used as the fitness score of the chromosome.
  • the chromosome with the highest fitness score is saved and added to the elite set, and then the elite is selected for mutation and crossover according to a predetermined probability to obtain a new population.
  • the selection-mutation-crossover process is repeated until the algorithm reaches a Pareto solution that satisfies the weight and activation average bitwidth goals.
  • the embodiment of the present invention solves the problem of competitive training under different bit subnets through minimum-random-maximum bit width collaborative training and adaptive label softening, and achieves higher model accuracy under different average bit width constraints.
  • high-performance model deployment can be quickly performed without re-quantization training, reducing a large amount of computing resources and time overhead.
  • the embodiment of the present invention can improve the performance of the quantitative perception accuracy rate predictor through the evolutionary algorithm of Monte Carlo sampling optimization, greatly improve the search efficiency, and reduce the time for obtaining the target subnet.
  • each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments.
  • the description is relatively simple, and for relevant parts, refer to part of the description of the method embodiments.
  • the device and system embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, It can be located in one place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Physiology (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种对深度卷积神经网络进行多位宽量化的方法。该方法包括:建立权重共享的多位宽感知量化模型,对多位宽感知量化模型进行多位宽感知的量化超网训练,根据需求设置目标约束,根据所述目标约束对训练好的多位宽感知量化模型进行混合精度搜索,得到满足约束的子网络,利用各个满足约束的子网络组成多位宽量化的深度卷积神经网络。该方法通过最小-随机-最大位宽协同训练以及自适应标签软化解决不同比特子网下竞争训练的问题,实现不同平均比特位宽约束下更高的模型精度。

Description

一种对深度卷积神经网络进行多位宽量化的方法 技术领域
本发明涉及卷积神经网络技术领域,尤其涉及一种对深度卷积神经网络进行多位宽量化的方法。
背景技术
神经网络量化指将32位浮点格式的神经网络模型压缩到8~1比特定点数格式,以减少存储和计算代价,神经网络量化技术是目前流行的用于压缩深度神经网络的一种技术,用于对神经网络进行压缩,使得神经网络能够在进行定点计算的边缘设备上部署。而一次量化,多场景部署的技术路线是新的量化方向,目前的技术方案有apq,oqa,coquant,any precision,robust quantization。一次量化多场景部署的多位宽感知的量化方法,仅需一次量化训练即可实现多次部署,解决传统量化方法对每一个场景下的单独模型进行量化训练而造成的训练成本。
目前,现有技术中的神经网络压缩量化的方法都聚焦于固定位宽(单一精度)的量化模型,模型针对不同的硬件设备特性(处理器计算精度)和约束(模型准确度)时都要进行独立的模型量化和压缩,在面对不同场景的部署(比如有时需要进行云端计算、有时需要进行边缘计算)需求时容易造成较大的计算资源、人力资源和时间上的开销。
而现有技术中的其他一次量化多场景部署的技术方案也存在比较多的缺陷。其中,apq方法不能够实现较低比特的量化,只能做到4,6,8比特3种比特之间的混合精度量化而没有做到4比特以下的量化。oqa只能实现统一比特位宽量化,不能做到混合比特量化(指不同神经网络层的比特精度必须一致,不能实现不同层压缩到不一样的比特精度),灵活性较差。其他如coquant,any precision,robust quantization在低比特量化时精度损失较大。
发明内容
本发明的实施例提供了一种对深度卷积神经网络进行多位宽量化的方法,以克服现有技术的问题。
为了实现上述目的,本发明采取了如下技术方案。
一种对深度卷积神经网络进行多位宽量化的方法,包括:
建立权重共享的多位宽感知量化模型;
对所述多位宽感知量化模型进行多位宽感知的量化超网训练;
根据需求设置目标约束,根据所述目标约束对训练好的多位宽感知量化模型进行混合精度搜索,得到满足约束的子网络,利用各个满足约束的子网络组成多位宽量化的深度卷积神经网络。
优选地,所述的建立权重共享的多位宽感知量化模型,包括:
建立权重共享的多位宽感知量化模型,该多位宽感知量化模型是一个多层结构的超网络,多位宽感知量化模型的子网络包括最低比特位宽模型、最高比特位宽模型和随机比特位宽模型,对所述多位宽感知量化模型中的多种子网络同时进行量化并训练;
设多位宽感知量化模型的量化配置表示成
Figure PCTCN2021119006-appb-000001
分别表示层l的权重和激活的位宽,给定一个浮点的权重w、激活v,可学习的量化步长集合
Figure PCTCN2021119006-appb-000002
和zero-point集合
Figure PCTCN2021119006-appb-000003
则多位宽感知量化模型训练的目标函数表示为:
Figure PCTCN2021119006-appb-000004
Q(·)表示量化函数。
优选地,所述的对所述多位宽感知量化模型进行多位宽感知的量化超网训练,包括:
采用最小-随机-最大位宽协同训练方式在每一次训练迭代中,对多位宽感知量化模型中的最低比特位宽模型、最高比特位宽模型和M个随机比特位宽模型M+2种子网络同时进行优化,训练目标为公式1所示的目标函数,M+2种不同的模型由公式1中不同的
Figure PCTCN2021119006-appb-000005
进行表示;
自适应标签软化,给定一个数据集
Figure PCTCN2021119006-appb-000006
包含N个类别,x i表示输入图像,y i表示对应的真实标签,定义
Figure PCTCN2021119006-appb-000007
作为每一轮的类级别的软标签,A e是一个N行N列的方阵,A e中的每一列对应着一个类别的软标签,当一个输入样本(x i,y i)被任意的量化模型正确判断,构造{p L(x i),p R(x i),p H(x i)}去更新A e中的y i列,M表示随机子网的数量,n表示预测值,p L(x i),p R(x i),p H(x i)三者都是描述同一个对象,被如下描述:
Figure PCTCN2021119006-appb-000008
则Adaptive Soft Label Loss表示为:
Figure PCTCN2021119006-appb-000009
Figure PCTCN2021119006-appb-000010
表示e轮时矩阵A在坐标(n,y i)的值,平衡系数ζ设置成0.5;
p L(x i),p R(x i),p H(x i)分别是最高比特位宽模型、随机比特位宽模型以及最低比特位宽模型的logit输出;
每一轮迭代iteration下都做一次公式3的更新,在每一个轮次epoch结束后对A e进行归一化,在下一个轮次epoch时的公式4中使用,直到多位宽感知量化模型收敛为止或者达到设定的训练次数,则所述多位宽感知量化模型的训练过程结束。
优选地,所述的根据需求设置目标约束,根据所述目标约束对训练好的多位宽感知量化模型进行混合精度搜索,得到满足约束的子网络,利用各个满足约束的子网络组成多位宽量化的深度卷积神经网络,包括:
将训练完成后的多位宽感知量化模型看作一个包含很多个子网络的模型池,根据需要的多位宽量化的深度卷积神经网络设置目标约束,该目标约束包括平均比特约束,根据目标约束采用蒙特卡洛采样、量化感知准确率预测器、遗传算法三种方法对训练好的多位宽感知量化模型进行混合精度搜索,搜索出满足约束的子网络;
根据满足约束的目标子网络组成需要的多位宽量化的深度卷积神经网络,各个目标子网络分别单独作为多位宽量化的深度卷积神经网络中的独立单元。
优选地,所述根据目标约束采用蒙特卡洛采样、量化感知准确率预测器、遗传算法三种方法对训练好的多位宽感知量化模型进行混合精度搜索,搜索出满足约束的子网络,包括:
利用蒙特卡洛采样构建量化感知准确率预测器的训练数据集,构建针对混合精度搜索的遗传算法中采样初始满足约束的种群,利用量化感知准确率预测器对混合精度搜索的精度进行估计;
根据子网络的配置和不同层的比特数设定采用蒙特卡洛采样生成若干条染色体,将所述若干条染色体作为初始帕累托解集,使用蒙特卡罗采样生成结构-精度数据对,针对不同的染色体,采用量化感知准确率预测器的预测输出作为该染色体的适应度分数,将适应度得分最高的染色体保存并添加到精英集合中,根据预定的概率选择精英进行变异和交叉以获得新的种群,选择-变异-交叉的过程重复进行,直到算法达到满足权重和激活平均位宽目标的帕累托解。
由上述本发明的实施例提供的技术方案可以看出,本发明实施例通过最小-随机-最大位宽协同训练以及自适应标签软化解决不同比特子网下竞争训练的问题,实现不同平均比特位宽约束下更高的模型精度。
本发明附加的方面和优点将在下面的描述中部分给出,这些将从下面的 描述中变得明显,或通过本发明的实践了解到。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种对深度卷积神经网络进行多位宽量化的方法的处理流程图。
具体实施方式
下面详细描述本发明的实施方式,所述实施方式的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语 (包括技术术语和科学术语)具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样定义,不会用理想化或过于正式的含义来解释。
为便于对本发明实施例的理解,下面将结合附图以几个具体实施例为例做进一步的解释说明,且各个实施例并不构成对本发明实施例的限定。
本发明实施例提供了一种面向多场景部署(每种应用场景对神经网络计算精度需求不同)的多位宽感知的量化方法,仅需对量化深度卷积神经网络训练一次,即可获得满足任意次部署的需求的多位宽感知的量化模型all-in-once network,极大减少了深度卷积神经网络压缩在时间上和计算上的开支,并在不同平均比特约束下达到较高的模型精度,形成更好的帕累托最优前沿,使得神经网络部署更轻量更好。
在权重共享的前提下,通过最小-随机-最大位宽协同训练实现模型的多位宽感知,构建一次量化多场景部署的量化模型。通过自适应标签软化,解决不同位宽下的子网恶性竞争的问题。通过蒙特卡洛搜索完成量化感知准确率预测器的性能提升。
本发明实施例提供的一种对深度卷积神经网络进行多位宽量化的方法的处理流程如图1所示,包括如下的处理步骤:
步骤S10、建立权重共享的多位宽感知量化模型。
首先对本方法中的多位宽感知量化训练问题进行建模。和一般的针对某个模型单独量化不同,本方法需要在同一个模型下同时量化并训练多个子网络。以resnet18为例,一个位宽范围2-8的resnet18的模型包含742个子网模型。同时训练多个子网模型需要重新对网络训练问题进行多位宽感知的量化建模。超网是超网模型的缩写,多位宽感知量化模型是超网从 功能层面的一种描述方式,超网、超网模型和多位宽感知量化模型三者都是同一个对象,超网包括多层。resnet18共有21层,每一层的激活和权重可独立设置,量化位宽可选择为2-8比特,则子网模型包括(21×2) 7个子网模型。
all-in-once量化模型支持多样化的量化位宽配置。假设一个模型的量化配置可以表示成
Figure PCTCN2021119006-appb-000011
同时
Figure PCTCN2021119006-appb-000012
分别表示层l的权重和激活的位宽,给定一个浮点的权重w和激活v,可学习的量化步长集合
Figure PCTCN2021119006-appb-000013
和zero-point集合
Figure PCTCN2021119006-appb-000014
则超网训练的目标函数可以表示为:
Figure PCTCN2021119006-appb-000015
Q(·)表示量化函数。Multi-bit量化的目标是为了在不同位宽配置下学习鲁棒性强的权重分布,独立的量化步长和zero-point集合。为了高效训练量化模型,我们采用了低比特量化训练方式LSQ(Learned Step-size Quantization,基于可训练步长低比特量化).以激活v量化到k-bit为例,权重共享的量化函数如下所示:
Figure PCTCN2021119006-appb-000016
公式1表示的是超网训练的目标函数。公式2表示的是lsq量化的公式。可以看作公式1里的Q()的具体描述,k表示量化到kbits。
多位宽感知量化模型旨在通过剥离模型权重和量化步长构建多位宽场景下权重共享,量化步长独立的模型结构。多位宽感知量化模型通过预先定义各层不同位宽下的量化步长,通过设置模型各层的量化位宽可以激活对应的量化步长和量化边界。从而使得该模型可以灵活调整为不同位宽场景下的统一量化和混合精度量化形式。
步骤S20、对多位宽感知量化模型进行多位宽感知的量化超网训练。
本方法提出了最小-随机-最大位宽协同训练以及自适应标签软化方法,来对多位宽感知量化模型进行迭代训练。
多位宽感知量化模型训练包括最低比特位宽模型、最高比特位宽模型和M个随机比特位宽模型M+2种子网络同时进行优化,训练目标为公式1所示的目标函数,M+2种不同的模型由公式1中不同的
Figure PCTCN2021119006-appb-000017
进行表示。采用最小-随机-最大位宽协同训练方式在每一次训练迭代中,对最低比特位宽模型(比如每层固定2比特)和最高比特位宽模型(比如每层固定8比特)和两个随机比特位宽模型同时进行训练,来对超网模型的整体性能进行提升。
自适应标签软化。给定一个数据集
Figure PCTCN2021119006-appb-000018
包含N个类别,x i表示输入图像,y i表示对应的真实标签。定义
Figure PCTCN2021119006-appb-000019
作为每一轮的类级别的软标签,A e是一个N行N列的方阵,A e中的每一列对应着一个类别的软标签.当一个输入样本(x i,y i)被任意的量化模型正确判断,我们构造{p L(x i),p R(x i),p H(x i)}去更新A e中的y i列,M表示随机子网的数量,n表示预测值。p L(x i),p R(x i),p H(x i)三者都是描述同一个对象。
可以被如下描述:
Figure PCTCN2021119006-appb-000020
则Adaptive Soft Label Loss可以表示为:
Figure PCTCN2021119006-appb-000021
Figure PCTCN2021119006-appb-000022
表示e轮时矩阵A在坐标(n,y i)的值。平衡系数ζ一般设置成0.5.
p L(x i),p R(x i),p H(x i)分别是上面说的最高比特位宽模型、随机比特位宽模型以及最低比特位宽模型的logit输出。
每一轮迭代iteration下都做一次公式3的更新,在每一个轮次epoch结束后对A e进行归一化,在下一个轮次epoch时的公式4中使用。总轮次epoch人为设定。一直到多位宽感知量化模型收敛为止或者达到设定的训练轮次,则所述多位宽感知量化模型的训练过程结束。判断多位宽感知量化模型收敛的条件包括精度不再随着训练轮数的增加而提升。
步骤S30、将训练完成后的多位宽感知量化模型看作一个很大的模型池,其中包含很多个子网络,可以根据需求再从里面挑出满足需求的子网络。比如,需要一个平均位宽数是4的量化深度卷积神经网络,就设置目标约束是4,根据目标约束采用蒙特卡洛采样、量化感知准确率预测器、遗传算法三种方法对训练好的多位宽感知量化模型进行混合精度搜索,搜索出目标子网。
目标约束包括平均比特约束。平均比特约束是指每一层的激活和权重都有不同的位宽表示,把所有层的激活和权重乘上他们的比例加权得到的一个值就是平均比特。
根据满足约束的目标子网络组成多位宽量化的深度卷积神经网络,各个目标子网络分别单独作为多位宽量化的深度卷积神经网络中的独立单元。
蒙特卡洛采样。首先讲解蒙特卡洛采样。是在一个超网中,通过随机均匀采样得到一个(子网架构,平均比特)的采样池。比如随机采集50万个子网模型以及计算出对应的平均比特数,即可得到每一个平均比特数下不同层比特数的一个经验分布。从该经验分布下采样,可以更高概率获得满足目标分布的结果。
蒙特卡洛采样应用于两个方面:量化感知准确率预测器中构建量化精度预测训练数据集。以及针对混合精度搜索的遗传算法中采样初始满足约束的种群。
技术细节如下:
分别给定权重和激活平均比特约束τ w和τ a,
Figure PCTCN2021119006-appb-000023
的经验近似为
Figure PCTCN2021119006-appb-000024
为了便于统计,,
Figure PCTCN2021119006-appb-000025
通过如下方式进行计算:
Figure PCTCN2021119006-appb-000026
为了构建上述分布,我们在采样空间中随机采样大量的结构-平均比特数据对
Figure PCTCN2021119006-appb-000027
来构建采样池。令#(τ w=τ 0)表示在采样池中平均位宽为τ 0merits子网的总数,同时
Figure PCTCN2021119006-appb-000028
表示数据对
Figure PCTCN2021119006-appb-000029
在采样池出现的总数,则
Figure PCTCN2021119006-appb-000030
可以如下估计:
Figure PCTCN2021119006-appb-000031
在搜索过程中,加快搜索模型的评估过程是非常重要的。我们提出量化感知准确率预测器对网络的精度进行准确的估计,它可以预测给定配置的模型的准确度。更具体地说,它是一个7层前馈神经网络,每个嵌入维度等于150。位宽配置被编码成一个one-hot向量作为输入,(比如[2,4,6,4,8]这样一组权重位宽配置,每一个数字代表某一层的权重的量化位宽,激活值同理)输入到预测器中以获得预测精度作为输出。
特别地,我们使用蒙特卡罗采样生成结构-精度数据对,可以避免数据集的不平衡,提高较低和较高位宽的预测性能,比如3比特以下模型的精度预测或者7比特以上的模型预测。
具体做法是,均匀随机采样一个平均比特数,比如5比特,然后再使用蒙特卡洛采样技术,在5比特下的经验分布中进行采样,能够使得采样 的模型容易满足5比特的约束,这样构建出来的数据集可以更加均匀,而不是像随机均匀采样一样使得采样得到的子网络大量集中在中间比特部分。
针对混合精度搜索的遗传算法首先采用蒙特卡洛采样生成若干条染色体(即子网络的配置:不同层的比特数设定)作为初始帕累托解集。蒙特卡洛采样可以极大加速构建初始解集的时间。
然后,针对不同的染色体,采用量化感知准确率预测器的预测输出作为该染色体的适应度分数。
最后将适应度得分最高的染色体保存并添加到精英集合中,然后根据预定的概率选择精英进行变异和交叉以获得新的种群。选择-变异-交叉的过程重复进行,直到算法达到满足权重和激活平均位宽目标的帕累托解。
综上所述,本发明实施例通过最小-随机-最大位宽协同训练以及自适应标签软化解决不同比特子网下竞争训练的问题,实现不同平均比特位宽约束下更高的模型精度能够针对不同量化约束的应用场景下快速进行高性能的模型部署而不必重新进行量化训练,减少大量的计算资源和时间开销。
本发明实施例通过蒙特卡洛采样优化的进化算法,能够提升量化感知准确率预测器的性能,以及极大提高搜索效率,减少获取目标子网的时间。
本领域普通技术人员可以理解:附图只是一个实施例的示意图,附图中的模块或流程并不一定是实施本发明所必须的。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例或者实施例的某些部分所 述的方法。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置或系统实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置及系统实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应该以权利要求的保护范围为准。

Claims (5)

  1. 一种对深度卷积神经网络进行多位宽量化的方法,其特征在于,包括:
    建立权重共享的多位宽感知量化模型;
    对所述多位宽感知量化模型进行多位宽感知的量化超网训练;
    根据需求设置目标约束,根据所述目标约束对训练好的多位宽感知量化模型进行混合精度搜索,得到满足约束的子网络,利用各个满足约束的子网络组成多位宽量化的深度卷积神经网络。
  2. 根据权利要求1所述的方法,其特征在于,所述的建立权重共享的多位宽感知量化模型,包括:
    建立权重共享的多位宽感知量化模型,该多位宽感知量化模型是一个多层结构的超网络,多位宽感知量化模型的子网络包括最低比特位宽模型、最高比特位宽模型和随机比特位宽模型,对所述多位宽感知量化模型中的多种子网络同时进行量化并训练;
    设多位宽感知量化模型的量化配置表示成
    Figure PCTCN2021119006-appb-100001
    分别表示层l的权重和激活的位宽,给定一个浮点的权重w、激活v,可学习的量化步长集合
    Figure PCTCN2021119006-appb-100002
    和zero-point集合
    Figure PCTCN2021119006-appb-100003
    则多位宽感知量化模型训练的目标函数表示为:
    Figure PCTCN2021119006-appb-100004
    Q(·)表示量化函数。
  3. 根据权利要求2所述的方法,其特征在于,所述的对所述多位宽感知量化模型进行多位宽感知的量化超网训练,包括:
    采用最小-随机-最大位宽协同训练方式在每一次训练迭代中,对多位宽感知量化模型中的最低比特位宽模型、最高比特位宽模型和M个随机 比特位宽模型M+2种子网络同时进行优化,训练目标为公式1所示的目标函数,M+2种不同的模型由公式1中不同的
    Figure PCTCN2021119006-appb-100005
    进行表示;
    自适应标签软化,给定一个数据集
    Figure PCTCN2021119006-appb-100006
    包含N个类别,x i表示输入图像,y i表示对应的真实标签,定义
    Figure PCTCN2021119006-appb-100007
    作为每一轮的类级别的软标签,A e是一个N行N列的方阵,A e中的每一列对应着一个类别的软标签,当一个输入样本(x i,y i)被任意的量化模型正确判断,构造{p L(x i),p R(x i),p H(x i)}去更新A e中的y i列,M表示随机子网的数量,n表示预测值,P L(x i),p R(x i),p H(x i)三者都是描述同一个对象,被如下描述:
    Figure PCTCN2021119006-appb-100008
    则Adaptive Soft Label Loss表示为:
    Figure PCTCN2021119006-appb-100009
    Figure PCTCN2021119006-appb-100010
    表示e轮时矩阵A在坐标(n,y i)的值,平衡系数ζ设置成0.5;
    p L(x i),p R(x i),p H(x i)分别是最高比特位宽模型、随机比特位宽模型以及最低比特位宽模型的logit输出;
    每一轮迭代iteration下都做一次公式3的更新,在每一个轮次epoch结束后对A e进行归一化,在下一个轮次epoch时的公式4中使用,直到多位宽感知量化模型收敛为止或者达到设定的训练次数,则所述多位宽感知量化模型的训练过程结束。
  4. 根据权利要求3所述的方法,其特征在于,所述的根据需求设置目标约束,根据所述目标约束对训练好的多位宽感知量化模型进行混合精度搜索,得到满足约束的子网络,利用各个满足约束的子网络组成多位宽量化的深度卷积神经网络,包括:
    将训练完成后的多位宽感知量化模型看作一个包含很多个子网络的 模型池,根据需要的多位宽量化的深度卷积神经网络设置目标约束,该目标约束包括平均比特约束,根据目标约束采用蒙特卡洛采样、量化感知准确率预测器、遗传算法三种方法对训练好的多位宽感知量化模型进行混合精度搜索,搜索出满足约束的子网络;
    根据满足约束的目标子网络组成需要的多位宽量化的深度卷积神经网络,各个目标子网络分别单独作为多位宽量化的深度卷积神经网络中的独立单元。
  5. 根据权利要求4所述的方法,其特征在于,所述根据目标约束采用蒙特卡洛采样、量化感知准确率预测器、遗传算法三种方法对训练好的多位宽感知量化模型进行混合精度搜索,搜索出满足约束的子网络,包括:
    利用蒙特卡洛采样构建量化感知准确率预测器的训练数据集,构建针对混合精度搜索的遗传算法中采样初始满足约束的种群,利用量化感知准确率预测器对混合精度搜索的精度进行估计;
    根据子网络的配置和不同层的比特数设定采用蒙特卡洛采样生成若干条染色体,将所述若干条染色体作为初始帕累托解集,使用蒙特卡罗采样生成结构-精度数据对,针对不同的染色体,采用量化感知准确率预测器的预测输出作为该染色体的适应度分数,将适应度得分最高的染色体保存并添加到精英集合中,根据预定的概率选择精英进行变异和交叉以获得新的种群,选择-变异-交叉的过程重复进行,直到算法达到满足权重和激活平均位宽目标的帕累托解。
PCT/CN2021/119006 2021-08-12 2021-09-17 一种对深度卷积神经网络进行多位宽量化的方法 WO2023015674A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110923119.6A CN113762489A (zh) 2021-08-12 2021-08-12 一种对深度卷积神经网络进行多位宽量化的方法
CN202110923119.6 2021-08-12

Publications (1)

Publication Number Publication Date
WO2023015674A1 true WO2023015674A1 (zh) 2023-02-16

Family

ID=78789120

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119006 WO2023015674A1 (zh) 2021-08-12 2021-09-17 一种对深度卷积神经网络进行多位宽量化的方法

Country Status (2)

Country Link
CN (1) CN113762489A (zh)
WO (1) WO2023015674A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115357554B (zh) * 2022-10-24 2023-02-24 浪潮电子信息产业股份有限公司 一种图神经网络压缩方法、装置、电子设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180046896A1 (en) * 2016-08-12 2018-02-15 DeePhi Technology Co., Ltd. Method and device for quantizing complex artificial neural network
US20200302271A1 (en) * 2019-03-18 2020-09-24 Microsoft Technology Licensing, Llc Quantization-aware neural architecture search
CN111931906A (zh) * 2020-07-14 2020-11-13 北京理工大学 一种基于结构搜索的深度神经网络混合精度量化方法
CN112101524A (zh) * 2020-09-07 2020-12-18 上海交通大学 可在线切换比特位宽的量化神经网络的方法及系统
CN112364981A (zh) * 2020-11-10 2021-02-12 南方科技大学 一种混合精度神经网络的可微分搜索方法和装置
CN112926570A (zh) * 2021-03-26 2021-06-08 上海交通大学 一种自适应比特网络量化方法、系统及图像处理方法
US11029958B1 (en) * 2019-12-28 2021-06-08 Intel Corporation Apparatuses, methods, and systems for configurable operand size operations in an operation configurable spatial accelerator
CN113033784A (zh) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 一种针对cpu和gpu设备搜索神经网络结构的方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180046896A1 (en) * 2016-08-12 2018-02-15 DeePhi Technology Co., Ltd. Method and device for quantizing complex artificial neural network
US20200302271A1 (en) * 2019-03-18 2020-09-24 Microsoft Technology Licensing, Llc Quantization-aware neural architecture search
US11029958B1 (en) * 2019-12-28 2021-06-08 Intel Corporation Apparatuses, methods, and systems for configurable operand size operations in an operation configurable spatial accelerator
CN111931906A (zh) * 2020-07-14 2020-11-13 北京理工大学 一种基于结构搜索的深度神经网络混合精度量化方法
CN112101524A (zh) * 2020-09-07 2020-12-18 上海交通大学 可在线切换比特位宽的量化神经网络的方法及系统
CN112364981A (zh) * 2020-11-10 2021-02-12 南方科技大学 一种混合精度神经网络的可微分搜索方法和装置
CN112926570A (zh) * 2021-03-26 2021-06-08 上海交通大学 一种自适应比特网络量化方法、系统及图像处理方法
CN113033784A (zh) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 一种针对cpu和gpu设备搜索神经网络结构的方法

Also Published As

Publication number Publication date
CN113762489A (zh) 2021-12-07

Similar Documents

Publication Publication Date Title
CN111291836B (zh) 一种生成学生网络模型的方法
US11049006B2 (en) Computing system for training neural networks
CN110969251B (zh) 基于无标签数据的神经网络模型量化方法及装置
CN111985523A (zh) 基于知识蒸馏训练的2指数幂深度神经网络量化方法
WO2021042857A1 (zh) 图像分割模型的处理方法和处理装置
CN113206887A (zh) 边缘计算下针对数据与设备异构性加速联邦学习的方法
CN116523079A (zh) 一种基于强化学习联邦学习优化方法及系统
WO2023015674A1 (zh) 一种对深度卷积神经网络进行多位宽量化的方法
CN113190688A (zh) 基于逻辑推理和图卷积的复杂网络链接预测方法及系统
CN111355633A (zh) 一种基于pso-delm算法的比赛场馆内手机上网流量预测方法
CN116644804B (zh) 分布式训练系统、神经网络模型训练方法、设备和介质
CN111353534B (zh) 一种基于自适应分数阶梯度的图数据类别预测方法
Chen et al. Deep-broad learning system for traffic flow prediction toward 5G cellular wireless network
US11899742B2 (en) Quantization method based on hardware of in-memory computing
CN114613437A (zh) 一种基于异构图的miRNA与疾病关联预测方法及系统
CN110826692B (zh) 一种自动化模型压缩方法、装置、设备及存储介质
CN110491443B (zh) 一种基于投影邻域非负矩阵分解的lncRNA蛋白质关联预测方法
CN112036651A (zh) 基于量子免疫优化bp神经网络算法的电价预测方法
CN111130909A (zh) 基于自适应储备池esn的网络流量预测方法
CN113095477A (zh) 基于de-bp神经网络的风电功率预测方法
CN111709519B (zh) 一种深度学习并行计算架构方法及其超参数自动配置优化
CN112380006A (zh) 一种数据中心资源分配方法及装置
CN115438784A (zh) 一种用于混合位宽超网络的充分训练方法
CN113743012B (zh) 一种多用户场景下的云-边缘协同模式任务卸载优化方法
CN116522747A (zh) 一种两阶段优化的挤压铸造工艺参数优化设计方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21953283

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE