CN114037051A - A Deep Learning Model Compression Method Based on Decision Boundary - Google Patents

A Deep Learning Model Compression Method Based on Decision Boundary Download PDF

Info

Publication number
CN114037051A
CN114037051A CN202111242448.0A CN202111242448A CN114037051A CN 114037051 A CN114037051 A CN 114037051A CN 202111242448 A CN202111242448 A CN 202111242448A CN 114037051 A CN114037051 A CN 114037051A
Authority
CN
China
Prior art keywords
decision
decision boundary
deep learning
learning model
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111242448.0A
Other languages
Chinese (zh)
Inventor
董航程
刘国栋
刘炳国
叶东
廖敬骁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Shenzhen
Original Assignee
Harbin Institute of Technology Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Shenzhen filed Critical Harbin Institute of Technology Shenzhen
Priority to CN202111242448.0A priority Critical patent/CN114037051A/en
Publication of CN114037051A publication Critical patent/CN114037051A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于决策边界的深度学习模型压缩方法,属于深度学习的模型压缩技术领域。基于决策边界的深度学习模型压缩方法包括以下步骤:步骤一、进行特征映射;步骤二、进行激活函数分段线性化;步骤三、进行子决策区域计算:计算全连接层的子决策区域;步骤四、进行决策网络构建:根据子决策区域计算相应的决策边界,并用于构建新的决策网络。本发明实现了对全连接层的高效模型压缩,且对于激活函数为分段线性的模型,相较于现有方法往往带来精度降低的问题,本发明可以实现精度无损压缩。对于激活函数为具有无穷渐近线的其他非线性函数,可以实现可控精度下的模型压缩。

Figure 202111242448

The invention discloses a deep learning model compression method based on decision boundary, which belongs to the technical field of deep learning model compression. The deep learning model compression method based on the decision boundary includes the following steps: step 1, perform feature mapping; step 2, perform piecewise linearization of the activation function; step 3, perform sub-decision region calculation: calculate the sub-decision region of the fully connected layer; step 4. Constructing a decision network: Calculate the corresponding decision boundary according to the sub-decision area, and use it to construct a new decision network. The present invention realizes efficient model compression for the fully connected layer, and for the model whose activation function is piecewise linear, compared with the prior method, the present invention often brings about the problem of reduced accuracy, and the present invention can realize lossless compression of accuracy. For other nonlinear functions whose activation functions are infinite asymptotes, model compression with controllable accuracy can be achieved.

Figure 202111242448

Description

Deep learning model compression method based on decision boundary
Technical Field
The invention relates to a deep learning model compression method based on decision boundaries, and belongs to the technical field of deep learning model compression.
Background
The deep learning model is a core algorithm of the existing artificial intelligence technology, depends on a large amount of labeled data, and realizes nonlinear fitting on complex problems through hierarchical modeling. In current practice, deep learning techniques have been successful in the fields of image recognition, speech processing, etc., and have continuously affected other industries.
In order to process complex data, current deep learning models often have hundreds of millions of parameters, and besides a lot of time and computing resources are consumed in a training phase, a lot of storage resources are occupied in deployment and inference processes of the models, and inference speed is slow. In the case of limited computing resources, such as mobile terminals, the application of the deep learning system will be limited.
The deep learning model compression mainly aims at the problem of excessive model parameter quantity, and currently, research on the field mainly focuses on the following 4 points:
(1) and (3) matrix low-rank decomposition, namely, a deep learning model relates to a large number of matrix operations, and the data volume of the matrix can be greatly reduced while the calculation result is basically unchanged by decomposing a large-scale low-rank matrix into a plurality of small matrices.
(2) Model pruning and parameter quantification: the main starting point of model pruning is that a deep learning model is often over-parameterized, so that redundant structures and parameters are contained in a network, and redundant networks are deleted through rules such as importance and the like, so that redundant parameters and neurons are deleted. Quantization is to simplify the data type stored in the weight, such as converting from floating point number to integer, so as to reduce the storage capacity. This type of approach tends to degrade the performance of the model.
(3) Network Architecture Search (NAS): and in a given model design space, a machine automatically searches an optimal structure, so that model compression is realized. Such methods can be computationally expensive in the search process.
(4) Knowledge Distillation (KD): through the trained teacher model, the student model with a smaller model is trained, so that the performance of the small model is improved while fewer model parameters are needed.
Disclosure of Invention
The invention aims to provide a deep learning model compression method based on decision boundaries, which solves the problems in the prior art.
A deep learning model compression method based on decision boundaries comprises the following steps:
step one, carrying out feature mapping;
step two, performing segmented linearization on the activation function;
step three, calculating a sub-decision area: calculating a sub-decision area of the full connection layer;
step four, decision network construction is carried out: and calculating corresponding decision boundaries according to the sub-decision regions and constructing a new decision network.
Further, in the step one, if the object of model compression is a fully-connected neural network, the step is not executed, and the step two is directly executed.
Further, in step one, if the object is a fully connected part of the cnn model, the model is regarded as a composite of two parts, where f is gMLP(gcnn(x0) G) a handlecnn(x0) The new sample set D '═ { x' ═ g is constructed as a feature mapcnn(x) And then operate it as a fully connected neural network.
Further, in the second step, if the activation function adopts a piecewise linear function, the third step is directly executed without executing the second step.
Further, in the second step, for the activation function which is not a piecewise linear function, the activation function piecewise linearization technique is adopted, and the piecewise linear function close to the activation function is found to perform approximate substitution and is converted into the piecewise linear function.
Further, in step two, as for the activation function that is not a piecewise linear function, specifically:
first, a hard approximation function hard- σ (x) of the activation function σ (x) is generated, specifically as follows:
Figure BDA0003319699360000021
Figure BDA0003319699360000022
Figure BDA0003319699360000031
depending on the required number of segments L n +2 and the acceptable error δ > 0, two segmentation points are first selected so as to be at (- ∞, a)0],[anAnd +∞) in two intervals, satisfying that [ sigma (x) -hard-a (x) < delta ], and in the interval [ a ≦0,an]Upper, directly and equidistantly taking the division point a1,a2,...,an-1And according to the point pair (a)1,hard-σ(a1)),(a2,σ(a2)),(a3,σ(a3)),...,(an-2,σ(an-2)),(an-1,hard-σ(an-1) And connecting the L (n + 2) sections of the original activation function in sequence to obtain the piecewise linear approximation function of the original activation function.
Further, in step three, the activation function is not a piecewise linear function, specifically: the invention firstly calculates the decision boundary, concretely, the used piecewise linear activation function is:
Figure BDA0003319699360000032
step three, traversing samples according to a training sample set, namely inputting each sample into a deep learning model f (x) in sequence, but not executing a back propagation process, and simultaneously recording the activation states of all full-connection layer activation functions;
step three, counting all full connectionsThe activation states of the layer neurons are sequentially arranged into an overall state vector S ═ S1,s2,...,sm]According to the steps in the third step and the first step, the integral state vectors of all the samples are counted to obtain an integral state vector set phi of the samples, wherein phi is { S ═ S }1,S2,...SN};
Thirdly, finishing phi, and combining the completely same integral state vectors to obtain
Figure BDA0003319699360000033
From the reformed ensemble of state vectors
Figure BDA0003319699360000036
The number q of elements of (1) will have the same activation state S'pDividing samples of (p is more than or equal to 1 and less than or equal to q) into the same subinterval, and classifying the samples belonging to the same subinterval by the same vector linear model gi(x)=wix+bi(i ═ 1, 2.. q.) description, directly through the parameters of the full connection layer and the overall activation state vector, the equivalent linear model g is obtained by calculationi(x)=wix+biLet all submodels be G ═ G1,g2,...,gq};
Step three and four, calculating decision boundaries of all sub models, and obtaining the N classification problems according to the definition of the decision boundaries
Figure BDA0003319699360000034
Class decision boundaries, and linear model g for a sub-intervali(x) Is calculated to obtain
Figure BDA0003319699360000035
The bar decision boundary specifically includes:
Figure BDA0003319699360000041
calculating decision boundaries of all subinterval models to form a decision boundary hyperplane setCombination of Chinese herbs
Figure BDA0003319699360000042
Figure BDA0003319699360000043
Further, in step four, specifically, a decision network is constructed according to the decision boundary hyperplane set DB obtained in step three, the network only contains a hidden layer, which is different from a general neural network, and the output of the decision network DNet is a position code relative to the decision boundary, which is recorded as 0/1, specifically, for the hyperplane PlAnd sample x0Directly substituting into a hyperplane formula to calculate the output, if the result is regular, marking 1, and if the result is negative, marking 0,
through the decision network, a relative position coding of the data with respect to all elements of the decision boundary set DB is obtained
Figure BDA0003319699360000044
Figure BDA0003319699360000045
Depending on the nature of the decision boundary, samples with the same position code must belong to the same class,
and traversing the training set data D ═ xi,CiI | (1, 2.), (N }), the class marking of the position code of the decision network is completed, and when a new sample is input, the class of the new sample can be known only by comparing the new sample with the marked position code.
The invention has the following beneficial effects:
1. the invention realizes the high-efficiency model compression of the full connection layer.
2. Compared with the prior art, the method has the advantage that the precision is reduced for the model with the piecewise linear activation function, and the precision lossless compression can be realized. For other non-linear functions where the activation function is an infinite asymptote, model compression with controllable accuracy can be achieved.
Drawings
Fig. 1 is a schematic diagram of a decision network.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a deep learning model compression method based on decision boundaries, which comprises the following steps:
step one, carrying out feature mapping;
step two, performing segmented linearization on the activation function;
step three, calculating a sub-decision area: calculating a sub-decision area of the full connection layer;
step four, decision network construction is carried out: and calculating corresponding decision boundaries according to the sub-decision regions and constructing a new decision network.
Further, in the step one, if the object of model compression is a fully-connected neural network, the step is not executed, and the step two is directly executed.
Further, in step one, if the object is a fully connected part of the cnn model, the model is regarded as a composite of two parts, where f is gMLP(gcnn(x0) G) a handlecnn(x0) The new sample set D '═ { x' ═ g is constructed as a feature mapcnn(x) And then the original model f is operated as a fully connected neural network.
Further, in the second step, if the activation function adopts a piecewise linear function, the third step is directly executed without executing the second step.
Further, in the second step, for the activation function which is not a piecewise linear function, the activation function piecewise linearization technique is adopted, and the piecewise linear function close to the activation function is found to perform approximate substitution and is converted into the piecewise linear function.
Further, in step two, for the function whose activation function is not piecewise linear, such as sigmoid, tanh function, etc., the activation function piecewise linearization technique can be adopted to perform approximate substitution by finding the piecewise linear function close to the activation function. The method comprises the following specific steps:
according to the existing activation function, which generally has the property of an infinite asymptote, according to the infinite asymptote of the activation function, a hard approximation function hard- σ (x) of the activation function σ (x) is firstly generated, specifically as follows:
Figure BDA0003319699360000051
Figure BDA0003319699360000052
Figure BDA0003319699360000061
depending on the required number of segments L n +2 and the acceptable error δ > 0, two segmentation points are first selected so as to be at (- ∞, a)0],[anAnd +/-infinity), satisfying that the absolute value of sigma (x) -hard-sigma (x) is less than or equal to delta. And in the interval [ a0,an]Upper, directly and equidistantly taking the division point a1,a2,...,an-1And according to the point pair (a)1,hard-σ(a1)),(a2,σ(a2)),(a3,σ(a3)),...,(an-2,σ(an-2)),(an-1,hard-σ(an-1) And connecting the L-n +2 sections of the original activation function in sequence to obtain the piecewise linear approximation function of the original activation function. Then the decision network can be generated according to the process of the first to fourth steps, thereby realizing model compression.
Further, in step three, for classifying the model, its essence is the block of the modelRule boundary, given a data set D ═ x, for example, by image classificationiC i1, 2.., N }, training a classifier, f: rd→RcWherein the classification label is C ═ { C ═ Ci/i=1,2,...,N,N∈Z+}. F is at CiAnd CjThe decision boundaries between classes are:
Figure BDA0003319699360000062
where U (x, δ) is the sphere opening neighborhood for sample x.
The deep learning model for solving the classification problem is also a classifier f (x), so the decision boundary of the deep learning model is calculated firstly, and the decision boundary of the deep learning model is generally difficult to calculate due to high nonlinearity. In particular, note that the piecewise linear activation function used is:
Figure BDA0003319699360000063
step three, traversing the samples according to a training sample set, namely inputting each sample into a deep learning model f (x) in sequence, but not executing a back propagation process (namely only performing an inference process), and simultaneously recording the activation states of all full-connection layer activation functions, such as the output a of a certain neuronijSatisfies muk<aij<μk+1(k-0, 1, 2.., n-1), the activation state of the neuron is sijK, and so on;
step three and two, counting the activation states of all neurons in the full connecting layer, and sequentially arranging the activation states into an overall state vector S ═ S1,s2,...,sm]According to the steps in the third step and the first step, the integral state vectors of all the samples are counted to obtain an integral state vector set phi of the samples, wherein phi is { S ═ S }1,S2,...SN};
Thirdly, finishing phi, and combining the completely same integral state vectors to obtain the final productTo
Figure BDA0003319699360000071
From the reformed ensemble of state vectors
Figure BDA0003319699360000079
The number q of elements of (1) will have the same activation state S'pDividing samples of (p is more than or equal to 1 and less than or equal to q) into the same subinterval, and classifying the samples belonging to the same subinterval by the same vector linear model gi(x)=wix+bi(i ═ 1, 2.. q.) description, directly through the parameters of the full connection layer and the overall activation state vector, the equivalent linear model g is obtained by calculationi(x)=wix+biLet all submodels be G ═ G1,g2,...,gq};
Step three and four, calculating decision boundaries of all sub models, and knowing that N classification problems are shared according to the definition of the decision boundaries
Figure BDA0003319699360000072
Class decision boundaries, and linear model g for a sub-intervali(x) Can be calculated to obtain
Figure BDA0003319699360000073
The bar decision boundary specifically includes:
Figure BDA0003319699360000074
calculating decision boundaries of all subinterval models to form decision boundary hyperplane set
Figure BDA0003319699360000075
Figure BDA0003319699360000076
Further, in step four, specifically, the decision boundary hyperplane set DB obtained in step three is used to constructDecision network (DNet) comprising only one hidden layer, different from the ordinary neural network, the output of the decision network (DNet) is a position code relative to the decision boundaries, denoted 0/1, in particular for hyperplane PlAnd sample x0And directly substituting into a hyperplane formula to calculate the output of the hyperplane formula, and if the result is regular, marking 1, and if the result is negative, marking 0. Thus, by means of the decision network, a relative position coding of the data with respect to all elements of the decision boundary set DB is obtained
Figure BDA0003319699360000077
Figure BDA0003319699360000078
Referring to fig. 1, samples having the same position code must belong to the same class according to the characteristics of the decision boundary.
Therefore, only the training set data D ═ x needs to be traversed againi,CiI 1, 2., N }, the location code of the decision network may be category-labeled. When a new sample is input, the type of the sample can be known only by comparing the new sample with the marked position code.
The invention provides a deep learning model (including CNN and MLP) compression method based on decision boundary, because the parameters of the deep learning model are largely derived from the full connection layer in the model, the method designs a compression method for the full connection layer without a large amount of experiments such as pruning, searching, distilling and the like, and only needs to traverse 2 times on a training set sample. And the obtained model can realize lossless compression if the activation function is a piecewise linear function commonly used as a ReLU, and can realize compression with any given precision through linear approximation if the activation function is other nonlinear activation functions.
The above embodiments are only used to help understanding the method of the present invention and the core idea thereof, and a person skilled in the art can also make several modifications and decorations on the specific embodiments and application scope according to the idea of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (8)

1.一种基于决策边界的深度学习模型压缩方法,其特征在于,所述基于决策边界的深度学习模型压缩方法包括以下步骤:1. a deep learning model compression method based on decision boundary, is characterized in that, described deep learning model compression method based on decision boundary comprises the following steps: 步骤一、进行特征映射;Step 1. Perform feature mapping; 步骤二、进行激活函数分段线性化;Step 2: Perform piecewise linearization of the activation function; 步骤三、进行子决策区域计算:计算全连接层的子决策区域;Step 3: Calculate the sub-decision area: Calculate the sub-decision area of the fully connected layer; 步骤四、进行决策网络构建:根据子决策区域计算相应的决策边界,并用于构建新的决策网络。Step 4: Construct a decision network: Calculate the corresponding decision boundary according to the sub-decision area, and use it to construct a new decision network. 2.根据权利要求1所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤一中,若模型压缩的对象为全连接神经网络,则不执行本步骤,直接执行步骤二。2. a kind of deep learning model compression method based on decision boundary according to claim 1, is characterized in that, in step 1, if the object of model compression is fully connected neural network, then do not execute this step, directly execute step two. 3.根据权利要求1所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤一中,若对象为cnn模型的全连接部分,则将模型视为两个部分的复合f=gMLP(gcnn(x0)),把gcnn(x0)视为特征映射,构建成新的样本集D′={x′=gcnn(x)},然后将其作为全连接神经网络来进行操作。3. a kind of deep learning model compression method based on decision boundary according to claim 1 is characterized in that, in step 1, if the object is the fully connected part of the cnn model, then the model is regarded as the composite of the two parts f=g MLP (g cnn (x 0 )), regard g cnn (x 0 ) as a feature map, construct a new sample set D′={x′=g cnn (x)}, and then use it as a full sample set D′={x′=g cnn (x)} Connect the neural network to operate. 4.根据权利要求1所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤二中,若激活函数采用分段线性函数,则不执行本步骤,直接执行步骤三。4. A decision boundary-based deep learning model compression method according to claim 1, wherein in step 2, if the activation function adopts a piecewise linear function, this step is not performed, and step 3 is directly performed. 5.根据权利要求1所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤二中,对于激活函数不是分段线性的函数,采用激活函数分段线性化技术,通过找到与激活函数相近的分段线性函数,来进行近似替代,转化为分段线性函数。5. a kind of deep learning model compression method based on decision boundary according to claim 1, is characterized in that, in step 2, for activation function not piecewise linear function, adopt activation function piecewise linearization technique, pass Find a piecewise linear function similar to the activation function for approximate substitution and convert it into a piecewise linear function. 6.根据权利要求5所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤二中,对于激活函数不是分段线性的函数,具体的:6. a kind of deep learning model compression method based on decision boundary according to claim 5, is characterized in that, in step 2, for activation function is not the function of piecewise linear, concrete: 首先生成激活函数σ(x)的硬近似函数hard-σ(x),具体如下:First, the hard approximation function hard-σ(x) of the activation function σ(x) is generated, as follows:
Figure FDA0003319699350000011
Figure FDA0003319699350000011
Figure FDA0003319699350000012
Figure FDA0003319699350000012
Figure FDA0003319699350000013
Figure FDA0003319699350000013
根据所需要的分段数L=n+2以及可接受的误差δ>0,首先选择两个分段点,使得在(-∞,a0],[an,+∞)两个区间上,满足|σ(x)-hard-σ(x)|≤δ,而在区间[a0,an]上,直接等距取分割点a1,a2,...,an-1,并按照点对(a1,hard-σ(a1)),(a2,σ(a2)),(a3,σ(a3)),…,(an-2,σ(an-2)),(an-1,hard-σ(an-1)),依次连线,即得到了对原激活函数的L=n+2段分段线性近似函数。According to the required number of segments L=n+2 and the acceptable error δ>0, first select two segment points, so that on the two intervals (-∞, a 0 ], [an , + ∞) , satisfies |σ(x)-hard-σ(x)|≤δ, and on the interval [a 0 , a n ], directly equidistantly take the dividing points a 1 , a 2 ,..., a n-1 , and according to point pairs (a 1 , hard-σ(a 1 )), (a 2 , σ(a 2 )), (a 3 , σ(a 3 )), …, (a n-2 , σ( a n-2 )), (a n-1 , hard-σ(a n-1 )), connect the lines in turn, that is, the L=n+2 segment piecewise linear approximation function of the original activation function is obtained.
7.根据权利要求1所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤三中,对于激活函数不是分段线性的函数,具体的:本发明首先计算其决策边界,具体的,记使用的分段线性激活函数为:7. a kind of deep learning model compression method based on decision boundary according to claim 1, is characterized in that, in step 3, for activation function is not the function of piecewise linear, concrete: the present invention first calculates its decision boundary , specifically, the piecewise linear activation function used is:
Figure FDA0003319699350000021
Figure FDA0003319699350000021
步骤三一、首先根据训练样本集,遍历样本,即将每一个样本依次输入深度学习模型f(x),但并不执行反向传播过程,同时记录所有全连接层激活函数的激活状态;Step 31: First, traverse the samples according to the training sample set, that is, input each sample into the deep learning model f(x) in turn, but do not perform the backpropagation process, and record the activation states of all fully connected layer activation functions at the same time; 步骤三二、统计所有全连接层神经元的激活状态,按顺序依次排列为整体状态向量S=[s1,s2,...,sm],根据步骤三一中的步骤,统计所有样本的整体状态向量,得到样本的整体状态向量集合φ={S1,S2,...SN};Step 32: Count the activation states of all neurons in the fully connected layer, and arrange them in sequence as an overall state vector S=[s 1 , s 2 , ..., s m ]. According to the steps in step 31, count all the The overall state vector of the sample is obtained to obtain the overall state vector set of the sample φ={S 1 , S 2 ,...S N }; 步骤三三、整理φ,将完全相同的整体状态向量合并,得到
Figure FDA0003319699350000022
根据重整后的整体状态向量集合
Figure FDA0003319699350000023
的元素个数q,将拥有相同激活状态S′p(1≤p≤q)的样本划归为同一个子区间,属于同一个子区间的样本,被同一个向量线性模型gi(x)=wix+bi(i=1,2,...,q)描述,直接通过全连接层的参数与整体激活状态向量,计算得到其等价的线性模型gi(x)=wix+bi,记所有的子模型为G={g1,g2,...,gq};
Step 33: Arrange φ, and combine the exact same overall state vector to get
Figure FDA0003319699350000022
According to the overall state vector set after reformation
Figure FDA0003319699350000023
The number of elements q of , the samples with the same activation state S′ p (1≤p≤q) are classified into the same sub-interval, and the samples belonging to the same sub-interval are divided by the same vector linear model g i (x)=w Described by i x+b i (i=1, 2,..., q), directly through the parameters of the fully connected layer and the overall activation state vector, the equivalent linear model g i (x)=w i x is obtained by calculation +b i , denote all sub-models as G={g 1 , g 2 , ..., g q };
步骤三四、计算所有子模型的决策边界,根据决策边界的定义,即得知对于N分类问题,共有
Figure FDA0003319699350000024
类决策边界,而对于一个子区间上线性模型gi(x),计算得到
Figure FDA0003319699350000025
条决策边界,具体为:
Step 34: Calculate the decision boundary of all sub-models. According to the definition of the decision boundary, it is known that for the N classification problem, there are a total of
Figure FDA0003319699350000024
class decision boundary, and for a linear model g i (x) over a subinterval, the computation yields
Figure FDA0003319699350000025
decision boundary, specifically:
Figure FDA0003319699350000026
Figure FDA0003319699350000026
计算所有子区间模型的决策边界,构成决策边界超平面集合
Figure FDA0003319699350000027
Figure FDA0003319699350000031
Calculate the decision boundaries of all subinterval models to form a set of decision boundary hyperplanes
Figure FDA0003319699350000027
Figure FDA0003319699350000031
8.根据权利要求1所述的一种基于决策边界的深度学习模型压缩方法,其特征在于,在步骤四中,具体为,根据步骤三中得到的决策边界超平面集合DB,构建决策网络,该网络只含有一个隐含层,不同于普通的神经网络,决策网络DNet的输出,是相对于决策边界的位置编码,记为0/1,具体的,对于超平面Pl与样本x0,直接带入超平面公式计算其输出,若结果为正则记1,为负则记0,8. a kind of deep learning model compression method based on decision boundary according to claim 1, is characterized in that, in step 4, specifically, according to the decision boundary hyperplane set DB obtained in step 3, construct decision-making network, The network only contains one hidden layer, which is different from the ordinary neural network. The output of the decision network DNet is the position code relative to the decision boundary, denoted as 0/1. Specifically, for the hyperplane P l and the sample x 0 , Bring it directly into the hyperplane formula to calculate its output, if the result is positive, record 1, if it is negative, record 0, 通过决策网络,得到了数据相对于决策边界集合DB的所有元素的相对位置编码
Figure FDA0003319699350000032
Figure FDA0003319699350000033
根据决策边界的特性,具有相同位置编码的样本,一定属于同一类,
Through the decision network, the relative position encoding of the data relative to all elements of the decision boundary set DB is obtained
Figure FDA0003319699350000032
Figure FDA0003319699350000033
According to the characteristics of the decision boundary, samples with the same location code must belong to the same class,
再遍历训练集数据D={xi,Ci|i=1,2,...,N},即完成了对决策网络的位置编码进行类别标记,当新样本输入时,只需要与标记好的位置编码进行比对即可知道其类别。Traverse the training set data D={x i , C i |i=1, 2, ..., N}, that is, the category labeling of the position encoding of the decision-making network is completed. When a new sample is input, only the labeling is required. A good location code can be compared to know its category.
CN202111242448.0A 2021-10-25 2021-10-25 A Deep Learning Model Compression Method Based on Decision Boundary Pending CN114037051A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111242448.0A CN114037051A (en) 2021-10-25 2021-10-25 A Deep Learning Model Compression Method Based on Decision Boundary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111242448.0A CN114037051A (en) 2021-10-25 2021-10-25 A Deep Learning Model Compression Method Based on Decision Boundary

Publications (1)

Publication Number Publication Date
CN114037051A true CN114037051A (en) 2022-02-11

Family

ID=80135285

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111242448.0A Pending CN114037051A (en) 2021-10-25 2021-10-25 A Deep Learning Model Compression Method Based on Decision Boundary

Country Status (1)

Country Link
CN (1) CN114037051A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859091A (en) * 2022-11-01 2023-03-28 哈尔滨工业大学 Bearing fault feature extraction method, electronic device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115859091A (en) * 2022-11-01 2023-03-28 哈尔滨工业大学 Bearing fault feature extraction method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN109165664B (en) Attribute-missing data set completion and prediction method based on generation of countermeasure network
CN112001498A (en) Data identification method and device based on quantum computer and readable storage medium
CN111507521A (en) Power load forecasting method and forecasting device in Taiwan area
WO2022252455A1 (en) Methods and systems for training graph neural network using supervised contrastive learning
CN106897254A (en) A kind of network representation learning method
CN113516133A (en) Multi-modal image classification method and system
CN114118369B (en) Image classification convolutional neural network design method based on group intelligent optimization
CN112528873B (en) Signal Semantic Recognition Method Based on Multi-level Semantic Representation and Semantic Computing
CN115186798A (en) Knowledge distillation-based regeneration TSK fuzzy classifier
CN111694974A (en) Depth hash vehicle image retrieval method integrating attention mechanism
CN115810351B (en) Voice recognition method and device for controller based on audio-visual fusion
CN114037051A (en) A Deep Learning Model Compression Method Based on Decision Boundary
CN113989566A (en) Image classification method and device, computer equipment and storage medium
CN111158640B (en) One-to-many demand analysis and identification method based on deep learning
WO2025001443A1 (en) Neural network model generation method and apparatus
CN116524282B (en) Discrete similarity matching classification method based on feature vectors
CN112308213A (en) A Convolutional Neural Network Compression Method Based on Global Feature Relation
CN117154256A (en) Electrochemical repair method for lithium battery
Gafour et al. Genetic fractal image compression
CN117133116A (en) A traffic flow prediction method and system based on spatiotemporal correlation network
CN116168437A (en) Prediction model training method, device, equipment and storage medium based on multitasking
CN111368976B (en) Data compression method based on neural network feature recognition
CN113836260A (en) Total nitrogen content prediction method based on deep learning of knowledge enhancement
CN112885367A (en) Fundamental frequency acquisition method, fundamental frequency acquisition device, computer equipment and storage medium
CN118916714B (en) Code similarity detection method, equipment and medium based on graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination