CN114037051A - Deep learning model compression method based on decision boundary - Google Patents
Deep learning model compression method based on decision boundary Download PDFInfo
- Publication number
- CN114037051A CN114037051A CN202111242448.0A CN202111242448A CN114037051A CN 114037051 A CN114037051 A CN 114037051A CN 202111242448 A CN202111242448 A CN 202111242448A CN 114037051 A CN114037051 A CN 114037051A
- Authority
- CN
- China
- Prior art keywords
- decision
- function
- activation
- activation function
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a deep learning model compression method based on decision boundaries, and belongs to the technical field of deep learning model compression. The deep learning model compression method based on the decision boundary comprises the following steps: step one, carrying out feature mapping; step two, performing segmented linearization on the activation function; step three, calculating a sub-decision area: calculating a sub-decision area of the full connection layer; step four, decision network construction is carried out: and calculating corresponding decision boundaries according to the sub-decision regions and constructing a new decision network. The invention realizes the high-efficiency model compression of the full connection layer, and compared with the prior art, the invention has the problem of reduced precision of the model with the piecewise linearity of the activation function, and the invention can realize the precision lossless compression. For other non-linear functions where the activation function is an infinite asymptote, model compression with controllable accuracy can be achieved.
Description
Technical Field
The invention relates to a deep learning model compression method based on decision boundaries, and belongs to the technical field of deep learning model compression.
Background
The deep learning model is a core algorithm of the existing artificial intelligence technology, depends on a large amount of labeled data, and realizes nonlinear fitting on complex problems through hierarchical modeling. In current practice, deep learning techniques have been successful in the fields of image recognition, speech processing, etc., and have continuously affected other industries.
In order to process complex data, current deep learning models often have hundreds of millions of parameters, and besides a lot of time and computing resources are consumed in a training phase, a lot of storage resources are occupied in deployment and inference processes of the models, and inference speed is slow. In the case of limited computing resources, such as mobile terminals, the application of the deep learning system will be limited.
The deep learning model compression mainly aims at the problem of excessive model parameter quantity, and currently, research on the field mainly focuses on the following 4 points:
(1) and (3) matrix low-rank decomposition, namely, a deep learning model relates to a large number of matrix operations, and the data volume of the matrix can be greatly reduced while the calculation result is basically unchanged by decomposing a large-scale low-rank matrix into a plurality of small matrices.
(2) Model pruning and parameter quantification: the main starting point of model pruning is that a deep learning model is often over-parameterized, so that redundant structures and parameters are contained in a network, and redundant networks are deleted through rules such as importance and the like, so that redundant parameters and neurons are deleted. Quantization is to simplify the data type stored in the weight, such as converting from floating point number to integer, so as to reduce the storage capacity. This type of approach tends to degrade the performance of the model.
(3) Network Architecture Search (NAS): and in a given model design space, a machine automatically searches an optimal structure, so that model compression is realized. Such methods can be computationally expensive in the search process.
(4) Knowledge Distillation (KD): through the trained teacher model, the student model with a smaller model is trained, so that the performance of the small model is improved while fewer model parameters are needed.
Disclosure of Invention
The invention aims to provide a deep learning model compression method based on decision boundaries, which solves the problems in the prior art.
A deep learning model compression method based on decision boundaries comprises the following steps:
step one, carrying out feature mapping;
step two, performing segmented linearization on the activation function;
step three, calculating a sub-decision area: calculating a sub-decision area of the full connection layer;
step four, decision network construction is carried out: and calculating corresponding decision boundaries according to the sub-decision regions and constructing a new decision network.
Further, in the step one, if the object of model compression is a fully-connected neural network, the step is not executed, and the step two is directly executed.
Further, in step one, if the object is a fully connected part of the cnn model, the model is regarded as a composite of two parts, where f is gMLP(gcnn(x0) G) a handlecnn(x0) The new sample set D '═ { x' ═ g is constructed as a feature mapcnn(x) And then operate it as a fully connected neural network.
Further, in the second step, if the activation function adopts a piecewise linear function, the third step is directly executed without executing the second step.
Further, in the second step, for the activation function which is not a piecewise linear function, the activation function piecewise linearization technique is adopted, and the piecewise linear function close to the activation function is found to perform approximate substitution and is converted into the piecewise linear function.
Further, in step two, as for the activation function that is not a piecewise linear function, specifically:
first, a hard approximation function hard- σ (x) of the activation function σ (x) is generated, specifically as follows:
depending on the required number of segments L n +2 and the acceptable error δ > 0, two segmentation points are first selected so as to be at (- ∞, a)0],[anAnd +∞) in two intervals, satisfying that [ sigma (x) -hard-a (x) < delta ], and in the interval [ a ≦0,an]Upper, directly and equidistantly taking the division point a1,a2,...,an-1And according to the point pair (a)1,hard-σ(a1)),(a2,σ(a2)),(a3,σ(a3)),...,(an-2,σ(an-2)),(an-1,hard-σ(an-1) And connecting the L (n + 2) sections of the original activation function in sequence to obtain the piecewise linear approximation function of the original activation function.
Further, in step three, the activation function is not a piecewise linear function, specifically: the invention firstly calculates the decision boundary, concretely, the used piecewise linear activation function is:
step three, traversing samples according to a training sample set, namely inputting each sample into a deep learning model f (x) in sequence, but not executing a back propagation process, and simultaneously recording the activation states of all full-connection layer activation functions;
step three, counting all full connectionsThe activation states of the layer neurons are sequentially arranged into an overall state vector S ═ S1,s2,...,sm]According to the steps in the third step and the first step, the integral state vectors of all the samples are counted to obtain an integral state vector set phi of the samples, wherein phi is { S ═ S }1,S2,...SN};
Thirdly, finishing phi, and combining the completely same integral state vectors to obtainFrom the reformed ensemble of state vectorsThe number q of elements of (1) will have the same activation state S'pDividing samples of (p is more than or equal to 1 and less than or equal to q) into the same subinterval, and classifying the samples belonging to the same subinterval by the same vector linear model gi(x)=wix+bi(i ═ 1, 2.. q.) description, directly through the parameters of the full connection layer and the overall activation state vector, the equivalent linear model g is obtained by calculationi(x)=wix+biLet all submodels be G ═ G1,g2,...,gq};
Step three and four, calculating decision boundaries of all sub models, and obtaining the N classification problems according to the definition of the decision boundariesClass decision boundaries, and linear model g for a sub-intervali(x) Is calculated to obtainThe bar decision boundary specifically includes:
calculating decision boundaries of all subinterval models to form a decision boundary hyperplane setCombination of Chinese herbs
Further, in step four, specifically, a decision network is constructed according to the decision boundary hyperplane set DB obtained in step three, the network only contains a hidden layer, which is different from a general neural network, and the output of the decision network DNet is a position code relative to the decision boundary, which is recorded as 0/1, specifically, for the hyperplane PlAnd sample x0Directly substituting into a hyperplane formula to calculate the output, if the result is regular, marking 1, and if the result is negative, marking 0,
through the decision network, a relative position coding of the data with respect to all elements of the decision boundary set DB is obtained Depending on the nature of the decision boundary, samples with the same position code must belong to the same class,
and traversing the training set data D ═ xi,CiI | (1, 2.), (N }), the class marking of the position code of the decision network is completed, and when a new sample is input, the class of the new sample can be known only by comparing the new sample with the marked position code.
The invention has the following beneficial effects:
1. the invention realizes the high-efficiency model compression of the full connection layer.
2. Compared with the prior art, the method has the advantage that the precision is reduced for the model with the piecewise linear activation function, and the precision lossless compression can be realized. For other non-linear functions where the activation function is an infinite asymptote, model compression with controllable accuracy can be achieved.
Drawings
Fig. 1 is a schematic diagram of a decision network.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a deep learning model compression method based on decision boundaries, which comprises the following steps:
step one, carrying out feature mapping;
step two, performing segmented linearization on the activation function;
step three, calculating a sub-decision area: calculating a sub-decision area of the full connection layer;
step four, decision network construction is carried out: and calculating corresponding decision boundaries according to the sub-decision regions and constructing a new decision network.
Further, in the step one, if the object of model compression is a fully-connected neural network, the step is not executed, and the step two is directly executed.
Further, in step one, if the object is a fully connected part of the cnn model, the model is regarded as a composite of two parts, where f is gMLP(gcnn(x0) G) a handlecnn(x0) The new sample set D '═ { x' ═ g is constructed as a feature mapcnn(x) And then the original model f is operated as a fully connected neural network.
Further, in the second step, if the activation function adopts a piecewise linear function, the third step is directly executed without executing the second step.
Further, in the second step, for the activation function which is not a piecewise linear function, the activation function piecewise linearization technique is adopted, and the piecewise linear function close to the activation function is found to perform approximate substitution and is converted into the piecewise linear function.
Further, in step two, for the function whose activation function is not piecewise linear, such as sigmoid, tanh function, etc., the activation function piecewise linearization technique can be adopted to perform approximate substitution by finding the piecewise linear function close to the activation function. The method comprises the following specific steps:
according to the existing activation function, which generally has the property of an infinite asymptote, according to the infinite asymptote of the activation function, a hard approximation function hard- σ (x) of the activation function σ (x) is firstly generated, specifically as follows:
depending on the required number of segments L n +2 and the acceptable error δ > 0, two segmentation points are first selected so as to be at (- ∞, a)0],[anAnd +/-infinity), satisfying that the absolute value of sigma (x) -hard-sigma (x) is less than or equal to delta. And in the interval [ a0,an]Upper, directly and equidistantly taking the division point a1,a2,...,an-1And according to the point pair (a)1,hard-σ(a1)),(a2,σ(a2)),(a3,σ(a3)),...,(an-2,σ(an-2)),(an-1,hard-σ(an-1) And connecting the L-n +2 sections of the original activation function in sequence to obtain the piecewise linear approximation function of the original activation function. Then the decision network can be generated according to the process of the first to fourth steps, thereby realizing model compression.
Further, in step three, for classifying the model, its essence is the block of the modelRule boundary, given a data set D ═ x, for example, by image classificationi,C i1, 2.., N }, training a classifier, f: rd→RcWherein the classification label is C ═ { C ═ Ci/i=1,2,...,N,N∈Z+}. F is at CiAnd CjThe decision boundaries between classes are:
where U (x, δ) is the sphere opening neighborhood for sample x.
The deep learning model for solving the classification problem is also a classifier f (x), so the decision boundary of the deep learning model is calculated firstly, and the decision boundary of the deep learning model is generally difficult to calculate due to high nonlinearity. In particular, note that the piecewise linear activation function used is:
step three, traversing the samples according to a training sample set, namely inputting each sample into a deep learning model f (x) in sequence, but not executing a back propagation process (namely only performing an inference process), and simultaneously recording the activation states of all full-connection layer activation functions, such as the output a of a certain neuronijSatisfies muk<aij<μk+1(k-0, 1, 2.., n-1), the activation state of the neuron is sijK, and so on;
step three and two, counting the activation states of all neurons in the full connecting layer, and sequentially arranging the activation states into an overall state vector S ═ S1,s2,...,sm]According to the steps in the third step and the first step, the integral state vectors of all the samples are counted to obtain an integral state vector set phi of the samples, wherein phi is { S ═ S }1,S2,...SN};
Thirdly, finishing phi, and combining the completely same integral state vectors to obtain the final productToFrom the reformed ensemble of state vectorsThe number q of elements of (1) will have the same activation state S'pDividing samples of (p is more than or equal to 1 and less than or equal to q) into the same subinterval, and classifying the samples belonging to the same subinterval by the same vector linear model gi(x)=wix+bi(i ═ 1, 2.. q.) description, directly through the parameters of the full connection layer and the overall activation state vector, the equivalent linear model g is obtained by calculationi(x)=wix+biLet all submodels be G ═ G1,g2,...,gq};
Step three and four, calculating decision boundaries of all sub models, and knowing that N classification problems are shared according to the definition of the decision boundariesClass decision boundaries, and linear model g for a sub-intervali(x) Can be calculated to obtainThe bar decision boundary specifically includes:
Further, in step four, specifically, the decision boundary hyperplane set DB obtained in step three is used to constructDecision network (DNet) comprising only one hidden layer, different from the ordinary neural network, the output of the decision network (DNet) is a position code relative to the decision boundaries, denoted 0/1, in particular for hyperplane PlAnd sample x0And directly substituting into a hyperplane formula to calculate the output of the hyperplane formula, and if the result is regular, marking 1, and if the result is negative, marking 0. Thus, by means of the decision network, a relative position coding of the data with respect to all elements of the decision boundary set DB is obtained Referring to fig. 1, samples having the same position code must belong to the same class according to the characteristics of the decision boundary.
Therefore, only the training set data D ═ x needs to be traversed againi,CiI 1, 2., N }, the location code of the decision network may be category-labeled. When a new sample is input, the type of the sample can be known only by comparing the new sample with the marked position code.
The invention provides a deep learning model (including CNN and MLP) compression method based on decision boundary, because the parameters of the deep learning model are largely derived from the full connection layer in the model, the method designs a compression method for the full connection layer without a large amount of experiments such as pruning, searching, distilling and the like, and only needs to traverse 2 times on a training set sample. And the obtained model can realize lossless compression if the activation function is a piecewise linear function commonly used as a ReLU, and can realize compression with any given precision through linear approximation if the activation function is other nonlinear activation functions.
The above embodiments are only used to help understanding the method of the present invention and the core idea thereof, and a person skilled in the art can also make several modifications and decorations on the specific embodiments and application scope according to the idea of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (8)
1. A deep learning model compression method based on decision boundaries is characterized by comprising the following steps:
step one, carrying out feature mapping;
step two, performing segmented linearization on the activation function;
step three, calculating a sub-decision area: calculating a sub-decision area of the full connection layer;
step four, decision network construction is carried out: and calculating corresponding decision boundaries according to the sub-decision regions and constructing a new decision network.
2. The method as claimed in claim 1, wherein in step one, if the object of model compression is a fully-connected neural network, the step is not executed, and step two is directly executed.
3. The method as claimed in claim 1, wherein in step one, if the object is a fully connected part of the cnn model, the model is considered as a composite of two parts, where f ═ gMLP(gcnn(x0) G) a handlecnn(x0) The new sample set D '═ { x' ═ g is constructed as a feature mapcnn(x) And then operate it as a fully connected neural network.
4. The method as claimed in claim 1, wherein in step two, if the activation function is a piecewise linear function, the step is not executed, and step three is directly executed.
5. The method for compressing the deep learning model based on the decision boundary as claimed in claim 1, wherein in the second step, for the function whose activation function is not piecewise linear, the activation function piecewise linearization technique is adopted, and the piecewise linear function close to the activation function is found to perform approximate substitution and is converted into the piecewise linear function.
6. The method for compressing a deep learning model based on decision boundaries as claimed in claim 5, wherein in the step two, for the activation function not being a piecewise linear function, specifically:
first, a hard approximation function hard- σ (x) of the activation function σ (x) is generated, specifically as follows:
depending on the required number of segments L n +2 and the acceptable error δ > 0, two segmentation points are first selected so as to be at (- ∞, a)0],[anAnd + ∞) in two intervals, satisfying that [ sigma (x) -hard-sigma (x) | < delta, and in the interval [ a ≦ delta0,an]Upper, directly and equidistantly taking the division point a1,a2,...,an-1And according to the point pair (a)1,hard-σ(a1)),(a2,σ(a2)),(a3,σ(a3)),…,(an-2,σ(an-2)),(an-1,hard-σ(an-1) And connecting the L (n + 2) sections of the original activation function in sequence to obtain the piecewise linear approximation function of the original activation function.
7. The method for compressing a deep learning model based on decision boundaries as claimed in claim 1, wherein in step three, for the activation function that is not a piecewise linear function, specifically: the invention firstly calculates the decision boundary, concretely, the used piecewise linear activation function is:
step three, traversing samples according to a training sample set, namely inputting each sample into a deep learning model f (x) in sequence, but not executing a back propagation process, and simultaneously recording the activation states of all full-connection layer activation functions;
step three and two, counting the activation states of all neurons in the full connecting layer, and sequentially arranging the activation states into an overall state vector S ═ S1,s2,...,sm]According to the steps in the third step and the first step, the integral state vectors of all the samples are counted to obtain an integral state vector set phi of the samples, wherein phi is { S ═ S }1,S2,...SN};
Thirdly, finishing phi, and combining the completely same integral state vectors to obtainFrom the reformed ensemble of state vectorsThe number q of elements of (1) will have the same activation state S'pDividing samples of (p is more than or equal to 1 and less than or equal to q) into the same subinterval, and classifying the samples belonging to the same subinterval by the same vector linear model gi(x)=wix+bi(i ═ 1, 2.. q.) description, directly through the parameters of the full connection layer and the overall activation state vector, the equivalent linear model g is obtained by calculationi(x)=wix+biLet all submodels be G ═ G1,g2,...,gq};
Step three and four, calculating decision boundaries of all sub-models according to the definition of the decision boundaries, namelyKnowledge of the N classification problems, in commonClass decision boundaries, and linear model g for a sub-intervali(x) Is calculated to obtainThe bar decision boundary specifically includes:
8. The method of claim 1, wherein in step four, specifically, according to the decision boundary hyperplane set DB obtained in step three, a decision network is constructed, the network only has an implicit layer, different from a normal neural network, and the output of the decision network DNet is a position code relative to the decision boundary, which is recorded as 0/1, specifically, for the hyperplane PlAnd sample x0Directly substituting into a hyperplane formula to calculate the output, if the result is regular, marking 1, and if the result is negative, marking 0,
through the decision network, a relative position coding of the data with respect to all elements of the decision boundary set DB is obtained Depending on the nature of the decision boundary, samples with the same position code must belong to the same class,
and traversing the training set data D ═ xi,CiI | (1, 2.), (N }), the class marking of the position code of the decision network is completed, and when a new sample is input, the class of the new sample can be known only by comparing the new sample with the marked position code.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111242448.0A CN114037051A (en) | 2021-10-25 | 2021-10-25 | Deep learning model compression method based on decision boundary |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111242448.0A CN114037051A (en) | 2021-10-25 | 2021-10-25 | Deep learning model compression method based on decision boundary |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114037051A true CN114037051A (en) | 2022-02-11 |
Family
ID=80135285
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111242448.0A Pending CN114037051A (en) | 2021-10-25 | 2021-10-25 | Deep learning model compression method based on decision boundary |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114037051A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115859091A (en) * | 2022-11-01 | 2023-03-28 | 哈尔滨工业大学 | Bearing fault feature extraction method, electronic device and storage medium |
-
2021
- 2021-10-25 CN CN202111242448.0A patent/CN114037051A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115859091A (en) * | 2022-11-01 | 2023-03-28 | 哈尔滨工业大学 | Bearing fault feature extraction method, electronic device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113905391B (en) | Integrated learning network traffic prediction method, system, equipment, terminal and medium | |
CN111723914A (en) | Neural network architecture searching method based on convolution kernel prediction | |
CN113255366B (en) | Aspect-level text emotion analysis method based on heterogeneous graph neural network | |
CN111008224A (en) | Time sequence classification and retrieval method based on deep multitask representation learning | |
CN113971735A (en) | Depth image clustering method, system, device, medium and terminal | |
CN115831102A (en) | Speech recognition method and device based on pre-training feature representation and electronic equipment | |
CN114239861A (en) | Model compression method and system based on multi-teacher combined guidance quantification | |
CN111078895A (en) | Remote supervision entity relation extraction method based on denoising convolutional neural network | |
CN115423105A (en) | Pre-training language model construction method, system and device | |
CN116976428A (en) | Model training method, device, equipment and storage medium | |
CN114881162A (en) | Method, apparatus, device and medium for predicting failure of metering automation master station | |
CN114037051A (en) | Deep learning model compression method based on decision boundary | |
CN113989566A (en) | Image classification method and device, computer equipment and storage medium | |
Rui et al. | Smart network maintenance in an edge cloud computing environment: An adaptive model compression algorithm based on model pruning and model clustering | |
CN116524282B (en) | Discrete similarity matching classification method based on feature vectors | |
CN117634459A (en) | Target content generation and model training method, device, system, equipment and medium | |
CN109033413B (en) | Neural network-based demand document and service document matching method | |
CN110288002B (en) | Image classification method based on sparse orthogonal neural network | |
CN112885367B (en) | Fundamental frequency acquisition method, fundamental frequency acquisition device, computer equipment and storage medium | |
Gafour et al. | Genetic fractal image compression | |
CN113282821A (en) | Intelligent application prediction method, device and system based on high-dimensional session data fusion | |
CN113378866A (en) | Image classification method, system, storage medium and electronic device | |
CN111368976A (en) | Data compression method based on neural network feature recognition | |
CN112396178B (en) | Method for improving compression efficiency of CNN (compressed network) | |
CN114936296B (en) | Indexing method, system and computer equipment for super-large-scale knowledge map storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |