CN111368976A - Data compression method based on neural network feature recognition - Google Patents
Data compression method based on neural network feature recognition Download PDFInfo
- Publication number
- CN111368976A CN111368976A CN202010126059.0A CN202010126059A CN111368976A CN 111368976 A CN111368976 A CN 111368976A CN 202010126059 A CN202010126059 A CN 202010126059A CN 111368976 A CN111368976 A CN 111368976A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- class
- feature
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention discloses a data compression method based on neural network feature recognition. The method comprises the steps of firstly constraining feature data of a neural network recognition model, then retraining the model, testing the model through a verification set, solidifying model parameters to obtain an inference model if a test result meets requirements, and enabling distribution of the feature data output by the inference model to meet follow-up compression requirements. Class feature data is derived from the inference model and the library data. And quantizing the class feature data, sequencing the class feature data in sequence according to the feature similarity, and storing the difference value according to the sequencing result to realize the compression of the class feature data. And (4) obtaining a compressed feature comparison library, and deploying the feature comparison library into actual equipment by combining with an inference model. The method is deployed to the model for storing the limited off-line terminal or mobile terminal equipment, so that the memory required by the library model is reduced and the decoding calculation amount is not additionally increased on the basis of ensuring the robust identification performance of the model.
Description
Technical Field
The invention belongs to the technical field of computers, particularly relates to the technical field of neural network identification, and particularly relates to a data compression method based on neural network feature identification.
Background
In recent years, deep learning technologies represented by neural networks have made great breakthroughs in many fields, but many challenges are encountered in the process of actually landing the technology. If the parameter quantity and the calculated quantity of the model are too large, and the memory of the equipment is insufficient, the actual deployment is difficult; or the inference delay limited by the actual model is too long, and the real-time requirement in reality cannot be met. Especially, when the offline end needs to perform the edge calculation task, the storage capacity and the storage cost of the device are limited, and the storage space required by the model is often limited, so that it is necessary to reduce the memory required by the model.
Neural networks are generally divided into training and inference modes, and what is actually required for deployment is an inference model. In order to infer that the model is more robust and reduce the risk of overfitting the model, constraints are usually added to the neural network model in a training mode. Such as dropout, regularization, etc., and these policies are constrained in terms of the weight of the model.
For the model based on neural network feature recognition, in practical use, each class in the feature comparison library contains a plurality of features for more robust recognition performance of the model. Only when the input features to be matched and the features in the feature comparison library meet a certain matching number, the successfully matched class is output. That is, when such a model is actually deployed, in addition to the parameter size and the calculation memory of the neural network model, an additional feature comparison database data storage space is also considered. If the feature comparison library is too large, great difficulty is brought to model deployment. The solution is as follows: on one hand, the memory space of the equipment can be increased, but the cost of the equipment can be increased; on the other hand, it is likely that the device itself cannot expand its memory space, which compresses the memory required by the model.
Many methods are proposed for the problem of neural network model compression. The method comprises the following steps: pruning treatment (structured pruning and unstructured pruning) of the model, but a reasonable pruning proportion needs to be obtained by searching; the neural network knowledge distillation method needs model reconstruction and training; the tensor decomposition-based method has the advantages that not only is the tensor decomposition process large in calculation amount, but also the global optimal solution is difficult to obtain when the decomposed model is optimized; in addition, a method for directly designing a lightweight neural network is also provided. These methods are all handled in terms of network models.
For the problem of database compression, although the existing compression method can achieve a very high compression rate, the higher the compression rate is, the more complicated the corresponding compression algorithm is, and the corresponding decoding process is also very complicated. If the complex compression method is adopted for compression, extra decoding calculation amount is brought to the comparison task of the model. These additional computational efforts can result in increased delays in the inference model, and even worse, in models that cannot be deployed in practice.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a data compression method based on neural network feature recognition. The method provided by the invention is a data compression method based on neural network feature identification, and the required memory size of a feature comparison library is compressed without increasing extra feature comparison calculation amount.
The method specifically comprises the following steps:
step (1), constraining output characteristic data of a neural network recognition model;
the neural network is a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN) or a Deep Neural Network (DNN), and at least one layer of network characteristics is selected as output characteristic data.
The Convolutional Neural Network (CNN) of the network layer structure is in the form of a conventional convolution, a block convolution or a separate convolution.
And selecting constraint conditions according to the characteristic data, adopting L2 norm constraint if the characteristic data are required to accord with normal distribution, and adopting L0 norm or L1 norm constraint if the characteristic data set is required to accord with sparse distribution. L0, L1, L2 denote the norm of the vector.
And (2) carrying out model training on the neural network model after the constraint of the output characteristic data. The method comprises the following steps:
a constraint value L obtained by performing norm constraint on the characteristic dataregAdding the other constraint values of the model into the total objective function S of the model and recording the other constraint values as LpreThen S is equal to Lpre+LregAnd inputting the training data into the S, and retraining the model. After the retraining model is tested and evaluated by a verification set, when the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to serve as an inference model; a1 is the accuracy before model retraining and A2 is the accuracy after model retraining.
If L isregToo large or too small of an influence, i.e. Lreg》LpreOr Lreg<<LpreThen to LregApplying a weighting factor α such that α · LregThe S is adjusted to be L as S, which is the same as the target function value Loss of the neural networkpre+α·Lreg。
And (3) inputting the database data into the inference model, and outputting the class characteristic data.
EnAn intra-class feature representing the nth class, which contains M intra-class data, N ∈ [1, N]The structure size of the database data is N × M, and the database data is input into an inference model to obtain class feature data with the same structure size.
And (4) compressing the intra-class feature data of the output class feature data.
Uniformly carrying out 8-bit quantization processing on class feature data, and then sequentially carrying out in-class feature EnThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics.
The feature similarity is represented by the number of 0 in the difference between the two features; the larger the number of 0 s, the larger the degree of similarity, and the smaller the number of 0 s, the lower the degree of similarity.
The intra-class feature data sorting algorithm adopts a prim algorithm or a greedy algorithm generated by a minimum binary tree.
And sequentially carrying out difference on the sorted intra-class features, and carrying out compression storage according to the difference of the features. That is, when storing the feature data of each class, only the complete data of the feature data in the first class is stored, and the feature data in the subsequent stored classes are all the differences from the feature data in the previous class.
Step 5, generating a feature comparison library according to the compressed class feature data;
after the characteristic data in the class is compressed, the characteristic data is stored according to the dictionary index mode, namely the class (key word) -the characteristic (value) in the class, and a characteristic comparison library is generated.
The method of the invention provides for compressing the feature comparison library, and constraining the feature data in the training process, so that the features output by the model meet certain data characteristic distribution, thereby facilitating the compression of the feature library. The method reduces the storage size of the characteristic comparison database for the model deployed to the offline terminal or the mobile terminal equipment with limited storage on the basis of ensuring the robust identification performance of the model, and does not additionally increase the decoding calculation amount caused by a complex compression method.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a flow chart of training in an embodiment;
FIG. 3 is a diagram of feature alignment library generation in an example.
Detailed description of the preferred embodiments
The invention is described in detail below with the aid of embodiments with reference to the accompanying drawings. But it should be noted that: unless otherwise specifically stated, the relative arrangement, numerical expressions, alphabetic expressions, and numerical values referred to in the embodiments are not limiting the scope of the present invention, and these embodiments are provided only to facilitate understanding of the present invention by the relevant persons. Technical methods well known to those skilled in the relevant art may not be described in excessive detail. The techniques, methods and systems of the present invention should, under certain circumstances, be considered part of the specification.
For clarity, the present invention is illustrated in detail by taking a neural network model a as an example, and specific implementations of the present invention are illustrated in fig. 1, 2 and 3.
Fig. 2 is a training flowchart according to the applicable neural network model a, and mainly describes the training process of the model.
First, an L2 norm constraint on the feature data E of the model, namely L, is added into the modelreg=||E||2The cross-entropy loss and other constraints of the model itself are denoted as Lpreα as a hyper-parameter to adjust the strength of the constraint on the feature data E, resulting in S ═ Lpre+α·Lreg. And loading the parameter values of the pre-training model into the model, and retraining the model based on a Back Propagation (BP) mechanism of deep learning.
After the model is retrained, the retrained model is evaluated by a validation set test. When the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to obtain an inference model; a1 is the accuracy before model retraining and A2 is the accuracy after model retraining.
FIG. 3 is a flow chart of generating a feature alignment library based on model A, which mainly describes the processes of generating class feature data and compressing the class feature data.
In the model inference mode, library data is input into an inference model, and class feature data F is output. In the multi-target identification work, in order to improve the robustness of model identification, D features of each class in the feature comparison library are always provided, and only when K features are successfully matched, and K is more than or equal to 1 and less than or equal to D, the target to be detected is considered to be successfully identified.
The class feature data F is a set of feature data of all classes. The structure of F can be described as: { E1,…,En,…,ENIn which E1Characteristic data representing class 1, corresponding EnRepresenting the characteristic data of class n. At the same time, each class contains M intra-class data, e.g. EnThe structure of (a) can be described as: { en1,…,enm,…,enM},enmRepresenting the characteristic data in the mth class in the class n.
In order to further reduce the storage size of the feature comparison library, the feature data F is quantized, wherein F is quantized uniformly by 8 bits.
The class quantized feature data F is then compressed, essentially EnCompression of (2).
In turn, theTo-class feature EnThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics. Here, the similarity refers to the number of 0 values accumulated after the subtraction of two feature data. The formula is described as: count zeros (e)ni-enj) The count _ zeros is the number of 0, wherein the greater the number of 0, the greater the similarity between the two features, and vice versa.
The prim algorithm for generating the minimum binary tree is adopted here, and the intra-class feature E is subjected tonAnd (6) sorting. E.g. for EnAfter the M characteristic data are sequenced, a structure { e } is obtainedaebeced…, a, b, c and d satisfy 1 ≦ a, b, c and d ≦ M, and a, b, c and d are not equal to each other.
And after the intra-class features are sorted, storing the intra-class features in a difference mode. That is, when storing each type of feature data, only the complete data of the first feature data is stored, and the feature data stored subsequently is the difference value from the previous feature data. After sorting as described above En:{eaebeced… }. When stored, press ea,(ea-eb),(eb-ec),(ec-ed) … for compressed storage.
After the class characteristic data is compressed, the class characteristic data, namely class n (key word) -compressed class characteristic E is stored according to the dictionary index moden(value), generating a feature comparison library.
And finally, obtaining a characteristic comparison library, and combining an inference model, wherein the characteristic comparison library can be used for actual deployment of the model. It should be noted that, in the process of generating the feature comparison library, the class feature data is subjected to quantization processing. When the inference model is actually used for inference, the output features are subjected to corresponding 8-bit quantization processing, and then feature comparison calculation is performed.
Claims (5)
1. The data compression method based on neural network feature recognition is characterized by comprising the following steps:
step (1), constraining output characteristic data of a neural network recognition model;
selecting at least one layer of network characteristics as output characteristic data; selecting constraint conditions according to the characteristic data, if the characteristic data are required to accord with normal distribution, adopting L2 norm constraint, and if the characteristic data set is required to accord with sparse distribution, adopting L0 norm or L1 norm constraint;
step (2), carrying out model training on the neural network model after the constraint of the output characteristic data; the method comprises the following steps:
a constraint value L obtained by performing norm constraint on the characteristic dataregAdding the other constraint values of the model into the total objective function S of the model and recording the other constraint values as LpreThen S is equal to Lpre+LregInputting training data into the S, and retraining the model; after the retraining model is tested and evaluated by a verification set, when the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to serve as an inference model; a1 is the accuracy before model retraining, A2 is the accuracy after model retraining;
step (3), inputting the database data into an inference model, and outputting class characteristic data;
Enan intra-class feature representing the nth class, which contains M intra-class data, N ∈ [1, N]The structure size of the database data is N × M, and the database data is input into an inference model to obtain class characteristic data with the same structure size;
step (4), compressing the output class characteristic data;
the class feature data are uniformly quantized, and the class features E are sequentially processednThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics;
the feature similarity is represented by the number of 0 in the difference between the two features; the more the number of 0, the greater the degree of similarity, and the less the number of 0, the lower the degree of similarity;
sequentially carrying out difference on the sorted intra-class features, and carrying out compression storage according to the difference of the features; when storing the feature data of each class, only storing the complete data of the feature data in the first class, wherein the feature data in the subsequent stored classes are the difference values with the feature data in the previous class;
and (5) compressing the feature data in the class, and storing the compressed class feature data in a dictionary index mode to generate a feature comparison library.
2. The data compression method based on neural network feature recognition of claim 1, characterized in that: the neural network is a convolutional neural network, a cyclic neural network or a deep neural network.
3. The data compression method based on neural network feature recognition of claim 1, wherein if L in step (2) isregToo large or too small of an influence, i.e. Lreg>>LpreOr Lreg<<LpreThen to LregApplying a weighting factor α such that α · LregThe S is adjusted to be L as S, which is the same as the target function value Loss of the neural networkpre+α·Lreg。
4. The data compression method based on neural network feature recognition as claimed in claim 1, wherein in step (4), the class feature data is quantized uniformly by 8-bit quantization.
5. The data compression method based on neural network feature recognition as claimed in claim 1, wherein in step (4), the intra-class feature data sorting algorithm adopts prim algorithm or greedy algorithm generated by minimum binary tree.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010126059.0A CN111368976B (en) | 2020-02-27 | 2020-02-27 | Data compression method based on neural network feature recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010126059.0A CN111368976B (en) | 2020-02-27 | 2020-02-27 | Data compression method based on neural network feature recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111368976A true CN111368976A (en) | 2020-07-03 |
CN111368976B CN111368976B (en) | 2022-09-02 |
Family
ID=71211537
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010126059.0A Active CN111368976B (en) | 2020-02-27 | 2020-02-27 | Data compression method based on neural network feature recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111368976B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104281616A (en) * | 2013-07-10 | 2015-01-14 | 北京旋极信息技术股份有限公司 | Data processing method |
CN105160400A (en) * | 2015-09-08 | 2015-12-16 | 西安交通大学 | L21 norm based method for improving convolutional neural network generalization capability |
CN107967516A (en) * | 2017-10-12 | 2018-04-27 | 中科视拓(北京)科技有限公司 | A kind of acceleration of neutral net based on trace norm constraint and compression method |
US20190095699A1 (en) * | 2017-09-28 | 2019-03-28 | Nec Laboratories America, Inc. | Long-tail large scale face recognition by non-linear feature level domain adaption |
CN109635936A (en) * | 2018-12-29 | 2019-04-16 | 杭州国芯科技股份有限公司 | A kind of neural networks pruning quantization method based on retraining |
CN109840530A (en) * | 2017-11-24 | 2019-06-04 | 华为技术有限公司 | The method and apparatus of training multi-tag disaggregated model |
CN109949437A (en) * | 2019-03-13 | 2019-06-28 | 东北大学 | Isomeric data based on rarefaction cooperates with industrial method for diagnosing faults |
US20190294929A1 (en) * | 2018-03-20 | 2019-09-26 | The Regents Of The University Of Michigan | Automatic Filter Pruning Technique For Convolutional Neural Networks |
EP3570221A1 (en) * | 2018-05-15 | 2019-11-20 | Hitachi, Ltd. | Neural networks for discovering latent factors from data |
CN110807514A (en) * | 2019-10-25 | 2020-02-18 | 中国科学院计算技术研究所 | Neural network pruning method based on LO regularization |
-
2020
- 2020-02-27 CN CN202010126059.0A patent/CN111368976B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104281616A (en) * | 2013-07-10 | 2015-01-14 | 北京旋极信息技术股份有限公司 | Data processing method |
CN105160400A (en) * | 2015-09-08 | 2015-12-16 | 西安交通大学 | L21 norm based method for improving convolutional neural network generalization capability |
US20190095699A1 (en) * | 2017-09-28 | 2019-03-28 | Nec Laboratories America, Inc. | Long-tail large scale face recognition by non-linear feature level domain adaption |
CN107967516A (en) * | 2017-10-12 | 2018-04-27 | 中科视拓(北京)科技有限公司 | A kind of acceleration of neutral net based on trace norm constraint and compression method |
CN109840530A (en) * | 2017-11-24 | 2019-06-04 | 华为技术有限公司 | The method and apparatus of training multi-tag disaggregated model |
US20190294929A1 (en) * | 2018-03-20 | 2019-09-26 | The Regents Of The University Of Michigan | Automatic Filter Pruning Technique For Convolutional Neural Networks |
EP3570221A1 (en) * | 2018-05-15 | 2019-11-20 | Hitachi, Ltd. | Neural networks for discovering latent factors from data |
CN109635936A (en) * | 2018-12-29 | 2019-04-16 | 杭州国芯科技股份有限公司 | A kind of neural networks pruning quantization method based on retraining |
CN109949437A (en) * | 2019-03-13 | 2019-06-28 | 东北大学 | Isomeric data based on rarefaction cooperates with industrial method for diagnosing faults |
CN110807514A (en) * | 2019-10-25 | 2020-02-18 | 中国科学院计算技术研究所 | Neural network pruning method based on LO regularization |
Non-Patent Citations (1)
Title |
---|
YU CHENG.ET AL: ""Model Compression and Acceleration for deep neural networks"", 《IEEE》 * |
Also Published As
Publication number | Publication date |
---|---|
CN111368976B (en) | 2022-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liang et al. | Pruning and quantization for deep neural network acceleration: A survey | |
Zhou et al. | Adaptive quantization for deep neural network | |
Li et al. | Towards compact cnns via collaborative compression | |
US20210089922A1 (en) | Joint pruning and quantization scheme for deep neural networks | |
US10984308B2 (en) | Compression method for deep neural networks with load balance | |
CN110287983B (en) | Single-classifier anomaly detection method based on maximum correlation entropy deep neural network | |
CN109445935B (en) | Self-adaptive configuration method of high-performance big data analysis system in cloud computing environment | |
CN114169330A (en) | Chinese named entity identification method fusing time sequence convolution and Transformer encoder | |
CN114493755B (en) | Self-attention sequence recommendation method fusing time sequence information | |
CN109871749B (en) | Pedestrian re-identification method and device based on deep hash and computer system | |
CN109918507B (en) | textCNN (text-based network communication network) improved text classification method | |
CN110851654A (en) | Industrial equipment fault detection and classification method based on tensor data dimension reduction | |
US20230073669A1 (en) | Optimising a neural network | |
Shi et al. | Structured Word Embedding for Low Memory Neural Network Language Model. | |
TW202006612A (en) | Machine learning method and machine learning device | |
Zhang et al. | Data Independent Sequence Augmentation Method for Acoustic Scene Classification. | |
CN110288002B (en) | Image classification method based on sparse orthogonal neural network | |
CN111368976B (en) | Data compression method based on neural network feature recognition | |
CN116578699A (en) | Sequence classification prediction method and system based on Transformer | |
Huang et al. | Flow of renyi information in deep neural networks | |
Wu et al. | Mirex 2017 submission: Automatic audio chord recognition with miditrained deep feature and blstm-crf sequence decoding model | |
Zhang et al. | Compressing knowledge graph embedding with relational graph auto-encoder | |
CN112735604A (en) | Novel coronavirus classification method based on deep learning algorithm | |
CN112465054A (en) | Multivariate time series data classification method based on FCN | |
Xu et al. | Batch-normalization-based soft filter pruning for deep convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |