CN111368976A - Data compression method based on neural network feature recognition - Google Patents

Data compression method based on neural network feature recognition Download PDF

Info

Publication number
CN111368976A
CN111368976A CN202010126059.0A CN202010126059A CN111368976A CN 111368976 A CN111368976 A CN 111368976A CN 202010126059 A CN202010126059 A CN 202010126059A CN 111368976 A CN111368976 A CN 111368976A
Authority
CN
China
Prior art keywords
data
model
class
feature
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010126059.0A
Other languages
Chinese (zh)
Other versions
CN111368976B (en
Inventor
杨常星
梁骏
钟宇清
宋蕴
宋一平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Nationalchip Science & Technology Co ltd
Original Assignee
Hangzhou Nationalchip Science & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Nationalchip Science & Technology Co ltd filed Critical Hangzhou Nationalchip Science & Technology Co ltd
Priority to CN202010126059.0A priority Critical patent/CN111368976B/en
Publication of CN111368976A publication Critical patent/CN111368976A/en
Application granted granted Critical
Publication of CN111368976B publication Critical patent/CN111368976B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a data compression method based on neural network feature recognition. The method comprises the steps of firstly constraining feature data of a neural network recognition model, then retraining the model, testing the model through a verification set, solidifying model parameters to obtain an inference model if a test result meets requirements, and enabling distribution of the feature data output by the inference model to meet follow-up compression requirements. Class feature data is derived from the inference model and the library data. And quantizing the class feature data, sequencing the class feature data in sequence according to the feature similarity, and storing the difference value according to the sequencing result to realize the compression of the class feature data. And (4) obtaining a compressed feature comparison library, and deploying the feature comparison library into actual equipment by combining with an inference model. The method is deployed to the model for storing the limited off-line terminal or mobile terminal equipment, so that the memory required by the library model is reduced and the decoding calculation amount is not additionally increased on the basis of ensuring the robust identification performance of the model.

Description

Data compression method based on neural network feature recognition
Technical Field
The invention belongs to the technical field of computers, particularly relates to the technical field of neural network identification, and particularly relates to a data compression method based on neural network feature identification.
Background
In recent years, deep learning technologies represented by neural networks have made great breakthroughs in many fields, but many challenges are encountered in the process of actually landing the technology. If the parameter quantity and the calculated quantity of the model are too large, and the memory of the equipment is insufficient, the actual deployment is difficult; or the inference delay limited by the actual model is too long, and the real-time requirement in reality cannot be met. Especially, when the offline end needs to perform the edge calculation task, the storage capacity and the storage cost of the device are limited, and the storage space required by the model is often limited, so that it is necessary to reduce the memory required by the model.
Neural networks are generally divided into training and inference modes, and what is actually required for deployment is an inference model. In order to infer that the model is more robust and reduce the risk of overfitting the model, constraints are usually added to the neural network model in a training mode. Such as dropout, regularization, etc., and these policies are constrained in terms of the weight of the model.
For the model based on neural network feature recognition, in practical use, each class in the feature comparison library contains a plurality of features for more robust recognition performance of the model. Only when the input features to be matched and the features in the feature comparison library meet a certain matching number, the successfully matched class is output. That is, when such a model is actually deployed, in addition to the parameter size and the calculation memory of the neural network model, an additional feature comparison database data storage space is also considered. If the feature comparison library is too large, great difficulty is brought to model deployment. The solution is as follows: on one hand, the memory space of the equipment can be increased, but the cost of the equipment can be increased; on the other hand, it is likely that the device itself cannot expand its memory space, which compresses the memory required by the model.
Many methods are proposed for the problem of neural network model compression. The method comprises the following steps: pruning treatment (structured pruning and unstructured pruning) of the model, but a reasonable pruning proportion needs to be obtained by searching; the neural network knowledge distillation method needs model reconstruction and training; the tensor decomposition-based method has the advantages that not only is the tensor decomposition process large in calculation amount, but also the global optimal solution is difficult to obtain when the decomposed model is optimized; in addition, a method for directly designing a lightweight neural network is also provided. These methods are all handled in terms of network models.
For the problem of database compression, although the existing compression method can achieve a very high compression rate, the higher the compression rate is, the more complicated the corresponding compression algorithm is, and the corresponding decoding process is also very complicated. If the complex compression method is adopted for compression, extra decoding calculation amount is brought to the comparison task of the model. These additional computational efforts can result in increased delays in the inference model, and even worse, in models that cannot be deployed in practice.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a data compression method based on neural network feature recognition. The method provided by the invention is a data compression method based on neural network feature identification, and the required memory size of a feature comparison library is compressed without increasing extra feature comparison calculation amount.
The method specifically comprises the following steps:
step (1), constraining output characteristic data of a neural network recognition model;
the neural network is a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN) or a Deep Neural Network (DNN), and at least one layer of network characteristics is selected as output characteristic data.
The Convolutional Neural Network (CNN) of the network layer structure is in the form of a conventional convolution, a block convolution or a separate convolution.
And selecting constraint conditions according to the characteristic data, adopting L2 norm constraint if the characteristic data are required to accord with normal distribution, and adopting L0 norm or L1 norm constraint if the characteristic data set is required to accord with sparse distribution. L0, L1, L2 denote the norm of the vector.
And (2) carrying out model training on the neural network model after the constraint of the output characteristic data. The method comprises the following steps:
a constraint value L obtained by performing norm constraint on the characteristic dataregAdding the other constraint values of the model into the total objective function S of the model and recording the other constraint values as LpreThen S is equal to Lpre+LregAnd inputting the training data into the S, and retraining the model. After the retraining model is tested and evaluated by a verification set, when the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to serve as an inference model; a1 is the accuracy before model retraining and A2 is the accuracy after model retraining.
If L isregToo large or too small of an influence, i.e. Lreg》LpreOr Lreg<<LpreThen to LregApplying a weighting factor α such that α · LregThe S is adjusted to be L as S, which is the same as the target function value Loss of the neural networkpre+α·Lreg
And (3) inputting the database data into the inference model, and outputting the class characteristic data.
EnAn intra-class feature representing the nth class, which contains M intra-class data, N ∈ [1, N]The structure size of the database data is N × M, and the database data is input into an inference model to obtain class feature data with the same structure size.
And (4) compressing the intra-class feature data of the output class feature data.
Uniformly carrying out 8-bit quantization processing on class feature data, and then sequentially carrying out in-class feature EnThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics.
The feature similarity is represented by the number of 0 in the difference between the two features; the larger the number of 0 s, the larger the degree of similarity, and the smaller the number of 0 s, the lower the degree of similarity.
The intra-class feature data sorting algorithm adopts a prim algorithm or a greedy algorithm generated by a minimum binary tree.
And sequentially carrying out difference on the sorted intra-class features, and carrying out compression storage according to the difference of the features. That is, when storing the feature data of each class, only the complete data of the feature data in the first class is stored, and the feature data in the subsequent stored classes are all the differences from the feature data in the previous class.
Step 5, generating a feature comparison library according to the compressed class feature data;
after the characteristic data in the class is compressed, the characteristic data is stored according to the dictionary index mode, namely the class (key word) -the characteristic (value) in the class, and a characteristic comparison library is generated.
The method of the invention provides for compressing the feature comparison library, and constraining the feature data in the training process, so that the features output by the model meet certain data characteristic distribution, thereby facilitating the compression of the feature library. The method reduces the storage size of the characteristic comparison database for the model deployed to the offline terminal or the mobile terminal equipment with limited storage on the basis of ensuring the robust identification performance of the model, and does not additionally increase the decoding calculation amount caused by a complex compression method.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a flow chart of training in an embodiment;
FIG. 3 is a diagram of feature alignment library generation in an example.
Detailed description of the preferred embodiments
The invention is described in detail below with the aid of embodiments with reference to the accompanying drawings. But it should be noted that: unless otherwise specifically stated, the relative arrangement, numerical expressions, alphabetic expressions, and numerical values referred to in the embodiments are not limiting the scope of the present invention, and these embodiments are provided only to facilitate understanding of the present invention by the relevant persons. Technical methods well known to those skilled in the relevant art may not be described in excessive detail. The techniques, methods and systems of the present invention should, under certain circumstances, be considered part of the specification.
For clarity, the present invention is illustrated in detail by taking a neural network model a as an example, and specific implementations of the present invention are illustrated in fig. 1, 2 and 3.
Fig. 2 is a training flowchart according to the applicable neural network model a, and mainly describes the training process of the model.
First, an L2 norm constraint on the feature data E of the model, namely L, is added into the modelreg=||E||2The cross-entropy loss and other constraints of the model itself are denoted as Lpreα as a hyper-parameter to adjust the strength of the constraint on the feature data E, resulting in S ═ Lpre+α·Lreg. And loading the parameter values of the pre-training model into the model, and retraining the model based on a Back Propagation (BP) mechanism of deep learning.
After the model is retrained, the retrained model is evaluated by a validation set test. When the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to obtain an inference model; a1 is the accuracy before model retraining and A2 is the accuracy after model retraining.
FIG. 3 is a flow chart of generating a feature alignment library based on model A, which mainly describes the processes of generating class feature data and compressing the class feature data.
In the model inference mode, library data is input into an inference model, and class feature data F is output. In the multi-target identification work, in order to improve the robustness of model identification, D features of each class in the feature comparison library are always provided, and only when K features are successfully matched, and K is more than or equal to 1 and less than or equal to D, the target to be detected is considered to be successfully identified.
The class feature data F is a set of feature data of all classes. The structure of F can be described as: { E1,…,En,…,ENIn which E1Characteristic data representing class 1, corresponding EnRepresenting the characteristic data of class n. At the same time, each class contains M intra-class data, e.g. EnThe structure of (a) can be described as: { en1,…,enm,…,enM},enmRepresenting the characteristic data in the mth class in the class n.
In order to further reduce the storage size of the feature comparison library, the feature data F is quantized, wherein F is quantized uniformly by 8 bits.
The class quantized feature data F is then compressed, essentially EnCompression of (2).
In turn, theTo-class feature EnThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics. Here, the similarity refers to the number of 0 values accumulated after the subtraction of two feature data. The formula is described as: count zeros (e)ni-enj) The count _ zeros is the number of 0, wherein the greater the number of 0, the greater the similarity between the two features, and vice versa.
The prim algorithm for generating the minimum binary tree is adopted here, and the intra-class feature E is subjected tonAnd (6) sorting. E.g. for EnAfter the M characteristic data are sequenced, a structure { e } is obtainedaebeced…, a, b, c and d satisfy 1 ≦ a, b, c and d ≦ M, and a, b, c and d are not equal to each other.
And after the intra-class features are sorted, storing the intra-class features in a difference mode. That is, when storing each type of feature data, only the complete data of the first feature data is stored, and the feature data stored subsequently is the difference value from the previous feature data. After sorting as described above En:{eaebeced… }. When stored, press ea,(ea-eb),(eb-ec),(ec-ed) … for compressed storage.
After the class characteristic data is compressed, the class characteristic data, namely class n (key word) -compressed class characteristic E is stored according to the dictionary index moden(value), generating a feature comparison library.
And finally, obtaining a characteristic comparison library, and combining an inference model, wherein the characteristic comparison library can be used for actual deployment of the model. It should be noted that, in the process of generating the feature comparison library, the class feature data is subjected to quantization processing. When the inference model is actually used for inference, the output features are subjected to corresponding 8-bit quantization processing, and then feature comparison calculation is performed.

Claims (5)

1. The data compression method based on neural network feature recognition is characterized by comprising the following steps:
step (1), constraining output characteristic data of a neural network recognition model;
selecting at least one layer of network characteristics as output characteristic data; selecting constraint conditions according to the characteristic data, if the characteristic data are required to accord with normal distribution, adopting L2 norm constraint, and if the characteristic data set is required to accord with sparse distribution, adopting L0 norm or L1 norm constraint;
step (2), carrying out model training on the neural network model after the constraint of the output characteristic data; the method comprises the following steps:
a constraint value L obtained by performing norm constraint on the characteristic dataregAdding the other constraint values of the model into the total objective function S of the model and recording the other constraint values as LpreThen S is equal to Lpre+LregInputting training data into the S, and retraining the model; after the retraining model is tested and evaluated by a verification set, when the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to serve as an inference model; a1 is the accuracy before model retraining, A2 is the accuracy after model retraining;
step (3), inputting the database data into an inference model, and outputting class characteristic data;
Enan intra-class feature representing the nth class, which contains M intra-class data, N ∈ [1, N]The structure size of the database data is N × M, and the database data is input into an inference model to obtain class characteristic data with the same structure size;
step (4), compressing the output class characteristic data;
the class feature data are uniformly quantized, and the class features E are sequentially processednThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics;
the feature similarity is represented by the number of 0 in the difference between the two features; the more the number of 0, the greater the degree of similarity, and the less the number of 0, the lower the degree of similarity;
sequentially carrying out difference on the sorted intra-class features, and carrying out compression storage according to the difference of the features; when storing the feature data of each class, only storing the complete data of the feature data in the first class, wherein the feature data in the subsequent stored classes are the difference values with the feature data in the previous class;
and (5) compressing the feature data in the class, and storing the compressed class feature data in a dictionary index mode to generate a feature comparison library.
2. The data compression method based on neural network feature recognition of claim 1, characterized in that: the neural network is a convolutional neural network, a cyclic neural network or a deep neural network.
3. The data compression method based on neural network feature recognition of claim 1, wherein if L in step (2) isregToo large or too small of an influence, i.e. Lreg>>LpreOr Lreg<<LpreThen to LregApplying a weighting factor α such that α · LregThe S is adjusted to be L as S, which is the same as the target function value Loss of the neural networkpre+α·Lreg
4. The data compression method based on neural network feature recognition as claimed in claim 1, wherein in step (4), the class feature data is quantized uniformly by 8-bit quantization.
5. The data compression method based on neural network feature recognition as claimed in claim 1, wherein in step (4), the intra-class feature data sorting algorithm adopts prim algorithm or greedy algorithm generated by minimum binary tree.
CN202010126059.0A 2020-02-27 2020-02-27 Data compression method based on neural network feature recognition Active CN111368976B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010126059.0A CN111368976B (en) 2020-02-27 2020-02-27 Data compression method based on neural network feature recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010126059.0A CN111368976B (en) 2020-02-27 2020-02-27 Data compression method based on neural network feature recognition

Publications (2)

Publication Number Publication Date
CN111368976A true CN111368976A (en) 2020-07-03
CN111368976B CN111368976B (en) 2022-09-02

Family

ID=71211537

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010126059.0A Active CN111368976B (en) 2020-02-27 2020-02-27 Data compression method based on neural network feature recognition

Country Status (1)

Country Link
CN (1) CN111368976B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281616A (en) * 2013-07-10 2015-01-14 北京旋极信息技术股份有限公司 Data processing method
CN105160400A (en) * 2015-09-08 2015-12-16 西安交通大学 L21 norm based method for improving convolutional neural network generalization capability
CN107967516A (en) * 2017-10-12 2018-04-27 中科视拓(北京)科技有限公司 A kind of acceleration of neutral net based on trace norm constraint and compression method
US20190095699A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Long-tail large scale face recognition by non-linear feature level domain adaption
CN109635936A (en) * 2018-12-29 2019-04-16 杭州国芯科技股份有限公司 A kind of neural networks pruning quantization method based on retraining
CN109840530A (en) * 2017-11-24 2019-06-04 华为技术有限公司 The method and apparatus of training multi-tag disaggregated model
CN109949437A (en) * 2019-03-13 2019-06-28 东北大学 Isomeric data based on rarefaction cooperates with industrial method for diagnosing faults
US20190294929A1 (en) * 2018-03-20 2019-09-26 The Regents Of The University Of Michigan Automatic Filter Pruning Technique For Convolutional Neural Networks
EP3570221A1 (en) * 2018-05-15 2019-11-20 Hitachi, Ltd. Neural networks for discovering latent factors from data
CN110807514A (en) * 2019-10-25 2020-02-18 中国科学院计算技术研究所 Neural network pruning method based on LO regularization

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281616A (en) * 2013-07-10 2015-01-14 北京旋极信息技术股份有限公司 Data processing method
CN105160400A (en) * 2015-09-08 2015-12-16 西安交通大学 L21 norm based method for improving convolutional neural network generalization capability
US20190095699A1 (en) * 2017-09-28 2019-03-28 Nec Laboratories America, Inc. Long-tail large scale face recognition by non-linear feature level domain adaption
CN107967516A (en) * 2017-10-12 2018-04-27 中科视拓(北京)科技有限公司 A kind of acceleration of neutral net based on trace norm constraint and compression method
CN109840530A (en) * 2017-11-24 2019-06-04 华为技术有限公司 The method and apparatus of training multi-tag disaggregated model
US20190294929A1 (en) * 2018-03-20 2019-09-26 The Regents Of The University Of Michigan Automatic Filter Pruning Technique For Convolutional Neural Networks
EP3570221A1 (en) * 2018-05-15 2019-11-20 Hitachi, Ltd. Neural networks for discovering latent factors from data
CN109635936A (en) * 2018-12-29 2019-04-16 杭州国芯科技股份有限公司 A kind of neural networks pruning quantization method based on retraining
CN109949437A (en) * 2019-03-13 2019-06-28 东北大学 Isomeric data based on rarefaction cooperates with industrial method for diagnosing faults
CN110807514A (en) * 2019-10-25 2020-02-18 中国科学院计算技术研究所 Neural network pruning method based on LO regularization

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YU CHENG.ET AL: ""Model Compression and Acceleration for deep neural networks"", 《IEEE》 *

Also Published As

Publication number Publication date
CN111368976B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
Liang et al. Pruning and quantization for deep neural network acceleration: A survey
Zhou et al. Adaptive quantization for deep neural network
Li et al. Towards compact cnns via collaborative compression
US20210089922A1 (en) Joint pruning and quantization scheme for deep neural networks
US10984308B2 (en) Compression method for deep neural networks with load balance
CN110287983B (en) Single-classifier anomaly detection method based on maximum correlation entropy deep neural network
CN109445935B (en) Self-adaptive configuration method of high-performance big data analysis system in cloud computing environment
CN114169330A (en) Chinese named entity identification method fusing time sequence convolution and Transformer encoder
CN114493755B (en) Self-attention sequence recommendation method fusing time sequence information
CN109871749B (en) Pedestrian re-identification method and device based on deep hash and computer system
CN109918507B (en) textCNN (text-based network communication network) improved text classification method
CN110851654A (en) Industrial equipment fault detection and classification method based on tensor data dimension reduction
US20230073669A1 (en) Optimising a neural network
Shi et al. Structured Word Embedding for Low Memory Neural Network Language Model.
TW202006612A (en) Machine learning method and machine learning device
Zhang et al. Data Independent Sequence Augmentation Method for Acoustic Scene Classification.
CN110288002B (en) Image classification method based on sparse orthogonal neural network
CN111368976B (en) Data compression method based on neural network feature recognition
CN116578699A (en) Sequence classification prediction method and system based on Transformer
Huang et al. Flow of renyi information in deep neural networks
Wu et al. Mirex 2017 submission: Automatic audio chord recognition with miditrained deep feature and blstm-crf sequence decoding model
Zhang et al. Compressing knowledge graph embedding with relational graph auto-encoder
CN112735604A (en) Novel coronavirus classification method based on deep learning algorithm
CN112465054A (en) Multivariate time series data classification method based on FCN
Xu et al. Batch-normalization-based soft filter pruning for deep convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant