CN111368976A

CN111368976A - Data compression method based on neural network feature recognition

Info

Publication number: CN111368976A
Application number: CN202010126059.0A
Authority: CN
Inventors: 杨常星; 梁骏; 钟宇清; 宋蕴; 宋一平
Original assignee: Hangzhou Nationalchip Science & Technology Co ltd
Current assignee: Hangzhou Nationalchip Science & Technology Co ltd
Priority date: 2020-02-27
Filing date: 2020-02-27
Publication date: 2020-07-03
Anticipated expiration: 2040-02-27
Also published as: CN111368976B

Abstract

The invention discloses a data compression method based on neural network feature recognition. The method comprises the steps of firstly constraining feature data of a neural network recognition model, then retraining the model, testing the model through a verification set, solidifying model parameters to obtain an inference model if a test result meets requirements, and enabling distribution of the feature data output by the inference model to meet follow-up compression requirements. Class feature data is derived from the inference model and the library data. And quantizing the class feature data, sequencing the class feature data in sequence according to the feature similarity, and storing the difference value according to the sequencing result to realize the compression of the class feature data. And (4) obtaining a compressed feature comparison library, and deploying the feature comparison library into actual equipment by combining with an inference model. The method is deployed to the model for storing the limited off-line terminal or mobile terminal equipment, so that the memory required by the library model is reduced and the decoding calculation amount is not additionally increased on the basis of ensuring the robust identification performance of the model.

Description

Data compression method based on neural network feature recognition

Technical Field

The invention belongs to the technical field of computers, particularly relates to the technical field of neural network identification, and particularly relates to a data compression method based on neural network feature identification.

Background

In recent years, deep learning technologies represented by neural networks have made great breakthroughs in many fields, but many challenges are encountered in the process of actually landing the technology. If the parameter quantity and the calculated quantity of the model are too large, and the memory of the equipment is insufficient, the actual deployment is difficult; or the inference delay limited by the actual model is too long, and the real-time requirement in reality cannot be met. Especially, when the offline end needs to perform the edge calculation task, the storage capacity and the storage cost of the device are limited, and the storage space required by the model is often limited, so that it is necessary to reduce the memory required by the model.

Neural networks are generally divided into training and inference modes, and what is actually required for deployment is an inference model. In order to infer that the model is more robust and reduce the risk of overfitting the model, constraints are usually added to the neural network model in a training mode. Such as dropout, regularization, etc., and these policies are constrained in terms of the weight of the model.

For the model based on neural network feature recognition, in practical use, each class in the feature comparison library contains a plurality of features for more robust recognition performance of the model. Only when the input features to be matched and the features in the feature comparison library meet a certain matching number, the successfully matched class is output. That is, when such a model is actually deployed, in addition to the parameter size and the calculation memory of the neural network model, an additional feature comparison database data storage space is also considered. If the feature comparison library is too large, great difficulty is brought to model deployment. The solution is as follows: on one hand, the memory space of the equipment can be increased, but the cost of the equipment can be increased; on the other hand, it is likely that the device itself cannot expand its memory space, which compresses the memory required by the model.

Many methods are proposed for the problem of neural network model compression. The method comprises the following steps: pruning treatment (structured pruning and unstructured pruning) of the model, but a reasonable pruning proportion needs to be obtained by searching; the neural network knowledge distillation method needs model reconstruction and training; the tensor decomposition-based method has the advantages that not only is the tensor decomposition process large in calculation amount, but also the global optimal solution is difficult to obtain when the decomposed model is optimized; in addition, a method for directly designing a lightweight neural network is also provided. These methods are all handled in terms of network models.

For the problem of database compression, although the existing compression method can achieve a very high compression rate, the higher the compression rate is, the more complicated the corresponding compression algorithm is, and the corresponding decoding process is also very complicated. If the complex compression method is adopted for compression, extra decoding calculation amount is brought to the comparison task of the model. These additional computational efforts can result in increased delays in the inference model, and even worse, in models that cannot be deployed in practice.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a data compression method based on neural network feature recognition. The method provided by the invention is a data compression method based on neural network feature identification, and the required memory size of a feature comparison library is compressed without increasing extra feature comparison calculation amount.

The method specifically comprises the following steps:

step (1), constraining output characteristic data of a neural network recognition model;

the neural network is a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN) or a Deep Neural Network (DNN), and at least one layer of network characteristics is selected as output characteristic data.

The Convolutional Neural Network (CNN) of the network layer structure is in the form of a conventional convolution, a block convolution or a separate convolution.

And selecting constraint conditions according to the characteristic data, adopting L2 norm constraint if the characteristic data are required to accord with normal distribution, and adopting L0 norm or L1 norm constraint if the characteristic data set is required to accord with sparse distribution. L0, L1, L2 denote the norm of the vector.

And (2) carrying out model training on the neural network model after the constraint of the output characteristic data. The method comprises the following steps:

a constraint value L obtained by performing norm constraint on the characteristic data_regAdding the other constraint values of the model into the total objective function S of the model and recording the other constraint values as L_preThen S is equal to L_pre+L_regAnd inputting the training data into the S, and retraining the model. After the retraining model is tested and evaluated by a verification set, when the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to serve as an inference model; a1 is the accuracy before model retraining and A2 is the accuracy after model retraining.

If L is_regToo large or too small of an influence, i.e. L_reg》L_preOr L_reg＜＜L_preThen to L_regApplying a weighting factor α such that α · L_regThe S is adjusted to be L as S, which is the same as the target function value Loss of the neural network_pre+α·L_reg。

And (3) inputting the database data into the inference model, and outputting the class characteristic data.

E_nAn intra-class feature representing the nth class, which contains M intra-class data, N ∈ [1, N]The structure size of the database data is N × M, and the database data is input into an inference model to obtain class feature data with the same structure size.

And (4) compressing the intra-class feature data of the output class feature data.

Uniformly carrying out 8-bit quantization processing on class feature data, and then sequentially carrying out in-class feature E_nThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics.

The feature similarity is represented by the number of 0 in the difference between the two features; the larger the number of 0 s, the larger the degree of similarity, and the smaller the number of 0 s, the lower the degree of similarity.

The intra-class feature data sorting algorithm adopts a prim algorithm or a greedy algorithm generated by a minimum binary tree.

And sequentially carrying out difference on the sorted intra-class features, and carrying out compression storage according to the difference of the features. That is, when storing the feature data of each class, only the complete data of the feature data in the first class is stored, and the feature data in the subsequent stored classes are all the differences from the feature data in the previous class.

Step 5, generating a feature comparison library according to the compressed class feature data;

after the characteristic data in the class is compressed, the characteristic data is stored according to the dictionary index mode, namely the class (key word) -the characteristic (value) in the class, and a characteristic comparison library is generated.

The method of the invention provides for compressing the feature comparison library, and constraining the feature data in the training process, so that the features output by the model meet certain data characteristic distribution, thereby facilitating the compression of the feature library. The method reduces the storage size of the characteristic comparison database for the model deployed to the offline terminal or the mobile terminal equipment with limited storage on the basis of ensuring the robust identification performance of the model, and does not additionally increase the decoding calculation amount caused by a complex compression method.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a flow chart of training in an embodiment;

FIG. 3 is a diagram of feature alignment library generation in an example.

Detailed description of the preferred embodiments

The invention is described in detail below with the aid of embodiments with reference to the accompanying drawings. But it should be noted that: unless otherwise specifically stated, the relative arrangement, numerical expressions, alphabetic expressions, and numerical values referred to in the embodiments are not limiting the scope of the present invention, and these embodiments are provided only to facilitate understanding of the present invention by the relevant persons. Technical methods well known to those skilled in the relevant art may not be described in excessive detail. The techniques, methods and systems of the present invention should, under certain circumstances, be considered part of the specification.

For clarity, the present invention is illustrated in detail by taking a neural network model a as an example, and specific implementations of the present invention are illustrated in fig. 1, 2 and 3.

Fig. 2 is a training flowchart according to the applicable neural network model a, and mainly describes the training process of the model.

First, an L2 norm constraint on the feature data E of the model, namely L, is added into the model_reg＝||E||₂The cross-entropy loss and other constraints of the model itself are denoted as L_preα as a hyper-parameter to adjust the strength of the constraint on the feature data E, resulting in S ═ L_pre+α·L_reg. And loading the parameter values of the pre-training model into the model, and retraining the model based on a Back Propagation (BP) mechanism of deep learning.

After the model is retrained, the retrained model is evaluated by a validation set test. When the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to obtain an inference model; a1 is the accuracy before model retraining and A2 is the accuracy after model retraining.

FIG. 3 is a flow chart of generating a feature alignment library based on model A, which mainly describes the processes of generating class feature data and compressing the class feature data.

In the model inference mode, library data is input into an inference model, and class feature data F is output. In the multi-target identification work, in order to improve the robustness of model identification, D features of each class in the feature comparison library are always provided, and only when K features are successfully matched, and K is more than or equal to 1 and less than or equal to D, the target to be detected is considered to be successfully identified.

The class feature data F is a set of feature data of all classes. The structure of F can be described as: { E₁,…,E_n,…,E_NIn which E₁Characteristic data representing class 1, corresponding E_nRepresenting the characteristic data of class n. At the same time, each class contains M intra-class data, e.g. E_nThe structure of (a) can be described as: { e_n1,…,e_nm,…,e_nM}，e_nmRepresenting the characteristic data in the mth class in the class n.

In order to further reduce the storage size of the feature comparison library, the feature data F is quantized, wherein F is quantized uniformly by 8 bits.

The class quantized feature data F is then compressed, essentially E_nCompression of (2).

In turn, theTo-class feature E_nThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics. Here, the similarity refers to the number of 0 values accumulated after the subtraction of two feature data. The formula is described as: count zeros (e)_ni-e_nj) The count _ zeros is the number of 0, wherein the greater the number of 0, the greater the similarity between the two features, and vice versa.

The prim algorithm for generating the minimum binary tree is adopted here, and the intra-class feature E is subjected to_nAnd (6) sorting. E.g. for E_nAfter the M characteristic data are sequenced, a structure { e } is obtained_ae_be_ce_d…, a, b, c and d satisfy 1 ≦ a, b, c and d ≦ M, and a, b, c and d are not equal to each other.

And after the intra-class features are sorted, storing the intra-class features in a difference mode. That is, when storing each type of feature data, only the complete data of the first feature data is stored, and the feature data stored subsequently is the difference value from the previous feature data. After sorting as described above E_n：{e_ae_be_ce_d… }. When stored, press e_a,(e_a-e_b),(e_b-e_c),(e_c-e_d) … for compressed storage.

After the class characteristic data is compressed, the class characteristic data, namely class n (key word) -compressed class characteristic E is stored according to the dictionary index mode_n(value), generating a feature comparison library.

And finally, obtaining a characteristic comparison library, and combining an inference model, wherein the characteristic comparison library can be used for actual deployment of the model. It should be noted that, in the process of generating the feature comparison library, the class feature data is subjected to quantization processing. When the inference model is actually used for inference, the output features are subjected to corresponding 8-bit quantization processing, and then feature comparison calculation is performed.

Claims

1. The data compression method based on neural network feature recognition is characterized by comprising the following steps:

selecting at least one layer of network characteristics as output characteristic data; selecting constraint conditions according to the characteristic data, if the characteristic data are required to accord with normal distribution, adopting L2 norm constraint, and if the characteristic data set is required to accord with sparse distribution, adopting L0 norm or L1 norm constraint;

step (2), carrying out model training on the neural network model after the constraint of the output characteristic data; the method comprises the following steps:

a constraint value L obtained by performing norm constraint on the characteristic data_regAdding the other constraint values of the model into the total objective function S of the model and recording the other constraint values as L_preThen S is equal to L_pre+L_regInputting training data into the S, and retraining the model; after the retraining model is tested and evaluated by a verification set, when the condition that the absolute value of A1-A2 is less than 0.01 is met, performing parameter curing on the retraining model to serve as an inference model; a1 is the accuracy before model retraining, A2 is the accuracy after model retraining;

step (3), inputting the database data into an inference model, and outputting class characteristic data;

E_nan intra-class feature representing the nth class, which contains M intra-class data, N ∈ [1, N]The structure size of the database data is N × M, and the database data is input into an inference model to obtain class characteristic data with the same structure size;

step (4), compressing the output class characteristic data;

the class feature data are uniformly quantized, and the class features E are sequentially processed_nThe M corresponding data in the class are sorted from large to small according to the similarity of the characteristics;

the feature similarity is represented by the number of 0 in the difference between the two features; the more the number of 0, the greater the degree of similarity, and the less the number of 0, the lower the degree of similarity;

sequentially carrying out difference on the sorted intra-class features, and carrying out compression storage according to the difference of the features; when storing the feature data of each class, only storing the complete data of the feature data in the first class, wherein the feature data in the subsequent stored classes are the difference values with the feature data in the previous class;

and (5) compressing the feature data in the class, and storing the compressed class feature data in a dictionary index mode to generate a feature comparison library.

2. The data compression method based on neural network feature recognition of claim 1, characterized in that: the neural network is a convolutional neural network, a cyclic neural network or a deep neural network.

3. The data compression method based on neural network feature recognition of claim 1, wherein if L in step (2) is_regToo large or too small of an influence, i.e. L_reg＞＞L_preOr L_reg＜＜L_preThen to L_regApplying a weighting factor α such that α · L_regThe S is adjusted to be L as S, which is the same as the target function value Loss of the neural network_pre+α·L_reg。

4. The data compression method based on neural network feature recognition as claimed in claim 1, wherein in step (4), the class feature data is quantized uniformly by 8-bit quantization.

5. The data compression method based on neural network feature recognition as claimed in claim 1, wherein in step (4), the intra-class feature data sorting algorithm adopts prim algorithm or greedy algorithm generated by minimum binary tree.